* [PATCH] x86, sched: allow topolgies where NUMA nodes share an LLC
@ 2017-06-08 19:39 Dave Hansen
2017-06-08 20:00 ` Peter Zijlstra
0 siblings, 1 reply; 4+ messages in thread
From: Dave Hansen @ 2017-06-08 19:39 UTC (permalink / raw)
To: linux-kernel
Cc: Dave Hansen, tony.luck, tim.c.chen, peterz, bp, rientjes,
imammedo, torvalds, prarit, toshi.kani, brice.goglin, hpa, mingo
From: Dave Hansen <dave.hansen@linux.intel.com>
Our SMP boot code has a series of assumptions about what NUMA
nodes are that are enforced via topology_sane(). Once upon a
time, we verified that a CPU package only contained a single node
(fixed in cebf15eb0). Today, we verify that SMT siblings and
LLCs do not span nodes.
The SMT siblings assumption is safe, but the LLC is violated on
current hardware.
Remove the "sanity" check on LLC spanning NUMA nodes. Also make
sure to set 'x86_has_numa_in_package = true' which ensures that
we use the x86_numa_in_package_topology[]. The default topology
layers NUMA "outside" of the cache, which is wrong when the cache
spans multiple nodes.
This fixes the warnings, but it does theoretically throw away the
LLC from being consulted in scheduling decisions, if the LLC is
shared at a boundary that is not also a NUMA node.
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Luck, Tony <tony.luck@intel.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: David Rientjes <rientjes@google.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Toshi Kani <toshi.kani@hp.com>
Cc: brice.goglin@gmail.com
Cc: "H. Peter Anvin" <hpa@linux.intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
---
b/arch/x86/kernel/smpboot.c | 15 ++++++++++-----
1 file changed, 10 insertions(+), 5 deletions(-)
diff -puN arch/x86/kernel/smpboot.c~x86-numa-nodes-share-llc arch/x86/kernel/smpboot.c
--- a/arch/x86/kernel/smpboot.c~x86-numa-nodes-share-llc 2017-06-01 14:46:40.562159566 -0700
+++ b/arch/x86/kernel/smpboot.c 2017-06-01 15:01:43.994157313 -0700
@@ -460,7 +460,7 @@ static bool match_llc(struct cpuinfo_x86
if (per_cpu(cpu_llc_id, cpu1) != BAD_APICID &&
per_cpu(cpu_llc_id, cpu1) == per_cpu(cpu_llc_id, cpu2))
- return topology_sane(c, o, "llc");
+ return true;
return false;
}
@@ -520,7 +520,8 @@ static struct sched_domain_topology_leve
/*
* Set if a package/die has multiple NUMA nodes inside.
- * AMD Magny-Cours and Intel Cluster-on-Die have this.
+ * AMD Magny-Cours, Intel Cluster-on-Die, and Intel
+ * Sub-NUMA Clustering have this.
*/
static bool x86_has_numa_in_package;
@@ -548,9 +549,13 @@ void set_cpu_sibling_map(int cpu)
if ((i == cpu) || (has_smt && match_smt(c, o)))
link_mask(topology_sibling_cpumask, cpu, i);
- if ((i == cpu) || (has_mp && match_llc(c, o)))
- link_mask(cpu_llc_shared_mask, cpu, i);
-
+ if ((i == cpu) || (has_mp && match_llc(c, o))) {
+ /* LLC may be shared across NUMA nodes */
+ if (topology_same_node(c, o))
+ link_mask(cpu_llc_shared_mask, cpu, i);
+ else
+ x86_has_numa_in_package = true;
+ }
}
/*
_
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] x86, sched: allow topolgies where NUMA nodes share an LLC
2017-06-08 19:39 [PATCH] x86, sched: allow topolgies where NUMA nodes share an LLC Dave Hansen
@ 2017-06-08 20:00 ` Peter Zijlstra
2017-06-08 20:08 ` Luck, Tony
0 siblings, 1 reply; 4+ messages in thread
From: Peter Zijlstra @ 2017-06-08 20:00 UTC (permalink / raw)
To: Dave Hansen
Cc: linux-kernel, tony.luck, tim.c.chen, bp, rientjes, imammedo,
torvalds, prarit, toshi.kani, brice.goglin, hpa, mingo
On Thu, Jun 08, 2017 at 12:39:28PM -0700, Dave Hansen wrote:
>
> From: Dave Hansen <dave.hansen@linux.intel.com>
>
> Our SMP boot code has a series of assumptions about what NUMA
> nodes are that are enforced via topology_sane(). Once upon a
> time, we verified that a CPU package only contained a single node
> (fixed in cebf15eb0). Today, we verify that SMT siblings and
> LLCs do not span nodes.
>
> The SMT siblings assumption is safe, but the LLC is violated on
> current hardware.
What does? That does sound broken. How can a cache domain sanely span
memory controllers?
This needs far more explanation.
^ permalink raw reply [flat|nested] 4+ messages in thread
* RE: [PATCH] x86, sched: allow topolgies where NUMA nodes share an LLC
2017-06-08 20:00 ` Peter Zijlstra
@ 2017-06-08 20:08 ` Luck, Tony
2017-06-08 20:20 ` Peter Zijlstra
0 siblings, 1 reply; 4+ messages in thread
From: Luck, Tony @ 2017-06-08 20:08 UTC (permalink / raw)
To: Peter Zijlstra, Dave Hansen
Cc: linux-kernel@vger.kernel.org, tim.c.chen@linux.intel.com,
bp@alien8.de, rientjes@google.com, imammedo@redhat.com,
torvalds@linux-foundation.org, prarit@redhat.com,
toshi.kani@hp.com, brice.goglin@gmail.com, hpa@linux.intel.com,
mingo@kernel.org
> What does? That does sound broken. How can a cache domain sanely span
> memory controllers?
Think "cluster on die" with cores on the socket split into two clusters, but still sharing LLC.
-Tony
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] x86, sched: allow topolgies where NUMA nodes share an LLC
2017-06-08 20:08 ` Luck, Tony
@ 2017-06-08 20:20 ` Peter Zijlstra
0 siblings, 0 replies; 4+ messages in thread
From: Peter Zijlstra @ 2017-06-08 20:20 UTC (permalink / raw)
To: Luck, Tony
Cc: Dave Hansen, linux-kernel@vger.kernel.org,
tim.c.chen@linux.intel.com, bp@alien8.de, rientjes@google.com,
imammedo@redhat.com, torvalds@linux-foundation.org,
prarit@redhat.com, toshi.kani@hp.com, brice.goglin@gmail.com,
hpa@linux.intel.com, mingo@kernel.org
On Thu, Jun 08, 2017 at 08:08:31PM +0000, Luck, Tony wrote:
> > What does? That does sound broken. How can a cache domain sanely span
> > memory controllers?
>
> Think "cluster on die" with cores on the socket split into two clusters, but still sharing LLC.
The thing is, cluster-on-die works with the current code, and therefore
seems to modify the SRAT an CPUID information in a consistent manner.
Which in turn seems to suggest the LLC really is split for
cluster-on-die.
This is something new, and the Changelog is absolute crap for not
explaining _anything_.
So while SRAT seems to invent new nodes, the CPUID topology bits still
describes the full LLC, now shared across nodes.
Is this accurate?, do these nodes, as described by SRAT, actually have a
memory controller each? And is the LLC still fully integrated across the
nodes? If so, we need to go fix the scheduler domain topology to put a
cache domain across nodes (which is going to be painful).
Just making the warning go away and not explaining things sucks.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2017-06-08 20:20 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-06-08 19:39 [PATCH] x86, sched: allow topolgies where NUMA nodes share an LLC Dave Hansen
2017-06-08 20:00 ` Peter Zijlstra
2017-06-08 20:08 ` Luck, Tony
2017-06-08 20:20 ` Peter Zijlstra
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox