* "Cache" sched domains
@ 2011-06-16 12:11 Samuel Thibault
2011-06-16 12:27 ` Peter Zijlstra
0 siblings, 1 reply; 3+ messages in thread
From: Samuel Thibault @ 2011-06-16 12:11 UTC (permalink / raw)
To: mingo, peterz; +Cc: linux-kernel
Hello,
We have an x86 machine whose sockets look like this in hwloc:
┌──────────────────────────────────────────────────────────────────┐
│Socket P#1 │
│┌────────────────────────────────────────────────────────────────┐│
││L3 (16MB) ││
│└────────────────────────────────────────────────────────────────┘│
│┌────────────────────┐┌────────────────────┐┌────────────────────┐│
││L2 (3072KB) ││L2 (3072KB) ││L2 (3072KB) ││
│└────────────────────┘└────────────────────┘└────────────────────┘│
│┌─────────┐┌─────────┐┌─────────┐┌─────────┐┌─────────┐┌─────────┐│
││L1 (32KB)││L1 (32KB)││L1 (32KB)││L1 (32KB)││L1 (32KB)││L1 (32KB)││
│└─────────┘└─────────┘└─────────┘└─────────┘└─────────┘└─────────┘│
│┌─────────┐┌─────────┐┌─────────┐┌─────────┐┌─────────┐┌─────────┐│
││Core P#0 ││Core P#1 ││Core P#2 ││Core P#3 ││Core P#4 ││Core P#5 ││
││┌───────┐││┌───────┐││┌───────┐││┌───────┐││┌───────┐││┌───────┐││
│││PU P#0 ││││PU P#4 ││││PU P#8 ││││PU P#12││││PU P#16││││PU P#20│││
││└───────┘││└───────┘││└───────┘││└───────┘││└───────┘││└───────┘││
│└─────────┘└─────────┘└─────────┘└─────────┘└─────────┘└─────────┘│
└──────────────────────────────────────────────────────────────────┘
However, Linux does not build sched domains for the pairs of cores
which share an L2 cache. On s390, IBM added sched domains for books,
that is, sets of cores which share an L2 cache. The support should
probably be added in a generic way for all archs thanks to generic cache
information.
Samuel
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: "Cache" sched domains
2011-06-16 12:11 "Cache" sched domains Samuel Thibault
@ 2011-06-16 12:27 ` Peter Zijlstra
2011-06-16 13:20 ` Samuel Thibault
0 siblings, 1 reply; 3+ messages in thread
From: Peter Zijlstra @ 2011-06-16 12:27 UTC (permalink / raw)
To: Samuel Thibault
Cc: mingo, linux-kernel, Suresh Siddha, Venkatesh Pallipadi,
Srivatsa Vaddagiri, Paul Turner, Mike Galbraith, Andreas Herrmann,
Heiko Carstens
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="UTF-8", Size: 4910 bytes --]
On Thu, 2011-06-16 at 14:11 +0200, Samuel Thibault wrote:
> Hello,
>
> We have an x86 machine whose sockets look like this in hwloc:
>
> ââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ
> âSocket P#1 â
> ââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ
> ââL3 (16MB) ââ
> ââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ
> ââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ
> ââL2 (3072KB) ââL2 (3072KB) ââL2 (3072KB) ââ
> ââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ
> ââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ
> ââL1 (32KB)ââL1 (32KB)ââL1 (32KB)ââL1 (32KB)ââL1 (32KB)ââL1 (32KB)ââ
> ââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ
> ââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ
> ââCore P#0 ââCore P#1 ââCore P#2 ââCore P#3 ââCore P#4 ââCore P#5 ââ
> ââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ
> âââPU P#0 ââââPU P#4 ââââPU P#8 ââââPU P#12ââââPU P#16ââââPU P#20âââ
> ââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ
> ââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ
> ââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ
Pretty, bonus points for effort there.
> However, Linux does not build sched domains for the pairs of cores
> which share an L2 cache. On s390, IBM added sched domains for books,
> that is, sets of cores which share an L2 cache. The support should
> probably be added in a generic way for all archs thanks to generic cache
> information.
Yeah, sched domain generation is currently somewhat crappy.
I think you'll find you'll get that L2 domain when you enable mc/smt
power savings on !magny-cours due to this particular horror in
arch/x86/kernel/smpboot.c (possibly loosing another level due to other
crap and changing scheduler behaviour in ways you might not fancy):
const struct cpumask *cpu_coregroup_mask(int cpu)
{
struct cpuinfo_x86 *c = &cpu_data(cpu);
/*
* For perf, we return last level cache shared map.
* And for power savings, we return cpu_core_map
*/
if ((sched_mc_power_savings || sched_smt_power_savings) &&
!(cpu_has(c, X86_FEATURE_AMD_DCM)))
return cpu_core_mask(cpu);
else
return cpu_llc_shared_mask(cpu);
}
I recently started reworking all that sched_domain crud and we're almost
at the point where we can remove all legacy 'level' crap. That is,
nothing in the scheduler should (and does, last time I checked) depend
on sd->level anymore.
So the current goal is to change sched_domain_topology to not be such a
silly hard coded list of domains, but build that thing dynamically based
on the system topology and set all the SD_flags correctly.
If that is something you're willing to work on, that'd be totally
awesome.
ÿôèº{.nÇ+·®+%Ëÿ±éݶ\x17¥wÿº{.nÇ+·¥{±þG«éÿ{ayº\x1dÊÚë,j\a¢f£¢·hïêÿêçz_è®\x03(éÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?¨èÚ&£ø§~á¶iOæ¬z·vØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?I¥
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: "Cache" sched domains
2011-06-16 12:27 ` Peter Zijlstra
@ 2011-06-16 13:20 ` Samuel Thibault
0 siblings, 0 replies; 3+ messages in thread
From: Samuel Thibault @ 2011-06-16 13:20 UTC (permalink / raw)
To: Peter Zijlstra
Cc: mingo, linux-kernel, Suresh Siddha, Venkatesh Pallipadi,
Srivatsa Vaddagiri, Paul Turner, Mike Galbraith, Andreas Herrmann,
Heiko Carstens
Hello,
Peter Zijlstra, le Thu 16 Jun 2011 14:27:22 +0200, a écrit :
> On Thu, 2011-06-16 at 14:11 +0200, Samuel Thibault wrote:
> > ┌──────────────────────────────────────────────────────────────────┐
> > │Socket P#1 │
> > │┌────────────────────────────────────────────────────────────────┐│
> > ││L3 (16MB) ││
> > │└────────────────────────────────────────────────────────────────┘│
> > │┌────────────────────┐┌────────────────────┐┌────────────────────┐│
> > ││L2 (3072KB) ││L2 (3072KB) ││L2 (3072KB) ││
> > │└────────────────────┘└────────────────────┘└────────────────────┘│
> > │┌─────────┐┌─────────┐┌─────────┐┌─────────┐┌─────────┐┌─────────┐│
> > ││L1 (32KB)││L1 (32KB)││L1 (32KB)││L1 (32KB)││L1 (32KB)││L1 (32KB)││
> > │└─────────┘└─────────┘└─────────┘└─────────┘└─────────┘└─────────┘│
> > │┌─────────┐┌─────────┐┌─────────┐┌─────────┐┌─────────┐┌─────────┐│
> > ││Core P#0 ││Core P#1 ││Core P#2 ││Core P#3 ││Core P#4 ││Core P#5 ││
> > ││┌───────┐││┌───────┐││┌───────┐││┌───────┐││┌───────┐││┌───────┐││
> > │││PU P#0 ││││PU P#4 ││││PU P#8 ││││PU P#12││││PU P#16││││PU P#20│││
> > ││└───────┘││└───────┘││└───────┘││└───────┘││└───────┘││└───────┘││
> > │└─────────┘└─────────┘└─────────┘└─────────┘└─────────┘└─────────┘│
> > └──────────────────────────────────────────────────────────────────┘
>
> Pretty, bonus points for effort there.
Well, that's all hwloc's credit :)
> So the current goal is to change sched_domain_topology to not be such a
> silly hard coded list of domains, but build that thing dynamically based
> on the system topology and set all the SD_flags correctly.
Ok, great!
> If that is something you're willing to work on, that'd be totally
> awesome.
I'm afraid I do not have time to spend on this.
Samuel
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2011-06-16 13:20 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-06-16 12:11 "Cache" sched domains Samuel Thibault
2011-06-16 12:27 ` Peter Zijlstra
2011-06-16 13:20 ` Samuel Thibault
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox