* Using Cpusets with HyperThreads
@ 2005-09-23 7:00 Paul Jackson
2005-09-23 7:10 ` Keith Owens
` (6 more replies)
0 siblings, 7 replies; 8+ messages in thread
From: Paul Jackson @ 2005-09-23 7:00 UTC (permalink / raw)
To: linux-ia64
This note explains the support provided by cpusets for job placement
on hyperthreaded CPUs in upcoming products, enabling one to control
what can run on the A and B sides of each core, if anything.
How does this look? Is the document itself clear and complete?
Will the following serve your needs? What's missing, wrong-headed
or useless? Is there a better way we should consider?
The cpuset command and library technology currently shipping in the
latest ProPack 4 versions already includes the following technology,
so it is quite unlikely that we would remove any of this. But there
may well be additional features, and improved documentation, that
would be useful.
Your feedback is welcome.
Using Cpusets with HyperThreads
===============
In addition to their traditional use to control the placement of
jobs on the CPUs and Memory Nodes of a system, cpusets also provides
a convenient mechanism to control the use of hyperthreading (HT).
Some jobs achieve better performance using both of the hyperthread
sides, A and B, of a processor core, and some run better using
just one of the sides, letting the other side idle.
Since each logical (hyperthreaded) processor in a core has a
distinct CPU number, one can easily specify a cpuset that contains
both, or contains just one side, from each of the processor cores
in the cpuset.
Cpusets can be configured to include any combination of the logical
CPUs in a system.
For example the cpuset configuration file:
cpus 0-127:2 # the even numbered CPUs 0, 2, 4, ... 126
mems 0-63 # all memory nodes 0, 1, 2, ... 63
would include the A sides of an HT enabled system, along with
all the memory, on the first 64 nodes. The colon ':' prefixes the
stride. The stride of '2' in this example means use every other
logical CPU.
The following commands would create a cpuset 'foo' according to
the above example, and run the job 'bar' in that cpuset, given that
'cpuset.cfg' is a file containing the above 2 example lines:
cpuset -c /foo < cpuset.cfg # create '/foo' on A sides
cpuset -i /foo -I bar # run 'bar' in cpuset /foo
To specify both sides of the first 64 cores, use:
cpus 0-127
To specify just the B sides, use:
cpus 1-127:2
The above assumes that CPUs are uniformly numbered, with the even
numbers for the A side and odd numbers for the B side. This is
usually the case, but not guaranteed. One could still place a
job on a system that was not uniformly numbered, but currently
it would involve a longer argument list to the 'cpus' option,
explicitly listing the desired CPUs. When time permits, we can add
more options to the cpuset command and libcpuset C interfaces, to
make it convenient to manage hyperthread placement on non-uniformly
numbered systems.
We do not need to create a separate cpuset with just the B side
CPUs to avoid having something run there. Tasks can only run where
there are cpusets allowing it.
If there is no cpuset for the B sides except the all encompassing
root cpuset, and if only root can put tasks in that cpuset, then
no one other than root can run on the B sides.
The dplace command can be used to manage more detailed placement of
job tasks within a such a cpuset. Since dplace numbering of CPUs is
relative to the cpuset, it does not affect the dplace configuration
whether the cpuset includes both sides of hyperthreaded cores,
or just one side, or even is on a system that does not support
hyperthreading.
Typically, the logical numbering of CPUs puts the even numbered
CPUs on the A sides, and the odd numbered CPUs on the B side. The
stride suffix (":2", above) makes it easy to specify that only
every other side will be used. If the CPU number range starts with
an even number, this will be the A sides, and if the range starts
with an odd number, this will be the B sides.
Use the following steps, for example, to setup a job to run only
on the A sides of its hyperthreaded cores, and to ensure that
nothing runs on the B sides (they remain idle).
1. The whole system is covered by a root cpuset (always the case).
2. A boot cpuset is defined to keep the kernel, system daemon and
user login session threads off other cpus.
3. The sys admin or batch scheduler with root permission creates
a cpuset that includes on the A sides of the processors to be
used for this job.
4. The sys admin or batch scheduler does not create any cpuset
with the B side CPUs in these processors. Then nothing disruptive
ever runs on the corresponding B side CPUs.
This is different from cpusets on IRIX. On IRIX, not all CPUs were
necessarily included in cpusets, and not all jobs were placed in
cpusets. Jobs not in a cpuset could run without constraint on the
CPUs not in cpusets. So, on IRIX, one would have to also create a
cpuset for the B sides, to ensure that other jobs did not run there.
The cpuset model for Linux 2.6 kernels is different. If a site uses
a bootcpuset to confine the traditional Unix load, then nothing
will run on the other CPUs in the system, except when those CPUs
are included in a cpuset that has a job assigned to it. These CPUs
are of course in the root cpuset, but this cpuset would normally
only be usable by a system administrator or batch scheduler with
root permissions. This prevents anyone without root permission
from running a task on those CPUs, unless an administrator or
service with root permission allows it.
--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <pj@sgi.com> 1.925.600.0401
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: Using Cpusets with HyperThreads
2005-09-23 7:00 Using Cpusets with HyperThreads Paul Jackson
@ 2005-09-23 7:10 ` Keith Owens
2005-09-23 11:10 ` Mark Goodwin
` (5 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: Keith Owens @ 2005-09-23 7:10 UTC (permalink / raw)
To: linux-ia64
On Fri, 23 Sep 2005 00:00:16 -0700,
Paul Jackson <pj@sgi.com> wrote:
> cpus 0-127:2 # the even numbered CPUs 0, 2, 4, ... 126
> mems 0-63 # all memory nodes 0, 1, 2, ... 63
>
> would include the A sides of an HT enabled system, along with
> all the memory, on the first 64 nodes. The colon ':' prefixes the
> stride. The stride of '2' in this example means use every other
> logical CPU.
I prefer the cron usage for striding, 0-127/2 or even */2.
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: Using Cpusets with HyperThreads
2005-09-23 7:00 Using Cpusets with HyperThreads Paul Jackson
2005-09-23 7:10 ` Keith Owens
@ 2005-09-23 11:10 ` Mark Goodwin
2005-09-23 11:18 ` Paul Jackson
` (4 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: Mark Goodwin @ 2005-09-23 11:10 UTC (permalink / raw)
To: linux-ia64
On Fri, 23 Sep 2005, Paul Jackson wrote:
> This note explains the support provided by cpusets for job placement
> on hyperthreaded CPUs in upcoming products, enabling one to control
> what can run on the A and B sides of each core, if anything.
do we need to accommodate a more generic specification? e.g how about
a system with K sockets per node, L cores per socket and M threads per
core? And if that's not complicated enough, how about a really large
system where K, L and/or M are not necessarily constant?
-- Mark
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: Using Cpusets with HyperThreads
2005-09-23 7:00 Using Cpusets with HyperThreads Paul Jackson
2005-09-23 7:10 ` Keith Owens
2005-09-23 11:10 ` Mark Goodwin
@ 2005-09-23 11:18 ` Paul Jackson
2005-09-23 17:26 ` Luck, Tony
` (3 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: Paul Jackson @ 2005-09-23 11:18 UTC (permalink / raw)
To: linux-ia64
> how about a system with K sockets per node, L cores per socket and M
> threads per core? ... where K, L and/or M are not necessarily constant?
Though I introduced this note by speaking of A and B sides, the
format of CPU number lists allows for specifying any particular
set of logical processors.
As systems with more varied architectures become available, I
suspect that we will add some convenience options to the cpuset
library and command interfaces, when we see what would be useful.
--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <pj@sgi.com> 1.925.600.0401
^ permalink raw reply [flat|nested] 8+ messages in thread* RE: Using Cpusets with HyperThreads
2005-09-23 7:00 Using Cpusets with HyperThreads Paul Jackson
` (2 preceding siblings ...)
2005-09-23 11:18 ` Paul Jackson
@ 2005-09-23 17:26 ` Luck, Tony
2005-09-23 17:38 ` Paul Jackson
` (2 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: Luck, Tony @ 2005-09-23 17:26 UTC (permalink / raw)
To: linux-ia64
Keith Owens wrote:
>On Fri, 23 Sep 2005 00:00:16 -0700,
>Paul Jackson <pj@sgi.com> wrote:
>> cpus 0-127:2 # the even numbered CPUs 0, 2, 4, ... 126
>> mems 0-63 # all memory nodes 0, 1, 2, ... 63
>>
>> would include the A sides of an HT enabled system, along with
>> all the memory, on the first 64 nodes. The colon ':' prefixes the
>> stride. The stride of '2' in this example means use every other
>> logical CPU.
>
>I prefer the cron usage for striding, 0-127/2 or even */2.
Perhaps it would be better to avoid all assumptions that even
cpu numbers correspond to "side A" and odd to "side B" and just
provide a syntax that allows you to pick some subset of cores
or threads from a range of sockets and get people used to something
like:
cpus S(0-15).t=0
to mean get the cpu numbers of all the thread zeros in any cores in
sockets 0-15.
-Tony
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: Using Cpusets with HyperThreads
2005-09-23 7:00 Using Cpusets with HyperThreads Paul Jackson
` (3 preceding siblings ...)
2005-09-23 17:26 ` Luck, Tony
@ 2005-09-23 17:38 ` Paul Jackson
2005-09-27 13:46 ` Zoltan Menyhart
2005-09-27 16:37 ` Paul Jackson
6 siblings, 0 replies; 8+ messages in thread
From: Paul Jackson @ 2005-09-23 17:38 UTC (permalink / raw)
To: linux-ia64
Tony wrote:
> cpus S(0-15).t=0
Yeah - something like that. Well said. The "/slice" is useful enough
now on uniform architectures, but someway to do what you describe will
be well worth doing too, once I think on it some more.
Thanks.
--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <pj@sgi.com> 1.925.600.0401
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: Using Cpusets with HyperThreads
2005-09-23 7:00 Using Cpusets with HyperThreads Paul Jackson
` (4 preceding siblings ...)
2005-09-23 17:38 ` Paul Jackson
@ 2005-09-27 13:46 ` Zoltan Menyhart
2005-09-27 16:37 ` Paul Jackson
6 siblings, 0 replies; 8+ messages in thread
From: Zoltan Menyhart @ 2005-09-27 13:46 UTC (permalink / raw)
To: linux-ia64
> do we need to accommodate a more generic specification? e.g how about
> a system with K sockets per node, L cores per socket and M threads per
> core? And if that's not complicated enough, how about a really large
> system where K, L and/or M are not necessarily constant?
I think it is a good idea to be able to handle the general case.
The appropriate shift and mask values could be dynamically established
at the boot time.
This could allow us to maintain a single kernel for machines with
different (generations of) processors.
Obviously, having more choice than just between the A and B sides,
we need some more rich set of options for the CPUsets, like:
- I need N out of M cores of the sockets x...y
+ I want to prevent the other applications from using the rest of
the cores
- I need the max. number of CPUs which are not farer from each
other than X
+ Use as many cores, as the HW can provide with Y memory bandwidth
for each
I think we should add the "locality information" into
/sys/devices/system/node/node<x>
like:
/sys/devices/system/node/node<x>/socket<y>/core<z>/cpu<n>
For compatibility reason we can keep the entries like:
/sys/devices/system/node/node<x>/cpu<n>
Thanks,
Zoltan Menyhart
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: Using Cpusets with HyperThreads
2005-09-23 7:00 Using Cpusets with HyperThreads Paul Jackson
` (5 preceding siblings ...)
2005-09-27 13:46 ` Zoltan Menyhart
@ 2005-09-27 16:37 ` Paul Jackson
6 siblings, 0 replies; 8+ messages in thread
From: Paul Jackson @ 2005-09-27 16:37 UTC (permalink / raw)
To: linux-ia64
Zoltan wrote:
> I think it is a good idea to be able to handle the general case.
One can certainly handle the general case, by explicitly
listing the selected CPUs.
The trade off is in finding a middle path, that is sufficiently
general for most cases, but still easier to use than the brute
force listing the selected CPUs.
I am open to adding a cpuset command option for that middle
path, but I will probably need more inspiration to discover
a good option, and more real life feedback to know what will
be most useful.
> I think we should add the "locality information" into ...
Yes - exposing "locality information" is the half of this
puzzle. I wish I had been able to spend more time on this
already.
--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <pj@sgi.com> 1.925.600.0401
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2005-09-27 16:37 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-09-23 7:00 Using Cpusets with HyperThreads Paul Jackson
2005-09-23 7:10 ` Keith Owens
2005-09-23 11:10 ` Mark Goodwin
2005-09-23 11:18 ` Paul Jackson
2005-09-23 17:26 ` Luck, Tony
2005-09-23 17:38 ` Paul Jackson
2005-09-27 13:46 ` Zoltan Menyhart
2005-09-27 16:37 ` Paul Jackson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox