From: Matthew Dobson <colpatch@us.ibm.com>
To: Paul Jackson <pj@sgi.com>
Cc: Rick Lindsley <ricklind@us.ibm.com>,
"Martin J. Bligh" <mbligh@aracnet.com>,
Simon.Derr@bull.net, pwil3058@bigpond.net.au,
frankeh@watson.ibm.com, dipankar@in.ibm.com,
Andrew Morton <akpm@osdl.org>,
ckrm-tech@lists.sourceforge.net, efocht@hpce.nec.com,
LSE Tech <lse-tech@lists.sourceforge.net>,
hch@infradead.org, steiner@sgi.com,
Jesse Barnes <jbarnes@sgi.com>,
sylvain.jeaugey@bull.net, djh@sgi.com,
LKML <linux-kernel@vger.kernel.org>, Andi Kleen <ak@suse.de>,
sivanich@sgi.com
Subject: Re: [ckrm-tech] Re: [Lse-tech] [PATCH] cpusets - big numa cpu and memory placement
Date: Mon, 11 Oct 2004 15:06:55 -0700 [thread overview]
Message-ID: <1097532415.4038.50.camel@arrakis> (raw)
In-Reply-To: <20041009191556.06e09c67.pj@sgi.com>
On Sat, 2004-10-09 at 19:15, Paul Jackson wrote:
> Rick replying to Paul:
> > And what I'm hearing is that if you're a job running in a set of shared
> > resources (i.e., non-exclusive) then by definition you are *not* a job
> > who cares about which processor you run on. I can't think of a situation
> > where I'd care about the physical locality, and the proximity of memory
> > and other nodes, but NOT care that other tasks might steal my cycles.
>
> There are at least these situations:
> 1) proximity to special hardware (graphics, networking, storage, ...)
> 2) non-dedicated tightly coupled multi-threaded apps (OpenMP, MPI)
> 3) batch managers switching resources between jobs
>
> On (2), if say you want to run eight copies of an application, on a
> system that only has eight CPUs, where each copy of the app is an
> eight-way tightly coupled app, they will go much faster if each app is
> placed across all 8 CPUs, one thread per CPU, than if they are placed
> willy-nilly. Or a bit more realistically, if you have a random input
> queue of such tightly coupled apps, each with a predetermined number of
> threads between one and eight, you will get more work done by pinning
> the threads of any given app on different CPUs. The users submitting
> the jobs may well not care which CPUs are used for their job, but an
> intermediate batch manager probably will care, as it may be solving the
> knapsack problem of how to fit a stream of varying sized jobs onto a
> given size of hardware.
>
> On (3), a batch manager might say have two small cpusets, and also one
> larger cpuset that is the two small ones combined. It might run one job
> in each of the two small cpusets for a while, then suspend these two
> jobs, in order to run a third job in the larger cpuset. The two small
> cpusets don't go away while the third job runs -- you don't want to lose
> or have to tear down and rebuild the detailed inter-cpuset placement of
> the two small jobs while they are suspended.
I think these situations, particularly the first two, are the times you
*want* to use the cpus_allowed mechanism. Pinning a specific thread to
a specific processor (cases (1) & (2)) is *exactly* why the cpus_allowed
mechanism was put into the kernel.
And (3) can pretty easily be achieved by using a combination of
sched_domains and cpus_allowed. In your example of one 4 CPU cpuset and
two 2 CPU sub cpusets (cpu-subsets? :), one could easily create a 4 CPU
domain for the larger job and two 2 CPU domains for the smaller jobs.
Those 2 2 CPU subdomains can be created & destroyed at will, or they
could be simply tagged as "exclusive" when you don't want tasks moving
back and forth between them, and tagged as "non-exclusive" when you want
tasks to be freely balanced across all 4 CPUs in the larger parent
domain.
One of the cool thing about using sched_domains as your partitioning
element is that in reality, tasks run on *CPUs*, not *domains*. So if
you have threads 'a1' & 'a2' running on CPUs 0 & 1 (small job 'a') and
threads 'b1' & 'b2' running on CPUs 2 & 3 (small job 'b'), you can
suspend threads a1, a2, b1 & b2 and remove the domains they were running
in to allow job A (big job with threads A1, A2, A3, & A4) to run on the
larger 4 CPU domain. When you then suspend A1-A4 again to allow the
smaller jobs to proceed, you can pretty trivially create the 2 CPU
domains underneath the 4 CPU domain and resume the jobs. Those jobs (a
& b) have been suspended on the CPUs they were originally running on,
and thus will resume on the same CPUs without any extra effort. They
will simply run on those CPUs, and at load balance time, the domains
attached to those CPUs will be consulted to determine where the tasks
can be relocated to if there is a heavy load. The domains will tell the
scheduler that the tasks cannot be relocated outside the 2 CPUs in each
respective domain. Viola! (sorta ;)
-Matt
next prev parent reply other threads:[~2004-10-11 22:14 UTC|newest]
Thread overview: 234+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-08-05 10:08 [PATCH] new bitmap list format (for cpusets) Paul Jackson
2004-08-05 10:10 ` [PATCH] cpusets - big numa cpu and memory placement Paul Jackson
2004-08-05 20:55 ` [Lse-tech] " Martin J. Bligh
2004-08-06 2:05 ` Paul Jackson
2004-08-06 3:24 ` Martin J. Bligh
2004-08-06 8:31 ` Paul Jackson
2004-08-06 15:30 ` Erich Focht
2004-08-06 15:35 ` Martin J. Bligh
2004-08-06 15:48 ` Hubertus Franke
2004-08-07 6:30 ` Paul Jackson
2004-08-07 6:45 ` Paul Jackson
2004-08-06 15:49 ` Hubertus Franke
2004-08-06 15:52 ` Hubertus Franke
2004-08-06 15:55 ` Erich Focht
2004-08-07 6:10 ` Paul Jackson
2004-08-07 15:22 ` Erich Focht
2004-08-07 18:59 ` Paul Jackson
2004-08-08 3:17 ` Paul Jackson
2004-08-08 14:50 ` Martin J. Bligh
2004-08-11 0:43 ` Paul Jackson
2004-08-11 9:40 ` Erich Focht
2004-08-11 14:49 ` Martin J. Bligh
2004-08-11 17:50 ` Paul Jackson
2004-08-11 21:12 ` Shailabh Nagar
2004-08-12 7:15 ` Paul Jackson
2004-08-12 12:58 ` Jack Steiner
2004-08-12 14:50 ` Martin J. Bligh
2004-08-11 15:12 ` Shailabh Nagar
2004-08-08 20:22 ` Shailabh Nagar
2004-08-09 15:57 ` Hubertus Franke
2004-08-10 11:31 ` [ckrm-tech] " Paul Jackson
2004-08-10 22:38 ` Shailabh Nagar
2004-08-11 10:42 ` Erich Focht
2004-08-11 14:56 ` Shailabh Nagar
2004-08-14 8:51 ` Paul Jackson
2004-08-08 19:58 ` Shailabh Nagar
2004-10-01 23:41 ` Andrew Morton
2004-10-02 6:06 ` Paul Jackson
2004-10-02 14:55 ` Dipankar Sarma
2004-10-02 16:14 ` Hubertus Franke
2004-10-02 18:04 ` Paul Jackson
2004-10-02 23:21 ` Peter Williams
2004-10-02 23:44 ` Hubertus Franke
2004-10-03 0:00 ` Peter Williams
2004-10-03 3:44 ` Paul Jackson
2004-10-05 3:13 ` [ckrm-tech] " Matthew Helsley
2004-10-05 8:30 ` Hubertus Franke
2004-10-05 14:20 ` Paul Jackson
2004-10-03 2:59 ` Paul Jackson
2004-10-03 3:19 ` Paul Jackson
2004-10-03 3:53 ` Peter Williams
2004-10-03 4:47 ` Paul Jackson
2004-10-03 5:12 ` Peter Williams
2004-10-03 5:39 ` Paul Jackson
2004-10-03 4:02 ` Paul Jackson
2004-10-03 3:39 ` Paul Jackson
2004-10-03 14:36 ` Martin J. Bligh
2004-10-03 15:39 ` Paul Jackson
2004-10-03 23:53 ` Martin J. Bligh
2004-10-04 0:02 ` Martin J. Bligh
2004-10-04 0:53 ` Paul Jackson
2004-10-04 3:56 ` Martin J. Bligh
2004-10-04 4:24 ` Paul Jackson
2004-10-04 15:03 ` Martin J. Bligh
2004-10-04 15:53 ` [ckrm-tech] " Paul Jackson
2004-10-04 18:17 ` Martin J. Bligh
2004-10-04 20:25 ` Paul Jackson
2004-10-04 22:15 ` Martin J. Bligh
2004-10-05 9:17 ` Paul Jackson
2004-10-05 10:01 ` Paul Jackson
2004-10-05 22:24 ` Matthew Dobson
2004-10-05 9:26 ` Simon Derr
2004-10-05 9:58 ` Paul Jackson
2004-10-05 19:34 ` Martin J. Bligh
2004-10-06 0:28 ` Paul Jackson
2004-10-06 1:16 ` Martin J. Bligh
2004-10-06 2:08 ` Paul Jackson
2004-10-06 22:59 ` Matthew Dobson
2004-10-06 23:23 ` Peter Williams
2004-10-07 0:16 ` Rick Lindsley
2004-10-07 18:27 ` Paul Jackson
2004-10-07 8:51 ` Paul Jackson
2004-10-07 10:53 ` Rick Lindsley
2004-10-07 14:41 ` Martin J. Bligh
[not found] ` <20041007072842.2bafc320.pj@sgi.com>
2004-10-07 19:05 ` Rick Lindsley
2004-10-10 2:15 ` [ckrm-tech] " Paul Jackson
2004-10-11 22:06 ` Matthew Dobson [this message]
2004-10-11 22:58 ` Paul Jackson
2004-10-12 21:22 ` Matthew Dobson
2004-10-12 8:50 ` Simon Derr
2004-10-12 21:25 ` Matthew Dobson
2004-10-10 2:28 ` Paul Jackson
2004-10-09 0:06 ` Matthew Dobson
[not found] ` <4165A31E.4070905@watson.ibm.com>
2004-10-08 13:14 ` Paul Jackson
2004-10-08 15:42 ` Hubertus Franke
2004-10-08 18:23 ` Paul Jackson
2004-10-09 1:00 ` Matthew Dobson
2004-10-09 20:08 ` [Lse-tech] " Paul Jackson
2004-10-11 22:16 ` Matthew Dobson
2004-10-11 22:42 ` Paul Jackson
2004-10-10 0:05 ` Paul Jackson
2004-10-11 22:18 ` Matthew Dobson
2004-10-11 22:39 ` Paul Jackson
2004-10-09 0:51 ` Matthew Dobson
2004-10-10 0:50 ` [Lse-tech] " Paul Jackson
2004-10-10 0:59 ` Paul Jackson
2004-10-09 0:22 ` Matthew Dobson
2004-10-12 22:24 ` [Lse-tech] " Hanna Linder
2004-10-13 20:56 ` Matthew Dobson
2004-10-07 12:47 ` [Lse-tech] " Simon Derr
2004-10-07 14:49 ` Martin J. Bligh
2004-10-07 17:54 ` Paul Jackson
2004-10-07 18:13 ` Martin J. Bligh
2004-10-08 9:23 ` Erich Focht
2004-10-08 9:50 ` Andrew Morton
2004-10-08 10:40 ` Erich Focht
2004-10-08 14:26 ` Martin J. Bligh
2004-10-08 9:53 ` Nick Piggin
2004-10-08 11:40 ` Erich Focht
2004-10-08 14:24 ` Martin J. Bligh
2004-10-08 22:37 ` Erich Focht
2004-10-14 10:35 ` Eric W. Biederman
2004-10-14 11:22 ` Erich Focht
2004-10-14 11:23 ` Paul Jackson
2004-10-14 19:39 ` Paul Jackson
2004-10-14 22:38 ` Hubertus Franke
2004-10-15 1:26 ` Paul Jackson
2004-10-07 18:25 ` Andrew Morton
2004-10-07 19:52 ` Paul Jackson
2004-10-07 21:04 ` [ckrm-tech] " Matthew Helsley
2004-10-10 3:22 ` Paul Jackson
2004-10-07 19:16 ` Rick Lindsley
2004-10-10 2:35 ` Paul Jackson
2004-10-10 5:12 ` [ckrm-tech] " Paul Jackson
2004-10-08 23:48 ` Matthew Dobson
2004-10-09 0:18 ` Nick Piggin
2004-10-11 23:00 ` Matthew Dobson
2004-10-11 23:09 ` Nick Piggin
2004-10-05 22:33 ` Matthew Dobson
2004-10-06 3:01 ` Paul Jackson
2004-10-06 23:12 ` Matthew Dobson
2004-10-07 8:59 ` [ckrm-tech] " Paul Jackson
2004-10-04 0:45 ` Paul Jackson
2004-10-04 11:44 ` Rick Lindsley
2004-10-04 22:46 ` [ckrm-tech] " Paul Jackson
2004-10-05 22:19 ` Matthew Dobson
2004-10-06 2:39 ` Paul Jackson
2004-10-06 23:21 ` Matthew Dobson
2004-10-07 9:41 ` [ckrm-tech] " Paul Jackson
2004-10-06 2:47 ` Paul Jackson
2004-10-06 9:43 ` Simon Derr
2004-10-06 13:27 ` Paul Jackson
2004-10-06 21:55 ` Peter Williams
2004-10-06 22:49 ` Paul Jackson
2004-10-06 8:02 ` Simon Derr
2005-02-07 23:59 ` Matthew Dobson
2005-02-08 0:20 ` Andrew Morton
2005-02-08 0:34 ` Paul Jackson
2005-02-08 9:54 ` Dinakar Guniguntala
2005-02-08 9:49 ` Nick Piggin
2005-02-08 16:13 ` Martin J. Bligh
2005-02-08 23:26 ` Nick Piggin
2005-02-09 4:23 ` Paul Jackson
2005-02-08 19:32 ` Matthew Dobson
2005-02-09 2:53 ` Nick Piggin
2005-02-08 19:00 ` Matthew Dobson
2005-02-08 20:42 ` Paul Jackson
2005-02-08 22:14 ` Matthew Dobson
2005-02-08 23:58 ` Shailabh Nagar
2005-02-09 0:27 ` Paul Jackson
2005-02-09 0:24 ` Paul Jackson
2005-02-09 17:59 ` [ckrm-tech] " Chandra Seetharaman
2005-02-11 2:46 ` Chandra Seetharaman
2005-02-11 9:21 ` Paul Jackson
2005-02-12 1:37 ` Chandra Seetharaman
2005-02-12 6:16 ` Paul Jackson
2005-02-11 16:54 ` Jesse Barnes
2005-02-11 18:42 ` Chandra Seetharaman
2005-02-11 18:50 ` Jesse Barnes
2005-02-08 16:15 ` Martin J. Bligh
2005-02-08 22:17 ` Matthew Dobson
2004-10-03 16:02 ` Paul Jackson
2004-10-03 23:47 ` Martin J. Bligh
2004-10-04 3:33 ` Paul Jackson
2004-10-03 20:10 ` Tim Hockin
2004-10-04 1:56 ` Paul Jackson
2004-10-03 3:35 ` Paul Jackson
2004-10-03 20:21 ` Erich Focht
2004-10-03 20:48 ` Andrew Morton
2004-10-04 14:05 ` Erich Focht
2004-10-04 14:57 ` Martin J. Bligh
2004-10-04 15:30 ` Paul Jackson
2004-10-04 15:41 ` Martin J. Bligh
2004-10-04 16:02 ` Paul Jackson
2004-10-04 18:19 ` Martin J. Bligh
2004-10-04 18:29 ` Paul Jackson
2004-10-04 15:38 ` Paul Jackson
2004-10-04 16:46 ` Paul Jackson
2004-10-04 3:41 ` Paul Jackson
2004-10-04 13:58 ` Hubertus Franke
2004-10-04 14:13 ` Simon Derr
2004-10-04 14:15 ` Erich Focht
2004-10-04 15:23 ` Paul Jackson
2004-10-04 14:37 ` Paul Jackson
2004-10-02 15:46 ` [ckrm-tech] " Marc E. Fiuczynski
2004-10-02 16:17 ` Hubertus Franke
2004-10-02 17:53 ` Paul Jackson
2004-10-02 18:16 ` Hubertus Franke
2004-10-02 19:14 ` Paul Jackson
2004-10-02 23:29 ` Peter Williams
2004-10-02 23:51 ` Hubertus Franke
2004-10-02 20:40 ` Andrew Morton
2004-10-02 23:08 ` Hubertus Franke
2004-10-02 22:26 ` Alan Cox
2004-10-03 2:49 ` Paul Jackson
2004-10-03 12:19 ` Hubertus Franke
2004-10-03 3:25 ` Paul Jackson
2004-10-03 2:26 ` Paul Jackson
2004-10-03 14:11 ` Paul Jackson
2004-10-02 17:47 ` Paul Jackson
2004-08-05 20:47 ` [Lse-tech] [PATCH] new bitmap list format (for cpusets) Martin J. Bligh
2004-08-05 21:45 ` Paul Jackson
[not found] ` <Pine.A41.4.53.0408060930100.20680@isabelle.frec.bull.fr>
2004-08-06 10:14 ` Paul Jackson
2004-08-09 8:01 ` Paul Jackson
2004-08-09 14:49 ` Martin J. Bligh
2004-08-10 23:43 ` Paul Jackson
2004-08-11 13:11 ` Dinakar Guniguntala
2004-08-11 16:17 ` Paul Jackson
2004-08-11 18:05 ` Dinakar Guniguntala
2004-08-11 20:40 ` Paul Jackson
2004-08-12 9:48 ` Dinakar Guniguntala
2004-08-12 10:11 ` Paul Jackson
2004-08-12 12:34 ` Dinakar Guniguntala
-- strict thread matches above, loose matches on Subject: below --
2004-10-05 6:05 [ckrm-tech] Re: [Lse-tech] [PATCH] cpusets - big numa cpu and memory placement Stan Hoeppner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1097532415.4038.50.camel@arrakis \
--to=colpatch@us.ibm.com \
--cc=Simon.Derr@bull.net \
--cc=ak@suse.de \
--cc=akpm@osdl.org \
--cc=ckrm-tech@lists.sourceforge.net \
--cc=dipankar@in.ibm.com \
--cc=djh@sgi.com \
--cc=efocht@hpce.nec.com \
--cc=frankeh@watson.ibm.com \
--cc=hch@infradead.org \
--cc=jbarnes@sgi.com \
--cc=linux-kernel@vger.kernel.org \
--cc=lse-tech@lists.sourceforge.net \
--cc=mbligh@aracnet.com \
--cc=pj@sgi.com \
--cc=pwil3058@bigpond.net.au \
--cc=ricklind@us.ibm.com \
--cc=sivanich@sgi.com \
--cc=steiner@sgi.com \
--cc=sylvain.jeaugey@bull.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.