public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Paul Jackson <pj@sgi.com>
To: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: dino@in.ibm.com, Simon.Derr@bull.net,
	linux-kernel@vger.kernel.org, lse-tech@lists.sourceforge.net,
	akpm@osdl.org, dipankar@in.ibm.com, colpatch@us.ibm.com
Subject: Re: [RFC PATCH] Dynamic sched domains aka Isolated cpusets
Date: Tue, 19 Apr 2005 13:34:31 -0700	[thread overview]
Message-ID: <20050419133431.2e389d57.pj@sgi.com> (raw)
In-Reply-To: <1113897440.5074.62.camel@npiggin-nld.site>

Nick wrote:
> Well the scheduler simply can't handle it, so it is not so much a
> matter of pushing - you simply can't use partitioned domains and
> meaningfully have a cpuset above them.

Translating that into cpuset-speak, I think what you mean is that I
can't have partitioned sched domains and have a task attached to a
cpuset above them, if it matters to me that the task can actually use
all the CPUs in its larger cpuset.

But what you actually said was that I cannot have a cpuset above them.

I can certainly _can_ have a cpuset above the cpusets that define the
partitioned domains.  I _have_ to have that, or toss the entire
hierarchical design cpuset.  The top cpuset encompasses all the CPUs on
the system, and is above all others.

Let's see if the following example helps clear up these confusions.

Let's say we started out as one big happy family, with a single top
cpuset, and a single sched domain, each encompassing the entire machine.
All tasks are attached to that cpuset and load balanced and scheduled in
that sched domain.  Any task can be run anywhere.

Then some yahoo comes along and decides to complicate things.  They
create my two cpusets Alpha and Beta, each covering half the system.
They create two partitioned sched domains corresponding to Alpha and
Beta, respectively.  They move almost every task into one of Alpha or
Beta, expecting hence forth that each such moved task will only run on
whichever half of the system it was placed in.  For instance, if they
moved init into Alpha, that means they _want_ the init task to be
constrained to the Alpha half of the system, even if every CPU in Beta
has been idle for the last 5 hours.

So far, all fine and dandy.

But they leave behind a few tasks still attached to the top cpuset, with
those tasks cpus_allowed still allowing any CPU in the system. They
actually don't give a rat's patootie about these few tasks, because they
consume less than 10 seconds each per day, and so long as they are
allowed their few CPU cycles when they want them, all is well.  They
could have moved these tasks as well into Alpha or Beta, but they wanted
to be annoying and see if they could concoct a test case that would
break something here.  Or maybe they were just forgetful.

What breaks?  You seem to be telling me that this is ver botten, but I
don't see yet where the problem is.

My timid guess is that about all that breaks is that each of these stray
tasks will be forever after stuck in which ever one of Alpha or Beta it
happened to be in at the point of the Great Divide.  If say one of these
tasks happened to be on the Beta side at that point, the Beta domain
scheduler will never let an Alpha CPU see that task, leaving the task to
only ever be picked up by a Beta CPU (even though the tasks cpuset and
cpus_allowed would have allowed an Alpha CPU, in theory).

Translating this back into a language my users might speak, I guess is
this means I tell them:
 * No scheduling or load balancing is done across partitioned scheduler domains.
 * Even if one such domain is hugely oversubscribed, and another totally
   idle, no task in one will run in the other.  If that's what you want,
   then go for it.
 * Tasks left attached to cpusets higher up in the hierarchy don't get
   moved or load balanced between partitioned sched domains below their cpuset.
   They will get stuck in one of the domains, willy-nilly.  So if it matters
   to you in the slightest which of the partitions a task runs in, attach
   it appropriately, to one of the cpusets that define the partitioned
   scheduler domains, or below.

In short, perhaps you were trying to make my life, or at least my efforts
to understand this, simple, by telling me that I simply can't have any
cpusets above partitioned sched domains.  The literal translation of that
into cpuset-speak throws out the entire cpuset architecture.  So I have to
push back and figure out in more detail what really matters here.

Am I anywhere close?

-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <pj@engr.sgi.com> 1.650.933.1373, 1.925.600.0401

  reply	other threads:[~2005-04-19 20:35 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-10-07  0:51 [RFC PATCH] scheduler: Dynamic sched_domains Matthew Dobson
2004-10-07  2:13 ` Nick Piggin
2004-10-07 17:01   ` Jesse Barnes
2004-10-08  5:55     ` [Lse-tech] " Takayoshi Kochi
2004-10-08  6:08       ` Nick Piggin
2004-10-08 16:43         ` Jesse Barnes
2004-10-07 21:58   ` Matthew Dobson
2004-10-08  0:22     ` Nick Piggin
2004-10-07 22:20   ` Matthew Dobson
2004-10-07  4:12 ` [ckrm-tech] " Marc E. Fiuczynski
2004-10-07  5:35   ` Paul Jackson
2004-10-07 22:06   ` Matthew Dobson
2004-10-07  9:32 ` Paul Jackson
2004-10-08 10:14 ` [Lse-tech] " Erich Focht
2004-10-08 10:40   ` Nick Piggin
2004-10-08 15:50     ` [ckrm-tech] " Hubertus Franke
2004-10-08 22:48       ` Matthew Dobson
2004-10-08 18:54     ` Matthew Dobson
2004-10-08 21:56       ` Peter Williams
2004-10-08 22:52         ` Matthew Dobson
2004-10-08 23:13       ` Erich Focht
2004-10-08 23:50         ` Nick Piggin
2004-10-10 12:25           ` Erich Focht
2004-10-08 22:51     ` Erich Focht
2004-10-09  1:05       ` Matthew Dobson
2004-10-10 12:45         ` Erich Focht
2004-10-12 22:45           ` Matthew Dobson
2004-10-08 18:45   ` Matthew Dobson
2005-04-18 20:26 ` [RFC PATCH] Dynamic sched domains aka Isolated cpusets Dinakar Guniguntala
2005-04-18 23:44   ` Nick Piggin
2005-04-19  8:00     ` Dinakar Guniguntala
2005-04-19  5:54   ` Paul Jackson
2005-04-19  6:19     ` Nick Piggin
2005-04-19  6:59       ` Paul Jackson
2005-04-19  7:09         ` Nick Piggin
2005-04-19  7:25           ` Paul Jackson
2005-04-19  7:28           ` Paul Jackson
2005-04-19  7:19       ` Paul Jackson
2005-04-19  7:57         ` Nick Piggin
2005-04-19 20:34           ` Paul Jackson [this message]
2005-04-23 23:26             ` Paul Jackson
2005-04-26  0:52               ` Matthew Dobson
2005-04-26  0:59                 ` Paul Jackson
2005-04-19  9:52       ` Dinakar Guniguntala
2005-04-19 15:26         ` Paul Jackson
2005-04-20  7:37           ` Dinakar Guniguntala
2005-04-19 20:42         ` Paul Jackson
2005-04-19  8:12     ` Simon Derr
2005-04-19 16:19       ` Paul Jackson
2005-04-19  9:34     ` [Lse-tech] " Dinakar Guniguntala
2005-04-19 17:23       ` Paul Jackson
2005-04-20  7:16         ` Dinakar Guniguntala
2005-04-20 19:09           ` Paul Jackson
2005-04-21 16:27             ` Dinakar Guniguntala
2005-04-22 21:26               ` Paul Jackson
2005-04-23  7:24                 ` Dinakar Guniguntala
2005-04-23 22:30               ` Paul Jackson
2005-04-25 11:53                 ` Dinakar Guniguntala
2005-04-25 14:38                   ` Paul Jackson
2005-04-21 17:31   ` [RFC PATCH] Dynamic sched domains aka Isolated cpusets (v0.2) Dinakar Guniguntala
2005-04-22 18:50     ` Paul Jackson
2005-04-22 21:37       ` Paul Jackson
2005-04-23  3:11     ` Paul Jackson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20050419133431.2e389d57.pj@sgi.com \
    --to=pj@sgi.com \
    --cc=Simon.Derr@bull.net \
    --cc=akpm@osdl.org \
    --cc=colpatch@us.ibm.com \
    --cc=dino@in.ibm.com \
    --cc=dipankar@in.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lse-tech@lists.sourceforge.net \
    --cc=nickpiggin@yahoo.com.au \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox