public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
To: Paul Jackson <pj@sgi.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>,
	dino@in.ibm.com, akpm@osdl.org, mbligh@google.com,
	menage@google.com, Simon.Derr@bull.net,
	linux-kernel@vger.kernel.org, rohitseth@google.com, holt@sgi.com,
	dipankar@in.ibm.com, suresh.b.siddha@intel.com
Subject: Re: [RFC] cpuset: add interface to isolated cpus
Date: Sun, 22 Oct 2006 22:40:03 -0700	[thread overview]
Message-ID: <20061022224002.C2526@unix-os.sc.intel.com> (raw)
In-Reply-To: <20061022225108.21716614.pj@sgi.com>; from pj@sgi.com on Sun, Oct 22, 2006 at 10:51:08PM -0700

On Sun, Oct 22, 2006 at 10:51:08PM -0700, Paul Jackson wrote:
> Nick wrote:
> > A cool option would be to determine the partitions according to the
> > disjoint set of unions of cpus_allowed masks of all tasks. I see this
> > getting computationally expensive though, probably O(tasks*CPUs)... I
> > guess that isn't too bad.
> 
> Yeah - if that would work, from the practical perspective of providing
> us with a useful partitioning (get those humongous sched domains carved
> down to a reasonable size) then that would be cool.
> 
> I'm guessing that in practice, it would be annoying to use.  One would
> end up with stray tasks that happened to be sitting in one of the bigger
> cpusets and that did not have their cpus_allowed narrowed, stopping us
> from getting a useful partitioning.  Perhaps anilliary tasks associated
> with the batch scheduler, or some paused tasks in an inactive job that
> were it active would need load balancing across a big swath of cpus.
> These would be tasks that we really didn't need to load balance, but they
> would appear as if they needed it because of their fat cpus_allowed.
> 
> Users (admins) would have to hunt down these tasks that were getting in
> the way of a nice partitioning and whack their cpus_allowed down to
> size.
> 
> So essentially, one would end up with another userspace API, backdoor
> again.  Like those magic doors in the libraries of wealthy protagonists
> in mystery novels, where you have to open a particular book and pull
> the lamp cord to get the door to appear and open.

Also we need to be careful with malicious users partitioning the systems wrongly
based on the cpus_allowed for their tasks.

> 
> Automatic chokes and transmissions are great - if they work.  If not,
> give me a knob and a stick.
> 
> ===
> 
> Another idea for a cpuset-based API to this ...
> 
> From our internal perspective, it's all about getting the sched domain
> partitions cut down to a reasonable size, for performance reasons.

we need consider resource partitioning too..  and this is where
google interests are probably coming from?

> 
> But from the users perspective, the deal we are asking them to
> consider is to trade in fully automatic, all tasks across all cpus,
> load balancing, in turn for better performance.
> 
> Big system admins would often be quite happy to mark the top cpuset
> as "no need to load balance tasks in this cpuset."  They would
> take responsibility for moving any non-trivial, unpinned tasks into
> lower cpusets (or not be upset if something left behind wasn't load
> balancing.)
> 
> And the batch scheduler would be quite happy to mark its top cpuset as
> "no need to load balance".  It could mark any cpusets holding inactive
> jobs the same way.
> 
> This "no need to load balance" flag would be advisory.  The kernel
> might load balance anyway.  For example if the batch scheduler were
> running under a top cpuset that was -not- so marked, we'd still have
> to load balance everyone.  The batch scheduler wouldn't care.  It would
> have done its duty, to mark which of its cpusets didn't need balancing.
> 
> All we need from them is the ok to not load balance certain cpusets,
> and the rest is easy enough.  If they give us such ok on enough of the
> big cpusets, we give back a nice performance improvement.
> 

  reply	other threads:[~2006-10-23  6:00 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-10-19  9:26 [RFC] cpuset: add interface to isolated cpus Paul Jackson
2006-10-19 10:17 ` Nick Piggin
2006-10-19 17:55   ` Paul Jackson
2006-10-19 18:07     ` Nick Piggin
2006-10-19 18:56       ` Paul Jackson
2006-10-19 19:03         ` Nick Piggin
2006-10-20  3:37           ` Paul Jackson
2006-10-20  8:02             ` Nick Piggin
2006-10-20 14:52               ` Nick Piggin
2006-10-20 20:03                 ` Paul Jackson
2006-10-20 19:59               ` Paul Jackson
2006-10-20 20:01               ` Paul Jackson
2006-10-20 20:59                 ` Siddha, Suresh B
2006-10-21  1:33                   ` Paul Jackson
2006-10-21  6:14                 ` Nick Piggin
2006-10-21  7:24                   ` Paul Jackson
2006-10-21 10:51                     ` Nick Piggin
2006-10-22  4:54                       ` Paul Jackson
2006-10-20 21:04 ` Dinakar Guniguntala
2006-10-23  3:18   ` Paul Jackson
2006-10-23  5:07     ` Nick Piggin
2006-10-23  5:51       ` Paul Jackson
2006-10-23  5:40         ` Siddha, Suresh B [this message]
2006-10-23  6:06           ` Paul Jackson
2006-10-23  6:07           ` Paul Jackson
2006-10-23  6:17         ` Nick Piggin
2006-10-23  6:41           ` Paul Jackson
2006-10-23  6:49             ` Nick Piggin
2006-10-23  6:48           ` Paul Jackson
2006-10-23 20:58           ` Paul Jackson
2006-10-23 19:50       ` Dinakar Guniguntala
2006-10-23 20:47         ` Paul Jackson
2006-10-24 15:44           ` Paul Jackson
2006-10-25 19:40         ` Paul Jackson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20061022224002.C2526@unix-os.sc.intel.com \
    --to=suresh.b.siddha@intel.com \
    --cc=Simon.Derr@bull.net \
    --cc=akpm@osdl.org \
    --cc=dino@in.ibm.com \
    --cc=dipankar@in.ibm.com \
    --cc=holt@sgi.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mbligh@google.com \
    --cc=menage@google.com \
    --cc=nickpiggin@yahoo.com.au \
    --cc=pj@sgi.com \
    --cc=rohitseth@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox