From: "Gregory Haskins" <ghaskins@novell.com>
To: "Paul Jackson" <pj@sgi.com>
Cc: <a.p.zijlstra@chello.nl>, <mingo@elte.hu>,
<dmitry.adamushko@gmail.com>, <rostedt@goodmis.org>,
<menage@google.com>, <rientjes@google.com>, <tong.n.li@intel.com>,
<tglx@linutronix.de>, <akpm@linux-foundation.org>,
<dhaval@linux.vnet.ibm.com>, <vatsa@linux.vnet.ibm.com>,
<sgrubb@redhat.com>, <linux-kernel@vger.kernel.org>,
<ebiederm@xmission.com>, <nickpiggin@yahoo.com.au>
Subject: Re: scheduler scalability - cgroups, cpusets and load-balancing
Date: Tue, 29 Jan 2008 13:36:50 -0700 [thread overview]
Message-ID: <479F4812.BA47.005A.0@novell.com> (raw)
In-Reply-To: <20080129130403.92d0a1fe.pj@sgi.com>
>>> On Tue, Jan 29, 2008 at 2:04 PM, in message
<20080129130403.92d0a1fe.pj@sgi.com>, Paul Jackson <pj@sgi.com> wrote:
> Gregory wrote:
>> IMHO it works well the way it is: The user selects the class for a
>> particular task using sched_setscheduler(), and they select the cpuset
>> (or inherit it) that defines its execution scope. If that scope has
>> balancing enabled, the policy for the member classes is in effect.
>
> Ok.
>
> For the various classes of schedulers (sched_class's), it's fine by me
> if sched domains are polymorphic, supporting all classes, and it is
> left to each task to self-select the scheduling class of its preference.
>
> For the batch scheduler case, this -must- be imposable from outside
> the task, by the batch scheduler that is overseeing the job, and it
> must support the batch scheduler being able to disable all the
> balancers in selected cpusets (selected sched_domains).
>
> We have that now. Each of us only knew of part of the solution,
> but we managed to arrive at the desired answer even so ... amazing.
>
> The batch scheduler just has to arrange to get 'sched_load_balance'
> turned off in a cpuset and all overlapping cpusets, and then the
> CPUS in that cpuset will not belong to -any- sched_domain, and hence
> (could you verify I'm right in this detail?) won't be balanced by any
> sched_class.
I am a little fuzzy on how this would work, so I cant say for certain. :) But it seems like that is accurate.
>
> I should update the documentation for sched_load_balance, changing it
> from saying that you get realtime by turning off sched_load_balance in
> the RT cpuset, to saying that you get realtime by (1) turning off
> sched_load_balance in any overlapping cpusets, including all
> encompassing parent cpusets, (2) leaving sched_load_balance on in the
> RT cpuset itself, and (3) having those realtime tasks each self-select
> (elect) the desired SCHED_* using sched_setscheduler().
>
> Condition (1) above is a tad difficult to understand, but servicable,
> I guess. The combination of (1) and (2) results in a separate
> sched_domain just for the CPUs in the RT cpuset.
Technically you only need (2). I run my 4-8 core development systems in the single default global cpuset, normally. Customers typically do use multiple sets, but we only use the vanilla balanced variety.
>
>> (on this topic, note that I do not know if the RT-balancer will
>> respect the cpuset concept of "balance-enabled" anyway. That might
>> have to be fixed)
>
> Er eh ... it has no choice. If the user space code has configured a
> cpuset with 'sched_load_balance' turned off in that cpuset and all
> overlapping cpusets, then there will not even be a sched_domain
> covering those CPUs, and hence no balancer, RT or other class, will
> even see those CPUs.
>
> Unless I really don't understand the kernel/sched.c sched_domain code
> (a distinct possibility), if some CPU is not in any sched_domain, then
> it won't get balanced, RT or otherwise.
Heh...I cant quite wrap my head around that, but it sounds like you are correct. The only thing I was really pointing out is that the RT code doesn't necessarily look at sched-domain flags before making balancing decisions. So as long as that is not a requirement, I think we are all set.
next prev parent reply other threads:[~2008-01-29 20:43 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-01-29 9:53 scheduler scalability - cgroups, cpusets and load-balancing Peter Zijlstra
2008-01-29 10:01 ` Paul Jackson
2008-01-29 10:50 ` Peter Zijlstra
2008-01-29 11:13 ` Paul Jackson
2008-01-29 11:31 ` Peter Zijlstra
2008-01-29 11:53 ` Paul Jackson
2008-01-29 12:07 ` Peter Zijlstra
2008-01-29 12:36 ` Paul Jackson
2008-01-29 12:03 ` Paul Jackson
2008-01-29 12:30 ` Peter Zijlstra
2008-01-29 12:52 ` Paul Jackson
2008-01-29 13:38 ` Peter Zijlstra
2008-01-29 10:57 ` Peter Zijlstra
2008-01-29 11:30 ` Paul Jackson
2008-01-29 11:34 ` Paul Jackson
2008-01-29 11:50 ` Peter Zijlstra
2008-01-29 12:12 ` Paul Jackson
2008-01-29 15:57 ` Gregory Haskins
2008-01-29 16:33 ` Paul Jackson
2008-01-29 15:50 ` Gregory Haskins
2008-01-29 16:51 ` Paul Jackson
2008-01-29 17:21 ` Gregory Haskins
2008-01-29 19:04 ` Paul Jackson
2008-01-29 20:36 ` Gregory Haskins [this message]
2008-01-29 21:02 ` Paul Jackson
2008-01-29 21:07 ` Gregory Haskins
2008-01-29 15:36 ` Gregory Haskins
2008-01-29 16:28 ` Paul Jackson
2008-01-29 16:42 ` Gregory Haskins
2008-01-29 19:37 ` Paul Jackson
2008-01-29 20:28 ` Gregory Haskins
2008-01-29 20:56 ` Paul Jackson
2008-01-29 21:02 ` Gregory Haskins
2008-01-29 22:23 ` Steven Rostedt
2008-01-29 12:32 ` Srivatsa Vaddagiri
2008-01-29 12:21 ` Paul Jackson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=479F4812.BA47.005A.0@novell.com \
--to=ghaskins@novell.com \
--cc=a.p.zijlstra@chello.nl \
--cc=akpm@linux-foundation.org \
--cc=dhaval@linux.vnet.ibm.com \
--cc=dmitry.adamushko@gmail.com \
--cc=ebiederm@xmission.com \
--cc=linux-kernel@vger.kernel.org \
--cc=menage@google.com \
--cc=mingo@elte.hu \
--cc=nickpiggin@yahoo.com.au \
--cc=pj@sgi.com \
--cc=rientjes@google.com \
--cc=rostedt@goodmis.org \
--cc=sgrubb@redhat.com \
--cc=tglx@linutronix.de \
--cc=tong.n.li@intel.com \
--cc=vatsa@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox