From: Peter Zijlstra <a.p.zijlstra@chello.nl>
To: Paul Jackson <pj@sgi.com>
Cc: linux-kernel@vger.kernel.org, mingo@elte.hu,
vatsa@linux.vnet.ibm.com, dhaval@linux.vnet.ibm.com,
nickpiggin@yahoo.com.au, ebiederm@xmission.com,
akpm@linux-foundation.org, sgrubb@redhat.com,
rostedt@goodmis.org, ghaskins@novell.com,
dmitry.adamushko@gmail.com, tong.n.li@intel.com,
tglx@linutronix.de, menage@google.com, rientjes@google.com
Subject: Re: scheduler scalability - cgroups, cpusets and load-balancing
Date: Tue, 29 Jan 2008 12:31:24 +0100 [thread overview]
Message-ID: <1201606284.28547.114.camel@lappy> (raw)
In-Reply-To: <20080129051353.4628c9eb.pj@sgi.com>
On Tue, 2008-01-29 at 05:13 -0600, Paul Jackson wrote:
> Peter wrote:
> > Thanks for the link. Yes I think your last suggestion of creating
> > rt-domains ( http://lkml.org/lkml/2007/10/23/419 ) is a good one.
>
> We now have a per-cpuset Boolean flag file called 'sched_load_balance'.
SD_LOAD_BALANCE, right?
> In the default case, this flag is set on, and the kernel does its
> usual load balancing across all CPUs in that cpuset. This means, under
> the covers, that there exists some sched domain such that all CPUs in
> that cpuset are in that same sched domain. That sched domain might
> contain additional CPUs from outside that cpuset as well. Indeed,
> in the default vanilla configuration, that sched domain contains all
> CPUs in the system.
>
> If we turn the sched_load_balance flag off for some cpuset, we are
> telling the kernel it's ok not to load balance on the CPUs in that
> cpuset (unless those CPUs are in some other cpuset that needed load
> balancing anyway.)
>
> This 'sched_load_balance' flag is, thus far, "the" cpuset hook
> supporting realtime. One can use it to configure a system so that
> the kernel does not do normal load balancing on select CPUs, such
> as those CPUs dedicated to realtime use.
Ah, here I disagree, it is possible to do (hard) realtime scheduling
over multiple cpus, the only draw back is that it requires a very strong
load-balancer, making it unsuitable for large number of cpus.
( of course, having a strong rt load balancer on a large cpuset doesn't
harm, as long as there are no rt tasks to balance )
So if we have a system like so:
__A__
/ | \
B1 B2 B3
/\
/ \
C1 C2
A comprises of cpus 0-127, !SD_LOAD_BALANCE
B1 comprises of cpus 0-63, !SD_LOAD_BALANCE
B2 comprises of cpus 64-119
B3 120-127
C1 0-3
C2 5-63
We end up with 4 disjoint load-balanced sets.
I would then attach the rt balance information to: C1, C2, B2, B3.
If, for example, B1 would be load-balanced, we'd only have 3 disjoint
sets left: B1, B2 and B3, and the rt balance data would be there.
> It sounds like Peter is reminding us that we really have three choices
> for a handling a given CPU's load balancing:
> 1) normal kernel scheduler load balancing,
> 2) RT load balancing, or
> 3) no load balancing whatsoever.
>
> If that's the case (if we really need choice 3) then a single Boolean
> flag, such as sched_load_balance, is not sufficient to select from
> the three choices, and it might make sense to add a second per-cpuset
> Boolean flag, say "sched_rt_balance", default off, which if turned on,
> enabled choice 2.
>
> If that's not the case (we only need choices 1 and 2) then -logically-
> we could overload the meaning of the current sched_load_balance,
> to mean, if turned off, not only to stop doing normal balancing, but
> to further mean that we should commence RT balancing. However bits
> aren't -that- precious here, and this sounds unnecessarily confusing.
>
> So ... would a new per-cpuset Boolean flag such as sched_rt_balance be
> appropriate and sufficient to mark those cpusets whose set of CPUs
> required RT balancing?
So, I don't think we need that, I think we can do with the single flag,
we just need to find these disjoint sets and stick our rt-domain there.
next prev parent reply other threads:[~2008-01-29 11:32 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-01-29 9:53 scheduler scalability - cgroups, cpusets and load-balancing Peter Zijlstra
2008-01-29 10:01 ` Paul Jackson
2008-01-29 10:50 ` Peter Zijlstra
2008-01-29 11:13 ` Paul Jackson
2008-01-29 11:31 ` Peter Zijlstra [this message]
2008-01-29 11:53 ` Paul Jackson
2008-01-29 12:07 ` Peter Zijlstra
2008-01-29 12:36 ` Paul Jackson
2008-01-29 12:03 ` Paul Jackson
2008-01-29 12:30 ` Peter Zijlstra
2008-01-29 12:52 ` Paul Jackson
2008-01-29 13:38 ` Peter Zijlstra
2008-01-29 10:57 ` Peter Zijlstra
2008-01-29 11:30 ` Paul Jackson
2008-01-29 11:34 ` Paul Jackson
2008-01-29 11:50 ` Peter Zijlstra
2008-01-29 12:12 ` Paul Jackson
2008-01-29 15:57 ` Gregory Haskins
2008-01-29 16:33 ` Paul Jackson
2008-01-29 15:50 ` Gregory Haskins
2008-01-29 16:51 ` Paul Jackson
2008-01-29 17:21 ` Gregory Haskins
2008-01-29 19:04 ` Paul Jackson
2008-01-29 20:36 ` Gregory Haskins
2008-01-29 21:02 ` Paul Jackson
2008-01-29 21:07 ` Gregory Haskins
2008-01-29 15:36 ` Gregory Haskins
2008-01-29 16:28 ` Paul Jackson
2008-01-29 16:42 ` Gregory Haskins
2008-01-29 19:37 ` Paul Jackson
2008-01-29 20:28 ` Gregory Haskins
2008-01-29 20:56 ` Paul Jackson
2008-01-29 21:02 ` Gregory Haskins
2008-01-29 22:23 ` Steven Rostedt
2008-01-29 12:32 ` Srivatsa Vaddagiri
2008-01-29 12:21 ` Paul Jackson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1201606284.28547.114.camel@lappy \
--to=a.p.zijlstra@chello.nl \
--cc=akpm@linux-foundation.org \
--cc=dhaval@linux.vnet.ibm.com \
--cc=dmitry.adamushko@gmail.com \
--cc=ebiederm@xmission.com \
--cc=ghaskins@novell.com \
--cc=linux-kernel@vger.kernel.org \
--cc=menage@google.com \
--cc=mingo@elte.hu \
--cc=nickpiggin@yahoo.com.au \
--cc=pj@sgi.com \
--cc=rientjes@google.com \
--cc=rostedt@goodmis.org \
--cc=sgrubb@redhat.com \
--cc=tglx@linutronix.de \
--cc=tong.n.li@intel.com \
--cc=vatsa@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox