From: Peter Zijlstra <peterz@infradead.org>
To: Alex Shi <alex.shi@linaro.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
Ingo Molnar <mingo@kernel.org>, Mike Galbraith <efault@gmx.de>,
Daniel Lezcano <daniel.lezcano@linaro.org>,
Vincent Guittot <vincent.guittot@linaro.org>,
Morten Rasmussen <morten.rasmussen@arm.com>,
Amit Kucheria <amit.kucheria@linaro.org>,
"tglx@linutronix.de" <tglx@linutronix.de>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: top-down balance purpose discussion -- resend
Date: Tue, 21 Jan 2014 15:57:08 +0100 [thread overview]
Message-ID: <20140121145708.GY31570@twins.programming.kicks-ass.net> (raw)
In-Reply-To: <52DE7E6A.90401@linaro.org>
On Tue, Jan 21, 2014 at 10:04:26PM +0800, Alex Shi wrote:
>
> Current scheduler load balance is bottom-up mode, each CPU need
> initiate the balance by self.
>
> 1, Like in a integrate computer system, it has smt/core/cpu/numa, 4
> level scheduler domains. If there is just 2 tasks in whole system that
> both running on cpu0. Current load balance need to pull task to another
> smt in smt domain, then pull task to another core, then pull task to
> another cpu, finally pull task to another numa. Totally it is need 4
> times task moving to get system balance.
Except the idle load balancer, and esp. the newidle can totally by-pass
this.
If you do the packing right in the newidle pass, you'd get there in 1
step.
> Generally, the task moving complexity is
> O(nm log n), n := nr_cpus, m := nr_tasks
>
> There is a excellent summary and explanation for this in
> kernel/sched/fair.c:4605
Which is a perfectly fine scheme for a busy system.
> Another weakness of current LB is that every cpu need to get the other
> cpus' load info repeatedly and try to figure out busiest sched
> group/queue on every sched domain level. But it just waste time, since
> it may not conduct a task moving. One of reasons is that cpu can only
> pull task, not pushing.
This doesn't make sense.. and in fact, we do a limited amount of 3rd
party movements.
Whatever you do, you have to repeat the information gathering anyhow,
because it constantly changes.
Trying to serialize that doesn't make any kind of sense. The only thing
you want is that the system converges.
Skipped the rest because it seems build on a fundament I don't agree
with. That 4 move thing is just silly for an idle system, and we
shouldn't do that.
I also very much do not want a single CPU balancing the entire system,
that's the anti-thesis of scalable.
next prev parent reply other threads:[~2014-01-21 14:57 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-01-21 14:04 top-down balance purpose discussion -- resend Alex Shi
2014-01-21 14:57 ` Peter Zijlstra [this message]
2014-01-22 7:40 ` Alex Shi
2014-01-24 7:29 ` Alex Shi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140121145708.GY31570@twins.programming.kicks-ass.net \
--to=peterz@infradead.org \
--cc=alex.shi@linaro.org \
--cc=amit.kucheria@linaro.org \
--cc=daniel.lezcano@linaro.org \
--cc=efault@gmx.de \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=morten.rasmussen@arm.com \
--cc=paulmck@linux.vnet.ibm.com \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=vincent.guittot@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox