From: Peter Williams <pwil3058@bigpond.net.au>
To: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
Cc: Andrew Morton <akpm@osdl.org>, Mike Galbraith <efault@gmx.de>,
Nick Piggin <nickpiggin@yahoo.com.au>,
Ingo Molnar <mingo@elte.hu>, Con Kolivas <kernel@kolivas.org>,
"Chen, Kenneth W" <kenneth.w.chen@intel.com>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] sched: smpnice work around for active_load_balance()
Date: Thu, 30 Mar 2006 12:14:57 +1100 [thread overview]
Message-ID: <442B3111.5030808@bigpond.net.au> (raw)
In-Reply-To: <20060329165052.C11376@unix-os.sc.intel.com>
Siddha, Suresh B wrote:
> On Thu, Mar 30, 2006 at 10:40:24AM +1100, Peter Williams wrote:
>> Siddha, Suresh B wrote:
>>> On Wed, Mar 29, 2006 at 02:42:45PM +1100, Peter Williams wrote:
>>>> I meant that it doesn't explicitly address your problem. What it does
>>>> is ASSUME that failure of load balancing to move tasks is because there
>>>> was exactly one task on the source run queue and that this makes it a
>>>> suitable candidate to have that single task moved elsewhere in the blind
>>>> hope that it may fix an HT/MC imbalance that may or may not exist. In
>>>> my mind this is very close to random.
>>> That so called assumption happens only when load balancing has
>>> failed for more than the domain specific cache_nice_tries. Only reason
>>> why it can fail so many times is because of all pinned tasks or only a single
>>> task is running on that particular CPU. load balancing code takes care of both
>>> these scenarios..
>>>
>>> sched groups cpu_power controls the mechanism of implementing HT/MC
>>> optimizations in addition to active balance code... There is no randomness
>>> in this.
>> The above explanation just increases my belief in the randomness of this
>> solution. This code is mostly done without locks and is therefore very
>> racy and any assumptions made based on the number of times load
>> balancing has failed etc. are highly speculative.
>
> Isn't it the same case with regular cpu load calculations during load
> balance?
Yes. Which is why move_tasks() is designed to cope.
But this doesn't effect the argument w.r.t. your code.
>
>> And even if there is only one task on the CPU there's no guarantee that
>> that CPU is in a package that meets the other requirements to make the
>> move desirable. So there's a good probability that you'll be moving
>> tasks unnecessarily.
>
> sched groups cpu_power and domain topology information cleanly
> encapsulates the imbalance identification and source/destination groups
> to fix the imbalance.
But you don't look at the rest of the queues in the package to see if
the need is REALLY required.
>
>> It's a poor solution and it's being inflicted on architectures that
>> don't need it. Even if cache_nice_tries is used to suppress this
>> behaviour on architectures that don't need it they have to carry the
>> code in their kernel.
>
> We can clearly throw CONFIG_SCHED_MC/SMT around that code.. Nick/Ingo
> do you see any issue?
That just makes it a poor solution and ugly. :-)
>
>>>
>>>> Also back to front and inefficient.
>>> HT/MC imbalance is detected in a normal way.. A lightly loaded group
>>> finds an imbalance and tries to pull some load from a busy group (which
>>> is inline with normal load balance)... pull fails because the only task
>>> on that cpu is busy running and needs to go off the cpu (which is triggered
>>> by active load balance)... Scheduler load balance is generally done by a
>>> pull mechansim and here (HT/MC) it is still a pull mechanism(triggering a
>>> final push only because of the single running task)
>>>
>>> If you have any better generic and simple method, please let us know.
>> I gave an example in a previous e-mail. Basically, at the end of
>> scheduler_tick() if rebalance_tick() doesn't move any tasks (it would be
>> foolish to contemplate moving tasks of the queue just after you've moved
>> some there) and the run queue has exactly one running task and it's time
>> for a HT/MC rebalance check on the package that this run queue belongs
>> to then check that package to to see if it meets the rest of criteria
>> for needing to lose some tasks. If it does look for a package that is a
>> suitable recipient for the moved task and if you find one then mark this
>> run queue as needing active load balancing and arrange for its migration
>> thread to be started.
>>
>> Simple, direct and amenable to being only built on architectures that
>> need the functionality.
>
> First of all we will be doing unnecessary checks to see if there is
> an imbalance.. Current code triggers the checks and movement only when
> it is necessary.. And second, finding the correct destination cpu in the
> presence of SMT and MC is really complicated.. Look at different examples
> in the OLS paper.. Domain topology provides all this info with no added
> complexity...
>
>> Another (more complex) solution that would also allow improvements to
>> other HT related code (e.g. the sleeping dependent code) would be to
>> modify the load balancing code so that all CPUs in a package share a run
>> queue and load balancing is then done between packages. As long as the
>> number of CPUs in a package is small this shouldn't have scalability
>> issues. The big disadvantage of this approach is its complexity which
>> is probably too great to contemplate doing it in 2.6.X kernels.
>
> Presence of SMT and MC, implementation of power-savings scheduler
> policy will present more challenges...
And I would recommend a similar approach to what I've suggested above.
They could probably be combined into a single neat well encapsulated
solution.
Peter
--
Peter Williams pwil3058@bigpond.net.au
"Learning, n. The kind of ignorance distinguishing the studious."
-- Ambrose Bierce
next prev parent reply other threads:[~2006-03-30 1:15 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-03-28 6:00 [PATCH] sched: smpnice work around for active_load_balance() Peter Williams
2006-03-28 19:25 ` Siddha, Suresh B
2006-03-28 22:44 ` Peter Williams
2006-03-29 2:14 ` Peter Williams
2006-03-29 2:52 ` Siddha, Suresh B
2006-03-29 3:42 ` Peter Williams
2006-03-29 22:52 ` Siddha, Suresh B
2006-03-29 23:40 ` Peter Williams
2006-03-30 0:50 ` Siddha, Suresh B
2006-03-30 1:14 ` Peter Williams [this message]
2006-04-02 4:48 ` smpnice loadbalancing with high priority tasks Siddha, Suresh B
2006-04-02 7:08 ` Peter Williams
2006-04-04 0:24 ` Siddha, Suresh B
2006-04-04 1:22 ` Peter Williams
2006-04-04 1:34 ` Peter Williams
2006-04-04 2:11 ` Siddha, Suresh B
2006-04-04 3:24 ` Peter Williams
2006-04-04 4:34 ` Peter Williams
2006-04-06 2:14 ` Peter Williams
2006-04-20 1:24 ` [patch] smpnice: don't consider sched groups which are lightly loaded for balancing Siddha, Suresh B
2006-04-20 5:19 ` Peter Williams
2006-04-20 16:54 ` Siddha, Suresh B
2006-04-20 23:11 ` Peter Williams
2006-04-20 23:49 ` Andrew Morton
2006-04-21 0:25 ` Siddha, Suresh B
2006-04-21 0:28 ` Peter Williams
2006-04-21 1:25 ` Andrew Morton
2006-04-20 17:04 ` Siddha, Suresh B
2006-04-21 0:00 ` Peter Williams
2006-04-03 1:04 ` [PATCH] sched: smpnice work around for active_load_balance() Peter Williams
2006-04-03 16:57 ` Siddha, Suresh B
2006-04-03 23:11 ` Peter Williams
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=442B3111.5030808@bigpond.net.au \
--to=pwil3058@bigpond.net.au \
--cc=akpm@osdl.org \
--cc=efault@gmx.de \
--cc=kenneth.w.chen@intel.com \
--cc=kernel@kolivas.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=nickpiggin@yahoo.com.au \
--cc=suresh.b.siddha@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox