From: Yuyang Du <yuyang.du@intel.com>
To: Morten Rasmussen <morten.rasmussen@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
Mike Galbraith <umgwanakikbuti@gmail.com>,
Rabin Vincent <rabin.vincent@axis.com>,
"mingo@redhat.com" <mingo@redhat.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
Paul Turner <pjt@google.com>, Ben Segall <bsegall@google.com>
Subject: Re: [PATCH?] Livelock in pick_next_task_fair() / idle_balance()
Date: Fri, 10 Jul 2015 07:24:16 +0800 [thread overview]
Message-ID: <20150709232416.GI5197@intel.com> (raw)
In-Reply-To: <20150709143219.GB8668@e105550-lin.cambridge.arm.com>
Hi,
On Thu, Jul 09, 2015 at 03:32:20PM +0100, Morten Rasmussen wrote:
> On Mon, Jul 06, 2015 at 06:31:44AM +0800, Yuyang Du wrote:
> > On Fri, Jul 03, 2015 at 06:38:31PM +0200, Peter Zijlstra wrote:
> > > > I'm not against having a policy that sits somewhere in between, we just
> > > > have to agree it is the right policy and clean up the load-balance code
> > > > such that the implemented policy is clear.
> > >
> > > Right, for balancing its a tricky question, but mixing them without
> > > intent is, as you say, a bit of a mess.
> > >
> > > So clearly blocked load doesn't make sense for (new)idle balancing. OTOH
> > > it does make some sense for the regular periodic balancing, because
> > > there we really do care mostly about the averages, esp. so when we're
> > > overloaded -- but there are issues there too.
> > >
> > > Now we can't track them both (or rather we could, but overhead).
> > >
> > > I like Yuyang's load tracking rewrite, but it changes exactly this part,
> > > and I'm not sure I understand the full ramifications of that yet.
>
> I don't think anybody does ;-) But I think we should try to make it
> work.
>
> > Thanks. It would be a pure average policy, which is non-perfect like now,
> > and certainly needs a mixing like now, but it is worth a starter, because
> > it is simple and reasaonble, and based on it, the other parts can be simple
> > and reasonable.
>
> I think we all agree on the benefits of taking blocked load into
> account but also that there are some policy questions to be addressed.
>
> > > One way out would be to split the load balancer into 3 distinct regions;
> > >
> > > 1) get a task on every CPU, screw everything else.
> > > 2) get each CPU fully utilized, still ignoring 'load'
> > > 3) when everybody is fully utilized, consider load.
>
> Seems very reasonable to me. We more or less follow that idea in the
> energy-model driven scheduling patches, at least 2) and 3).
>
> The difficult bit is detecting when to transition between 2) and 3). If
> you want to enforce smp_nice you have to start worrying about task
> priority as soon as one cpu is fully utilized.
>
> For example, a fully utilized cpu has two high priority tasks while all
> other cpus are running low priority tasks and are not fully utilized.
> The utilization imbalance may be too small to cause any tasks to be
> migrated, so we end up giving fewer cycles to the high priority tasks.
>
> > > If we make find_busiest_foo() select one of these 3, and make
> > > calculate_imbalance() invariant to the metric passed in, and have things
> > > like cpu_load() and task_load() return different, but coherent, numbers
> > > depending on which region we're in, this almost sounds 'simple'.
> > >
> > > The devil is in the details, and the balancer is a hairy nest of details
> > > which will make the above non-trivial.
>
> Yes, but if we have an overall policy like the one you propose we can at
> least make it complicated and claim that we think we know what it is
> supposed to do ;-)
>
> I agree that there is some work to be done in find_busiest_*() and
> calcuate_imbalance() + friends. Maybe step one should be to clean them
> up a bit.
Consensus looks like that we move step-by-step and start working right now:
1) Based on the "Rewrite" patch, let me add cfs_rq->runnable_load_avg. Then
we will have up-to-date everything: load.weight, runnable_load_avg, and
load_avg (including runnable + blocked), from pure now to pure average.
The runnable_load_avg will be used the same as now. So we will not have
a shred of remification. As long as the code is cleared and simplified,
it is a win.
2) Let's clean up a bit the load balancing part code-wise, and if needed,
make change to the obvious things, otherwise leave it unchanged.
3) Polish/complicate the policies, :)
What do you think?
Thanks,
Yuyang
next prev parent reply other threads:[~2015-07-10 7:15 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-06-30 14:30 [PATCH?] Livelock in pick_next_task_fair() / idle_balance() Rabin Vincent
2015-07-01 5:36 ` Mike Galbraith
2015-07-01 14:55 ` Rabin Vincent
2015-07-01 15:47 ` Mike Galbraith
2015-07-01 20:44 ` Peter Zijlstra
2015-07-01 23:25 ` Yuyang Du
2015-07-02 8:05 ` Mike Galbraith
2015-07-02 1:05 ` Yuyang Du
2015-07-02 10:25 ` Mike Galbraith
2015-07-02 11:40 ` Morten Rasmussen
2015-07-02 19:37 ` Yuyang Du
2015-07-03 9:34 ` Morten Rasmussen
2015-07-03 16:38 ` Peter Zijlstra
2015-07-05 22:31 ` Yuyang Du
2015-07-09 14:32 ` Morten Rasmussen
2015-07-09 23:24 ` Yuyang Du [this message]
2015-07-05 20:12 ` Yuyang Du
2015-07-06 17:36 ` Dietmar Eggemann
2015-07-07 11:17 ` Rabin Vincent
2015-07-13 17:43 ` Dietmar Eggemann
2015-07-09 13:53 ` Morten Rasmussen
2015-07-09 22:34 ` Yuyang Du
2015-07-02 10:53 ` Peter Zijlstra
2015-07-02 11:44 ` Morten Rasmussen
2015-07-02 18:42 ` Yuyang Du
2015-07-03 4:42 ` Mike Galbraith
2015-07-03 16:39 ` Peter Zijlstra
2015-07-05 22:11 ` Yuyang Du
2015-07-09 6:15 ` Stefan Ekenberg
2015-07-26 18:57 ` Yuyang Du
2015-08-03 17:05 ` [tip:sched/core] sched/fair: Avoid pulling all tasks in idle balancing tip-bot for Yuyang Du
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150709232416.GI5197@intel.com \
--to=yuyang.du@intel.com \
--cc=bsegall@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=morten.rasmussen@arm.com \
--cc=peterz@infradead.org \
--cc=pjt@google.com \
--cc=rabin.vincent@axis.com \
--cc=umgwanakikbuti@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.