From: Peter Zijlstra <peterz@infradead.org>
To: Yuyang Du <yuyang.du@intel.com>
Cc: Mike Galbraith <umgwanakikbuti@gmail.com>, linux-kernel@vger.kernel.org
Subject: Re: 4.3 group scheduling regression
Date: Mon, 12 Oct 2015 13:47:23 +0200 [thread overview]
Message-ID: <20151012114723.GL3816@twins.programming.kicks-ass.net> (raw)
In-Reply-To: <20151012021230.GK11102@intel.com>
On Mon, Oct 12, 2015 at 10:12:31AM +0800, Yuyang Du wrote:
> On Mon, Oct 12, 2015 at 11:12:06AM +0200, Peter Zijlstra wrote:
> > So in the old code we had 'magic' to deal with the case where a cgroup
> > was consuming less than 1 cpu's worth of runtime. For example, a single
> > task running in the group.
> >
> > In that scenario it might be possible that the group entity weight:
> >
> > se->weight = (tg->shares * cfs_rq->weight) / tg->weight;
> >
> > Strongly deviates from the tg->shares; you want the single task reflect
> > the full group shares to the next level; due to the whole distributed
> > approximation stuff.
>
> Yeah, I thought so.
>
> > I see you've deleted all that code; see the former
> > __update_group_entity_contrib().
>
> Probably not there, it actually was an icky way to adjust things.
Yeah, no argument there.
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 4df37a4..b184da0 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -2370,7 +2370,7 @@ static inline long calc_tg_weight(struct task_group *tg, struct cfs_rq *cfs_rq)
> */
> tg_weight = atomic_long_read(&tg->load_avg);
> tg_weight -= cfs_rq->tg_load_avg_contrib;
> - tg_weight += cfs_rq_load_avg(cfs_rq);
> + tg_weight += cfs_rq->load.weight;
>
> return tg_weight;
> }
> @@ -2380,7 +2380,7 @@ static long calc_cfs_shares(struct cfs_rq *cfs_rq, struct task_group *tg)
> long tg_weight, load, shares;
>
> tg_weight = calc_tg_weight(tg, cfs_rq);
> - load = cfs_rq_load_avg(cfs_rq);
> + load = cfs_rq->load.weight;
>
> shares = (tg->shares * load);
> if (tg_weight)
Aah, yes very much so. I completely overlooked that :-(
When calculating shares we very much want the current load, not the load
average.
Also, should we do the below? At this point se->on_rq is still 0 so
reweight_entity() will not update (dequeue/enqueue) the accounting, but
we'll have just accounted the 'old' load.weight.
Doing it this way around we'll first update the weight and then account
it, which seems more accurate.
---
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 700eb548315f..d2efef565aed 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -3009,8 +3009,8 @@ enqueue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags)
*/
update_curr(cfs_rq);
enqueue_entity_load_avg(cfs_rq, se);
- account_entity_enqueue(cfs_rq, se);
update_cfs_shares(cfs_rq);
+ account_entity_enqueue(cfs_rq, se);
if (flags & ENQUEUE_WAKEUP) {
place_entity(cfs_rq, se, 0);
next prev parent reply other threads:[~2015-10-12 11:47 UTC|newest]
Thread overview: 48+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-10-05 21:48 CFS scheduler unfairly prefers pinned tasks paul.szabo
2015-10-06 2:45 ` Mike Galbraith
2015-10-06 10:06 ` paul.szabo
2015-10-06 12:17 ` Mike Galbraith
2015-10-06 20:44 ` paul.szabo
2015-10-07 1:28 ` Mike Galbraith
2015-10-08 8:19 ` Mike Galbraith
2015-10-08 10:54 ` paul.szabo
2015-10-08 11:19 ` Peter Zijlstra
2015-10-10 13:22 ` [patch] sched: disable task group re-weighting on the desktop Mike Galbraith
2015-10-10 14:03 ` kbuild test robot
2015-10-10 14:41 ` Mike Galbraith
2015-10-10 17:01 ` Peter Zijlstra
2015-10-10 17:13 ` Peter Zijlstra
2015-10-11 2:25 ` Mike Galbraith
2015-10-11 17:42 ` 4.3 group scheduling regression Mike Galbraith
2015-10-12 7:23 ` Peter Zijlstra
2015-10-12 7:44 ` Mike Galbraith
2015-10-12 8:04 ` Peter Zijlstra
2015-10-12 0:53 ` Yuyang Du
2015-10-12 9:12 ` Peter Zijlstra
2015-10-12 2:12 ` Yuyang Du
2015-10-12 10:23 ` Mike Galbraith
2015-10-12 19:55 ` Yuyang Du
2015-10-13 4:08 ` Mike Galbraith
2015-10-12 20:42 ` Yuyang Du
2015-10-13 8:06 ` Peter Zijlstra
2015-10-13 0:35 ` Yuyang Du
2015-10-13 8:10 ` Peter Zijlstra
2015-10-13 0:37 ` Yuyang Du
2015-10-12 11:47 ` Peter Zijlstra [this message]
2015-10-12 19:32 ` Yuyang Du
2015-10-13 8:07 ` Peter Zijlstra
2015-10-13 2:22 ` Mike Galbraith
2015-10-12 8:48 ` Mike Galbraith
2015-10-10 20:14 ` [patch] sched: disable task group re-weighting on the desktop paul.szabo
2015-10-11 2:38 ` Mike Galbraith
2015-10-11 9:25 ` paul.szabo
2015-10-11 12:49 ` Mike Galbraith
2015-10-11 19:46 ` paul.szabo
2015-10-12 1:59 ` Mike Galbraith
2015-10-08 14:25 ` CFS scheduler unfairly prefers pinned tasks Mike Galbraith
2015-10-08 21:55 ` paul.szabo
2015-10-09 1:56 ` Mike Galbraith
2015-10-09 2:40 ` Mike Galbraith
2015-10-11 9:43 ` paul.szabo
2015-10-10 3:59 ` Wanpeng Li
2015-10-10 7:58 ` Wanpeng Li
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20151012114723.GL3816@twins.programming.kicks-ass.net \
--to=peterz@infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=umgwanakikbuti@gmail.com \
--cc=yuyang.du@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).