From: Peter Zijlstra <peterz@infradead.org>
To: mingo@kernel.org, linux-kernel@vger.kernel.org, tj@kernel.org
Cc: torvalds@linux-foundation.org, vincent.guittot@linaro.org,
efault@gmx.de, pjt@google.com, clm@fb.com,
dietmar.eggemann@arm.com, morten.rasmussen@arm.com,
bsegall@google.com, yuyang.du@intel.com, peterz@infradead.org
Subject: [RFC][PATCH 02/14] sched/fair: Add comment to calc_cfs_shares()
Date: Fri, 12 May 2017 18:44:18 +0200 [thread overview]
Message-ID: <20170512171335.603832930@infradead.org> (raw)
In-Reply-To: 20170512164416.108843033@infradead.org
[-- Attachment #1: peterz-sched-comment-calc_cfs_shares.patch --]
[-- Type: text/plain, Size: 2859 bytes --]
Explain the magic equation in calc_cfs_shares() a bit better.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
kernel/sched/fair.c | 61 ++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 61 insertions(+)
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -2633,6 +2633,67 @@ account_entity_dequeue(struct cfs_rq *cf
#ifdef CONFIG_FAIR_GROUP_SCHED
# ifdef CONFIG_SMP
+/*
+ * All this does is approximate the hierarchical proportion which includes that
+ * global sum we all love to hate.
+ *
+ * That is, the weight of a group entity, is the proportional share of the
+ * group weight based on the group runqueue weights. That is:
+ *
+ * tg->weight * grq->load.weight
+ * ge->load.weight = ----------------------------- (1)
+ * \Sum grq->load.weight
+ *
+ * Now, because computing that sum is prohibitively expensive to compute (been
+ * there, done that) we approximate it with this average stuff. The average
+ * moves slower and therefore the approximation is cheaper and more stable.
+ *
+ * So instead of the above, we substitute:
+ *
+ * grq->load.weight -> grq->avg.load_avg (2)
+ *
+ * which yields the following:
+ *
+ * tg->weight * grq->avg.load_avg
+ * ge->load.weight = ------------------------------ (3)
+ * tg->load_avg
+ *
+ * Where: tg->load_avg ~= \Sum grq->avg.load_avg
+ *
+ * That is shares_avg, and it is right (given the approximation (2)).
+ *
+ * The problem with it is that because the average is slow -- it was designed
+ * to be exactly that of course -- this leads to transients in boundary
+ * conditions. In specific, the case where the group was idle and we start the
+ * one task. It takes time for our CPU's grq->avg.load_avg to build up,
+ * yielding bad latency etc..
+ *
+ * Now, in that special case (1) reduces to:
+ *
+ * tg->weight * grq->load.weight
+ * ge->load.weight = ----------------------------- = tg>weight (4)
+ * grp->load.weight
+ *
+ * That is, the sum collapses because all other CPUs are idle; the UP scenario.
+ *
+ * So what we do is modify our approximation (3) to approach (4) in the (near)
+ * UP case, like:
+ *
+ * ge->load.weight =
+ *
+ * tg->weight * grq->load.weight
+ * --------------------------------------------------- (5)
+ * tg->load_avg - grq->avg.load_avg + grq->load.weight
+ *
+ *
+ * And that is shares_weight and is icky. In the (near) UP case it approaches
+ * (4) while in the normal case it approaches (3). It consistently
+ * overestimates the ge->load.weight and therefore:
+ *
+ * \Sum ge->load.weight >= tg->weight
+ *
+ * hence icky!
+ */
static long calc_cfs_shares(struct cfs_rq *cfs_rq)
{
long tg_weight, tg_shares, load, shares;
next prev parent reply other threads:[~2017-05-12 17:24 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-05-12 16:44 [RFC][PATCH 00/14] sched/fair: A bit of a cgroup/PELT overhaul (again) Peter Zijlstra
2017-05-12 16:44 ` [RFC][PATCH 01/14] sched/fair: Clean up calc_cfs_shares() Peter Zijlstra
2017-05-16 8:02 ` Vincent Guittot
2017-05-12 16:44 ` Peter Zijlstra [this message]
2017-05-12 16:44 ` [RFC][PATCH 03/14] sched/fair: Remove se->load.weight from se->avg.load_sum Peter Zijlstra
2017-05-17 7:04 ` Vincent Guittot
2017-05-17 9:50 ` Vincent Guittot
2017-05-17 14:20 ` Peter Zijlstra
2017-09-29 20:11 ` [tip:sched/core] sched/fair: Use reweight_entity() for set_user_nice() tip-bot for Vincent Guittot
2017-05-12 16:44 ` [RFC][PATCH 04/14] sched/fair: More accurate reweight_entity() Peter Zijlstra
2017-05-12 16:44 ` [RFC][PATCH 05/14] sched/fair: Change update_load_avg() arguments Peter Zijlstra
2017-05-17 10:46 ` Vincent Guittot
2017-05-12 16:44 ` [RFC][PATCH 06/14] sched/fair: Move enqueue migrate handling Peter Zijlstra
2017-05-29 13:41 ` Vincent Guittot
2017-05-12 16:44 ` [RFC][PATCH 07/14] sched/fair: Rewrite cfs_rq->removed_*avg Peter Zijlstra
2017-05-12 16:44 ` [RFC][PATCH 08/14] sched/fair: Rewrite PELT migration propagation Peter Zijlstra
2017-05-12 16:44 ` [RFC][PATCH 09/14] sched/fair: Propagate an effective runnable_load_avg Peter Zijlstra
2017-05-12 16:44 ` [RFC][PATCH 10/14] sched/fair: more obvious Peter Zijlstra
2017-05-12 16:44 ` [RFC][PATCH 11/14] sched/fair: Synchonous PELT detach on load-balance migrate Peter Zijlstra
2017-05-12 16:44 ` [RFC][PATCH 12/14] sched/fair: Cure calc_cfs_shares() vs reweight_entity() Peter Zijlstra
2017-05-12 16:44 ` [RFC][PATCH 13/14] sched/fair: Align PELT windows between cfs_rq and its se Peter Zijlstra
2017-05-12 16:44 ` [RFC][PATCH 14/14] sched/fair: More accurate async detach Peter Zijlstra
2017-05-16 22:02 ` [RFC][PATCH 00/14] sched/fair: A bit of a cgroup/PELT overhaul (again) Tejun Heo
2017-05-17 6:53 ` Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170512171335.603832930@infradead.org \
--to=peterz@infradead.org \
--cc=bsegall@google.com \
--cc=clm@fb.com \
--cc=dietmar.eggemann@arm.com \
--cc=efault@gmx.de \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=morten.rasmussen@arm.com \
--cc=pjt@google.com \
--cc=tj@kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=vincent.guittot@linaro.org \
--cc=yuyang.du@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox