public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: mingo@kernel.org, linux-kernel@vger.kernel.org, tj@kernel.org
Cc: torvalds@linux-foundation.org, vincent.guittot@linaro.org,
	efault@gmx.de, pjt@google.com, clm@fb.com,
	dietmar.eggemann@arm.com, morten.rasmussen@arm.com,
	bsegall@google.com, yuyang.du@intel.com, peterz@infradead.org
Subject: [RFC][PATCH 02/14] sched/fair: Add comment to calc_cfs_shares()
Date: Fri, 12 May 2017 18:44:18 +0200	[thread overview]
Message-ID: <20170512171335.603832930@infradead.org> (raw)
In-Reply-To: 20170512164416.108843033@infradead.org

[-- Attachment #1: peterz-sched-comment-calc_cfs_shares.patch --]
[-- Type: text/plain, Size: 2859 bytes --]

Explain the magic equation in calc_cfs_shares() a bit better.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 kernel/sched/fair.c |   61 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 61 insertions(+)

--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -2633,6 +2633,67 @@ account_entity_dequeue(struct cfs_rq *cf
 
 #ifdef CONFIG_FAIR_GROUP_SCHED
 # ifdef CONFIG_SMP
+/*
+ * All this does is approximate the hierarchical proportion which includes that
+ * global sum we all love to hate.
+ *
+ * That is, the weight of a group entity, is the proportional share of the
+ * group weight based on the group runqueue weights. That is:
+ *
+ *                     tg->weight * grq->load.weight
+ *   ge->load.weight = -----------------------------               (1)
+ *       		  \Sum grq->load.weight
+ *
+ * Now, because computing that sum is prohibitively expensive to compute (been
+ * there, done that) we approximate it with this average stuff. The average
+ * moves slower and therefore the approximation is cheaper and more stable.
+ *
+ * So instead of the above, we substitute:
+ *
+ *   grq->load.weight -> grq->avg.load_avg                         (2)
+ *
+ * which yields the following:
+ *
+ *                     tg->weight * grq->avg.load_avg
+ *   ge->load.weight = ------------------------------              (3)
+ *       		      tg->load_avg
+ *
+ * Where: tg->load_avg ~= \Sum grq->avg.load_avg
+ *
+ * That is shares_avg, and it is right (given the approximation (2)).
+ *
+ * The problem with it is that because the average is slow -- it was designed
+ * to be exactly that of course -- this leads to transients in boundary
+ * conditions. In specific, the case where the group was idle and we start the
+ * one task. It takes time for our CPU's grq->avg.load_avg to build up,
+ * yielding bad latency etc..
+ *
+ * Now, in that special case (1) reduces to:
+ *
+ *                     tg->weight * grq->load.weight
+ *   ge->load.weight = ----------------------------- = tg>weight   (4)
+ *       		  grp->load.weight
+ *
+ * That is, the sum collapses because all other CPUs are idle; the UP scenario.
+ *
+ * So what we do is modify our approximation (3) to approach (4) in the (near)
+ * UP case, like:
+ *
+ *   ge->load.weight =
+ *
+ *              tg->weight * grq->load.weight
+ *     ---------------------------------------------------         (5)
+ *     tg->load_avg - grq->avg.load_avg + grq->load.weight
+ *
+ *
+ * And that is shares_weight and is icky. In the (near) UP case it approaches
+ * (4) while in the normal case it approaches (3). It consistently
+ * overestimates the ge->load.weight and therefore:
+ *
+ *   \Sum ge->load.weight >= tg->weight
+ *
+ * hence icky!
+ */
 static long calc_cfs_shares(struct cfs_rq *cfs_rq)
 {
 	long tg_weight, tg_shares, load, shares;

  parent reply	other threads:[~2017-05-12 17:24 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-05-12 16:44 [RFC][PATCH 00/14] sched/fair: A bit of a cgroup/PELT overhaul (again) Peter Zijlstra
2017-05-12 16:44 ` [RFC][PATCH 01/14] sched/fair: Clean up calc_cfs_shares() Peter Zijlstra
2017-05-16  8:02   ` Vincent Guittot
2017-05-12 16:44 ` Peter Zijlstra [this message]
2017-05-12 16:44 ` [RFC][PATCH 03/14] sched/fair: Remove se->load.weight from se->avg.load_sum Peter Zijlstra
2017-05-17  7:04   ` Vincent Guittot
2017-05-17  9:50     ` Vincent Guittot
2017-05-17 14:20       ` Peter Zijlstra
2017-09-29 20:11       ` [tip:sched/core] sched/fair: Use reweight_entity() for set_user_nice() tip-bot for Vincent Guittot
2017-05-12 16:44 ` [RFC][PATCH 04/14] sched/fair: More accurate reweight_entity() Peter Zijlstra
2017-05-12 16:44 ` [RFC][PATCH 05/14] sched/fair: Change update_load_avg() arguments Peter Zijlstra
2017-05-17 10:46   ` Vincent Guittot
2017-05-12 16:44 ` [RFC][PATCH 06/14] sched/fair: Move enqueue migrate handling Peter Zijlstra
2017-05-29 13:41   ` Vincent Guittot
2017-05-12 16:44 ` [RFC][PATCH 07/14] sched/fair: Rewrite cfs_rq->removed_*avg Peter Zijlstra
2017-05-12 16:44 ` [RFC][PATCH 08/14] sched/fair: Rewrite PELT migration propagation Peter Zijlstra
2017-05-12 16:44 ` [RFC][PATCH 09/14] sched/fair: Propagate an effective runnable_load_avg Peter Zijlstra
2017-05-12 16:44 ` [RFC][PATCH 10/14] sched/fair: more obvious Peter Zijlstra
2017-05-12 16:44 ` [RFC][PATCH 11/14] sched/fair: Synchonous PELT detach on load-balance migrate Peter Zijlstra
2017-05-12 16:44 ` [RFC][PATCH 12/14] sched/fair: Cure calc_cfs_shares() vs reweight_entity() Peter Zijlstra
2017-05-12 16:44 ` [RFC][PATCH 13/14] sched/fair: Align PELT windows between cfs_rq and its se Peter Zijlstra
2017-05-12 16:44 ` [RFC][PATCH 14/14] sched/fair: More accurate async detach Peter Zijlstra
2017-05-16 22:02 ` [RFC][PATCH 00/14] sched/fair: A bit of a cgroup/PELT overhaul (again) Tejun Heo
2017-05-17  6:53   ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170512171335.603832930@infradead.org \
    --to=peterz@infradead.org \
    --cc=bsegall@google.com \
    --cc=clm@fb.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=efault@gmx.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=morten.rasmussen@arm.com \
    --cc=pjt@google.com \
    --cc=tj@kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=vincent.guittot@linaro.org \
    --cc=yuyang.du@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox