public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: kenchen@google.com (Ken Chen)
To: a.p.zijlstra@chello.nl, mingo@elte.hu
Cc: linux-kernel@vger.kernel.org
Subject: [PATCH] sched: fix sched-domain avg_load calculation.
Date: Thu,  7 Apr 2011 17:23:22 -0700 (PDT)	[thread overview]
Message-ID: <20110408002322.3A0D812217F@elm.corp.google.com> (raw)

In function find_busiest_group(), the sched-domain avg_load isn't
calculated at all if there is a group imbalance within the domain.
This will cause erroneous imbalance calculation.  The reason is
that calculate_imbalance() sees sds->avg_load = 0 and it will dump
entire sds->max_load into imbalance variable, which is used later
on to migrate entire load from busiest CPU to the puller CPU. It
has two really bad effect:

1. stampede of task migration, and they won't be able to break out
   of the bad state because of positive feedback loop: large load
   delta -> heavier load migration -> larger imbalance and the cycle
   goes on.

2. severe imbalance in CPU queue depth.  This causes really long
   scheduling latency blip which affects badly on application that
   has tight latency requirement.

The fix is to have kernel calculate domain avg_load in both cases.
This will ensure that imbalance calculation is always sensible and
the target is usually half way between busiest and puller CPU.

Signed-off-by: Ken Chen <kenchen@google.com>
---
 kernel/sched_fair.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c
index c7ec5c8..c46568a 100644
--- a/kernel/sched_fair.c
+++ b/kernel/sched_fair.c
@@ -3127,6 +3127,8 @@ find_busiest_group(
 	if (!sds.busiest || sds.busiest_nr_running == 0)
 		goto out_balanced;
 
+	sds.avg_load = (SCHED_LOAD_SCALE * sds.total_load) / sds.total_pwr;
+
 	/*
 	 * If the busiest group is imbalanced the below checks don't
 	 * work because they assumes all things are equal, which typically
@@ -3151,7 +3153,6 @@ find_busiest_group(
 	 * Don't pull any tasks if this group is already above the domain
 	 * average load.
 	 */
-	sds.avg_load = (SCHED_LOAD_SCALE * sds.total_load) / sds.total_pwr;
 	if (sds.this_load >= sds.avg_load)
 		goto out_balanced;
 
-- 
1.7.3.1


             reply	other threads:[~2011-04-08  0:23 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-04-08  0:23 Ken Chen [this message]
2011-04-08 11:15 ` [PATCH] sched: fix sched-domain avg_load calculation Peter Zijlstra
2011-04-08 19:29   ` Ken Chen
2011-04-11 10:46 ` [tip:sched/urgent] sched: Fix " tip-bot for Ken Chen
2011-04-11 11:00   ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110408002322.3A0D812217F@elm.corp.google.com \
    --to=kenchen@google.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox