From: tip-bot for Jason Low <tipbot@zytor.com>
To: linux-tip-commits@vger.kernel.org
Cc: linux-kernel@vger.kernel.org, hpa@zytor.com, mingo@kernel.org,
preeti@linux.vnet.ibm.com, peterz@infradead.org,
jason.low2@hp.com, tglx@linutronix.de
Subject: [tip:sched/core] sched: Fix next_balance logic in rebalance_domains() and idle_balance()
Date: Mon, 19 May 2014 06:09:39 -0700 [thread overview]
Message-ID: <tip-df6b728416e32fe7fe04e5c81ca9ca2d6416735a@git.kernel.org> (raw)
In-Reply-To: <1399596562.2200.7.camel@j-VirtualBox>
Commit-ID: df6b728416e32fe7fe04e5c81ca9ca2d6416735a
Gitweb: http://git.kernel.org/tip/df6b728416e32fe7fe04e5c81ca9ca2d6416735a
Author: Jason Low <jason.low2@hp.com>
AuthorDate: Thu, 8 May 2014 17:49:22 -0700
Committer: Thomas Gleixner <tglx@linutronix.de>
CommitDate: Mon, 19 May 2014 22:02:41 +0900
sched: Fix next_balance logic in rebalance_domains() and idle_balance()
Currently, in idle_balance(), we update rq->next_balance when we pull_tasks.
However, it is also important to update this in the !pulled_tasks case too.
When the CPU is "busy" (the CPU isn't idle), rq->next_balance gets computed
using sd->busy_factor (so we increase the balance interval when the CPU is
busy). However, when the CPU goes idle, rq->next_balance could still be set
to a large value that was computed with the sd->busy_factor.
Thus, we need to also update rq->next_balance in idle_balance() in the cases
where !pulled_tasks too, so that rq->next_balance gets updated without taking
the busy_factor into account when the CPU is about to go idle.
This patch makes rq->next_balance get updated independently of whether or
not we pulled_task. Also, we add logic to ensure that we always traverse
at least 1 of the sched domains to get a proper next_balance value for
updating rq->next_balance.
Additionally, since load_balance() modifies the sd->balance_interval, we
need to re-obtain the sched domain's interval after the call to
load_balance() in rebalance_domains() before we update rq->next_balance.
This patch adds and uses 2 new helper functions, update_next_balance() and
get_sd_balance_interval() to update next_balance and obtain the sched
domain's balance_interval.
Cc: daniel.lezcano@linaro.org
Cc: alex.shi@linaro.org
Cc: efault@gmx.de
Cc: vincent.guittot@linaro.org
Cc: morten.rasmussen@arm.com
Cc: aswin@hp.com
Cc: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Signed-off-by: Jason Low <jason.low2@hp.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1399596562.2200.7.camel@j-VirtualBox
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
kernel/sched/fair.c | 69 +++++++++++++++++++++++++++++++++++------------------
1 file changed, 46 insertions(+), 23 deletions(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 429164d..26ec668 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6672,17 +6672,44 @@ out:
return ld_moved;
}
+static inline unsigned long
+get_sd_balance_interval(struct sched_domain *sd, int cpu_busy)
+{
+ unsigned long interval = sd->balance_interval;
+
+ if (cpu_busy)
+ interval *= sd->busy_factor;
+
+ /* scale ms to jiffies */
+ interval = msecs_to_jiffies(interval);
+ interval = clamp(interval, 1UL, max_load_balance_interval);
+
+ return interval;
+}
+
+static inline void
+update_next_balance(struct sched_domain *sd, int cpu_busy, unsigned long *next_balance)
+{
+ unsigned long interval, next;
+
+ interval = get_sd_balance_interval(sd, cpu_busy);
+ next = sd->last_balance + interval;
+
+ if (time_after(*next_balance, next))
+ *next_balance = next;
+}
+
/*
* idle_balance is called by schedule() if this_cpu is about to become
* idle. Attempts to pull tasks from other CPUs.
*/
static int idle_balance(struct rq *this_rq)
{
+ unsigned long next_balance = jiffies + HZ;
+ int this_cpu = this_rq->cpu;
struct sched_domain *sd;
int pulled_task = 0;
- unsigned long next_balance = jiffies + HZ;
u64 curr_cost = 0;
- int this_cpu = this_rq->cpu;
idle_enter_fair(this_rq);
@@ -6692,8 +6719,15 @@ static int idle_balance(struct rq *this_rq)
*/
this_rq->idle_stamp = rq_clock(this_rq);
- if (this_rq->avg_idle < sysctl_sched_migration_cost)
+ if (this_rq->avg_idle < sysctl_sched_migration_cost) {
+ rcu_read_lock();
+ sd = rcu_dereference_check_sched_domain(this_rq->sd);
+ if (sd)
+ update_next_balance(sd, 0, &next_balance);
+ rcu_read_unlock();
+
goto out;
+ }
/*
* Drop the rq->lock, but keep IRQ/preempt disabled.
@@ -6703,15 +6737,16 @@ static int idle_balance(struct rq *this_rq)
update_blocked_averages(this_cpu);
rcu_read_lock();
for_each_domain(this_cpu, sd) {
- unsigned long interval;
int continue_balancing = 1;
u64 t0, domain_cost;
if (!(sd->flags & SD_LOAD_BALANCE))
continue;
- if (this_rq->avg_idle < curr_cost + sd->max_newidle_lb_cost)
+ if (this_rq->avg_idle < curr_cost + sd->max_newidle_lb_cost) {
+ update_next_balance(sd, 0, &next_balance);
break;
+ }
if (sd->flags & SD_BALANCE_NEWIDLE) {
t0 = sched_clock_cpu(this_cpu);
@@ -6727,9 +6762,7 @@ static int idle_balance(struct rq *this_rq)
curr_cost += domain_cost;
}
- interval = msecs_to_jiffies(sd->balance_interval);
- if (time_after(next_balance, sd->last_balance + interval))
- next_balance = sd->last_balance + interval;
+ update_next_balance(sd, 0, &next_balance);
/*
* Stop searching for tasks to pull if there are
@@ -6753,15 +6786,11 @@ static int idle_balance(struct rq *this_rq)
if (this_rq->cfs.h_nr_running && !pulled_task)
pulled_task = 1;
- if (pulled_task || time_after(jiffies, this_rq->next_balance)) {
- /*
- * We are going idle. next_balance may be set based on
- * a busy processor. So reset next_balance.
- */
+out:
+ /* Move the next balance forward */
+ if (time_after(this_rq->next_balance, next_balance))
this_rq->next_balance = next_balance;
- }
-out:
/* Is there a task of a high priority class? */
if (this_rq->nr_running != this_rq->cfs.h_nr_running)
pulled_task = -1;
@@ -7044,16 +7073,9 @@ static void rebalance_domains(struct rq *rq, enum cpu_idle_type idle)
break;
}
- interval = sd->balance_interval;
- if (idle != CPU_IDLE)
- interval *= sd->busy_factor;
-
- /* scale ms to jiffies */
- interval = msecs_to_jiffies(interval);
- interval = clamp(interval, 1UL, max_load_balance_interval);
+ interval = get_sd_balance_interval(sd, idle != CPU_IDLE);
need_serialize = sd->flags & SD_SERIALIZE;
-
if (need_serialize) {
if (!spin_trylock(&balancing))
goto out;
@@ -7069,6 +7091,7 @@ static void rebalance_domains(struct rq *rq, enum cpu_idle_type idle)
idle = idle_cpu(cpu) ? CPU_IDLE : CPU_NOT_IDLE;
}
sd->last_balance = jiffies;
+ interval = get_sd_balance_interval(sd, idle != CPU_IDLE);
}
if (need_serialize)
spin_unlock(&balancing);
next prev parent reply other threads:[~2014-05-19 13:10 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-04-28 22:45 [PATCH 0/2] sched: Idle load balance fixes Jason Low
2014-04-28 22:45 ` [PATCH 1/2] sched: Fix updating rq->max_idle_balance_cost and rq->next_balance in idle_balance() Jason Low
2014-05-08 10:42 ` [tip:sched/core] sched: Fix updating rq-> max_idle_balance_cost " tip-bot for Jason Low
2014-04-28 22:45 ` [PATCH 2/2] sched: Fix next_balance logic in rebalance_domains() and idle_balance() Jason Low
2014-05-08 16:59 ` Jason Low
2014-05-08 17:38 ` Ingo Molnar
2014-05-08 18:15 ` Jason Low
2014-05-08 22:14 ` Jason Low
2014-05-09 0:49 ` Jason Low
2014-05-09 9:08 ` Peter Zijlstra
2014-05-09 17:29 ` Jason Low
2014-05-19 13:09 ` tip-bot for Jason Low [this message]
2014-05-22 12:27 ` [tip:sched/core] sched: Fix the rq-> " tip-bot for Jason Low
2014-05-08 18:06 ` [PATCH 2/2] sched: Fix " Peter Zijlstra
2014-04-30 7:54 ` [PATCH 0/2] sched: Idle load balance fixes Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=tip-df6b728416e32fe7fe04e5c81ca9ca2d6416735a@git.kernel.org \
--to=tipbot@zytor.com \
--cc=hpa@zytor.com \
--cc=jason.low2@hp.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-tip-commits@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=peterz@infradead.org \
--cc=preeti@linux.vnet.ibm.com \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.