All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mel Gorman <mgorman@techsingularity.net>
To: Linux-Stable <stable@vger.kernel.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Subject: [PATCH 23/26] sched/fair: Fix load_balance() affinity redo path
Date: Thu, 20 Jul 2017 22:21:41 +0100	[thread overview]
Message-ID: <20170720212144.18453-24-mgorman@techsingularity.net> (raw)
In-Reply-To: <20170720212144.18453-1-mgorman@techsingularity.net>

From: Jeffrey Hugo <jhugo@codeaurora.org>

commit 65a4433aebe36c8c6abeb69b99ef00274b971c6c upstream.

If load_balance() fails to migrate any tasks because all tasks were
affined, load_balance() removes the source CPU from consideration and
attempts to redo and balance among the new subset of CPUs.

There is a bug in this code path where the algorithm considers all active
CPUs in the system (minus the source that was just masked out).  This is
not valid for two reasons: some active CPUs may not be in the current
scheduling domain and one of the active CPUs is dst_cpu. These CPUs should
not be considered, as we cannot pull load from them.

Instead of failing out of load_balance(), we may end up redoing the search
with no valid CPUs and incorrectly concluding the domain is balanced.
Additionally, if the group_imbalance flag was just set, it may also be
incorrectly unset, thus the flag will not be seen by other CPUs in future
load_balance() runs as that algorithm intends.

Fix the check by removing CPUs not in the current domain and the dst_cpu
from considertation, thus limiting the evaluation to valid remaining CPUs
from which load might be migrated.

Co-authored-by: Austin Christ <austinwc@codeaurora.org>
Co-authored-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Tested-by: Tyler Baicar <tbaicar@codeaurora.org>
Signed-off-by: Jeffrey Hugo <jhugo@codeaurora.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Austin Christ <austinwc@codeaurora.org>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Timur Tabi <timur@codeaurora.org>
Link: http://lkml.kernel.org/r/1496863138-11322-2-git-send-email-jhugo@codeaurora.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
---
 kernel/sched/fair.c | 32 ++++++++++++++++++++------------
 1 file changed, 20 insertions(+), 12 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index bafa4e04b850..5f8c91bdac3c 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6619,10 +6619,10 @@ int can_migrate_task(struct task_struct *p, struct lb_env *env)
 		 * our sched_group. We may want to revisit it if we couldn't
 		 * meet load balance goals by pulling other tasks on src_cpu.
 		 *
-		 * Also avoid computing new_dst_cpu if we have already computed
-		 * one in current iteration.
+		 * Avoid computing new_dst_cpu for NEWLY_IDLE or if we have
+		 * already computed one in current iteration.
 		 */
-		if (!env->dst_grpmask || (env->flags & LBF_DST_PINNED))
+		if (env->idle == CPU_NEWLY_IDLE || (env->flags & LBF_DST_PINNED))
 			return 0;
 
 		/* Prevent to re-select dst_cpu via env's cpus */
@@ -7973,14 +7973,7 @@ static int load_balance(int this_cpu, struct rq *this_rq,
 		.tasks		= LIST_HEAD_INIT(env.tasks),
 	};
 
-	/*
-	 * For NEWLY_IDLE load_balancing, we don't need to consider
-	 * other cpus in our group
-	 */
-	if (idle == CPU_NEWLY_IDLE)
-		env.dst_grpmask = NULL;
-
-	cpumask_copy(cpus, cpu_active_mask);
+	cpumask_and(cpus, sched_domain_span(sd), cpu_active_mask);
 
 	schedstat_inc(sd->lb_count[idle]);
 
@@ -8102,7 +8095,15 @@ static int load_balance(int this_cpu, struct rq *this_rq,
 		/* All tasks on this runqueue were pinned by CPU affinity */
 		if (unlikely(env.flags & LBF_ALL_PINNED)) {
 			cpumask_clear_cpu(cpu_of(busiest), cpus);
-			if (!cpumask_empty(cpus)) {
+			/*
+			 * Attempting to continue load balancing at the current
+			 * sched_domain level only makes sense if there are
+			 * active CPUs remaining as possible busiest CPUs to
+			 * pull load from which are not contained within the
+			 * destination group that is receiving any migrated
+			 * load.
+			 */
+			if (!cpumask_subset(cpus, env.dst_grpmask)) {
 				env.loop = 0;
 				env.loop_break = sched_nr_migrate_break;
 				goto redo;
@@ -8398,6 +8399,13 @@ static int active_load_balance_cpu_stop(void *data)
 			.src_cpu	= busiest_rq->cpu,
 			.src_rq		= busiest_rq,
 			.idle		= CPU_IDLE,
+			/*
+			 * can_migrate_task() doesn't need to compute new_dst_cpu
+			 * for active balancing. Since we have CPU_IDLE, but no
+			 * @dst_grpmask we need to make that test go away with lying
+			 * about DST_PINNED.
+			 */
+			.flags		= LBF_DST_PINNED,
 		};
 
 		schedstat_inc(sd->alb_count);
-- 
2.13.1

  parent reply	other threads:[~2017-07-20 21:21 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-07-20 21:21 [PATCH 00/26] Performance-related backports for 4.12.2 Mel Gorman
2017-07-20 21:21 ` [PATCH 01/26] sched/topology: Refactor function build_overlap_sched_groups() Mel Gorman
2017-07-20 21:21 ` [PATCH 02/26] sched/topology: Fix building of overlapping sched-groups Mel Gorman
2017-07-20 21:21 ` [PATCH 03/26] sched/topology: Simplify build_overlap_sched_groups() Mel Gorman
2017-07-20 21:21 ` [PATCH 04/26] sched/debug: Print the scheduler topology group mask Mel Gorman
2017-07-20 21:21 ` [PATCH 05/26] sched/topology: Verify the first group matches the child domain Mel Gorman
2017-07-20 21:21 ` [PATCH 06/26] sched/topology: Optimize build_group_mask() Mel Gorman
2017-07-20 21:21 ` [PATCH 07/26] sched/topology: Move comment about asymmetric node setups Mel Gorman
2017-07-20 21:21 ` [PATCH 08/26] sched/topology: Remove FORCE_SD_OVERLAP Mel Gorman
2017-07-20 21:21 ` [PATCH 09/26] sched/topology: Fix overlapping sched_group_mask Mel Gorman
2017-07-20 21:21 ` [PATCH 10/26] sched/topology: Small cleanup Mel Gorman
2017-07-20 21:21 ` [PATCH 11/26] sched/topology: Add sched_group_capacity debugging Mel Gorman
2017-07-20 21:21 ` [PATCH 12/26] sched/topology: Fix overlapping sched_group_capacity Mel Gorman
2017-07-20 21:21 ` [PATCH 13/26] sched/topology: Add a few comments Mel Gorman
2017-07-20 21:21 ` [PATCH 14/26] sched/topology: Rewrite get_group() Mel Gorman
2017-07-20 21:21 ` [PATCH 15/26] sched/topology: Simplify sched_group_mask() usage Mel Gorman
2017-07-20 21:21 ` [PATCH 16/26] sched/topology: Rename sched_group_mask() Mel Gorman
2017-07-20 21:21 ` [PATCH 17/26] sched/topology: Rename sched_group_cpus() Mel Gorman
2017-07-20 21:21 ` [PATCH 18/26] vtime, sched/cputime: Remove vtime_account_user() Mel Gorman
2017-07-20 21:21 ` [PATCH 19/26] sched/cputime: Always set tsk->vtime_snap_whence after accounting vtime Mel Gorman
2017-07-20 21:21 ` [PATCH 20/26] sched/cputime: Rename vtime fields Mel Gorman
2017-07-20 21:21 ` [PATCH 21/26] sched/cputime: Move the vtime task fields to their own struct Mel Gorman
2017-07-20 21:21 ` [PATCH 22/26] sched/cputime: Accumulate vtime on top of nsec clocksource Mel Gorman
2017-07-20 21:21 ` Mel Gorman [this message]
2017-07-20 21:21 ` [PATCH 24/26] percpu_counter: Rename __percpu_counter_add to percpu_counter_add_batch Mel Gorman
2017-07-20 21:21 ` [PATCH 25/26] writeback: rework wb_[dec|inc]_stat family of functions Mel Gorman
2017-07-20 21:21 ` [PATCH 26/26] kernel/fork.c: virtually mapped stacks: do not disable interrupts Mel Gorman
2017-07-24 16:44 ` [PATCH 00/26] Performance-related backports for 4.12.2 Mel Gorman
2017-07-24 23:29   ` Greg KH
2017-07-25  8:14     ` Mel Gorman
2017-07-25 15:21       ` Greg KH

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170720212144.18453-24-mgorman@techsingularity.net \
    --to=mgorman@techsingularity.net \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.