public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] sched: smpnice work around for active_load_balance()
@ 2006-03-28  6:00 Peter Williams
  2006-03-28 19:25 ` Siddha, Suresh B
  0 siblings, 1 reply; 32+ messages in thread
From: Peter Williams @ 2006-03-28  6:00 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Siddha, Suresh B, Mike Galbraith, Nick Piggin, Ingo Molnar,
	Con Kolivas, Chen, Kenneth W, Linux Kernel Mailing List

[-- Attachment #1: Type: text/plain, Size: 2019 bytes --]

Problem:

It is undesirable for HT/MC packages to have more than one of their CPUs
busy if there are other packages that have all of their CPUs idle.  This
involves moving the only running task (i.e. the one actually on the CPU)
off on to another CPU and is achieved by using active_load_balance() and
relying on the fact that (when it starts) the queue's migration thread
will preempt the sole running task and (therefore) make it movable.  The
migration thread then moves it to an idle package.

Unfortunately, the mechanism for setting the run queues active_balance
flag is buried deep inside load_balance() and relies heavily on
find_busiest_group() and find_busiest_queue() reporting success even if
the busiest queue has only one task running.  This requirement is not
currently met.

Solution:

The long term solution to this problem is provide an alternative
mechanism for triggering active load balancing for run queues that need it.

However, in the meantime, this patch modifies find_busiest_group() so
that (when idle != NEWLY_IDLE) it prefers groups with at least one CPU 
with more than one running task over those with only one and modifies 
find_busiest_queue() so that (when idle != NEWLY_IDLE) it prefers queues 
with more than one running task over those with only one.  This means 
that the "busiest" queue found in load_balance() will only have one or 
less runnable tasks if there were no queues with more than one runnable 
task.  When called in load_balance_newidle() they will have the existing 
functionality.  These measure will prevent high load weight tasks that 
have a CPU to themselves from suppressing load balancing between other 
queues.

NB This patch should be backed out when a proper fix for the problems 
inherit in HT and MC systems is implemented.

Signed-off-by: Peter Williams <pwil3058@bigpond.com.au>

Peter
-- 
Peter Williams                                   pwil3058@bigpond.net.au

"Learning, n. The kind of ignorance distinguishing the studious."
  -- Ambrose Bierce


[-- Attachment #2: smpnice-active_load_balance --]
[-- Type: text/plain, Size: 2175 bytes --]

Index: MM-2.6.X/kernel/sched.c
===================================================================
--- MM-2.6.X.orig/kernel/sched.c	2006-03-27 17:02:53.000000000 +1100
+++ MM-2.6.X/kernel/sched.c	2006-03-28 16:57:58.000000000 +1100
@@ -2098,6 +2098,7 @@ find_busiest_group(struct sched_domain *
 	unsigned long max_pull;
 	unsigned long busiest_load_per_task, busiest_nr_running;
 	unsigned long this_load_per_task, this_nr_running;
+	unsigned int busiest_has_loaded_cpus = idle == NEWLY_IDLE;
 	int load_idx;
 
 	max_load = this_load = total_load = total_pwr = 0;
@@ -2152,7 +2153,15 @@ find_busiest_group(struct sched_domain *
 			this = group;
 			this_nr_running = sum_nr_running;
 			this_load_per_task = sum_weighted_load;
-		} else if (avg_load > max_load && nr_loaded_cpus) {
+		} else if (nr_loaded_cpus) {
+			if (avg_load > max_load || !busiest_has_loaded_cpus) {
+				max_load = avg_load;
+				busiest = group;
+				busiest_nr_running = sum_nr_running;
+				busiest_load_per_task = sum_weighted_load;
+				busiest_has_loaded_cpus = 1;
+			}
+		} else if (!busiest_has_loaded_cpus && avg_load < max_load) {
 			max_load = avg_load;
 			busiest = group;
 			busiest_nr_running = sum_nr_running;
@@ -2161,7 +2170,7 @@ find_busiest_group(struct sched_domain *
 		group = group->next;
 	} while (group != sd->groups);
 
-	if (!busiest || this_load >= max_load)
+	if (!busiest || this_load >= max_load || busiest_nr_running == 0)
 		goto out_balanced;
 
 	avg_load = (SCHED_LOAD_SCALE * total_load) / total_pwr;
@@ -2265,12 +2274,19 @@ static runqueue_t *find_busiest_queue(st
 {
 	unsigned long max_load = 0;
 	runqueue_t *busiest = NULL, *rqi;
+	unsigned int busiest_is_loaded = idle == NEWLY_IDLE;
 	int i;
 
 	for_each_cpu_mask(i, group->cpumask) {
 		rqi = cpu_rq(i);
 
-		if (rqi->raw_weighted_load > max_load && rqi->nr_running > 1) {
+		if (rqi->nr_running > 1) {
+			if (rqi->raw_weighted_load > max_load || !busiest_is_loaded) {
+				max_load = rqi->raw_weighted_load;
+				busiest = rqi;
+				busiest_is_loaded = 1;
+			}
+		} else if (!busiest_is_loaded && rqi->raw_weighted_load > max_load) {
 			max_load = rqi->raw_weighted_load;
 			busiest = rqi;
 		}

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2006-04-21  1:27 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-03-28  6:00 [PATCH] sched: smpnice work around for active_load_balance() Peter Williams
2006-03-28 19:25 ` Siddha, Suresh B
2006-03-28 22:44   ` Peter Williams
2006-03-29  2:14     ` Peter Williams
2006-03-29  2:52     ` Siddha, Suresh B
2006-03-29  3:42       ` Peter Williams
2006-03-29 22:52         ` Siddha, Suresh B
2006-03-29 23:40           ` Peter Williams
2006-03-30  0:50             ` Siddha, Suresh B
2006-03-30  1:14               ` Peter Williams
2006-04-02  4:48                 ` smpnice loadbalancing with high priority tasks Siddha, Suresh B
2006-04-02  7:08                   ` Peter Williams
2006-04-04  0:24                     ` Siddha, Suresh B
2006-04-04  1:22                       ` Peter Williams
2006-04-04  1:34                         ` Peter Williams
2006-04-04  2:11                         ` Siddha, Suresh B
2006-04-04  3:24                           ` Peter Williams
2006-04-04  4:34                             ` Peter Williams
2006-04-06  2:14                             ` Peter Williams
2006-04-20  1:24                     ` [patch] smpnice: don't consider sched groups which are lightly loaded for balancing Siddha, Suresh B
2006-04-20  5:19                       ` Peter Williams
2006-04-20 16:54                         ` Siddha, Suresh B
2006-04-20 23:11                           ` Peter Williams
2006-04-20 23:49                           ` Andrew Morton
2006-04-21  0:25                             ` Siddha, Suresh B
2006-04-21  0:28                             ` Peter Williams
2006-04-21  1:25                               ` Andrew Morton
2006-04-20 17:04                         ` Siddha, Suresh B
2006-04-21  0:00                           ` Peter Williams
2006-04-03  1:04             ` [PATCH] sched: smpnice work around for active_load_balance() Peter Williams
2006-04-03 16:57               ` Siddha, Suresh B
2006-04-03 23:11                 ` Peter Williams

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox