From: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
To: Peter Williams <pwil3058@bigpond.net.au>
Cc: Andrew Morton <akpm@osdl.org>,
"Siddha, Suresh B" <suresh.b.siddha@intel.com>,
Mike Galbraith <efault@gmx.de>,
Nick Piggin <nickpiggin@yahoo.com.au>,
Ingo Molnar <mingo@elte.hu>,
"Chen, Kenneth W" <kenneth.w.chen@intel.com>,
Con Kolivas <kernel@kolivas.org>,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH] sched: prevent high load weight tasks suppressing balancing
Date: Mon, 27 Mar 2006 13:52:04 -0800 [thread overview]
Message-ID: <20060327135204.B12364@unix-os.sc.intel.com> (raw)
In-Reply-To: <4427873A.4010601@bigpond.net.au>; from pwil3058@bigpond.net.au on Mon, Mar 27, 2006 at 05:33:30PM +1100
This breaks HT and MC optimizations.. Consider a DP system with each
physical processor having two HT logical threads.. if there are two
runnable processes running on package-0, with this patch scheduler
will never move one of those processes to package-1..
thanks,
suresh
On Mon, Mar 27, 2006 at 05:33:30PM +1100, Peter Williams wrote:
> Problem:
>
> On systems with more than 2 CPUs it is possible for a single task with a
> high smpnice load weight to suppress load balancing on other CPUs (to
> the one that it's running on) if it is the only runnable task on its
> CPU. E.g. consider a 4-way system (simple SMP system with no HT and
> cores) scenario where a high priority task (nice-20) is running on P0
> and two normal priority tasks running on P1. load balance with smp nice
> code will never be able to detect an imbalance and hence will never move
> one of the normal priority tasks on P1 to idle cpus P2 or P3 as P0 will
> always be identified as the busiest CPU but it has no tasks that can be
> moved.
>
> Solution:
>
> Make sure that only CPUs with tasks that can be moved get selected as
> the busiest queue. This involves ensuring that find_busiest_group()
> only considers groups that have at least one CPU with more than one task
> running as candidates for the busiest group and that
> find_busiest_queue() only considers CPUs that have more than one task
> running as candidates for the busiest run queue.
>
> One effect of this is that load balancing will be abandoned earlier in
> the sequence (i.e. before the double run queue locks are taken prior to
> calling move_tasks() rather than in move_tasks() itself) when there are
> no tasks that can be moved than would be the case without this patch.
>
> Signed-off-by: Peter Williams <pwil3058@bigpond.com.au>
>
> Peter
> PS This doesn't take into account tasks that can't be moved because they
> are pinned to a particular CPU. At this stage, I don't think that it's
> worth the effort to make the changes that would enable this.
> --
> Peter Williams pwil3058@bigpond.net.au
>
> "Learning, n. The kind of ignorance distinguishing the studious."
> -- Ambrose Bierce
> Index: MM-2.6.X/kernel/sched.c
> ===================================================================
> --- MM-2.6.X.orig/kernel/sched.c 2006-03-27 16:00:12.000000000 +1100
> +++ MM-2.6.X/kernel/sched.c 2006-03-27 17:02:53.000000000 +1100
> @@ -2115,6 +2115,7 @@ find_busiest_group(struct sched_domain *
> int local_group;
> int i;
> unsigned long sum_nr_running, sum_weighted_load;
> + unsigned int nr_loaded_cpus = 0; /* where nr_running > 1 */
>
> local_group = cpu_isset(this_cpu, group->cpumask);
>
> @@ -2135,6 +2136,8 @@ find_busiest_group(struct sched_domain *
>
> avg_load += load;
> sum_nr_running += rq->nr_running;
> + if (rq->nr_running > 1)
> + ++nr_loaded_cpus;
> sum_weighted_load += rq->raw_weighted_load;
> }
>
> @@ -2149,7 +2152,7 @@ find_busiest_group(struct sched_domain *
> this = group;
> this_nr_running = sum_nr_running;
> this_load_per_task = sum_weighted_load;
> - } else if (avg_load > max_load) {
> + } else if (avg_load > max_load && nr_loaded_cpus) {
> max_load = avg_load;
> busiest = group;
> busiest_nr_running = sum_nr_running;
> @@ -2158,7 +2161,7 @@ find_busiest_group(struct sched_domain *
> group = group->next;
> } while (group != sd->groups);
>
> - if (!busiest || this_load >= max_load || busiest_nr_running <= 1)
> + if (!busiest || this_load >= max_load)
> goto out_balanced;
>
> avg_load = (SCHED_LOAD_SCALE * total_load) / total_pwr;
> @@ -2260,16 +2263,16 @@ out_balanced:
> static runqueue_t *find_busiest_queue(struct sched_group *group,
> enum idle_type idle)
> {
> - unsigned long load, max_load = 0;
> - runqueue_t *busiest = NULL;
> + unsigned long max_load = 0;
> + runqueue_t *busiest = NULL, *rqi;
> int i;
>
> for_each_cpu_mask(i, group->cpumask) {
> - load = weighted_cpuload(i);
> + rqi = cpu_rq(i);
>
> - if (load > max_load) {
> - max_load = load;
> - busiest = cpu_rq(i);
> + if (rqi->raw_weighted_load > max_load && rqi->nr_running > 1) {
> + max_load = rqi->raw_weighted_load;
> + busiest = rqi;
> }
> }
>
next parent reply other threads:[~2006-03-27 21:52 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <4427873A.4010601@bigpond.net.au>
2006-03-27 21:52 ` Siddha, Suresh B [this message]
2006-03-27 23:21 ` [PATCH] sched: prevent high load weight tasks suppressing balancing Peter Williams
2006-03-28 3:15 ` Siddha, Suresh B
2006-03-28 4:17 ` Peter Williams
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20060327135204.B12364@unix-os.sc.intel.com \
--to=suresh.b.siddha@intel.com \
--cc=akpm@osdl.org \
--cc=efault@gmx.de \
--cc=kenneth.w.chen@intel.com \
--cc=kernel@kolivas.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=nickpiggin@yahoo.com.au \
--cc=pwil3058@bigpond.net.au \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox