* Re: cpu scheduler merge plans
2006-03-22 23:51 cpu scheduler merge plans Andrew Morton
@ 2006-03-22 22:57 ` kernel
2006-03-23 1:37 ` Siddha, Suresh B
2006-03-23 0:31 ` Nick Piggin
` (2 subsequent siblings)
3 siblings, 1 reply; 14+ messages in thread
From: kernel @ 2006-03-22 22:57 UTC (permalink / raw)
To: Andrew Morton
Cc: Ingo Molnar, Nick Piggin, Peter Williams, Siddha, Suresh B,
Chen, Kenneth W, Mike Galbraith, linux-kernel
Quoting Andrew Morton <akpm@osdl.org>:
>
> So it's that time again. We need to decide which of the queued sched
> patches should be merged into 2.6.17.
>
> I have:
>
> sched-fix-task-interactivity-calculation.patch
> small-schedule-microoptimization.patch
> #
> sched-implement-smpnice.patch
> sched-smpnice-apply-review-suggestions.patch
> sched-smpnice-fix-average-load-per-run-queue-calculations.patch
> sched-store-weighted-load-on-up.patch
> sched-add-discrete-weighted-cpu-load-function.patch
> sched-add-above-background-load-function.patch
> # Suresh had problems
> # con:
> sched-cleanup_task_activated.patch
> sched-make_task_noninteractive_use_sleep_type.patch
> sched-dont_decrease_idle_sleep_avg.patch
> sched-include_noninteractive_sleep_in_idle_detect.patch
> sched-remove-on-runqueue-requeueing.patch
> sched-activate-sched-batch-expired.patch
> sched-reduce-overhead-of-calc_load.patch
> #
> sched-fix-interactive-task-starvation.patch
I'd like to see all of these up to this point go in. I can't comment on the
below directly.
> #
> # "strange load balancing problems": pwil3058@bigpond.net.au
> sched-new-sched-domain-for-representing-multi-core.patch
> sched-fix-group-power-for-allnodes_domains.patch
> x86-dont-use-cpuid2-to-determine-cache-info-if-cpuid4-is-supported.patch
>
>
> I'm not sure what the "Suresh had problems" comment refers to - perhaps a
> now-removed patch.
On previous versions of smp nice Suresh found some throughput issues. Peter has
addressed these as far as I'm aware, but we really need Suresh to check all
those again.
>
> afaik, the load balancing problem which Peter observed remains unresolved.
That was a multicore enabled balancing problem which he reported went away on a
later -mm.
Cheers,
Con
^ permalink raw reply [flat|nested] 14+ messages in thread
* cpu scheduler merge plans
@ 2006-03-22 23:51 Andrew Morton
2006-03-22 22:57 ` kernel
` (3 more replies)
0 siblings, 4 replies; 14+ messages in thread
From: Andrew Morton @ 2006-03-22 23:51 UTC (permalink / raw)
To: Ingo Molnar, Nick Piggin, Con Kolivas, Peter Williams,
Siddha, Suresh B, Chen, Kenneth W, Mike Galbraith
Cc: linux-kernel
So it's that time again. We need to decide which of the queued sched
patches should be merged into 2.6.17.
I have:
sched-fix-task-interactivity-calculation.patch
small-schedule-microoptimization.patch
#
sched-implement-smpnice.patch
sched-smpnice-apply-review-suggestions.patch
sched-smpnice-fix-average-load-per-run-queue-calculations.patch
sched-store-weighted-load-on-up.patch
sched-add-discrete-weighted-cpu-load-function.patch
sched-add-above-background-load-function.patch
# Suresh had problems
# con:
sched-cleanup_task_activated.patch
sched-make_task_noninteractive_use_sleep_type.patch
sched-dont_decrease_idle_sleep_avg.patch
sched-include_noninteractive_sleep_in_idle_detect.patch
sched-remove-on-runqueue-requeueing.patch
sched-activate-sched-batch-expired.patch
sched-reduce-overhead-of-calc_load.patch
#
sched-fix-interactive-task-starvation.patch
#
# "strange load balancing problems": pwil3058@bigpond.net.au
sched-new-sched-domain-for-representing-multi-core.patch
sched-fix-group-power-for-allnodes_domains.patch
x86-dont-use-cpuid2-to-determine-cache-info-if-cpuid4-is-supported.patch
I'm not sure what the "Suresh had problems" comment refers to - perhaps a
now-removed patch.
afaik, the load balancing problem which Peter observed remains unresolved.
Has smpnice had appropriate testing for regressions?
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: cpu scheduler merge plans
2006-03-22 23:51 cpu scheduler merge plans Andrew Morton
2006-03-22 22:57 ` kernel
@ 2006-03-23 0:31 ` Nick Piggin
2006-03-23 1:16 ` Peter Williams
2006-03-23 5:03 ` cpu scheduler merge plans Ingo Molnar
3 siblings, 0 replies; 14+ messages in thread
From: Nick Piggin @ 2006-03-23 0:31 UTC (permalink / raw)
To: Andrew Morton
Cc: Ingo Molnar, Con Kolivas, Peter Williams, Siddha, Suresh B,
Chen, Kenneth W, Mike Galbraith, linux-kernel
Andrew Morton wrote:
> So it's that time again. We need to decide which of the queued sched
> patches should be merged into 2.6.17.
>
> I have:
>
> sched-fix-task-interactivity-calculation.patch
> small-schedule-microoptimization.patch
> #
> sched-implement-smpnice.patch
> sched-smpnice-apply-review-suggestions.patch
> sched-smpnice-fix-average-load-per-run-queue-calculations.patch
> sched-store-weighted-load-on-up.patch
> sched-add-discrete-weighted-cpu-load-function.patch
> sched-add-above-background-load-function.patch
> # Suresh had problems
I really need to review smpnice. I'll try to get on to that soon.
I don't have any problems with the non-multiprocessor stuff
(Con's and Mike's patches).
--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: cpu scheduler merge plans
2006-03-22 23:51 cpu scheduler merge plans Andrew Morton
2006-03-22 22:57 ` kernel
2006-03-23 0:31 ` Nick Piggin
@ 2006-03-23 1:16 ` Peter Williams
2006-03-24 23:45 ` more smpnice patch issues Siddha, Suresh B
2006-03-23 5:03 ` cpu scheduler merge plans Ingo Molnar
3 siblings, 1 reply; 14+ messages in thread
From: Peter Williams @ 2006-03-23 1:16 UTC (permalink / raw)
To: Andrew Morton
Cc: Ingo Molnar, Nick Piggin, Con Kolivas, Siddha, Suresh B,
Chen, Kenneth W, Mike Galbraith, linux-kernel
Andrew Morton wrote:
> So it's that time again. We need to decide which of the queued sched
> patches should be merged into 2.6.17.
>
> I have:
>
> sched-fix-task-interactivity-calculation.patch
> small-schedule-microoptimization.patch
> #
> sched-implement-smpnice.patch
> sched-smpnice-apply-review-suggestions.patch
> sched-smpnice-fix-average-load-per-run-queue-calculations.patch
> sched-store-weighted-load-on-up.patch
> sched-add-discrete-weighted-cpu-load-function.patch
> sched-add-above-background-load-function.patch
> # Suresh had problems
> # con:
> sched-cleanup_task_activated.patch
> sched-make_task_noninteractive_use_sleep_type.patch
> sched-dont_decrease_idle_sleep_avg.patch
> sched-include_noninteractive_sleep_in_idle_detect.patch
> sched-remove-on-runqueue-requeueing.patch
> sched-activate-sched-batch-expired.patch
> sched-reduce-overhead-of-calc_load.patch
> #
> sched-fix-interactive-task-starvation.patch
> #
> # "strange load balancing problems": pwil3058@bigpond.net.au
> sched-new-sched-domain-for-representing-multi-core.patch
> sched-fix-group-power-for-allnodes_domains.patch
> x86-dont-use-cpuid2-to-determine-cache-info-if-cpuid4-is-supported.patch
>
>
> I'm not sure what the "Suresh had problems" comment refers to - perhaps a
> now-removed patch.
>
> afaik, the load balancing problem which Peter observed remains unresolved.
I have not seen this problem on recent -mm kernels (-rc6-mm1 and
-rc6-mm2 even with SCHED_MC configured in) so it would appear that it's
fixed. The only worrying thing is that we don't know what fixed it.
>
> Has smpnice had appropriate testing for regressions?
There've been no reported problems for quite a while so my (biased)
answer would be "yes".
Peter
--
Peter Williams pwil3058@bigpond.net.au
"Learning, n. The kind of ignorance distinguishing the studious."
-- Ambrose Bierce
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: cpu scheduler merge plans
2006-03-22 22:57 ` kernel
@ 2006-03-23 1:37 ` Siddha, Suresh B
2006-03-23 22:06 ` Peter Williams
0 siblings, 1 reply; 14+ messages in thread
From: Siddha, Suresh B @ 2006-03-23 1:37 UTC (permalink / raw)
To: kernel
Cc: Andrew Morton, Ingo Molnar, Nick Piggin, Peter Williams,
Siddha, Suresh B, Chen, Kenneth W, Mike Galbraith, linux-kernel
On Thu, Mar 23, 2006 at 09:57:06AM +1100, kernel@kolivas.org wrote:
> Quoting Andrew Morton <akpm@osdl.org>:
> > #
> > # "strange load balancing problems": pwil3058@bigpond.net.au
> > sched-new-sched-domain-for-representing-multi-core.patch
> > sched-fix-group-power-for-allnodes_domains.patch
> > x86-dont-use-cpuid2-to-determine-cache-info-if-cpuid4-is-supported.patch
I'd like to see the three above patches in 2.6.17. Peters "strange load
balancing problems" seems to be a false alarm(this patch will have
minimal impact on a single core cpu because of domain degeneration..) and
doesn't happen on recent -mm kernels..
> >
> >
> > I'm not sure what the "Suresh had problems" comment refers to - perhaps a
> > now-removed patch.
>
> On previous versions of smp nice Suresh found some throughput issues. Peter has
> addressed these as far as I'm aware, but we really need Suresh to check all
> those again.
I am just back from vacation. I will soon review and provide feedback.
> >
> > afaik, the load balancing problem which Peter observed remains unresolved.
>
> That was a multicore enabled balancing problem which he reported went away on a
> later -mm.
thanks,
suresh
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: cpu scheduler merge plans
2006-03-22 23:51 cpu scheduler merge plans Andrew Morton
` (2 preceding siblings ...)
2006-03-23 1:16 ` Peter Williams
@ 2006-03-23 5:03 ` Ingo Molnar
2006-03-23 5:13 ` Andrew Morton
3 siblings, 1 reply; 14+ messages in thread
From: Ingo Molnar @ 2006-03-23 5:03 UTC (permalink / raw)
To: Andrew Morton
Cc: Nick Piggin, Con Kolivas, Peter Williams, Siddha, Suresh B,
Chen, Kenneth W, Mike Galbraith, linux-kernel
* Andrew Morton <akpm@osdl.org> wrote:
> So it's that time again. We need to decide which of the queued sched
> patches should be merged into 2.6.17.
>
> I have:
>
> sched-fix-task-interactivity-calculation.patch
> small-schedule-microoptimization.patch
> #
> sched-implement-smpnice.patch
> sched-smpnice-apply-review-suggestions.patch
> sched-smpnice-fix-average-load-per-run-queue-calculations.patch
> sched-store-weighted-load-on-up.patch
> sched-add-discrete-weighted-cpu-load-function.patch
> sched-add-above-background-load-function.patch
>
> # Suresh had problems
> # con:
> sched-cleanup_task_activated.patch
> sched-make_task_noninteractive_use_sleep_type.patch
> sched-dont_decrease_idle_sleep_avg.patch
> sched-include_noninteractive_sleep_in_idle_detect.patch
> sched-remove-on-runqueue-requeueing.patch
> sched-activate-sched-batch-expired.patch
> sched-reduce-overhead-of-calc_load.patch
> #
> sched-fix-interactive-task-starvation.patch
> #
> # "strange load balancing problems": pwil3058@bigpond.net.au
> sched-new-sched-domain-for-representing-multi-core.patch
> sched-fix-group-power-for-allnodes_domains.patch
> x86-dont-use-cpuid2-to-determine-cache-info-if-cpuid4-is-supported.patch
strange as it may sound, all of these patches are fine with me. I really
tried to find a questionable one (out of principle) but failed ;-)
there are two main risk areas: smpnice and the interactivity changes.
[multi-core support ought to be risk-free] ['risk' here means some 'oh
sh*t' conceptual problem that could cause big head-scratching shortly
before 2.6.17 is released, not some easy to fix regression.]
Smpnice got alot of attention (and testing) and it's still a feature
well worth having. The biggest risk comes from its relative complexity,
but not doing the merge now wont reduce that risk. The biggest plus
compared to the previous iteration is that smpnice is now essentially a
NOP for same-nice-level workloads.
The interactivity changes had less testing (being relatively young), but
they are pretty well split up and they should solve the worst of the
starvation problems. So if any of those causes problems, it will be an
easy revert.
All in one, i'm not worried about any these changes.
> I'm not sure what the "Suresh had problems" comment refers to -
> perhaps a now-removed patch.
i think that got resolved with a retest.
> afaik, the load balancing problem which Peter observed remains
> unresolved.
this seems resolved too.
> Has smpnice had appropriate testing for regressions?
it's all green again, and it seems all parties that reported regressions
before retested and there are no outstanding complaints. Having it in
-mm longer probably wont cause additional increase in test coverage. (in
fact bitrot will probably degrade its test status, so i wouldnt wait any
longer with it. We've got the spotlight on it now, so lets try it
upstream while it's still hot and in tester's attention span.)
Ingo
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: cpu scheduler merge plans
2006-03-23 5:03 ` cpu scheduler merge plans Ingo Molnar
@ 2006-03-23 5:13 ` Andrew Morton
0 siblings, 0 replies; 14+ messages in thread
From: Andrew Morton @ 2006-03-23 5:13 UTC (permalink / raw)
To: Ingo Molnar
Cc: nickpiggin, kernel, pwil3058, suresh.b.siddha, kenneth.w.chen,
efault, linux-kernel
Ingo Molnar <mingo@elte.hu> wrote:
>
> it's all green again
>
OK...
It'll take as long as a week to get that far into the -mm queue anyway
(sched is staged 70% of the way through). So unless we hear differently
between now and then, off it all goes. Thanks.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: cpu scheduler merge plans
2006-03-23 1:37 ` Siddha, Suresh B
@ 2006-03-23 22:06 ` Peter Williams
0 siblings, 0 replies; 14+ messages in thread
From: Peter Williams @ 2006-03-23 22:06 UTC (permalink / raw)
To: Siddha, Suresh B
Cc: kernel, Andrew Morton, Ingo Molnar, Nick Piggin, Chen, Kenneth W,
Mike Galbraith, linux-kernel
Siddha, Suresh B wrote:
> On Thu, Mar 23, 2006 at 09:57:06AM +1100, kernel@kolivas.org wrote:
>
>>Quoting Andrew Morton <akpm@osdl.org>:
>>
>>>#
>>># "strange load balancing problems": pwil3058@bigpond.net.au
>>>sched-new-sched-domain-for-representing-multi-core.patch
>>>sched-fix-group-power-for-allnodes_domains.patch
>>>x86-dont-use-cpuid2-to-determine-cache-info-if-cpuid4-is-supported.patch
>
>
> I'd like to see the three above patches in 2.6.17. Peters "strange load
> balancing problems" seems to be a false alarm(this patch will have
> minimal impact on a single core cpu because of domain degeneration..) and
> doesn't happen on recent -mm kernels..
I agree. This is no longer a problem and certainly shouldn't prevent
the above patches going in to 2.6.17.
Peter
--
Peter Williams pwil3058@bigpond.net.au
"Learning, n. The kind of ignorance distinguishing the studious."
-- Ambrose Bierce
^ permalink raw reply [flat|nested] 14+ messages in thread
* more smpnice patch issues
2006-03-23 1:16 ` Peter Williams
@ 2006-03-24 23:45 ` Siddha, Suresh B
2006-03-25 0:56 ` Peter Williams
2006-03-26 23:24 ` more smpnice patch issues Peter Williams
0 siblings, 2 replies; 14+ messages in thread
From: Siddha, Suresh B @ 2006-03-24 23:45 UTC (permalink / raw)
To: Peter Williams
Cc: Andrew Morton, Ingo Molnar, Nick Piggin, Con Kolivas,
Siddha, Suresh B, Chen, Kenneth W, Mike Galbraith, linux-kernel
more issues with smpnice patch...
a) consider a 4-way system (simple SMP system with no HT and cores) scenario
where a high priority task (nice -20) is running on P0 and two normal
priority tasks running on P1. load balance with smp nice code
will never be able to detect an imbalance and hence will never move one of
the normal priority tasks on P1 to idle cpus P2 or P3.
b) smpnice seems to break this patch..
[PATCH] sched: allow the load to grow upto its cpu_power
http://www.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=0c117f1b4d14380baeed9c883f765ee023da8761
example scenario for this case: consider a numa system with two nodes, each
node containing four processors. if there are two processes in node-0 and with
node-1 being completely idle, your patch will move one of those processes to
node-1 whereas the previous behavior will retain those two processes in node-0..
(in this case, in your code max_load will be less than busiest_load_per_task)
smpnice patch has complicated the load balance code... Very difficult
to comprehend the side effects of this patch in the presence of different
priority tasks...
thanks,
suresh
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: more smpnice patch issues
2006-03-24 23:45 ` more smpnice patch issues Siddha, Suresh B
@ 2006-03-25 0:56 ` Peter Williams
2006-03-25 1:53 ` Peter Williams
2006-03-26 23:24 ` more smpnice patch issues Peter Williams
1 sibling, 1 reply; 14+ messages in thread
From: Peter Williams @ 2006-03-25 0:56 UTC (permalink / raw)
To: Siddha, Suresh B
Cc: Andrew Morton, Ingo Molnar, Nick Piggin, Con Kolivas,
Chen, Kenneth W, Mike Galbraith, linux-kernel
Siddha, Suresh B wrote:
> more issues with smpnice patch...
>
> a) consider a 4-way system (simple SMP system with no HT and cores) scenario
> where a high priority task (nice -20) is running on P0 and two normal
> priority tasks running on P1. load balance with smp nice code
> will never be able to detect an imbalance and hence will never move one of
> the normal priority tasks on P1 to idle cpus P2 or P3.
Why?
>
> b) smpnice seems to break this patch..
>
> [PATCH] sched: allow the load to grow upto its cpu_power
> http://www.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=0c117f1b4d14380baeed9c883f765ee023da8761
>
> example scenario for this case: consider a numa system with two nodes, each
> node containing four processors. if there are two processes in node-0 and with
> node-1 being completely idle, your patch will move one of those processes to
> node-1 whereas the previous behavior will retain those two processes in node-0..
> (in this case, in your code max_load will be less than busiest_load_per_task)
>
> smpnice patch has complicated the load balance code... Very difficult
> to comprehend the side effects of this patch in the presence of different
> priority tasks...
That is NOT true. Without smpnice whether the "right thing" happens
with non zero nice tasks is largely down to luck. With smpnice the
result is far more predictable.
The PURPOSE of smpnice IS to alter load balancing in the face of the use
of non zero nice tasks. The reason for doing this is so that nice
reliably effects the allocation of CPU resources on SMP machines. I.e.
changes in load balancing behaviour as a result of tasks having nice
values other than zero is the desired result and NOT a side effect.
Peter
--
Peter Williams pwil3058@bigpond.net.au
"Learning, n. The kind of ignorance distinguishing the studious."
-- Ambrose Bierce
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: more smpnice patch issues
2006-03-25 0:56 ` Peter Williams
@ 2006-03-25 1:53 ` Peter Williams
2006-03-25 3:40 ` [PATCH] sched: make sure busiest group and run queue are pullable Peter Williams
0 siblings, 1 reply; 14+ messages in thread
From: Peter Williams @ 2006-03-25 1:53 UTC (permalink / raw)
To: Siddha, Suresh B
Cc: Andrew Morton, Ingo Molnar, Nick Piggin, Con Kolivas,
Chen, Kenneth W, Mike Galbraith, linux-kernel
Peter Williams wrote:
> Siddha, Suresh B wrote:
>> more issues with smpnice patch...
>>
>> a) consider a 4-way system (simple SMP system with no HT and cores)
>> scenario
>> where a high priority task (nice -20) is running on P0 and two normal
>> priority tasks running on P1. load balance with smp nice code
>> will never be able to detect an imbalance and hence will never move
>> one of the normal priority tasks on P1 to idle cpus P2 or P3.
>
> Why?
OK, I think I know why. The load balancing code will always decide that
P0 is the busiest CPU, right?
Peter
--
Peter Williams pwil3058@bigpond.net.au
"Learning, n. The kind of ignorance distinguishing the studious."
-- Ambrose Bierce
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH] sched: make sure busiest group and run queue are pullable
2006-03-25 1:53 ` Peter Williams
@ 2006-03-25 3:40 ` Peter Williams
0 siblings, 0 replies; 14+ messages in thread
From: Peter Williams @ 2006-03-25 3:40 UTC (permalink / raw)
To: Andrew Morton
Cc: Siddha, Suresh B, Ingo Molnar, Nick Piggin, Con Kolivas,
Chen, Kenneth W, Mike Galbraith, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 1910 bytes --]
Peter Williams wrote:
> Peter Williams wrote:
>> Siddha, Suresh B wrote:
>>> more issues with smpnice patch...
>>>
>>> a) consider a 4-way system (simple SMP system with no HT and cores)
>>> scenario
>>> where a high priority task (nice -20) is running on P0 and two normal
>>> priority tasks running on P1. load balance with smp nice code
>>> will never be able to detect an imbalance and hence will never move
>>> one of the normal priority tasks on P1 to idle cpus P2 or P3.
>>
>> Why?
>
> OK, I think I know why. The load balancing code will always decide that
> P0 is the busiest CPU, right?
Attached is a patch that addresses this problem. The strategies
employed are:
1. for find_busiest_group() only consider groups that have at least one
CPU with more than one task running as candidates for "busiest", and
2. for find_busiest_queue() only consider queues that have more than one
running tasks as candidates for "busiest".
I think that the overhead gains from earlier abandonment of load
balancing attempts that would eventually (most probably -- see next
paragraph) be abandoned anyway will compensate for the extra overhead
introduced in these functions.
I think that the only likely behavioural changes for an all tasks have
nice==0 system is that without these checks there is a small chance that
a "busiest" that only has one runnable task (and for which move_tasks()
would eventually not move any tasks) when these tests are made may
actually acquire extra runnable tasks before the locks are taken in
preparation for calling move_tasks() and, therefore, load balancing may
actually take place. I think that this effect can be safely ignored.
Signed-off-by: Peter Williams <pwil3058@bigpond.com.au>
Peter
--
Peter Williams pwil3058@bigpond.net.au
"Learning, n. The kind of ignorance distinguishing the studious."
-- Ambrose Bierce
[-- Attachment #2: smpnice-modify-busiest-searches --]
[-- Type: text/plain, Size: 1646 bytes --]
Index: MM-2.6.X/kernel/sched.c
===================================================================
--- MM-2.6.X.orig/kernel/sched.c 2006-03-25 13:43:06.000000000 +1100
+++ MM-2.6.X/kernel/sched.c 2006-03-25 13:56:37.000000000 +1100
@@ -2115,6 +2115,7 @@ find_busiest_group(struct sched_domain *
int local_group;
int i;
unsigned long sum_nr_running, sum_weighted_load;
+ unsigned int nr_loaded_cpus = 0; /* where nr_running > 1 */
local_group = cpu_isset(this_cpu, group->cpumask);
@@ -2135,6 +2136,8 @@ find_busiest_group(struct sched_domain *
avg_load += load;
sum_nr_running += rq->nr_running;
+ if (rq->nr_running > 1)
+ ++nr_loaded_cpus;
sum_weighted_load += rq->raw_weighted_load;
}
@@ -2149,7 +2152,7 @@ find_busiest_group(struct sched_domain *
this = group;
this_nr_running = sum_nr_running;
this_load_per_task = sum_weighted_load;
- } else if (avg_load > max_load) {
+ } else if (nr_loaded_cpus && avg_load > max_load) {
max_load = avg_load;
busiest = group;
busiest_nr_running = sum_nr_running;
@@ -2258,16 +2261,16 @@ out_balanced:
static runqueue_t *find_busiest_queue(struct sched_group *group,
enum idle_type idle)
{
- unsigned long load, max_load = 0;
- runqueue_t *busiest = NULL;
+ unsigned long max_load = 0;
+ runqueue_t *busiest = NULL, *rqi;
int i;
for_each_cpu_mask(i, group->cpumask) {
- load = weighted_cpuload(i);
+ rqi = cpu_rq(i);
- if (load > max_load) {
- max_load = load;
- busiest = cpu_rq(i);
+ if (rqi->nr_running > 1 && rqi->raw_weighted_load > max_load) {
+ max_load = rqi->raw_weighted_load;
+ busiest = rqi;
}
}
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: more smpnice patch issues
2006-03-24 23:45 ` more smpnice patch issues Siddha, Suresh B
2006-03-25 0:56 ` Peter Williams
@ 2006-03-26 23:24 ` Peter Williams
2006-03-26 23:43 ` [PATCH] sched: smpnice prevent integer arithmetic wrap problems Peter Williams
1 sibling, 1 reply; 14+ messages in thread
From: Peter Williams @ 2006-03-26 23:24 UTC (permalink / raw)
To: Siddha, Suresh B
Cc: Andrew Morton, Ingo Molnar, Nick Piggin, Con Kolivas,
Chen, Kenneth W, Mike Galbraith, linux-kernel
Siddha, Suresh B wrote:
> more issues with smpnice patch...
>
> a) consider a 4-way system (simple SMP system with no HT and cores) scenario
> where a high priority task (nice -20) is running on P0 and two normal
> priority tasks running on P1. load balance with smp nice code
> will never be able to detect an imbalance and hence will never move one of
> the normal priority tasks on P1 to idle cpus P2 or P3.
Fix already sent.
>
> b) smpnice seems to break this patch..
>
> [PATCH] sched: allow the load to grow upto its cpu_power
> http://www.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=0c117f1b4d14380baeed9c883f765ee023da8761
>
> example scenario for this case: consider a numa system with two nodes, each
> node containing four processors. if there are two processes in node-0 and with
> node-1 being completely idle, your patch will move one of those processes to
> node-1 whereas the previous behavior will retain those two processes in node-0..
> (in this case, in your code max_load will be less than busiest_load_per_task)
I think that the patch I sent to address a) above will also fix this
problem as find_busiest_queue() will no longer find node-0 as the
busiest group unless both of the processes in node-0 are on the same
CPU. This is because it now only considers groups that have at least
one CPU with more than one running task as candidates for being the
busiest group.
Implicit in this is the assumption that it's OK to move one of the tasks
from node-0 to node-1 if they're both on the same CPU within node-0.
Could you confirm this is OK?
Thanks,
Peter
--
Peter Williams pwil3058@bigpond.net.au
"Learning, n. The kind of ignorance distinguishing the studious."
-- Ambrose Bierce
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH] sched: smpnice prevent integer arithmetic wrap problems
2006-03-26 23:24 ` more smpnice patch issues Peter Williams
@ 2006-03-26 23:43 ` Peter Williams
0 siblings, 0 replies; 14+ messages in thread
From: Peter Williams @ 2006-03-26 23:43 UTC (permalink / raw)
To: Siddha, Suresh B, Andrew Morton
Cc: Ingo Molnar, Nick Piggin, Con Kolivas, Chen, Kenneth W,
Mike Galbraith, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 2425 bytes --]
Peter Williams wrote:
> Siddha, Suresh B wrote:
>> more issues with smpnice patch...
>>
>> a) consider a 4-way system (simple SMP system with no HT and cores)
>> scenario
>> where a high priority task (nice -20) is running on P0 and two normal
>> priority tasks running on P1. load balance with smp nice code
>> will never be able to detect an imbalance and hence will never move
>> one of the normal priority tasks on P1 to idle cpus P2 or P3.
>
> Fix already sent.
>
>>
>> b) smpnice seems to break this patch..
>>
>> [PATCH] sched: allow the load to grow upto its cpu_power
>> http://www.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=0c117f1b4d14380baeed9c883f765ee023da8761
>>
>>
>> example scenario for this case: consider a numa system with two nodes,
>> each
>> node containing four processors. if there are two processes in node-0
>> and with
>> node-1 being completely idle, your patch will move one of those
>> processes to
>> node-1 whereas the previous behavior will retain those two processes
>> in node-0..
>> (in this case, in your code max_load will be less than
>> busiest_load_per_task)
>
> I think that the patch I sent to address a) above will also fix this
> problem as find_busiest_queue() will no longer find node-0 as the
> busiest group unless both of the processes in node-0 are on the same
> CPU. This is because it now only considers groups that have at least
> one CPU with more than one running task as candidates for being the
> busiest group.
>
> Implicit in this is the assumption that it's OK to move one of the tasks
> from node-0 to node-1 if they're both on the same CPU within node-0.
>
> Could you confirm this is OK?
It looks like my coffee was slow kicking in this morning :-)
When I looked at the code more carefully I realized that you're
suggestion re comparing avg_load and busiest_load_per_task is needed to
protect the calculation of max_pull from integer arithmetic wrapping
problems. There was a big clue to this need in the comment above the
calculation of max_pull that I failed to read :-(
Anyway the attached patch should fix the problem. It should be applied
on top of the other patch.
Signed-off-by: Peter Williams <pwil3058@bigpond.com.au>
Peter
--
Peter Williams pwil3058@bigpond.net.au
"Learning, n. The kind of ignorance distinguishing the studious."
-- Ambrose Bierce
[-- Attachment #2: smpnice-allow-load-up-to-cpu_power --]
[-- Type: text/plain, Size: 890 bytes --]
Index: MM-2.6.X/kernel/sched.c
===================================================================
--- MM-2.6.X.orig/kernel/sched.c 2006-03-25 13:56:37.000000000 +1100
+++ MM-2.6.X/kernel/sched.c 2006-03-27 10:15:38.000000000 +1100
@@ -2161,7 +2161,7 @@ find_busiest_group(struct sched_domain *
group = group->next;
} while (group != sd->groups);
- if (!busiest || this_load >= max_load || busiest_nr_running <= 1)
+ if (!busiest || this_load >= max_load)
goto out_balanced;
avg_load = (SCHED_LOAD_SCALE * total_load) / total_pwr;
@@ -2171,6 +2171,9 @@ find_busiest_group(struct sched_domain *
goto out_balanced;
busiest_load_per_task /= busiest_nr_running;
+
+ if (avg_load <= busiest_load_per_task)
+ goto out_balanced;
/*
* We're trying to get all the cpus to the average_load, so we don't
* want to push ourselves above the average load, nor do we wish to
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2006-03-26 23:43 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-03-22 23:51 cpu scheduler merge plans Andrew Morton
2006-03-22 22:57 ` kernel
2006-03-23 1:37 ` Siddha, Suresh B
2006-03-23 22:06 ` Peter Williams
2006-03-23 0:31 ` Nick Piggin
2006-03-23 1:16 ` Peter Williams
2006-03-24 23:45 ` more smpnice patch issues Siddha, Suresh B
2006-03-25 0:56 ` Peter Williams
2006-03-25 1:53 ` Peter Williams
2006-03-25 3:40 ` [PATCH] sched: make sure busiest group and run queue are pullable Peter Williams
2006-03-26 23:24 ` more smpnice patch issues Peter Williams
2006-03-26 23:43 ` [PATCH] sched: smpnice prevent integer arithmetic wrap problems Peter Williams
2006-03-23 5:03 ` cpu scheduler merge plans Ingo Molnar
2006-03-23 5:13 ` Andrew Morton
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.