* cross-cpu balancing with the new scheduler
@ 2002-01-13 17:01 Manfred Spraul
2002-01-14 2:19 ` Rusty Russell
2002-01-14 6:10 ` Anton Blanchard
0 siblings, 2 replies; 9+ messages in thread
From: Manfred Spraul @ 2002-01-13 17:01 UTC (permalink / raw)
To: Ingo Molnar; +Cc: linux-kernel
Is it possible that the inter-cpu balancing is broken in 2.5.2-pre11?
eatcpu is a simple cpu hog ("for(;;);"). Dual CPU i386.
$nice -19 ./eatcpu&;
<wait>
$nice -19 ./eatcpu&;
<wait>
$./eatcpu&.
IMHO it should be
* both niced process run on one cpu.
* the non-niced process runs with a 100% timeslice.
But it's the other way around:
One niced process runs with 100%. The non-niced process with 50%, and
the second niced process with 50%.
--
Manfred
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: cross-cpu balancing with the new scheduler
2002-01-13 17:01 cross-cpu balancing with the new scheduler Manfred Spraul
@ 2002-01-14 2:19 ` Rusty Russell
2002-01-14 2:49 ` Davide Libenzi
2002-01-14 6:10 ` Anton Blanchard
1 sibling, 1 reply; 9+ messages in thread
From: Rusty Russell @ 2002-01-14 2:19 UTC (permalink / raw)
To: Manfred Spraul; +Cc: mingo, linux-kernel
On Sun, 13 Jan 2002 18:01:40 +0100
Manfred Spraul <manfred@colorfullife.com> wrote:
> Is it possible that the inter-cpu balancing is broken in 2.5.2-pre11?
>
> eatcpu is a simple cpu hog ("for(;;);"). Dual CPU i386.
>
> $nice -19 ./eatcpu&;
> <wait>
> $nice -19 ./eatcpu&;
> <wait>
> $./eatcpu&.
>
> IMHO it should be
> * both niced process run on one cpu.
> * the non-niced process runs with a 100% timeslice.
>
> But it's the other way around:
> One niced process runs with 100%. The non-niced process with 50%, and
> the second niced process with 50%.
This could be fixed by making "nr_running" closer to a "priority sum".
Ingo?
Rusty.
--
Anyone who quotes me in their sig is an idiot. -- Rusty Russell.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: cross-cpu balancing with the new scheduler
2002-01-14 2:19 ` Rusty Russell
@ 2002-01-14 2:49 ` Davide Libenzi
2002-01-14 4:37 ` Rusty Russell
2002-01-14 15:39 ` Manfred Spraul
0 siblings, 2 replies; 9+ messages in thread
From: Davide Libenzi @ 2002-01-14 2:49 UTC (permalink / raw)
To: Rusty Russell; +Cc: Manfred Spraul, mingo, linux-kernel
On Mon, 14 Jan 2002, Rusty Russell wrote:
> On Sun, 13 Jan 2002 18:01:40 +0100
> Manfred Spraul <manfred@colorfullife.com> wrote:
>
> > Is it possible that the inter-cpu balancing is broken in 2.5.2-pre11?
> >
> > eatcpu is a simple cpu hog ("for(;;);"). Dual CPU i386.
> >
> > $nice -19 ./eatcpu&;
> > <wait>
> > $nice -19 ./eatcpu&;
> > <wait>
> > $./eatcpu&.
> >
> > IMHO it should be
> > * both niced process run on one cpu.
> > * the non-niced process runs with a 100% timeslice.
> >
> > But it's the other way around:
> > One niced process runs with 100%. The non-niced process with 50%, and
> > the second niced process with 50%.
>
> This could be fixed by making "nr_running" closer to a "priority sum".
I've a very simple phrase when QA is bugging me with these corner cases :
"As Designed"
It's much much better than adding code and "Return To QA" :-)
I tried priority balancing in BMQS but i still prefer "As Designed" ...
- Davide
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: cross-cpu balancing with the new scheduler
2002-01-14 2:49 ` Davide Libenzi
@ 2002-01-14 4:37 ` Rusty Russell
2002-01-14 15:39 ` Manfred Spraul
1 sibling, 0 replies; 9+ messages in thread
From: Rusty Russell @ 2002-01-14 4:37 UTC (permalink / raw)
To: Davide Libenzi; +Cc: Manfred Spraul, mingo, linux-kernel
In message <Pine.LNX.4.40.0201131842570.937-100000@blue1.dev.mcafeelabs.com> yo
u write:
> On Mon, 14 Jan 2002, Rusty Russell wrote:
>
> > This could be fixed by making "nr_running" closer to a "priority sum".
>
> I've a very simple phrase when QA is bugging me with these corner cases :
>
> "As Designed"
My point is: it's just a heuristic number. It currently reflects the
number on the runqueue, but there's no reason it *has to* (except the
name, of course).
1) The nr_running() function can use rq->active->nr_active +
rq->expired->nr_active. And anyway it's only as "am I
idle?".
2) The test inside schedule() can be replaced by checking the result
of the sched_find_first_zero_bit() (I have a patch which does this
to good effect, but for other reasons).
The other uses of nr_running are all "how long is this runqueue for
rebalancing", and Ingo *already* modifies his use of this number,
using the "prev_nr_running" hack.
Hope that clarifies,
Rusty.
--
Anyone who quotes me in their sig is an idiot. -- Rusty Russell.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: cross-cpu balancing with the new scheduler
2002-01-13 17:01 cross-cpu balancing with the new scheduler Manfred Spraul
2002-01-14 2:19 ` Rusty Russell
@ 2002-01-14 6:10 ` Anton Blanchard
2002-01-15 16:37 ` Ingo Molnar
1 sibling, 1 reply; 9+ messages in thread
From: Anton Blanchard @ 2002-01-14 6:10 UTC (permalink / raw)
To: Manfred Spraul; +Cc: Ingo Molnar, linux-kernel
> eatcpu is a simple cpu hog ("for(;;);"). Dual CPU i386.
>
> $nice -19 ./eatcpu&;
> <wait>
> $nice -19 ./eatcpu&;
> <wait>
> $./eatcpu&.
>
> IMHO it should be
> * both niced process run on one cpu.
> * the non-niced process runs with a 100% timeslice.
>
> But it's the other way around:
> One niced process runs with 100%. The non-niced process with 50%, and
> the second niced process with 50%.
Rusty and I were talking about this recently. Would it make sense for
the load balancer to use a weighted queue length (sum up all priorities
in the queue?) instead of just balancing the queue length?
Anton
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: cross-cpu balancing with the new scheduler
2002-01-14 2:49 ` Davide Libenzi
2002-01-14 4:37 ` Rusty Russell
@ 2002-01-14 15:39 ` Manfred Spraul
2002-01-14 15:50 ` Davide Libenzi
2002-01-14 17:44 ` Ingo Molnar
1 sibling, 2 replies; 9+ messages in thread
From: Manfred Spraul @ 2002-01-14 15:39 UTC (permalink / raw)
To: Davide Libenzi; +Cc: Rusty Russell, mingo, linux-kernel
Davide Libenzi wrote:
>
> I've a very simple phrase when QA is bugging me with these corner cases :
>
> "As Designed"
>
> It's much much better than adding code and "Return To QA" :-)
> I tried priority balancing in BMQS but i still prefer "As Designed" ...
>
Another test, now with 4 process (dual cpu):
#nice -n 19 ./eatcpu&
#nice -n 19 ./eatcpu&
#./eatcpu&
#nice -n -19 ./eatcpu&
And the top output:
<<<<<<
73 processes: 68 sleeping, 5 running, 0 zombie, 0 stopped
CPU0 states: 100.0% user, 0.0% system, 100.0% nice, 0.0% idle
CPU1 states: 98.0% user, 2.0% system, 33.0% nice, 0.0% idle
[snip]
PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND
1163 root 39 19 396 396 324 R N 99.5 0.1 0:28 eatcpu
1164 root 39 19 396 396 324 R N 33.1 0.1 0:11 eatcpu
1165 root 39 0 396 396 324 R 33.1 0.1 0:07 eatcpu
1166 root 39 -19 396 396 324 R < 31.3 0.1 0:06 eatcpu
1168 manfred 1 0 980 976 768 R 2.7 0.2 0:00 top
[snip]
The niced process still has it's own cpu, and the "nice -19" process has
33% of the second cpu.
IMHO that's buggy. 4 running process, 1 on cpu0, 3 on cpu1.
--
Manfred
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: cross-cpu balancing with the new scheduler
2002-01-14 15:39 ` Manfred Spraul
@ 2002-01-14 15:50 ` Davide Libenzi
2002-01-14 17:44 ` Ingo Molnar
1 sibling, 0 replies; 9+ messages in thread
From: Davide Libenzi @ 2002-01-14 15:50 UTC (permalink / raw)
To: Manfred Spraul; +Cc: Rusty Russell, Ingo Molnar, lkml
On Mon, 14 Jan 2002, Manfred Spraul wrote:
> Davide Libenzi wrote:
> >
> > I've a very simple phrase when QA is bugging me with these corner cases :
> >
> > "As Designed"
> >
> > It's much much better than adding code and "Return To QA" :-)
> > I tried priority balancing in BMQS but i still prefer "As Designed" ...
> >
> Another test, now with 4 process (dual cpu):
> #nice -n 19 ./eatcpu&
> #nice -n 19 ./eatcpu&
> #./eatcpu&
> #nice -n -19 ./eatcpu&
>
> And the top output:
> <<<<<<
> 73 processes: 68 sleeping, 5 running, 0 zombie, 0 stopped
> CPU0 states: 100.0% user, 0.0% system, 100.0% nice, 0.0% idle
> CPU1 states: 98.0% user, 2.0% system, 33.0% nice, 0.0% idle
> [snip]
> PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND
> 1163 root 39 19 396 396 324 R N 99.5 0.1 0:28 eatcpu
> 1164 root 39 19 396 396 324 R N 33.1 0.1 0:11 eatcpu
> 1165 root 39 0 396 396 324 R 33.1 0.1 0:07 eatcpu
> 1166 root 39 -19 396 396 324 R < 31.3 0.1 0:06 eatcpu
> 1168 manfred 1 0 980 976 768 R 2.7 0.2 0:00 top
> [snip]
>
> The niced process still has it's own cpu, and the "nice -19" process has
> 33% of the second cpu.
>
> IMHO that's buggy. 4 running process, 1 on cpu0, 3 on cpu1.
Yes, a long run with 3:1 is no more "As Designed" :-)
- Davide
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: cross-cpu balancing with the new scheduler
2002-01-14 15:39 ` Manfred Spraul
2002-01-14 15:50 ` Davide Libenzi
@ 2002-01-14 17:44 ` Ingo Molnar
1 sibling, 0 replies; 9+ messages in thread
From: Ingo Molnar @ 2002-01-14 17:44 UTC (permalink / raw)
To: Manfred Spraul; +Cc: Davide Libenzi, Rusty Russell, linux-kernel
(it turns out that Manfred used 2.5.2-pre11-vanilla for this test.)
On Mon, 14 Jan 2002, Manfred Spraul wrote:
> PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND
> 1163 root 39 19 396 396 324 R N 99.5 0.1 0:28 eatcpu
> 1164 root 39 19 396 396 324 R N 33.1 0.1 0:11 eatcpu
> 1165 root 39 0 396 396 324 R 33.1 0.1 0:07 eatcpu
> 1166 root 39 -19 396 396 324 R < 31.3 0.1 0:06 eatcpu
The load-balancer in 2.5.2-pre11 is known-broken, please try the -H7 patch
to get the latest code.
Ingo
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: cross-cpu balancing with the new scheduler
2002-01-14 6:10 ` Anton Blanchard
@ 2002-01-15 16:37 ` Ingo Molnar
0 siblings, 0 replies; 9+ messages in thread
From: Ingo Molnar @ 2002-01-15 16:37 UTC (permalink / raw)
To: Anton Blanchard; +Cc: Manfred Spraul, linux-kernel
On Mon, 14 Jan 2002, Anton Blanchard wrote:
> Rusty and I were talking about this recently. Would it make sense for
> the load balancer to use a weighted queue length (sum up all
> priorities in the queue?) instead of just balancing the queue length?
something like this would work, but it's not an easy task to *truly*
balance priorities (or timeslice lengths instead) between CPUs.
Eg. in the following situation:
CPU#0 CPU#1
prio 1 prio 1
prio 1 prio 1
prio 20 prio 1
if the load-balancer only looks at the tail of the runqueue then it finds
that it cannot balance things any better - by moving the prio 20 task over
to CPU#1 it will not create a better-balanced situation. If it would look
at other runqueue entries then it could create the following,
better-balanced situation:
CPU#0 CPU#1
prio 20 prio 1
prio 1
prio 1
prio 1
prio 1
the solution would be to search the whole runqueue and migrate the task
with the shortest timeslice - but that is a pretty slow and
cache-intensive thing to do.
Ingo
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2002-01-15 14:40 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-01-13 17:01 cross-cpu balancing with the new scheduler Manfred Spraul
2002-01-14 2:19 ` Rusty Russell
2002-01-14 2:49 ` Davide Libenzi
2002-01-14 4:37 ` Rusty Russell
2002-01-14 15:39 ` Manfred Spraul
2002-01-14 15:50 ` Davide Libenzi
2002-01-14 17:44 ` Ingo Molnar
2002-01-14 6:10 ` Anton Blanchard
2002-01-15 16:37 ` Ingo Molnar
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox