* [PATCH] sched.c: Be a bit more conservative in SMP
@ 2006-09-03 13:41 Vincent Pelletier
2006-09-03 17:10 ` Vincent Pelletier
2006-09-19 13:39 ` Ludovic
0 siblings, 2 replies; 11+ messages in thread
From: Vincent Pelletier @ 2006-09-03 13:41 UTC (permalink / raw)
To: linux-kernel
[-- Attachment #1: Type: text/plain, Size: 2139 bytes --]
I've often seen the following use case happening on the few linux SMP boxes
I have access to : one process eats one cpu becaus eit has a big
computation to do, all cpu being idle, and the process keeps on hopping
from one cpu to another.
This patch is a quick try to make this behaviour disapear without requiring
to bind all processes manually with taskset.
I don't know if there is any practical performance increase (although I
believe there locally is).
Patch principle is simple :
When calculating the load of "source" cpu (the one the process is on)
substract one to the number of runing processes so we don't count the
process to be balanced.
As I only know sched.c for 5 minutes, I added a max(..., 0) to make sure the
load can't be negative if the function happens to be called on a cpu with
only idle tasks. No idea if it can actually happen.
I tested its efficiency this way :
Before :
-start a command eating one full cpu on an idle smp machine.
I used dd if=/dev/urandom of=/dev/null.
-wait for ~30 seconds, and see that it switched to another cpu.
After :
-repeat the same test and see that it does not switch to another cpu (the
patch does what it's meant to).
-start a second dd, and bind both to the same cpu with taskset, then free
one of them (allow it to use 2 cpus, including the one it can already
access) and see that the task gets moved to the second cpu (load balancing
still works).
Disclaimer :
This patch is just the result of a 5 minutes hacking rush. Although I think
it technically work, I'm no SMP expert.
--- linux-2.6-2.6.17/kernel/sched.c 2006-06-18 03:49:35.000000000 +0200
+++ linux-2.6-2.6.17-conservative/kernel/sched.c 2006-09-03
13:18:11.000000000 +0200
@@ -952,7 +952,7 @@ void kick_process(task_t *p)
static inline unsigned long source_load(int cpu, int type)
{
runqueue_t *rq = cpu_rq(cpu);
- unsigned long load_now = rq->nr_running * SCHED_LOAD_SCALE;
+ unsigned long load_now = (max(rq->nr_running - 1, 0)) *
SCHED_LOAD_SCALE;
if (type == 0)
return load_now;
--
Vincent Pelletier
[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] sched.c: Be a bit more conservative in SMP
2006-09-03 13:41 [PATCH] sched.c: Be a bit more conservative in SMP Vincent Pelletier
@ 2006-09-03 17:10 ` Vincent Pelletier
2006-09-06 23:30 ` Vincent Pelletier
2006-09-19 13:39 ` Ludovic
1 sibling, 1 reply; 11+ messages in thread
From: Vincent Pelletier @ 2006-09-03 17:10 UTC (permalink / raw)
To: linux-kernel; +Cc: mingo
Forgot the signed-off-by line in previous mail. Reposting same patch just in
case. CC to maintainer as advised in the FAQ.
Signed-off-by: Vincent Pelletier <vincent.plr@wanadoo.fr>
--- linux-2.6-2.6.17/kernel/sched.c 2006-06-18 03:49:35.000000000 +0200
+++ linux-2.6-2.6.17-conservative/kernel/sched.c 2006-09-03
13:18:11.000000000 +0200
@@ -952,7 +952,7 @@ void kick_process(task_t *p)
static inline unsigned long source_load(int cpu, int type)
{
runqueue_t *rq = cpu_rq(cpu);
- unsigned long load_now = rq->nr_running * SCHED_LOAD_SCALE;
+ unsigned long load_now = (max(rq->nr_running - 1, 0)) *
SCHED_LOAD_SCALE;
if (type == 0)
return load_now;
--
VGER BF report: U 0.500348
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] sched.c: Be a bit more conservative in SMP
2006-09-03 17:10 ` Vincent Pelletier
@ 2006-09-06 23:30 ` Vincent Pelletier
2006-09-19 14:06 ` Ludovic Drolez
0 siblings, 1 reply; 11+ messages in thread
From: Vincent Pelletier @ 2006-09-06 23:30 UTC (permalink / raw)
To: linux-kernel; +Cc: mingo
I found one maybe-drawback to this change :
When runing n+1 process (n = number of cpu), one takes one cpu, the other 2
share another cpu. And, because of this patch, all processes stay in their
own cpu, so one always has 100% of cpu power, the 2 others get 50% each.
In current implementation, one of the 2 processes from the same cpu would
migrate to the other cpu, and so on, somehow sharing cpu time among them.
Is it a feature or a side effect of current implementation ?
I'll do some tests soon to see which version gives better performance at a
higher level than just process migration cost - if different at all.
--
Vincent Pelletier
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] sched.c: Be a bit more conservative in SMP
2006-09-03 13:41 [PATCH] sched.c: Be a bit more conservative in SMP Vincent Pelletier
2006-09-03 17:10 ` Vincent Pelletier
@ 2006-09-19 13:39 ` Ludovic
1 sibling, 0 replies; 11+ messages in thread
From: Ludovic @ 2006-09-19 13:39 UTC (permalink / raw)
To: linux-kernel
Vincent Pelletier <subdino2004 <at> yahoo.fr> writes:
> I've often seen the following use case happening on the few linux SMP boxes
> I have access to : one process eats one cpu becaus eit has a big
> computation to do, all cpu being idle, and the process keeps on hopping
> from one cpu to another.
> This patch is a quick try to make this behaviour disapear without requiring
> to bind all processes manually with taskset.
> I don't know if there is any practical performance increase (although I
> believe there locally is).
Hi !
Do you know if your patch has been included somewhere ?
We have the same problem on a HPCC here with 4 CPUs per MB, and I don't like
playing with taskset (moreover, performance under Windows *much* is better
without any tuning, shame on us), it would be nice to see less migration when
it's not needed...
Cheers,
Ludovic.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] sched.c: Be a bit more conservative in SMP
2006-09-06 23:30 ` Vincent Pelletier
@ 2006-09-19 14:06 ` Ludovic Drolez
2006-09-19 17:50 ` Antonio Vargas
0 siblings, 1 reply; 11+ messages in thread
From: Ludovic Drolez @ 2006-09-19 14:06 UTC (permalink / raw)
To: linux-kernel
Vincent Pelletier <vincent.plr <at> wanadoo.fr> writes:
> I'll do some tests soon to see which version gives better performance at a
> higher level than just process migration cost - if different at all.
I think that your patch should improve the performance because process
migrations are expensive (cache miss) and should be avoided when not
really necessary.
Cheers,
Ludovic.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] sched.c: Be a bit more conservative in SMP
2006-09-19 14:06 ` Ludovic Drolez
@ 2006-09-19 17:50 ` Antonio Vargas
2006-09-20 7:42 ` Ludovic Drolez
0 siblings, 1 reply; 11+ messages in thread
From: Antonio Vargas @ 2006-09-19 17:50 UTC (permalink / raw)
To: Ludovic Drolez; +Cc: linux-kernel
On 9/19/06, Ludovic Drolez <ldrolez@linbox.com> wrote:
> Vincent Pelletier <vincent.plr <at> wanadoo.fr> writes:
> > I'll do some tests soon to see which version gives better performance at a
> > higher level than just process migration cost - if different at all.
>
> I think that your patch should improve the performance because process
> migrations are expensive (cache miss) and should be avoided when not
> really necessary.
>
> Cheers,
>
> Ludovic.
>
A variant on this theme would be (not tested or somewhat, just a
random idea for considering):
1. find if the process is a cpu-hog, if not then ignore
2. find somehow how much time has this process on it's current cpu
3. then, instead of always substracting 1 from th current load on the
current cpu, substract for example 1...0 when running from 0 to 60
seconds... this way cpu hogs would only rotate slowly?
in code:
number_to_sub_from_queue_load = (256 - min(256,
time_from_last_change_of_cpu)) >> 8;
somehow managing to get fixedpoint loadlevels on the runqueues would
make this work better....
--
Greetz, Antonio Vargas aka winden of network
http://network.amigascne.org/
windNOenSPAMntw@gmail.com
thesameasabove@amigascne.org
Every day, every year
you have to work
you have to study
you have to scene.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] sched.c: Be a bit more conservative in SMP
2006-09-19 17:50 ` Antonio Vargas
@ 2006-09-20 7:42 ` Ludovic Drolez
2006-09-20 16:26 ` Poor scheduling when not loaded at 100% (Was: [PATCH] sched.c: Be a bit more conservative in SMP) Ludovic Drolez
2006-09-21 18:36 ` [PATCH] sched.c: Be a bit more conservative in SMP Vincent Pelletier
0 siblings, 2 replies; 11+ messages in thread
From: Ludovic Drolez @ 2006-09-20 7:42 UTC (permalink / raw)
To: Antonio Vargas; +Cc: linux-kernel, subdino2004
Antonio Vargas wrote:
> A variant on this theme would be (not tested or somewhat, just a
> random idea for considering):
>
> 1. find if the process is a cpu-hog, if not then ignore
>
> 2. find somehow how much time has this process on it's current cpu
>
> 3. then, instead of always substracting 1 from th current load on the
> current cpu, substract for example 1...0 when running from 0 to 60
> seconds... this way cpu hogs would only rotate slowly?
>
> in code:
>
> number_to_sub_from_queue_load = (256 - min(256,
> time_from_last_change_of_cpu)) >> 8;
>
> somehow managing to get fixedpoint loadlevels on the runqueues would
> make this work better....
>
Yes ! That might be a better idea !
In fact, I tested the 1st patch on our cluster (Finite elements computing on 8
CPUs):
- Under Windows: 875 seconds
- Linux 2.6.16 : 1019 s
- Linux 2.6.16 + manual taskset : 842 s
- Linux 2.6.16 + Vincent's patch : 1373 s :-(
If you find time to write a patch, Antonio, I would be pleased to try it !
Cheers,
--
Ludovic DROLEZ
http://lrs.linbox.org - Free asset management software
^ permalink raw reply [flat|nested] 11+ messages in thread
* Poor scheduling when not loaded at 100% (Was: [PATCH] sched.c: Be a bit more conservative in SMP)
2006-09-20 7:42 ` Ludovic Drolez
@ 2006-09-20 16:26 ` Ludovic Drolez
2006-09-21 18:36 ` [PATCH] sched.c: Be a bit more conservative in SMP Vincent Pelletier
1 sibling, 0 replies; 11+ messages in thread
From: Ludovic Drolez @ 2006-09-20 16:26 UTC (permalink / raw)
To: linux-kernel
Ludovic Drolez <ldrolez <at> linbox.com> writes:
> In fact, I tested the 1st patch on our cluster (Finite elements computing on 8
> CPUs):
> - Under Windows: 875 seconds
> - Linux 2.6.16 : 1019 s
> - Linux 2.6.16 + manual taskset : 842 s
> - Linux 2.6.16 + Vincent's patch : 1373 s
Anyone has an idea why the scheduling is poor when processes don't use all CPU ?
In the above example, we have 4 processes on 4 processors which use about 40% of
the CPU (computing and waiting for network packets).
1- If taskset is not used : CPU0 is used at 80%, and the 3 others at 30%. The
tasks are constantly migrated between cores -> poor performance (1019s), Windows
does better :-(
2- If taskset is used : All CPUs have 1 process and are used at 40%. No
migration -> high performance (842s), better than Windows :-)
I tried to play with the 'migration_cost' kernel parameter but it did not help.
By default, on the Bi-Xeon Dual Core MB (Dell 1855), migration_cost=1600, and
trying values up to 200000, did not improve performance...
Any Ideas ?
Ludovic Drolez.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] sched.c: Be a bit more conservative in SMP
2006-09-20 7:42 ` Ludovic Drolez
2006-09-20 16:26 ` Poor scheduling when not loaded at 100% (Was: [PATCH] sched.c: Be a bit more conservative in SMP) Ludovic Drolez
@ 2006-09-21 18:36 ` Vincent Pelletier
2006-09-22 7:24 ` Ludovic Drolez
1 sibling, 1 reply; 11+ messages in thread
From: Vincent Pelletier @ 2006-09-21 18:36 UTC (permalink / raw)
To: Ludovic Drolez; +Cc: Antonio Vargas, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 920 bytes --]
Le mercredi 20 septembre 2006 09:42, Ludovic Drolez a écrit :
> Yes ! That might be a better idea !
> In fact, I tested the 1st patch on our cluster (Finite elements computing
> on 8 CPUs):
> - Under Windows: 875 seconds
> - Linux 2.6.16 : 1019 s
> - Linux 2.6.16 + manual taskset : 842 s
> - Linux 2.6.16 + Vincent's patch : 1373 s :-(
I was afraid of this :/.
I did some quick tests, and I got non-significant results. I tried building a
kernel with different make -j parameters, and there was like a few seconds of
difference, and not always in favour of the same version.
I find it strange that you get such horrible results...
Maybe I was completely wrong with my assumption that one running process
always has an impact of 1, which would have make the scheduler underestimate
the load on one cpu and put too many processes on it, without moving them
afterward.
--
Vincent Pelletier
[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] sched.c: Be a bit more conservative in SMP
2006-09-21 18:36 ` [PATCH] sched.c: Be a bit more conservative in SMP Vincent Pelletier
@ 2006-09-22 7:24 ` Ludovic Drolez
2006-09-22 12:31 ` Antonio Vargas
0 siblings, 1 reply; 11+ messages in thread
From: Ludovic Drolez @ 2006-09-22 7:24 UTC (permalink / raw)
To: Vincent Pelletier; +Cc: linux-kernel
Vincent Pelletier wrote:
> Maybe I was completely wrong with my assumption that one running process
> always has an impact of 1, which would have make the scheduler underestimate
> the load on one cpu and put too many processes on it, without moving them
> afterward.
Yes, maybe that's the problem, since in my bench, one process takes only 40% of
the CPU.
Cheers,
--
Ludovic DROLEZ Linbox / Free&ALter Soft
www.linbox.com www.linbox.org tel: +33 3 87 50 87 90
152 rue de Grigy - Technopole Metz 2000 57070 METZ
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] sched.c: Be a bit more conservative in SMP
2006-09-22 7:24 ` Ludovic Drolez
@ 2006-09-22 12:31 ` Antonio Vargas
0 siblings, 0 replies; 11+ messages in thread
From: Antonio Vargas @ 2006-09-22 12:31 UTC (permalink / raw)
To: Ludovic Drolez; +Cc: Vincent Pelletier, linux-kernel
On 9/22/06, Ludovic Drolez <ludovic.drolez@linbox.com> wrote:
> Vincent Pelletier wrote:
> > Maybe I was completely wrong with my assumption that one running process
> > always has an impact of 1, which would have make the scheduler underestimate
> > the load on one cpu and put too many processes on it, without moving them
> > afterward.
>
> Yes, maybe that's the problem, since in my bench, one process takes only 40% of
> the CPU.
>
> Cheers,
>
> --
> Ludovic DROLEZ Linbox / Free&ALter Soft
> www.linbox.com www.linbox.org tel: +33 3 87 50 87 90
> 152 rue de Grigy - Technopole Metz 2000 57070 METZ
> -
Provided you have enough memory, the somewhat better way to test this
is to turn off swap, copy the sources to a tmpfs directory and compile
there. Then any disks accesses would be only related to reloading code
pages from the compiler / daemons /shared libs, which having even more
ram would solve so that it's all compute bound. I guess even 1.5Gb of
ram is plenty for all this, and not so much costly nowdays for a
kernel hacker ;)
--
Greetz, Antonio Vargas aka winden of network
http://network.amigascne.org/
windNOenSPAMntw@gmail.com
thesameasabove@amigascne.org
Every day, every year
you have to work
you have to study
you have to scene.
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2006-09-22 12:31 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-09-03 13:41 [PATCH] sched.c: Be a bit more conservative in SMP Vincent Pelletier
2006-09-03 17:10 ` Vincent Pelletier
2006-09-06 23:30 ` Vincent Pelletier
2006-09-19 14:06 ` Ludovic Drolez
2006-09-19 17:50 ` Antonio Vargas
2006-09-20 7:42 ` Ludovic Drolez
2006-09-20 16:26 ` Poor scheduling when not loaded at 100% (Was: [PATCH] sched.c: Be a bit more conservative in SMP) Ludovic Drolez
2006-09-21 18:36 ` [PATCH] sched.c: Be a bit more conservative in SMP Vincent Pelletier
2006-09-22 7:24 ` Ludovic Drolez
2006-09-22 12:31 ` Antonio Vargas
2006-09-19 13:39 ` Ludovic
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox