From: Ingo Molnar <mingo@elte.hu>
To: Tong Li <tong.n.li@intel.com>
Cc: linux-kernel@vger.kernel.org, Chris Snook <csnook@redhat.com>
Subject: Re: [RFC] scheduler: improve SMP fairness in CFS
Date: Wed, 25 Jul 2007 14:03:58 +0200 [thread overview]
Message-ID: <20070725120358.GA30755@elte.hu> (raw)
In-Reply-To: <20070725110159.GA15076@elte.hu>
* Ingo Molnar <mingo@elte.hu> wrote:
> > This patch extends CFS to achieve better fairness for SMPs. For
> > example, with 10 tasks (same priority) on 8 CPUs, it enables each task
> > to receive equal CPU time (80%). [...]
>
> hm, CFS should already offer reasonable long-term SMP fairness. It
> certainly works on a dual-core box, i just started 3 tasks of the same
> priority on 2 CPUs, and on vanilla 2.6.23-rc1 the distribution is
> this:
>
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> 7084 mingo 20 0 1576 248 196 R 67 0.0 0:50.13 loop
> 7083 mingo 20 0 1576 244 196 R 66 0.0 0:48.86 loop
> 7085 mingo 20 0 1576 244 196 R 66 0.0 0:49.45 loop
>
> so each task gets a perfect 66% of CPU time.
>
> prior CFS, we indeed did a 50%/50%/100% split - so for example on
> v2.6.22:
>
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> 2256 mingo 25 0 1580 248 196 R 100 0.0 1:03.19 loop
> 2255 mingo 25 0 1580 248 196 R 50 0.0 0:31.79 loop
> 2257 mingo 25 0 1580 248 196 R 50 0.0 0:31.69 loop
>
> but CFS has changed that behavior.
>
> I'll check your 10-tasks-on-8-cpus example on an 8-way box too, maybe
> we regressed somewhere ...
ok, i just tried it on an 8-cpu box and indeed, unlike the dual-core
case, the scheduler does not distribute tasks well enough:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2572 mingo 20 0 1576 244 196 R 100 0.0 1:03.61 loop
2578 mingo 20 0 1576 248 196 R 100 0.0 1:03.59 loop
2576 mingo 20 0 1576 248 196 R 100 0.0 1:03.52 loop
2571 mingo 20 0 1576 244 196 R 100 0.0 1:03.46 loop
2569 mingo 20 0 1576 244 196 R 99 0.0 1:03.36 loop
2570 mingo 20 0 1576 244 196 R 95 0.0 1:00.55 loop
2577 mingo 20 0 1576 248 196 R 50 0.0 0:31.88 loop
2574 mingo 20 0 1576 248 196 R 50 0.0 0:31.87 loop
2573 mingo 20 0 1576 248 196 R 50 0.0 0:31.86 loop
2575 mingo 20 0 1576 248 196 R 50 0.0 0:31.86 loop
but this is relatively easy to fix - with the patch below applied, it
looks a lot better:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2681 mingo 20 0 1576 244 196 R 85 0.0 3:51.68 loop
2688 mingo 20 0 1576 244 196 R 81 0.0 3:46.35 loop
2682 mingo 20 0 1576 244 196 R 80 0.0 3:43.68 loop
2685 mingo 20 0 1576 248 196 R 80 0.0 3:45.97 loop
2683 mingo 20 0 1576 248 196 R 80 0.0 3:40.25 loop
2679 mingo 20 0 1576 244 196 R 80 0.0 3:33.53 loop
2680 mingo 20 0 1576 244 196 R 79 0.0 3:43.53 loop
2686 mingo 20 0 1576 244 196 R 79 0.0 3:39.31 loop
2687 mingo 20 0 1576 244 196 R 78 0.0 3:33.31 loop
2684 mingo 20 0 1576 244 196 R 77 0.0 3:27.52 loop
they now nicely converte to the expected 80% long-term CPU usage.
so, could you please try the patch below, does it work for you too?
Ingo
--------------------------->
Subject: sched: increase SCHED_LOAD_SCALE_FUZZ
From: Ingo Molnar <mingo@elte.hu>
increase SCHED_LOAD_SCALE_FUZZ that adds a small amount of
over-balancing: to help distribute CPU-bound tasks more fairly on SMP
systems.
the problem of unfair balancing was noticed and reported by Tong N Li.
10 CPU-bound tasks running on 8 CPUs, v2.6.23-rc1:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2572 mingo 20 0 1576 244 196 R 100 0.0 1:03.61 loop
2578 mingo 20 0 1576 248 196 R 100 0.0 1:03.59 loop
2576 mingo 20 0 1576 248 196 R 100 0.0 1:03.52 loop
2571 mingo 20 0 1576 244 196 R 100 0.0 1:03.46 loop
2569 mingo 20 0 1576 244 196 R 99 0.0 1:03.36 loop
2570 mingo 20 0 1576 244 196 R 95 0.0 1:00.55 loop
2577 mingo 20 0 1576 248 196 R 50 0.0 0:31.88 loop
2574 mingo 20 0 1576 248 196 R 50 0.0 0:31.87 loop
2573 mingo 20 0 1576 248 196 R 50 0.0 0:31.86 loop
2575 mingo 20 0 1576 248 196 R 50 0.0 0:31.86 loop
v2.6.23-rc1 + patch:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2681 mingo 20 0 1576 244 196 R 85 0.0 3:51.68 loop
2688 mingo 20 0 1576 244 196 R 81 0.0 3:46.35 loop
2682 mingo 20 0 1576 244 196 R 80 0.0 3:43.68 loop
2685 mingo 20 0 1576 248 196 R 80 0.0 3:45.97 loop
2683 mingo 20 0 1576 248 196 R 80 0.0 3:40.25 loop
2679 mingo 20 0 1576 244 196 R 80 0.0 3:33.53 loop
2680 mingo 20 0 1576 244 196 R 79 0.0 3:43.53 loop
2686 mingo 20 0 1576 244 196 R 79 0.0 3:39.31 loop
2687 mingo 20 0 1576 244 196 R 78 0.0 3:33.31 loop
2684 mingo 20 0 1576 244 196 R 77 0.0 3:27.52 loop
so they now nicely converte to the expected 80% long-term CPU usage.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
include/linux/sched.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
Index: linux/include/linux/sched.h
===================================================================
--- linux.orig/include/linux/sched.h
+++ linux/include/linux/sched.h
@@ -681,7 +681,7 @@ enum cpu_idle_type {
#define SCHED_LOAD_SHIFT 10
#define SCHED_LOAD_SCALE (1L << SCHED_LOAD_SHIFT)
-#define SCHED_LOAD_SCALE_FUZZ (SCHED_LOAD_SCALE >> 5)
+#define SCHED_LOAD_SCALE_FUZZ (SCHED_LOAD_SCALE >> 1)
#ifdef CONFIG_SMP
#define SD_LOAD_BALANCE 1 /* Do load balancing on this domain. */
next prev parent reply other threads:[~2007-07-25 12:04 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-07-23 18:38 [RFC] scheduler: improve SMP fairness in CFS Tong Li
2007-07-23 20:00 ` Andi Kleen
2007-07-23 21:10 ` Li, Tong N
2007-07-23 21:25 ` Chris Friesen
2007-07-24 9:43 ` Andi Kleen
2007-07-23 23:40 ` Chris Snook
2007-07-24 8:07 ` Chris Snook
2007-07-24 17:11 ` Li, Tong N
2007-07-24 17:07 ` Tong Li
2007-07-24 18:08 ` Chris Snook
2007-07-24 19:47 ` Chris Friesen
2007-07-24 20:39 ` Chris Snook
2007-07-24 20:58 ` Li, Tong N
2007-07-24 21:09 ` Chris Snook
2007-07-24 21:23 ` Chris Friesen
2007-07-24 21:45 ` Chris Snook
2007-07-24 23:33 ` Chris Friesen
2007-07-24 21:06 ` Bill Huey
2007-07-24 21:22 ` Chris Snook
2007-07-24 23:14 ` Bill Huey
2007-07-24 21:12 ` Chris Friesen
2007-07-25 11:01 ` Ingo Molnar
2007-07-25 12:03 ` Ingo Molnar [this message]
2007-07-25 17:23 ` Tong Li
2007-07-25 19:24 ` Ingo Molnar
2007-07-25 20:38 ` Chris Friesen
2007-07-25 20:55 ` Chris Snook
2007-07-25 21:15 ` Li, Tong N
2007-07-25 22:24 ` Chris Snook
2007-07-26 19:00 ` Tong Li
2007-07-26 21:31 ` Ingo Molnar
2007-07-26 22:00 ` Li, Tong N
2007-07-27 1:34 ` Tong Li
2007-07-27 17:16 ` Chris Snook
2007-07-27 19:03 ` Tong Li
2007-07-27 22:20 ` Bill Huey
2007-07-27 23:36 ` Chris Snook
2007-07-28 0:54 ` Bill Huey
2007-07-28 2:59 ` Chris Snook
2007-07-28 19:38 ` Tong Li
2007-07-29 2:40 ` Chris Snook
2007-07-28 19:23 ` Tong Li
2007-07-29 3:01 ` Chris Snook
2007-07-25 18:20 ` Li, Tong N
2007-07-25 19:18 ` Ingo Molnar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20070725120358.GA30755@elte.hu \
--to=mingo@elte.hu \
--cc=csnook@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=tong.n.li@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox