From: Ingo Molnar <mingo@elte.hu>
To: Tong Li <tong.n.li@intel.com>
Cc: linux-kernel@vger.kernel.org, Chris Snook <csnook@redhat.com>
Subject: Re: [RFC] scheduler: improve SMP fairness in CFS
Date: Wed, 25 Jul 2007 14:03:58 +0200 [thread overview]
Message-ID: <20070725120358.GA30755@elte.hu> (raw)
In-Reply-To: <20070725110159.GA15076@elte.hu>
* Ingo Molnar <mingo@elte.hu> wrote:
> > This patch extends CFS to achieve better fairness for SMPs. For
> > example, with 10 tasks (same priority) on 8 CPUs, it enables each task
> > to receive equal CPU time (80%). [...]
>
> hm, CFS should already offer reasonable long-term SMP fairness. It
> certainly works on a dual-core box, i just started 3 tasks of the same
> priority on 2 CPUs, and on vanilla 2.6.23-rc1 the distribution is
> this:
>
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> 7084 mingo 20 0 1576 248 196 R 67 0.0 0:50.13 loop
> 7083 mingo 20 0 1576 244 196 R 66 0.0 0:48.86 loop
> 7085 mingo 20 0 1576 244 196 R 66 0.0 0:49.45 loop
>
> so each task gets a perfect 66% of CPU time.
>
> prior CFS, we indeed did a 50%/50%/100% split - so for example on
> v2.6.22:
>
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> 2256 mingo 25 0 1580 248 196 R 100 0.0 1:03.19 loop
> 2255 mingo 25 0 1580 248 196 R 50 0.0 0:31.79 loop
> 2257 mingo 25 0 1580 248 196 R 50 0.0 0:31.69 loop
>
> but CFS has changed that behavior.
>
> I'll check your 10-tasks-on-8-cpus example on an 8-way box too, maybe
> we regressed somewhere ...
ok, i just tried it on an 8-cpu box and indeed, unlike the dual-core
case, the scheduler does not distribute tasks well enough:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2572 mingo 20 0 1576 244 196 R 100 0.0 1:03.61 loop
2578 mingo 20 0 1576 248 196 R 100 0.0 1:03.59 loop
2576 mingo 20 0 1576 248 196 R 100 0.0 1:03.52 loop
2571 mingo 20 0 1576 244 196 R 100 0.0 1:03.46 loop
2569 mingo 20 0 1576 244 196 R 99 0.0 1:03.36 loop
2570 mingo 20 0 1576 244 196 R 95 0.0 1:00.55 loop
2577 mingo 20 0 1576 248 196 R 50 0.0 0:31.88 loop
2574 mingo 20 0 1576 248 196 R 50 0.0 0:31.87 loop
2573 mingo 20 0 1576 248 196 R 50 0.0 0:31.86 loop
2575 mingo 20 0 1576 248 196 R 50 0.0 0:31.86 loop
but this is relatively easy to fix - with the patch below applied, it
looks a lot better:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2681 mingo 20 0 1576 244 196 R 85 0.0 3:51.68 loop
2688 mingo 20 0 1576 244 196 R 81 0.0 3:46.35 loop
2682 mingo 20 0 1576 244 196 R 80 0.0 3:43.68 loop
2685 mingo 20 0 1576 248 196 R 80 0.0 3:45.97 loop
2683 mingo 20 0 1576 248 196 R 80 0.0 3:40.25 loop
2679 mingo 20 0 1576 244 196 R 80 0.0 3:33.53 loop
2680 mingo 20 0 1576 244 196 R 79 0.0 3:43.53 loop
2686 mingo 20 0 1576 244 196 R 79 0.0 3:39.31 loop
2687 mingo 20 0 1576 244 196 R 78 0.0 3:33.31 loop
2684 mingo 20 0 1576 244 196 R 77 0.0 3:27.52 loop
they now nicely converte to the expected 80% long-term CPU usage.
so, could you please try the patch below, does it work for you too?
Ingo
--------------------------->
Subject: sched: increase SCHED_LOAD_SCALE_FUZZ
From: Ingo Molnar <mingo@elte.hu>
increase SCHED_LOAD_SCALE_FUZZ that adds a small amount of
over-balancing: to help distribute CPU-bound tasks more fairly on SMP
systems.
the problem of unfair balancing was noticed and reported by Tong N Li.
10 CPU-bound tasks running on 8 CPUs, v2.6.23-rc1:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2572 mingo 20 0 1576 244 196 R 100 0.0 1:03.61 loop
2578 mingo 20 0 1576 248 196 R 100 0.0 1:03.59 loop
2576 mingo 20 0 1576 248 196 R 100 0.0 1:03.52 loop
2571 mingo 20 0 1576 244 196 R 100 0.0 1:03.46 loop
2569 mingo 20 0 1576 244 196 R 99 0.0 1:03.36 loop
2570 mingo 20 0 1576 244 196 R 95 0.0 1:00.55 loop
2577 mingo 20 0 1576 248 196 R 50 0.0 0:31.88 loop
2574 mingo 20 0 1576 248 196 R 50 0.0 0:31.87 loop
2573 mingo 20 0 1576 248 196 R 50 0.0 0:31.86 loop
2575 mingo 20 0 1576 248 196 R 50 0.0 0:31.86 loop
v2.6.23-rc1 + patch:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2681 mingo 20 0 1576 244 196 R 85 0.0 3:51.68 loop
2688 mingo 20 0 1576 244 196 R 81 0.0 3:46.35 loop
2682 mingo 20 0 1576 244 196 R 80 0.0 3:43.68 loop
2685 mingo 20 0 1576 248 196 R 80 0.0 3:45.97 loop
2683 mingo 20 0 1576 248 196 R 80 0.0 3:40.25 loop
2679 mingo 20 0 1576 244 196 R 80 0.0 3:33.53 loop
2680 mingo 20 0 1576 244 196 R 79 0.0 3:43.53 loop
2686 mingo 20 0 1576 244 196 R 79 0.0 3:39.31 loop
2687 mingo 20 0 1576 244 196 R 78 0.0 3:33.31 loop
2684 mingo 20 0 1576 244 196 R 77 0.0 3:27.52 loop
so they now nicely converte to the expected 80% long-term CPU usage.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
include/linux/sched.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
Index: linux/include/linux/sched.h
===================================================================
--- linux.orig/include/linux/sched.h
+++ linux/include/linux/sched.h
@@ -681,7 +681,7 @@ enum cpu_idle_type {
#define SCHED_LOAD_SHIFT 10
#define SCHED_LOAD_SCALE (1L << SCHED_LOAD_SHIFT)
-#define SCHED_LOAD_SCALE_FUZZ (SCHED_LOAD_SCALE >> 5)
+#define SCHED_LOAD_SCALE_FUZZ (SCHED_LOAD_SCALE >> 1)
#ifdef CONFIG_SMP
#define SD_LOAD_BALANCE 1 /* Do load balancing on this domain. */
next prev parent reply other threads:[~2007-07-25 12:04 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-07-23 18:38 [RFC] scheduler: improve SMP fairness in CFS Tong Li
2007-07-23 20:00 ` Andi Kleen
2007-07-23 21:10 ` Li, Tong N
2007-07-23 21:25 ` Chris Friesen
2007-07-24 9:43 ` Andi Kleen
2007-07-23 23:40 ` Chris Snook
2007-07-24 8:07 ` Chris Snook
2007-07-24 17:11 ` Li, Tong N
2007-07-24 17:07 ` Tong Li
2007-07-24 18:08 ` Chris Snook
2007-07-24 19:47 ` Chris Friesen
2007-07-24 20:39 ` Chris Snook
2007-07-24 20:58 ` Li, Tong N
2007-07-24 21:09 ` Chris Snook
2007-07-24 21:23 ` Chris Friesen
2007-07-24 21:45 ` Chris Snook
2007-07-24 23:33 ` Chris Friesen
2007-07-24 21:06 ` Bill Huey
2007-07-24 21:22 ` Chris Snook
2007-07-24 23:14 ` Bill Huey
2007-07-24 21:12 ` Chris Friesen
2007-07-25 11:01 ` Ingo Molnar
2007-07-25 12:03 ` Ingo Molnar [this message]
2007-07-25 17:23 ` Tong Li
2007-07-25 19:24 ` Ingo Molnar
2007-07-25 20:38 ` Chris Friesen
2007-07-25 20:55 ` Chris Snook
2007-07-25 21:15 ` Li, Tong N
2007-07-25 22:24 ` Chris Snook
2007-07-26 19:00 ` Tong Li
2007-07-26 21:31 ` Ingo Molnar
2007-07-26 22:00 ` Li, Tong N
2007-07-27 1:34 ` Tong Li
2007-07-27 17:16 ` Chris Snook
2007-07-27 19:03 ` Tong Li
2007-07-27 22:20 ` Bill Huey
2007-07-27 23:36 ` Chris Snook
2007-07-28 0:54 ` Bill Huey
2007-07-28 2:59 ` Chris Snook
2007-07-28 19:38 ` Tong Li
2007-07-29 2:40 ` Chris Snook
2007-07-28 19:23 ` Tong Li
2007-07-29 3:01 ` Chris Snook
2007-07-25 18:20 ` Li, Tong N
2007-07-25 19:18 ` Ingo Molnar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20070725120358.GA30755@elte.hu \
--to=mingo@elte.hu \
--cc=csnook@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=tong.n.li@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.