From: Peter Zijlstra <peterz@infradead.org>
To: lkp@lists.01.org
Subject: Re: [sched] 143e1e28cb4: +17.9% aim7.jobs-per-min, -9.7% hackbench.throughput
Date: Mon, 11 Aug 2014 15:33:52 +0200 [thread overview]
Message-ID: <20140811133352.GC9918@twins.programming.kicks-ass.net> (raw)
In-Reply-To: <20140810105413.GA29451@localhost>
[-- Attachment #1: Type: text/plain, Size: 4363 bytes --]
On Sun, Aug 10, 2014 at 06:54:13PM +0800, Fengguang Wu wrote:
> This view may be easier to read, by grouping the metrics by test case.
>
> test case: brickland1/aim7/6000-page_test
OK, I have a similar system to the brickland thing (slightly different
configuration, but should be close enough).
Now; do you have a description of each test-case someplace? In
particular, it might be good to have a small annotation to show which
direction is better.
>
> 128529 ± 1% +17.9% 151594 ± 0% TOTAL aim7.jobs-per-min
jobs per minute, + is better, so no worries there.
> 582269 ±14% -55.6% 258617 ±16% TOTAL softirqs.SCHED
> 993654 ± 2% -19.9% 795962 ± 3% TOTAL softirqs.RCU
> 15865125 ± 1% -15.0% 13485882 ± 1% TOTAL softirqs.TIMER
> 59366697 ± 3% -46.1% 32017187 ± 7% TOTAL cpuidle.C1-IVT.time
> 54543 ±11% -37.2% 34252 ±16% TOTAL cpuidle.C1-IVT.usage
> 19542 ± 9% -38.3% 12057 ± 4% TOTAL cpuidle.C1E-IVT.usage
> 49527464 ± 6% -32.4% 33488833 ± 4% TOTAL cpuidle.C1E-IVT.time
> 76064 ± 3% -32.2% 51572 ± 6% TOTAL cpuidle.C6-IVT.usage
Less idle time; might be good, if the work is cpubound, might be bad if
not; hard to say.
> 2.82 ± 3% +21.9% 3.43 ± 4% TOTAL turbostat.%pc2
> 4.40 ± 2% +22.0% 5.37 ± 4% TOTAL turbostat.%c6
> 15.75 ± 1% -3.4% 15.21 ± 0% TOTAL turbostat.RAM_W
> 3150464 ± 2% -24.2% 2387551 ± 3% TOTAL time.voluntary_context_switches
Typically less ctxsw is better..
> 281 ± 1% -15.1% 238 ± 0% TOTAL time.elapsed_time
> 29294 ± 1% -14.3% 25093 ± 0% TOTAL time.system_time
Less time spend (on presumably the same work) is better
> 4529818 ± 1% -8.8% 4129398 ± 1% TOTAL time.involuntary_context_switches
Less preemptions, also generally better
> 10655 ± 0% +1.4% 10802 ± 0% TOTAL time.percent_of_cpu_this_job_got
Seem an improvement; not sure.
Many more stats.. but from the above it looks like its an overall 'win';
or am I reading the thing wrong?
Now I think I see why this is; we've reduced load balancing frequency
significantly on this machine due to:
-#define SD_SIBLING_INIT (struct sched_domain) { \
- .min_interval = 1, \
- .max_interval = 2, \
-#define SD_MC_INIT (struct sched_domain) { \
- .min_interval = 1, \
- .max_interval = 4, \
-#define SD_CPU_INIT (struct sched_domain) { \
- .min_interval = 1, \
- .max_interval = 4, \
*sd = (struct sched_domain){
.min_interval = sd_weight,
.max_interval = 2*sd_weight,
Which both increased the min and max value significantly for all domains
involved.
That said; I think we might want to do something like the below; I can
imagine decreasing load balancing too much will negatively impact other
workloads.
Maybe slightly modified to make sure the first domain has a min_interval
of 1.
---
kernel/sched/core.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 1211575a2208..67ed5d854da1 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -6049,8 +6049,8 @@ sd_init(struct sched_domain_topology_level *tl, int cpu)
sd_flags &= ~TOPOLOGY_SD_FLAGS;
*sd = (struct sched_domain){
- .min_interval = sd_weight,
- .max_interval = 2*sd_weight,
+ .min_interval = max(1, sd_weight/2),
+ .max_interval = sd_weight,
.busy_factor = 32,
.imbalance_pct = 125,
@@ -6076,7 +6076,7 @@ sd_init(struct sched_domain_topology_level *tl, int cpu)
,
.last_balance = jiffies,
- .balance_interval = sd_weight,
+ .balance_interval = max(1, sd_weight/2),
.smt_gain = 0,
.max_newidle_lb_cost = 0,
.next_decay_max_lb_cost = jiffies,
[-- Attachment #2: attachment.sig --]
[-- Type: application/pgp-signature, Size: 836 bytes --]
WARNING: multiple messages have this Message-ID (diff)
From: Peter Zijlstra <peterz@infradead.org>
To: Fengguang Wu <fengguang.wu@intel.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>,
Dave Hansen <dave.hansen@intel.com>,
LKML <linux-kernel@vger.kernel.org>,
lkp@01.org, Ingo Molnar <mingo@kernel.org>,
Dietmar Eggemann <dietmar.eggemann@arm.com>,
Preeti U Murthy <preeti@linux.vnet.ibm.com>
Subject: Re: [sched] 143e1e28cb4: +17.9% aim7.jobs-per-min, -9.7% hackbench.throughput
Date: Mon, 11 Aug 2014 15:33:52 +0200 [thread overview]
Message-ID: <20140811133352.GC9918@twins.programming.kicks-ass.net> (raw)
In-Reply-To: <20140810105413.GA29451@localhost>
[-- Attachment #1: Type: text/plain, Size: 4329 bytes --]
On Sun, Aug 10, 2014 at 06:54:13PM +0800, Fengguang Wu wrote:
> This view may be easier to read, by grouping the metrics by test case.
>
> test case: brickland1/aim7/6000-page_test
OK, I have a similar system to the brickland thing (slightly different
configuration, but should be close enough).
Now; do you have a description of each test-case someplace? In
particular, it might be good to have a small annotation to show which
direction is better.
>
> 128529 ± 1% +17.9% 151594 ± 0% TOTAL aim7.jobs-per-min
jobs per minute, + is better, so no worries there.
> 582269 ±14% -55.6% 258617 ±16% TOTAL softirqs.SCHED
> 993654 ± 2% -19.9% 795962 ± 3% TOTAL softirqs.RCU
> 15865125 ± 1% -15.0% 13485882 ± 1% TOTAL softirqs.TIMER
> 59366697 ± 3% -46.1% 32017187 ± 7% TOTAL cpuidle.C1-IVT.time
> 54543 ±11% -37.2% 34252 ±16% TOTAL cpuidle.C1-IVT.usage
> 19542 ± 9% -38.3% 12057 ± 4% TOTAL cpuidle.C1E-IVT.usage
> 49527464 ± 6% -32.4% 33488833 ± 4% TOTAL cpuidle.C1E-IVT.time
> 76064 ± 3% -32.2% 51572 ± 6% TOTAL cpuidle.C6-IVT.usage
Less idle time; might be good, if the work is cpubound, might be bad if
not; hard to say.
> 2.82 ± 3% +21.9% 3.43 ± 4% TOTAL turbostat.%pc2
> 4.40 ± 2% +22.0% 5.37 ± 4% TOTAL turbostat.%c6
> 15.75 ± 1% -3.4% 15.21 ± 0% TOTAL turbostat.RAM_W
> 3150464 ± 2% -24.2% 2387551 ± 3% TOTAL time.voluntary_context_switches
Typically less ctxsw is better..
> 281 ± 1% -15.1% 238 ± 0% TOTAL time.elapsed_time
> 29294 ± 1% -14.3% 25093 ± 0% TOTAL time.system_time
Less time spend (on presumably the same work) is better
> 4529818 ± 1% -8.8% 4129398 ± 1% TOTAL time.involuntary_context_switches
Less preemptions, also generally better
> 10655 ± 0% +1.4% 10802 ± 0% TOTAL time.percent_of_cpu_this_job_got
Seem an improvement; not sure.
Many more stats.. but from the above it looks like its an overall 'win';
or am I reading the thing wrong?
Now I think I see why this is; we've reduced load balancing frequency
significantly on this machine due to:
-#define SD_SIBLING_INIT (struct sched_domain) { \
- .min_interval = 1, \
- .max_interval = 2, \
-#define SD_MC_INIT (struct sched_domain) { \
- .min_interval = 1, \
- .max_interval = 4, \
-#define SD_CPU_INIT (struct sched_domain) { \
- .min_interval = 1, \
- .max_interval = 4, \
*sd = (struct sched_domain){
.min_interval = sd_weight,
.max_interval = 2*sd_weight,
Which both increased the min and max value significantly for all domains
involved.
That said; I think we might want to do something like the below; I can
imagine decreasing load balancing too much will negatively impact other
workloads.
Maybe slightly modified to make sure the first domain has a min_interval
of 1.
---
kernel/sched/core.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 1211575a2208..67ed5d854da1 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -6049,8 +6049,8 @@ sd_init(struct sched_domain_topology_level *tl, int cpu)
sd_flags &= ~TOPOLOGY_SD_FLAGS;
*sd = (struct sched_domain){
- .min_interval = sd_weight,
- .max_interval = 2*sd_weight,
+ .min_interval = max(1, sd_weight/2),
+ .max_interval = sd_weight,
.busy_factor = 32,
.imbalance_pct = 125,
@@ -6076,7 +6076,7 @@ sd_init(struct sched_domain_topology_level *tl, int cpu)
,
.last_balance = jiffies,
- .balance_interval = sd_weight,
+ .balance_interval = max(1, sd_weight/2),
.smt_gain = 0,
.max_newidle_lb_cost = 0,
.next_decay_max_lb_cost = jiffies,
[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]
next prev parent reply other threads:[~2014-08-11 13:33 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-08-10 4:41 [sched] 143e1e28cb4: +17.9% aim7.jobs-per-min, -9.7% hackbench.throughput Fengguang Wu
2014-08-10 4:41 ` Fengguang Wu
2014-08-10 7:59 ` Peter Zijlstra
2014-08-10 7:59 ` Peter Zijlstra
2014-08-10 10:54 ` Fengguang Wu
2014-08-10 10:54 ` Fengguang Wu
2014-08-10 15:05 ` Peter Zijlstra
2014-08-10 15:05 ` Peter Zijlstra
2014-08-10 15:16 ` Ingo Molnar
2014-08-10 15:16 ` Ingo Molnar
2014-08-11 1:23 ` Fengguang Wu
2014-08-11 1:23 ` Fengguang Wu
2014-08-12 14:57 ` kodiak furr
2014-08-12 14:57 ` kodiak furr
2014-08-11 13:33 ` Peter Zijlstra [this message]
2014-08-11 13:33 ` Peter Zijlstra
2014-08-12 3:59 ` Preeti U Murthy
2014-08-12 3:59 ` Preeti U Murthy
2014-08-12 6:41 ` Peter Zijlstra
2014-08-12 6:41 ` Peter Zijlstra
2014-08-12 14:30 ` Fengguang Wu
2014-08-12 14:30 ` Fengguang Wu
2014-08-25 13:47 ` Vincent Guittot
2014-08-25 13:47 ` Vincent Guittot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140811133352.GC9918@twins.programming.kicks-ass.net \
--to=peterz@infradead.org \
--cc=lkp@lists.01.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.