* tbench regression with 2.6.32-rc1
@ 2009-10-09 9:51 Zhang, Yanmin
2009-10-09 10:16 ` Peter Zijlstra
0 siblings, 1 reply; 4+ messages in thread
From: Zhang, Yanmin @ 2009-10-09 9:51 UTC (permalink / raw)
To: Peter Zijlstra; +Cc: Ingo Molnar, LKML
Comparing with 2.6.31's results, tebench has some regression with
2.6.32-rc1.
COmmandline to start tbench:
#./tbench_srv &
#./tbench -t 600 CPU_NUM*2 127.0.0.1 #Use real cpu num to replace CPU_NUM
So start 2 client processes per cpu.
1) On 4*4 core tigerton: 30%;
2) On 2*4 core stoakley: 15%;
3) On 2*8 core Nehalem: 6%.
As there are couple of patches which try to turn on/off some sched domain
flags such like SD_BALANCE_WAKE, I used some walkaround to bisect it.
On tigerton, below patch is captured.
commit 59abf02644c45f1591e1374ee7bb45dc757fcb88
Author: Peter Zijlstra <a.p.zijlstra@chello.nl>
Date: Wed Sep 16 08:28:30 2009 +0200
sched: Add SD_PREFER_LOCAL
The patch reverting is not clean, so I did some testing by turning on/off
some domain flags and sched_feaures manually.
1) On tigerton: if SD_PREFER_LOCAL=0 (disable it), the regression becomes about 2%.
2) On stoakley: if SD_PREFER_LOCAL=0 (disable it), the regression becomes about 4%.
3) On Nehalem: Above method couldn't improve result. I'm still checking it.
I also tried to turn on/off FAIR_SLEEPERS and GENTLE_FAIR_SLEEPERS. It seems they
has limited impact on tbench. I need double check these 2 flags.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: tbench regression with 2.6.32-rc1
2009-10-09 9:51 tbench regression with 2.6.32-rc1 Zhang, Yanmin
@ 2009-10-09 10:16 ` Peter Zijlstra
2009-10-12 5:20 ` Zhang, Yanmin
2009-10-14 13:13 ` [tip:sched/urgent] sched: Disable SD_PREFER_LOCAL for MC/CPU domains tip-bot for Peter Zijlstra
0 siblings, 2 replies; 4+ messages in thread
From: Peter Zijlstra @ 2009-10-09 10:16 UTC (permalink / raw)
To: Zhang, Yanmin; +Cc: Ingo Molnar, LKML
On Fri, 2009-10-09 at 17:51 +0800, Zhang, Yanmin wrote:
> Comparing with 2.6.31's results, tebench has some regression with
> 2.6.32-rc1.
> COmmandline to start tbench:
> #./tbench_srv &
> #./tbench -t 600 CPU_NUM*2 127.0.0.1 #Use real cpu num to replace CPU_NUM
> So start 2 client processes per cpu.
>
> 1) On 4*4 core tigerton: 30%;
> 2) On 2*4 core stoakley: 15%;
> 3) On 2*8 core Nehalem: 6%.
>
> As there are couple of patches which try to turn on/off some sched domain
> flags such like SD_BALANCE_WAKE, I used some walkaround to bisect it.
> On tigerton, below patch is captured.
> commit 59abf02644c45f1591e1374ee7bb45dc757fcb88
> Author: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Date: Wed Sep 16 08:28:30 2009 +0200
>
> sched: Add SD_PREFER_LOCAL
>
>
> The patch reverting is not clean, so I did some testing by turning on/off
> some domain flags and sched_feaures manually.
>
> 1) On tigerton: if SD_PREFER_LOCAL=0 (disable it), the regression becomes about 2%.
> 2) On stoakley: if SD_PREFER_LOCAL=0 (disable it), the regression becomes about 4%.
> 3) On Nehalem: Above method couldn't improve result. I'm still checking it.
>
> I also tried to turn on/off FAIR_SLEEPERS and GENTLE_FAIR_SLEEPERS. It seems they
> has limited impact on tbench. I need double check these 2 flags.
So the c2q cpus, and esp the one with smaller cache hurt from this. I
guess we can turn this off without too much down sides. Maybe turn it on
for NUMA on the nehalem?
---
diff --git a/arch/x86/include/asm/topology.h b/arch/x86/include/asm/topology.h
index 25a9284..d823c24 100644
--- a/arch/x86/include/asm/topology.h
+++ b/arch/x86/include/asm/topology.h
@@ -143,6 +143,7 @@ extern unsigned long node_remap_size[];
| 1*SD_BALANCE_FORK \
| 0*SD_BALANCE_WAKE \
| 1*SD_WAKE_AFFINE \
+ | 1*SD_PREFER_LOCAL \
| 0*SD_SHARE_CPUPOWER \
| 0*SD_POWERSAVINGS_BALANCE \
| 0*SD_SHARE_PKG_RESOURCES \
diff --git a/include/linux/topology.h b/include/linux/topology.h
index fc0bf3e..57e6357 100644
--- a/include/linux/topology.h
+++ b/include/linux/topology.h
@@ -129,7 +129,7 @@ int arch_update_cpu_topology(void);
| 1*SD_BALANCE_FORK \
| 0*SD_BALANCE_WAKE \
| 1*SD_WAKE_AFFINE \
- | 1*SD_PREFER_LOCAL \
+ | 0*SD_PREFER_LOCAL \
| 0*SD_SHARE_CPUPOWER \
| 1*SD_SHARE_PKG_RESOURCES \
| 0*SD_SERIALIZE \
@@ -162,7 +162,7 @@ int arch_update_cpu_topology(void);
| 1*SD_BALANCE_FORK \
| 0*SD_BALANCE_WAKE \
| 1*SD_WAKE_AFFINE \
- | 1*SD_PREFER_LOCAL \
+ | 0*SD_PREFER_LOCAL \
| 0*SD_SHARE_CPUPOWER \
| 0*SD_SHARE_PKG_RESOURCES \
| 0*SD_SERIALIZE \
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: tbench regression with 2.6.32-rc1
2009-10-09 10:16 ` Peter Zijlstra
@ 2009-10-12 5:20 ` Zhang, Yanmin
2009-10-14 13:13 ` [tip:sched/urgent] sched: Disable SD_PREFER_LOCAL for MC/CPU domains tip-bot for Peter Zijlstra
1 sibling, 0 replies; 4+ messages in thread
From: Zhang, Yanmin @ 2009-10-12 5:20 UTC (permalink / raw)
To: Peter Zijlstra; +Cc: Ingo Molnar, LKML
On Fri, 2009-10-09 at 12:16 +0200, Peter Zijlstra wrote:
> On Fri, 2009-10-09 at 17:51 +0800, Zhang, Yanmin wrote:
> > Comparing with 2.6.31's results, tebench has some regression with
> > 2.6.32-rc1.
> > COmmandline to start tbench:
> > #./tbench_srv &
> > #./tbench -t 600 CPU_NUM*2 127.0.0.1 #Use real cpu num to replace CPU_NUM
> > So start 2 client processes per cpu.
> >
> > 1) On 4*4 core tigerton: 30%;
> > 2) On 2*4 core stoakley: 15%;
> > 3) On 2*8 core Nehalem: 6%.
> >
> > As there are couple of patches which try to turn on/off some sched domain
> > flags such like SD_BALANCE_WAKE, I used some walkaround to bisect it.
> > On tigerton, below patch is captured.
> > commit 59abf02644c45f1591e1374ee7bb45dc757fcb88
> > Author: Peter Zijlstra <a.p.zijlstra@chello.nl>
> > Date: Wed Sep 16 08:28:30 2009 +0200
> >
> > sched: Add SD_PREFER_LOCAL
> >
> >
> > The patch reverting is not clean, so I did some testing by turning on/off
> > some domain flags and sched_feaures manually.
> >
> > 1) On tigerton: if SD_PREFER_LOCAL=0 (disable it), the regression becomes about 2%.
> > 2) On stoakley: if SD_PREFER_LOCAL=0 (disable it), the regression becomes about 4%.
> > 3) On Nehalem: Above method couldn't improve result. I'm still checking it.
> >
> > I also tried to turn on/off FAIR_SLEEPERS and GENTLE_FAIR_SLEEPERS. It seems they
> > has limited impact on tbench. I need double check these 2 flags.
>
> So the c2q cpus, and esp the one with smaller cache hurt from this. I
> guess we can turn this off without too much down sides. Maybe turn it on
> for NUMA on the nehalem?
I tested the patch and it does work like turning it off from domain flags.
So with the patch, stoakley still has 4% regression and tigerton has 2%.
>
>
> ---
> diff --git a/arch/x86/include/asm/topology.h b/arch/x86/include/asm/topology.h
> index 25a9284..d823c24 100644
> --- a/arch/x86/include/asm/topology.h
> +++ b/arch/x86/include/asm/topology.h
> @@ -143,6 +143,7 @@ extern unsigned long node_remap_size[];
> | 1*SD_BALANCE_FORK \
> | 0*SD_BALANCE_WAKE \
> | 1*SD_WAKE_AFFINE \
> + | 1*SD_PREFER_LOCAL \
> | 0*SD_SHARE_CPUPOWER \
> | 0*SD_POWERSAVINGS_BALANCE \
> | 0*SD_SHARE_PKG_RESOURCES \
> diff --git a/include/linux/topology.h b/include/linux/topology.h
> index fc0bf3e..57e6357 100644
> --- a/include/linux/topology.h
> +++ b/include/linux/topology.h
> @@ -129,7 +129,7 @@ int arch_update_cpu_topology(void);
> | 1*SD_BALANCE_FORK \
> | 0*SD_BALANCE_WAKE \
> | 1*SD_WAKE_AFFINE \
> - | 1*SD_PREFER_LOCAL \
> + | 0*SD_PREFER_LOCAL \
> | 0*SD_SHARE_CPUPOWER \
> | 1*SD_SHARE_PKG_RESOURCES \
> | 0*SD_SERIALIZE \
> @@ -162,7 +162,7 @@ int arch_update_cpu_topology(void);
> | 1*SD_BALANCE_FORK \
> | 0*SD_BALANCE_WAKE \
> | 1*SD_WAKE_AFFINE \
> - | 1*SD_PREFER_LOCAL \
> + | 0*SD_PREFER_LOCAL \
> | 0*SD_SHARE_CPUPOWER \
> | 0*SD_SHARE_PKG_RESOURCES \
> | 0*SD_SERIALIZE \
>
>
^ permalink raw reply [flat|nested] 4+ messages in thread
* [tip:sched/urgent] sched: Disable SD_PREFER_LOCAL for MC/CPU domains
2009-10-09 10:16 ` Peter Zijlstra
2009-10-12 5:20 ` Zhang, Yanmin
@ 2009-10-14 13:13 ` tip-bot for Peter Zijlstra
1 sibling, 0 replies; 4+ messages in thread
From: tip-bot for Peter Zijlstra @ 2009-10-14 13:13 UTC (permalink / raw)
To: linux-tip-commits
Cc: linux-kernel, hpa, mingo, a.p.zijlstra, efault, yanmin_zhang,
tglx, mingo
Commit-ID: 799e2205ec65e174f752b558c62a92c4752df313
Gitweb: http://git.kernel.org/tip/799e2205ec65e174f752b558c62a92c4752df313
Author: Peter Zijlstra <a.p.zijlstra@chello.nl>
AuthorDate: Fri, 9 Oct 2009 12:16:40 +0200
Committer: Ingo Molnar <mingo@elte.hu>
CommitDate: Wed, 14 Oct 2009 15:02:34 +0200
sched: Disable SD_PREFER_LOCAL for MC/CPU domains
Yanmin reported that both tbench and hackbench were significantly
hurt by trying to keep tasks local on these domains, esp on small
cache machines.
So disable it in order to promote spreading outside of the cache
domains.
Reported-by: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
CC: Mike Galbraith <efault@gmx.de>
LKML-Reference: <1255083400.8802.15.camel@laptop>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
arch/x86/include/asm/topology.h | 1 +
include/linux/topology.h | 4 ++--
2 files changed, 3 insertions(+), 2 deletions(-)
diff --git a/arch/x86/include/asm/topology.h b/arch/x86/include/asm/topology.h
index 25a9284..d823c24 100644
--- a/arch/x86/include/asm/topology.h
+++ b/arch/x86/include/asm/topology.h
@@ -143,6 +143,7 @@ extern unsigned long node_remap_size[];
| 1*SD_BALANCE_FORK \
| 0*SD_BALANCE_WAKE \
| 1*SD_WAKE_AFFINE \
+ | 1*SD_PREFER_LOCAL \
| 0*SD_SHARE_CPUPOWER \
| 0*SD_POWERSAVINGS_BALANCE \
| 0*SD_SHARE_PKG_RESOURCES \
diff --git a/include/linux/topology.h b/include/linux/topology.h
index fc0bf3e..57e6357 100644
--- a/include/linux/topology.h
+++ b/include/linux/topology.h
@@ -129,7 +129,7 @@ int arch_update_cpu_topology(void);
| 1*SD_BALANCE_FORK \
| 0*SD_BALANCE_WAKE \
| 1*SD_WAKE_AFFINE \
- | 1*SD_PREFER_LOCAL \
+ | 0*SD_PREFER_LOCAL \
| 0*SD_SHARE_CPUPOWER \
| 1*SD_SHARE_PKG_RESOURCES \
| 0*SD_SERIALIZE \
@@ -162,7 +162,7 @@ int arch_update_cpu_topology(void);
| 1*SD_BALANCE_FORK \
| 0*SD_BALANCE_WAKE \
| 1*SD_WAKE_AFFINE \
- | 1*SD_PREFER_LOCAL \
+ | 0*SD_PREFER_LOCAL \
| 0*SD_SHARE_CPUPOWER \
| 0*SD_SHARE_PKG_RESOURCES \
| 0*SD_SERIALIZE \
^ permalink raw reply related [flat|nested] 4+ messages in thread
end of thread, other threads:[~2009-10-14 13:14 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-10-09 9:51 tbench regression with 2.6.32-rc1 Zhang, Yanmin
2009-10-09 10:16 ` Peter Zijlstra
2009-10-12 5:20 ` Zhang, Yanmin
2009-10-14 13:13 ` [tip:sched/urgent] sched: Disable SD_PREFER_LOCAL for MC/CPU domains tip-bot for Peter Zijlstra
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox