public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* BUG: sched_mc_powersavings broken on pre-Nehalem x86 platforms
@ 2010-02-08 10:05 Vaidyanathan Srinivasan
  2010-02-08 11:35 ` Peter Zijlstra
  2010-02-16 14:15 ` [tip:sched/urgent] sched: Fix sched_mv_power_savings for !SMT tip-bot for Vaidyanathan Srinivasan
  0 siblings, 2 replies; 5+ messages in thread
From: Vaidyanathan Srinivasan @ 2010-02-08 10:05 UTC (permalink / raw)
  To: Suresh B Siddha, Venkatesh Pallipadi, Peter Zijlstra
  Cc: Ingo Molnar, Gautham R Shenoy, Arun Bharadwaj, Linux Kernel

Hi Peter,

sched_mc_powersavings is broken in pre-Nehalem x86 platforms due to
contradictory SD flags at MC level and CPU level.  SD_PREFER_SIBLING being set
at MC level is expected to do the following:

a) Disable consolidating tasks to single group in the parent sched domain
(generally single cpu package) 

b) Spread tasks equally across groups at the parent sched domain.

While SD_POWERSAVINGS_BALANCE set at a sched domain will enable logic to
consolidate tasks within minimum number of groups at that sched domain.

Basically SD_POWERSAVINGS_BALANCE at one sched domain and its child domain
having SD_PREFER_SIBLING is contradicting and disabling the
SD_POWERSAVINGS_BALANCE logic in 

        if (local_group && (sds->this_nr_running >= sgs->group_capacity ||
                                !sds->this_nr_running))
                sds->power_savings_balance = 0;

Since sgs.group_capacity is set to '1' by SD_PREFER_SIBLING in child
sched domain.

The attached patch will fix the expected behavior for sched_mc_powersavings > 0
while objective (b) is still an open issue.

The following condition in find_busiest_group()
	sds.max_load <= sds.busiest_load_per_task

	treats unequally loaded groups as balanced as longs they are below
	capacity

Test Results:

The following patch was tested on dual socket quad core non-threaded Xeon:

Running 4 while(1) loops in shell:

echo 1 > /sys/devices/system/cpu/sched_mc_powersavings

Without Patch:
        Running 1 task in one quad core package and 3 in another.
        This is effectively the baseline behavior with sched_mc=0

With patch:
        All 4 tasks running in one quad core package.
        Expected behavior for sched_mc_powersavings>0

--Vaidy

    Fix for sched_mc_powersavigs for pre-Nehalem platforms.
    Child sched domain should clear SD_PREFER_SIBLING if parent will have 
    SD_POWERSAVINGS_BALANCE because they are contradicting.

    Sets the flags correctly based on sched_mc_power_savings.
    
    Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 6550415..ef6b7cd 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -866,7 +866,10 @@ static inline int sd_balance_for_mc_power(void)
 	if (sched_smt_power_savings)
 		return SD_POWERSAVINGS_BALANCE;
 
-	return SD_PREFER_SIBLING;
+	if (!sched_mc_power_savings)
+		return SD_PREFER_SIBLING;
+
+	return 0;
 }
 
 static inline int sd_balance_for_package_power(void)


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: BUG: sched_mc_powersavings broken on pre-Nehalem x86 platforms
  2010-02-08 10:05 BUG: sched_mc_powersavings broken on pre-Nehalem x86 platforms Vaidyanathan Srinivasan
@ 2010-02-08 11:35 ` Peter Zijlstra
  2010-02-08 12:46   ` Vaidyanathan Srinivasan
  2010-02-16 14:15 ` [tip:sched/urgent] sched: Fix sched_mv_power_savings for !SMT tip-bot for Vaidyanathan Srinivasan
  1 sibling, 1 reply; 5+ messages in thread
From: Peter Zijlstra @ 2010-02-08 11:35 UTC (permalink / raw)
  To: svaidy
  Cc: Suresh B Siddha, Venkatesh Pallipadi, Ingo Molnar,
	Gautham R Shenoy, Arun Bharadwaj, Linux Kernel

On Mon, 2010-02-08 at 15:35 +0530, Vaidyanathan Srinivasan wrote:

>     Fix for sched_mc_powersavigs for pre-Nehalem platforms.
>     Child sched domain should clear SD_PREFER_SIBLING if parent will have 
>     SD_POWERSAVINGS_BALANCE because they are contradicting.
> 
>     Sets the flags correctly based on sched_mc_power_savings.
>     
>     Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
> 
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index 6550415..ef6b7cd 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -866,7 +866,10 @@ static inline int sd_balance_for_mc_power(void)
>  	if (sched_smt_power_savings)
>  		return SD_POWERSAVINGS_BALANCE;
>  
> -	return SD_PREFER_SIBLING;
> +	if (!sched_mc_power_savings)
> +		return SD_PREFER_SIBLING;
> +
> +	return 0;
>  }
>  
>  static inline int sd_balance_for_package_power(void)
> 

Looks good, thanks!

What's the status of getting rid of sched_{mc,smt}_power_savings?


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: BUG: sched_mc_powersavings broken on pre-Nehalem x86 platforms
  2010-02-08 11:35 ` Peter Zijlstra
@ 2010-02-08 12:46   ` Vaidyanathan Srinivasan
  0 siblings, 0 replies; 5+ messages in thread
From: Vaidyanathan Srinivasan @ 2010-02-08 12:46 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Suresh B Siddha, Venkatesh Pallipadi, Ingo Molnar,
	Gautham R Shenoy, Arun Bharadwaj, Linux Kernel

* Peter Zijlstra <peterz@infradead.org> [2010-02-08 12:35:48]:

> On Mon, 2010-02-08 at 15:35 +0530, Vaidyanathan Srinivasan wrote:
> 
> >     Fix for sched_mc_powersavigs for pre-Nehalem platforms.
> >     Child sched domain should clear SD_PREFER_SIBLING if parent will have 
> >     SD_POWERSAVINGS_BALANCE because they are contradicting.
> > 
> >     Sets the flags correctly based on sched_mc_power_savings.
> >     
> >     Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
> > 
> > diff --git a/include/linux/sched.h b/include/linux/sched.h
> > index 6550415..ef6b7cd 100644
> > --- a/include/linux/sched.h
> > +++ b/include/linux/sched.h
> > @@ -866,7 +866,10 @@ static inline int sd_balance_for_mc_power(void)
> >  	if (sched_smt_power_savings)
> >  		return SD_POWERSAVINGS_BALANCE;
> >  
> > -	return SD_PREFER_SIBLING;
> > +	if (!sched_mc_power_savings)
> > +		return SD_PREFER_SIBLING;
> > +
> > +	return 0;
> >  }
> >  
> >  static inline int sd_balance_for_package_power(void)
> > 
> 
> Looks good, thanks!
> 
> What's the status of getting rid of sched_{mc,smt}_power_savings?

Hi Peter,

With the current rearrangement of the code, the unified
sched_power_savings seems more doable.

However, I have few more fixes for sched_smt_powersavings on Nehalem
before I would revisit the unified tunable.

--Vaidy


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [tip:sched/urgent] sched: Fix sched_mv_power_savings for !SMT
  2010-02-08 10:05 BUG: sched_mc_powersavings broken on pre-Nehalem x86 platforms Vaidyanathan Srinivasan
  2010-02-08 11:35 ` Peter Zijlstra
@ 2010-02-16 14:15 ` tip-bot for Vaidyanathan Srinivasan
  2010-02-16 16:01   ` Vaidyanathan Srinivasan
  1 sibling, 1 reply; 5+ messages in thread
From: tip-bot for Vaidyanathan Srinivasan @ 2010-02-16 14:15 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: linux-kernel, hpa, mingo, a.p.zijlstra, tglx, svaidy

Commit-ID:  28f5318167adf23b16c844b9c2253f355cb21796
Gitweb:     http://git.kernel.org/tip/28f5318167adf23b16c844b9c2253f355cb21796
Author:     Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
AuthorDate: Mon, 8 Feb 2010 15:35:55 +0530
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Tue, 16 Feb 2010 15:13:59 +0100

sched: Fix sched_mv_power_savings for !SMT

Fix for sched_mc_powersavigs for pre-Nehalem platforms.
Child sched domain should clear SD_PREFER_SIBLING if parent will have
SD_POWERSAVINGS_BALANCE because they are contradicting.

Sets the flags correctly based on sched_mc_power_savings.

Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20100208100555.GD2931@dirshya.in.ibm.com>
Cc: stable@kernel.org [2.6.32.x]
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 include/linux/sched.h |    5 ++++-
 1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 78efe7c..1f5fa53 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -878,7 +878,10 @@ static inline int sd_balance_for_mc_power(void)
 	if (sched_smt_power_savings)
 		return SD_POWERSAVINGS_BALANCE;
 
-	return SD_PREFER_SIBLING;
+	if (!sched_mc_power_savings)
+		return SD_PREFER_SIBLING;
+
+	return 0;
 }
 
 static inline int sd_balance_for_package_power(void)

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [tip:sched/urgent] sched: Fix sched_mv_power_savings for !SMT
  2010-02-16 14:15 ` [tip:sched/urgent] sched: Fix sched_mv_power_savings for !SMT tip-bot for Vaidyanathan Srinivasan
@ 2010-02-16 16:01   ` Vaidyanathan Srinivasan
  0 siblings, 0 replies; 5+ messages in thread
From: Vaidyanathan Srinivasan @ 2010-02-16 16:01 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: linux-kernel, hpa, mingo, a.p.zijlstra, tglx

* tip-bot for Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> [2010-02-16 14:15:43]:

> Commit-ID:  28f5318167adf23b16c844b9c2253f355cb21796
> Gitweb:     http://git.kernel.org/tip/28f5318167adf23b16c844b9c2253f355cb21796
> Author:     Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
> AuthorDate: Mon, 8 Feb 2010 15:35:55 +0530
> Committer:  Thomas Gleixner <tglx@linutronix.de>
> CommitDate: Tue, 16 Feb 2010 15:13:59 +0100
> 
> sched: Fix sched_mv_power_savings for !SMT
                ^^^^^ _mc_

Minor typo, the summary should be 
"sched: Fix sched_mc_power_savings for !SMT cases"

Thanks,
Vaidy

> Fix for sched_mc_powersavigs for pre-Nehalem platforms.
> Child sched domain should clear SD_PREFER_SIBLING if parent will have
> SD_POWERSAVINGS_BALANCE because they are contradicting.
> 
> Sets the flags correctly based on sched_mc_power_savings.
> 
> Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
> LKML-Reference: <20100208100555.GD2931@dirshya.in.ibm.com>
> Cc: stable@kernel.org [2.6.32.x]
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
>  include/linux/sched.h |    5 ++++-
>  1 files changed, 4 insertions(+), 1 deletions(-)
> 
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index 78efe7c..1f5fa53 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -878,7 +878,10 @@ static inline int sd_balance_for_mc_power(void)
>  	if (sched_smt_power_savings)
>  		return SD_POWERSAVINGS_BALANCE;
> 
> -	return SD_PREFER_SIBLING;
> +	if (!sched_mc_power_savings)
> +		return SD_PREFER_SIBLING;
> +
> +	return 0;
>  }
> 
>  static inline int sd_balance_for_package_power(void)

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2010-02-16 16:01 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-02-08 10:05 BUG: sched_mc_powersavings broken on pre-Nehalem x86 platforms Vaidyanathan Srinivasan
2010-02-08 11:35 ` Peter Zijlstra
2010-02-08 12:46   ` Vaidyanathan Srinivasan
2010-02-16 14:15 ` [tip:sched/urgent] sched: Fix sched_mv_power_savings for !SMT tip-bot for Vaidyanathan Srinivasan
2010-02-16 16:01   ` Vaidyanathan Srinivasan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox