public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH v1 0/3] Saving power by cpu evacuation using sched_mc=n
@ 2009-04-26 20:46 Vaidyanathan Srinivasan
  2009-04-26 20:46 ` [RFC PATCH v1 1/3] sched: add more levels of sched_mc Vaidyanathan Srinivasan
                   ` (4 more replies)
  0 siblings, 5 replies; 16+ messages in thread
From: Vaidyanathan Srinivasan @ 2009-04-26 20:46 UTC (permalink / raw)
  To: Linux Kernel, Suresh B Siddha, Venkatesh Pallipadi,
	Peter Zijlstra, Arjan van de Ven
  Cc: Ingo Molnar, Dipankar Sarma, Balbir Singh, Vatsa,
	Gautham R Shenoy, Andi Kleen, Gregory Haskins, Mike Galbraith,
	Thomas Gleixner, Arun Bharadwaj, Vaidyanathan Srinivasan

Hi,

The sched_mc_powersavings tunable can be set to {0,1,2} to enable
aggressive task consolidation to less number of cpu packages and save
power.  Under certain conditions, sched_mc=2 may provide better
performance in a underutilised system by keeping the group of tasks on
a single cpu package facilitating cache sharing and reduced off-chip
traffic.

Extending this concept further, the following patch series tries to
implement sched_mc={3,4,5} where CPUs/cores are forced to be idle and
thereby save power at the cost of performance.  Some of the cpu
packages in the system are overloaded with tasks while other packages
can have free cpus.  This patch is a hack to discuss the idea and
requirements.

Objective:
----------

* Framework to evacuate tasks from cpus in order to force the cpu
  cores to stay at idle

* Interrupts can be moved using user space irqbalancer daemons, while
  timer migration framework is being discussed:
  http://lkml.org/lkml/2009/4/16/45

* Forcefully idling cpu cores in a system will reduce the power
  consumption of the system and also cool cpu packages for thermal 
  management

Requirements:
------------

* Fast response time and low OS overhead to moved tasks away from
  selected cpu packages.  CPU hotplug is too heavyweight for this
  purpose

Use cases:
---------

* Enabling the right number of cpus to run the given workload can
  provide good power vs performance tradeoffs.

* Ability to throttle the number of cores uses in the system along
  with other power saving controls like cpufreq governors can enable
  the system to operate at a more power efficient operating point and
  still meet the design objectives.
 
* Facilitate thermal management by evacuating cores from hot cpu packages

Alternatives:
-------------

* CPU hotplug: Heavy weight and slow.  Setting up and tear down of
  data structures involved.  May need new fast or light weight
  notifications

* CPUSets: Exclusive CPU sets and partitioned sched domains involve
  rebuilding sched domains and relatively heavy weight for the purpose

The following patch is against 2.6.30-rc3 and will work only in
an under utilised system (Tasks <= number of cores).

Test results for ebizzy 8 threads at various sched_mc settings has been 
summarised with relative values below. The test platform is dual socket 
quad core x86 system (pre-Nehalem).

--------------------------------------------------------
sched_mc	No Cores	Performance	AvgPower	
		used		Records/sec	(Watts)
--------------------------------------------------------
0		8		1.00x		1.00y
1		8		1.02x		1.01y
2		8		0.83x		1.01y
3		7		0.86x		0.97y
4		6		0.76x		0.92y
5		4		0.72x		0.82y
--------------------------------------------------------
		
There were wide run variation with ebizzy.  The purpose of the above
data is to justify use of core evacuation for power vs performance
trade-offs.

ToDo:
-----

* Make the core evacuation predictable under different system load
  conditions and workload characteristics
* Enhance framework to control which packages/cores will be
  evacuated, this is needed for thermal management

I can experiment with different benchmarks/platforms and post results
while the framework is being discussed.

Please let me know you comments and suggestions.

Thanks,
Vaidy

---

Vaidyanathan Srinivasan (3):
      sched: loadbalancer hacks for forced packing of tasks
      sched: threshold helper functions
      sched: add more levels of sched_mc


 include/linux/sched.h |    4 ++++
 kernel/sched.c        |   35 ++++++++++++++++++++++++++++++++++-
 2 files changed, 38 insertions(+), 1 deletions(-)

-- 

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2009-04-28 16:16 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-04-26 20:46 [RFC PATCH v1 0/3] Saving power by cpu evacuation using sched_mc=n Vaidyanathan Srinivasan
2009-04-26 20:46 ` [RFC PATCH v1 1/3] sched: add more levels of sched_mc Vaidyanathan Srinivasan
2009-04-26 20:46 ` [RFC PATCH v1 2/3] sched: threshold helper functions Vaidyanathan Srinivasan
2009-04-26 20:47 ` [RFC PATCH v1 3/3] sched: loadbalancer hacks for forced packing of tasks Vaidyanathan Srinivasan
2009-04-27  3:52 ` [RFC PATCH v1 0/3] Saving power by cpu evacuation using sched_mc=n Ingo Molnar
2009-04-27  5:43   ` Vaidyanathan Srinivasan
2009-04-27  5:53     ` Ingo Molnar
2009-04-27  6:39       ` Vaidyanathan Srinivasan
2009-04-27  7:01         ` Balbir Singh
2009-04-27  5:54   ` Dipankar Sarma
2009-04-27 10:09 ` Peter Zijlstra
2009-04-27 14:20   ` Vaidyanathan Srinivasan
2009-04-28  8:33     ` Peter Zijlstra
2009-04-28  8:52       ` Ingo Molnar
2009-04-28 16:15         ` Vaidyanathan Srinivasan
2009-04-28 16:11       ` Vaidyanathan Srinivasan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox