public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Gautham R Shenoy <ego@in.ibm.com>
To: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Ingo Molnar <mingo@elte.hu>,
	Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>,
	linux-kernel@vger.kernel.org, Balbir Singh <balbir@in.ibm.com>,
	Suresh Siddha <suresh.b.siddha@intel.com>,
	Andi Kleen <andi@firstfloor.org>,
	Randy Dunlap <randy.dunlap@oracle.com>
Subject: Re: [RFC PATCH v2 0/2] sched: Nominate a power-efficient ILB
Date: Tue, 14 Apr 2009 15:28:03 +0530	[thread overview]
Message-ID: <20090414095803.GA11553@in.ibm.com> (raw)
In-Reply-To: <1239702484.21985.6857.camel@twins>

On Tue, Apr 14, 2009 at 11:48:04AM +0200, Peter Zijlstra wrote:
> On Tue, 2009-04-14 at 10:25 +0530, Gautham R Shenoy wrote:
> > Hi,
> > 
> > This is the second iteration of the patchset which aims at improving
> > the idle-load balancer nomination logic, by taking the system topology
> > into consideration.
> > 
> > Changes from v1 (found here: http://lkml.org/lkml/2009/4/2/246)
> > o Fixed the kernel-doc style comments.
> > o Renamed a variable to better reflect it's usage.
> > 
> > Background
> > ======================================
> > An idle-load balancer is an idle-cpu which does not turn off it's sched_ticks
> > and performs load-balancing on behalf of the other idle CPUs. Currently,
> > this idle load balancer is nominated as the first_cpu(nohz.cpu_mask)
> > 
> > The drawback of the current method is that the CPU numbering in the
> > cores/packages need not necessarily be sequential. For example, on a
> > two-socket, Quad core system, the CPU numbering can be as follows:
> > 
> > |-------------------------------|  |-------------------------------|
> > |               |               |  |               |               |
> > |     0         |      2        |  |     1         |      3        |
> > |-------------------------------|  |-------------------------------|
> > |               |               |  |               |               |
> > |     4         |      6        |  |     5         |      7        |
> > |-------------------------------|  |-------------------------------|
> > 
> > Now, the other power-savings settings such as the sched_mc/smt_power_savings
> > and the power-aware IRQ balancer try to balance tasks/IRQs by taking
> > the system topology into consideration, with the intention of keeping
> > as many "power-domains" (cores/packages) in the low-power state.
> > 
> > The current idle-load-balancer nomination does not necessarily align towards
> > this policy. For eg, we could be having tasks and interrupts largely running
> > on the first package with the intention of keeping the second package idle.
> > Hence, CPU 0 may be busy. The first_cpu in the nohz.cpu_mask happens to be CPU1,
> > which in-turn becomes nominated as the idle-load balancer. CPU1 being from
> > the 2nd package, would in turn prevent the 2nd package from going into a
> > deeper sleep state.
> > 
> > Instead the role of the idle-load balancer could have been assumed by an
> > idle CPU from the first package, thereby helping the second package go
> > completely idle.
> > 
> > This patchset has been tested with 2.6.30-rc1 on a Two-Socket
> > Quad core system with the topology as mentioned above.
> > 
> > |----------------------------------------------------------------------------|
> > |       With Patchset + sched_mc_power_savings = 1                           |
> > |----------------------------------------------------------------------------|
> > |make -j2 options| time taken |  LOC timer interrupts |  LOC timer interrupts|
> > |                |            |  on Package 0         |  on Package 1        |
> > |----------------------------------------------------------------------------|
> > |taskset -c 0,2  |            |  CPU0     | CPU2      | CPU1      |  CPU3    |
> > |                | 227.234s   |  56969    | 57080     | 1003      |  588     |
> > |                |            |----------------------------------------------|
> > |                |            |  CPU4     | CPU6      | CPU5      |  CPU7    |
> > |                |            |  55995    | 703       | 583       |  600     |
> > |----------------------------------------------------------------------------|
> > |taskset -c 1,3  |            |  CPU0     | CPU2      | CPU1      |  CPU3    |
> > |                | 227.136s   |  1109     | 611       | 57074     |  57091   |
> > |                |            |----------------------------------------------|
> > |                |            |  CPU4     | CPU6      | CPU5      |  CPU7    |
> > |                |            |  709      | 637       | 56133     |  587     |
> > |----------------------------------------------------------------------------|
> > 
> > We see here that the idle load balancer is chosen from the package which is
> > busy. In the first case, it's CPU4 and in the second case it's CPU5.
> > 
> > |----------------------------------------------------------------------------|
> > |       With Patchset + sched_mc_power_savings = 1                           |
            ^^^^
	    Without
> > |----------------------------------------------------------------------------|
> > |make -j2 options| time taken |  LOC timer interrupts |  LOC timer interrupts|
> > |                |            |  on Package 0         |  on Package 1        |
> > |----------------------------------------------------------------------------|
> > |taskset -c 0,2  |            |  CPU0     | CPU2      | CPU1      |  CPU3    |
> > |                | 228.786s   |  59094    | 61994     | 13984     |  43652   |
> > |                |            |----------------------------------------------|
> > |                |            |  CPU4     | CPU6      | CPU5      |  CPU7    |
> > |                |            |  1827     | 734       | 748       |  760     |
> > |----------------------------------------------------------------------------|
> > |taskset -c 1,3  |            |  CPU0     | CPU2      | CPU1      |  CPU3    |
> > |                | 228.435s   |  57013    | 876       | 58596     |  61633   |
> > |                |            |----------------------------------------------|
> > |                |            |  CPU4     | CPU6      | CPU5      |  CPU7    |
> > |                |            |  772      | 1133      | 850       |  910     |
> > |----------------------------------------------------------------------------|
> > 
> > Here, we see that the idle load balancer is chosen from the other package,
> > despite choosing sched_mc_power_savings = 1. In the first case, we have
> > CPU1 and CPU3 sharing the responsibility among themselves. In the second case,
> > it's CPU0 and CPU6, which assume that role.
> 
> Both tables above claim to be _with_ the pathes :-), from the
> accompanying text one can deduce its the bottom one that is without.

Sorry, copy pasted the 2nd table from the first, and updated only the
values.


> 
> Patches look straight-forward enough, seems good stuff.

Thanks for the review!
> 
> Thanks!

-- 
Thanks and Regards
gautham

  reply	other threads:[~2009-04-14  9:58 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-04-14  4:55 [RFC PATCH v2 0/2] sched: Nominate a power-efficient ILB Gautham R Shenoy
2009-04-14  4:55 ` [RFC PATCH 2 1/2] sched: Nominate idle load balancer from a semi-idle package Gautham R Shenoy
2009-04-14 10:01   ` [tip:sched/core] " tip-bot for Gautham R Shenoy
2009-04-14 17:32   ` [RFC PATCH 2 1/2] " Randy Dunlap
2009-04-14  4:55 ` [RFC PATCH 2 2/2] sched: Nominate a power-efficient ilb in select_nohz_balancer() Gautham R Shenoy
2009-04-14 10:01   ` [tip:sched/core] " tip-bot for Gautham R Shenoy
2009-04-14  9:48 ` [RFC PATCH v2 0/2] sched: Nominate a power-efficient ILB Peter Zijlstra
2009-04-14  9:58   ` Gautham R Shenoy [this message]
2009-04-22  1:05 ` Suresh Siddha
2009-04-26 16:56   ` Vaidyanathan Srinivasan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090414095803.GA11553@in.ibm.com \
    --to=ego@in.ibm.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=andi@firstfloor.org \
    --cc=balbir@in.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=randy.dunlap@oracle.com \
    --cc=suresh.b.siddha@intel.com \
    --cc=svaidy@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox