From: Gautham R Shenoy <ego@in.ibm.com>
To: "Ingo Molnar" <mingo@elte.hu>,
Peter Zijlstra <a.p.zijlstra@chello.nl>,
Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Cc: linux-kernel@vger.kernel.org,
Suresh Siddha <suresh.b.siddha@intel.com>,
"Balbir Singh" <balbir@in.ibm.com>
Subject: [PATCH v4 0/5] sched: Extend sched_mc/smt_power_savings framework.
Date: Tue, 31 Mar 2009 16:20:12 +0530 [thread overview]
Message-ID: <20090331104829.16414.11385.stgit@sofia.in.ibm.com> (raw)
Hi,
This is the iteration number 4 of the patchset which extends the
sched_mc_power_savings enhancements to benefit sched_smt_power_savings
as well. This is intended to work on platforms that have on-chip
memory controllers making each of the cpu-package a 'node'.
The patch-series is against linux-2.6-tip master, as on March 30th.
In addition to providing power savings, on such platforms,
this patchset fixes the inconsistent behavior of
sched_smt_power_savings while running odd number of pairs of tasks.
Ideally, when sched_smt_power_savings is enabled, we would like to see
the tasks running on sibling threads to take advantage of the
cache-sharing. However, in the case when there
are only 2 threads running, and sched_smt_power_savings is enabled,
the current load balancer doesn't pull them from across packages
onto a single core.
Changes from V3 (Found here: --> http://lkml.org/lkml/2009/3/6/23)
- Rebased and Retested against linux-2.6-tip master as on Mar 30th.
- Dropped the patch which added comments for find_busiest_group. That
has been sent as a seperate series.
Changes from V2: (Found here: --> http://lkml.org/lkml/2009/3/3/109)
- Patches have been split up in an incremental manner for easy review.
- Fixed comments for some variables.
- Renamed some variables to better reflect their usage.
Changes from V1: (Found here: --> http://lkml.org/lkml/2009/2/16/221)
- Added comments to explain power-saving part in find_busiest_group()
- Added comments for the different sched_domain levels.
Background
------------------------------------------------------------------
On machines with on-chip memory controller, each physical CPU
package forms a NUMA node and the CPU level sched_domain will have
only one group. This prevents any form of power saving balance across
these nodes. Enabling the sched_mc_power_savings tunable to work as
designed on these new single CPU NUMA node machines will help task
consolidation and save power as we did in other multi core multi
socket platforms.
Consolidation across NODES have implications of cross-node memory
access and other NUMA locality issues. Even under such constraints
there could be scope for power savings vs performance tradeoffs and
hence making the sched_mc_powersavings work as expected on these
platform is justified.
sched_mc/smt_power_savings is still a tunable and power savings benefits
and performance would vary depending on the workload and the system
topology and hardware features.
The results of this patch series tested with kernbench on a
2-Socket Quad-core Dual threaded box, varying the number of threads is
as follows:
+------------------------------------------------------------------------+
|Test: make -j4 |
+-----------+----------+--------+---------+-------------+----------------+
| sched_smt | sched_mc | %Power | Time | % Package 0 | % Package 1 |
| | | | (s) | idle | idle |
+-----------+----------+--------+---------+-------------+----------------+
| | | | |Core0: 37.44 |Core4: 70.50 |
| | | | +-------------+----------------+
| | | | |Core1: 59.89 |Core5: 20.08 |
| 0 | 0 | 100 | 107.45 +-------------+----------------+
| | | | |Core2: 57.63 |Core6: 62.80 |
| | | | +-------------+----------------+
| | | | |Core3: 64.78 |Core7: 65.07 |
+-----------+----------+--------+---------+-------------+----------------+
+-----------+----------+--------+---------+-------------+----------------+
| | | | |Core0: 13.41 |Core4: 98.06 |
| | | | +-------------+----------------+
| | | | |Core1: 28.56 |Core5: 60.64 |
| 0 | 2 | 99.89 | 109.95 +-------------+----------------+
| | | | |Core2: 28.49 |Core6: 98.30 |
| | | | +-------------+----------------+
| | | | |Core3: 31.49 |Core7: 99.77 |
+-----------+----------+--------+---------+-------------+----------------+
+-----------+----------+--------+---------+-------------+----------------+
| | | | |Core0: 35.05 |Core4: 41.78 |
| | | | +-------------+----------------+
| | | | |Core1: 78.28 |Core5: 32.15 |
| 2 | 2 | 95.84 | 137.73 +-------------+----------------+
| | | | |Core2: 93.45 |Core6: 87.49 |
| | | | +-------------+----------------+
| | | | |Core3: 97.70 |Core7: 90.47 |
+-----------+----------+--------+---------+-------------+----------------+
+------------------------------------------------------------------------+
|Test: make -j6 |
+-----------+----------+--------+---------+-------------+----------------+
| sched_smt | sched_mc | %Power | Time | % Package 0 | % Package 1 |
| | | | (s) | idle | idle |
+-----------+----------+--------+---------+-------------+----------------+
| | | | |Core0: 25.50 |Core4: 39.22 |
| | | | +-------------+----------------+
| | | | |Core1: 46.47 |Core5: 20.71 |
| 0 | 0 | 100 | 76.06 +-------------+----------------+
| | | | |Core2: 45.20 |Core6: 42.30 |
| | | | +-------------+----------------+
| | | | |Core3: 46.50 |Core7: 47.29 |
+-----------+----------+--------+---------+-------------+----------------+
+-----------+----------+--------+---------+-------------+----------------+
| | | | |Core0: 17.00 |Core4: 52.38 |
| | | | +-------------+----------------+
| | | | |Core1: 42.08 |Core5: 26.21 |
| 0 | 2 | 98.99 | 79.53 +-------------+----------------+
| | | | |Core2: 46.47 |Core6: 58.16 |
| | | | +-------------+----------------+
| | | | |Core3: 43.39 |Core7: 54.36 |
+-----------+----------+--------+---------+-------------+----------------+
+-----------+----------+--------+---------+-------------+----------------+
| | | | |Core0: 63.85 |Core4: 21.55 |
| | | | +-------------+----------------+
| | | | |Core1: 93.35 |Core5: 18.76 |
| 2 | 2 | 92.16 | 100.22 +-------------+----------------+
| | | | |Core2: 96.02 |Core6: 36.76 |
| | | | +-------------+----------------+
| | | | |Core3: 99.01 |Core7: 64.32 |
+-----------+----------+--------+---------+-------------+----------------+
+------------------------------------------------------------------------+
|Test: make -j8 |
+-----------+----------+--------+---------+-------------+----------------+
| sched_smt | sched_mc | %Power | Time | % Package 0 | % Package 1 |
| | | | (s) | idle | idle |
+-----------+----------+--------+---------+-------------+----------------+
| | | | |Core0: 19.34 |Core4: 34.01 |
| | | | +-------------+----------------+
| | | | |Core1: 36.06 |Core5: 21.02 |
| 0 | 0 | 100 | 62.67 +-------------+----------------+
| | | | |Core2: 31.60 |Core6: 32.32 |
| | | | +-------------+----------------+
| | | | |Core3: 34.89 |Core7: 36.48 |
+-----------+----------+--------+---------+-------------+----------------+
+-----------+----------+--------+---------+-------------+----------------+
| | | | |Core0: 17.53 |Core4: 35.30 |
| | | | +-------------+----------------+
| | | | |Core1: 37.05 |Core5: 22.93 |
| 0 | 2 | 99.20 | 64.08 +-------------+----------------+
| | | | |Core2: 36.96 |Core6: 35.07 |
| | | | +-------------+----------------+
| | | | |Core3: 36.09 |Core7: 37.38 |
+-----------+----------+--------+---------+-------------+----------------+
+-----------+----------+--------+---------+-------------+----------------+
| | | | |Core0: 11.58 |Core4: 91.99 |
| | | | +-------------+----------------+
| | | | |Core1: 18.51 |Core5: 58.37 |
| 2 | 2 | 90.20 | 80.87 +-------------+----------------+
| | | | |Core2: 22.62 |Core6: 97.68 |
| | | | +-------------+----------------+
| | | | |Core3: 21.83 |Core7: 99.80 |
+-----------+----------+--------+---------+-------------+----------------+
---
Gautham R Shenoy (5):
sched: Fix sd_parent_degenerate for SD_POWERSAVINGS_BALANCE.
sched: Arbitrate the nomination of preferred_wakeup_cpu
sched: Rename the variable sched_mc_preferred_wakeup_cpu
sched: Record the current active power savings level
sched: code cleanup - sd_power_saving_flags(), sd_balance_for_*_power()
include/linux/sched.h | 66 ++++++++++++++++++-----------------
include/linux/topology.h | 6 +--
kernel/sched.c | 86 ++++++++++++++++++++++++++++++++++++++++------
kernel/sched_fair.c | 4 +-
4 files changed, 112 insertions(+), 50 deletions(-)
--
Thanks and Regards
gautham.
next reply other threads:[~2009-03-31 10:50 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-03-31 10:50 Gautham R Shenoy [this message]
2009-03-31 10:50 ` [PATCH v4 1/5] sched: code cleanup - sd_power_saving_flags(), sd_balance_for_*_power() Gautham R Shenoy
2009-03-31 10:50 ` [PATCH v4 2/5] sched: Record the current active power savings level Gautham R Shenoy
2009-08-24 16:09 ` Peter Zijlstra
2009-08-25 9:12 ` Gautham R Shenoy
2009-03-31 10:50 ` [PATCH v4 3/5] sched: Rename the variable sched_mc_preferred_wakeup_cpu Gautham R Shenoy
2009-03-31 10:50 ` [PATCH v4 4/5] sched: Arbitrate the nomination of preferred_wakeup_cpu Gautham R Shenoy
2009-03-31 10:50 ` [PATCH v4 5/5] sched: Fix sd_parent_degenerate for SD_POWERSAVINGS_BALANCE Gautham R Shenoy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090331104829.16414.11385.stgit@sofia.in.ibm.com \
--to=ego@in.ibm.com \
--cc=a.p.zijlstra@chello.nl \
--cc=balbir@in.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=suresh.b.siddha@intel.com \
--cc=svaidy@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox