From: Bharata B Rao <bharata@linux.vnet.ibm.com>
To: linux-kernel@vger.kernel.org
Cc: Dhaval Giani <dhaval@linux.vnet.ibm.com>,
Balbir Singh <balbir@linux.vnet.ibm.com>,
Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>,
Gautham R Shenoy <ego@in.ibm.com>,
Srivatsa Vaddagiri <vatsa@in.ibm.com>,
Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>,
Ingo Molnar <mingo@elte.hu>,
Peter Zijlstra <a.p.zijlstra@chello.nl>,
Pavel Emelyanov <xemul@openvz.org>,
Herbert Poetzl <herbert@13thfloor.at>,
Avi Kivity <avi@redhat.com>, Chris Friesen <cfriesen@nortel.com>,
Paul Menage <menage@google.com>,
Mike Waychison <mikew@google.com>
Subject: [RFC v4 PATCH 0/7] CFS Hard limits - v4
Date: Tue, 17 Nov 2009 20:03:06 +0530 [thread overview]
Message-ID: <20091117143306.GK17335@in.ibm.com> (raw)
Hi,
Here is the v4 post of hard limits feature for CFS group scheduler. This
version mainly adds cpu hotplug support for CFS runtime balancing.
Changes
-------
RFC v4:
- Reclaim runtimes lent to other cpus when a cpu goes
offline. (Kamalesh Babulal)
- Fixed a few bugs.
- Some cleanups.
RFC v3:
- http://lkml.org/lkml/2009/11/9/65
- Till v2, I was updating rq->nr_running when tasks go and come back on
runqueue during throttling and unthrottling. Don't do this.
- With the above change, quite a bit of code simplification is achieved.
Runtime related fields of cfs_rq are now being protected by per cfs_rq
lock instead of per rq lock. With this it looks more similar to rt.
- Remove the control file cpu.cfs_hard_limit which enabled/disabled hard limits
for groups. Now hard limits is enabled by having a non-zero runtime.
- Don't explicitly prevent movement of tasks into throttled groups during
load balancing as throttled entities are anyway prevented from being
enqueued in enqueue_task_fair().
- Moved to 2.6.32-rc6
RFC v2:
- http://lkml.org/lkml/2009/9/30/115
- Upgraded to 2.6.31.
- Added CFS runtime borrowing.
- New locking scheme
The hard limit specific fields of cfs_rq (cfs_runtime, cfs_time and
cfs_throttled) were being protected by rq->lock. This simple scheme will
not work when runtime rebalancing is introduced where it will be required
to look at these fields on other CPU's which requires us to acquire
rq->lock of other CPUs. This will not be feasible from update_curr().
Hence introduce a separate lock (rq->runtime_lock) to protect these
fields of all cfs_rq under it.
- Handle the task wakeup in a throttled group correctly.
- Make CFS_HARD_LIMITS dependent on CGROUP_SCHED (Thanks to Andrea Righi)
RFC v1:
- First version of the patches with minimal features was posted at
http://lkml.org/lkml/2009/8/25/128
RFC v0:
- The CFS hard limits proposal was first posted at
http://lkml.org/lkml/2009/6/4/24
Testing and Benchmark numbers
-----------------------------
Some numbers from simple benchmarks to sanity-check that hard limits
patches are not causing any major regressions.
- hackbench (hackbench -pipe N)
(hackbench was run as part of a group under root group)
-----------------------------------------------------------------------
Time
-----------------------------------------------------------------
N CFS_HARD_LIMTS=n CFS_HARD_LIMTS=y CFS_HARD_LIMITS=y
(infinite runtime) (BW=450000/500000)
-----------------------------------------------------------------------
10 0.574 0.614 0.674
20 1.086 1.154 1.232
50 2.689 2.487 2.714
100 4.897 4.771 5.439
-----------------------------------------------------------------------
- BW = Bandwidth = runtime/period
- Infinite runtime means no hard limiting
- lmbench (lat_ctx -N 5 -s <size_in_kb> N)
(i) size_in_kb = 1024
-----------------------------------------------------------------------
Context switch time (us)
-----------------------------------------------------------------
N CFS_HARD_LIMTS=n CFS_HARD_LIMTS=y CFS_HARD_LIMITS=y
(infinite runtime) (BW=450000/500000)
-----------------------------------------------------------------------
10 237.14 248.83 69.71
100 251.97 234.74 254.73
500 248.39 252.73 252.66
-----------------------------------------------------------------------
(ii) size_in_kb = 2048
-----------------------------------------------------------------------
Context switch time (us)
-----------------------------------------------------------------
N CFS_HARD_LIMTS=n CFS_HARD_LIMTS=y CFS_HARD_LIMITS=y
(infinite runtime) (BW=450000/500000)
-----------------------------------------------------------------------
10 541.39 538.68 419.03
100 504.52 504.22 491.20
500 495.26 494.11 497.12
-----------------------------------------------------------------------
- kernbench
Average Optimal load -j 96 Run (std deviation):
------------------------------------------------------------------------------
CFS_HARD_LIMTS=n CFS_HARD_LIMTS=y CFS_HARD_LIMITS=y
(infinite runtime) (BW=450000/500000)
------------------------------------------------------------------------------
Elapsd 234.965 (10.1328) 235.93 (8.0893) 270.74 (5.11945)
User 796.605 (62.1617) 787.105 (80.3486) 880.54 (9.33381)
System 802.715 (7.62968) 838.565 (14.5593) 868.23 (10.8894)
% CPU 680 (0) 688.5 (16.2635) 645.5 (4.94975)
CtxSwt 535452 (23273.7) 536321 (27946.3) 567430 (9579.88)
Sleeps 614784 (19538.8) 610256 (17570.2) 626286 (2390.73)
------------------------------------------------------------------------------
Patches description
-------------------
This post has the following patches:
1/7 sched: Rename sched_rt_period_mask() and use it in CFS also
2/7 sched: Bandwidth initialization for fair task groups
3/7 sched: Enforce hard limits by throttling
4/7 sched: Unthrottle the throttled tasks
5/7 sched: Add throttle time statistics to /proc/sched_debug
6/7 sched: CFS runtime borrowing
7/7 sched: Hard limits documentation
Documentation/scheduler/sched-cfs-hard-limits.txt | 48 ++
include/linux/sched.h | 6
init/Kconfig | 13
kernel/sched.c | 339 ++++++++++++++
kernel/sched_debug.c | 17
kernel/sched_fair.c | 464 +++++++++++++++++++-
kernel/sched_rt.c | 45 -
7 files changed, 869 insertions(+), 63 deletions(-)
Regards,
Bharata.
next reply other threads:[~2009-11-17 14:33 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-11-17 14:33 Bharata B Rao [this message]
2009-11-17 14:34 ` [RFC v4 PATCH 1/7] sched: Rename sched_rt_period_mask() and use it in CFS also Bharata B Rao
2009-11-17 14:34 ` [RFC v4 PATCH 2/7] sched: Bandwidth initialization for fair task groups Bharata B Rao
2009-12-04 16:09 ` Peter Zijlstra
2009-12-04 16:09 ` Peter Zijlstra
2009-12-05 13:04 ` Bharata B Rao
2009-11-17 14:35 ` [RFC v4 PATCH 3/7] sched: Enforce hard limits by throttling Bharata B Rao
2009-12-04 16:09 ` Peter Zijlstra
2009-12-05 13:02 ` Bharata B Rao
2009-11-17 14:35 ` [RFC v4 PATCH 4/7] sched: Unthrottle the throttled tasks Bharata B Rao
2009-11-17 14:36 ` [RFC v4 PATCH 5/7] sched: Add throttle time statistics to /proc/sched_debug Bharata B Rao
2009-11-17 14:37 ` [RFC v4 PATCH 6/7] sched: Rebalance cfs runtimes Bharata B Rao
2009-12-04 16:09 ` Peter Zijlstra
2009-12-05 13:08 ` Bharata B Rao
2009-11-17 14:37 ` [RFC v4 PATCH 7/7] sched: Hard limits documentation Bharata B Rao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20091117143306.GK17335@in.ibm.com \
--to=bharata@linux.vnet.ibm.com \
--cc=a.p.zijlstra@chello.nl \
--cc=avi@redhat.com \
--cc=balbir@linux.vnet.ibm.com \
--cc=cfriesen@nortel.com \
--cc=dhaval@linux.vnet.ibm.com \
--cc=ego@in.ibm.com \
--cc=herbert@13thfloor.at \
--cc=kamalesh@linux.vnet.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=menage@google.com \
--cc=mikew@google.com \
--cc=mingo@elte.hu \
--cc=svaidy@linux.vnet.ibm.com \
--cc=vatsa@in.ibm.com \
--cc=xemul@openvz.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.