[RFC v4 PATCH 0/7] CFS Hard limits - v4

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Bharata B Rao <bharata@linux.vnet.ibm.com>
To: linux-kernel@vger.kernel.org
Cc: Dhaval Giani <dhaval@linux.vnet.ibm.com>,
	Balbir Singh <balbir@linux.vnet.ibm.com>,
	Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>,
	Gautham R Shenoy <ego@in.ibm.com>,
	Srivatsa Vaddagiri <vatsa@in.ibm.com>,
	Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>,
	Ingo Molnar <mingo@elte.hu>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Pavel Emelyanov <xemul@openvz.org>,
	Herbert Poetzl <herbert@13thfloor.at>,
	Avi Kivity <avi@redhat.com>, Chris Friesen <cfriesen@nortel.com>,
	Paul Menage <menage@google.com>,
	Mike Waychison <mikew@google.com>
Subject: [RFC v4 PATCH 0/7] CFS Hard limits - v4
Date: Tue, 17 Nov 2009 20:03:06 +0530	[thread overview]
Message-ID: <20091117143306.GK17335@in.ibm.com> (raw)

Hi,

Here is the v4 post of hard limits feature for CFS group scheduler. This
version mainly adds cpu hotplug support for CFS runtime balancing.

Changes
-------
RFC v4:
- Reclaim runtimes lent to other cpus when a cpu goes
  offline. (Kamalesh Babulal)
- Fixed a few bugs.
- Some cleanups.

RFC v3:
- http://lkml.org/lkml/2009/11/9/65
- Till v2, I was updating rq->nr_running when tasks go and come back on
  runqueue during throttling and unthrottling. Don't do this.
- With the above change, quite a bit of code simplification is achieved.
  Runtime related fields of cfs_rq are now being protected by per cfs_rq
  lock instead of per rq lock. With this it looks more similar to rt.
- Remove the control file cpu.cfs_hard_limit which enabled/disabled hard limits
  for groups. Now hard limits is enabled by having a non-zero runtime.
- Don't explicitly prevent movement of tasks into throttled groups during
  load balancing as throttled entities are anyway prevented from being
  enqueued in enqueue_task_fair().
- Moved to 2.6.32-rc6

RFC v2:
- http://lkml.org/lkml/2009/9/30/115
- Upgraded to 2.6.31.
- Added CFS runtime borrowing.
- New locking scheme
    The hard limit specific fields of cfs_rq (cfs_runtime, cfs_time and
    cfs_throttled) were being protected by rq->lock. This simple scheme will
    not work when runtime rebalancing is introduced where it will be required
    to look at these fields on other CPU's which requires us to acquire
    rq->lock of other CPUs. This will not be feasible from update_curr().
    Hence introduce a separate lock (rq->runtime_lock) to protect these
    fields of all cfs_rq under it.
- Handle the task wakeup in a throttled group correctly.
- Make CFS_HARD_LIMITS dependent on CGROUP_SCHED (Thanks to Andrea Righi)

RFC v1:
- First version of the patches with minimal features was posted at
  http://lkml.org/lkml/2009/8/25/128

RFC v0:
- The CFS hard limits proposal was first posted at
  http://lkml.org/lkml/2009/6/4/24

Testing and Benchmark numbers
-----------------------------
Some numbers from simple benchmarks to sanity-check that hard limits
patches are not causing any major regressions.

- hackbench (hackbench -pipe N)
  (hackbench was run as part of a group under root group)
  -----------------------------------------------------------------------
				Time
	-----------------------------------------------------------------
  N	CFS_HARD_LIMTS=n	CFS_HARD_LIMTS=y	CFS_HARD_LIMITS=y
				(infinite runtime)	(BW=450000/500000)
  -----------------------------------------------------------------------
  10	0.574			0.614			0.674
  20	1.086			1.154			1.232
  50	2.689			2.487			2.714
  100	4.897			4.771			5.439
  -----------------------------------------------------------------------
  - BW = Bandwidth = runtime/period
  - Infinite runtime means no hard limiting

- lmbench (lat_ctx -N 5 -s <size_in_kb> N)

  (i) size_in_kb = 1024
  -----------------------------------------------------------------------
				Context switch time (us)
	-----------------------------------------------------------------
  N	CFS_HARD_LIMTS=n	CFS_HARD_LIMTS=y	CFS_HARD_LIMITS=y
				(infinite runtime)	(BW=450000/500000)
  -----------------------------------------------------------------------
  10	237.14			248.83			69.71
  100	251.97			234.74			254.73
  500	248.39			252.73			252.66
  -----------------------------------------------------------------------

  (ii) size_in_kb = 2048
  -----------------------------------------------------------------------
				Context switch time (us)
	-----------------------------------------------------------------
  N	CFS_HARD_LIMTS=n	CFS_HARD_LIMTS=y	CFS_HARD_LIMITS=y
				(infinite runtime)	(BW=450000/500000)
  -----------------------------------------------------------------------
  10	541.39			538.68			419.03
  100	504.52			504.22			491.20
  500	495.26			494.11			497.12
  -----------------------------------------------------------------------

- kernbench

Average Optimal load -j 96 Run (std deviation):
------------------------------------------------------------------------------
	CFS_HARD_LIMTS=n	CFS_HARD_LIMTS=y	CFS_HARD_LIMITS=y
				(infinite runtime)	(BW=450000/500000)
------------------------------------------------------------------------------
Elapsd	234.965 (10.1328)	235.93 (8.0893)		270.74 (5.11945)
User	796.605 (62.1617)	787.105 (80.3486)	880.54 (9.33381)
System	802.715 (7.62968)	838.565 (14.5593)	868.23 (10.8894)
% CPU	680 (0)			688.5 (16.2635)		645.5 (4.94975)
CtxSwt	535452 (23273.7)	536321 (27946.3)	567430 (9579.88)
Sleeps	614784 (19538.8)	610256 (17570.2)	626286 (2390.73)
------------------------------------------------------------------------------

Patches description
-------------------
This post has the following patches:

1/7 sched: Rename sched_rt_period_mask() and use it in CFS also
2/7 sched: Bandwidth initialization for fair task groups
3/7 sched: Enforce hard limits by throttling
4/7 sched: Unthrottle the throttled tasks
5/7 sched: Add throttle time statistics to /proc/sched_debug
6/7 sched: CFS runtime borrowing
7/7 sched: Hard limits documentation

 Documentation/scheduler/sched-cfs-hard-limits.txt |   48 ++
 include/linux/sched.h                               |    6 
 init/Kconfig                                        |   13 
 kernel/sched.c                                      |  339 ++++++++++++++
 kernel/sched_debug.c                                |   17 
 kernel/sched_fair.c                                 |  464 +++++++++++++++++++-
 kernel/sched_rt.c                                   |   45 -
 7 files changed, 869 insertions(+), 63 deletions(-)

Regards,
Bharata.

next             reply	other threads:[~2009-11-17 14:33 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-11-17 14:33 Bharata B Rao [this message]
2009-11-17 14:34 ` [RFC v4 PATCH 1/7] sched: Rename sched_rt_period_mask() and use it in CFS also Bharata B Rao
2009-11-17 14:34 ` [RFC v4 PATCH 2/7] sched: Bandwidth initialization for fair task groups Bharata B Rao
2009-12-04 16:09   ` Peter Zijlstra
2009-12-04 16:09   ` Peter Zijlstra
2009-12-05 13:04     ` Bharata B Rao
2009-11-17 14:35 ` [RFC v4 PATCH 3/7] sched: Enforce hard limits by throttling Bharata B Rao
2009-12-04 16:09   ` Peter Zijlstra
2009-12-05 13:02     ` Bharata B Rao
2009-11-17 14:35 ` [RFC v4 PATCH 4/7] sched: Unthrottle the throttled tasks Bharata B Rao
2009-11-17 14:36 ` [RFC v4 PATCH 5/7] sched: Add throttle time statistics to /proc/sched_debug Bharata B Rao
2009-11-17 14:37 ` [RFC v4 PATCH 6/7] sched: Rebalance cfs runtimes Bharata B Rao
2009-12-04 16:09   ` Peter Zijlstra
2009-12-05 13:08     ` Bharata B Rao
2009-11-17 14:37 ` [RFC v4 PATCH 7/7] sched: Hard limits documentation Bharata B Rao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20091117143306.GK17335@in.ibm.com \
    --to=bharata@linux.vnet.ibm.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=avi@redhat.com \
    --cc=balbir@linux.vnet.ibm.com \
    --cc=cfriesen@nortel.com \
    --cc=dhaval@linux.vnet.ibm.com \
    --cc=ego@in.ibm.com \
    --cc=herbert@13thfloor.at \
    --cc=kamalesh@linux.vnet.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=menage@google.com \
    --cc=mikew@google.com \
    --cc=mingo@elte.hu \
    --cc=svaidy@linux.vnet.ibm.com \
    --cc=vatsa@in.ibm.com \
    --cc=xemul@openvz.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.