public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [patch/backport] CFS scheduler, -v22, for v2.6.23-rc8, v2.6.22.8, v2.6.21.7, v2.6.20.20
@ 2007-09-26 11:13 Ingo Molnar
  2007-09-26 13:33 ` S.Çağlar Onur
                   ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Ingo Molnar @ 2007-09-26 11:13 UTC (permalink / raw)
  To: linux-kernel

By popular demand, here is release -v22 of the CFS scheduler. It is a 
full backport of the latest & greatest sched-devel.git code to 
v2.6.23-rc8, v2.6.22.8, v2.6.21.7 and v2.6.20.20. The patches can be 
downloaded from the usual place:

    http://people.redhat.com/mingo/cfs-scheduler/

This is the first time the development version of the scheduler has been 
fed back into the stable backport series, so there's many changes since 
v20.5:

 15 files changed, 1103 insertions(+), 840 deletions(-)

Even if CFS v20.5 worked well for you, please try this release too, with 
a good focus on interactivity testing - because, unless some major 
showstopper is found, this codebase is intended for a v2.6.24 upstream 
merge.

( Even a quick, subjective report of: "checked this patch, it didnt
  crash and it feels like v20.5" or "laggier than v20.5" or "feels 
  better than v20.5" is useful to us and enables us to judge the general 
  direction of interactivity. )

The changes in v22 consist of lots of mostly small enhancements, 
speedups, interactivity improvements, debug enhancements and tidy-ups - 
many of which can be user-visible. (These enhancements have been 
contributed by many people - see the changelog below and the git tree 
for detailed credits.)

The biggest individual new feature is per UID group scheduling, written 
by Srivatsa Vaddagiri, which can be enabled via the 
CONFIG_FAIR_USER_SCHED=y .config option. With this feature enabled, each 
user gets a fair share of the CPU time, regardless of how many tasks 
each user is running.

For example, it took me 0.1 seconds to log in over ssh as root on a 
testbox that was running a kernel with per UID group scheduling enabled:

  $ time ssh root@testbox /bin/true

  real    0m0.125s
  user    0m0.013s
  sys     0m0.011s

Which testbox had a system load of 1000.17 at this time, due to a rogue 
runaway workload of one thousand (!) non-reniced infinite loops:

  top - 14:34:05 up 30 min,  3 users,  load average: 1000.17, 839.23, 444.57
  Tasks: 1131 total, 1002 running, 129 sleeping,   0 stopped,   0 zombie
  Cpu(s): 30.8%us,  0.2%sy,  0.0%ni, 68.2%id,  0.8%wa,  0.0%hi,  0.0%si
  Mem:   2048992k total,   157688k used,  1891304k free,    18308k buffers
  Swap:  4096564k total,        0k used,  4096564k free,    25464k cached

   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
  3633 root      20   0  2892 1576  724 R    7  0.1   0:00.06 top
  2427 mingo     20   0  1576  244  196 R    2  0.0   0:01.14 loop
  2429 mingo     20   0  1576  244  196 R    2  0.0   0:01.14 loop

To the root user, the box was fully usable an interactivity was 
excellent - i was easily able to kill off those runaway tasks.

( The /proc/root_user_cpu_share tunable also allows the root uid to have
  higher weight than other uids. Unit of the tunable is 0.1%, a weight
  of 100% is 1024, the default weight of the root uid is 200%. )

See the detailed shortlog below for a description of the other changes, 
or pull the sched-devel.git tree for all the 83 commits:

  git-pull git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched-devel.git

Also, as usual, any sort of feedback, bugreport, fix and suggestion is 
more than welcome!

	Ingo

------------------>
Dmitry Adamushko (9):
      sched: clean up struct load_stat
      sched: clean up schedstat block in dequeue_entity()
      sched: sched_setscheduler() fix
      sched: add set_curr_task() calls
      sched: do not keep current in the tree and get rid of sched_entity::fair_key
      sched: optimize task_new_fair()
      sched: simplify sched_class::yield_task()
      sched: rework enqueue/dequeue_entity() to get rid of set_curr_task()
      sched: yield fix

Hiroshi Shimamoto (1):
      sched: clean up sched_fork()

Matthias Kaehlcke (1):
      sched: use list_for_each_entry_safe() in __wake_up_common()

Mike Galbraith (2):
      sched: fix SMP migration latencies
      sched: fix formatting of /proc/sched_debug

Peter Zijlstra (12):
      sched: simplify SCHED_FEAT_* code
      sched: new task placement for vruntime
      sched: simplify adaptive latency
      sched: clean up new task placement
      sched: add tree based averages
      sched: handle vruntime overflow
      sched: better min_vruntime tracking
      sched: add vslice
      sched debug: check spread
      sched: max_vruntime() simplification
      sched: clean up min_vruntime use
      sched: speed up and simplify vslice calculations

S.Caglar Onur (1):
      sched debug: BKL usage statistics, fix

Srivatsa Vaddagiri (12):
      sched: group-scheduler core
      sched: revert recent removal of set_curr_task()
      sched: fix minor bug in yield
      sched: print nr_running and load in /proc/sched_debug
      sched: print &rq->cfs stats
      sched: clean up code under CONFIG_FAIR_GROUP_SCHED
      sched: add fair-user scheduler
      sched: group scheduler wakeup latency fix
      sched: group scheduler SMP migration fix
      sched: group scheduler, fix coding style issues
      sched: group scheduler, fix bloat
      sched: group scheduler, fix latency

Ingo Molnar (44):
      sched: fix new-task method
      sched: resched task in task_new_fair()
      sched: small sched_debug cleanup
      sched: debug: track maximum 'slice'
      sched: uniform tunings
      sched: use constants if !CONFIG_SCHED_DEBUG
      sched: remove stat_gran
      sched: remove precise CPU load
      sched: remove precise CPU load calculations #2
      sched: track cfs_rq->curr on !group-scheduling too
      sched: cleanup: simplify cfs_rq_curr() methods
      sched: uninline __enqueue_entity()/__dequeue_entity()
      sched: speed up update_load_add/_sub()
      sched: clean up calc_weighted()
      sched: introduce se->vruntime
      sched: move sched_feat() definitions
      sched: optimize vruntime based scheduling
      sched: simplify check_preempt() methods
      sched: wakeup granularity fix
      sched: add se->vruntime debugging
      sched: add more vruntime statistics
      sched: debug: update exec_clock only when SCHED_DEBUG
      sched: remove wait_runtime limit
      sched: remove wait_runtime fields and features
      sched: x86: allow single-depth wchan output
      sched: fix delay accounting performance regression
      sched: prettify /proc/sched_debug output
      sched: enhance debug output
      sched: kernel/sched_fair.c whitespace cleanups
      sched: fair-group sched, cleanups
      sched: enable CONFIG_FAIR_GROUP_SCHED=y by default
      sched debug: BKL usage statistics
      sched: remove unneeded tunables
      sched debug: print settings
      sched debug: more width for parameter printouts
      sched: entity_key() fix
      sched: remove condition from set_task_cpu()
      sched: remove last_min_vruntime effect
      sched: undo some of the recent changes
      sched: fix place_entity()
      sched: fix sched_fork()
      sched: remove set_leftmost()
      sched: clean up schedstats, cnt -> count
      sched: cleanup, remove stale comment

 arch/i386/Kconfig       |   11 
 fs/proc/base.c          |    2 
 include/linux/sched.h   |   56 ++-
 init/Kconfig            |   21 +
 kernel/delayacct.c      |    2 
 kernel/sched.c          |  577 ++++++++++++++++++++++++-------------
 kernel/sched_debug.c    |  246 ++++++++++------
 kernel/sched_fair.c     |  733 ++++++++++++++++++------------------------------
 kernel/sched_idletask.c |    5 
 kernel/sched_rt.c       |   12 
 kernel/sched_stats.h    |   28 -
 kernel/sysctl.c         |   31 --
 kernel/user.c           |   43 ++
 13 files changed, 963 insertions(+), 804 deletions(-)

^ permalink raw reply	[flat|nested] 14+ messages in thread
* Re: [patch/backport] CFS scheduler, -v22, for v2.6.23-rc8, v2.6.22.8, v2.6.21.7, v2.6.20.20
@ 2007-09-29 11:11 Matthew
  2007-09-30 15:43 ` Ingo Molnar
  2007-09-30 23:54 ` Bill Davidsen
  0 siblings, 2 replies; 14+ messages in thread
From: Matthew @ 2007-09-29 11:11 UTC (permalink / raw)
  To: linux-kernel; +Cc: mingo

Hi Ingo & everbody on the list,

first of all: many thanks for developing this great scheduler (also:
kudos to Con Kolivas for having developed SD & CK-patchset)

(this is my second mail to this list and I hope I'm doing everything right)

I'm doing some backup during work right now: rsyncing my home
partition (nearly 180 GB) to another harddrive locally &
since I'm running compiz-fusion, openoffice and gnome, therefore am in
some real "working environment" I thought:
give Ingo's new scheduler a test-ride during heavy load ;)

first some impressions:
cpu load balancing looks great again (pretty symmetrical loading on
both cores - it looks pretty similar to 19.1 if not better if I recall
right),
v20 wasn't that "good-looking" ;) (with gnome-system-monitor)

both cpus have a continous load of ~  70% right now so I'll be
starting up 9 instances of glxgears, below are some output & details
of my system
(cpu frequency switching is disabled since it doesn't work right now
with the current bios version)

short summary: unfortunately after starting glxgears everything
stuttered a lot, don't know if it's expactable during that heavy load
- just wanted to let you know; after having closed each instance of
glxgears, everything was fine again ...

cat /proc/sched_debug
Sched Debug Version: v0.05-v22, 2.6.23-rc8-cfs-v22 #1
now at 3890590.670323 msecs
  .sysctl_sched_latency                    : 20.000000
  .sysctl_sched_nr_latency                 : 0.000020
  .sysctl_sched_wakeup_granularity         : 2.000000
  .sysctl_sched_batch_wakeup_granularity   : 25.000000
  .sysctl_sched_child_runs_first           : 0.000001
  .sysctl_sched_features                   : 3

cpu#0, 2404.249 MHz
  .nr_running                    : 4
  .load                          : 4096
  .nr_switches                   : 7648325
  .nr_load_updates               : 2103023
  .nr_uninterruptible            : 58007
  .jiffies                       : 3590591
  .next_balance                  : 3.590615
  .curr->pid                     : 4942
  .clock                         : 2102704.853484
  .idle_clock                    : 0.000000
  .prev_clock_raw                : 3939505.166968
  .clock_warps                   : 0
  .clock_overflows               : 1525057
  .clock_deep_idle_events        : 0
  .clock_max_delta               : 0.999846
  .cpu_load[0]                   : 3072
  .cpu_load[1]                   : 3148
  .cpu_load[2]                   : 3448
  .cpu_load[3]                   : 3598
  .cpu_load[4]                   : 3612

cfs_rq
  .exec_clock                    : 0.000000
  .MIN_vruntime                  : 705800.821444
  .min_vruntime                  : 705800.818396
  .max_vruntime                  : 705800.821444
  .spread                        : 0.000000
  .spread0                       : 0.000000
  .nr_running                    : 2
  .load                          : 3072
  .nr_spread_over                : 0

cfs_rq
  .exec_clock                    : 0.000000
  .MIN_vruntime                  : 114142.324354
  .min_vruntime                  : 705800.818396
  .max_vruntime                  : 114142.460206
  .spread                        : 0.135852
  .spread0                       : 0.000000
  .nr_running                    : 3
  .load                          : 3072
  .nr_spread_over                : 0

cfs_rq
  .exec_clock                    : 0.000000
  .MIN_vruntime                  : 0.000001
  .min_vruntime                  : 705800.818396
  .max_vruntime                  : 0.000001
  .spread                        : 0.000000
  .spread0                       : 0.000000
  .nr_running                    : 0
  .load                          : 0
  .nr_spread_over                : 0

cfs_rq
  .exec_clock                    : 0.000000
  .MIN_vruntime                  : 0.000001
  .min_vruntime                  : 705800.818396
  .max_vruntime                  : 0.000001
  .spread                        : 0.000000
  .spread0                       : 0.000000
  .nr_running                    : 0
  .load                          : 0
  .nr_spread_over                : 0

cfs_rq
  .exec_clock                    : 0.000000
  .MIN_vruntime                  : 0.000001
  .min_vruntime                  : 705800.818396
  .max_vruntime                  : 0.000001
  .spread                        : 0.000000
  .spread0                       : 0.000000
  .nr_running                    : 1
  .load                          : 1024
  .nr_spread_over                : 0

runnable tasks:
            task   PID         tree-key  switches  prio
exec-runtime         sum-exec        sum-sleep
----------------------------------------------------------------------------------------------------------
               X  4043    114142.410694   1413252   120
0               0               0.000000               0.000000
       0.000000
        glxgears  4938    114142.460206    251121   120
0               0               0.000000               0.000000
       0.000000
        glxgears  4939    114142.324354    418180   120
0               0               0.000000               0.000000
       0.000000
R            cat  4942    373113.531317        11   120
0               0               0.000000               0.000000
       0.000000

cpu#1, 2404.249 MHz
  .nr_running                    : 4
  .load                          : 4096
  .nr_switches                   : 9227086
  .nr_load_updates               : 2014314
  .nr_uninterruptible            : 4294909290
  .jiffies                       : 3590591
  .next_balance                  : 3.590623
  .curr->pid                     : 4932
  .clock                         : 2014009.406851
  .idle_clock                    : 0.000000
  .prev_clock_raw                : 3939505.462830
  .clock_warps                   : 0
  .clock_overflows               : 1490105
  .clock_deep_idle_events        : 0
  .clock_max_delta               : 0.999845
  .cpu_load[0]                   : 4096
  .cpu_load[1]                   : 4096
  .cpu_load[2]                   : 4102
  .cpu_load[3]                   : 4148
  .cpu_load[4]                   : 4210

cfs_rq
  .exec_clock                    : 0.000000
  .MIN_vruntime                  : 0.000001
  .min_vruntime                  : 582740.765855
  .max_vruntime                  : 0.000001
  .spread                        : 0.000000
  .spread0                       : -123060.052541
  .nr_running                    : 1
  .load                          : 1024
  .nr_spread_over                : 0

cfs_rq
  .exec_clock                    : 0.000000
  .MIN_vruntime                  : 127569.587650
  .min_vruntime                  : 582740.768735
  .max_vruntime                  : 127573.669809
  .spread                        : 4.082159
  .spread0                       : -123060.049661
  .nr_running                    : 4
  .load                          : 4096
  .nr_spread_over                : 0

cfs_rq
  .exec_clock                    : 0.000000
  .MIN_vruntime                  : 0.000001
  .min_vruntime                  : 582740.771821
  .max_vruntime                  : 0.000001
  .spread                        : 0.000000
  .spread0                       : -123060.046575
  .nr_running                    : 0
  .load                          : 0
  .nr_spread_over                : 0

cfs_rq
  .exec_clock                    : 0.000000
  .MIN_vruntime                  : 0.000001
  .min_vruntime                  : 582740.774902
  .max_vruntime                  : 0.000001
  .spread                        : 0.000000
  .spread0                       : -123060.043494
  .nr_running                    : 0
  .load                          : 0
  .nr_spread_over                : 0

cfs_rq
  .exec_clock                    : 0.000000
  .MIN_vruntime                  : 0.000001
  .min_vruntime                  : 582740.777696
  .max_vruntime                  : 0.000001
  .spread                        : 0.000000
  .spread0                       : -123060.040700
  .nr_running                    : 0
  .load                          : 0
  .nr_spread_over                : 0

runnable tasks:
            task   PID         tree-key  switches  prio
exec-runtime         sum-exec        sum-sleep
----------------------------------------------------------------------------------------------------------
        glxgears  4932    127569.592544    440228   120
0               0               0.000000               0.000000
       0.000000
        glxgears  4933    127569.593334    555188   120
0               0               0.000000               0.000000
       0.000000
        glxgears  4934    127569.593355    713648   120
0               0               0.000000               0.000000
       0.000000
        glxgears  4935    127573.669809    419649   120
0               0               0.000000               0.000000
       0.000000


cat /proc/meminfo
MemTotal:      2074264 kB
MemFree:         52624 kB
Buffers:         38704 kB
Cached:        1468288 kB
SwapCached:          0 kB
Active:         565544 kB
Inactive:      1378496 kB
HighTotal:     1179136 kB
HighFree:         1976 kB
LowTotal:       895128 kB
LowFree:         50648 kB
SwapTotal:     2698912 kB
SwapFree:      2698516 kB
Dirty:           38508 kB
Writeback:           0 kB
AnonPages:      437052 kB
Mapped:         116140 kB
Slab:            33768 kB
SReclaimable:    16652 kB
SUnreclaim:      17116 kB
PageTables:       3704 kB
NFS_Unstable:        0 kB
Bounce:              0 kB
CommitLimit:   3736044 kB
Committed_AS:   817652 kB
VmallocTotal:   118776 kB
VmallocUsed:     56224 kB
VmallocChunk:    60916 kB

hardware: NVIDIA 7600GT, P5W DH Deluxe, Core2 Duo 6600 "Conroe", 2048
MB Ram (DDR2-800, 6400)
GNU/Gentoo hardened x86 2.6 profile, gcc-4.2.1 hardened; glibc 2.6.1
kernel: 2.6.23-rc8-git3 (+ cfs-devel v22)

9 instances of glxgears, 1 instance of openoffice, 1 instance of
screen, 1 window firefox, gnome-2.20 running
compiz-fusion running (via nvidia GLX_EXT_texture_from_pixmap) ,
nvidia-drivers-100.14.19-r10

/dev/dm-1 reiserfs    213G  175G   38G  83% /home
/dev/dm-2 reiserfs    213G   56G  157G  27% /bak    <== rsync -aur
--delete /home/ /bak/
(1st hdd on Intel ICH7R, 2nd hdd on Jmicron; both S-ATA2, ahci)

readahead-cache for both harddrives: 4 MB (via blockdev --setra), 16
MB internal harddisk cache (seagate 7200.10)
I/O-scheduler: deadline scheduler; partition-type: reiserfs v3.6;
mount-options: noatime,nodiratime,data=writeback,commit=120
SLAB: slub

result: most times only 2-3 of the glxgears-windows were running, the
rest was stuttering / halting (no motion)
mouse movement was pretty discountinous, keyboard input was also delayed

Please keep up the good work !

Cheers
Mat

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2007-10-26 19:02 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-09-26 11:13 [patch/backport] CFS scheduler, -v22, for v2.6.23-rc8, v2.6.22.8, v2.6.21.7, v2.6.20.20 Ingo Molnar
2007-09-26 13:33 ` S.Çağlar Onur
2007-09-26 13:48   ` Ingo Molnar
2007-09-28 20:26 ` Alejandro Riveira Fernández
2007-09-29  2:20   ` Henrique de Moraes Holschuh
2007-09-29 16:51     ` Alejandro Riveira Fernández
2007-09-30 15:26   ` Ingo Molnar
2007-10-02 18:12     ` Alejandro Riveira Fernández
2007-10-26 18:56 ` [patch/backport] CFS scheduler, -v22, for v2.6.23-rc8, v2.6.22.8,v2.6.21.7, v2.6.20.20 Fortier,Vincent [Montreal]
  -- strict thread matches above, loose matches on Subject: below --
2007-09-29 11:11 [patch/backport] CFS scheduler, -v22, for v2.6.23-rc8, v2.6.22.8, v2.6.21.7, v2.6.20.20 Matthew
2007-09-30 15:43 ` Ingo Molnar
     [not found]   ` <e85b9d30709300943rded9801xddc6ca0a4773ff53@mail.gmail.com>
2007-09-30 17:27     ` Ingo Molnar
2007-09-30 23:54 ` Bill Davidsen
2007-10-02  9:06   ` Matthew

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox