Re: [git pull request] scheduler updates

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Ingo Molnar <mingo@elte.hu>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-kernel@vger.kernel.org,
	Peter Zijlstra <a.p.zijlstra@chello.nl>
Subject: Re: [git pull request] scheduler updates
Date: Sat, 25 Aug 2007 19:23:19 +0200	[thread overview]
Message-ID: <20070825172319.GA2080@elte.hu> (raw)
In-Reply-To: <20070824193758.GA6440@elte.hu>

* Ingo Molnar <mingo@elte.hu> wrote:

> We'll see - people can still readily tweak these values under 
> CONFIG_SCHED_DEBUG=y and give us feedback, and if there's enough 
> demand for very-finegrained scheduling (throughput be damned), we 
> could introduce yet another config option or enable the runtime 
> tunable unconditionally. (Or maybe the best option is Peter Zijlstra's 
> adaptive granularity idea, that gives the best of both worlds.)

hm, glxgears smoothness regresses with the de-HZ-ification change: with 
an increasing background load the turning of the wheels quickly becomes 
visually ugly - with small ruckles instead of smooth rotation.

The reason for that is that the 20 msec granularity on my testbox (which 
is a dual-core box, so the default 10msec turns into 20msec) turns into 
40, 60, 80, 100 msec 'observed latency' for glxgears as load increases 
to 2x, 3x, 4x etc - and a 100 msec pause in rotation is easily 
perceivable to the human eye (and brain). Before that the delay curve 
with increasing load was 4msec/8msec/12msec etc.

Due to the removal of the HZ dependency we now have upset the 
granularity picture anyway, so i believe we should do the adaptive 
granularity thing right now. That will aim for a 40msec task-observable 
latency, in a load-independent manner. (!) (This is an approach we 
couldnt even dream of with the previous, fixed-timeslice scheduler.)

The code is simple (and it is all in the slowpath), it in essence boils 
down to this new code:

 +static long
 +sched_granularity(struct cfs_rq *cfs_rq)
 +{
 +       unsigned int gran = sysctl_sched_latency;
 +       unsigned int nr = cfs_rq->nr_running;
 +
 +       if (nr > 1) {
 +               gran = gran/nr - gran/nr/nr;
 +               gran = max(gran, sysctl_sched_granularity);
 +       }
 +
 +       return gran;
 +}

IMO it is a good compromise between long slicing and short slicing: 
there are two values, one is the "CPU-bound task latency the scheduler 
aims for", the second one is a minimum granularity (to not do too many 
context-switches).

Peter and me tested this all day with various workloads and extreme-load 
behavior has improved all over the place - while the server benchmarks 
(which want less preemption) are still fine too. The glxgear ruckles are 
all gone.

If you do not disagree with this (it's pretty late in the game with more 
than 1 month spent from the kernel cycle already), please pull the 
latest scheduler tree from:

   git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched.git

Find the shortlog further below. There are 3 commits in it: adaptive 
granularity, a subsequent cleanup, and a lockdep sysctl bug Peter 
noticed while hacking on this. (the bug was introduced with the initial 
CFS commits but nobody noticed because the lockdep sysctls are rarely 
used.)

The linecount increase is mostly due to the comments added to explain 
the "gran = lat/nr - lat/nr/nr" magic formula and due to the extra 
parameter.

Tested on 32-bit and 64-bit x86, and with a few make randconfig build 
tests too.

	Ingo

------------------>
Ingo Molnar (1):
      sched: cleanup, sched_granularity -> sched_min_granularity

Peter Zijlstra (2):
      sched: fix CONFIG_SCHED_DEBUG dependency of lockdep sysctls
      sched: adaptive scheduler granularity

 include/linux/sched.h |    3 +
 kernel/sched.c        |   16 ++++++----
 kernel/sched_fair.c   |   77 ++++++++++++++++++++++++++++++++++++++++++--------
 kernel/sysctl.c       |   33 ++++++++++++++-------
 4 files changed, 99 insertions(+), 30 deletions(-)

next prev parent reply	other threads:[~2007-08-25 17:23 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-08-24 14:12 [git pull request] scheduler updates Ingo Molnar
2007-08-24 18:09 ` Linus Torvalds
2007-08-24 19:37   ` Ingo Molnar
2007-08-25 17:23     ` Ingo Molnar [this message]
2007-08-25 20:43       ` Ingo Molnar
2007-08-25 21:20       ` Peter Zijlstra
2007-08-31  1:58   ` Roman Zippel
  -- strict thread matches above, loose matches on Subject: below --
2007-08-28 11:32 Ingo Molnar
2007-08-28 14:11 ` Mike Galbraith
2007-08-28 14:46   ` Ingo Molnar
2007-08-28 14:55     ` Mike Galbraith
2007-08-23 16:07 Ingo Molnar
2007-08-12 16:32 Ingo Molnar
2007-08-10 21:22 Ingo Molnar
2007-08-08 20:30 Ingo Molnar
2007-08-02 16:08 Ingo Molnar
2007-07-26 12:08 Ingo Molnar
2007-07-19 16:50 Ingo Molnar
2007-07-16  7:53 Ingo Molnar
2007-07-11 19:38 Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070825172319.GA2080@elte.hu \
    --to=mingo@elte.hu \
    --cc=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox