public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Ingo Molnar <mingo@elte.hu>, Mike Galbraith <efault@gmx.de>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Greg Kroah-Hartman <greg@kroah.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Jarkko Nikula <jhnikula@gmail.com>,
	Tony Lindgren <tony@atomide.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [RFC patch] CFS fix place entity spread issue (v2)
Date: Mon, 19 Apr 2010 16:43:45 +0200	[thread overview]
Message-ID: <1271688225.1488.237.camel@laptop> (raw)
In-Reply-To: <20100418131302.GA3614@Krystal>

On Sun, 2010-04-18 at 09:13 -0400, Mathieu Desnoyers wrote:

OK, so looking purely at the patch:

> Index: linux-2.6-lttng.git/kernel/sched_fair.c
> ===================================================================
> --- linux-2.6-lttng.git.orig/kernel/sched_fair.c        2010-04-18 01:48:04.000000000 -0400
> +++ linux-2.6-lttng.git/kernel/sched_fair.c     2010-04-18 08:58:30.000000000 -0400
> @@ -738,6 +738,14 @@
>                 unsigned long thresh = sysctl_sched_latency;
>  
>                 /*
> +                * Place the woken up task relative to
> +                * min_vruntime + sysctl_sched_latency.
> +                * We must _never_ decrement min_vruntime, because the effect is

Nobody I could find decrements min_vruntime, and certainly
place_entity() doesn't change min_vruntime. So this is a totally
mis-guided comment.

> +                * that spread increases progressively under the Xorg workload.
> +                */
> +               vruntime += sysctl_sched_latency;

So in effect you change: 
  vruntime = max(vruntime, min_vruntime - thresh/2)
into
  vruntime = max(vruntime, min_vruntime + thresh/2)

in a non-obvious way and unclear reason.

> +               /*
>                  * Convert the sleeper threshold into virtual time.
>                  * SCHED_IDLE is a special sub-class.  We care about
>                  * fairness only relative to other SCHED_IDLE tasks,
> @@ -755,6 +763,9 @@
>                         thresh >>= 1;
>  
>                 vruntime -= thresh;
> +
> +               /* ensure min_vruntime never go backwards. */
> +               vruntime = max_t(u64, vruntime, cfs_rq->min_vruntime);

So the comment doesn't match the code, nor is it correct.

The code tries to implement clipping vruntime to min_vruntime, not
clipping min_vruntime, but then botches it by not taking wrap-around
into account.

Now, I know why your patch helps you (its in effect similar to what
START_DEBIT does for fork()), but getting the wakeup-preemption to do
something nice along with it is the hard part.

The whole perfectly fair scheduling thing is more-or-less doable
(dealing with tasks dying with !0-lag gets interesting, you'd have to
start adjusting global-timeline like things for that). But the thing is
that it makes for rather poor interactive behaviour.

Letting a task that sleeps long and runs short preempt heavier tasks
generally works well. Also, there's a number of apps that get a nice
boost from getting preempted before they can actually block on a
(read-like) systemcall, That saves a whole scheduler round-trip on the
wakeup side, so ping-pong like tasks love this too.

And then there is the whole signal delivery muck..


      parent reply	other threads:[~2010-04-19 14:43 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-04-18 13:13 [RFC patch] CFS fix place entity spread issue (v2) Mathieu Desnoyers
2010-04-18 20:21 ` Linus Torvalds
2010-04-18 22:59   ` Mathieu Desnoyers
2010-04-18 23:23     ` Linus Torvalds
2010-04-19  9:25 ` Peter Zijlstra
2010-04-19 14:06   ` Mathieu Desnoyers
2010-04-19 14:43     ` Peter Zijlstra
2010-04-19 14:43 ` Peter Zijlstra [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1271688225.1488.237.camel@laptop \
    --to=peterz@infradead.org \
    --cc=akpm@linux-foundation.org \
    --cc=efault@gmx.de \
    --cc=greg@kroah.com \
    --cc=jhnikula@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mingo@elte.hu \
    --cc=rostedt@goodmis.org \
    --cc=tony@atomide.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox