public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Ingo Molnar <mingo@elte.hu>, Mike Galbraith <efault@gmx.de>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Greg Kroah-Hartman <greg@kroah.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Jarkko Nikula <jhnikula@gmail.com>,
	Tony Lindgren <tony@atomide.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [RFC patch] CFS fix place entity spread issue (v2)
Date: Mon, 19 Apr 2010 11:25:35 +0200	[thread overview]
Message-ID: <1271669135.1674.615.camel@laptop> (raw)
In-Reply-To: <20100418131302.GA3614@Krystal>

On Sun, 2010-04-18 at 09:13 -0400, Mathieu Desnoyers wrote:

> Detailed explanation in my ELC2010 presentation at:
> 
> http://www.efficios.com/elc2010

So what happened to comprehensive changelogs?

> I've torn the CFS scheduler apart in the past days to figure out what is causing
> this weird behavior, and the culprit seems to be place_entity(). The problem
> appears to be the cumulative effect of letting the min_vruntime go backward when
> putting sleepers back on the runqueue.

Right, that should not happen, ever, min_vruntime should be monotonic.
Your explanation fails to mention how exactly this happens.

>  It lets the vruntime spread grow to
> "entertaining" values (it is supposed to be in the 5ms range, not 18 minutes!).
> 
> In the original code, a max between the sched entity vruntime and the calculated
> vruntime was supposed to "ensure that the thread time never go backward". But I
> don't see why we even care about that.

Fairness, a task's vruntime going backwards means it can obtain more
cputime than a while(1) loop would get, not a good thing.

>  The key point is that the min_vruntime
> of the runqueue should not go backward.

That is certainly important too.

> I propose to fix this by calculating the relative offset from
> min_vruntime + sysctl_sched_latency rather than directly from min_vruntime. I
> also ensure that the value never goes below min_vruntime.

I'm not sure that's a good idea, that'll end up being a 'fairer'
scheduler in the strict sense, but I suspect it will break the wakeup
preemption model we have.

> Under the Xorg workload, moving a few windows around and starting firefox while
> executing the wakeup-latency.c program (program waking up every 10ms and
> reporting wakeup latency), this patch brings worse latency from 60ms down to
> 12ms. Even doing a kernel compilation at the same time, the worse latency stays
> around 20ms now.

That seems nice enough. But really, this changelog doesn't explain the
cause at all.

/me goes try and figure out where this min_vruntime goes backwards.

So update_min_vruntime() is the sole place where we change min_vruntime,
it is called from update_curr() where we add runtime to the task we're
currently servicing, and dequeue_entity() where we remove someone from
the tree.

These two sites are the only places where min_vruntime can change
because only adding service to the leftmost task, or removal thereof can
make the min_vruntime go forward.

So update_min_vruntime():

 - takes the smaller vruntime between the current task and the leftmost
tree entry (current is not in the tree, and we can pass beyond the
leftmost waiting task due to limited preemption and minimum service
windows).

 - set min_vruntime to the larger between the existing min_vruntime and
the smallest vruntime.

So colour me confused if I'm not seeing min_vruntime going backwards...




  parent reply	other threads:[~2010-04-19  9:25 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-04-18 13:13 [RFC patch] CFS fix place entity spread issue (v2) Mathieu Desnoyers
2010-04-18 20:21 ` Linus Torvalds
2010-04-18 22:59   ` Mathieu Desnoyers
2010-04-18 23:23     ` Linus Torvalds
2010-04-19  9:25 ` Peter Zijlstra [this message]
2010-04-19 14:06   ` Mathieu Desnoyers
2010-04-19 14:43     ` Peter Zijlstra
2010-04-19 14:43 ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1271669135.1674.615.camel@laptop \
    --to=peterz@infradead.org \
    --cc=akpm@linux-foundation.org \
    --cc=efault@gmx.de \
    --cc=greg@kroah.com \
    --cc=jhnikula@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mingo@elte.hu \
    --cc=rostedt@goodmis.org \
    --cc=tony@atomide.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox