public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@elte.hu>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Chuck Ebbert <cebbert@redhat.com>,
	Antoine Martin <antoine@nagafix.co.uk>,
	Satyam Sharma <satyam.sharma@gmail.com>,
	Linux Kernel Development <linux-kernel@vger.kernel.org>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>
Subject: Re: CFS: some bad numbers with Java/database threading [FIXED]
Date: Wed, 19 Sep 2007 22:26:32 +0200	[thread overview]
Message-ID: <20070919202632.GA30474@elte.hu> (raw)
In-Reply-To: <20070919195633.GA23595@elte.hu>


* Ingo Molnar <mingo@elte.hu> wrote:

> ok, we can do that.
> 
> the O(1) implementation of yield() was pretty arbitrary: it did not 
> move it last on the same priority level - it only did it within the 
> active array. So expired tasks (such as CPU hogs) would come _after_ a 
> yield()-ing task.
> 
> so the yield() implementation was so much tied to the data structures 
> of the O(1) scheduler that it was impossible to fully emulate it in 
> CFS.
> 
> in CFS we dont have a per-nice-level rbtree, so we cannot move it dead 
> last within the same priority group - but we can move it dead last in 
> the whole tree. (then they'd be put even after nice +19 tasks.) People 
> might complain about _that_.
> 
> another practical problem is that this will break certain desktop apps 
> that do calls to yield() [some firefox plugins do that, some 3D apps 
> do that, etc.] but they dont expect to be moved 'very late' into the 
> queue - they expect the O(1) scheduler's behavior of being delayed "a 
> bit". (That's why i added the yield-granularity tunable.)
> 
> we can make yield super-agressive, that is pretty much the only sane 
> (because well-defined) thing to do (besides turning yield into a NOP), 
> but there will be lots of regression reports about lost interactivity 
> during load. sched_yield() is a mortally broken API. "fix the app" 
> would be the answer, but still there will be lots of complaints.

find below the fix that puts yielding tasks to the rightmost position in 
the rbtree. I have not tested it extensively yet, but it appears to work 
on x86. (i tested yield using interactive tasks and they get hurt now 
under load - but this would be expected.)

	Ingo

---------------------->
Subject: sched: make yield more agressive
From: Ingo Molnar <mingo@elte.hu>

make sys_sched_yield() more agressive, by moving the yielding task
to the last position in the rbtree.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 kernel/sched.c      |    5 +----
 kernel/sched_fair.c |   39 ++++++++++++++++++++++++++++++++++-----
 2 files changed, 35 insertions(+), 9 deletions(-)

Index: linux/kernel/sched.c
===================================================================
--- linux.orig/kernel/sched.c
+++ linux/kernel/sched.c
@@ -4550,10 +4550,7 @@ asmlinkage long sys_sched_yield(void)
 	struct rq *rq = this_rq_lock();
 
 	schedstat_inc(rq, yld_cnt);
-	if (unlikely(rq->nr_running == 1))
-		schedstat_inc(rq, yld_act_empty);
-	else
-		current->sched_class->yield_task(rq, current);
+	current->sched_class->yield_task(rq, current);
 
 	/*
 	 * Since we are going to call schedule() anyway, there's
Index: linux/kernel/sched_fair.c
===================================================================
--- linux.orig/kernel/sched_fair.c
+++ linux/kernel/sched_fair.c
@@ -902,14 +902,43 @@ static void dequeue_task_fair(struct rq 
 static void yield_task_fair(struct rq *rq, struct task_struct *p)
 {
 	struct cfs_rq *cfs_rq = task_cfs_rq(p);
+	struct rb_node **link = &cfs_rq->tasks_timeline.rb_node;
+	struct sched_entity *rightmost, *se = &p->se;
+	struct rb_node *parent;
 
-	__update_rq_clock(rq);
 	/*
-	 * Dequeue and enqueue the task to update its
-	 * position within the tree:
+	 * Are we the only task in the tree?
 	 */
-	dequeue_entity(cfs_rq, &p->se, 0);
-	enqueue_entity(cfs_rq, &p->se, 0);
+	if (unlikely(cfs_rq->nr_running == 1))
+		return;
+	/*
+	 * Find the rightmost entry in the rbtree:
+	 */
+	do {
+		parent = *link;
+		link = &parent->rb_right;
+	} while (*link);
+
+	rightmost = rb_entry(parent, struct sched_entity, run_node);
+	/*
+	 * Already in the rightmost position?
+	 */
+	if (unlikely(rightmost == se))
+		return;
+
+	/*
+	 * Minimally necessary key value to be last in the tree:
+	 */
+	se->fair_key = rightmost->fair_key + 1;
+
+	if (cfs_rq->rb_leftmost == &se->run_node)
+		cfs_rq->rb_leftmost = rb_next(&se->run_node);
+	/*
+	 * Relink the task to the rightmost position:
+	 */
+	rb_erase(&se->run_node, &cfs_rq->tasks_timeline);
+	rb_link_node(&se->run_node, parent, link);
+	rb_insert_color(&se->run_node, &cfs_rq->tasks_timeline);
 }
 
 /*


  reply	other threads:[~2007-09-19 20:27 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-09-12 23:10 CFS: some bad numbers with Java/database threading Antoine Martin
2007-09-13  7:18 ` David Schwartz
2007-09-12 23:33   ` Nick Piggin
2007-09-13 19:02     ` Antoine Martin
2007-09-13 21:47       ` David Schwartz
2007-09-13 11:24 ` CFS: " Ingo Molnar
2007-09-14  8:32   ` Ingo Molnar
2007-09-14 10:06     ` Satyam Sharma
2007-09-14 15:25       ` CFS: some bad numbers with Java/database threading [FIXED] Antoine Martin
2007-09-14 15:32         ` Ingo Molnar
2007-09-18 17:00           ` Chuck Ebbert
2007-09-18 22:46             ` Ingo Molnar
2007-09-18 23:02               ` Chuck Ebbert
2007-09-19 18:45                 ` David Schwartz
2007-09-19 19:48                   ` Chris Friesen
2007-09-19 22:56                     ` David Schwartz
2007-09-19 23:05                       ` David Schwartz
2007-09-19 23:52                         ` David Schwartz
2007-09-19 19:18                 ` Ingo Molnar
2007-09-19 19:39                   ` Linus Torvalds
2007-09-19 19:56                     ` Ingo Molnar
2007-09-19 20:26                       ` Ingo Molnar [this message]
2007-09-19 20:28                       ` Linus Torvalds
2007-09-19 21:41                         ` Ingo Molnar
2007-09-19 21:49                           ` Ingo Molnar
2007-09-19 21:58                           ` Peter Zijlstra
2007-09-26  1:46                           ` CFS: new java yield graphs Antoine Martin
2007-09-27  8:35                             ` Ingo Molnar
2007-09-19 20:00                   ` CFS: some bad numbers with Java/database threading [FIXED] Chris Friesen
2007-09-14 16:01       ` CFS: some bad numbers with Java/database threading Satyam Sharma
2007-09-14 16:08         ` Satyam Sharma
2007-09-17 12:17         ` Antoine Martin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070919202632.GA30474@elte.hu \
    --to=mingo@elte.hu \
    --cc=a.p.zijlstra@chello.nl \
    --cc=antoine@nagafix.co.uk \
    --cc=cebbert@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=satyam.sharma@gmail.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox