All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@elte.hu>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Chuck Ebbert <cebbert@redhat.com>,
	Antoine Martin <antoine@nagafix.co.uk>,
	Satyam Sharma <satyam.sharma@gmail.com>,
	Linux Kernel Development <linux-kernel@vger.kernel.org>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>
Subject: Re: CFS: some bad numbers with Java/database threading [FIXED]
Date: Wed, 19 Sep 2007 22:26:32 +0200	[thread overview]
Message-ID: <20070919202632.GA30474@elte.hu> (raw)
In-Reply-To: <20070919195633.GA23595@elte.hu>


* Ingo Molnar <mingo@elte.hu> wrote:

> ok, we can do that.
> 
> the O(1) implementation of yield() was pretty arbitrary: it did not 
> move it last on the same priority level - it only did it within the 
> active array. So expired tasks (such as CPU hogs) would come _after_ a 
> yield()-ing task.
> 
> so the yield() implementation was so much tied to the data structures 
> of the O(1) scheduler that it was impossible to fully emulate it in 
> CFS.
> 
> in CFS we dont have a per-nice-level rbtree, so we cannot move it dead 
> last within the same priority group - but we can move it dead last in 
> the whole tree. (then they'd be put even after nice +19 tasks.) People 
> might complain about _that_.
> 
> another practical problem is that this will break certain desktop apps 
> that do calls to yield() [some firefox plugins do that, some 3D apps 
> do that, etc.] but they dont expect to be moved 'very late' into the 
> queue - they expect the O(1) scheduler's behavior of being delayed "a 
> bit". (That's why i added the yield-granularity tunable.)
> 
> we can make yield super-agressive, that is pretty much the only sane 
> (because well-defined) thing to do (besides turning yield into a NOP), 
> but there will be lots of regression reports about lost interactivity 
> during load. sched_yield() is a mortally broken API. "fix the app" 
> would be the answer, but still there will be lots of complaints.

find below the fix that puts yielding tasks to the rightmost position in 
the rbtree. I have not tested it extensively yet, but it appears to work 
on x86. (i tested yield using interactive tasks and they get hurt now 
under load - but this would be expected.)

	Ingo

---------------------->
Subject: sched: make yield more agressive
From: Ingo Molnar <mingo@elte.hu>

make sys_sched_yield() more agressive, by moving the yielding task
to the last position in the rbtree.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 kernel/sched.c      |    5 +----
 kernel/sched_fair.c |   39 ++++++++++++++++++++++++++++++++++-----
 2 files changed, 35 insertions(+), 9 deletions(-)

Index: linux/kernel/sched.c
===================================================================
--- linux.orig/kernel/sched.c
+++ linux/kernel/sched.c
@@ -4550,10 +4550,7 @@ asmlinkage long sys_sched_yield(void)
 	struct rq *rq = this_rq_lock();
 
 	schedstat_inc(rq, yld_cnt);
-	if (unlikely(rq->nr_running == 1))
-		schedstat_inc(rq, yld_act_empty);
-	else
-		current->sched_class->yield_task(rq, current);
+	current->sched_class->yield_task(rq, current);
 
 	/*
 	 * Since we are going to call schedule() anyway, there's
Index: linux/kernel/sched_fair.c
===================================================================
--- linux.orig/kernel/sched_fair.c
+++ linux/kernel/sched_fair.c
@@ -902,14 +902,43 @@ static void dequeue_task_fair(struct rq 
 static void yield_task_fair(struct rq *rq, struct task_struct *p)
 {
 	struct cfs_rq *cfs_rq = task_cfs_rq(p);
+	struct rb_node **link = &cfs_rq->tasks_timeline.rb_node;
+	struct sched_entity *rightmost, *se = &p->se;
+	struct rb_node *parent;
 
-	__update_rq_clock(rq);
 	/*
-	 * Dequeue and enqueue the task to update its
-	 * position within the tree:
+	 * Are we the only task in the tree?
 	 */
-	dequeue_entity(cfs_rq, &p->se, 0);
-	enqueue_entity(cfs_rq, &p->se, 0);
+	if (unlikely(cfs_rq->nr_running == 1))
+		return;
+	/*
+	 * Find the rightmost entry in the rbtree:
+	 */
+	do {
+		parent = *link;
+		link = &parent->rb_right;
+	} while (*link);
+
+	rightmost = rb_entry(parent, struct sched_entity, run_node);
+	/*
+	 * Already in the rightmost position?
+	 */
+	if (unlikely(rightmost == se))
+		return;
+
+	/*
+	 * Minimally necessary key value to be last in the tree:
+	 */
+	se->fair_key = rightmost->fair_key + 1;
+
+	if (cfs_rq->rb_leftmost == &se->run_node)
+		cfs_rq->rb_leftmost = rb_next(&se->run_node);
+	/*
+	 * Relink the task to the rightmost position:
+	 */
+	rb_erase(&se->run_node, &cfs_rq->tasks_timeline);
+	rb_link_node(&se->run_node, parent, link);
+	rb_insert_color(&se->run_node, &cfs_rq->tasks_timeline);
 }
 
 /*


  reply	other threads:[~2007-09-19 20:27 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-09-12 23:10 CFS: some bad numbers with Java/database threading Antoine Martin
2007-09-13  7:18 ` David Schwartz
2007-09-12 23:33   ` Nick Piggin
2007-09-13 19:02     ` Antoine Martin
2007-09-13 21:47       ` David Schwartz
2007-09-13 11:24 ` CFS: " Ingo Molnar
2007-09-14  8:32   ` Ingo Molnar
2007-09-14 10:06     ` Satyam Sharma
2007-09-14 15:25       ` CFS: some bad numbers with Java/database threading [FIXED] Antoine Martin
2007-09-14 15:32         ` Ingo Molnar
2007-09-18 17:00           ` Chuck Ebbert
2007-09-18 22:46             ` Ingo Molnar
2007-09-18 23:02               ` Chuck Ebbert
2007-09-19 18:45                 ` David Schwartz
2007-09-19 19:48                   ` Chris Friesen
2007-09-19 22:56                     ` David Schwartz
2007-09-19 23:05                       ` David Schwartz
2007-09-19 23:52                         ` David Schwartz
2007-09-19 19:18                 ` Ingo Molnar
2007-09-19 19:39                   ` Linus Torvalds
2007-09-19 19:56                     ` Ingo Molnar
2007-09-19 20:26                       ` Ingo Molnar [this message]
2007-09-19 20:28                       ` Linus Torvalds
2007-09-19 21:41                         ` Ingo Molnar
2007-09-19 21:49                           ` Ingo Molnar
2007-09-19 21:58                           ` Peter Zijlstra
2007-09-26  1:46                           ` CFS: new java yield graphs Antoine Martin
2007-09-27  8:35                             ` Ingo Molnar
2007-09-19 20:00                   ` CFS: some bad numbers with Java/database threading [FIXED] Chris Friesen
2007-09-14 16:01       ` CFS: some bad numbers with Java/database threading Satyam Sharma
2007-09-14 16:08         ` Satyam Sharma
2007-09-17 12:17         ` Antoine Martin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070919202632.GA30474@elte.hu \
    --to=mingo@elte.hu \
    --cc=a.p.zijlstra@chello.nl \
    --cc=antoine@nagafix.co.uk \
    --cc=cebbert@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=satyam.sharma@gmail.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.