public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Nick Piggin <nickpiggin@yahoo.com.au>
To: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Ingo Molnar <mingo@elte.hu>, Linus Torvalds <torvalds@osdl.org>,
	Andrew Morton <akpm@osdl.org>,
	linux-kernel@vger.kernel.org
Subject: Re: [patch, 2.6.10-rc2] sched: fix ->nr_uninterruptible handling bugs
Date: Wed, 17 Nov 2004 10:03:12 +1100	[thread overview]
Message-ID: <419A8730.8030108@yahoo.com.au> (raw)
In-Reply-To: <419A83FB.2080308@yahoo.com.au>

[-- Attachment #1: Type: text/plain, Size: 1086 bytes --]

Nick Piggin wrote:
> Ingo Molnar wrote:
> 
>> PREEMPT_RT on SMP systems triggered weird (very high) load average
>> values rather easily, which turned out to be a mainline kernel
>> ->nr_uninterruptible handling bug in try_to_wake_up().
>>
>> the following code:
>>
>>         if (old_state == TASK_UNINTERRUPTIBLE) {
>>                 old_rq->nr_uninterruptible--;
>>
>> potentially executes with old_rq potentially being != rq, and hence
>> updating ->nr_uninterruptible without the lock held. Given a
>> sufficiently concurrent preemption workload the count can get out of
>> whack and updates might get lost, permanently skewing the global 
>> count. Nothing except the load-average uses nr_uninterruptible() so this
>> condition can go unnoticed quite easily.
>>
> 
> Hi Ingo,
> Yes you're right.
> 
> I have another idea. Revert back to the old code, then just transfer
> the nr_uninterruptible count when migrating a task. That way, the
> rq's nr_uninterruptible field always is a measure of the number of
> uninterruptible tasks on it. What do you think?

Something like this:

[-- Attachment #2: sched-nr_unint-fix.patch --]
[-- Type: text/x-patch, Size: 1499 bytes --]




---

 linux-2.6-npiggin/kernel/sched.c |   12 ++++++++----
 1 files changed, 8 insertions(+), 4 deletions(-)

diff -puN kernel/sched.c~sched-nr_unint-fix kernel/sched.c
--- linux-2.6/kernel/sched.c~sched-nr_unint-fix	2004-11-17 09:54:36.000000000 +1100
+++ linux-2.6-npiggin/kernel/sched.c	2004-11-17 10:01:49.000000000 +1100
@@ -981,14 +981,14 @@ static int try_to_wake_up(task_t * p, un
 	int cpu, this_cpu, success = 0;
 	unsigned long flags;
 	long old_state;
-	runqueue_t *rq, *old_rq;
+	runqueue_t *rq;
 #ifdef CONFIG_SMP
 	unsigned long load, this_load;
 	struct sched_domain *sd;
 	int new_cpu;
 #endif
 
-	old_rq = rq = task_rq_lock(p, &flags);
+	rq = task_rq_lock(p, &flags);
 	schedstat_inc(rq, ttwu_cnt);
 	old_state = p->state;
 	if (!(old_state & state))
@@ -1083,7 +1083,7 @@ out_set_cpu:
 out_activate:
 #endif /* CONFIG_SMP */
 	if (old_state == TASK_UNINTERRUPTIBLE) {
-		old_rq->nr_uninterruptible--;
+		rq->nr_uninterruptible--;
 		/*
 		 * Tasks on involuntary sleep don't earn
 		 * sleep_avg beyond just interactive state.
@@ -1608,8 +1608,12 @@ void pull_task(runqueue_t *src_rq, prio_
 {
 	dequeue_task(p, src_array);
 	src_rq->nr_running--;
-	set_task_cpu(p, this_cpu);
 	this_rq->nr_running++;
+	if (p->state == TASK_UNINTERRUPTIBLE) {
+		src_rq->nr_uninterruptible--;
+		this_rq->nr_uninterruptible++;
+	}
+	set_task_cpu(p, this_cpu);
 	enqueue_task(p, this_array);
 	p->timestamp = (p->timestamp - src_rq->timestamp_last_tick)
 				+ this_rq->timestamp_last_tick;

_

  reply	other threads:[~2004-11-16 23:06 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-11-16 11:32 [patch, 2.6.10-rc2] sched: fix ->nr_uninterruptible handling bugs Ingo Molnar
2004-11-16 22:19 ` Peter Williams
2004-11-16 23:28   ` Ingo Molnar
2004-11-16 23:10     ` Linus Torvalds
2004-11-17 10:26       ` Ingo Molnar
2004-11-17 15:52         ` Linus Torvalds
2004-11-18 16:21           ` Ingo Molnar
2005-01-28  0:53       ` Paul Jackson
2005-01-28  1:06         ` Linus Torvalds
2005-01-28  2:14           ` Paul Jackson
2005-01-28  4:28         ` Ingo Molnar
2005-01-28  5:18           ` Paul Jackson
2005-01-28  6:01           ` Andi Kleen
2004-11-16 23:48     ` Peter Williams
2004-11-16 22:49 ` Nick Piggin
2004-11-16 23:03   ` Nick Piggin [this message]
2004-11-16 23:32   ` Peter Williams
2004-11-16 23:37     ` Nick Piggin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=419A8730.8030108@yahoo.com.au \
    --to=nickpiggin@yahoo.com.au \
    --cc=akpm@osdl.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=torvalds@osdl.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox