[PATCH] sched: smpnice prevent integer arithmetic wrap problems

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Peter Williams <pwil3058@bigpond.net.au>
To: "Siddha, Suresh B" <suresh.b.siddha@intel.com>,
	Andrew Morton <akpm@osdl.org>
Cc: Ingo Molnar <mingo@elte.hu>,
	Nick Piggin <nickpiggin@yahoo.com.au>,
	Con Kolivas <kernel@kolivas.org>,
	"Chen, Kenneth W" <kenneth.w.chen@intel.com>,
	Mike Galbraith <efault@gmx.de>,
	linux-kernel@vger.kernel.org
Subject: [PATCH] sched: smpnice prevent integer arithmetic wrap problems
Date: Mon, 27 Mar 2006 10:43:51 +1100	[thread overview]
Message-ID: <44272737.50309@bigpond.net.au> (raw)
In-Reply-To: <442722C4.4030409@bigpond.net.au>

[-- Attachment #1: Type: text/plain, Size: 2425 bytes --]

Peter Williams wrote:
> Siddha, Suresh B wrote:
>> more issues with smpnice patch...
>>
>> a) consider a 4-way system (simple SMP system with no HT and cores) 
>> scenario
>> where a high priority task (nice -20) is running on P0 and two normal
>> priority tasks running on P1. load balance with smp nice code
>> will never be able to detect an imbalance and hence will never move 
>> one of the normal priority tasks on P1 to idle cpus P2 or P3.
> 
> Fix already sent.
> 
>>
>> b) smpnice seems to break this patch..
>>
>> [PATCH] sched: allow the load to grow upto its cpu_power
>> http://www.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=0c117f1b4d14380baeed9c883f765ee023da8761 
>>
>>
>> example scenario for this case: consider a numa system with two nodes, 
>> each
>> node containing four processors. if there are two processes in node-0 
>> and with
>> node-1 being completely idle, your patch will move one of those 
>> processes to
>> node-1 whereas the previous behavior will retain those two processes 
>> in node-0..
>> (in this case, in your code max_load will be less than 
>> busiest_load_per_task)
> 
> I think that the patch I sent to address a) above will also fix this 
> problem as find_busiest_queue() will no longer find node-0 as the 
> busiest group unless both of the processes in node-0 are on the same 
> CPU.  This is because it now only considers groups that have at least 
> one CPU with more than one running task as candidates for being the 
> busiest group.
> 
> Implicit in this is the assumption that it's OK to move one of the tasks 
> from node-0 to node-1 if they're both on the same CPU within node-0.
> 
> Could you confirm this is OK?

It looks like my coffee was slow kicking in this morning :-)

When I looked at the code more carefully I realized that you're 
suggestion re comparing avg_load and busiest_load_per_task is needed to 
protect the calculation of max_pull from integer arithmetic wrapping 
problems.  There was a big clue to this need in the comment above the 
calculation of max_pull that I failed to read :-(

Anyway the attached patch should fix the problem.  It should be applied 
on top of the other patch.

Signed-off-by: Peter Williams <pwil3058@bigpond.com.au>

Peter
-- 
Peter Williams                                   pwil3058@bigpond.net.au

"Learning, n. The kind of ignorance distinguishing the studious."
  -- Ambrose Bierce

[-- Attachment #2: smpnice-allow-load-up-to-cpu_power --]
[-- Type: text/plain, Size: 890 bytes --]

Index: MM-2.6.X/kernel/sched.c
===================================================================
--- MM-2.6.X.orig/kernel/sched.c	2006-03-25 13:56:37.000000000 +1100
+++ MM-2.6.X/kernel/sched.c	2006-03-27 10:15:38.000000000 +1100
@@ -2161,7 +2161,7 @@ find_busiest_group(struct sched_domain *
 		group = group->next;
 	} while (group != sd->groups);
 
-	if (!busiest || this_load >= max_load || busiest_nr_running <= 1)
+	if (!busiest || this_load >= max_load)
 		goto out_balanced;
 
 	avg_load = (SCHED_LOAD_SCALE * total_load) / total_pwr;
@@ -2171,6 +2171,9 @@ find_busiest_group(struct sched_domain *
 		goto out_balanced;
 
 	busiest_load_per_task /= busiest_nr_running;
+
+	if (avg_load <= busiest_load_per_task)
+		goto out_balanced;
 	/*
 	 * We're trying to get all the cpus to the average_load, so we don't
 	 * want to push ourselves above the average load, nor do we wish to

next prev parent reply	other threads:[~2006-03-26 23:43 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-03-22 23:51 cpu scheduler merge plans Andrew Morton
2006-03-22 22:57 ` kernel
2006-03-23  1:37   ` Siddha, Suresh B
2006-03-23 22:06     ` Peter Williams
2006-03-23  0:31 ` Nick Piggin
2006-03-23  1:16 ` Peter Williams
2006-03-24 23:45   ` more smpnice patch issues Siddha, Suresh B
2006-03-25  0:56     ` Peter Williams
2006-03-25  1:53       ` Peter Williams
2006-03-25  3:40         ` [PATCH] sched: make sure busiest group and run queue are pullable Peter Williams
2006-03-26 23:24     ` more smpnice patch issues Peter Williams
2006-03-26 23:43       ` Peter Williams [this message]
2006-03-23  5:03 ` cpu scheduler merge plans Ingo Molnar
2006-03-23  5:13   ` Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=44272737.50309@bigpond.net.au \
    --to=pwil3058@bigpond.net.au \
    --cc=akpm@osdl.org \
    --cc=efault@gmx.de \
    --cc=kenneth.w.chen@intel.com \
    --cc=kernel@kolivas.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=nickpiggin@yahoo.com.au \
    --cc=suresh.b.siddha@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.