public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Nick Piggin <piggin@cyberone.com.au>
To: colpatch@us.ibm.com
Cc: linux-kernel <linux-kernel@vger.kernel.org>,
	Michael Hohnbaum <hohnbaum@us.ibm.com>,
	Ingo Molnar <mingo@elte.hu>
Subject: Re: [rfc][patch] kernel/sched.c oddness?
Date: Thu, 03 Oct 2002 10:06:21 +1000	[thread overview]
Message-ID: <3D9B89FD.1060400@cyberone.com.au> (raw)
In-Reply-To: 3D9B3DC4.4030209@us.ibm.com

[-- Attachment #1: Type: text/plain, Size: 3007 bytes --]

Hi Matt,
I think your patch is good except that it would be fine to use a /= 2 as the
compiler will turn this into a shift, and imbalance is not really valid (or
used) if busiest == NULL. Actually there are a number of things (I think)
still wrong with this! I posted a patch to Ingo - I guess I should have
cc'ed the list - we should combine our efforts! Here is my explanation and
a new patch including both our fixes.

Regards,
Nick

Nick Piggin wrote:

> Just a quick follow up
> * using imbalance (not divided by 2) instead of nr_running in 
> is_unbalanced
>  would save a multiply by 3
> * it is a more accurate with a small number of processes due to no 
> divisions
> * perhaps the last test should actually be
>  if(busiest->nr_running <= nr_running)
>
> Regards,
> Nick
>
> Nick Piggin wrote:
>
>> Dear Ingo,
>> I apologise in advance if the current behaviour of the scheduler is 
>> the result
>> of tuning / special casing you did in the scheduler I didn't follow its
>> development closely. However, I noticed on my 2xSMP system that quite
>> unbalanced loads weren't getting even CPU time best example - 3 
>> processes in
>> busywait loops - one would get 100% of one cpu while two would get 
>> 50% each of
>> the other.
>>
>> I think find_busiest_queue might have a few problems:
>> * the "imbalance" value divided by 2 loses meaning and precision esp. 
>> with
>>  small number of processes
>>
>> * in the 25% imbalance calculation presumably the + should be a *
>>
>> * the 25% calculation uses the divided imbalance value which is 
>> inaccurate
>>
>> * i think the 25% imbalance calculation should use a quarter of 
>> max_load, not
>>  three quarters.
>>
>> * the second test (after the lock) isn't very good, its cheaper than 
>> the mul
>>  but having battled contention and claimed the lock it would be a 
>> shame to
>>  not do anything with it!
>>
>> Now top says the patch has fixed up the balancing but perhaps some of 
>> these
>> aren't bugs - perhaps they need a comment because they are a bit 
>> subtle! If
>> they are, please recommend the patch to Linus.
>>
>> Regards,
>> Nick
>>
>>
>


Matthew Dobson wrote:

> This snippet of code appears wrong...  Either that, or the 
> accompanying comment is wrong?
>
> from kernel/sched.c::find_busiest_queue():
>
> | *imbalance = (max_load - nr_running) / 2;
> |
> | /* It needs an at least ~25% imbalance to trigger balancing. */
> | if (!idle && (*imbalance < (max_load + 3)/4)) {
> |     busiest = NULL;
> |     goto out;
> | }
>
> The comment says 25% imbalance, but the code is really checking for a 
> ~50% imbalance.  The attatched patch moves the division by two to the 
> pull_task function where the imbalance number is actually used.  This 
> patch makes the code match the comment, and divides the imbalance by 
> two where it is needed.
>
> Please let me know if I've misinterpreted what this code is supposed 
> to be doing, -or- if we really want the comment to match the current 
> code.
>
> Cheers!
>
> -Matt



[-- Attachment #2: sched-balance2.patch --]
[-- Type: text/plain, Size: 1172 bytes --]

--- linux-2.5/kernel/sched.c.old	2002-10-02 23:14:12.000000000 +1000
+++ linux-2.5/kernel/sched.c	2002-10-03 10:03:41.000000000 +1000
@@ -689,10 +689,10 @@
 	if (likely(!busiest))
 		goto out;
 
-	*imbalance = (max_load - nr_running) / 2;
+	*imbalance = max_load - nr_running;
 
 	/* It needs an at least ~25% imbalance to trigger balancing. */
-	if (!idle && (*imbalance < (max_load + 3)/4)) {
+	if (!idle && (*imbalance*4 < max_load)) {
 		busiest = NULL;
 		goto out;
 	}
@@ -702,10 +702,11 @@
 	 * Make sure nothing changed since we checked the
 	 * runqueue length.
 	 */
-	if (busiest->nr_running <= nr_running + 1) {
+	if (busiest->nr_running <= nr_running) {
 		spin_unlock(&busiest->lock);
 		busiest = NULL;
 	}
+
 out:
 	return busiest;
 }
@@ -748,7 +749,12 @@
 	busiest = find_busiest_queue(this_rq, this_cpu, idle, &imbalance);
 	if (!busiest)
 		goto out;
-
+	/*
+	 * We only want to steal a number of tasks equal to 1/2 the imbalance,
+	 * otherwise, we'll just shift the imbalance to the new queue.
+	 */
+	imbalance /= 2;
+	
 	/*
 	 * We first consider expired tasks. Those will likely not be
 	 * executed in the near future, and they are most likely to

  reply	other threads:[~2002-10-03  0:05 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-10-02 18:41 [rfc][patch] kernel/sched.c oddness? Matthew Dobson
2002-10-03  0:06 ` Nick Piggin [this message]
2002-10-03  0:30   ` Matthew Dobson
2002-10-03  6:54   ` Ingo Molnar
2002-10-03  8:32     ` Nick Piggin
2002-10-03 21:15     ` Matthew Dobson
2002-10-04  7:46       ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3D9B89FD.1060400@cyberone.com.au \
    --to=piggin@cyberone.com.au \
    --cc=colpatch@us.ibm.com \
    --cc=hohnbaum@us.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox