From: Nick Piggin <piggin@cyberone.com.au>
To: Anton Blanchard <anton@samba.org>
Cc: Rick Lindsley <ricklind@us.ibm.com>,
"Martin J. Bligh" <mbligh@aracnet.com>,
akpm@osdl.org, linux-kernel@vger.kernel.org, dvhltc@us.ibm.com
Subject: Re: [PATCH] Load balancing problem in 2.6.2-mm1
Date: Sun, 08 Feb 2004 15:05:06 +1100 [thread overview]
Message-ID: <4025B572.9040904@cyberone.com.au> (raw)
In-Reply-To: <20040208035721.GY19011@krispykreme>
[-- Attachment #1: Type: text/plain, Size: 346 bytes --]
Anton Blanchard wrote:
>
>Hi,
>
>
>
>>Yeah its because you have a lot of cpus, so the average is still
>>small. You also need something like
>>
>>if (*imbalance == 0 && max_load - this_load > SCHED_LOAD_SCALE)
>> *imbalance = 1;
>>
>>
>
>OK I'll give that a try.
>
>
>
>
Can you try this patch instead pretty please ;)
Thanks
[-- Attachment #2: rollup.patch --]
[-- Type: text/plain, Size: 2941 bytes --]
linux-2.6-npiggin/kernel/sched.c | 52 ++++++++++++++++++++++++---------------
1 files changed, 32 insertions(+), 20 deletions(-)
diff -puN kernel/sched.c~rollup kernel/sched.c
--- linux-2.6/kernel/sched.c~rollup 2004-02-08 15:03:53.000000000 +1100
+++ linux-2.6-npiggin/kernel/sched.c 2004-02-08 15:03:53.000000000 +1100
@@ -1405,16 +1405,28 @@ find_busiest_group(struct sched_domain *
total_load += avg_load;
total_nr_cpus += nr_cpus;
- avg_load /= nr_cpus;
+
+ /*
+ * Load is cumulative over SD_FLAG_IDLE domains, but
+ * spread over !SD_FLAG_IDLE domains. For example, 2
+ * processes running on an SMT CPU puts a load of 2 on
+ * that CPU, however 2 processes running on 2 CPUs puts
+ * a load of 1 on that domain.
+ *
+ * This should be configurable so as SMT siblings become
+ * more powerful, they can "spread" more load - for example,
+ * the above case might only count as a load of 1.7.
+ */
+ if (!(domain->flags & SD_FLAG_IDLE))
+ avg_load /= nr_cpus;
+
+ if (avg_load > max_load)
+ max_load = avg_load;
if (local_group) {
this_load = avg_load;
- goto nextgroup;
- }
-
- if (avg_load >= max_load) {
+ } else if (avg_load >= max_load) {
busiest = group;
- max_load = avg_load;
busiest_nr_cpus = nr_cpus;
}
nextgroup:
@@ -1424,8 +1436,10 @@ nextgroup:
if (!busiest)
goto out_balanced;
- avg_load = total_load / total_nr_cpus;
- if (idle == NOT_IDLE && this_load >= avg_load)
+ if (!(domain->flags & SD_FLAG_IDLE))
+ avg_load = total_load / total_nr_cpus;
+
+ if (this_load >= avg_load)
goto out_balanced;
if (idle == NOT_IDLE && 100*max_load <= domain->imbalance_pct*this_load)
@@ -1437,20 +1451,18 @@ nextgroup:
* reduce the max loaded cpu below the average load, as either of these
* actions would just result in more rebalancing later, and ping-pong
* tasks around. Thus we look for the minimum possible imbalance.
+ * Negative imbalances (*we* are more loaded than anyone else) will
+ * be counted as no imbalance for these purposes -- we can't fix that
+ * by pulling tasks to us. Be careful of negative numbers as they'll
+ * appear as very large values with unsigned longs.
*/
- *imbalance = min(max_load - avg_load, avg_load - this_load);
-
- /* Get rid of the scaling factor now, rounding *up* as we divide */
- *imbalance = (*imbalance + SCHED_LOAD_SCALE - 1) >> SCHED_LOAD_SHIFT;
-
- if (*imbalance == 0) {
- if (package_idle != NOT_IDLE && domain->flags & SD_FLAG_IDLE
- && max_load * busiest_nr_cpus > (3*SCHED_LOAD_SCALE/2))
- *imbalance = 1;
- else
- busiest = NULL;
- }
+ *imbalance = min(max_load - avg_load, avg_load - this_load) / 2;
+ /* Get rid of the scaling factor, rounding *up* as we divide */
+ *imbalance = (*imbalance + SCHED_LOAD_SCALE-1)
+ >> SCHED_LOAD_SHIFT;
+ if (*imbalance == 0 && (max_load - this_load) > SCHED_LOAD_SCALE)
+ *imbalance = 1;
return busiest;
out_balanced:
_
next prev parent reply other threads:[~2004-02-08 4:05 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-02-06 9:24 [PATCH] Load balancing problem in 2.6.2-mm1 Rick Lindsley
2004-02-06 9:38 ` Nick Piggin
2004-02-06 18:13 ` Rick Lindsley
2004-02-06 21:57 ` Nick Piggin
2004-02-06 22:30 ` Rick Lindsley
2004-02-06 22:40 ` Nick Piggin
2004-02-06 22:49 ` Rick Lindsley
2004-02-06 23:08 ` Nick Piggin
2004-02-06 10:30 ` Anton Blanchard
2004-02-06 18:15 ` Rick Lindsley
2004-02-06 18:39 ` Martin J. Bligh
2004-02-06 22:02 ` Nick Piggin
2004-02-06 22:34 ` Rick Lindsley
2004-02-06 22:48 ` Nick Piggin
2004-02-06 22:42 ` Martin J. Bligh
2004-02-06 22:53 ` Nick Piggin
2004-02-06 23:11 ` Rick Lindsley
2004-02-06 23:20 ` Nick Piggin
2004-02-06 23:33 ` Martin J. Bligh
2004-02-06 23:41 ` Nick Piggin
2004-02-06 23:47 ` Martin J. Bligh
2004-02-07 0:11 ` Nick Piggin
2004-02-07 0:25 ` Martin J. Bligh
2004-02-07 0:31 ` Nick Piggin
2004-02-07 9:50 ` Anton Blanchard
2004-02-08 0:40 ` Rick Lindsley
2004-02-08 1:12 ` Anton Blanchard
2004-02-08 1:21 ` Nick Piggin
2004-02-08 1:41 ` Anton Blanchard
2004-02-08 3:20 ` Nick Piggin
2004-02-08 3:57 ` Anton Blanchard
2004-02-08 4:05 ` Nick Piggin [this message]
2004-02-08 12:14 ` Anton Blanchard
2004-02-08 1:22 ` Anton Blanchard
2004-02-09 16:37 ` Timothy Miller
2004-02-09 16:43 ` Martin J. Bligh
2004-02-06 18:33 ` Martin J. Bligh
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4025B572.9040904@cyberone.com.au \
--to=piggin@cyberone.com.au \
--cc=akpm@osdl.org \
--cc=anton@samba.org \
--cc=dvhltc@us.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mbligh@aracnet.com \
--cc=ricklind@us.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.