From: Erich Focht <efocht@ess.nec.de>
To: "Martin J. Bligh" <mbligh@aracnet.com>,
Andrew Theurer <habanero@us.ibm.com>,
Michael Hohnbaum <hohnbaum@us.ibm.com>,
Christoph Hellwig <hch@infradead.org>
Cc: Robert Love <rml@tech9.net>, Ingo Molnar <mingo@elte.hu>,
linux-kernel <linux-kernel@vger.kernel.org>,
lse-tech <lse-tech@lists.sourceforge.net>
Subject: [PATCH 2.5.58] new NUMA scheduler: fix
Date: Tue, 14 Jan 2003 17:43:25 +0100 [thread overview]
Message-ID: <200301141743.25513.efocht@ess.nec.de> (raw)
In-Reply-To: <200301141723.29613.efocht@ess.nec.de>
[-- Attachment #1: Type: text/plain, Size: 581 bytes --]
Aargh, I should have gone home earlier...
For those who really care about patch 05, it's attached. It's all
untested as I don't have a ia32 NUMA machine running 2.5.58...
Erich
On Tuesday 14 January 2003 17:23, Erich Focht wrote:
> In the previous email the patch 02-initial-lb-2.5.58.patch had a bug
> and this was present in the numa-sched-2.5.58.patch and
> numa-sched-add-2.5.58.patch, too. Please use the patches attached to
> this email! Sorry for the silly mistake...
>
> Christoph, I used your way of coding nr_running_inc/dec now.
>
> Regards,
> Erich
[-- Attachment #2: 05-var-intnode-lb-2.5.58.patch --]
[-- Type: text/x-diff, Size: 2949 bytes --]
diff -urNp 2.5.58-ms-ilb-nb-sm/kernel/sched.c 2.5.58-ms-ilb-nb-sm-var/kernel/sched.c
--- 2.5.58-ms-ilb-nb-sm/kernel/sched.c 2003-01-14 17:11:48.000000000 +0100
+++ 2.5.58-ms-ilb-nb-sm-var/kernel/sched.c 2003-01-14 17:36:26.000000000 +0100
@@ -67,8 +67,9 @@
#define INTERACTIVE_DELTA 2
#define MAX_SLEEP_AVG (2*HZ)
#define STARVATION_LIMIT (2*HZ)
-#define NODE_BALANCE_RATIO 10
#define NODE_THRESHOLD 125
+#define NODE_BALANCE_MIN 10
+#define NODE_BALANCE_MAX 40
/*
* If a task is 'interactive' then we reinsert it in the active
@@ -186,6 +187,8 @@ static struct runqueue runqueues[NR_CPUS
#if CONFIG_NUMA
static atomic_t node_nr_running[MAX_NUMNODES] ____cacheline_maxaligned_in_smp =
{[0 ...MAX_NUMNODES-1] = ATOMIC_INIT(0)};
+static int internode_lb[MAX_NUMNODES] ____cacheline_maxaligned_in_smp =
+ {[0 ...MAX_NUMNODES-1] = NODE_BALANCE_MAX};
static inline void nr_running_inc(runqueue_t *rq)
{
@@ -735,15 +738,18 @@ void sched_balance_exec(void)
static int find_busiest_node(int this_node)
{
int i, node = -1, load, this_load, maxload;
+ int avg_load;
this_load = maxload = (this_rq()->prev_node_load[this_node] >> 1)
+ atomic_read(&node_nr_running[this_node]);
this_rq()->prev_node_load[this_node] = this_load;
+ avg_load = this_load;
for (i = 0; i < numnodes; i++) {
if (i == this_node)
continue;
load = (this_rq()->prev_node_load[i] >> 1)
+ atomic_read(&node_nr_running[i]);
+ avg_load += load;
this_rq()->prev_node_load[i] = load;
if (load > maxload &&
(100*load > ((NODE_THRESHOLD*100*this_load)/100))) {
@@ -751,10 +757,26 @@ static int find_busiest_node(int this_no
node = i;
}
}
- if (maxload <= 4+2+1)
+ avg_load = avg_load / (numnodes ? numnodes : 1);
+ if (this_load < (avg_load / 2)) {
+ if (internode_lb[this_node] == NODE_BALANCE_MAX)
+ internode_lb[this_node] = NODE_BALANCE_MIN;
+ } else
+ if (internode_lb[this_node] == NODE_BALANCE_MIN)
+ internode_lb[this_node] = NODE_BALANCE_MAX;
+ if (maxload <= 4+2+1 || this_load >= avg_load)
node = -1;
return node;
}
+
+static inline int remote_steal_factor(runqueue_t *rq)
+{
+ int cpu = __cpu_to_node(task_cpu(rq->curr));
+
+ return (cpu == __cpu_to_node(smp_processor_id())) ? 1 : 2;
+}
+#else
+#define remote_steal_factor(rq) 1
#endif /* CONFIG_NUMA */
#if CONFIG_SMP
@@ -903,7 +925,7 @@ static void load_balance(runqueue_t *thi
/*
* Avoid rebalancing between nodes too often.
*/
- if (!(++(this_rq->nr_balanced) % NODE_BALANCE_RATIO)) {
+ if (!(++(this_rq->nr_balanced) % internode_lb[this_node])) {
int node = find_busiest_node(this_node);
if (node >= 0)
cpumask = __node_to_cpu_mask(node) | (1UL << this_cpu);
@@ -967,7 +989,7 @@ skip_queue:
goto skip_bitmap;
}
pull_task(busiest, array, tmp, this_rq, this_cpu);
- if (!idle && --imbalance) {
+ if (!idle && ((--imbalance)/remote_steal_factor(busiest))) {
if (curr != head)
goto skip_queue;
idx++;
next prev parent reply other threads:[~2003-01-14 16:34 UTC|newest]
Thread overview: 96+ messages / expand[flat|nested] mbox.gz Atom feed top
2003-01-09 23:54 Minature NUMA scheduler Martin J. Bligh
2003-01-10 5:36 ` [Lse-tech] " Michael Hohnbaum
2003-01-10 16:34 ` Erich Focht
2003-01-10 16:57 ` Martin J. Bligh
2003-01-12 23:35 ` Erich Focht
2003-01-12 23:55 ` NUMA scheduler 2nd approach Erich Focht
2003-01-13 8:02 ` Christoph Hellwig
2003-01-13 11:32 ` Erich Focht
2003-01-13 15:26 ` [Lse-tech] " Christoph Hellwig
2003-01-13 15:46 ` Erich Focht
2003-01-13 19:03 ` Michael Hohnbaum
2003-01-14 1:23 ` Michael Hohnbaum
2003-01-14 4:45 ` [Lse-tech] " Andrew Theurer
2003-01-14 4:56 ` Martin J. Bligh
2003-01-14 11:14 ` Erich Focht
2003-01-14 15:55 ` [PATCH 2.5.58] new NUMA scheduler Erich Focht
2003-01-14 16:07 ` [Lse-tech] " Christoph Hellwig
2003-01-14 16:23 ` [PATCH 2.5.58] new NUMA scheduler: fix Erich Focht
2003-01-14 16:43 ` Erich Focht [this message]
2003-01-14 19:02 ` Michael Hohnbaum
2003-01-14 21:56 ` [Lse-tech] " Michael Hohnbaum
2003-01-15 15:10 ` Erich Focht
2003-01-16 0:14 ` Michael Hohnbaum
2003-01-16 6:05 ` Martin J. Bligh
2003-01-16 16:47 ` Erich Focht
2003-01-16 18:07 ` Robert Love
2003-01-16 18:48 ` Martin J. Bligh
2003-01-16 19:07 ` Ingo Molnar
2003-01-16 18:59 ` Martin J. Bligh
2003-01-16 19:10 ` Christoph Hellwig
2003-01-16 19:44 ` Ingo Molnar
2003-01-16 19:43 ` Martin J. Bligh
2003-01-16 20:19 ` Ingo Molnar
2003-01-16 20:29 ` [Lse-tech] " Rick Lindsley
2003-01-16 23:31 ` Martin J. Bligh
2003-01-17 7:23 ` Ingo Molnar
2003-01-17 8:47 ` [patch] sched-2.5.59-A2 Ingo Molnar
2003-01-17 14:35 ` Erich Focht
2003-01-17 15:11 ` Ingo Molnar
2003-01-17 15:30 ` Erich Focht
2003-01-17 16:58 ` Martin J. Bligh
2003-01-18 20:54 ` NUMA sched -> pooling scheduler (inc HT) Martin J. Bligh
2003-01-18 21:34 ` [Lse-tech] " Martin J. Bligh
2003-01-19 0:13 ` Andrew Theurer
2003-01-17 18:19 ` [patch] sched-2.5.59-A2 Michael Hohnbaum
2003-01-18 7:08 ` William Lee Irwin III
2003-01-18 8:12 ` Martin J. Bligh
2003-01-18 8:16 ` William Lee Irwin III
2003-01-19 4:22 ` William Lee Irwin III
2003-01-17 17:21 ` Martin J. Bligh
2003-01-17 17:23 ` Martin J. Bligh
2003-01-17 18:11 ` Erich Focht
2003-01-17 19:04 ` Martin J. Bligh
2003-01-17 19:26 ` [Lse-tech] " Martin J. Bligh
2003-01-18 0:13 ` Michael Hohnbaum
2003-01-18 13:31 ` [patch] tunable rebalance rates for sched-2.5.59-B0 Erich Focht
2003-01-18 23:09 ` [patch] sched-2.5.59-A2 Erich Focht
2003-01-20 9:28 ` Ingo Molnar
2003-01-20 12:07 ` Erich Focht
2003-01-20 16:56 ` Ingo Molnar
2003-01-20 17:04 ` Ingo Molnar
2003-01-20 17:10 ` Martin J. Bligh
2003-01-20 17:24 ` Ingo Molnar
2003-01-20 19:13 ` Andrew Theurer
2003-01-20 19:33 ` Martin J. Bligh
2003-01-20 19:52 ` Andrew Theurer
2003-01-20 19:52 ` Martin J. Bligh
2003-01-20 21:18 ` [patch] HT scheduler, sched-2.5.59-D7 Ingo Molnar
2003-01-20 22:28 ` Andrew Morton
2003-01-21 1:11 ` Michael Hohnbaum
2003-01-22 3:15 ` Michael Hohnbaum
2003-01-22 16:41 ` Andrew Theurer
2003-01-22 16:17 ` Martin J. Bligh
2003-01-22 16:20 ` Andrew Theurer
2003-01-22 16:35 ` Michael Hohnbaum
2003-02-03 18:23 ` [patch] HT scheduler, sched-2.5.59-E2 Ingo Molnar
2003-02-03 20:47 ` Robert Love
2003-02-04 9:31 ` Erich Focht
2003-01-20 17:04 ` [patch] sched-2.5.59-A2 Martin J. Bligh
2003-01-21 17:44 ` Erich Focht
2003-01-20 16:23 ` Martin J. Bligh
2003-01-20 16:59 ` Ingo Molnar
2003-01-17 23:09 ` Matthew Dobson
2003-01-16 23:45 ` [PATCH 2.5.58] new NUMA scheduler: fix Michael Hohnbaum
2003-01-17 11:10 ` Erich Focht
2003-01-17 14:07 ` Ingo Molnar
2003-01-16 19:44 ` John Bradford
2003-01-14 16:51 ` Christoph Hellwig
2003-01-15 0:05 ` Michael Hohnbaum
2003-01-15 7:47 ` Martin J. Bligh
2003-01-14 5:50 ` [Lse-tech] Re: NUMA scheduler 2nd approach Michael Hohnbaum
2003-01-14 16:52 ` Andrew Theurer
2003-01-14 15:13 ` Erich Focht
2003-01-14 10:56 ` Erich Focht
2003-01-11 14:43 ` [Lse-tech] Minature NUMA scheduler Bill Davidsen
2003-01-12 23:24 ` Erich Focht
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200301141743.25513.efocht@ess.nec.de \
--to=efocht@ess.nec.de \
--cc=habanero@us.ibm.com \
--cc=hch@infradead.org \
--cc=hohnbaum@us.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=lse-tech@lists.sourceforge.net \
--cc=mbligh@aracnet.com \
--cc=mingo@elte.hu \
--cc=rml@tech9.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.