From: jbarnes@sgi.com (Jesse Barnes)
To: linux-ia64@vger.kernel.org
Subject: Re: fix zonelist ordering for NUMA
Date: Tue, 24 Feb 2004 17:13:34 +0000 [thread overview]
Message-ID: <20040224171334.GA13504@sgi.com> (raw)
In-Reply-To: <20040224.182028.884032071.nomura@linux.bs1.fc.nec.co.jp>
On Tue, Feb 24, 2004 at 06:20:28PM +0900, j-nomura@ce.jp.nec.com wrote:
> The attached patch makes use of arch-dependent info for building zonelist.
> The patch uses ACPI SLIT for ia64.
> Other arch may have their own method to determine the order.
>
> This kind of ordering is very important for the NUMA system in which
> the distance between nodes is not uniform.
>
> The patch doing this was posted by Jesse Barnes in linux-ia64:
> http://marc.theaimsgroup.com/?t\x106383477500001&r=1&w=2
> however, I couldn't find it in current tree...
Yeah, I haven't pushed it yet (I didn't think it was ready yet and I
haven't done a good version for 2.6 yet).
> The sorting can be extended to, for example, more fine grained round-robin
> like Erich suggested. But let's start from the simple one.
>
> Any comments?
Yeah, it looks ok. What I was hoping to do in the patch that ultimately
gets in:
1) make it arch independent
this means having arch code populate a SLIT-like table for use by
the generic zonelist building code
2) handle the cases that Erich talked about a bit better
3) some systems have pgdats w/o any CPUs associated with them, they
need to be dealt with differently than regular nodes, maybe as
extensions to an existing node
The final routine might look something like (many thanks to pj for
hitting me with a cluebat about this):
/**
* find_next_best_node - find the next node that should appear in a given
* node's fallback list
* @node: node whose fallback list we're appending
*
* We use a number of factors to determine which is the next node that should
* appear on a given node's fallback list. The node should not have appeared
* already in @node's fallback list, and it should be the next closest node
* according to the distance array (which contains arbitrary distance values
* from each node to each node in the system), and should also prefer nodes
* with no CPUs, since presumably they'll have very little allocation pressure
* on them otherwise.
*/
int find_next_best_node(int node)
{
int i, val, min_val, best_node;
for (i = 0; i < numnodes; i++) {
/* Don't want a node to appear more than once */
if (node_present(node, i))
continue;
/* Use the distance array to find the distance */
val = node_distance(node, i);
/* Give preference to headless and unused nodes */
val += nid_enabled_cpu_count[i] * 255;
val += node_load[i];
if (val < min_val) {
min_val = val;
best_node = i;
}
}
return best_node;
}
Jesse
next prev parent reply other threads:[~2004-02-24 17:13 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-02-24 9:20 fix zonelist ordering for NUMA j-nomura
2004-02-24 17:13 ` Jesse Barnes [this message]
2004-02-25 5:01 ` j-nomura
2004-02-25 16:54 ` Jesse Barnes
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20040224171334.GA13504@sgi.com \
--to=jbarnes@sgi.com \
--cc=linux-ia64@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox