* Re: [Lse-tech] Re: fix zonelist ordering for NUMA
@ 2004-02-26 22:21 Martin J. Bligh
2004-02-26 23:09 ` Matthew Dobson
` (5 more replies)
0 siblings, 6 replies; 7+ messages in thread
From: Martin J. Bligh @ 2004-02-26 22:21 UTC (permalink / raw)
To: linux-ia64
--On Wednesday, February 25, 2004 08:54:09 -0800 Jesse Barnes <jbarnes@sgi.com> wrote:
> On Wed, Feb 25, 2004 at 02:01:16PM +0900, j-nomura@ce.jp.nec.com wrote:
>> > 1) make it arch independent
>> > this means having arch code populate a SLIT-like table for use by
>> > the generic zonelist building code
>>
>> I would like to hear the comments from people on other arch.
>> If the same ordering rule can be applicable for others, it's nice.
>
> Martin, does a scheme like this sound ok with you? Arch specific code
> would populate a node distance table, which would be used to build each
> pgdat->zonelist in a smarter way than we do currently.
Yeah, looks sensible to me. We probably ought to do this:
+#ifndef node_distance
+#define node_distance(from,to) (1)
+#endif
in the generic fallback topology headers, not in the mm/ .c files. Matt?
Also, I seem to recall those build_zonelists functions are used for both
NUMA and UMA ... now they're getting complex enough that it's probably
worth making a specific non-NUMA version, if only for the sanity of
99% of the poor souls trying to work out how a UMA machine lays it out ;-)
It looks like it won't change ordering for existing boxes with single
layer flat NUMA topologies (round-robin), but we probably ought to check
that carefully ;-)
M.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Lse-tech] Re: fix zonelist ordering for NUMA
2004-02-26 22:21 [Lse-tech] Re: fix zonelist ordering for NUMA Martin J. Bligh
@ 2004-02-26 23:09 ` Matthew Dobson
2004-02-26 23:40 ` Dave Hansen
` (4 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Matthew Dobson @ 2004-02-26 23:09 UTC (permalink / raw)
To: linux-ia64
[-- Attachment #1: Type: text/plain, Size: 1580 bytes --]
On Thu, 2004-02-26 at 14:21, Martin J. Bligh wrote:
> --On Wednesday, February 25, 2004 08:54:09 -0800 Jesse Barnes <jbarnes@sgi.com> wrote:
>
> > On Wed, Feb 25, 2004 at 02:01:16PM +0900, j-nomura@ce.jp.nec.com wrote:
> >> > 1) make it arch independent
> >> > this means having arch code populate a SLIT-like table for use by
> >> > the generic zonelist building code
> >>
> >> I would like to hear the comments from people on other arch.
> >> If the same ordering rule can be applicable for others, it's nice.
> >
> > Martin, does a scheme like this sound ok with you? Arch specific code
> > would populate a node distance table, which would be used to build each
> > pgdat->zonelist in a smarter way than we do currently.
>
> Yeah, looks sensible to me. We probably ought to do this:
>
> +#ifndef node_distance
> +#define node_distance(from,to) (1)
> +#endif
>
> in the generic fallback topology headers, not in the mm/ .c files. Matt?
>
> Also, I seem to recall those build_zonelists functions are used for both
> NUMA and UMA ... now they're getting complex enough that it's probably
> worth making a specific non-NUMA version, if only for the sanity of
> 99% of the poor souls trying to work out how a UMA machine lays it out ;-)
>
> It looks like it won't change ordering for existing boxes with single
> layer flat NUMA topologies (round-robin), but we probably ought to check
> that carefully ;-)
>
> M.
Yep... Here's a quickie for i386 and the generic header. All other
arches would look pretty similar to the asm/i386/topology.h change.
-Matt
[-- Attachment #2: node_distance.patch --]
[-- Type: text/x-patch, Size: 1161 bytes --]
diff -Nurp --exclude-from=/home/mcd/.dontdiff linux-2.6.3-vanilla/include/asm-generic/topology.h linux-2.6.3-topo_distance/include/asm-generic/topology.h
--- linux-2.6.3-vanilla/include/asm-generic/topology.h Tue Feb 17 19:57:15 2004
+++ linux-2.6.3-topo_distance/include/asm-generic/topology.h Thu Feb 26 15:02:08 2004
@@ -44,6 +44,9 @@
#ifndef pcibus_to_cpumask
#define pcibus_to_cpumask(bus) (cpu_online_map)
#endif
+#ifndef node_distance
+#define node_distance(from, to) (1)
+#endif
/* Cross-node load balancing interval. */
#ifndef NODE_BALANCE_RATE
diff -Nurp --exclude-from=/home/mcd/.dontdiff linux-2.6.3-vanilla/include/asm-i386/topology.h linux-2.6.3-topo_distance/include/asm-i386/topology.h
--- linux-2.6.3-vanilla/include/asm-i386/topology.h Tue Feb 17 19:57:17 2004
+++ linux-2.6.3-topo_distance/include/asm-i386/topology.h Thu Feb 26 15:00:12 2004
@@ -66,6 +66,12 @@ static inline cpumask_t pcibus_to_cpumas
return node_to_cpumask(mp_bus_id_to_node[bus]);
}
+/* Node-to-Node distance */
+static inline int node_distance(int from, int to)
+{
+ return 1;
+}
+
/* Cross-node load balancing interval. */
#define NODE_BALANCE_RATE 100
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Lse-tech] Re: fix zonelist ordering for NUMA
2004-02-26 22:21 [Lse-tech] Re: fix zonelist ordering for NUMA Martin J. Bligh
2004-02-26 23:09 ` Matthew Dobson
@ 2004-02-26 23:40 ` Dave Hansen
2004-02-26 23:54 ` Martin J. Bligh
` (3 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Dave Hansen @ 2004-02-26 23:40 UTC (permalink / raw)
To: linux-ia64
On Thu, 2004-02-26 at 14:21, Martin J. Bligh wrote:
> Also, I seem to recall those build_zonelists functions are used for both
> NUMA and UMA ... now they're getting complex enough that it's probably
> worth making a specific non-NUMA version, if only for the sanity of
> 99% of the poor souls trying to work out how a UMA machine lays it out ;-)
>
> It looks like it won't change ordering for existing boxes with single
> layer flat NUMA topologies (round-robin), but we probably ought to check
> that carefully ;-)
More ifdefs just make the code messier. How about adding a nice comment
explaining that UMA will fall right through the loop, and giving the
poor souls a temporary reprieve?
-- dave
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Lse-tech] Re: fix zonelist ordering for NUMA
2004-02-26 22:21 [Lse-tech] Re: fix zonelist ordering for NUMA Martin J. Bligh
2004-02-26 23:09 ` Matthew Dobson
2004-02-26 23:40 ` Dave Hansen
@ 2004-02-26 23:54 ` Martin J. Bligh
2004-02-27 0:47 ` Chris Wedgwood
` (2 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Martin J. Bligh @ 2004-02-26 23:54 UTC (permalink / raw)
To: linux-ia64
> On Thu, 2004-02-26 at 14:21, Martin J. Bligh wrote:
>> Also, I seem to recall those build_zonelists functions are used for both
>> NUMA and UMA ... now they're getting complex enough that it's probably
>> worth making a specific non-NUMA version, if only for the sanity of
>> 99% of the poor souls trying to work out how a UMA machine lays it out ;-)
>>
>> It looks like it won't change ordering for existing boxes with single
>> layer flat NUMA topologies (round-robin), but we probably ought to check
>> that carefully ;-)
>
> More ifdefs just make the code messier. How about adding a nice comment
> explaining that UMA will fall right through the loop, and giving the
> poor souls a temporary reprieve?
In this case, I disagree - it's better to just have the ifdef.
M.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Lse-tech] Re: fix zonelist ordering for NUMA
2004-02-26 22:21 [Lse-tech] Re: fix zonelist ordering for NUMA Martin J. Bligh
` (2 preceding siblings ...)
2004-02-26 23:54 ` Martin J. Bligh
@ 2004-02-27 0:47 ` Chris Wedgwood
2004-02-27 0:56 ` Martin J. Bligh
2004-02-27 5:38 ` j-nomura
5 siblings, 0 replies; 7+ messages in thread
From: Chris Wedgwood @ 2004-02-27 0:47 UTC (permalink / raw)
To: linux-ia64
On Thu, Feb 26, 2004 at 02:21:02PM -0800, Martin J. Bligh wrote:
> +#ifndef node_distance
> +#define node_distance(from,to) (1)
> +#endif
1 or 0 here?
--cw
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Lse-tech] Re: fix zonelist ordering for NUMA
2004-02-26 22:21 [Lse-tech] Re: fix zonelist ordering for NUMA Martin J. Bligh
` (3 preceding siblings ...)
2004-02-27 0:47 ` Chris Wedgwood
@ 2004-02-27 0:56 ` Martin J. Bligh
2004-02-27 5:38 ` j-nomura
5 siblings, 0 replies; 7+ messages in thread
From: Martin J. Bligh @ 2004-02-27 0:56 UTC (permalink / raw)
To: linux-ia64
--On Thursday, February 26, 2004 16:47:10 -0800 Chris Wedgwood <cw@f00f.org> wrote:
> On Thu, Feb 26, 2004 at 02:21:02PM -0800, Martin J. Bligh wrote:
>
>> +#ifndef node_distance
>> +#define node_distance(from,to) (1)
>> +#endif
>
> 1 or 0 here?
Dunno ... was from j-nomura@ce.jp.nec.com, but I'd guess 1 makes more
sense semantically - we're saying the distance between nodes on a flat
NUMA architecture is equal (I think ;-))
M.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Lse-tech] Re: fix zonelist ordering for NUMA
2004-02-26 22:21 [Lse-tech] Re: fix zonelist ordering for NUMA Martin J. Bligh
` (4 preceding siblings ...)
2004-02-27 0:56 ` Martin J. Bligh
@ 2004-02-27 5:38 ` j-nomura
5 siblings, 0 replies; 7+ messages in thread
From: j-nomura @ 2004-02-27 5:38 UTC (permalink / raw)
To: linux-ia64
Hi,
> >> +#ifndef node_distance
> >> +#define node_distance(from,to) (1)
> >> +#endif
> >
> > 1 or 0 here?
>
> Dunno ... was from j-nomura@ce.jp.nec.com, but I'd guess 1 makes more
> sense semantically - we're saying the distance between nodes on a flat
> NUMA architecture is equal (I think ;-))
Umm. Even on a flat NUMA architecture, I should have distinguish between
local and remote. It's NUMA.
#define node_distance(from,to) (from != to)
Best regards.
--
NOMURA, Jun'ichi <j-nomura@ce.jp.nec.com>
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2004-02-27 5:38 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-02-26 22:21 [Lse-tech] Re: fix zonelist ordering for NUMA Martin J. Bligh
2004-02-26 23:09 ` Matthew Dobson
2004-02-26 23:40 ` Dave Hansen
2004-02-26 23:54 ` Martin J. Bligh
2004-02-27 0:47 ` Chris Wedgwood
2004-02-27 0:56 ` Martin J. Bligh
2004-02-27 5:38 ` j-nomura
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox