From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ryan Harper Subject: Re: [PATCH 0/6] xen,xend,tools: Add NUMA support to Xen Date: Wed, 2 Aug 2006 17:29:47 -0500 Message-ID: <20060802222947.GR1694@us.ibm.com> References: <20060731190958.GI1694@us.ibm.com> <200608010946.43556.Tristan.Gingold@bull.net> <20060801154053.GQ1694@us.ibm.com> <200608020759.47453.Tristan.Gingold@bull.net> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable Return-path: Content-Disposition: inline In-Reply-To: <200608020759.47453.Tristan.Gingold@bull.net> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Tristan Gingold Cc: xen-devel@lists.xensource.com List-Id: xen-devel@lists.xenproject.org * Tristan Gingold [2006-08-02 00:56]: > Le Mardi 01 Ao=FBt 2006 17:40, Ryan Harper a =E9crit : > > * Tristan Gingold [2006-08-01 02:43]: > > > Le Lundi 31 Juillet 2006 21:09, Ryan Harper a =E9crit : > > > > I've respun the NUMA patches against 10874 and I'm re-submitting = them > > > > with the optimizations mentioned [1]previously on the list. Ther= e was > > > > a request to see the overhead on non-numa/single-node machines. = I've > > > > re-run those benchmarks (ballooning up from small mem to multi-gi= g) as > > > > well as timing the initially domain increase_reservation time to = gauge > > > > the overhead when allocating from the heap. > > > > > > Hi, > > > > > > I am trying to use your patch on ia64. > > > > Thanks for testing these out on ia64. > > > > > In asm-x86/topology.h, you wrote: > > > > > > extern unsigned int cpu_to_node[]; > > > extern cpumask_t node_to_cpumask[]; > > > > > > #define cpu_to_node(cpu) (cpu_to_node[cpu]) > > > #define parent_node(node) (node) > > > #define node_to_first_cpu(node) (__ffs(node_to_cpumask[node])) > > > #define node_to_cpumask(node) (node_to_cpumask[node]) > > > > > > I think cpu_to_node and node_to_cpumask must be either a variable o= r a > > > macro, but not both! (ia64 defines cpu_to_node as a macro). > > > > I'm not sure about this, but the definition of both the variable and > > macro come from Linux, for example in > [...] > > AFAIK, this isn't an issue. > Except you are using both versions mainly the macro but at least the va= riable=20 > once in page_alloc.c: >=20 > /* Allocate 2^@order contiguous pages. */ > struct page_info *alloc_heap_pages(unsigned int zone, unsigned int cpu, > unsigned int order) > { > unsigned int i,j, node =3D cpu_to_node[cpu], num_nodes =3D num_onli= ne_nodes(); > unsigned int request =3D (1UL << order); >=20 > This was a problem for ia64. > Furthermore you define the variable in xen/numa.h: > extern unsigned int cpu_to_node[]; > #include > extern cpumask_t node_to_cpumask[]; >=20 > Which one is the API ? If ia64 is already using macros, then we should use the macros. I should be able to remove those externs from numa.h and include the asm/topology.h (not in xen/numa.h since that is the same thing) in the .c files , like page_alloc.c, and use the macros instead. Something like this (compiled on x86_64): diff -r 083e69a85080 xen/arch/x86/dom0_ops.c --- a/xen/arch/x86/dom0_ops.c Mon Jul 31 10:53:59 2006 -0500 +++ b/xen/arch/x86/dom0_ops.c Wed Aug 02 10:36:28 2006 -0500 @@ -26,6 +26,7 @@ #include #include #include +#include =20 #include #include "cpu/mtrr/mtrr.h" @@ -220,7 +221,7 @@ long arch_do_dom0_op(struct dom0_op *op, memset(node_to_cpu_64, 0, sizeof(node_to_cpu_64)); for ( i=3D0; inr_nodes; i++) { for ( j=3D0; ju.physinfo.node_to_cpu,=20 diff -r 083e69a85080 xen/common/page_alloc.c --- a/xen/common/page_alloc.c Mon Jul 31 10:53:59 2006 -0500 +++ b/xen/common/page_alloc.c Wed Aug 02 10:32:27 2006 -0500 @@ -35,6 +35,7 @@ #include #include #include +#include #include =20 /* @@ -317,7 +318,7 @@ struct page_info *alloc_heap_pages(unsig struct page_info *alloc_heap_pages(unsigned int zone, unsigned int cpu, unsigned int order) { - unsigned int i,j, node =3D cpu_to_node[cpu], num_nodes =3D num_onlin= e_nodes(); + unsigned int i,j, node =3D cpu_to_node(cpu), num_nodes =3D num_onlin= e_nodes(); unsigned int request =3D (1UL << order); struct page_info *pg; =20 diff -r 083e69a85080 xen/include/xen/numa.h --- a/xen/include/xen/numa.h Mon Jul 31 10:53:59 2006 -0500 +++ b/xen/include/xen/numa.h Wed Aug 02 17:24:39 2006 -0500 @@ -23,8 +23,4 @@ /* needed for drivers/acpi/numa.c */ #define NR_NODE_MEMBLKS (MAX_NUMNODES*2) =20 -extern unsigned int cpu_to_node[]; -#include -extern cpumask_t node_to_cpumask[]; - #endif /* _XEN_NUMA_H */ --=20 Ryan Harper Software Engineer; Linux Technology Center IBM Corp., Austin, Tx (512) 838-9253 T/L: 678-9253 ryanh@us.ibm.com