All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andre Przywara <andre.przywara@amd.com>
To: Keir Fraser <keir.fraser@eu.citrix.com>
Cc: xen-devel@lists.xensource.com, "Edwin.Zhai" <edwin.zhai@intel.com>
Subject: Re: [RFC][PATCH] domheap optimization for NUMA
Date: Thu, 03 Apr 2008 12:39:14 +0200	[thread overview]
Message-ID: <47F4B3D2.1030909@amd.com> (raw)
In-Reply-To: <C419D386.15E31%keir.fraser@eu.citrix.com>

[-- Attachment #1: Type: text/plain, Size: 3673 bytes --]

Keir,

>>> Yes, but it's a bad interface, particularlty when the function is called
>>> alloc_domheap_pages_on_node(). Pass in a nodeid. Write a helper function to
>>> work out the nodeid from the domain*.
>> I was just looking at this code, too, so I fixed this. Eventually
>> alloc_heap_pages is called, which deals with nodes only, so I replaced
>> cpu with node everywhere else, too. Now __alloc_domheap_pages and
>> alloc_domheap_pages_on_node are almost the same (except parameter
>> ordering), so I removed the first one, since the naming of the latter is
>> better. Passing node numbers instead of cpu numbers needs cpu_to_node
>> and asm/numa.h, if you think there is a better way, I am all ears.
> 
> That's fine. If you reference numa stuff then you need numa.h.
> 
> But vcpu_to_node and domain_to_node as well as cpu_to_node, please. There's
> no need to be open-coding v->processor everywhere. Also in future we might
> care to pick node based on v's affinity map rather than just current
> processor value. And usage of d->vcpu[0] without checking for != NULL is
> asking to introduce edge-case bugs. We can easily do that NULL check in one
> place if we implement domain_to_node().
Ok, I did this. I provided NUMA_NO_NODE in the case d->vcpu[0] is NULL, 
this will be resolved to the current node in alloc_heap_pages (at least 
for now).
By the way, can we solve the DMA_BITSIZE problem (your mail from 28th 
Feb) with this? If no node is specified, use the current behaviour of 
preferring non DMA zones, else stick to the given node.
If you agree, I will implement this.

> And, while I'm thinking about the interfaces, let's just stick to
> alloc_domheap_page() and alloc_domheap_pages(). Let's add a flags parameter
> to the former (so it matches the latter in that respect) and let's add a
> MEMF_node() flag subtype (similar to MEMF_bits). Semantics will be that if
> MEMF_node(node) is provided then we try to allocate memory from node; else
> we try to allocate memory from a node local to specified domain; else if
> domain is NULL then we ignore locality.
Sounds reasonable. I changed this, too. If domain is NULL, 
domain_to_node will return NUMA_NO_NODE, which will eventually ignore 
locality (in alloc_heap_pages).
> 
> Since zero is probably a valid numa nodeid we can define MEMF_node() as
> something like ((((node)+1)&0xff)<<8). Then since NUMA_NO_NODE==0xff
> everything works nicely: MEMF_node(NUMA_NO_NODE) is equivalent to not
> specifying MEMF_node() at all, which is what we would logically expect.
Good idea.
> NUMA_NO_NODE probably needs to be pulled out of asm-x86/numa.h and made the
> official arch-neutral way to specify 'don't care' for numa nodes.
Is this really needed? I provided memflags=0 is all don't care cases, 
this should work and is more compatible. But beware that this silently 
assumes in page_alloc.c#alloc_domheap_pages that NUMA_NO_NODE is 0xFF, 
otherwise this trick will not work.

Attached again a diff against my last version and the full patch (for
some reason a missing bracket slipped through my last one, sorry for that).

This is only quick-tested (booted and created a guest on each node).

Signed-off-by: Andre Przywara <andre.przywara@amd.com>

Regards,
Andre.

-- 
Andre Przywara
AMD-Operating System Research Center (OSRC), Dresden, Germany
Tel: +49 351 277-84917
----to satisfy European Law for business letters:
AMD Saxony Limited Liability Company & Co. KG,
Wilschdorfer Landstr. 101, 01109 Dresden, Germany
Register Court Dresden: HRA 4896, General Partner authorized
to represent: AMD Saxony LLC (Wilmington, Delaware, US)
General Manager of AMD Saxony LLC: Dr. Hans-R. Deppe, Thomas McCoy

[-- Attachment #2: alloc_domheap_node_v2_diff.patch --]
[-- Type: text/plain, Size: 18897 bytes --]

diff -r 848a36114bb9 xen/arch/ia64/xen/mm.c
--- a/xen/arch/ia64/xen/mm.c	Thu Apr 03 00:37:33 2008 +0200
+++ b/xen/arch/ia64/xen/mm.c	Thu Apr 03 12:25:31 2008 +0200
@@ -822,7 +822,7 @@ __assign_new_domain_page(struct domain *
 
     BUG_ON(!pte_none(*pte));
 
-    p = alloc_domheap_page(d);
+    p = alloc_domheap_page(d, 0);
     if (unlikely(!p)) {
         printk("assign_new_domain_page: Can't alloc!!!! Aaaargh!\n");
         return(p);
@@ -2316,7 +2316,7 @@ steal_page(struct domain *d, struct page
         unsigned long new_mfn;
         int ret;
 
-        new = alloc_domheap_page(d);
+        new = alloc_domheap_page(d, 0);
         if (new == NULL) {
             gdprintk(XENLOG_INFO, "alloc_domheap_page() failed\n");
             return -1;
@@ -2603,7 +2603,7 @@ void *pgtable_quicklist_alloc(void)
 
     BUG_ON(dom_p2m == NULL);
     if (!opt_p2m_xenheap) {
-        struct page_info *page = alloc_domheap_page(dom_p2m);
+        struct page_info *page = alloc_domheap_page(dom_p2m, 0);
         if (page == NULL)
             return NULL;
         p = page_to_virt(page);
diff -r 848a36114bb9 xen/arch/ia64/xen/tlb_track.c
--- a/xen/arch/ia64/xen/tlb_track.c	Thu Apr 03 00:37:33 2008 +0200
+++ b/xen/arch/ia64/xen/tlb_track.c	Thu Apr 03 12:25:31 2008 +0200
@@ -48,7 +48,7 @@ tlb_track_allocate_entries(struct tlb_tr
                 __func__, tlb_track->num_entries, tlb_track->limit);
         return -ENOMEM;
     }
-    entry_page = alloc_domheap_page(NULL);
+    entry_page = alloc_domheap_page(NULL, 0);
     if (entry_page == NULL) {
         dprintk(XENLOG_WARNING,
                 "%s: domheap page failed. num_entries %d limit %d\n",
@@ -84,7 +84,7 @@ tlb_track_create(struct domain* d)
     if (tlb_track == NULL)
         goto out;
 
-    hash_page = alloc_domheap_page(NULL);
+    hash_page = alloc_domheap_page(NULL, 0);
     if (hash_page == NULL)
         goto out;
 
diff -r 848a36114bb9 xen/arch/x86/domain.c
--- a/xen/arch/x86/domain.c	Thu Apr 03 00:37:33 2008 +0200
+++ b/xen/arch/x86/domain.c	Thu Apr 03 12:25:31 2008 +0200
@@ -172,7 +172,7 @@ int setup_arg_xlat_area(struct vcpu *v, 
 
     if ( !d->arch.mm_arg_xlat_l3 )
     {
-        pg = alloc_domheap_page(NULL);
+        pg = alloc_domheap_page(NULL, 0);
         if ( !pg )
             return -ENOMEM;
         d->arch.mm_arg_xlat_l3 = page_to_virt(pg);
@@ -190,7 +190,7 @@ int setup_arg_xlat_area(struct vcpu *v, 
 
         if ( !l3e_get_intpte(d->arch.mm_arg_xlat_l3[l3_table_offset(va)]) )
         {
-            pg = alloc_domheap_page(NULL);
+            pg = alloc_domheap_page(NULL, 0);
             if ( !pg )
                 return -ENOMEM;
             clear_page(page_to_virt(pg));
@@ -199,7 +199,7 @@ int setup_arg_xlat_area(struct vcpu *v, 
         l2tab = l3e_to_l2e(d->arch.mm_arg_xlat_l3[l3_table_offset(va)]);
         if ( !l2e_get_intpte(l2tab[l2_table_offset(va)]) )
         {
-            pg = alloc_domheap_page(NULL);
+            pg = alloc_domheap_page(NULL, 0);
             if ( !pg )
                 return -ENOMEM;
             clear_page(page_to_virt(pg));
@@ -207,7 +207,7 @@ int setup_arg_xlat_area(struct vcpu *v, 
         }
         l1tab = l2e_to_l1e(l2tab[l2_table_offset(va)]);
         BUG_ON(l1e_get_intpte(l1tab[l1_table_offset(va)]));
-        pg = alloc_domheap_page(NULL);
+        pg = alloc_domheap_page(NULL, 0);
         if ( !pg )
             return -ENOMEM;
         l1tab[l1_table_offset(va)] = l1e_from_page(pg, PAGE_HYPERVISOR);
@@ -253,7 +253,7 @@ static void release_arg_xlat_area(struct
 
 static int setup_compat_l4(struct vcpu *v)
 {
-    struct page_info *pg = alloc_domheap_page(NULL);
+    struct page_info *pg = alloc_domheap_page(NULL, 0);
     l4_pgentry_t *l4tab;
     int rc;
 
@@ -478,8 +478,7 @@ int arch_domain_create(struct domain *d,
 
 #else /* __x86_64__ */
 
-    if ( (pg = alloc_domheap_page_on_node(NULL,
-        cpu_to_node(d->vcpu[0]->processor))) == NULL )
+    if ( (pg = alloc_domheap_page(NULL, MEMF_node(domain_to_node(d)))) == NULL )
             goto fail;
     d->arch.mm_perdomain_l2 = page_to_virt(pg);
     clear_page(d->arch.mm_perdomain_l2);
@@ -488,9 +487,8 @@ int arch_domain_create(struct domain *d,
             l2e_from_page(virt_to_page(d->arch.mm_perdomain_pt)+i,
                           __PAGE_HYPERVISOR);
 
-    if ( (pg = alloc_domheap_page_on_node(NULL,
-        cpu_to_node(d->vcpu[0]->processor))) == NULL )
-            goto fail;
+    if ( (pg = alloc_domheap_page(NULL, MEMF_node(domain_to_node(d)))) == NULL )
+        goto fail;
     d->arch.mm_perdomain_l3 = page_to_virt(pg);
     clear_page(d->arch.mm_perdomain_l3);
     d->arch.mm_perdomain_l3[l3_table_offset(PERDOMAIN_VIRT_START)] =
diff -r 848a36114bb9 xen/arch/x86/domain_build.c
--- a/xen/arch/x86/domain_build.c	Thu Apr 03 00:37:33 2008 +0200
+++ b/xen/arch/x86/domain_build.c	Thu Apr 03 12:25:31 2008 +0200
@@ -630,7 +630,7 @@ int __init construct_dom0(
     }
     else
     {
-        page = alloc_domheap_page(NULL);
+        page = alloc_domheap_page(NULL, 0);
         if ( !page )
             panic("Not enough RAM for domain 0 PML4.\n");
         l4start = l4tab = page_to_virt(page);
diff -r 848a36114bb9 xen/arch/x86/hvm/stdvga.c
--- a/xen/arch/x86/hvm/stdvga.c	Thu Apr 03 00:37:33 2008 +0200
+++ b/xen/arch/x86/hvm/stdvga.c	Thu Apr 03 12:25:31 2008 +0200
@@ -514,8 +514,8 @@ void stdvga_init(struct domain *d)
     
     for ( i = 0; i != ARRAY_SIZE(s->vram_page); i++ )
     {
-        if ( (pg = alloc_domheap_page_on_node(NULL,
-            cpu_to_node(d->vcpu[0]->processor))) == NULL )
+        if ( (pg = alloc_domheap_page(NULL,
+            MEMF_node(domain_to_node(d)))) == NULL )
                 break;
         s->vram_page[i] = pg;
         p = map_domain_page(page_to_mfn(pg));
diff -r 848a36114bb9 xen/arch/x86/hvm/vlapic.c
--- a/xen/arch/x86/hvm/vlapic.c	Thu Apr 03 00:37:33 2008 +0200
+++ b/xen/arch/x86/hvm/vlapic.c	Thu Apr 03 12:25:31 2008 +0200
@@ -917,7 +917,7 @@ int vlapic_init(struct vcpu *v)
 int vlapic_init(struct vcpu *v)
 {
     struct vlapic *vlapic = vcpu_vlapic(v);
-    unsigned int memflags = 0;
+    unsigned int memflags = MEMF_node (vcpu_to_node(v));
 
     HVM_DBG_LOG(DBG_LEVEL_VLAPIC, "%d", v->vcpu_id);
 
@@ -926,11 +926,10 @@ int vlapic_init(struct vcpu *v)
 #ifdef __i386__
     /* 32-bit VMX may be limited to 32-bit physical addresses. */
     if ( boot_cpu_data.x86_vendor == X86_VENDOR_INTEL )
-        memflags = MEMF_bits(32);
+        memflags |= MEMF_bits(32);
 #endif
 
-    vlapic->regs_page = alloc_domheap_pages_on_node(NULL, 0, memflags,
-        cpu_to_node(v->processor));
+    vlapic->regs_page = alloc_domheap_page(NULL, memflags);
     if ( vlapic->regs_page == NULL )
     {
         dprintk(XENLOG_ERR, "alloc vlapic regs error: %d/%d\n",
diff -r 848a36114bb9 xen/arch/x86/mm/hap/hap.c
--- a/xen/arch/x86/mm/hap/hap.c	Thu Apr 03 00:37:33 2008 +0200
+++ b/xen/arch/x86/mm/hap/hap.c	Thu Apr 03 12:25:31 2008 +0200
@@ -136,8 +136,8 @@ static struct page_info *hap_alloc_p2m_p
          && mfn_x(page_to_mfn(pg)) >= (1UL << (32 - PAGE_SHIFT)) )
     {
         free_domheap_page(pg);
-        pg = alloc_domheap_pages_on_node(NULL, 0, MEMF_bits(32),
-            cpu_to_node(d->vcpu[0]->processor));
+        pg = alloc_domheap_page(NULL, MEMF_bits(32) |
+            MEMF_node(domain_to_node(d)));
         if ( likely(pg != NULL) )
         {
             void *p = hap_map_domain_page(page_to_mfn(pg));
@@ -201,8 +201,7 @@ hap_set_allocation(struct domain *d, uns
         if ( d->arch.paging.hap.total_pages < pages )
         {
             /* Need to allocate more memory from domheap */
-            pg = alloc_domheap_page_on_node(NULL,
-                cpu_to_node(d->vcpu[0]->processor));
+            pg = alloc_domheap_page(NULL, MEMF_node(domain_to_node(d)));
             if ( pg == NULL )
             {
                 HAP_PRINTK("failed to allocate hap pages.\n");
diff -r 848a36114bb9 xen/arch/x86/mm/paging.c
--- a/xen/arch/x86/mm/paging.c	Thu Apr 03 00:37:33 2008 +0200
+++ b/xen/arch/x86/mm/paging.c	Thu Apr 03 12:25:31 2008 +0200
@@ -100,8 +100,8 @@ static mfn_t paging_new_log_dirty_page(s
 static mfn_t paging_new_log_dirty_page(struct domain *d, void **mapping_p)
 {
     mfn_t mfn;
-    struct page_info *page = alloc_domheap_page_on_node(NULL,
-        cpu_to_node(d->vcpu[0]->processor));
+    struct page_info *page = alloc_domheap_page(NULL,
+        MEMF_node(domain_to_node(d)));
 
     if ( unlikely(page == NULL) )
     {
diff -r 848a36114bb9 xen/arch/x86/mm/shadow/common.c
--- a/xen/arch/x86/mm/shadow/common.c	Thu Apr 03 00:37:33 2008 +0200
+++ b/xen/arch/x86/mm/shadow/common.c	Thu Apr 03 12:25:31 2008 +0200
@@ -1250,8 +1250,7 @@ static unsigned int sh_set_allocation(st
         {
             /* Need to allocate more memory from domheap */
             sp = (struct shadow_page_info *)
-                alloc_domheap_pages_on_node(NULL, order, 0,
-                    cpu_to_node(d->vcpu[0]->processor));
+                alloc_domheap_pages(NULL, order, MEMF_node(domain_to_node(d)));
             if ( sp == NULL ) 
             { 
                 SHADOW_PRINTK("failed to allocate shadow pages.\n");
diff -r 848a36114bb9 xen/arch/x86/x86_64/mm.c
--- a/xen/arch/x86/x86_64/mm.c	Thu Apr 03 00:37:33 2008 +0200
+++ b/xen/arch/x86/x86_64/mm.c	Thu Apr 03 12:25:31 2008 +0200
@@ -59,7 +59,7 @@ void *alloc_xen_pagetable(void)
 
     if ( !early_boot )
     {
-        struct page_info *pg = alloc_domheap_page(NULL);
+        struct page_info *pg = alloc_domheap_page(NULL, 0);
         BUG_ON(pg == NULL);
         return page_to_virt(pg);
     }
@@ -108,7 +108,7 @@ void __init paging_init(void)
     struct page_info *l1_pg, *l2_pg, *l3_pg;
 
     /* Create user-accessible L2 directory to map the MPT for guests. */
-    if ( (l3_pg = alloc_domheap_page(NULL)) == NULL )
+    if ( (l3_pg = alloc_domheap_page(NULL, 0)) == NULL )
         goto nomem;
     l3_ro_mpt = page_to_virt(l3_pg);
     clear_page(l3_ro_mpt);
@@ -134,7 +134,7 @@ void __init paging_init(void)
                1UL << L2_PAGETABLE_SHIFT);
         if ( !((unsigned long)l2_ro_mpt & ~PAGE_MASK) )
         {
-            if ( (l2_pg = alloc_domheap_page(NULL)) == NULL )
+            if ( (l2_pg = alloc_domheap_page(NULL, 0)) == NULL )
                 goto nomem;
             va = RO_MPT_VIRT_START + (i << L2_PAGETABLE_SHIFT);
             l2_ro_mpt = page_to_virt(l2_pg);
@@ -154,7 +154,7 @@ void __init paging_init(void)
                  l4_table_offset(HIRO_COMPAT_MPT_VIRT_START));
     l3_ro_mpt = l4e_to_l3e(idle_pg_table[l4_table_offset(
         HIRO_COMPAT_MPT_VIRT_START)]);
-    if ( (l2_pg = alloc_domheap_page(NULL)) == NULL )
+    if ( (l2_pg = alloc_domheap_page(NULL, 0)) == NULL )
         goto nomem;
     compat_idle_pg_table_l2 = l2_ro_mpt = page_to_virt(l2_pg);
     clear_page(l2_ro_mpt);
diff -r 848a36114bb9 xen/common/grant_table.c
--- a/xen/common/grant_table.c	Thu Apr 03 00:37:33 2008 +0200
+++ b/xen/common/grant_table.c	Thu Apr 03 12:25:31 2008 +0200
@@ -1102,7 +1102,7 @@ gnttab_transfer(
             struct page_info *new_page;
             void *sp, *dp;
 
-            new_page = alloc_domheap_pages(NULL, 0, MEMF_bits(max_bitsize));
+            new_page = alloc_domheap_page(NULL, MEMF_bits(max_bitsize));
             if ( new_page == NULL )
             {
                 gop.status = GNTST_address_too_big;
diff -r 848a36114bb9 xen/common/memory.c
--- a/xen/common/memory.c	Thu Apr 03 00:37:33 2008 +0200
+++ b/xen/common/memory.c	Thu Apr 03 12:25:32 2008 +0200
@@ -38,19 +38,13 @@ struct memop_args {
     int          preempted;  /* Was the hypercall preempted? */
 };
 
-static unsigned int select_local_node(struct domain *d)
-{
-    struct vcpu *v = d->vcpu[0];
-    return (v ? cpu_to_node(v->processor) : 0);
-}
-
 static void increase_reservation(struct memop_args *a)
 {
     struct page_info *page;
     unsigned long i;
     xen_pfn_t mfn;
     struct domain *d = a->domain;
-    unsigned int node = select_local_node(d);
+    unsigned int node = domain_to_node (d);
 
     if ( !guest_handle_is_null(a->extent_list) &&
          !guest_handle_okay(a->extent_list, a->nr_extents) )
@@ -68,8 +62,8 @@ static void increase_reservation(struct 
             goto out;
         }
 
-        page = alloc_domheap_pages_on_node (
-            d, a->extent_order, a->memflags, node);
+        page = alloc_domheap_pages (
+            d, a->extent_order, a->memflags | MEMF_node(node));
         if ( unlikely(page == NULL) ) 
         {
             gdprintk(XENLOG_INFO, "Could not allocate order=%d extent: "
@@ -98,7 +92,7 @@ static void populate_physmap(struct memo
     unsigned long i, j;
     xen_pfn_t gpfn, mfn;
     struct domain *d = a->domain;
-    unsigned int node = select_local_node(d);
+    unsigned int node = domain_to_node(d);
 
     if ( !guest_handle_okay(a->extent_list, a->nr_extents) )
         return;
@@ -118,8 +112,8 @@ static void populate_physmap(struct memo
         if ( unlikely(__copy_from_guest_offset(&gpfn, a->extent_list, i, 1)) )
             goto out;
 
-        page = alloc_domheap_pages_on_node (
-            d, a->extent_order, a->memflags, node);
+        page = alloc_domheap_pages (
+            d, a->extent_order, a->memflags | MEMF_node(node));
         if ( unlikely(page == NULL) ) 
         {
             gdprintk(XENLOG_INFO, "Could not allocate order=%d extent: "
@@ -299,7 +293,7 @@ static long memory_exchange(XEN_GUEST_HA
     unsigned long in_chunk_order, out_chunk_order;
     xen_pfn_t     gpfn, gmfn, mfn;
     unsigned long i, j, k;
-    unsigned int  memflags = 0, node;
+    unsigned int  memflags = 0;
     long          rc = 0;
     struct domain *d;
     struct page_info *page;
@@ -355,7 +349,7 @@ static long memory_exchange(XEN_GUEST_HA
     memflags |= MEMF_bits(domain_clamp_alloc_bitsize(
         d, exch.out.address_bits ? : (BITS_PER_LONG+PAGE_SHIFT)));
 
-    node = select_local_node(d);
+    memflags |= MEMF_node (domain_to_node(d));
 
     for ( i = (exch.nr_exchanged >> in_chunk_order);
           i < (exch.in.nr_extents >> in_chunk_order);
@@ -404,8 +398,8 @@ static long memory_exchange(XEN_GUEST_HA
         /* Allocate a chunk's worth of anonymous output pages. */
         for ( j = 0; j < (1UL << out_chunk_order); j++ )
         {
-            page = alloc_domheap_pages_on_node(
-                NULL, exch.out.extent_order, memflags, node);
+            page = alloc_domheap_pages(
+                NULL, exch.out.extent_order, memflags);
             if ( unlikely(page == NULL) )
             {
                 rc = -ENOMEM;
diff -r 848a36114bb9 xen/common/page_alloc.c
--- a/xen/common/page_alloc.c	Thu Apr 03 00:37:33 2008 +0200
+++ b/xen/common/page_alloc.c	Thu Apr 03 12:25:32 2008 +0200
@@ -337,6 +337,7 @@ static struct page_info *alloc_heap_page
     cpumask_t extra_cpus_mask, mask;
     struct page_info *pg;
 
+    if ( node == NUMA_NO_NODE ) node = cpu_to_node(smp_processor_id());
     ASSERT(node >= 0);
     ASSERT(node < num_nodes);
     ASSERT(zone_lo <= zone_hi);
@@ -780,12 +781,12 @@ int assign_pages(
 }
 
 
-struct page_info *alloc_domheap_pages_on_node(
-    struct domain *d, unsigned int order, unsigned int memflags,
-    unsigned int node)
+struct page_info *alloc_domheap_pages(
+    struct domain *d, unsigned int order, unsigned int memflags)
 {
     struct page_info *pg = NULL;
     unsigned int bits = memflags >> _MEMF_bits, zone_hi = NR_ZONES - 1;
+    unsigned int node = (((memflags >> _MEMF_node)&0xFF) - 1 ) &0xFF;
 
     ASSERT(!in_irq());
 
@@ -823,13 +824,6 @@ struct page_info *alloc_domheap_pages_on
     }
     
     return pg;
-}
-
-struct page_info *alloc_domheap_pages(
-    struct domain *d, unsigned int order, unsigned int flags)
-{
-    return alloc_domheap_pages_on_node (d, order, flags,
-        cpu_to_node (smp_processor_id());
 }
 
 void free_domheap_pages(struct page_info *pg, unsigned int order)
diff -r 848a36114bb9 xen/drivers/passthrough/vtd/iommu.c
--- a/xen/drivers/passthrough/vtd/iommu.c	Thu Apr 03 00:37:33 2008 +0200
+++ b/xen/drivers/passthrough/vtd/iommu.c	Thu Apr 03 12:25:32 2008 +0200
@@ -270,8 +270,8 @@ static struct page_info *addr_to_dma_pag
 
         if ( dma_pte_addr(*pte) == 0 )
         {
-            pg = alloc_domheap_page_on_node(NULL,
-                cpu_to_node(domain->vcpu[0]->processor));
+            pg = alloc_domheap_page(NULL,
+                MEMF_node(domain_to_node(domain)));
             vaddr = map_domain_page(page_to_mfn(pg));
             if ( !vaddr )
             {
diff -r 848a36114bb9 xen/include/asm-x86/numa.h
--- a/xen/include/asm-x86/numa.h	Thu Apr 03 00:37:33 2008 +0200
+++ b/xen/include/asm-x86/numa.h	Thu Apr 03 12:25:32 2008 +0200
@@ -4,11 +4,16 @@
 #include <xen/cpumask.h>
 
 #define NODES_SHIFT 6
+#define NUMA_NO_NODE 0xff
 
 extern unsigned char cpu_to_node[];
 extern cpumask_t     node_to_cpumask[];
 
 #define cpu_to_node(cpu)		(cpu_to_node[cpu])
+#define domain_to_node(domain)  ((domain!=NULL && domain->vcpu[0]!=NULL)?\
+                                  cpu_to_node[domain->vcpu[0]->processor]:\
+                                  NUMA_NO_NODE)
+#define vcpu_to_node(vcpu)		(cpu_to_node[v->processor])
 #define parent_node(node)		(node)
 #define node_to_first_cpu(node)  (__ffs(node_to_cpumask[node]))
 #define node_to_cpumask(node)    (node_to_cpumask[node])
@@ -73,6 +78,5 @@ static inline __attribute__((pure)) int 
 #define clear_node_cpumask(cpu) do {} while (0)
 #endif
 
-#define NUMA_NO_NODE 0xff
 
 #endif
diff -r 848a36114bb9 xen/include/xen/mm.h
--- a/xen/include/xen/mm.h	Thu Apr 03 00:37:33 2008 +0200
+++ b/xen/include/xen/mm.h	Thu Apr 03 12:25:32 2008 +0200
@@ -54,15 +54,11 @@ void init_domheap_pages(paddr_t ps, padd
 void init_domheap_pages(paddr_t ps, paddr_t pe);
 struct page_info *alloc_domheap_pages(
     struct domain *d, unsigned int order, unsigned int memflags);
-struct page_info *alloc_domheap_pages_on_node(
-    struct domain *d, unsigned int order, unsigned int memflags,
-    unsigned int node_id);
 void free_domheap_pages(struct page_info *pg, unsigned int order);
 unsigned long avail_domheap_pages_region(
     unsigned int node, unsigned int min_width, unsigned int max_width);
 unsigned long avail_domheap_pages(void);
-#define alloc_domheap_page(d) (alloc_domheap_pages(d,0,0))
-#define alloc_domheap_page_on_node(d, n) (alloc_domheap_pages_on_node(d,0,0,n))
+#define alloc_domheap_page(d,f) (alloc_domheap_pages(d,0,f))
 #define free_domheap_page(p)  (free_domheap_pages(p,0))
 
 void scrub_heap_pages(void);
@@ -76,6 +72,8 @@ int assign_pages(
 /* memflags: */
 #define _MEMF_no_refcount 0
 #define  MEMF_no_refcount (1U<<_MEMF_no_refcount)
+#define _MEMF_node        8
+#define  MEMF_node(n)     ((((n)+1)&0xff)<<_MEMF_node)
 #define _MEMF_bits        24
 #define  MEMF_bits(n)     ((n)<<_MEMF_bits)
 

[-- Attachment #3: alloc_domheap_node_v2.patch --]
[-- Type: text/plain, Size: 21271 bytes --]

diff -r db943e8d1051 xen/arch/ia64/xen/mm.c
--- a/xen/arch/ia64/xen/mm.c	Tue Apr 01 10:09:33 2008 +0100
+++ b/xen/arch/ia64/xen/mm.c	Thu Apr 03 12:25:50 2008 +0200
@@ -822,7 +822,7 @@ __assign_new_domain_page(struct domain *
 
     BUG_ON(!pte_none(*pte));
 
-    p = alloc_domheap_page(d);
+    p = alloc_domheap_page(d, 0);
     if (unlikely(!p)) {
         printk("assign_new_domain_page: Can't alloc!!!! Aaaargh!\n");
         return(p);
@@ -2316,7 +2316,7 @@ steal_page(struct domain *d, struct page
         unsigned long new_mfn;
         int ret;
 
-        new = alloc_domheap_page(d);
+        new = alloc_domheap_page(d, 0);
         if (new == NULL) {
             gdprintk(XENLOG_INFO, "alloc_domheap_page() failed\n");
             return -1;
@@ -2603,7 +2603,7 @@ void *pgtable_quicklist_alloc(void)
 
     BUG_ON(dom_p2m == NULL);
     if (!opt_p2m_xenheap) {
-        struct page_info *page = alloc_domheap_page(dom_p2m);
+        struct page_info *page = alloc_domheap_page(dom_p2m, 0);
         if (page == NULL)
             return NULL;
         p = page_to_virt(page);
diff -r db943e8d1051 xen/arch/ia64/xen/tlb_track.c
--- a/xen/arch/ia64/xen/tlb_track.c	Tue Apr 01 10:09:33 2008 +0100
+++ b/xen/arch/ia64/xen/tlb_track.c	Thu Apr 03 12:25:50 2008 +0200
@@ -48,7 +48,7 @@ tlb_track_allocate_entries(struct tlb_tr
                 __func__, tlb_track->num_entries, tlb_track->limit);
         return -ENOMEM;
     }
-    entry_page = alloc_domheap_page(NULL);
+    entry_page = alloc_domheap_page(NULL, 0);
     if (entry_page == NULL) {
         dprintk(XENLOG_WARNING,
                 "%s: domheap page failed. num_entries %d limit %d\n",
@@ -84,7 +84,7 @@ tlb_track_create(struct domain* d)
     if (tlb_track == NULL)
         goto out;
 
-    hash_page = alloc_domheap_page(NULL);
+    hash_page = alloc_domheap_page(NULL, 0);
     if (hash_page == NULL)
         goto out;
 
diff -r db943e8d1051 xen/arch/x86/domain.c
--- a/xen/arch/x86/domain.c	Tue Apr 01 10:09:33 2008 +0100
+++ b/xen/arch/x86/domain.c	Thu Apr 03 12:25:50 2008 +0200
@@ -46,6 +46,7 @@
 #include <asm/debugreg.h>
 #include <asm/msr.h>
 #include <asm/nmi.h>
+#include <asm/numa.h>
 #include <xen/iommu.h>
 #ifdef CONFIG_COMPAT
 #include <compat/vcpu.h>
@@ -171,7 +172,7 @@ int setup_arg_xlat_area(struct vcpu *v, 
 
     if ( !d->arch.mm_arg_xlat_l3 )
     {
-        pg = alloc_domheap_page(NULL);
+        pg = alloc_domheap_page(NULL, 0);
         if ( !pg )
             return -ENOMEM;
         d->arch.mm_arg_xlat_l3 = page_to_virt(pg);
@@ -189,7 +190,7 @@ int setup_arg_xlat_area(struct vcpu *v, 
 
         if ( !l3e_get_intpte(d->arch.mm_arg_xlat_l3[l3_table_offset(va)]) )
         {
-            pg = alloc_domheap_page(NULL);
+            pg = alloc_domheap_page(NULL, 0);
             if ( !pg )
                 return -ENOMEM;
             clear_page(page_to_virt(pg));
@@ -198,7 +199,7 @@ int setup_arg_xlat_area(struct vcpu *v, 
         l2tab = l3e_to_l2e(d->arch.mm_arg_xlat_l3[l3_table_offset(va)]);
         if ( !l2e_get_intpte(l2tab[l2_table_offset(va)]) )
         {
-            pg = alloc_domheap_page(NULL);
+            pg = alloc_domheap_page(NULL, 0);
             if ( !pg )
                 return -ENOMEM;
             clear_page(page_to_virt(pg));
@@ -206,7 +207,7 @@ int setup_arg_xlat_area(struct vcpu *v, 
         }
         l1tab = l2e_to_l1e(l2tab[l2_table_offset(va)]);
         BUG_ON(l1e_get_intpte(l1tab[l1_table_offset(va)]));
-        pg = alloc_domheap_page(NULL);
+        pg = alloc_domheap_page(NULL, 0);
         if ( !pg )
             return -ENOMEM;
         l1tab[l1_table_offset(va)] = l1e_from_page(pg, PAGE_HYPERVISOR);
@@ -252,7 +253,7 @@ static void release_arg_xlat_area(struct
 
 static int setup_compat_l4(struct vcpu *v)
 {
-    struct page_info *pg = alloc_domheap_page(NULL);
+    struct page_info *pg = alloc_domheap_page(NULL, 0);
     l4_pgentry_t *l4tab;
     int rc;
 
@@ -477,8 +478,8 @@ int arch_domain_create(struct domain *d,
 
 #else /* __x86_64__ */
 
-    if ( (pg = alloc_domheap_page(NULL)) == NULL )
-        goto fail;
+    if ( (pg = alloc_domheap_page(NULL, MEMF_node(domain_to_node(d)))) == NULL )
+            goto fail;
     d->arch.mm_perdomain_l2 = page_to_virt(pg);
     clear_page(d->arch.mm_perdomain_l2);
     for ( i = 0; i < (1 << pdpt_order); i++ )
@@ -486,7 +487,7 @@ int arch_domain_create(struct domain *d,
             l2e_from_page(virt_to_page(d->arch.mm_perdomain_pt)+i,
                           __PAGE_HYPERVISOR);
 
-    if ( (pg = alloc_domheap_page(NULL)) == NULL )
+    if ( (pg = alloc_domheap_page(NULL, MEMF_node(domain_to_node(d)))) == NULL )
         goto fail;
     d->arch.mm_perdomain_l3 = page_to_virt(pg);
     clear_page(d->arch.mm_perdomain_l3);
diff -r db943e8d1051 xen/arch/x86/domain_build.c
--- a/xen/arch/x86/domain_build.c	Tue Apr 01 10:09:33 2008 +0100
+++ b/xen/arch/x86/domain_build.c	Thu Apr 03 12:25:50 2008 +0200
@@ -630,7 +630,7 @@ int __init construct_dom0(
     }
     else
     {
-        page = alloc_domheap_page(NULL);
+        page = alloc_domheap_page(NULL, 0);
         if ( !page )
             panic("Not enough RAM for domain 0 PML4.\n");
         l4start = l4tab = page_to_virt(page);
diff -r db943e8d1051 xen/arch/x86/hvm/stdvga.c
--- a/xen/arch/x86/hvm/stdvga.c	Tue Apr 01 10:09:33 2008 +0100
+++ b/xen/arch/x86/hvm/stdvga.c	Thu Apr 03 12:25:50 2008 +0200
@@ -32,6 +32,7 @@
 #include <xen/sched.h>
 #include <xen/domain_page.h>
 #include <asm/hvm/support.h>
+#include <asm/numa.h>
 
 #define PAT(x) (x)
 static const uint32_t mask16[16] = {
@@ -513,8 +514,9 @@ void stdvga_init(struct domain *d)
     
     for ( i = 0; i != ARRAY_SIZE(s->vram_page); i++ )
     {
-        if ( (pg = alloc_domheap_page(NULL)) == NULL )
-            break;
+        if ( (pg = alloc_domheap_page(NULL,
+            MEMF_node(domain_to_node(d)))) == NULL )
+                break;
         s->vram_page[i] = pg;
         p = map_domain_page(page_to_mfn(pg));
         clear_page(p);
diff -r db943e8d1051 xen/arch/x86/hvm/vlapic.c
--- a/xen/arch/x86/hvm/vlapic.c	Tue Apr 01 10:09:33 2008 +0100
+++ b/xen/arch/x86/hvm/vlapic.c	Thu Apr 03 12:25:50 2008 +0200
@@ -33,6 +33,7 @@
 #include <xen/sched.h>
 #include <asm/current.h>
 #include <asm/hvm/vmx/vmx.h>
+#include <asm/numa.h>
 #include <public/hvm/ioreq.h>
 #include <public/hvm/params.h>
 
@@ -916,7 +917,7 @@ int vlapic_init(struct vcpu *v)
 int vlapic_init(struct vcpu *v)
 {
     struct vlapic *vlapic = vcpu_vlapic(v);
-    unsigned int memflags = 0;
+    unsigned int memflags = MEMF_node (vcpu_to_node(v));
 
     HVM_DBG_LOG(DBG_LEVEL_VLAPIC, "%d", v->vcpu_id);
 
@@ -925,10 +926,10 @@ int vlapic_init(struct vcpu *v)
 #ifdef __i386__
     /* 32-bit VMX may be limited to 32-bit physical addresses. */
     if ( boot_cpu_data.x86_vendor == X86_VENDOR_INTEL )
-        memflags = MEMF_bits(32);
+        memflags |= MEMF_bits(32);
 #endif
 
-    vlapic->regs_page = alloc_domheap_pages(NULL, 0, memflags);
+    vlapic->regs_page = alloc_domheap_page(NULL, memflags);
     if ( vlapic->regs_page == NULL )
     {
         dprintk(XENLOG_ERR, "alloc vlapic regs error: %d/%d\n",
diff -r db943e8d1051 xen/arch/x86/mm/hap/hap.c
--- a/xen/arch/x86/mm/hap/hap.c	Tue Apr 01 10:09:33 2008 +0100
+++ b/xen/arch/x86/mm/hap/hap.c	Thu Apr 03 12:25:50 2008 +0200
@@ -38,6 +38,7 @@
 #include <asm/hap.h>
 #include <asm/paging.h>
 #include <asm/domain.h>
+#include <asm/numa.h>
 
 #include "private.h"
 
@@ -135,7 +136,8 @@ static struct page_info *hap_alloc_p2m_p
          && mfn_x(page_to_mfn(pg)) >= (1UL << (32 - PAGE_SHIFT)) )
     {
         free_domheap_page(pg);
-        pg = alloc_domheap_pages(NULL, 0, MEMF_bits(32));
+        pg = alloc_domheap_page(NULL, MEMF_bits(32) |
+            MEMF_node(domain_to_node(d)));
         if ( likely(pg != NULL) )
         {
             void *p = hap_map_domain_page(page_to_mfn(pg));
@@ -199,7 +201,7 @@ hap_set_allocation(struct domain *d, uns
         if ( d->arch.paging.hap.total_pages < pages )
         {
             /* Need to allocate more memory from domheap */
-            pg = alloc_domheap_page(NULL);
+            pg = alloc_domheap_page(NULL, MEMF_node(domain_to_node(d)));
             if ( pg == NULL )
             {
                 HAP_PRINTK("failed to allocate hap pages.\n");
diff -r db943e8d1051 xen/arch/x86/mm/paging.c
--- a/xen/arch/x86/mm/paging.c	Tue Apr 01 10:09:33 2008 +0100
+++ b/xen/arch/x86/mm/paging.c	Thu Apr 03 12:25:50 2008 +0200
@@ -26,6 +26,7 @@
 #include <asm/p2m.h>
 #include <asm/hap.h>
 #include <asm/guest_access.h>
+#include <asm/numa.h>
 #include <xsm/xsm.h>
 
 #define hap_enabled(d) (is_hvm_domain(d) && (d)->arch.hvm_domain.hap_enabled)
@@ -99,7 +100,8 @@ static mfn_t paging_new_log_dirty_page(s
 static mfn_t paging_new_log_dirty_page(struct domain *d, void **mapping_p)
 {
     mfn_t mfn;
-    struct page_info *page = alloc_domheap_page(NULL);
+    struct page_info *page = alloc_domheap_page(NULL,
+        MEMF_node(domain_to_node(d)));
 
     if ( unlikely(page == NULL) )
     {
diff -r db943e8d1051 xen/arch/x86/mm/shadow/common.c
--- a/xen/arch/x86/mm/shadow/common.c	Tue Apr 01 10:09:33 2008 +0100
+++ b/xen/arch/x86/mm/shadow/common.c	Thu Apr 03 12:25:50 2008 +0200
@@ -36,6 +36,7 @@
 #include <asm/current.h>
 #include <asm/flushtlb.h>
 #include <asm/shadow.h>
+#include <asm/numa.h>
 #include "private.h"
 
 
@@ -1249,7 +1250,7 @@ static unsigned int sh_set_allocation(st
         {
             /* Need to allocate more memory from domheap */
             sp = (struct shadow_page_info *)
-                alloc_domheap_pages(NULL, order, 0);
+                alloc_domheap_pages(NULL, order, MEMF_node(domain_to_node(d)));
             if ( sp == NULL ) 
             { 
                 SHADOW_PRINTK("failed to allocate shadow pages.\n");
diff -r db943e8d1051 xen/arch/x86/x86_64/mm.c
--- a/xen/arch/x86/x86_64/mm.c	Tue Apr 01 10:09:33 2008 +0100
+++ b/xen/arch/x86/x86_64/mm.c	Thu Apr 03 12:25:50 2008 +0200
@@ -59,7 +59,7 @@ void *alloc_xen_pagetable(void)
 
     if ( !early_boot )
     {
-        struct page_info *pg = alloc_domheap_page(NULL);
+        struct page_info *pg = alloc_domheap_page(NULL, 0);
         BUG_ON(pg == NULL);
         return page_to_virt(pg);
     }
@@ -108,7 +108,7 @@ void __init paging_init(void)
     struct page_info *l1_pg, *l2_pg, *l3_pg;
 
     /* Create user-accessible L2 directory to map the MPT for guests. */
-    if ( (l3_pg = alloc_domheap_page(NULL)) == NULL )
+    if ( (l3_pg = alloc_domheap_page(NULL, 0)) == NULL )
         goto nomem;
     l3_ro_mpt = page_to_virt(l3_pg);
     clear_page(l3_ro_mpt);
@@ -134,7 +134,7 @@ void __init paging_init(void)
                1UL << L2_PAGETABLE_SHIFT);
         if ( !((unsigned long)l2_ro_mpt & ~PAGE_MASK) )
         {
-            if ( (l2_pg = alloc_domheap_page(NULL)) == NULL )
+            if ( (l2_pg = alloc_domheap_page(NULL, 0)) == NULL )
                 goto nomem;
             va = RO_MPT_VIRT_START + (i << L2_PAGETABLE_SHIFT);
             l2_ro_mpt = page_to_virt(l2_pg);
@@ -154,7 +154,7 @@ void __init paging_init(void)
                  l4_table_offset(HIRO_COMPAT_MPT_VIRT_START));
     l3_ro_mpt = l4e_to_l3e(idle_pg_table[l4_table_offset(
         HIRO_COMPAT_MPT_VIRT_START)]);
-    if ( (l2_pg = alloc_domheap_page(NULL)) == NULL )
+    if ( (l2_pg = alloc_domheap_page(NULL, 0)) == NULL )
         goto nomem;
     compat_idle_pg_table_l2 = l2_ro_mpt = page_to_virt(l2_pg);
     clear_page(l2_ro_mpt);
diff -r db943e8d1051 xen/common/grant_table.c
--- a/xen/common/grant_table.c	Tue Apr 01 10:09:33 2008 +0100
+++ b/xen/common/grant_table.c	Thu Apr 03 12:25:50 2008 +0200
@@ -1102,7 +1102,7 @@ gnttab_transfer(
             struct page_info *new_page;
             void *sp, *dp;
 
-            new_page = alloc_domheap_pages(NULL, 0, MEMF_bits(max_bitsize));
+            new_page = alloc_domheap_page(NULL, MEMF_bits(max_bitsize));
             if ( new_page == NULL )
             {
                 gop.status = GNTST_address_too_big;
diff -r db943e8d1051 xen/common/memory.c
--- a/xen/common/memory.c	Tue Apr 01 10:09:33 2008 +0100
+++ b/xen/common/memory.c	Thu Apr 03 12:25:50 2008 +0200
@@ -21,6 +21,7 @@
 #include <xen/errno.h>
 #include <asm/current.h>
 #include <asm/hardirq.h>
+#include <asm/numa.h>
 #include <public/memory.h>
 #include <xsm/xsm.h>
 
@@ -37,19 +38,13 @@ struct memop_args {
     int          preempted;  /* Was the hypercall preempted? */
 };
 
-static unsigned int select_local_cpu(struct domain *d)
-{
-    struct vcpu *v = d->vcpu[0];
-    return (v ? v->processor : 0);
-}
-
 static void increase_reservation(struct memop_args *a)
 {
     struct page_info *page;
     unsigned long i;
     xen_pfn_t mfn;
     struct domain *d = a->domain;
-    unsigned int cpu = select_local_cpu(d);
+    unsigned int node = domain_to_node (d);
 
     if ( !guest_handle_is_null(a->extent_list) &&
          !guest_handle_okay(a->extent_list, a->nr_extents) )
@@ -67,7 +62,8 @@ static void increase_reservation(struct 
             goto out;
         }
 
-        page = __alloc_domheap_pages(d, cpu, a->extent_order, a->memflags);
+        page = alloc_domheap_pages (
+            d, a->extent_order, a->memflags | MEMF_node(node));
         if ( unlikely(page == NULL) ) 
         {
             gdprintk(XENLOG_INFO, "Could not allocate order=%d extent: "
@@ -96,7 +92,7 @@ static void populate_physmap(struct memo
     unsigned long i, j;
     xen_pfn_t gpfn, mfn;
     struct domain *d = a->domain;
-    unsigned int cpu = select_local_cpu(d);
+    unsigned int node = domain_to_node(d);
 
     if ( !guest_handle_okay(a->extent_list, a->nr_extents) )
         return;
@@ -116,7 +112,8 @@ static void populate_physmap(struct memo
         if ( unlikely(__copy_from_guest_offset(&gpfn, a->extent_list, i, 1)) )
             goto out;
 
-        page = __alloc_domheap_pages(d, cpu, a->extent_order, a->memflags);
+        page = alloc_domheap_pages (
+            d, a->extent_order, a->memflags | MEMF_node(node));
         if ( unlikely(page == NULL) ) 
         {
             gdprintk(XENLOG_INFO, "Could not allocate order=%d extent: "
@@ -296,7 +293,7 @@ static long memory_exchange(XEN_GUEST_HA
     unsigned long in_chunk_order, out_chunk_order;
     xen_pfn_t     gpfn, gmfn, mfn;
     unsigned long i, j, k;
-    unsigned int  memflags = 0, cpu;
+    unsigned int  memflags = 0;
     long          rc = 0;
     struct domain *d;
     struct page_info *page;
@@ -352,7 +349,7 @@ static long memory_exchange(XEN_GUEST_HA
     memflags |= MEMF_bits(domain_clamp_alloc_bitsize(
         d, exch.out.address_bits ? : (BITS_PER_LONG+PAGE_SHIFT)));
 
-    cpu = select_local_cpu(d);
+    memflags |= MEMF_node (domain_to_node(d));
 
     for ( i = (exch.nr_exchanged >> in_chunk_order);
           i < (exch.in.nr_extents >> in_chunk_order);
@@ -401,8 +398,8 @@ static long memory_exchange(XEN_GUEST_HA
         /* Allocate a chunk's worth of anonymous output pages. */
         for ( j = 0; j < (1UL << out_chunk_order); j++ )
         {
-            page = __alloc_domheap_pages(
-                NULL, cpu, exch.out.extent_order, memflags);
+            page = alloc_domheap_pages(
+                NULL, exch.out.extent_order, memflags);
             if ( unlikely(page == NULL) )
             {
                 rc = -ENOMEM;
diff -r db943e8d1051 xen/common/page_alloc.c
--- a/xen/common/page_alloc.c	Tue Apr 01 10:09:33 2008 +0100
+++ b/xen/common/page_alloc.c	Thu Apr 03 12:25:50 2008 +0200
@@ -36,6 +36,7 @@
 #include <xen/numa.h>
 #include <xen/nodemask.h>
 #include <asm/page.h>
+#include <asm/numa.h>
 #include <asm/flushtlb.h>
 
 /*
@@ -328,14 +329,15 @@ static void init_node_heap(int node)
 /* Allocate 2^@order contiguous pages. */
 static struct page_info *alloc_heap_pages(
     unsigned int zone_lo, unsigned int zone_hi,
-    unsigned int cpu, unsigned int order)
+    unsigned int node, unsigned int order)
 {
     unsigned int i, j, zone;
-    unsigned int node = cpu_to_node(cpu), num_nodes = num_online_nodes();
+    unsigned int num_nodes = num_online_nodes();
     unsigned long request = 1UL << order;
     cpumask_t extra_cpus_mask, mask;
     struct page_info *pg;
 
+    if ( node == NUMA_NO_NODE ) node = cpu_to_node(smp_processor_id());
     ASSERT(node >= 0);
     ASSERT(node < num_nodes);
     ASSERT(zone_lo <= zone_hi);
@@ -670,7 +672,8 @@ void *alloc_xenheap_pages(unsigned int o
 
     ASSERT(!in_irq());
 
-    pg = alloc_heap_pages(MEMZONE_XEN, MEMZONE_XEN, smp_processor_id(), order);
+    pg = alloc_heap_pages(MEMZONE_XEN, MEMZONE_XEN, 
+        cpu_to_node(smp_processor_id()), order);
     if ( unlikely(pg == NULL) )
         goto no_memory;
 
@@ -778,12 +781,12 @@ int assign_pages(
 }
 
 
-struct page_info *__alloc_domheap_pages(
-    struct domain *d, unsigned int cpu, unsigned int order, 
-    unsigned int memflags)
+struct page_info *alloc_domheap_pages(
+    struct domain *d, unsigned int order, unsigned int memflags)
 {
     struct page_info *pg = NULL;
     unsigned int bits = memflags >> _MEMF_bits, zone_hi = NR_ZONES - 1;
+    unsigned int node = (((memflags >> _MEMF_node)&0xFF) - 1 ) &0xFF;
 
     ASSERT(!in_irq());
 
@@ -797,7 +800,7 @@ struct page_info *__alloc_domheap_pages(
 
     if ( (zone_hi + PAGE_SHIFT) >= dma_bitsize )
     {
-        pg = alloc_heap_pages(dma_bitsize - PAGE_SHIFT, zone_hi, cpu, order);
+        pg = alloc_heap_pages(dma_bitsize - PAGE_SHIFT, zone_hi, node, order);
 
         /* Failure? Then check if we can fall back to the DMA pool. */
         if ( unlikely(pg == NULL) &&
@@ -811,7 +814,7 @@ struct page_info *__alloc_domheap_pages(
 
     if ( (pg == NULL) &&
          ((pg = alloc_heap_pages(MEMZONE_XEN + 1, zone_hi,
-                                 cpu, order)) == NULL) )
+                                 node, order)) == NULL) )
          return NULL;
 
     if ( (d != NULL) && assign_pages(d, pg, order, memflags) )
@@ -821,12 +824,6 @@ struct page_info *__alloc_domheap_pages(
     }
     
     return pg;
-}
-
-struct page_info *alloc_domheap_pages(
-    struct domain *d, unsigned int order, unsigned int flags)
-{
-    return __alloc_domheap_pages(d, smp_processor_id(), order, flags);
 }
 
 void free_domheap_pages(struct page_info *pg, unsigned int order)
diff -r db943e8d1051 xen/drivers/passthrough/vtd/iommu.c
--- a/xen/drivers/passthrough/vtd/iommu.c	Tue Apr 01 10:09:33 2008 +0100
+++ b/xen/drivers/passthrough/vtd/iommu.c	Thu Apr 03 12:25:50 2008 +0200
@@ -24,6 +24,7 @@
 #include <xen/xmalloc.h>
 #include <xen/domain_page.h>
 #include <xen/iommu.h>
+#include <asm/numa.h>
 #include "iommu.h"
 #include "dmar.h"
 #include "../pci-direct.h"
@@ -269,7 +270,8 @@ static struct page_info *addr_to_dma_pag
 
         if ( dma_pte_addr(*pte) == 0 )
         {
-            pg = alloc_domheap_page(NULL);
+            pg = alloc_domheap_page(NULL,
+                MEMF_node(domain_to_node(domain)));
             vaddr = map_domain_page(page_to_mfn(pg));
             if ( !vaddr )
             {
diff -r db943e8d1051 xen/include/asm-x86/numa.h
--- a/xen/include/asm-x86/numa.h	Tue Apr 01 10:09:33 2008 +0100
+++ b/xen/include/asm-x86/numa.h	Thu Apr 03 12:25:50 2008 +0200
@@ -4,11 +4,16 @@
 #include <xen/cpumask.h>
 
 #define NODES_SHIFT 6
+#define NUMA_NO_NODE 0xff
 
 extern unsigned char cpu_to_node[];
 extern cpumask_t     node_to_cpumask[];
 
 #define cpu_to_node(cpu)		(cpu_to_node[cpu])
+#define domain_to_node(domain)  ((domain!=NULL && domain->vcpu[0]!=NULL)?\
+                                  cpu_to_node[domain->vcpu[0]->processor]:\
+                                  NUMA_NO_NODE)
+#define vcpu_to_node(vcpu)		(cpu_to_node[v->processor])
 #define parent_node(node)		(node)
 #define node_to_first_cpu(node)  (__ffs(node_to_cpumask[node]))
 #define node_to_cpumask(node)    (node_to_cpumask[node])
@@ -73,6 +78,5 @@ static inline __attribute__((pure)) int 
 #define clear_node_cpumask(cpu) do {} while (0)
 #endif
 
-#define NUMA_NO_NODE 0xff
 
 #endif
diff -r db943e8d1051 xen/include/xen/mm.h
--- a/xen/include/xen/mm.h	Tue Apr 01 10:09:33 2008 +0100
+++ b/xen/include/xen/mm.h	Thu Apr 03 12:25:50 2008 +0200
@@ -54,14 +54,11 @@ void init_domheap_pages(paddr_t ps, padd
 void init_domheap_pages(paddr_t ps, paddr_t pe);
 struct page_info *alloc_domheap_pages(
     struct domain *d, unsigned int order, unsigned int memflags);
-struct page_info *__alloc_domheap_pages(
-    struct domain *d, unsigned int cpu, unsigned int order, 
-    unsigned int memflags);
 void free_domheap_pages(struct page_info *pg, unsigned int order);
 unsigned long avail_domheap_pages_region(
     unsigned int node, unsigned int min_width, unsigned int max_width);
 unsigned long avail_domheap_pages(void);
-#define alloc_domheap_page(d) (alloc_domheap_pages(d,0,0))
+#define alloc_domheap_page(d,f) (alloc_domheap_pages(d,0,f))
 #define free_domheap_page(p)  (free_domheap_pages(p,0))
 
 void scrub_heap_pages(void);
@@ -75,6 +72,8 @@ int assign_pages(
 /* memflags: */
 #define _MEMF_no_refcount 0
 #define  MEMF_no_refcount (1U<<_MEMF_no_refcount)
+#define _MEMF_node        8
+#define  MEMF_node(n)     ((((n)+1)&0xff)<<_MEMF_node)
 #define _MEMF_bits        24
 #define  MEMF_bits(n)     ((n)<<_MEMF_bits)
 

[-- Attachment #4: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

  reply	other threads:[~2008-04-03 10:39 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-03-31 10:40 [RFC][PATCH] domheap optimization for NUMA Zhai, Edwin
2008-03-31 10:47 ` Keir Fraser
2008-04-02 13:06   ` Zhai, Edwin
2008-04-02 13:27     ` Keir Fraser
2008-04-02 22:49       ` Andre Przywara
2008-04-02 23:21         ` Keir Fraser
2008-04-03 10:39           ` Andre Przywara [this message]
2008-04-03 10:58             ` Keir Fraser
2008-04-03 13:57               ` Andre Przywara
2008-04-03 14:49                 ` Keir Fraser
2008-04-04  8:37                 ` Isaku Yamahata
2008-04-04 13:22                 ` Aron Griffis
2008-04-07 13:25           ` Zhai, Edwin
2008-04-07 13:51             ` Keir Fraser

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=47F4B3D2.1030909@amd.com \
    --to=andre.przywara@amd.com \
    --cc=edwin.zhai@intel.com \
    --cc=keir.fraser@eu.citrix.com \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.