All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/4] Fix HVM vNUMA
@ 2015-05-29 11:37 Wei Liu
  2015-05-29 11:37 ` [PATCH v2 1/4] libxc/libxl: fill xc_hvm_build_args in libxl Wei Liu
                   ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Wei Liu @ 2015-05-29 11:37 UTC (permalink / raw)
  To: Xen-devel
  Cc: Wei Liu, Boris Ostrovsky, Ian Jackson, Dario Faggioli,
	Ian Campbell

Boris discovered that HVM vNUMA didn't actually work. This patch series fixes
that.

The first patch is a prerequisite patch for the actual fixes. The second patch
is to help debugging. The fixes are in the last two patches, which can be
squashed into one if necessary.

Patch 3 is updated in this series. Other patches remain the same as last
version.

Wei.

Wei Liu (4):
  libxc/libxl: fill xc_hvm_build_args in libxl
  libxc: print more error messages when failed
  libxc: rework vnuma bits in setup_guest
  libxl: fix HVM vNUMA

 tools/libxc/xc_hvm_build_x86.c | 125 ++++++++++++++++++++++++++---------------
 tools/libxl/libxl_dom.c        |  48 ++++++++--------
 tools/libxl/libxl_vnuma.c      |  15 ++++-
 3 files changed, 119 insertions(+), 69 deletions(-)

-- 
1.9.1

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v2 1/4] libxc/libxl: fill xc_hvm_build_args in libxl
  2015-05-29 11:37 [PATCH v2 0/4] Fix HVM vNUMA Wei Liu
@ 2015-05-29 11:37 ` Wei Liu
  2015-05-29 12:29   ` Andrew Cooper
  2015-05-29 11:37 ` [PATCH v2 2/4] libxc: print more error messages when failed Wei Liu
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 11+ messages in thread
From: Wei Liu @ 2015-05-29 11:37 UTC (permalink / raw)
  To: Xen-devel
  Cc: Wei Liu, Ian Campbell, Dario Faggioli, Ian Jackson, Chen, Tiejun,
	Boris Ostrovsky

When building HVM guests, originally some fields of xc_hvm_build_args
are filled in xc_hvm_build (and buried in the wrong function), some are
set in libxl__build_hvm before passing xc_hvm_build_args to
xc_hvm_build. This is fragile.

After examining the code in xc_hvm_build that sets those fields, we can
in fact move setting of mmio_start etc in libxl. This way we consolidate
memory layout setting in libxl.

The setting of firmware data related fields is left in xc_hvm_build
because it depends on parsing ELF image. Those fields only point to
scratch data that doesn't affect memory layout.

There should be no change in the generated guest memory layout.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
---
Cc: "Chen, Tiejun" <tiejun.chen@intel.com>

This might affect your RMRR patch series.

I once said xc_hvm_build would touch various xc_hvm_build_args fields
that would affect guest memory layout. It won't be that case anymore
with this patch.
---
 tools/libxc/xc_hvm_build_x86.c | 37 +++++++------------------------------
 tools/libxl/libxl_dom.c        | 16 ++++++++++++++++
 2 files changed, 23 insertions(+), 30 deletions(-)

diff --git a/tools/libxc/xc_hvm_build_x86.c b/tools/libxc/xc_hvm_build_x86.c
index e45ae4a..92422bf 100644
--- a/tools/libxc/xc_hvm_build_x86.c
+++ b/tools/libxc/xc_hvm_build_x86.c
@@ -88,22 +88,14 @@ static int modules_init(struct xc_hvm_build_args *args,
     return 0;
 }
 
-static void build_hvm_info(void *hvm_info_page, uint64_t mem_size,
-                           uint64_t mmio_start, uint64_t mmio_size,
+static void build_hvm_info(void *hvm_info_page,
                            struct xc_hvm_build_args *args)
 {
     struct hvm_info_table *hvm_info = (struct hvm_info_table *)
         (((unsigned char *)hvm_info_page) + HVM_INFO_OFFSET);
-    uint64_t lowmem_end = mem_size, highmem_end = 0;
     uint8_t sum;
     int i;
 
-    if ( lowmem_end > mmio_start )
-    {
-        highmem_end = (1ull<<32) + (lowmem_end - mmio_start);
-        lowmem_end = mmio_start;
-    }
-
     memset(hvm_info_page, 0, PAGE_SIZE);
 
     /* Fill in the header. */
@@ -116,14 +108,10 @@ static void build_hvm_info(void *hvm_info_page, uint64_t mem_size,
     memset(hvm_info->vcpu_online, 0xff, sizeof(hvm_info->vcpu_online));
 
     /* Memory parameters. */
-    hvm_info->low_mem_pgend = lowmem_end >> PAGE_SHIFT;
-    hvm_info->high_mem_pgend = highmem_end >> PAGE_SHIFT;
+    hvm_info->low_mem_pgend = args->lowmem_end >> PAGE_SHIFT;
+    hvm_info->high_mem_pgend = args->highmem_end >> PAGE_SHIFT;
     hvm_info->reserved_mem_pgstart = ioreq_server_pfn(0);
 
-    args->lowmem_end = lowmem_end;
-    args->highmem_end = highmem_end;
-    args->mmio_start = mmio_start;
-
     /* Finish with the checksum. */
     for ( i = 0, sum = 0; i < hvm_info->length; i++ )
         sum += ((uint8_t *)hvm_info)[i];
@@ -251,8 +239,6 @@ static int setup_guest(xc_interface *xch,
     xen_pfn_t *page_array = NULL;
     unsigned long i, vmemid, nr_pages = args->mem_size >> PAGE_SHIFT;
     unsigned long target_pages = args->mem_target >> PAGE_SHIFT;
-    uint64_t mmio_start = (1ull << 32) - args->mmio_size;
-    uint64_t mmio_size = args->mmio_size;
     unsigned long entry_eip, cur_pages, cur_pfn;
     void *hvm_info_page;
     uint32_t *ident_pt;
@@ -344,8 +330,8 @@ static int setup_guest(xc_interface *xch,
 
     for ( i = 0; i < nr_pages; i++ )
         page_array[i] = i;
-    for ( i = mmio_start >> PAGE_SHIFT; i < nr_pages; i++ )
-        page_array[i] += mmio_size >> PAGE_SHIFT;
+    for ( i = args->mmio_start >> PAGE_SHIFT; i < nr_pages; i++ )
+        page_array[i] += args->mmio_size >> PAGE_SHIFT;
 
     /*
      * Try to claim pages for early warning of insufficient memory available.
@@ -446,7 +432,7 @@ static int setup_guest(xc_interface *xch,
                   * range */
                  !check_mmio_hole(cur_pfn << PAGE_SHIFT,
                                   SUPERPAGE_1GB_NR_PFNS << PAGE_SHIFT,
-                                  mmio_start, mmio_size) )
+                                  args->mmio_start, args->mmio_size) )
             {
                 long done;
                 unsigned long nr_extents = count >> SUPERPAGE_1GB_SHIFT;
@@ -545,7 +531,7 @@ static int setup_guest(xc_interface *xch,
               xch, dom, PAGE_SIZE, PROT_READ | PROT_WRITE,
               HVM_INFO_PFN)) == NULL )
         goto error_out;
-    build_hvm_info(hvm_info_page, v_end, mmio_start, mmio_size, args);
+    build_hvm_info(hvm_info_page, args);
     munmap(hvm_info_page, PAGE_SIZE);
 
     /* Allocate and clear special pages. */
@@ -661,12 +647,6 @@ int xc_hvm_build(xc_interface *xch, uint32_t domid,
     if ( args.image_file_name == NULL )
         return -1;
 
-    if ( args.mem_target == 0 )
-        args.mem_target = args.mem_size;
-
-    if ( args.mmio_size == 0 )
-        args.mmio_size = HVM_BELOW_4G_MMIO_LENGTH;
-
     /* An HVM guest must be initialised with at least 2MB memory. */
     if ( args.mem_size < (2ull << 20) || args.mem_target < (2ull << 20) )
         return -1;
@@ -684,9 +664,6 @@ int xc_hvm_build(xc_interface *xch, uint32_t domid,
             args.acpi_module.guest_addr_out;
         hvm_args->smbios_module.guest_addr_out = 
             args.smbios_module.guest_addr_out;
-        hvm_args->lowmem_end = args.lowmem_end;
-        hvm_args->highmem_end = args.highmem_end;
-        hvm_args->mmio_start = args.mmio_start;
     }
 
     free(image);
diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index a0c9850..dccc9ac 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -920,6 +920,7 @@ int libxl__build_hvm(libxl__gc *gc, uint32_t domid,
     libxl_ctx *ctx = libxl__gc_owner(gc);
     struct xc_hvm_build_args args = {};
     int ret, rc = ERROR_FAIL;
+    uint64_t mmio_start, lowmem_end, highmem_end;
 
     memset(&args, 0, sizeof(struct xc_hvm_build_args));
     /* The params from the configuration file are in Mb, which are then
@@ -941,6 +942,21 @@ int libxl__build_hvm(libxl__gc *gc, uint32_t domid,
         LOG(ERROR, "initializing domain firmware failed");
         goto out;
     }
+    if (args.mem_target == 0)
+        args.mem_target = args.mem_size;
+    if (args.mmio_size == 0)
+        args.mmio_size = HVM_BELOW_4G_MMIO_LENGTH;
+    lowmem_end = args.mem_size;
+    highmem_end = 0;
+    mmio_start = (1ull << 32) - args.mmio_size;
+    if (lowmem_end > mmio_start)
+    {
+        highmem_end = (1ull << 32) + (lowmem_end - mmio_start);
+        lowmem_end = mmio_start;
+    }
+    args.lowmem_end = lowmem_end;
+    args.highmem_end = highmem_end;
+    args.mmio_start = mmio_start;
 
     if (info->num_vnuma_nodes != 0) {
         int i;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v2 2/4] libxc: print more error messages when failed
  2015-05-29 11:37 [PATCH v2 0/4] Fix HVM vNUMA Wei Liu
  2015-05-29 11:37 ` [PATCH v2 1/4] libxc/libxl: fill xc_hvm_build_args in libxl Wei Liu
@ 2015-05-29 11:37 ` Wei Liu
  2015-05-29 12:42   ` Andrew Cooper
  2015-05-29 11:37 ` [PATCH v2 3/4] libxc: rework vnuma bits in setup_guest Wei Liu
  2015-05-29 11:37 ` [PATCH v2 4/4] libxl: fix HVM vNUMA Wei Liu
  3 siblings, 1 reply; 11+ messages in thread
From: Wei Liu @ 2015-05-29 11:37 UTC (permalink / raw)
  To: Xen-devel
  Cc: Wei Liu, Boris Ostrovsky, Ian Jackson, Dario Faggioli,
	Ian Campbell

No functional changes introduced.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
---
 tools/libxc/xc_hvm_build_x86.c | 30 +++++++++++++++++++++++++++---
 1 file changed, 27 insertions(+), 3 deletions(-)

diff --git a/tools/libxc/xc_hvm_build_x86.c b/tools/libxc/xc_hvm_build_x86.c
index 92422bf..df4b7ed 100644
--- a/tools/libxc/xc_hvm_build_x86.c
+++ b/tools/libxc/xc_hvm_build_x86.c
@@ -259,7 +259,10 @@ static int setup_guest(xc_interface *xch,
 
     memset(&elf, 0, sizeof(elf));
     if ( elf_init(&elf, image, image_size) != 0 )
+    {
+        PERROR("Could not initialise ELF image");
         goto error_out;
+    }
 
     xc_elf_set_logfile(xch, &elf, 1);
 
@@ -522,15 +525,24 @@ static int setup_guest(xc_interface *xch,
     DPRINTF("  1GB PAGES: 0x%016lx\n", stat_1gb_pages);
     
     if ( loadelfimage(xch, &elf, dom, page_array) != 0 )
+    {
+        PERROR("Could not load ELF image");
         goto error_out;
+    }
 
     if ( loadmodules(xch, args, m_start, m_end, dom, page_array) != 0 )
-        goto error_out;    
+    {
+        PERROR("Could not load ACPI modules");
+        goto error_out;
+    }
 
     if ( (hvm_info_page = xc_map_foreign_range(
               xch, dom, PAGE_SIZE, PROT_READ | PROT_WRITE,
               HVM_INFO_PFN)) == NULL )
+    {
+        PERROR("Could not map hvm info page");
         goto error_out;
+    }
     build_hvm_info(hvm_info_page, args);
     munmap(hvm_info_page, PAGE_SIZE);
 
@@ -547,7 +559,10 @@ static int setup_guest(xc_interface *xch,
     }
 
     if ( xc_clear_domain_pages(xch, dom, special_pfn(0), NR_SPECIAL_PAGES) )
-            goto error_out;
+    {
+        PERROR("Could not clear special pages");
+        goto error_out;
+    }
 
     xc_hvm_param_set(xch, dom, HVM_PARAM_STORE_PFN,
                      special_pfn(SPECIALPAGE_XENSTORE));
@@ -580,7 +595,10 @@ static int setup_guest(xc_interface *xch,
     }
 
     if ( xc_clear_domain_pages(xch, dom, ioreq_server_pfn(0), NR_IOREQ_SERVER_PAGES) )
-            goto error_out;
+    {
+        PERROR("Could not clear ioreq page");
+        goto error_out;
+    }
 
     /* Tell the domain where the pages are and how many there are */
     xc_hvm_param_set(xch, dom, HVM_PARAM_IOREQ_SERVER_PFN,
@@ -595,7 +613,10 @@ static int setup_guest(xc_interface *xch,
     if ( (ident_pt = xc_map_foreign_range(
               xch, dom, PAGE_SIZE, PROT_READ | PROT_WRITE,
               special_pfn(SPECIALPAGE_IDENT_PT))) == NULL )
+    {
+        PERROR("Could not map special page ident_pt");
         goto error_out;
+    }
     for ( i = 0; i < PAGE_SIZE / sizeof(*ident_pt); i++ )
         ident_pt[i] = ((i << 22) | _PAGE_PRESENT | _PAGE_RW | _PAGE_USER |
                        _PAGE_ACCESSED | _PAGE_DIRTY | _PAGE_PSE);
@@ -610,7 +631,10 @@ static int setup_guest(xc_interface *xch,
         char *page0 = xc_map_foreign_range(
             xch, dom, PAGE_SIZE, PROT_READ | PROT_WRITE, 0);
         if ( page0 == NULL )
+        {
+            PERROR("Could not map page0");
             goto error_out;
+        }
         page0[0] = 0xe9;
         *(uint32_t *)&page0[1] = entry_eip - 5;
         munmap(page0, PAGE_SIZE);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v2 3/4] libxc: rework vnuma bits in setup_guest
  2015-05-29 11:37 [PATCH v2 0/4] Fix HVM vNUMA Wei Liu
  2015-05-29 11:37 ` [PATCH v2 1/4] libxc/libxl: fill xc_hvm_build_args in libxl Wei Liu
  2015-05-29 11:37 ` [PATCH v2 2/4] libxc: print more error messages when failed Wei Liu
@ 2015-05-29 11:37 ` Wei Liu
  2015-05-29 15:02   ` Boris Ostrovsky
  2015-05-29 11:37 ` [PATCH v2 4/4] libxl: fix HVM vNUMA Wei Liu
  3 siblings, 1 reply; 11+ messages in thread
From: Wei Liu @ 2015-05-29 11:37 UTC (permalink / raw)
  To: Xen-devel
  Cc: Wei Liu, Boris Ostrovsky, Ian Jackson, Dario Faggioli,
	Ian Campbell

Make the setup process similar to PV counterpart. That is, to allocate a
P2M array that covers the whole memory range and start from there. This
is clearer than using an array with no holes in it.

Also the dummy layout should take MMIO hole into consideration. We might
end up having two vmemranges in the dummy layout.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
---
v2: only generate 1 vnode in dummy layout
---
 tools/libxc/xc_hvm_build_x86.c | 62 ++++++++++++++++++++++++++++++++----------
 1 file changed, 47 insertions(+), 15 deletions(-)

diff --git a/tools/libxc/xc_hvm_build_x86.c b/tools/libxc/xc_hvm_build_x86.c
index df4b7ed..b3855e0 100644
--- a/tools/libxc/xc_hvm_build_x86.c
+++ b/tools/libxc/xc_hvm_build_x86.c
@@ -238,6 +238,7 @@ static int setup_guest(xc_interface *xch,
 {
     xen_pfn_t *page_array = NULL;
     unsigned long i, vmemid, nr_pages = args->mem_size >> PAGE_SHIFT;
+    unsigned long p2m_size;
     unsigned long target_pages = args->mem_target >> PAGE_SHIFT;
     unsigned long entry_eip, cur_pages, cur_pfn;
     void *hvm_info_page;
@@ -254,8 +255,8 @@ static int setup_guest(xc_interface *xch,
     xen_pfn_t special_array[NR_SPECIAL_PAGES];
     xen_pfn_t ioreq_server_array[NR_IOREQ_SERVER_PAGES];
     uint64_t total_pages;
-    xen_vmemrange_t dummy_vmemrange;
-    unsigned int dummy_vnode_to_pnode;
+    xen_vmemrange_t dummy_vmemrange[2];
+    unsigned int dummy_vnode_to_pnode[1];
 
     memset(&elf, 0, sizeof(elf));
     if ( elf_init(&elf, image, image_size) != 0 )
@@ -275,17 +276,35 @@ static int setup_guest(xc_interface *xch,
 
     if ( args->nr_vmemranges == 0 )
     {
-        /* Build dummy vnode information */
-        dummy_vmemrange.start = 0;
-        dummy_vmemrange.end   = args->mem_size;
-        dummy_vmemrange.flags = 0;
-        dummy_vmemrange.nid   = 0;
+        /* Build dummy vnode information
+         *
+         * Guest physical address space layout:
+         * [0, hole_start) [hole_start, 4G) [4G, highmem_end)
+         *
+         * Of course if there is no high memory, the second vmemrange
+         * has no effect on the actual result.
+         */
+
+        dummy_vmemrange[0].start = 0;
+        dummy_vmemrange[0].end   = args->lowmem_end;
+        dummy_vmemrange[0].flags = 0;
+        dummy_vmemrange[0].nid   = 0;
         args->nr_vmemranges = 1;
-        args->vmemranges = &dummy_vmemrange;
 
-        dummy_vnode_to_pnode = XC_NUMA_NO_NODE;
+        if ( args->highmem_end > (1ULL << 32) )
+        {
+            dummy_vmemrange[1].start = 1ULL << 32;
+            dummy_vmemrange[1].end   = args->highmem_end;
+            dummy_vmemrange[1].flags = 0;
+            dummy_vmemrange[1].nid   = 0;
+
+            args->nr_vmemranges++;
+        }
+
+        dummy_vnode_to_pnode[0] = XC_NUMA_NO_NODE;
         args->nr_vnodes = 1;
-        args->vnode_to_pnode = &dummy_vnode_to_pnode;
+        args->vmemranges = dummy_vmemrange;
+        args->vnode_to_pnode = dummy_vnode_to_pnode;
     }
     else
     {
@@ -297,9 +316,15 @@ static int setup_guest(xc_interface *xch,
     }
 
     total_pages = 0;
+    p2m_size = 0;
     for ( i = 0; i < args->nr_vmemranges; i++ )
+    {
         total_pages += ((args->vmemranges[i].end - args->vmemranges[i].start)
                         >> PAGE_SHIFT);
+        p2m_size = p2m_size > (args->vmemranges[i].end >> PAGE_SHIFT) ?
+            p2m_size : (args->vmemranges[i].end >> PAGE_SHIFT);
+    }
+
     if ( total_pages != (args->mem_size >> PAGE_SHIFT) )
     {
         PERROR("vNUMA memory pages mismatch (0x%"PRIx64" != 0x%"PRIx64")",
@@ -325,16 +350,23 @@ static int setup_guest(xc_interface *xch,
     DPRINTF("  TOTAL:    %016"PRIx64"->%016"PRIx64"\n", v_start, v_end);
     DPRINTF("  ENTRY:    %016"PRIx64"\n", elf_uval(&elf, elf.ehdr, e_entry));
 
-    if ( (page_array = malloc(nr_pages * sizeof(xen_pfn_t))) == NULL )
+    if ( (page_array = malloc(p2m_size * sizeof(xen_pfn_t))) == NULL )
     {
         PERROR("Could not allocate memory.");
         goto error_out;
     }
 
-    for ( i = 0; i < nr_pages; i++ )
-        page_array[i] = i;
-    for ( i = args->mmio_start >> PAGE_SHIFT; i < nr_pages; i++ )
-        page_array[i] += args->mmio_size >> PAGE_SHIFT;
+    for ( i = 0; i < p2m_size; i++ )
+        page_array[i] = ((xen_pfn_t)-1);
+    for ( vmemid = 0; vmemid < args->nr_vmemranges; vmemid++ )
+    {
+        uint64_t pfn;
+
+        for ( pfn = args->vmemranges[vmemid].start >> PAGE_SHIFT;
+              pfn < args->vmemranges[vmemid].end >> PAGE_SHIFT;
+              pfn++ )
+            page_array[pfn] = pfn;
+    }
 
     /*
      * Try to claim pages for early warning of insufficient memory available.
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v2 4/4] libxl: fix HVM vNUMA
  2015-05-29 11:37 [PATCH v2 0/4] Fix HVM vNUMA Wei Liu
                   ` (2 preceding siblings ...)
  2015-05-29 11:37 ` [PATCH v2 3/4] libxc: rework vnuma bits in setup_guest Wei Liu
@ 2015-05-29 11:37 ` Wei Liu
  2015-05-29 15:50   ` Boris Ostrovsky
  3 siblings, 1 reply; 11+ messages in thread
From: Wei Liu @ 2015-05-29 11:37 UTC (permalink / raw)
  To: Xen-devel
  Cc: Wei Liu, Boris Ostrovsky, Ian Jackson, Dario Faggioli,
	Ian Campbell

This patch does two thing:

The original code erroneously fills in xc_hvm_build_args before
generating vmemranges. The effect is that guest memory is populated
without vNUMA information. Move the hunk to right place to fix this.

Move the subtraction of video ram to libxl__vnuma_build_vmemrange_hvm
because it's the central place for generating vmemranges.

Reported-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Dario Faggioli <dario.faggioli@citrix.com>
---
 tools/libxl/libxl_dom.c   | 32 ++++++++++----------------------
 tools/libxl/libxl_vnuma.c | 15 ++++++++++++++-
 2 files changed, 24 insertions(+), 23 deletions(-)

diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index dccc9ac..867172a 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -961,6 +961,16 @@ int libxl__build_hvm(libxl__gc *gc, uint32_t domid,
     if (info->num_vnuma_nodes != 0) {
         int i;
 
+        ret = libxl__vnuma_build_vmemrange_hvm(gc, domid, info, state, &args);
+        if (ret) {
+            LOGEV(ERROR, ret, "hvm build vmemranges failed");
+            goto out;
+        }
+        ret = libxl__vnuma_config_check(gc, info, state);
+        if (ret) goto out;
+        ret = set_vnuma_info(gc, domid, info, state);
+        if (ret) goto out;
+
         args.nr_vmemranges = state->num_vmemranges;
         args.vmemranges = libxl__malloc(gc, sizeof(*args.vmemranges) *
                                         args.nr_vmemranges);
@@ -972,17 +982,6 @@ int libxl__build_hvm(libxl__gc *gc, uint32_t domid,
             args.vmemranges[i].nid   = state->vmemranges[i].nid;
         }
 
-        /* Consider video ram belongs to vmemrange 0 -- just shrink it
-         * by the size of video ram.
-         */
-        if (((args.vmemranges[0].end - args.vmemranges[0].start) >> 10)
-            < info->video_memkb) {
-            LOG(ERROR, "vmemrange 0 too small to contain video ram");
-            goto out;
-        }
-
-        args.vmemranges[0].end -= (info->video_memkb << 10);
-
         args.nr_vnodes = info->num_vnuma_nodes;
         args.vnode_to_pnode = libxl__malloc(gc, sizeof(*args.vnode_to_pnode) *
                                             args.nr_vnodes);
@@ -996,17 +995,6 @@ int libxl__build_hvm(libxl__gc *gc, uint32_t domid,
         goto out;
     }
 
-    if (info->num_vnuma_nodes != 0) {
-        ret = libxl__vnuma_build_vmemrange_hvm(gc, domid, info, state, &args);
-        if (ret) {
-            LOGEV(ERROR, ret, "hvm build vmemranges failed");
-            goto out;
-        }
-        ret = libxl__vnuma_config_check(gc, info, state);
-        if (ret) goto out;
-        ret = set_vnuma_info(gc, domid, info, state);
-        if (ret) goto out;
-    }
     ret = hvm_build_set_params(ctx->xch, domid, info, state->store_port,
                                &state->store_mfn, state->console_port,
                                &state->console_mfn, state->store_domid,
diff --git a/tools/libxl/libxl_vnuma.c b/tools/libxl/libxl_vnuma.c
index cac78d7..56856d2 100644
--- a/tools/libxl/libxl_vnuma.c
+++ b/tools/libxl/libxl_vnuma.c
@@ -257,6 +257,7 @@ int libxl__vnuma_build_vmemrange_hvm(libxl__gc *gc,
     uint64_t hole_start, hole_end, next;
     int nid, nr_vmemrange;
     xen_vmemrange_t *vmemranges;
+    int rc;
 
     /* Derive vmemranges from vnode size and memory hole.
      *
@@ -277,6 +278,16 @@ int libxl__vnuma_build_vmemrange_hvm(libxl__gc *gc,
         libxl_vnode_info *p = &b_info->vnuma_nodes[nid];
         uint64_t remaining_bytes = p->memkb << 10;
 
+        /* Consider video ram belongs to vnode 0 */
+        if (nid == 0) {
+            if (p->memkb < b_info->video_memkb) {
+                LOG(ERROR, "vnode 0 too small to contain video ram");
+                rc = ERROR_INVAL;
+                goto out;
+            }
+            remaining_bytes -= (b_info->video_memkb << 10);
+        }
+
         while (remaining_bytes > 0) {
             uint64_t count = remaining_bytes;
 
@@ -300,7 +311,9 @@ int libxl__vnuma_build_vmemrange_hvm(libxl__gc *gc,
     state->vmemranges = vmemranges;
     state->num_vmemranges = nr_vmemrange;
 
-    return 0;
+    rc = 0;
+out:
+    return rc;
 }
 
 /*
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 1/4] libxc/libxl: fill xc_hvm_build_args in libxl
  2015-05-29 11:37 ` [PATCH v2 1/4] libxc/libxl: fill xc_hvm_build_args in libxl Wei Liu
@ 2015-05-29 12:29   ` Andrew Cooper
  2015-05-29 12:54     ` Wei Liu
  0 siblings, 1 reply; 11+ messages in thread
From: Andrew Cooper @ 2015-05-29 12:29 UTC (permalink / raw)
  To: Wei Liu, Xen-devel
  Cc: Boris Ostrovsky, Chen, Tiejun, Dario Faggioli, Ian Jackson,
	Ian Campbell

On 29/05/15 12:37, Wei Liu wrote:
> When building HVM guests, originally some fields of xc_hvm_build_args
> are filled in xc_hvm_build (and buried in the wrong function), some are
> set in libxl__build_hvm before passing xc_hvm_build_args to
> xc_hvm_build. This is fragile.
>
> After examining the code in xc_hvm_build that sets those fields, we can
> in fact move setting of mmio_start etc in libxl. This way we consolidate
> memory layout setting in libxl.
>
> The setting of firmware data related fields is left in xc_hvm_build
> because it depends on parsing ELF image. Those fields only point to
> scratch data that doesn't affect memory layout.
>
> There should be no change in the generated guest memory layout.
>
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> Cc: Ian Campbell <ian.campbell@citrix.com>
> Cc: Ian Jackson <ian.jackson@eu.citrix.com>
> ---
> Cc: "Chen, Tiejun" <tiejun.chen@intel.com>
>
> This might affect your RMRR patch series.

It should at least me noted that this is a semantic change in domain
construction for all other toolstacks out there, an aid to the unlucky
person forward porting something like Xapi and finding that a chunk of
work is no longer performed by libxc.

>
> I once said xc_hvm_build would touch various xc_hvm_build_args fields
> that would affect guest memory layout. It won't be that case anymore
> with this patch.
> ---
>  tools/libxc/xc_hvm_build_x86.c | 37 +++++++------------------------------
>  tools/libxl/libxl_dom.c        | 16 ++++++++++++++++
>  2 files changed, 23 insertions(+), 30 deletions(-)
>
> diff --git a/tools/libxc/xc_hvm_build_x86.c b/tools/libxc/xc_hvm_build_x86.c
> index e45ae4a..92422bf 100644
> --- a/tools/libxc/xc_hvm_build_x86.c
> +++ b/tools/libxc/xc_hvm_build_x86.c
> @@ -88,22 +88,14 @@ static int modules_init(struct xc_hvm_build_args *args,
>      return 0;
>  }
>  
> -static void build_hvm_info(void *hvm_info_page, uint64_t mem_size,
> -                           uint64_t mmio_start, uint64_t mmio_size,
> +static void build_hvm_info(void *hvm_info_page,
>                             struct xc_hvm_build_args *args)
>  {
>      struct hvm_info_table *hvm_info = (struct hvm_info_table *)
>          (((unsigned char *)hvm_info_page) + HVM_INFO_OFFSET);
> -    uint64_t lowmem_end = mem_size, highmem_end = 0;
>      uint8_t sum;
>      int i;
>  
> -    if ( lowmem_end > mmio_start )
> -    {
> -        highmem_end = (1ull<<32) + (lowmem_end - mmio_start);
> -        lowmem_end = mmio_start;
> -    }
> -
>      memset(hvm_info_page, 0, PAGE_SIZE);
>  
>      /* Fill in the header. */
> @@ -116,14 +108,10 @@ static void build_hvm_info(void *hvm_info_page, uint64_t mem_size,
>      memset(hvm_info->vcpu_online, 0xff, sizeof(hvm_info->vcpu_online));
>  
>      /* Memory parameters. */
> -    hvm_info->low_mem_pgend = lowmem_end >> PAGE_SHIFT;
> -    hvm_info->high_mem_pgend = highmem_end >> PAGE_SHIFT;
> +    hvm_info->low_mem_pgend = args->lowmem_end >> PAGE_SHIFT;
> +    hvm_info->high_mem_pgend = args->highmem_end >> PAGE_SHIFT;
>      hvm_info->reserved_mem_pgstart = ioreq_server_pfn(0);
>  
> -    args->lowmem_end = lowmem_end;
> -    args->highmem_end = highmem_end;
> -    args->mmio_start = mmio_start;
> -
>      /* Finish with the checksum. */
>      for ( i = 0, sum = 0; i < hvm_info->length; i++ )
>          sum += ((uint8_t *)hvm_info)[i];
> @@ -251,8 +239,6 @@ static int setup_guest(xc_interface *xch,
>      xen_pfn_t *page_array = NULL;
>      unsigned long i, vmemid, nr_pages = args->mem_size >> PAGE_SHIFT;
>      unsigned long target_pages = args->mem_target >> PAGE_SHIFT;
> -    uint64_t mmio_start = (1ull << 32) - args->mmio_size;
> -    uint64_t mmio_size = args->mmio_size;
>      unsigned long entry_eip, cur_pages, cur_pfn;
>      void *hvm_info_page;
>      uint32_t *ident_pt;
> @@ -344,8 +330,8 @@ static int setup_guest(xc_interface *xch,
>  
>      for ( i = 0; i < nr_pages; i++ )
>          page_array[i] = i;
> -    for ( i = mmio_start >> PAGE_SHIFT; i < nr_pages; i++ )
> -        page_array[i] += mmio_size >> PAGE_SHIFT;
> +    for ( i = args->mmio_start >> PAGE_SHIFT; i < nr_pages; i++ )
> +        page_array[i] += args->mmio_size >> PAGE_SHIFT;
>  
>      /*
>       * Try to claim pages for early warning of insufficient memory available.
> @@ -446,7 +432,7 @@ static int setup_guest(xc_interface *xch,
>                    * range */
>                   !check_mmio_hole(cur_pfn << PAGE_SHIFT,
>                                    SUPERPAGE_1GB_NR_PFNS << PAGE_SHIFT,
> -                                  mmio_start, mmio_size) )
> +                                  args->mmio_start, args->mmio_size) )
>              {
>                  long done;
>                  unsigned long nr_extents = count >> SUPERPAGE_1GB_SHIFT;
> @@ -545,7 +531,7 @@ static int setup_guest(xc_interface *xch,
>                xch, dom, PAGE_SIZE, PROT_READ | PROT_WRITE,
>                HVM_INFO_PFN)) == NULL )
>          goto error_out;
> -    build_hvm_info(hvm_info_page, v_end, mmio_start, mmio_size, args);
> +    build_hvm_info(hvm_info_page, args);
>      munmap(hvm_info_page, PAGE_SIZE);
>  
>      /* Allocate and clear special pages. */
> @@ -661,12 +647,6 @@ int xc_hvm_build(xc_interface *xch, uint32_t domid,
>      if ( args.image_file_name == NULL )
>          return -1;
>  
> -    if ( args.mem_target == 0 )
> -        args.mem_target = args.mem_size;
> -
> -    if ( args.mmio_size == 0 )
> -        args.mmio_size = HVM_BELOW_4G_MMIO_LENGTH;
> -
>      /* An HVM guest must be initialised with at least 2MB memory. */
>      if ( args.mem_size < (2ull << 20) || args.mem_target < (2ull << 20) )
>          return -1;
> @@ -684,9 +664,6 @@ int xc_hvm_build(xc_interface *xch, uint32_t domid,
>              args.acpi_module.guest_addr_out;
>          hvm_args->smbios_module.guest_addr_out = 
>              args.smbios_module.guest_addr_out;
> -        hvm_args->lowmem_end = args.lowmem_end;
> -        hvm_args->highmem_end = args.highmem_end;
> -        hvm_args->mmio_start = args.mmio_start;
>      }
>  
>      free(image);
> diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
> index a0c9850..dccc9ac 100644
> --- a/tools/libxl/libxl_dom.c
> +++ b/tools/libxl/libxl_dom.c
> @@ -920,6 +920,7 @@ int libxl__build_hvm(libxl__gc *gc, uint32_t domid,
>      libxl_ctx *ctx = libxl__gc_owner(gc);
>      struct xc_hvm_build_args args = {};
>      int ret, rc = ERROR_FAIL;
> +    uint64_t mmio_start, lowmem_end, highmem_end;
>  
>      memset(&args, 0, sizeof(struct xc_hvm_build_args));
>      /* The params from the configuration file are in Mb, which are then
> @@ -941,6 +942,21 @@ int libxl__build_hvm(libxl__gc *gc, uint32_t domid,
>          LOG(ERROR, "initializing domain firmware failed");
>          goto out;
>      }
> +    if (args.mem_target == 0)
> +        args.mem_target = args.mem_size;
> +    if (args.mmio_size == 0)
> +        args.mmio_size = HVM_BELOW_4G_MMIO_LENGTH;
> +    lowmem_end = args.mem_size;
> +    highmem_end = 0;
> +    mmio_start = (1ull << 32) - args.mmio_size;
> +    if (lowmem_end > mmio_start)
> +    {
> +        highmem_end = (1ull << 32) + (lowmem_end - mmio_start);
> +        lowmem_end = mmio_start;
> +    }
> +    args.lowmem_end = lowmem_end;
> +    args.highmem_end = highmem_end;
> +    args.mmio_start = mmio_start;
>  
>      if (info->num_vnuma_nodes != 0) {
>          int i;

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 2/4] libxc: print more error messages when failed
  2015-05-29 11:37 ` [PATCH v2 2/4] libxc: print more error messages when failed Wei Liu
@ 2015-05-29 12:42   ` Andrew Cooper
  0 siblings, 0 replies; 11+ messages in thread
From: Andrew Cooper @ 2015-05-29 12:42 UTC (permalink / raw)
  To: Wei Liu, Xen-devel
  Cc: Boris Ostrovsky, Ian Jackson, Dario Faggioli, Ian Campbell

On 29/05/15 12:37, Wei Liu wrote:
> No functional changes introduced.
>
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> Cc: Ian Campbell <ian.campbell@citrix.com>
> Cc: Ian Jackson <ian.jackson@eu.citrix.com>

This patch reminds me of a todo item I have wanted to do for a while
using setvcpucontext rather than mapping gfn 0 and manually inserting
`jmp $0x100000`

However, as for the change itself, a definite improvement.

> ---
>  tools/libxc/xc_hvm_build_x86.c | 30 +++++++++++++++++++++++++++---
>  1 file changed, 27 insertions(+), 3 deletions(-)
>
> diff --git a/tools/libxc/xc_hvm_build_x86.c b/tools/libxc/xc_hvm_build_x86.c
> index 92422bf..df4b7ed 100644
> --- a/tools/libxc/xc_hvm_build_x86.c
> +++ b/tools/libxc/xc_hvm_build_x86.c
> @@ -259,7 +259,10 @@ static int setup_guest(xc_interface *xch,
>  
>      memset(&elf, 0, sizeof(elf));
>      if ( elf_init(&elf, image, image_size) != 0 )
> +    {
> +        PERROR("Could not initialise ELF image");
>          goto error_out;
> +    }
>  
>      xc_elf_set_logfile(xch, &elf, 1);
>  
> @@ -522,15 +525,24 @@ static int setup_guest(xc_interface *xch,
>      DPRINTF("  1GB PAGES: 0x%016lx\n", stat_1gb_pages);
>      
>      if ( loadelfimage(xch, &elf, dom, page_array) != 0 )
> +    {
> +        PERROR("Could not load ELF image");
>          goto error_out;
> +    }
>  
>      if ( loadmodules(xch, args, m_start, m_end, dom, page_array) != 0 )
> -        goto error_out;    
> +    {
> +        PERROR("Could not load ACPI modules");
> +        goto error_out;
> +    }
>  
>      if ( (hvm_info_page = xc_map_foreign_range(
>                xch, dom, PAGE_SIZE, PROT_READ | PROT_WRITE,
>                HVM_INFO_PFN)) == NULL )
> +    {
> +        PERROR("Could not map hvm info page");
>          goto error_out;
> +    }
>      build_hvm_info(hvm_info_page, args);
>      munmap(hvm_info_page, PAGE_SIZE);
>  
> @@ -547,7 +559,10 @@ static int setup_guest(xc_interface *xch,
>      }
>  
>      if ( xc_clear_domain_pages(xch, dom, special_pfn(0), NR_SPECIAL_PAGES) )
> -            goto error_out;
> +    {
> +        PERROR("Could not clear special pages");
> +        goto error_out;
> +    }
>  
>      xc_hvm_param_set(xch, dom, HVM_PARAM_STORE_PFN,
>                       special_pfn(SPECIALPAGE_XENSTORE));
> @@ -580,7 +595,10 @@ static int setup_guest(xc_interface *xch,
>      }
>  
>      if ( xc_clear_domain_pages(xch, dom, ioreq_server_pfn(0), NR_IOREQ_SERVER_PAGES) )
> -            goto error_out;
> +    {
> +        PERROR("Could not clear ioreq page");
> +        goto error_out;
> +    }
>  
>      /* Tell the domain where the pages are and how many there are */
>      xc_hvm_param_set(xch, dom, HVM_PARAM_IOREQ_SERVER_PFN,
> @@ -595,7 +613,10 @@ static int setup_guest(xc_interface *xch,
>      if ( (ident_pt = xc_map_foreign_range(
>                xch, dom, PAGE_SIZE, PROT_READ | PROT_WRITE,
>                special_pfn(SPECIALPAGE_IDENT_PT))) == NULL )
> +    {
> +        PERROR("Could not map special page ident_pt");
>          goto error_out;
> +    }
>      for ( i = 0; i < PAGE_SIZE / sizeof(*ident_pt); i++ )
>          ident_pt[i] = ((i << 22) | _PAGE_PRESENT | _PAGE_RW | _PAGE_USER |
>                         _PAGE_ACCESSED | _PAGE_DIRTY | _PAGE_PSE);
> @@ -610,7 +631,10 @@ static int setup_guest(xc_interface *xch,
>          char *page0 = xc_map_foreign_range(
>              xch, dom, PAGE_SIZE, PROT_READ | PROT_WRITE, 0);
>          if ( page0 == NULL )
> +        {
> +            PERROR("Could not map page0");
>              goto error_out;
> +        }
>          page0[0] = 0xe9;
>          *(uint32_t *)&page0[1] = entry_eip - 5;
>          munmap(page0, PAGE_SIZE);

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 1/4] libxc/libxl: fill xc_hvm_build_args in libxl
  2015-05-29 12:29   ` Andrew Cooper
@ 2015-05-29 12:54     ` Wei Liu
  0 siblings, 0 replies; 11+ messages in thread
From: Wei Liu @ 2015-05-29 12:54 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Wei Liu, Ian Campbell, Dario Faggioli, Ian Jackson, Chen, Tiejun,
	Xen-devel, Boris Ostrovsky

On Fri, May 29, 2015 at 01:29:18PM +0100, Andrew Cooper wrote:
> On 29/05/15 12:37, Wei Liu wrote:
> > When building HVM guests, originally some fields of xc_hvm_build_args
> > are filled in xc_hvm_build (and buried in the wrong function), some are
> > set in libxl__build_hvm before passing xc_hvm_build_args to
> > xc_hvm_build. This is fragile.
> >
> > After examining the code in xc_hvm_build that sets those fields, we can
> > in fact move setting of mmio_start etc in libxl. This way we consolidate
> > memory layout setting in libxl.
> >
> > The setting of firmware data related fields is left in xc_hvm_build
> > because it depends on parsing ELF image. Those fields only point to
> > scratch data that doesn't affect memory layout.
> >
> > There should be no change in the generated guest memory layout.
> >
> > Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> > Cc: Ian Campbell <ian.campbell@citrix.com>
> > Cc: Ian Jackson <ian.jackson@eu.citrix.com>
> > ---
> > Cc: "Chen, Tiejun" <tiejun.chen@intel.com>
> >
> > This might affect your RMRR patch series.
> 
> It should at least me noted that this is a semantic change in domain
> construction for all other toolstacks out there, an aid to the unlucky
> person forward porting something like Xapi and finding that a chunk of
> work is no longer performed by libxc.
> 

Right. I will write this in the commit message if I end up sending v3.

Wei.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 3/4] libxc: rework vnuma bits in setup_guest
  2015-05-29 11:37 ` [PATCH v2 3/4] libxc: rework vnuma bits in setup_guest Wei Liu
@ 2015-05-29 15:02   ` Boris Ostrovsky
  2015-05-29 15:38     ` Wei Liu
  0 siblings, 1 reply; 11+ messages in thread
From: Boris Ostrovsky @ 2015-05-29 15:02 UTC (permalink / raw)
  To: Wei Liu, Xen-devel; +Cc: Dario Faggioli, Ian Jackson, Ian Campbell

On 05/29/2015 07:37 AM, Wei Liu wrote:
> Make the setup process similar to PV counterpart. That is, to allocate a
> P2M array that covers the whole memory range and start from there. This
> is clearer than using an array with no holes in it.
>
> Also the dummy layout should take MMIO hole into consideration. We might
> end up having two vmemranges in the dummy layout.
>
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> Cc: Ian Campbell <ian.campbell@citrix.com>
> Cc: Ian Jackson <ian.jackson@eu.citrix.com>
> ---
> v2: only generate 1 vnode in dummy layout
> ---
>   tools/libxc/xc_hvm_build_x86.c | 62 ++++++++++++++++++++++++++++++++----------
>   1 file changed, 47 insertions(+), 15 deletions(-)
>
> diff --git a/tools/libxc/xc_hvm_build_x86.c b/tools/libxc/xc_hvm_build_x86.c
> index df4b7ed..b3855e0 100644
> --- a/tools/libxc/xc_hvm_build_x86.c
> +++ b/tools/libxc/xc_hvm_build_x86.c
> @@ -238,6 +238,7 @@ static int setup_guest(xc_interface *xch,
>   {
>       xen_pfn_t *page_array = NULL;
>       unsigned long i, vmemid, nr_pages = args->mem_size >> PAGE_SHIFT;
> +    unsigned long p2m_size;
>       unsigned long target_pages = args->mem_target >> PAGE_SHIFT;
>       unsigned long entry_eip, cur_pages, cur_pfn;
>       void *hvm_info_page;
> @@ -254,8 +255,8 @@ static int setup_guest(xc_interface *xch,
>       xen_pfn_t special_array[NR_SPECIAL_PAGES];
>       xen_pfn_t ioreq_server_array[NR_IOREQ_SERVER_PAGES];
>       uint64_t total_pages;
> -    xen_vmemrange_t dummy_vmemrange;
> -    unsigned int dummy_vnode_to_pnode;
> +    xen_vmemrange_t dummy_vmemrange[2];
> +    unsigned int dummy_vnode_to_pnode[1];
>   
>       memset(&elf, 0, sizeof(elf));
>       if ( elf_init(&elf, image, image_size) != 0 )
> @@ -275,17 +276,35 @@ static int setup_guest(xc_interface *xch,
>   
>       if ( args->nr_vmemranges == 0 )
>       {
> -        /* Build dummy vnode information */
> -        dummy_vmemrange.start = 0;
> -        dummy_vmemrange.end   = args->mem_size;
> -        dummy_vmemrange.flags = 0;
> -        dummy_vmemrange.nid   = 0;
> +        /* Build dummy vnode information
> +         *
> +         * Guest physical address space layout:
> +         * [0, hole_start) [hole_start, 4G) [4G, highmem_end)
> +         *
> +         * Of course if there is no high memory, the second vmemrange
> +         * has no effect on the actual result.
> +         */
> +
> +        dummy_vmemrange[0].start = 0;
> +        dummy_vmemrange[0].end   = args->lowmem_end;
> +        dummy_vmemrange[0].flags = 0;
> +        dummy_vmemrange[0].nid   = 0;
>           args->nr_vmemranges = 1;
> -        args->vmemranges = &dummy_vmemrange;
>   
> -        dummy_vnode_to_pnode = XC_NUMA_NO_NODE;
> +        if ( args->highmem_end > (1ULL << 32) )
> +        {
> +            dummy_vmemrange[1].start = 1ULL << 32;
> +            dummy_vmemrange[1].end   = args->highmem_end;
> +            dummy_vmemrange[1].flags = 0;
> +            dummy_vmemrange[1].nid   = 0;
> +
> +            args->nr_vmemranges++;
> +        }
> +
> +        dummy_vnode_to_pnode[0] = XC_NUMA_NO_NODE;
>           args->nr_vnodes = 1;
> -        args->vnode_to_pnode = &dummy_vnode_to_pnode;
> +        args->vmemranges = dummy_vmemrange;
> +        args->vnode_to_pnode = dummy_vnode_to_pnode;


I should have asked last time: we are setting args->vmemranges and 
args->vnode_to_pnode to point to something on the stack. Is it safe to 
assume that these pointers will never be used outside of setup_guest()? 
(and if it is, should they be set to NULL before returning, just in case?)

I also had a very cursory look at their usage and I think I see in some 
cases memory is allocated (and presumably then freed), but I am not sure 
whether this on some other code path.


-boris

>       }
>       else
>       {
> @@ -297,9 +316,15 @@ static int setup_guest(xc_interface *xch,
>       }
>   
>       total_pages = 0;
> +    p2m_size = 0;
>       for ( i = 0; i < args->nr_vmemranges; i++ )
> +    {
>           total_pages += ((args->vmemranges[i].end - args->vmemranges[i].start)
>                           >> PAGE_SHIFT);
> +        p2m_size = p2m_size > (args->vmemranges[i].end >> PAGE_SHIFT) ?
> +            p2m_size : (args->vmemranges[i].end >> PAGE_SHIFT);
> +    }
> +
>       if ( total_pages != (args->mem_size >> PAGE_SHIFT) )
>       {
>           PERROR("vNUMA memory pages mismatch (0x%"PRIx64" != 0x%"PRIx64")",
> @@ -325,16 +350,23 @@ static int setup_guest(xc_interface *xch,
>       DPRINTF("  TOTAL:    %016"PRIx64"->%016"PRIx64"\n", v_start, v_end);
>       DPRINTF("  ENTRY:    %016"PRIx64"\n", elf_uval(&elf, elf.ehdr, e_entry));
>   
> -    if ( (page_array = malloc(nr_pages * sizeof(xen_pfn_t))) == NULL )
> +    if ( (page_array = malloc(p2m_size * sizeof(xen_pfn_t))) == NULL )
>       {
>           PERROR("Could not allocate memory.");
>           goto error_out;
>       }
>   
> -    for ( i = 0; i < nr_pages; i++ )
> -        page_array[i] = i;
> -    for ( i = args->mmio_start >> PAGE_SHIFT; i < nr_pages; i++ )
> -        page_array[i] += args->mmio_size >> PAGE_SHIFT;
> +    for ( i = 0; i < p2m_size; i++ )
> +        page_array[i] = ((xen_pfn_t)-1);
> +    for ( vmemid = 0; vmemid < args->nr_vmemranges; vmemid++ )
> +    {
> +        uint64_t pfn;
> +
> +        for ( pfn = args->vmemranges[vmemid].start >> PAGE_SHIFT;
> +              pfn < args->vmemranges[vmemid].end >> PAGE_SHIFT;
> +              pfn++ )
> +            page_array[pfn] = pfn;
> +    }
>   
>       /*
>        * Try to claim pages for early warning of insufficient memory available.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 3/4] libxc: rework vnuma bits in setup_guest
  2015-05-29 15:02   ` Boris Ostrovsky
@ 2015-05-29 15:38     ` Wei Liu
  0 siblings, 0 replies; 11+ messages in thread
From: Wei Liu @ 2015-05-29 15:38 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: Ian Jackson, Xen-devel, Dario Faggioli, Wei Liu, Ian Campbell

On Fri, May 29, 2015 at 11:02:16AM -0400, Boris Ostrovsky wrote:
> On 05/29/2015 07:37 AM, Wei Liu wrote:
> >Make the setup process similar to PV counterpart. That is, to allocate a
> >P2M array that covers the whole memory range and start from there. This
> >is clearer than using an array with no holes in it.
> >
> >Also the dummy layout should take MMIO hole into consideration. We might
> >end up having two vmemranges in the dummy layout.
> >
> >Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> >Cc: Ian Campbell <ian.campbell@citrix.com>
> >Cc: Ian Jackson <ian.jackson@eu.citrix.com>
> >---
> >v2: only generate 1 vnode in dummy layout
> >---
> >  tools/libxc/xc_hvm_build_x86.c | 62 ++++++++++++++++++++++++++++++++----------
> >  1 file changed, 47 insertions(+), 15 deletions(-)
> >
> >diff --git a/tools/libxc/xc_hvm_build_x86.c b/tools/libxc/xc_hvm_build_x86.c
> >index df4b7ed..b3855e0 100644
> >--- a/tools/libxc/xc_hvm_build_x86.c
> >+++ b/tools/libxc/xc_hvm_build_x86.c
> >@@ -238,6 +238,7 @@ static int setup_guest(xc_interface *xch,
> >  {
> >      xen_pfn_t *page_array = NULL;
> >      unsigned long i, vmemid, nr_pages = args->mem_size >> PAGE_SHIFT;
> >+    unsigned long p2m_size;
> >      unsigned long target_pages = args->mem_target >> PAGE_SHIFT;
> >      unsigned long entry_eip, cur_pages, cur_pfn;
> >      void *hvm_info_page;
> >@@ -254,8 +255,8 @@ static int setup_guest(xc_interface *xch,
> >      xen_pfn_t special_array[NR_SPECIAL_PAGES];
> >      xen_pfn_t ioreq_server_array[NR_IOREQ_SERVER_PAGES];
> >      uint64_t total_pages;
> >-    xen_vmemrange_t dummy_vmemrange;
> >-    unsigned int dummy_vnode_to_pnode;
> >+    xen_vmemrange_t dummy_vmemrange[2];
> >+    unsigned int dummy_vnode_to_pnode[1];
> >      memset(&elf, 0, sizeof(elf));
> >      if ( elf_init(&elf, image, image_size) != 0 )
> >@@ -275,17 +276,35 @@ static int setup_guest(xc_interface *xch,
> >      if ( args->nr_vmemranges == 0 )
> >      {
> >-        /* Build dummy vnode information */
> >-        dummy_vmemrange.start = 0;
> >-        dummy_vmemrange.end   = args->mem_size;
> >-        dummy_vmemrange.flags = 0;
> >-        dummy_vmemrange.nid   = 0;
> >+        /* Build dummy vnode information
> >+         *
> >+         * Guest physical address space layout:
> >+         * [0, hole_start) [hole_start, 4G) [4G, highmem_end)
> >+         *
> >+         * Of course if there is no high memory, the second vmemrange
> >+         * has no effect on the actual result.
> >+         */
> >+
> >+        dummy_vmemrange[0].start = 0;
> >+        dummy_vmemrange[0].end   = args->lowmem_end;
> >+        dummy_vmemrange[0].flags = 0;
> >+        dummy_vmemrange[0].nid   = 0;
> >          args->nr_vmemranges = 1;
> >-        args->vmemranges = &dummy_vmemrange;
> >-        dummy_vnode_to_pnode = XC_NUMA_NO_NODE;
> >+        if ( args->highmem_end > (1ULL << 32) )
> >+        {
> >+            dummy_vmemrange[1].start = 1ULL << 32;
> >+            dummy_vmemrange[1].end   = args->highmem_end;
> >+            dummy_vmemrange[1].flags = 0;
> >+            dummy_vmemrange[1].nid   = 0;
> >+
> >+            args->nr_vmemranges++;
> >+        }
> >+
> >+        dummy_vnode_to_pnode[0] = XC_NUMA_NO_NODE;
> >          args->nr_vnodes = 1;
> >-        args->vnode_to_pnode = &dummy_vnode_to_pnode;
> >+        args->vmemranges = dummy_vmemrange;
> >+        args->vnode_to_pnode = dummy_vnode_to_pnode;
> 
> 
> I should have asked last time: we are setting args->vmemranges and
> args->vnode_to_pnode to point to something on the stack. Is it safe to
> assume that these pointers will never be used outside of setup_guest()? (and
> if it is, should they be set to NULL before returning, just in case?)
> 

I think this is a good idea. In libxl we won't read those fields back
because it's one way communication. Cannot speak for other toolstack
that builds directly on top of libxc, so let's err on the safe side.

> I also had a very cursory look at their usage and I think I see in some
> cases memory is allocated (and presumably then freed), but I am not sure
> whether this on some other code path.
> 

The memory allocated to those fields are not managed here in libxc.
The only memory allocated here is page_array and it's subsequently
freed.

Wei.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 4/4] libxl: fix HVM vNUMA
  2015-05-29 11:37 ` [PATCH v2 4/4] libxl: fix HVM vNUMA Wei Liu
@ 2015-05-29 15:50   ` Boris Ostrovsky
  0 siblings, 0 replies; 11+ messages in thread
From: Boris Ostrovsky @ 2015-05-29 15:50 UTC (permalink / raw)
  To: Wei Liu, Xen-devel; +Cc: Dario Faggioli, Ian Jackson, Ian Campbell

On 05/29/2015 07:37 AM, Wei Liu wrote:
> This patch does two thing:
>
> The original code erroneously fills in xc_hvm_build_args before
> generating vmemranges. The effect is that guest memory is populated
> without vNUMA information. Move the hunk to right place to fix this.
>
> Move the subtraction of video ram to libxl__vnuma_build_vmemrange_hvm
> because it's the central place for generating vmemranges.
>
> Reported-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> Cc: Ian Campbell <ian.campbell@citrix.com>
> Cc: Ian Jackson <ian.jackson@eu.citrix.com>
> Cc: Dario Faggioli <dario.faggioli@citrix.com>

Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2015-05-29 15:50 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-05-29 11:37 [PATCH v2 0/4] Fix HVM vNUMA Wei Liu
2015-05-29 11:37 ` [PATCH v2 1/4] libxc/libxl: fill xc_hvm_build_args in libxl Wei Liu
2015-05-29 12:29   ` Andrew Cooper
2015-05-29 12:54     ` Wei Liu
2015-05-29 11:37 ` [PATCH v2 2/4] libxc: print more error messages when failed Wei Liu
2015-05-29 12:42   ` Andrew Cooper
2015-05-29 11:37 ` [PATCH v2 3/4] libxc: rework vnuma bits in setup_guest Wei Liu
2015-05-29 15:02   ` Boris Ostrovsky
2015-05-29 15:38     ` Wei Liu
2015-05-29 11:37 ` [PATCH v2 4/4] libxl: fix HVM vNUMA Wei Liu
2015-05-29 15:50   ` Boris Ostrovsky

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.