All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v7 00/21] Virtual NUMA for PV and HVM
@ 2015-03-09 12:51 Wei Liu
  2015-03-09 12:51 ` [PATCH v7 01/21] xen: make two memory hypercalls vNUMA-aware Wei Liu
                   ` (20 more replies)
  0 siblings, 21 replies; 39+ messages in thread
From: Wei Liu @ 2015-03-09 12:51 UTC (permalink / raw)
  To: xen-devel; +Cc: Wei Liu

Hi all

This is version 7 of this series rebased on top of master.

This patch series implements virtual NUMA support for both PV and HVM guest.
That is, admin can configure via libxl what virtual NUMA topology the guest
sees.

This is the stage 1 (basic vNUMA support) and part of stage 2 (vNUMA-ware
ballooning, hypervisor side) described in my previous email to xen-devel [0].

This series is broken into several parts:

1. xen patches: vNUMA debug output and vNUMA-aware memory hypercall support.
2. libxc/libxl support for PV vNUMA.
3. libxc/libxl/hypervisor support for HVM vNUMA.
4. xl vNUMA configuration documentation and parser.

One significant difference from Elena's work is that this patch series makes
use of multiple vmemranges should there be a memory hole, instead of shrinking
ram. This matches the behaviour of real hardware.

The vNUMA auto placement algorithm is missing at the moment and Dario is
working on it.

This series can be found at:
 git://xenbits.xen.org/people/liuw/xen.git wip.vnuma-v5

With this series, the following configuration can be used to enabled virtual
NUMA support, and it works for both PV and HVM guests.

vnuma = [ [ "pnode=0","size=3000","vcpus=0-3","vdistances=10,20"  ],
          [ "pnode=0","size=3000","vcpus=4-7","vdistances=20,10"  ],
        ]

For example output of guest NUMA information, please look at [1].

In terms of libxl / libxc internal, things are broken into several
parts:

1. libxl interface

Users of libxl can only specify how many vnodes a guest can have, but
currently they have no control over the actual memory layout. Note that
it's fairly easy to export the interface to control memory layout in the
future.

2. libxl internal

It generates some internal vNUMA configurations when building domain,
then transform them into libxc representations. It also validates vNUMA
configuration along the line.

3. libxc internal

Libxc does what it's told to do. It doesn't do anything smart (in fact,
I delibrately didn't put any smart logic inside it). Libxc will also
report back some information in HVM case to libxl but that's it.

Wei.

[0] <20141111173606.GC21312@zion.uk.xensource.com>
[1] <1416582421-10789-1-git-send-email-wei.liu2@citrix.com>

Patches marked with "A" have already been acked. Hypervisor side patch and
toolstack side patches can be applied independently.

Wei Liu (21):
    xen: make two memory hypercalls vNUMA-aware

    libxc: duplicate snippet to allocate p2m_host array
 A  libxc: add p2m_size to xc_dom_image
 A  libxc: allocate memory with vNUMA information for PV guest
 A  libxl: introduce vNUMA types
 A  libxl: add vmemrange to libxl__domain_build_state
 A  libxl: introduce libxl__vnuma_config_check
 A  libxl: x86: factor out e820_host_sanitize
 A  libxl: functions to build vmemranges for PV guest
 A  libxl: build, check and pass vNUMA info to Xen for PV guest
 A  libxc: indentation change to xc_hvm_build_x86.c
 A  libxc: allocate memory with vNUMA information for HVM guest
 A  libxl: build, check and pass vNUMA info to Xen for HVM guest
 A  libxl: disallow memory relocation when vNUMA is enabled
    libxl: define LIBXL_HAVE_VNUMA

 A  libxlu: rework internal representation of setting
 A  libxlu: nested list support
    libxlu: record location when parsing values
 A  libxlu: introduce new APIs
 A  xl: introduce xcalloc
    xl: vNUMA support

 docs/man/xl.cfg.pod.5          |  59 +++++++++-
 tools/libxc/include/xc_dom.h   |  13 ++-
 tools/libxc/include/xenguest.h |  13 +++
 tools/libxc/xc_dom_arm.c       |   1 +
 tools/libxc/xc_dom_core.c      |   8 +-
 tools/libxc/xc_dom_x86.c       | 129 +++++++++++++++++----
 tools/libxc/xc_hvm_build_x86.c | 237 +++++++++++++++++++++++++-------------
 tools/libxl/Makefile           |   2 +-
 tools/libxl/libxl.h            |   7 ++
 tools/libxl/libxl_arch.h       |   6 +
 tools/libxl/libxl_arm.c        |   8 ++
 tools/libxl/libxl_create.c     |   9 ++
 tools/libxl/libxl_dm.c         |   6 +-
 tools/libxl/libxl_dom.c        | 120 ++++++++++++++++++++
 tools/libxl/libxl_internal.h   |  24 ++++
 tools/libxl/libxl_types.idl    |  10 ++
 tools/libxl/libxl_vnuma.c      | 252 +++++++++++++++++++++++++++++++++++++++++
 tools/libxl/libxl_x86.c        | 110 ++++++++++++++++--
 tools/libxl/libxlu_cfg.c       | 211 +++++++++++++++++++++++++---------
 tools/libxl/libxlu_cfg_i.h     |  14 ++-
 tools/libxl/libxlu_cfg_y.c     |  46 ++++----
 tools/libxl/libxlu_cfg_y.h     |   2 +-
 tools/libxl/libxlu_cfg_y.y     |  14 +--
 tools/libxl/libxlu_internal.h  |  25 +++-
 tools/libxl/libxlutil.h        |  13 +++
 tools/libxl/xl_cmdimpl.c       | 168 ++++++++++++++++++++++++++-
 xen/common/kernel.c            |   2 +-
 xen/common/memory.c            |  55 +++++++--
 xen/include/public/features.h  |   3 +
 xen/include/public/memory.h    |   2 +
 30 files changed, 1341 insertions(+), 228 deletions(-)
 create mode 100644 tools/libxl/libxl_vnuma.c

-- 
1.9.1

^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH v7 01/21] xen: make two memory hypercalls vNUMA-aware
  2015-03-09 12:51 [PATCH v7 00/21] Virtual NUMA for PV and HVM Wei Liu
@ 2015-03-09 12:51 ` Wei Liu
  2015-03-09 13:22   ` Jan Beulich
  2015-03-09 12:51 ` [PATCH v7 02/21] libxc: duplicate snippet to allocate p2m_host array Wei Liu
                   ` (19 subsequent siblings)
  20 siblings, 1 reply; 39+ messages in thread
From: Wei Liu @ 2015-03-09 12:51 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Jan Beulich

Make XENMEM_increase_reservation and XENMEM_populate_physmap
vNUMA-aware.

That is, if guest requests Xen to allocate memory for specific vnode,
Xen can translate vnode to pnode using vNUMA information of that guest.

XENMEMF_vnode is introduced for the guest to mark the node number is in
fact virtual node number and should be translated by Xen.

XENFEAT_memory_op_vnode_supported is introduced to indicate that Xen is
able to translate virtual node to physical node.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: Jan Beulich <JBeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
Changes in v7:
1. Remove XEN_NUMA_NO_NODE.
2. Use nodeid_t for vnode and pnode variables.

Changes in v6:
1. Add logic in construct_memop_from_reservation.
---
 xen/common/kernel.c           |  2 +-
 xen/common/memory.c           | 55 ++++++++++++++++++++++++++++++++++---------
 xen/include/public/features.h |  3 +++
 xen/include/public/memory.h   |  2 ++
 4 files changed, 50 insertions(+), 12 deletions(-)

diff --git a/xen/common/kernel.c b/xen/common/kernel.c
index 0d9e519..e5e0050 100644
--- a/xen/common/kernel.c
+++ b/xen/common/kernel.c
@@ -301,7 +301,7 @@ DO(xen_version)(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
         switch ( fi.submap_idx )
         {
         case 0:
-            fi.submap = 0;
+            fi.submap = (1U << XENFEAT_memory_op_vnode_supported);
             if ( VM_ASSIST(d, VMASST_TYPE_pae_extended_cr3) )
                 fi.submap |= (1U << XENFEAT_pae_pgdir_above_4gb);
             if ( paging_mode_translate(current->domain) )
diff --git a/xen/common/memory.c b/xen/common/memory.c
index 9d9d43c..09b9045 100644
--- a/xen/common/memory.c
+++ b/xen/common/memory.c
@@ -692,11 +692,12 @@ out:
     return rc;
 }
 
-static int construct_memop_from_reservation(
+static int construct_memop_from_reservation(struct domain *d,
                const struct xen_memory_reservation *r,
                struct memop_args *a)
 {
     unsigned int address_bits;
+    int rc;
 
     a->extent_list  = r->extent_start;
     a->nr_extents   = r->nr_extents;
@@ -712,11 +713,41 @@ static int construct_memop_from_reservation(
         a->memflags = MEMF_bits(address_bits);
     }
 
-    a->memflags |= MEMF_node(XENMEMF_get_node(r->mem_flags));
-    if ( r->mem_flags & XENMEMF_exact_node_request )
-        a->memflags |= MEMF_exact_node;
+    if ( r->mem_flags & XENMEMF_vnode )
+    {
+        nodeid_t vnode, pnode;
 
-    return 0;
+        read_lock(&d->vnuma_rwlock);
+        if ( d->vnuma )
+        {
+            vnode = XENMEMF_get_node(r->mem_flags);
+            if ( vnode >= d->vnuma->nr_vnodes )
+            {
+                rc = -EINVAL;
+                read_unlock(&d->vnuma_rwlock);
+                goto out;
+            }
+
+            pnode = d->vnuma->vnode_to_pnode[vnode];
+            if ( pnode != NUMA_NO_NODE )
+            {
+                a->memflags |= MEMF_node(pnode);
+                if ( r->mem_flags & XENMEMF_exact_node_request )
+                    a->memflags |= MEMF_exact_node;
+            }
+        }
+        read_unlock(&d->vnuma_rwlock);
+    }
+    else
+    {
+        a->memflags |= MEMF_node(XENMEMF_get_node(r->mem_flags));
+        if ( r->mem_flags & XENMEMF_exact_node_request )
+            a->memflags |= MEMF_exact_node;
+    }
+
+    rc = 0;
+out:
+    return rc;
 }
 
 long do_memory_op(unsigned long cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
@@ -744,12 +775,6 @@ long do_memory_op(unsigned long cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
         if ( unlikely(start_extent >= reservation.nr_extents) )
             return start_extent;
 
-        if ( construct_memop_from_reservation(&reservation, &args) )
-            return start_extent;
-        args.nr_done      = start_extent;
-        args.preempted    = 0;
-
-
         if ( op == XENMEM_populate_physmap
              && (reservation.mem_flags & XENMEMF_populate_on_demand) )
             args.memflags |= MEMF_populate_on_demand;
@@ -759,6 +784,14 @@ long do_memory_op(unsigned long cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
             return start_extent;
         args.domain = d;
 
+        if ( construct_memop_from_reservation(d, &reservation, &args) )
+        {
+            rcu_unlock_domain(d);
+            return start_extent;
+        }
+        args.nr_done      = start_extent;
+        args.preempted    = 0;
+
         if ( xsm_memory_adjust_reservation(XSM_TARGET, current->domain, d) )
         {
             rcu_unlock_domain(d);
diff --git a/xen/include/public/features.h b/xen/include/public/features.h
index 16d92aa..2110b04 100644
--- a/xen/include/public/features.h
+++ b/xen/include/public/features.h
@@ -99,6 +99,9 @@
 #define XENFEAT_grant_map_identity        12
  */
 
+/* Guest can use XENMEMF_vnode to specify virtual node for memory op. */
+#define XENFEAT_memory_op_vnode_supported 13
+
 #define XENFEAT_NR_SUBMAPS 1
 
 #endif /* __XEN_PUBLIC_FEATURES_H__ */
diff --git a/xen/include/public/memory.h b/xen/include/public/memory.h
index 595f953..2b5206b 100644
--- a/xen/include/public/memory.h
+++ b/xen/include/public/memory.h
@@ -55,6 +55,8 @@
 /* Flag to request allocation only from the node specified */
 #define XENMEMF_exact_node_request  (1<<17)
 #define XENMEMF_exact_node(n) (XENMEMF_node(n) | XENMEMF_exact_node_request)
+/* Flag to indicate the node specified is virtual node */
+#define XENMEMF_vnode  (1<<18)
 #endif
 
 struct xen_memory_reservation {
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v7 02/21] libxc: duplicate snippet to allocate p2m_host array
  2015-03-09 12:51 [PATCH v7 00/21] Virtual NUMA for PV and HVM Wei Liu
  2015-03-09 12:51 ` [PATCH v7 01/21] xen: make two memory hypercalls vNUMA-aware Wei Liu
@ 2015-03-09 12:51 ` Wei Liu
  2015-03-09 15:06   ` Konrad Rzeszutek Wilk
  2015-03-11 15:02   ` Ian Campbell
  2015-03-09 12:51 ` [PATCH v7 03/21] libxc: add p2m_size to xc_dom_image Wei Liu
                   ` (18 subsequent siblings)
  20 siblings, 2 replies; 39+ messages in thread
From: Wei Liu @ 2015-03-09 12:51 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, Ian Campbell

Currently all in tree code doesn't set the superpage flag, I would just
remove superpage support if I can, but Konrad wants it retained for the
moment.

As I'm going to change the p2m_host array allocation, duplicate the code
snippet to allocate p2m_host array in this patch, so that we retain the
behaviour in superpage case.

This patch introduces no functional change and it will make future patch
easier to review. Also removed one stray tab while I was there.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
CC: Konrad Wilk <konrad.wilk@oracle.com>
---
 tools/libxc/xc_dom_x86.c | 15 ++++++++++-----
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/tools/libxc/xc_dom_x86.c b/tools/libxc/xc_dom_x86.c
index bf06fe4..9dbaedb 100644
--- a/tools/libxc/xc_dom_x86.c
+++ b/tools/libxc/xc_dom_x86.c
@@ -772,15 +772,16 @@ int arch_setup_meminit(struct xc_dom_image *dom)
             return rc;
     }
 
-    dom->p2m_host = xc_dom_malloc(dom, sizeof(xen_pfn_t) * dom->total_pages);
-    if ( dom->p2m_host == NULL )
-        return -EINVAL;
-
     if ( dom->superpages )
     {
         int count = dom->total_pages >> SUPERPAGE_PFN_SHIFT;
         xen_pfn_t extents[count];
 
+        dom->p2m_host = xc_dom_malloc(dom, sizeof(xen_pfn_t) *
+                                      dom->total_pages);
+        if ( dom->p2m_host == NULL )
+            return -EINVAL;
+
         DOMPRINTF("Populating memory with %d superpages", count);
         for ( pfn = 0; pfn < count; pfn++ )
             extents[pfn] = pfn << SUPERPAGE_PFN_SHIFT;
@@ -809,9 +810,13 @@ int arch_setup_meminit(struct xc_dom_image *dom)
                 return rc;
         }
         /* setup initial p2m */
+        dom->p2m_host = xc_dom_malloc(dom, sizeof(xen_pfn_t) *
+                                      dom->total_pages);
+        if ( dom->p2m_host == NULL )
+            return -EINVAL;
         for ( pfn = 0; pfn < dom->total_pages; pfn++ )
             dom->p2m_host[pfn] = pfn;
-        
+
         /* allocate guest memory */
         for ( i = rc = allocsz = 0;
               (i < dom->total_pages) && !rc;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v7 03/21] libxc: add p2m_size to xc_dom_image
  2015-03-09 12:51 [PATCH v7 00/21] Virtual NUMA for PV and HVM Wei Liu
  2015-03-09 12:51 ` [PATCH v7 01/21] xen: make two memory hypercalls vNUMA-aware Wei Liu
  2015-03-09 12:51 ` [PATCH v7 02/21] libxc: duplicate snippet to allocate p2m_host array Wei Liu
@ 2015-03-09 12:51 ` Wei Liu
  2015-03-09 12:51 ` [PATCH v7 04/21] libxc: allocate memory with vNUMA information for PV guest Wei Liu
                   ` (17 subsequent siblings)
  20 siblings, 0 replies; 39+ messages in thread
From: Wei Liu @ 2015-03-09 12:51 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, Ian Campbell

Add a new field p2m_size to keep track of the number of pages covered by
p2m.  Change total_pages to p2m_size in functions which in fact need
the size of p2m.

This is needed because we are going to ditch the assumption that PV x86
has only one contiguous ram region. Originally the p2m size was always
equal to total_pages, but we will soon change that in later patch.

This patch doesn't change the behaviour of libxc.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
 tools/libxc/include/xc_dom.h |  1 +
 tools/libxc/xc_dom_arm.c     |  1 +
 tools/libxc/xc_dom_core.c    |  8 ++++----
 tools/libxc/xc_dom_x86.c     | 19 +++++++++++--------
 4 files changed, 17 insertions(+), 12 deletions(-)

diff --git a/tools/libxc/include/xc_dom.h b/tools/libxc/include/xc_dom.h
index 07d7224..6b8ddf4 100644
--- a/tools/libxc/include/xc_dom.h
+++ b/tools/libxc/include/xc_dom.h
@@ -129,6 +129,7 @@ struct xc_dom_image {
      */
     xen_pfn_t rambase_pfn;
     xen_pfn_t total_pages;
+    xen_pfn_t p2m_size;         /* number of pfns covered by p2m */
     struct xc_dom_phys *phys_pages;
     int realmodearea_log;
 #if defined (__arm__) || defined(__aarch64__)
diff --git a/tools/libxc/xc_dom_arm.c b/tools/libxc/xc_dom_arm.c
index c7feca7..b9fa66d 100644
--- a/tools/libxc/xc_dom_arm.c
+++ b/tools/libxc/xc_dom_arm.c
@@ -449,6 +449,7 @@ int arch_setup_meminit(struct xc_dom_image *dom)
     assert(dom->rambank_size[0] != 0);
     assert(ramsize == 0); /* Too much RAM is rejected above */
 
+    dom->p2m_size = p2m_size;
     dom->p2m_host = xc_dom_malloc(dom, sizeof(xen_pfn_t) * p2m_size);
     if ( dom->p2m_host == NULL )
         return -EINVAL;
diff --git a/tools/libxc/xc_dom_core.c b/tools/libxc/xc_dom_core.c
index ecbf981..b100ce1 100644
--- a/tools/libxc/xc_dom_core.c
+++ b/tools/libxc/xc_dom_core.c
@@ -931,9 +931,9 @@ int xc_dom_update_guest_p2m(struct xc_dom_image *dom)
     {
     case 4:
         DOMPRINTF("%s: dst 32bit, pages 0x%" PRIpfn "",
-                  __FUNCTION__, dom->total_pages);
+                  __FUNCTION__, dom->p2m_size);
         p2m_32 = dom->p2m_guest;
-        for ( i = 0; i < dom->total_pages; i++ )
+        for ( i = 0; i < dom->p2m_size; i++ )
             if ( dom->p2m_host[i] != INVALID_P2M_ENTRY )
                 p2m_32[i] = dom->p2m_host[i];
             else
@@ -941,9 +941,9 @@ int xc_dom_update_guest_p2m(struct xc_dom_image *dom)
         break;
     case 8:
         DOMPRINTF("%s: dst 64bit, pages 0x%" PRIpfn "",
-                  __FUNCTION__, dom->total_pages);
+                  __FUNCTION__, dom->p2m_size);
         p2m_64 = dom->p2m_guest;
-        for ( i = 0; i < dom->total_pages; i++ )
+        for ( i = 0; i < dom->p2m_size; i++ )
             if ( dom->p2m_host[i] != INVALID_P2M_ENTRY )
                 p2m_64[i] = dom->p2m_host[i];
             else
diff --git a/tools/libxc/xc_dom_x86.c b/tools/libxc/xc_dom_x86.c
index 9dbaedb..bea54f2 100644
--- a/tools/libxc/xc_dom_x86.c
+++ b/tools/libxc/xc_dom_x86.c
@@ -122,11 +122,11 @@ static int count_pgtables(struct xc_dom_image *dom, int pae,
 
         try_pfn_end = (try_virt_end - dom->parms.virt_base) >> PAGE_SHIFT_X86;
 
-        if ( try_pfn_end > dom->total_pages )
+        if ( try_pfn_end > dom->p2m_size )
         {
             xc_dom_panic(dom->xch, XC_OUT_OF_MEMORY,
                          "%s: not enough memory for initial mapping (%#"PRIpfn" > %#"PRIpfn")",
-                         __FUNCTION__, try_pfn_end, dom->total_pages);
+                         __FUNCTION__, try_pfn_end, dom->p2m_size);
             return -ENOMEM;
         }
 
@@ -440,10 +440,11 @@ pfn_error:
 
 static int alloc_magic_pages(struct xc_dom_image *dom)
 {
-    size_t p2m_size = dom->total_pages * dom->arch_hooks->sizeof_pfn;
+    size_t p2m_alloc_size = dom->p2m_size * dom->arch_hooks->sizeof_pfn;
 
     /* allocate phys2mach table */
-    if ( xc_dom_alloc_segment(dom, &dom->p2m_seg, "phys2mach", 0, p2m_size) )
+    if ( xc_dom_alloc_segment(dom, &dom->p2m_seg, "phys2mach",
+                              0, p2m_alloc_size) )
         return -1;
     dom->p2m_guest = xc_dom_seg_to_ptr(dom, &dom->p2m_seg);
     if ( dom->p2m_guest == NULL )
@@ -777,8 +778,9 @@ int arch_setup_meminit(struct xc_dom_image *dom)
         int count = dom->total_pages >> SUPERPAGE_PFN_SHIFT;
         xen_pfn_t extents[count];
 
+        dom->p2m_size = dom->total_pages;
         dom->p2m_host = xc_dom_malloc(dom, sizeof(xen_pfn_t) *
-                                      dom->total_pages);
+                                      dom->p2m_size);
         if ( dom->p2m_host == NULL )
             return -EINVAL;
 
@@ -810,8 +812,9 @@ int arch_setup_meminit(struct xc_dom_image *dom)
                 return rc;
         }
         /* setup initial p2m */
+        dom->p2m_size = dom->total_pages;
         dom->p2m_host = xc_dom_malloc(dom, sizeof(xen_pfn_t) *
-                                      dom->total_pages);
+                                      dom->p2m_size);
         if ( dom->p2m_host == NULL )
             return -EINVAL;
         for ( pfn = 0; pfn < dom->total_pages; pfn++ )
@@ -860,7 +863,7 @@ static int map_grant_table_frames(struct xc_dom_image *dom)
     {
         rc = xc_domain_add_to_physmap(dom->xch, dom->guest_domid,
                                       XENMAPSPACE_grant_table,
-                                      i, dom->total_pages + i);
+                                      i, dom->p2m_size + i);
         if ( rc != 0 )
         {
             if ( (i > 0) && (errno == EINVAL) )
@@ -870,7 +873,7 @@ static int map_grant_table_frames(struct xc_dom_image *dom)
             }
             xc_dom_panic(dom->xch, XC_INTERNAL_ERROR,
                          "%s: mapping grant tables failed " "(pfn=0x%" PRIpfn
-                         ", rc=%d)", __FUNCTION__, dom->total_pages + i, rc);
+                         ", rc=%d)", __FUNCTION__, dom->p2m_size + i, rc);
             return rc;
         }
     }
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v7 04/21] libxc: allocate memory with vNUMA information for PV guest
  2015-03-09 12:51 [PATCH v7 00/21] Virtual NUMA for PV and HVM Wei Liu
                   ` (2 preceding siblings ...)
  2015-03-09 12:51 ` [PATCH v7 03/21] libxc: add p2m_size to xc_dom_image Wei Liu
@ 2015-03-09 12:51 ` Wei Liu
  2015-03-09 12:51 ` [PATCH v7 05/21] libxl: introduce vNUMA types Wei Liu
                   ` (16 subsequent siblings)
  20 siblings, 0 replies; 39+ messages in thread
From: Wei Liu @ 2015-03-09 12:51 UTC (permalink / raw)
  To: xen-devel
  Cc: Ian Jackson, Dario Faggioli, Wei Liu, Ian Campbell,
	Elena Ufimtseva

>From libxc's point of view, it only needs to know vnode to pnode mapping
and size of each vnode to allocate memory accordingly. Add these fields
to xc_dom structure.

The caller might not pass in vNUMA information. In that case, a dummy
layout is generated for the convenience of libxc's allocation code. The
upper layer (libxl etc) still sees the domain has no vNUMA
configuration.

Note that for this patch on PV x86 guest can have multiple regions of
ram allocated.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Dario Faggioli <dario.faggioli@citrix.com>
Cc: Elena Ufimtseva <ufimtseva@gmail.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
Changes in v7:
1. Fall back to use our own XC_NUMA_NO_NODE.

Changes in v6:
1. Ditch XC_VNUMA_NO_NODE and use XEN_NUMA_NO_NODE.
2. Update comment in xc_dom.h.

Changes in v5:
1. Ditch xc_vnuma_info.

Changes in v4:
1. Pack fields into a struct.
2. Use "page" as unit.
3. __FUNCTION__ -> __func__.
4. Don't print total_pages.
5. Improve comment.

Changes in v3:
1. Rewrite commit log.
2. Shorten some error messages.
---
 tools/libxc/include/xc_dom.h   |  12 ++++-
 tools/libxc/include/xenguest.h |   2 +
 tools/libxc/xc_dom_x86.c       | 101 +++++++++++++++++++++++++++++++++++------
 3 files changed, 99 insertions(+), 16 deletions(-)

diff --git a/tools/libxc/include/xc_dom.h b/tools/libxc/include/xc_dom.h
index 6b8ddf4..a7d059a 100644
--- a/tools/libxc/include/xc_dom.h
+++ b/tools/libxc/include/xc_dom.h
@@ -119,8 +119,10 @@ struct xc_dom_image {
 
     /* physical memory
      *
-     * An x86 PV guest has a single contiguous block of physical RAM,
-     * consisting of total_pages starting at rambase_pfn.
+     * An x86 PV guest has one or more blocks of physical RAM,
+     * consisting of total_pages starting at rambase_pfn. The start
+     * address and size of each block is controlled by vNUMA
+     * structures.
      *
      * An ARM guest has GUEST_RAM_BANKS regions of RAM, with
      * rambank_size[i] pages in each. The lowest RAM address
@@ -168,6 +170,12 @@ struct xc_dom_image {
     struct xc_dom_loader *kernel_loader;
     void *private_loader;
 
+    /* vNUMA information */
+    xen_vmemrange_t *vmemranges;
+    unsigned int nr_vmemranges;
+    unsigned int *vnode_to_pnode;
+    unsigned int nr_vnodes;
+
     /* kernel loader */
     struct xc_dom_arch *arch_hooks;
     /* allocate up to virt_alloc_end */
diff --git a/tools/libxc/include/xenguest.h b/tools/libxc/include/xenguest.h
index 40bbac8..b7a924f 100644
--- a/tools/libxc/include/xenguest.h
+++ b/tools/libxc/include/xenguest.h
@@ -23,6 +23,8 @@
 #ifndef XENGUEST_H
 #define XENGUEST_H
 
+#define XC_NUMA_NO_NODE   (~0U)
+
 #define XCFLAGS_LIVE      (1 << 0)
 #define XCFLAGS_DEBUG     (1 << 1)
 #define XCFLAGS_HVM       (1 << 2)
diff --git a/tools/libxc/xc_dom_x86.c b/tools/libxc/xc_dom_x86.c
index bea54f2..af0c9f4 100644
--- a/tools/libxc/xc_dom_x86.c
+++ b/tools/libxc/xc_dom_x86.c
@@ -760,7 +760,8 @@ static int x86_shadow(xc_interface *xch, domid_t domid)
 int arch_setup_meminit(struct xc_dom_image *dom)
 {
     int rc;
-    xen_pfn_t pfn, allocsz, i, j, mfn;
+    xen_pfn_t pfn, allocsz, mfn, total, pfn_base;
+    int i, j;
 
     rc = x86_compat(dom->xch, dom->guest_domid, dom->guest_type);
     if ( rc )
@@ -811,26 +812,98 @@ int arch_setup_meminit(struct xc_dom_image *dom)
             if ( rc )
                 return rc;
         }
-        /* setup initial p2m */
-        dom->p2m_size = dom->total_pages;
+
+        /* Setup dummy vNUMA information if it's not provided. Note
+         * that this is a valid state if libxl doesn't provide any
+         * vNUMA information.
+         *
+         * The dummy values make libxc allocate all pages from
+         * arbitrary physical nodes. This is the expected behaviour if
+         * no vNUMA configuration is provided to libxc.
+         *
+         * Note that the following hunk is just for the convenience of
+         * allocation code. No defaulting happens in libxc.
+         */
+        if ( dom->nr_vmemranges == 0 )
+        {
+            dom->nr_vmemranges = 1;
+            dom->vmemranges = xc_dom_malloc(dom, sizeof(*dom->vmemranges));
+            dom->vmemranges[0].start = 0;
+            dom->vmemranges[0].end   = dom->total_pages << PAGE_SHIFT;
+            dom->vmemranges[0].flags = 0;
+            dom->vmemranges[0].nid   = 0;
+
+            dom->nr_vnodes = 1;
+            dom->vnode_to_pnode = xc_dom_malloc(dom,
+                                      sizeof(*dom->vnode_to_pnode));
+            dom->vnode_to_pnode[0] = XC_NUMA_NO_NODE;
+        }
+
+        total = dom->p2m_size = 0;
+        for ( i = 0; i < dom->nr_vmemranges; i++ )
+        {
+            total += ((dom->vmemranges[i].end - dom->vmemranges[i].start)
+                      >> PAGE_SHIFT);
+            dom->p2m_size =
+                dom->p2m_size > (dom->vmemranges[i].end >> PAGE_SHIFT) ?
+                dom->p2m_size : (dom->vmemranges[i].end >> PAGE_SHIFT);
+        }
+        if ( total != dom->total_pages )
+        {
+            xc_dom_panic(dom->xch, XC_INTERNAL_ERROR,
+                         "%s: vNUMA page count mismatch (0x%"PRIpfn" != 0x%"PRIpfn")\n",
+                         __func__, total, dom->total_pages);
+            return -EINVAL;
+        }
+
         dom->p2m_host = xc_dom_malloc(dom, sizeof(xen_pfn_t) *
                                       dom->p2m_size);
         if ( dom->p2m_host == NULL )
             return -EINVAL;
-        for ( pfn = 0; pfn < dom->total_pages; pfn++ )
-            dom->p2m_host[pfn] = pfn;
+        for ( pfn = 0; pfn < dom->p2m_size; pfn++ )
+            dom->p2m_host[pfn] = INVALID_P2M_ENTRY;
 
         /* allocate guest memory */
-        for ( i = rc = allocsz = 0;
-              (i < dom->total_pages) && !rc;
-              i += allocsz )
+        for ( i = 0; i < dom->nr_vmemranges; i++ )
         {
-            allocsz = dom->total_pages - i;
-            if ( allocsz > 1024*1024 )
-                allocsz = 1024*1024;
-            rc = xc_domain_populate_physmap_exact(
-                dom->xch, dom->guest_domid, allocsz,
-                0, 0, &dom->p2m_host[i]);
+            unsigned int memflags;
+            uint64_t pages;
+            unsigned int pnode = dom->vnode_to_pnode[dom->vmemranges[i].nid];
+
+            memflags = 0;
+            if ( pnode != XC_NUMA_NO_NODE )
+                memflags |= XENMEMF_exact_node(pnode);
+
+            pages = (dom->vmemranges[i].end - dom->vmemranges[i].start)
+                >> PAGE_SHIFT;
+            pfn_base = dom->vmemranges[i].start >> PAGE_SHIFT;
+
+            for ( pfn = pfn_base; pfn < pfn_base+pages; pfn++ )
+                dom->p2m_host[pfn] = pfn;
+
+            for ( j = 0; j < pages; j += allocsz )
+            {
+                allocsz = pages - j;
+                if ( allocsz > 1024*1024 )
+                    allocsz = 1024*1024;
+
+                rc = xc_domain_populate_physmap_exact(dom->xch,
+                         dom->guest_domid, allocsz, 0, memflags,
+                         &dom->p2m_host[pfn_base+j]);
+
+                if ( rc )
+                {
+                    if ( pnode != XC_NUMA_NO_NODE )
+                        xc_dom_panic(dom->xch, XC_INTERNAL_ERROR,
+                                     "%s: failed to allocate 0x%"PRIx64" pages (v=%d, p=%d)\n",
+                                     __func__, pages, i, pnode);
+                    else
+                        xc_dom_panic(dom->xch, XC_INTERNAL_ERROR,
+                                     "%s: failed to allocate 0x%"PRIx64" pages\n",
+                                     __func__, pages);
+                    return rc;
+                }
+            }
         }
 
         /* Ensure no unclaimed pages are left unused.
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v7 05/21] libxl: introduce vNUMA types
  2015-03-09 12:51 [PATCH v7 00/21] Virtual NUMA for PV and HVM Wei Liu
                   ` (3 preceding siblings ...)
  2015-03-09 12:51 ` [PATCH v7 04/21] libxc: allocate memory with vNUMA information for PV guest Wei Liu
@ 2015-03-09 12:51 ` Wei Liu
  2015-03-09 12:51 ` [PATCH v7 06/21] libxl: add vmemrange to libxl__domain_build_state Wei Liu
                   ` (15 subsequent siblings)
  20 siblings, 0 replies; 39+ messages in thread
From: Wei Liu @ 2015-03-09 12:51 UTC (permalink / raw)
  To: xen-devel
  Cc: Ian Jackson, Dario Faggioli, Wei Liu, Ian Campbell,
	Elena Ufimtseva

A domain can contain several virtual NUMA nodes, hence we introduce an
array in libxl_domain_build_info.

libxl_vnode_info contains the size of memory in that node, the distance
from that node to every nodes, the underlying pnode and a bitmap of
vcpus.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Dario Faggioli <dario.faggioli@citrix.com>
Cc: Elena Ufimtseva <ufimtseva@gmail.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
Changes in v4:
1. Use MemKB.

Changes in v3:
1. Add commit message.
---
 tools/libxl/libxl_types.idl | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index 02be466..14c7e7c 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -356,6 +356,13 @@ libxl_domain_sched_params = Struct("domain_sched_params",[
     ("budget",       integer, {'init_val': 'LIBXL_DOMAIN_SCHED_PARAM_BUDGET_DEFAULT'}),
     ])
 
+libxl_vnode_info = Struct("vnode_info", [
+    ("memkb", MemKB),
+    ("distances", Array(uint32, "num_distances")), # distances from this node to other nodes
+    ("pnode", uint32), # physical node of this node
+    ("vcpus", libxl_bitmap), # vcpus in this node
+    ])
+
 libxl_domain_build_info = Struct("domain_build_info",[
     ("max_vcpus",       integer),
     ("avail_vcpus",     libxl_bitmap),
@@ -376,6 +383,8 @@ libxl_domain_build_info = Struct("domain_build_info",[
     ("disable_migrate", libxl_defbool),
     ("cpuid",           libxl_cpuid_policy_list),
     ("blkdev_start",    string),
+
+    ("vnuma_nodes", Array(libxl_vnode_info, "num_vnuma_nodes")),
     
     ("device_model_version", libxl_device_model_version),
     ("device_model_stubdomain", libxl_defbool),
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v7 06/21] libxl: add vmemrange to libxl__domain_build_state
  2015-03-09 12:51 [PATCH v7 00/21] Virtual NUMA for PV and HVM Wei Liu
                   ` (4 preceding siblings ...)
  2015-03-09 12:51 ` [PATCH v7 05/21] libxl: introduce vNUMA types Wei Liu
@ 2015-03-09 12:51 ` Wei Liu
  2015-03-09 12:51 ` [PATCH v7 07/21] libxl: introduce libxl__vnuma_config_check Wei Liu
                   ` (14 subsequent siblings)
  20 siblings, 0 replies; 39+ messages in thread
From: Wei Liu @ 2015-03-09 12:51 UTC (permalink / raw)
  To: xen-devel
  Cc: Ian Jackson, Dario Faggioli, Wei Liu, Ian Campbell,
	Elena Ufimtseva

A vnode consists of one or more vmemranges (virtual memory range).  One
example of multiple vmemranges is that there is a hole in one vnode.

Currently we haven't exported vmemrange interface to libxl user.
Vmemranges are generated during domain build, so we have relevant
structures in domain build state.

Later if we discover we need to export the interface, those structures
can be moved to libxl_domain_build_info as well.

These new fields (along with other fields in that struct) are set to 0
at start of day so we don't need to explicitly initialise them. A
following patch which introduces an independent checking function will
need to access these fields. I don't feel very comfortable squashing
this change into that one so I didn't use a single commit.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Dario Faggioli <dario.faggioli@citrix.com>
Cc: Elena Ufimtseva <ufimtseva@gmail.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
Changes in v5:
1. Fix commit message.

Changes in v4:
1. Improve commit message.

Changes in v3:
1. Rewrite commit message.
---
 tools/libxl/libxl_internal.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 934465a..6d3ac58 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -973,6 +973,9 @@ typedef struct {
     libxl__file_reference pv_ramdisk;
     const char * pv_cmdline;
     bool pvh_enabled;
+
+    xen_vmemrange_t *vmemranges;
+    uint32_t num_vmemranges;
 } libxl__domain_build_state;
 
 _hidden int libxl__build_pre(libxl__gc *gc, uint32_t domid,
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v7 07/21] libxl: introduce libxl__vnuma_config_check
  2015-03-09 12:51 [PATCH v7 00/21] Virtual NUMA for PV and HVM Wei Liu
                   ` (5 preceding siblings ...)
  2015-03-09 12:51 ` [PATCH v7 06/21] libxl: add vmemrange to libxl__domain_build_state Wei Liu
@ 2015-03-09 12:51 ` Wei Liu
  2015-03-09 12:51 ` [PATCH v7 08/21] libxl: x86: factor out e820_host_sanitize Wei Liu
                   ` (13 subsequent siblings)
  20 siblings, 0 replies; 39+ messages in thread
From: Wei Liu @ 2015-03-09 12:51 UTC (permalink / raw)
  To: xen-devel
  Cc: Ian Jackson, Dario Faggioli, Wei Liu, Ian Campbell,
	Elena Ufimtseva

This function is used to check whether vNUMA configuration (be it
auto-generated or supplied by user) is valid.

Define a new error code ERROR_VNUMA_CONFIG_INVALID.

The checks performed can be found in the comment of the function.

This vNUMA function (and future ones) is placed in a new file called
libxl_vnuma.c

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Dario Faggioli <dario.faggioli@citrix.com>
Cc: Elena Ufimtseva <ufimtseva@gmail.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
Changes in v7:
1. Use PRIu32 for uint32_t.

Changes in v6:
1. Address comments from Andrew.
2. Check vdistances.
3. use libxl_numainfo_list_free.
4. Change p to v.

Changes in v5:
1. Define and use new error code.
2. Use LOG macro.
3. Fix hard tabs.

Changes in v4:
1. Adapt to new interface.

Changes in v3:
1. Rewrite commit log.
2. Shorten two error messages.
---
 tools/libxl/Makefile         |   2 +-
 tools/libxl/libxl_internal.h |   7 ++
 tools/libxl/libxl_types.idl  |   1 +
 tools/libxl/libxl_vnuma.c    | 151 +++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 160 insertions(+), 1 deletion(-)
 create mode 100644 tools/libxl/libxl_vnuma.c

diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile
index 7329521..1b16598 100644
--- a/tools/libxl/Makefile
+++ b/tools/libxl/Makefile
@@ -93,7 +93,7 @@ LIBXL_LIBS += -lyajl
 LIBXL_OBJS = flexarray.o libxl.o libxl_create.o libxl_dm.o libxl_pci.o \
 			libxl_dom.o libxl_exec.o libxl_xshelp.o libxl_device.o \
 			libxl_internal.o libxl_utils.o libxl_uuid.o \
-			libxl_json.o libxl_aoutils.o libxl_numa.o \
+			libxl_json.o libxl_aoutils.o libxl_numa.o libxl_vnuma.o \
 			libxl_save_callout.o _libxl_save_msgs_callout.o \
 			libxl_qmp.o libxl_event.o libxl_fork.o $(LIBXL_OBJS-y)
 LIBXL_OBJS += libxl_genid.o
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 6d3ac58..258be0d 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -3394,6 +3394,13 @@ void libxl__numa_candidate_put_nodemap(libxl__gc *gc,
     libxl_bitmap_copy(CTX, &cndt->nodemap, nodemap);
 }
 
+/* Check if vNUMA config is valid. Returns 0 if valid,
+ * ERROR_VNUMA_CONFIG_INVALID otherwise.
+ */
+int libxl__vnuma_config_check(libxl__gc *gc,
+                              const libxl_domain_build_info *b_info,
+                              const libxl__domain_build_state *state);
+
 _hidden int libxl__ms_vm_genid_set(libxl__gc *gc, uint32_t domid,
                                    const libxl_ms_vm_genid *id);
 
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index 14c7e7c..23951fc 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -63,6 +63,7 @@ libxl_error = Enumeration("error", [
     (-17, "DEVICE_EXISTS"),
     (-18, "REMUS_DEVOPS_DOES_NOT_MATCH"),
     (-19, "REMUS_DEVICE_NOT_SUPPORTED"),
+    (-20, "VNUMA_CONFIG_INVALID"),
     ], value_namespace = "")
 
 libxl_domain_type = Enumeration("domain_type", [
diff --git a/tools/libxl/libxl_vnuma.c b/tools/libxl/libxl_vnuma.c
new file mode 100644
index 0000000..89532ef
--- /dev/null
+++ b/tools/libxl/libxl_vnuma.c
@@ -0,0 +1,151 @@
+/*
+ * Copyright (C) 2014      Citrix Ltd.
+ * Author Wei Liu <wei.liu2@citrix.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU Lesser General Public License as published
+ * by the Free Software Foundation; version 2.1 only. with the special
+ * exception on linking described in file LICENSE.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU Lesser General Public License for more details.
+ */
+#include "libxl_osdeps.h" /* must come before any other headers */
+#include "libxl_internal.h"
+#include <stdlib.h>
+
+/* Sort vmemranges in ascending order with "start" */
+static int compare_vmemrange(const void *a, const void *b)
+{
+    const xen_vmemrange_t *x = a, *y = b;
+    if (x->start < y->start)
+        return -1;
+    if (x->start > y->start)
+        return 1;
+    return 0;
+}
+
+/* Check if vNUMA configuration is valid:
+ *  1. all pnodes inside vnode_to_pnode array are valid
+ *  2. each vcpu belongs to one and only one vnode
+ *  3. each vmemrange is valid and doesn't overlap with any other
+ *  4. local distance cannot be larger than remote distance
+ */
+int libxl__vnuma_config_check(libxl__gc *gc,
+                              const libxl_domain_build_info *b_info,
+                              const libxl__domain_build_state *state)
+{
+    int nr_nodes = 0, rc = ERROR_VNUMA_CONFIG_INVALID;
+    unsigned int i, j;
+    libxl_numainfo *ninfo = NULL;
+    uint64_t total_memkb = 0;
+    libxl_bitmap cpumap;
+    libxl_vnode_info *v;
+
+    libxl_bitmap_init(&cpumap);
+
+    /* Check pnode specified is valid */
+    ninfo = libxl_get_numainfo(CTX, &nr_nodes);
+    if (!ninfo) {
+        LOG(ERROR, "libxl_get_numainfo failed");
+        goto out;
+    }
+
+    for (i = 0; i < b_info->num_vnuma_nodes; i++) {
+        uint32_t pnode;
+
+        v = &b_info->vnuma_nodes[i];
+        pnode = v->pnode;
+
+        /* The pnode specified is not valid? */
+        if (pnode >= nr_nodes) {
+            LOG(ERROR, "Invalid pnode %"PRIu32" specified", pnode);
+            goto out;
+        }
+
+        total_memkb += v->memkb;
+    }
+
+    if (total_memkb != b_info->max_memkb) {
+        LOG(ERROR, "Amount of memory mismatch (0x%"PRIx64" != 0x%"PRIx64")",
+            total_memkb, b_info->max_memkb);
+        goto out;
+    }
+
+    /* Check vcpu mapping */
+    libxl_cpu_bitmap_alloc(CTX, &cpumap, b_info->max_vcpus);
+    for (i = 0; i < b_info->num_vnuma_nodes; i++) {
+        v = &b_info->vnuma_nodes[i];
+        libxl_for_each_set_bit(j, v->vcpus) {
+            if (!libxl_bitmap_test(&cpumap, j))
+                libxl_bitmap_set(&cpumap, j);
+            else {
+                LOG(ERROR, "Vcpu %d assigned more than once", j);
+                goto out;
+            }
+        }
+    }
+
+    for (i = 0; i < b_info->max_vcpus; i++) {
+        if (!libxl_bitmap_test(&cpumap, i)) {
+            LOG(ERROR, "Vcpu %d is not assigned to any vnode", i);
+            goto out;
+        }
+    }
+
+    /* Check vmemranges */
+    qsort(state->vmemranges, state->num_vmemranges, sizeof(xen_vmemrange_t),
+          compare_vmemrange);
+
+    for (i = 0; i < state->num_vmemranges; i++) {
+        if (state->vmemranges[i].end < state->vmemranges[i].start) {
+                LOG(ERROR, "Vmemrange end < start");
+                goto out;
+        }
+    }
+
+    for (i = 0; i < state->num_vmemranges - 1; i++) {
+        if (state->vmemranges[i].end > state->vmemranges[i+1].start) {
+            LOG(ERROR,
+                "Vmemranges overlapped, 0x%"PRIx64"-0x%"PRIx64", 0x%"PRIx64"-0x%"PRIx64,
+                state->vmemranges[i].start, state->vmemranges[i].end,
+                state->vmemranges[i+1].start, state->vmemranges[i+1].end);
+            goto out;
+        }
+    }
+
+    /* Check vdistances */
+    for (i = 0; i < b_info->num_vnuma_nodes; i++) {
+        uint32_t local_distance, remote_distance;
+
+        v = &b_info->vnuma_nodes[i];
+        local_distance = v->distances[i];
+
+        for (j = 0; j < v->num_distances; j++) {
+            if (i == j) continue;
+            remote_distance = v->distances[j];
+            if (local_distance > remote_distance) {
+                LOG(ERROR,
+                    "Distance from %u to %u smaller than %u's local distance",
+                    i, j, i);
+                goto out;
+            }
+        }
+    }
+
+    rc = 0;
+out:
+    libxl_numainfo_list_free(ninfo, nr_nodes);
+    libxl_bitmap_dispose(&cpumap);
+    return rc;
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v7 08/21] libxl: x86: factor out e820_host_sanitize
  2015-03-09 12:51 [PATCH v7 00/21] Virtual NUMA for PV and HVM Wei Liu
                   ` (6 preceding siblings ...)
  2015-03-09 12:51 ` [PATCH v7 07/21] libxl: introduce libxl__vnuma_config_check Wei Liu
@ 2015-03-09 12:51 ` Wei Liu
  2015-03-09 12:51 ` [PATCH v7 09/21] libxl: functions to build vmemranges for PV guest Wei Liu
                   ` (12 subsequent siblings)
  20 siblings, 0 replies; 39+ messages in thread
From: Wei Liu @ 2015-03-09 12:51 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, Ian Campbell, Elena Ufimtseva

This function gets the machine E820 map and sanitize it according to PV
guest configuration.

This will be used in later patch. No functional change introduced in
this patch.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Elena Ufimtseva <ufimtseva@gmail.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
Changes in v4:
1. Use actual size of the map instead of using E820MAX.
---
 tools/libxl/libxl_x86.c | 32 +++++++++++++++++++++++---------
 1 file changed, 23 insertions(+), 9 deletions(-)

diff --git a/tools/libxl/libxl_x86.c b/tools/libxl/libxl_x86.c
index 9ceb373..d012b4d 100644
--- a/tools/libxl/libxl_x86.c
+++ b/tools/libxl/libxl_x86.c
@@ -207,6 +207,27 @@ static int e820_sanitize(libxl_ctx *ctx, struct e820entry src[],
     return 0;
 }
 
+static int e820_host_sanitize(libxl__gc *gc,
+                              libxl_domain_build_info *b_info,
+                              struct e820entry map[],
+                              uint32_t *nr)
+{
+    int rc;
+
+    rc = xc_get_machine_memory_map(CTX->xch, map, *nr);
+    if (rc < 0) {
+        errno = rc;
+        return ERROR_FAIL;
+    }
+
+    *nr = rc;
+
+    rc = e820_sanitize(CTX, map, nr, b_info->target_memkb,
+                       (b_info->max_memkb - b_info->target_memkb) +
+                       b_info->u.pv.slack_memkb);
+    return rc;
+}
+
 static int libxl__e820_alloc(libxl__gc *gc, uint32_t domid,
         libxl_domain_config *d_config)
 {
@@ -223,15 +244,8 @@ static int libxl__e820_alloc(libxl__gc *gc, uint32_t domid,
     if (!libxl_defbool_val(b_info->u.pv.e820_host))
         return ERROR_INVAL;
 
-    rc = xc_get_machine_memory_map(ctx->xch, map, E820MAX);
-    if (rc < 0) {
-        errno = rc;
-        return ERROR_FAIL;
-    }
-    nr = rc;
-    rc = e820_sanitize(ctx, map, &nr, b_info->target_memkb,
-                       (b_info->max_memkb - b_info->target_memkb) +
-                       b_info->u.pv.slack_memkb);
+    nr = E820MAX;
+    rc = e820_host_sanitize(gc, b_info, map, &nr);
     if (rc)
         return ERROR_FAIL;
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v7 09/21] libxl: functions to build vmemranges for PV guest
  2015-03-09 12:51 [PATCH v7 00/21] Virtual NUMA for PV and HVM Wei Liu
                   ` (7 preceding siblings ...)
  2015-03-09 12:51 ` [PATCH v7 08/21] libxl: x86: factor out e820_host_sanitize Wei Liu
@ 2015-03-09 12:51 ` Wei Liu
  2015-03-09 12:51 ` [PATCH v7 10/21] libxl: build, check and pass vNUMA info to Xen " Wei Liu
                   ` (11 subsequent siblings)
  20 siblings, 0 replies; 39+ messages in thread
From: Wei Liu @ 2015-03-09 12:51 UTC (permalink / raw)
  To: xen-devel
  Cc: Ian Jackson, Dario Faggioli, Wei Liu, Ian Campbell,
	Elena Ufimtseva

Introduce a arch-independent routine to generate one vmemrange per
vnode. Also introduce arch-dependent routines for different
architectures because part of the process is arch-specific -- ARM has
yet have NUMA support and E820 is x86 only.

For those x86 guests who care about machine E820 map (i.e. with
e820_host=1), vnode is further split into several vmemranges to
accommodate memory holes.  A few stubs for libxl_arm.c are created.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Dario Faggioli <dario.faggioli@citrix.com>
Cc: Elena Ufimtseva <ufimtseva@gmail.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
Changes in v7:
1. Remove stray blank line.

Changes in v5:
1. Allocate array all in one go.
2. Reverse the logic of vmemranges generation.

Changes in v4:
1. Adapt to new interface.
2. Address Ian Jackson's comments.

Changes in v3:
1. Rewrite commit log.
---
 tools/libxl/libxl_arch.h     |  6 ++++
 tools/libxl/libxl_arm.c      |  8 +++++
 tools/libxl/libxl_internal.h |  8 +++++
 tools/libxl/libxl_vnuma.c    | 40 +++++++++++++++++++++++
 tools/libxl/libxl_x86.c      | 78 ++++++++++++++++++++++++++++++++++++++++++++
 5 files changed, 140 insertions(+)

diff --git a/tools/libxl/libxl_arch.h b/tools/libxl/libxl_arch.h
index d3bc136..e249048 100644
--- a/tools/libxl/libxl_arch.h
+++ b/tools/libxl/libxl_arch.h
@@ -27,4 +27,10 @@ int libxl__arch_domain_init_hw_description(libxl__gc *gc,
 int libxl__arch_domain_finalise_hw_description(libxl__gc *gc,
                                       libxl_domain_build_info *info,
                                       struct xc_dom_image *dom);
+
+/* build vNUMA vmemrange with arch specific information */
+int libxl__arch_vnuma_build_vmemrange(libxl__gc *gc,
+                                      uint32_t domid,
+                                      libxl_domain_build_info *b_info,
+                                      libxl__domain_build_state *state);
 #endif
diff --git a/tools/libxl/libxl_arm.c b/tools/libxl/libxl_arm.c
index 65a762b..7da254f 100644
--- a/tools/libxl/libxl_arm.c
+++ b/tools/libxl/libxl_arm.c
@@ -707,6 +707,14 @@ int libxl__arch_domain_finalise_hw_description(libxl__gc *gc,
     return 0;
 }
 
+int libxl__arch_vnuma_build_vmemrange(libxl__gc *gc,
+                                      uint32_t domid,
+                                      libxl_domain_build_info *info,
+                                      libxl__domain_build_state *state)
+{
+    return libxl__vnuma_build_vmemrange_pv_generic(gc, domid, info, state);
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 258be0d..7d1e1cf 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -3400,6 +3400,14 @@ void libxl__numa_candidate_put_nodemap(libxl__gc *gc,
 int libxl__vnuma_config_check(libxl__gc *gc,
                               const libxl_domain_build_info *b_info,
                               const libxl__domain_build_state *state);
+int libxl__vnuma_build_vmemrange_pv_generic(libxl__gc *gc,
+                                            uint32_t domid,
+                                            libxl_domain_build_info *b_info,
+                                            libxl__domain_build_state *state);
+int libxl__vnuma_build_vmemrange_pv(libxl__gc *gc,
+                                    uint32_t domid,
+                                    libxl_domain_build_info *b_info,
+                                    libxl__domain_build_state *state);
 
 _hidden int libxl__ms_vm_genid_set(libxl__gc *gc, uint32_t domid,
                                    const libxl_ms_vm_genid *id);
diff --git a/tools/libxl/libxl_vnuma.c b/tools/libxl/libxl_vnuma.c
index 89532ef..bef3cc5 100644
--- a/tools/libxl/libxl_vnuma.c
+++ b/tools/libxl/libxl_vnuma.c
@@ -14,6 +14,7 @@
  */
 #include "libxl_osdeps.h" /* must come before any other headers */
 #include "libxl_internal.h"
+#include "libxl_arch.h"
 #include <stdlib.h>
 
 /* Sort vmemranges in ascending order with "start" */
@@ -142,6 +143,45 @@ out:
     return rc;
 }
 
+int libxl__vnuma_build_vmemrange_pv_generic(libxl__gc *gc,
+                                            uint32_t domid,
+                                            libxl_domain_build_info *b_info,
+                                            libxl__domain_build_state *state)
+{
+    int i;
+    uint64_t next;
+    xen_vmemrange_t *v = NULL;
+
+    /* Generate one vmemrange for each virtual node. */
+    GCREALLOC_ARRAY(v, b_info->num_vnuma_nodes);
+    next = 0;
+    for (i = 0; i < b_info->num_vnuma_nodes; i++) {
+        libxl_vnode_info *p = &b_info->vnuma_nodes[i];
+
+        v[i].start = next;
+        v[i].end = next + (p->memkb << 10);
+        v[i].flags = 0;
+        v[i].nid = i;
+
+        next = v[i].end;
+    }
+
+    state->vmemranges = v;
+    state->num_vmemranges = i;
+
+    return 0;
+}
+
+/* Build vmemranges for PV guest */
+int libxl__vnuma_build_vmemrange_pv(libxl__gc *gc,
+                                    uint32_t domid,
+                                    libxl_domain_build_info *b_info,
+                                    libxl__domain_build_state *state)
+{
+    assert(state->vmemranges == NULL);
+    return libxl__arch_vnuma_build_vmemrange(gc, domid, b_info, state);
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/tools/libxl/libxl_x86.c b/tools/libxl/libxl_x86.c
index d012b4d..fd45ead 100644
--- a/tools/libxl/libxl_x86.c
+++ b/tools/libxl/libxl_x86.c
@@ -339,6 +339,84 @@ int libxl__arch_domain_finalise_hw_description(libxl__gc *gc,
     return 0;
 }
 
+/* Return 0 on success, ERROR_* on failure. */
+int libxl__arch_vnuma_build_vmemrange(libxl__gc *gc,
+                                      uint32_t domid,
+                                      libxl_domain_build_info *b_info,
+                                      libxl__domain_build_state *state)
+{
+    int nid, nr_vmemrange, rc;
+    uint32_t nr_e820, e820_count;
+    struct e820entry map[E820MAX];
+    xen_vmemrange_t *vmemranges;
+    unsigned int array_size;
+
+    /* If e820_host is not set, call the generic function */
+    if (!(b_info->type == LIBXL_DOMAIN_TYPE_PV &&
+          libxl_defbool_val(b_info->u.pv.e820_host)))
+        return libxl__vnuma_build_vmemrange_pv_generic(gc, domid, b_info,
+                                                       state);
+
+    assert(state->vmemranges == NULL);
+
+    nr_e820 = E820MAX;
+    rc = e820_host_sanitize(gc, b_info, map, &nr_e820);
+    if (rc) goto out;
+
+    e820_count = 0;
+    nr_vmemrange = 0;
+    vmemranges = NULL;
+    array_size = 0;
+    for (nid = 0; nid < b_info->num_vnuma_nodes; nid++) {
+        libxl_vnode_info *p = &b_info->vnuma_nodes[nid];
+        uint64_t remaining_bytes = (p->memkb << 10), bytes;
+
+        while (remaining_bytes > 0) {
+            if (e820_count >= nr_e820) {
+                rc = ERROR_NOMEM;
+                goto out;
+            }
+
+            /* Skip non RAM region */
+            if (map[e820_count].type != E820_RAM) {
+                e820_count++;
+                continue;
+            }
+
+            if (nr_vmemrange >= array_size) {
+                array_size += 32;
+                GCREALLOC_ARRAY(vmemranges, array_size);
+            }
+
+            bytes = map[e820_count].size >= remaining_bytes ?
+                remaining_bytes : map[e820_count].size;
+
+            vmemranges[nr_vmemrange].start = map[e820_count].addr;
+            vmemranges[nr_vmemrange].end = map[e820_count].addr + bytes;
+
+            if (map[e820_count].size >= remaining_bytes) {
+                map[e820_count].addr += bytes;
+                map[e820_count].size -= bytes;
+            } else {
+                e820_count++;
+            }
+
+            remaining_bytes -= bytes;
+
+            vmemranges[nr_vmemrange].flags = 0;
+            vmemranges[nr_vmemrange].nid = nid;
+            nr_vmemrange++;
+        }
+    }
+
+    state->vmemranges = vmemranges;
+    state->num_vmemranges = nr_vmemrange;
+
+    rc = 0;
+out:
+    return rc;
+}
+
 /*
  * Local variables:
  * mode: C
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v7 10/21] libxl: build, check and pass vNUMA info to Xen for PV guest
  2015-03-09 12:51 [PATCH v7 00/21] Virtual NUMA for PV and HVM Wei Liu
                   ` (8 preceding siblings ...)
  2015-03-09 12:51 ` [PATCH v7 09/21] libxl: functions to build vmemranges for PV guest Wei Liu
@ 2015-03-09 12:51 ` Wei Liu
  2015-03-09 12:51 ` [PATCH v7 11/21] libxc: indentation change to xc_hvm_build_x86.c Wei Liu
                   ` (10 subsequent siblings)
  20 siblings, 0 replies; 39+ messages in thread
From: Wei Liu @ 2015-03-09 12:51 UTC (permalink / raw)
  To: xen-devel
  Cc: Ian Jackson, Dario Faggioli, Wei Liu, Ian Campbell,
	Elena Ufimtseva

Transform the user supplied vNUMA configuration into libxl internal
representations, and finally libxc representations. Check validity of
the configuration along the line.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Dario Faggioli <dario.faggioli@citrix.com>
Cc: Elena Ufimtseva <ufimtseva@gmail.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
Changes in v6:
1. Use "unsigned" for some variables.
2. Variable name: bit -> j.

Changes in v5:
1. Adapt to change of interface (ditching xc_vnuma_info).

Changes in v4:
1. Adapt to new interfaces.

Changes in v3:
1. Add more commit log.
---
 tools/libxl/libxl_dom.c | 77 +++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 77 insertions(+)

diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index a16d4a1..b58a19b 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -515,6 +515,51 @@ retry_transaction:
     return 0;
 }
 
+static int set_vnuma_info(libxl__gc *gc, uint32_t domid,
+                          const libxl_domain_build_info *info,
+                          const libxl__domain_build_state *state)
+{
+    int rc = 0;
+    unsigned int i, nr_vdistance;
+    unsigned int *vcpu_to_vnode, *vnode_to_pnode, *vdistance = NULL;
+
+    vcpu_to_vnode = libxl__calloc(gc, info->max_vcpus,
+                                  sizeof(unsigned int));
+    vnode_to_pnode = libxl__calloc(gc, info->num_vnuma_nodes,
+                                   sizeof(unsigned int));
+
+    nr_vdistance = info->num_vnuma_nodes * info->num_vnuma_nodes;
+    vdistance = libxl__calloc(gc, nr_vdistance, sizeof(unsigned int));
+
+    for (i = 0; i < info->num_vnuma_nodes; i++) {
+        libxl_vnode_info *v = &info->vnuma_nodes[i];
+        int j;
+
+        /* vnode to pnode mapping */
+        vnode_to_pnode[i] = v->pnode;
+
+        /* vcpu to vnode mapping */
+        libxl_for_each_set_bit(j, v->vcpus)
+            vcpu_to_vnode[j] = i;
+
+        /* node distances */
+        assert(info->num_vnuma_nodes == v->num_distances);
+        memcpy(vdistance + (i * info->num_vnuma_nodes),
+               v->distances,
+               v->num_distances * sizeof(unsigned int));
+    }
+
+    if (xc_domain_setvnuma(CTX->xch, domid, info->num_vnuma_nodes,
+                           state->num_vmemranges, info->max_vcpus,
+                           state->vmemranges, vdistance,
+                           vcpu_to_vnode, vnode_to_pnode) < 0) {
+        LOGE(ERROR, "xc_domain_setvnuma failed");
+        rc = ERROR_FAIL;
+    }
+
+    return rc;
+}
+
 int libxl__build_pv(libxl__gc *gc, uint32_t domid,
              libxl_domain_build_info *info, libxl__domain_build_state *state)
 {
@@ -572,6 +617,38 @@ int libxl__build_pv(libxl__gc *gc, uint32_t domid,
     dom->xenstore_domid = state->store_domid;
     dom->claim_enabled = libxl_defbool_val(info->claim_mode);
 
+    if (info->num_vnuma_nodes != 0) {
+        unsigned int i;
+
+        ret = libxl__vnuma_build_vmemrange_pv(gc, domid, info, state);
+        if (ret) {
+            LOGE(ERROR, "cannot build vmemranges");
+            goto out;
+        }
+        ret = libxl__vnuma_config_check(gc, info, state);
+        if (ret) goto out;
+
+        ret = set_vnuma_info(gc, domid, info, state);
+        if (ret) goto out;
+
+        dom->nr_vmemranges = state->num_vmemranges;
+        dom->vmemranges = xc_dom_malloc(dom, sizeof(*dom->vmemranges) *
+                                        dom->nr_vmemranges);
+
+        for (i = 0; i < dom->nr_vmemranges; i++) {
+            dom->vmemranges[i].start = state->vmemranges[i].start;
+            dom->vmemranges[i].end   = state->vmemranges[i].end;
+            dom->vmemranges[i].flags = state->vmemranges[i].flags;
+            dom->vmemranges[i].nid   = state->vmemranges[i].nid;
+        }
+
+        dom->nr_vnodes = info->num_vnuma_nodes;
+        dom->vnode_to_pnode = xc_dom_malloc(dom, sizeof(*dom->vnode_to_pnode) *
+                                            dom->nr_vnodes);
+        for (i = 0; i < info->num_vnuma_nodes; i++)
+            dom->vnode_to_pnode[i] = info->vnuma_nodes[i].pnode;
+    }
+
     if ( (ret = xc_dom_boot_xen_init(dom, ctx->xch, domid)) != 0 ) {
         LOGE(ERROR, "xc_dom_boot_xen_init failed");
         goto out;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v7 11/21] libxc: indentation change to xc_hvm_build_x86.c
  2015-03-09 12:51 [PATCH v7 00/21] Virtual NUMA for PV and HVM Wei Liu
                   ` (9 preceding siblings ...)
  2015-03-09 12:51 ` [PATCH v7 10/21] libxl: build, check and pass vNUMA info to Xen " Wei Liu
@ 2015-03-09 12:51 ` Wei Liu
  2015-03-09 12:51 ` [PATCH v7 12/21] libxc: allocate memory with vNUMA information for HVM guest Wei Liu
                   ` (9 subsequent siblings)
  20 siblings, 0 replies; 39+ messages in thread
From: Wei Liu @ 2015-03-09 12:51 UTC (permalink / raw)
  To: xen-devel
  Cc: Ian Jackson, Dario Faggioli, Wei Liu, Ian Campbell,
	Elena Ufimtseva

Move a while loop in xc_hvm_build_x86 one block to the right. No
functional change introduced.

Functional changes will be introduced in next patch.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Dario Faggioli <dario.faggioli@citrix.com>
Cc: Elena Ufimtseva <ufimtseva@gmail.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
 tools/libxc/xc_hvm_build_x86.c | 153 ++++++++++++++++++++++-------------------
 1 file changed, 81 insertions(+), 72 deletions(-)

diff --git a/tools/libxc/xc_hvm_build_x86.c b/tools/libxc/xc_hvm_build_x86.c
index c81a25b..ecc3224 100644
--- a/tools/libxc/xc_hvm_build_x86.c
+++ b/tools/libxc/xc_hvm_build_x86.c
@@ -353,98 +353,107 @@ static int setup_guest(xc_interface *xch,
     cur_pages = 0xc0;
     stat_normal_pages = 0xc0;
 
-    while ( (rc == 0) && (nr_pages > cur_pages) )
     {
-        /* Clip count to maximum 1GB extent. */
-        unsigned long count = nr_pages - cur_pages;
-        unsigned long max_pages = SUPERPAGE_1GB_NR_PFNS;
-
-        if ( count > max_pages )
-            count = max_pages;
-
-        cur_pfn = page_array[cur_pages];
-
-        /* Take care the corner cases of super page tails */
-        if ( ((cur_pfn & (SUPERPAGE_1GB_NR_PFNS-1)) != 0) &&
-             (count > (-cur_pfn & (SUPERPAGE_1GB_NR_PFNS-1))) )
-            count = -cur_pfn & (SUPERPAGE_1GB_NR_PFNS-1);
-        else if ( ((count & (SUPERPAGE_1GB_NR_PFNS-1)) != 0) &&
-                  (count > SUPERPAGE_1GB_NR_PFNS) )
-            count &= ~(SUPERPAGE_1GB_NR_PFNS - 1);
-
-        /* Attemp to allocate 1GB super page. Because in each pass we only
-         * allocate at most 1GB, we don't have to clip super page boundaries.
-         */
-        if ( ((count | cur_pfn) & (SUPERPAGE_1GB_NR_PFNS - 1)) == 0 &&
-             /* Check if there exists MMIO hole in the 1GB memory range */
-             !check_mmio_hole(cur_pfn << PAGE_SHIFT,
-                              SUPERPAGE_1GB_NR_PFNS << PAGE_SHIFT,
-                              mmio_start, mmio_size) )
+        while ( (rc == 0) && (nr_pages > cur_pages) )
         {
-            long done;
-            unsigned long nr_extents = count >> SUPERPAGE_1GB_SHIFT;
-            xen_pfn_t sp_extents[nr_extents];
-
-            for ( i = 0; i < nr_extents; i++ )
-                sp_extents[i] = page_array[cur_pages+(i<<SUPERPAGE_1GB_SHIFT)];
-
-            done = xc_domain_populate_physmap(xch, dom, nr_extents, SUPERPAGE_1GB_SHIFT,
-                                              pod_mode, sp_extents);
-
-            if ( done > 0 )
-            {
-                stat_1gb_pages += done;
-                done <<= SUPERPAGE_1GB_SHIFT;
-                cur_pages += done;
-                count -= done;
-            }
-        }
+            /* Clip count to maximum 1GB extent. */
+            unsigned long count = nr_pages - cur_pages;
+            unsigned long max_pages = SUPERPAGE_1GB_NR_PFNS;
 
-        if ( count != 0 )
-        {
-            /* Clip count to maximum 8MB extent. */
-            max_pages = SUPERPAGE_2MB_NR_PFNS * 4;
             if ( count > max_pages )
                 count = max_pages;
-            
-            /* Clip partial superpage extents to superpage boundaries. */
-            if ( ((cur_pfn & (SUPERPAGE_2MB_NR_PFNS-1)) != 0) &&
-                 (count > (-cur_pfn & (SUPERPAGE_2MB_NR_PFNS-1))) )
-                count = -cur_pfn & (SUPERPAGE_2MB_NR_PFNS-1);
-            else if ( ((count & (SUPERPAGE_2MB_NR_PFNS-1)) != 0) &&
-                      (count > SUPERPAGE_2MB_NR_PFNS) )
-                count &= ~(SUPERPAGE_2MB_NR_PFNS - 1); /* clip non-s.p. tail */
-
-            /* Attempt to allocate superpage extents. */
-            if ( ((count | cur_pfn) & (SUPERPAGE_2MB_NR_PFNS - 1)) == 0 )
+
+            cur_pfn = page_array[cur_pages];
+
+            /* Take care the corner cases of super page tails */
+            if ( ((cur_pfn & (SUPERPAGE_1GB_NR_PFNS-1)) != 0) &&
+                 (count > (-cur_pfn & (SUPERPAGE_1GB_NR_PFNS-1))) )
+                count = -cur_pfn & (SUPERPAGE_1GB_NR_PFNS-1);
+            else if ( ((count & (SUPERPAGE_1GB_NR_PFNS-1)) != 0) &&
+                      (count > SUPERPAGE_1GB_NR_PFNS) )
+                count &= ~(SUPERPAGE_1GB_NR_PFNS - 1);
+
+            /* Attemp to allocate 1GB super page. Because in each pass
+             * we only allocate at most 1GB, we don't have to clip
+             * super page boundaries.
+             */
+            if ( ((count | cur_pfn) & (SUPERPAGE_1GB_NR_PFNS - 1)) == 0 &&
+                 /* Check if there exists MMIO hole in the 1GB memory
+                  * range */
+                 !check_mmio_hole(cur_pfn << PAGE_SHIFT,
+                                  SUPERPAGE_1GB_NR_PFNS << PAGE_SHIFT,
+                                  mmio_start, mmio_size) )
             {
                 long done;
-                unsigned long nr_extents = count >> SUPERPAGE_2MB_SHIFT;
+                unsigned long nr_extents = count >> SUPERPAGE_1GB_SHIFT;
                 xen_pfn_t sp_extents[nr_extents];
 
                 for ( i = 0; i < nr_extents; i++ )
-                    sp_extents[i] = page_array[cur_pages+(i<<SUPERPAGE_2MB_SHIFT)];
+                    sp_extents[i] =
+                        page_array[cur_pages+(i<<SUPERPAGE_1GB_SHIFT)];
 
-                done = xc_domain_populate_physmap(xch, dom, nr_extents, SUPERPAGE_2MB_SHIFT,
+                done = xc_domain_populate_physmap(xch, dom, nr_extents,
+                                                  SUPERPAGE_1GB_SHIFT,
                                                   pod_mode, sp_extents);
 
                 if ( done > 0 )
                 {
-                    stat_2mb_pages += done;
-                    done <<= SUPERPAGE_2MB_SHIFT;
+                    stat_1gb_pages += done;
+                    done <<= SUPERPAGE_1GB_SHIFT;
                     cur_pages += done;
                     count -= done;
                 }
             }
-        }
 
-        /* Fall back to 4kB extents. */
-        if ( count != 0 )
-        {
-            rc = xc_domain_populate_physmap_exact(
-                xch, dom, count, 0, pod_mode, &page_array[cur_pages]);
-            cur_pages += count;
-            stat_normal_pages += count;
+            if ( count != 0 )
+            {
+                /* Clip count to maximum 8MB extent. */
+                max_pages = SUPERPAGE_2MB_NR_PFNS * 4;
+                if ( count > max_pages )
+                    count = max_pages;
+
+                /* Clip partial superpage extents to superpage
+                 * boundaries. */
+                if ( ((cur_pfn & (SUPERPAGE_2MB_NR_PFNS-1)) != 0) &&
+                     (count > (-cur_pfn & (SUPERPAGE_2MB_NR_PFNS-1))) )
+                    count = -cur_pfn & (SUPERPAGE_2MB_NR_PFNS-1);
+                else if ( ((count & (SUPERPAGE_2MB_NR_PFNS-1)) != 0) &&
+                          (count > SUPERPAGE_2MB_NR_PFNS) )
+                    count &= ~(SUPERPAGE_2MB_NR_PFNS - 1); /* clip non-s.p. tail */
+
+                /* Attempt to allocate superpage extents. */
+                if ( ((count | cur_pfn) & (SUPERPAGE_2MB_NR_PFNS - 1)) == 0 )
+                {
+                    long done;
+                    unsigned long nr_extents = count >> SUPERPAGE_2MB_SHIFT;
+                    xen_pfn_t sp_extents[nr_extents];
+
+                    for ( i = 0; i < nr_extents; i++ )
+                        sp_extents[i] =
+                            page_array[cur_pages+(i<<SUPERPAGE_2MB_SHIFT)];
+
+                    done = xc_domain_populate_physmap(xch, dom, nr_extents,
+                                                      SUPERPAGE_2MB_SHIFT,
+                                                      pod_mode, sp_extents);
+
+                    if ( done > 0 )
+                    {
+                        stat_2mb_pages += done;
+                        done <<= SUPERPAGE_2MB_SHIFT;
+                        cur_pages += done;
+                        count -= done;
+                    }
+                }
+            }
+
+            /* Fall back to 4kB extents. */
+            if ( count != 0 )
+            {
+                rc = xc_domain_populate_physmap_exact(
+                    xch, dom, count, 0, pod_mode, &page_array[cur_pages]);
+                cur_pages += count;
+                stat_normal_pages += count;
+            }
         }
     }
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v7 12/21] libxc: allocate memory with vNUMA information for HVM guest
  2015-03-09 12:51 [PATCH v7 00/21] Virtual NUMA for PV and HVM Wei Liu
                   ` (10 preceding siblings ...)
  2015-03-09 12:51 ` [PATCH v7 11/21] libxc: indentation change to xc_hvm_build_x86.c Wei Liu
@ 2015-03-09 12:51 ` Wei Liu
  2015-03-09 12:51 ` [PATCH v7 13/21] libxl: build, check and pass vNUMA info to Xen " Wei Liu
                   ` (8 subsequent siblings)
  20 siblings, 0 replies; 39+ messages in thread
From: Wei Liu @ 2015-03-09 12:51 UTC (permalink / raw)
  To: xen-devel
  Cc: Ian Jackson, Dario Faggioli, Wei Liu, Ian Campbell,
	Elena Ufimtseva

The algorithm is more or less the same as the one used for PV guest.
Libxc gets hold of the mapping of vnode to pnode and size of each vnode
then allocate memory accordingly.

And then the function returns low memory end, high memory end and mmio
start to caller. Libxl needs those values to construct vmemranges for
that guest.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Dario Faggioli <dario.faggioli@citrix.com>
Cc: Elena Ufimtseva <ufimtseva@gmail.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
Changes in v7:
1. Fall back to define our own XC_NUMA_NO_NODE.

Changes in v6:
1. Use XEN_NUMA_NO_NODE.
2. Fix a minor bug discovered by Dario.

Changes in v5:
1. Use a better loop variable name vnid.

Changes in v4:
1. Adapt to new interface.
2. Shorten error message.
3. This patch includes only functional changes.

Changes in v3:
1. Rewrite commit log.
2. Add a few code comments.
---
 tools/libxc/include/xenguest.h |  11 +++++
 tools/libxc/xc_hvm_build_x86.c | 102 ++++++++++++++++++++++++++++++++++-------
 2 files changed, 97 insertions(+), 16 deletions(-)

diff --git a/tools/libxc/include/xenguest.h b/tools/libxc/include/xenguest.h
index b7a924f..601b108 100644
--- a/tools/libxc/include/xenguest.h
+++ b/tools/libxc/include/xenguest.h
@@ -232,6 +232,17 @@ struct xc_hvm_build_args {
     struct xc_hvm_firmware_module smbios_module;
     /* Whether to use claim hypercall (1 - enable, 0 - disable). */
     int claim_enabled;
+
+    /* vNUMA information*/
+    xen_vmemrange_t *vmemranges;
+    unsigned int nr_vmemranges;
+    unsigned int *vnode_to_pnode;
+    unsigned int nr_vnodes;
+
+    /* Out parameters  */
+    uint64_t lowmem_end;
+    uint64_t highmem_end;
+    uint64_t mmio_start;
 };
 
 /**
diff --git a/tools/libxc/xc_hvm_build_x86.c b/tools/libxc/xc_hvm_build_x86.c
index ecc3224..2a1fd68 100644
--- a/tools/libxc/xc_hvm_build_x86.c
+++ b/tools/libxc/xc_hvm_build_x86.c
@@ -89,7 +89,8 @@ static int modules_init(struct xc_hvm_build_args *args,
 }
 
 static void build_hvm_info(void *hvm_info_page, uint64_t mem_size,
-                           uint64_t mmio_start, uint64_t mmio_size)
+                           uint64_t mmio_start, uint64_t mmio_size,
+                           struct xc_hvm_build_args *args)
 {
     struct hvm_info_table *hvm_info = (struct hvm_info_table *)
         (((unsigned char *)hvm_info_page) + HVM_INFO_OFFSET);
@@ -119,6 +120,10 @@ static void build_hvm_info(void *hvm_info_page, uint64_t mem_size,
     hvm_info->high_mem_pgend = highmem_end >> PAGE_SHIFT;
     hvm_info->reserved_mem_pgstart = ioreq_server_pfn(0);
 
+    args->lowmem_end = lowmem_end;
+    args->highmem_end = highmem_end;
+    args->mmio_start = mmio_start;
+
     /* Finish with the checksum. */
     for ( i = 0, sum = 0; i < hvm_info->length; i++ )
         sum += ((uint8_t *)hvm_info)[i];
@@ -244,7 +249,7 @@ static int setup_guest(xc_interface *xch,
                        char *image, unsigned long image_size)
 {
     xen_pfn_t *page_array = NULL;
-    unsigned long i, nr_pages = args->mem_size >> PAGE_SHIFT;
+    unsigned long i, vmemid, nr_pages = args->mem_size >> PAGE_SHIFT;
     unsigned long target_pages = args->mem_target >> PAGE_SHIFT;
     uint64_t mmio_start = (1ull << 32) - args->mmio_size;
     uint64_t mmio_size = args->mmio_size;
@@ -258,13 +263,13 @@ static int setup_guest(xc_interface *xch,
     xen_capabilities_info_t caps;
     unsigned long stat_normal_pages = 0, stat_2mb_pages = 0, 
         stat_1gb_pages = 0;
-    int pod_mode = 0;
+    unsigned int memflags = 0;
     int claim_enabled = args->claim_enabled;
     xen_pfn_t special_array[NR_SPECIAL_PAGES];
     xen_pfn_t ioreq_server_array[NR_IOREQ_SERVER_PAGES];
-
-    if ( nr_pages > target_pages )
-        pod_mode = XENMEMF_populate_on_demand;
+    uint64_t total_pages;
+    xen_vmemrange_t dummy_vmemrange;
+    unsigned int dummy_vnode_to_pnode;
 
     memset(&elf, 0, sizeof(elf));
     if ( elf_init(&elf, image, image_size) != 0 )
@@ -276,6 +281,43 @@ static int setup_guest(xc_interface *xch,
     v_start = 0;
     v_end = args->mem_size;
 
+    if ( nr_pages > target_pages )
+        memflags |= XENMEMF_populate_on_demand;
+
+    if ( args->nr_vmemranges == 0 )
+    {
+        /* Build dummy vnode information */
+        dummy_vmemrange.start = 0;
+        dummy_vmemrange.end   = args->mem_size;
+        dummy_vmemrange.flags = 0;
+        dummy_vmemrange.nid   = 0;
+        args->nr_vmemranges = 1;
+        args->vmemranges = &dummy_vmemrange;
+
+        dummy_vnode_to_pnode = XC_NUMA_NO_NODE;
+        args->nr_vnodes = 1;
+        args->vnode_to_pnode = &dummy_vnode_to_pnode;
+    }
+    else
+    {
+        if ( nr_pages > target_pages )
+        {
+            PERROR("Cannot enable vNUMA and PoD at the same time");
+            goto error_out;
+        }
+    }
+
+    total_pages = 0;
+    for ( i = 0; i < args->nr_vmemranges; i++ )
+        total_pages += ((args->vmemranges[i].end - args->vmemranges[i].start)
+                        >> PAGE_SHIFT);
+    if ( total_pages != (args->mem_size >> PAGE_SHIFT) )
+    {
+        PERROR("vNUMA memory pages mismatch (0x%"PRIx64" != 0x%"PRIx64")",
+               total_pages, args->mem_size >> PAGE_SHIFT);
+        goto error_out;
+    }
+
     if ( xc_version(xch, XENVER_capabilities, &caps) != 0 )
     {
         PERROR("Could not get Xen capabilities");
@@ -320,7 +362,7 @@ static int setup_guest(xc_interface *xch,
         }
     }
 
-    if ( pod_mode )
+    if ( memflags & XENMEMF_populate_on_demand )
     {
         /*
          * Subtract VGA_HOLE_SIZE from target_pages for the VGA
@@ -349,15 +391,37 @@ static int setup_guest(xc_interface *xch,
      * ensure that we can be preempted and hence dom0 remains responsive.
      */
     rc = xc_domain_populate_physmap_exact(
-        xch, dom, 0xa0, 0, pod_mode, &page_array[0x00]);
-    cur_pages = 0xc0;
-    stat_normal_pages = 0xc0;
+        xch, dom, 0xa0, 0, memflags, &page_array[0x00]);
 
+    stat_normal_pages = 0;
+    for ( vmemid = 0; vmemid < args->nr_vmemranges; vmemid++ )
     {
-        while ( (rc == 0) && (nr_pages > cur_pages) )
+        unsigned int new_memflags = memflags;
+        uint64_t end_pages;
+        unsigned int vnode = args->vmemranges[vmemid].nid;
+        unsigned int pnode = args->vnode_to_pnode[vnode];
+
+        if ( pnode != XC_NUMA_NO_NODE )
+            new_memflags |= XENMEMF_exact_node(pnode);
+
+        end_pages = args->vmemranges[vmemid].end >> PAGE_SHIFT;
+        /*
+         * Consider vga hole belongs to the vmemrange that covers
+         * 0xA0000-0xC0000. Note that 0x00000-0xA0000 is populated just
+         * before this loop.
+         */
+        if ( args->vmemranges[vmemid].start == 0 )
+        {
+            cur_pages = 0xc0;
+            stat_normal_pages += 0xc0;
+        }
+        else
+            cur_pages = args->vmemranges[vmemid].start >> PAGE_SHIFT;
+
+        while ( (rc == 0) && (end_pages > cur_pages) )
         {
             /* Clip count to maximum 1GB extent. */
-            unsigned long count = nr_pages - cur_pages;
+            unsigned long count = end_pages - cur_pages;
             unsigned long max_pages = SUPERPAGE_1GB_NR_PFNS;
 
             if ( count > max_pages )
@@ -394,7 +458,7 @@ static int setup_guest(xc_interface *xch,
 
                 done = xc_domain_populate_physmap(xch, dom, nr_extents,
                                                   SUPERPAGE_1GB_SHIFT,
-                                                  pod_mode, sp_extents);
+                                                  memflags, sp_extents);
 
                 if ( done > 0 )
                 {
@@ -434,7 +498,7 @@ static int setup_guest(xc_interface *xch,
 
                     done = xc_domain_populate_physmap(xch, dom, nr_extents,
                                                       SUPERPAGE_2MB_SHIFT,
-                                                      pod_mode, sp_extents);
+                                                      memflags, sp_extents);
 
                     if ( done > 0 )
                     {
@@ -450,11 +514,14 @@ static int setup_guest(xc_interface *xch,
             if ( count != 0 )
             {
                 rc = xc_domain_populate_physmap_exact(
-                    xch, dom, count, 0, pod_mode, &page_array[cur_pages]);
+                    xch, dom, count, 0, new_memflags, &page_array[cur_pages]);
                 cur_pages += count;
                 stat_normal_pages += count;
             }
         }
+
+        if ( rc != 0 )
+            break;
     }
 
     if ( rc != 0 )
@@ -478,7 +545,7 @@ static int setup_guest(xc_interface *xch,
               xch, dom, PAGE_SIZE, PROT_READ | PROT_WRITE,
               HVM_INFO_PFN)) == NULL )
         goto error_out;
-    build_hvm_info(hvm_info_page, v_end, mmio_start, mmio_size);
+    build_hvm_info(hvm_info_page, v_end, mmio_start, mmio_size, args);
     munmap(hvm_info_page, PAGE_SIZE);
 
     /* Allocate and clear special pages. */
@@ -617,6 +684,9 @@ int xc_hvm_build(xc_interface *xch, uint32_t domid,
             args.acpi_module.guest_addr_out;
         hvm_args->smbios_module.guest_addr_out = 
             args.smbios_module.guest_addr_out;
+        hvm_args->lowmem_end = args.lowmem_end;
+        hvm_args->highmem_end = args.highmem_end;
+        hvm_args->mmio_start = args.mmio_start;
     }
 
     free(image);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v7 13/21] libxl: build, check and pass vNUMA info to Xen for HVM guest
  2015-03-09 12:51 [PATCH v7 00/21] Virtual NUMA for PV and HVM Wei Liu
                   ` (11 preceding siblings ...)
  2015-03-09 12:51 ` [PATCH v7 12/21] libxc: allocate memory with vNUMA information for HVM guest Wei Liu
@ 2015-03-09 12:51 ` Wei Liu
  2015-03-09 12:51 ` [PATCH v7 14/21] libxl: disallow memory relocation when vNUMA is enabled Wei Liu
                   ` (7 subsequent siblings)
  20 siblings, 0 replies; 39+ messages in thread
From: Wei Liu @ 2015-03-09 12:51 UTC (permalink / raw)
  To: xen-devel
  Cc: Ian Jackson, Dario Faggioli, Wei Liu, Ian Campbell,
	Elena Ufimtseva

Transform user supplied vNUMA configuration into libxl internal
representations then libxc representations. Check validity along the
line.

Libxc has more involvement in building vmemranges in HVM case compared
to PV case. The building of vmemranges is placed after xc_hvm_build
returns, because it relies on memory hole information provided by
xc_hvm_build.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Dario Faggioli <dario.faggioli@citrix.com>
Cc: Elena Ufimtseva <ufimtseva@gmail.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
Changes in v6:
1. Fix a minor bug discovered by Dario.

Changes in v5:
1. Check vnode 0 is large enough to accommodate video ram.

Changes in v4:
1. Adapt to new interface.
2. Rename some variables.
3. Use GCREALLOC_ARRAY.

Changes in v3:
1. Rewrite commit log.
---
 tools/libxl/libxl_create.c   |  9 +++++++
 tools/libxl/libxl_dom.c      | 43 ++++++++++++++++++++++++++++++++++
 tools/libxl/libxl_internal.h |  5 ++++
 tools/libxl/libxl_vnuma.c    | 56 ++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 113 insertions(+)

diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index 98687bd..af04248 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -853,6 +853,15 @@ static void initiate_domain_create(libxl__egc *egc,
         goto error_out;
     }
 
+    /* Disallow PoD and vNUMA to be enabled at the same time because PoD
+     * pool is not vNUMA-aware yet.
+     */
+    if (pod_enabled && d_config->b_info.num_vnuma_nodes) {
+        ret = ERROR_INVAL;
+        LOG(ERROR, "Cannot enable PoD and vNUMA at the same time");
+        goto error_out;
+    }
+
     ret = libxl__domain_create_info_setdefault(gc, &d_config->c_info);
     if (ret) goto error_out;
 
diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index b58a19b..c1a409d 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -893,12 +893,55 @@ int libxl__build_hvm(libxl__gc *gc, uint32_t domid,
         goto out;
     }
 
+    if (info->num_vnuma_nodes != 0) {
+        int i;
+
+        args.nr_vmemranges = state->num_vmemranges;
+        args.vmemranges = libxl__malloc(gc, sizeof(*args.vmemranges) *
+                                        args.nr_vmemranges);
+
+        for (i = 0; i < args.nr_vmemranges; i++) {
+            args.vmemranges[i].start = state->vmemranges[i].start;
+            args.vmemranges[i].end   = state->vmemranges[i].end;
+            args.vmemranges[i].flags = state->vmemranges[i].flags;
+            args.vmemranges[i].nid   = state->vmemranges[i].nid;
+        }
+
+        /* Consider video ram belongs to vmemrange 0 -- just shrink it
+         * by the size of video ram.
+         */
+        if (((args.vmemranges[0].end - args.vmemranges[0].start) >> 10)
+            < info->video_memkb) {
+            LOG(ERROR, "vmemrange 0 too small to contain video ram");
+            goto out;
+        }
+
+        args.vmemranges[0].end -= (info->video_memkb << 10);
+
+        args.nr_vnodes = info->num_vnuma_nodes;
+        args.vnode_to_pnode = libxl__malloc(gc, sizeof(*args.vnode_to_pnode) *
+                                            args.nr_vnodes);
+        for (i = 0; i < args.nr_vnodes; i++)
+            args.vnode_to_pnode[i] = info->vnuma_nodes[i].pnode;
+    }
+
     ret = xc_hvm_build(ctx->xch, domid, &args);
     if (ret) {
         LOGEV(ERROR, ret, "hvm building failed");
         goto out;
     }
 
+    if (info->num_vnuma_nodes != 0) {
+        ret = libxl__vnuma_build_vmemrange_hvm(gc, domid, info, state, &args);
+        if (ret) {
+            LOGEV(ERROR, ret, "hvm build vmemranges failed");
+            goto out;
+        }
+        ret = libxl__vnuma_config_check(gc, info, state);
+        if (ret) goto out;
+        ret = set_vnuma_info(gc, domid, info, state);
+        if (ret) goto out;
+    }
     ret = hvm_build_set_params(ctx->xch, domid, info, state->store_port,
                                &state->store_mfn, state->console_port,
                                &state->console_mfn, state->store_domid,
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 7d1e1cf..e93089a 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -3408,6 +3408,11 @@ int libxl__vnuma_build_vmemrange_pv(libxl__gc *gc,
                                     uint32_t domid,
                                     libxl_domain_build_info *b_info,
                                     libxl__domain_build_state *state);
+int libxl__vnuma_build_vmemrange_hvm(libxl__gc *gc,
+                                     uint32_t domid,
+                                     libxl_domain_build_info *b_info,
+                                     libxl__domain_build_state *state,
+                                     struct xc_hvm_build_args *args);
 
 _hidden int libxl__ms_vm_genid_set(libxl__gc *gc, uint32_t domid,
                                    const libxl_ms_vm_genid *id);
diff --git a/tools/libxl/libxl_vnuma.c b/tools/libxl/libxl_vnuma.c
index bef3cc5..72339f7 100644
--- a/tools/libxl/libxl_vnuma.c
+++ b/tools/libxl/libxl_vnuma.c
@@ -182,6 +182,62 @@ int libxl__vnuma_build_vmemrange_pv(libxl__gc *gc,
     return libxl__arch_vnuma_build_vmemrange(gc, domid, b_info, state);
 }
 
+/* Build vmemranges for HVM guest */
+int libxl__vnuma_build_vmemrange_hvm(libxl__gc *gc,
+                                     uint32_t domid,
+                                     libxl_domain_build_info *b_info,
+                                     libxl__domain_build_state *state,
+                                     struct xc_hvm_build_args *args)
+{
+    uint64_t hole_start, hole_end, next;
+    int nid, nr_vmemrange;
+    xen_vmemrange_t *vmemranges;
+
+    /* Derive vmemranges from vnode size and memory hole.
+     *
+     * Guest physical address space layout:
+     * [0, hole_start) [hole_start, hole_end) [hole_end, highmem_end)
+     */
+    hole_start = args->lowmem_end < args->mmio_start ?
+        args->lowmem_end : args->mmio_start;
+    hole_end = (args->mmio_start + args->mmio_size) > (1ULL << 32) ?
+        (args->mmio_start + args->mmio_size) : (1ULL << 32);
+
+    assert(state->vmemranges == NULL);
+
+    next = 0;
+    nr_vmemrange = 0;
+    vmemranges = NULL;
+    for (nid = 0; nid < b_info->num_vnuma_nodes; nid++) {
+        libxl_vnode_info *p = &b_info->vnuma_nodes[nid];
+        uint64_t remaining_bytes = p->memkb << 10;
+
+        while (remaining_bytes > 0) {
+            uint64_t count = remaining_bytes;
+
+            if (next >= hole_start && next < hole_end)
+                next = hole_end;
+            if ((next < hole_start) && (next + remaining_bytes >= hole_start))
+                count = hole_start - next;
+
+            GCREALLOC_ARRAY(vmemranges, nr_vmemrange+1);
+            vmemranges[nr_vmemrange].start = next;
+            vmemranges[nr_vmemrange].end = next + count;
+            vmemranges[nr_vmemrange].flags = 0;
+            vmemranges[nr_vmemrange].nid = nid;
+
+            nr_vmemrange++;
+            remaining_bytes -= count;
+            next += count;
+        }
+    }
+
+    state->vmemranges = vmemranges;
+    state->num_vmemranges = nr_vmemrange;
+
+    return 0;
+}
+
 /*
  * Local variables:
  * mode: C
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v7 14/21] libxl: disallow memory relocation when vNUMA is enabled
  2015-03-09 12:51 [PATCH v7 00/21] Virtual NUMA for PV and HVM Wei Liu
                   ` (12 preceding siblings ...)
  2015-03-09 12:51 ` [PATCH v7 13/21] libxl: build, check and pass vNUMA info to Xen " Wei Liu
@ 2015-03-09 12:51 ` Wei Liu
  2015-03-09 12:51 ` [PATCH v7 15/21] libxl: define LIBXL_HAVE_VNUMA Wei Liu
                   ` (6 subsequent siblings)
  20 siblings, 0 replies; 39+ messages in thread
From: Wei Liu @ 2015-03-09 12:51 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, Ian Campbell

Disallow memory relocation when vNUMA is enabled, because relocated
memory ends up off node. Further more, even if we dynamically expand
node coverage in hvmloader, low memory and high memory may reside
in different physical nodes, blindly relocating low memory to high
memory gives us a sub-optimal configuration.

Introduce a function called libxl__vnuma_configured and use it.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Konrad Wilk <konrad.wilk@oracle.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
Changes in v6:
1. Introduce a helper function.
---
 tools/libxl/libxl_dm.c       | 6 ++++--
 tools/libxl/libxl_internal.h | 1 +
 tools/libxl/libxl_vnuma.c    | 5 +++++
 3 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
index 8599a6a..7b09512 100644
--- a/tools/libxl/libxl_dm.c
+++ b/tools/libxl/libxl_dm.c
@@ -1365,13 +1365,15 @@ void libxl__spawn_local_dm(libxl__egc *egc, libxl__dm_spawn_state *dmss)
                         libxl__sprintf(gc, "%s/hvmloader/bios", path),
                         "%s", libxl_bios_type_to_string(b_info->u.hvm.bios));
         /* Disable relocating memory to make the MMIO hole larger
-         * unless we're running qemu-traditional */
+         * unless we're running qemu-traditional and vNUMA is not
+         * configured. */
         libxl__xs_write(gc, XBT_NULL,
                         libxl__sprintf(gc,
                                        "%s/hvmloader/allow-memory-relocate",
                                        path),
                         "%d",
-                        b_info->device_model_version==LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL);
+                        b_info->device_model_version==LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL &&
+                        !libxl__vnuma_configured(b_info));
         free(path);
     }
 
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index e93089a..d04b6aa 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -3413,6 +3413,7 @@ int libxl__vnuma_build_vmemrange_hvm(libxl__gc *gc,
                                      libxl_domain_build_info *b_info,
                                      libxl__domain_build_state *state,
                                      struct xc_hvm_build_args *args);
+bool libxl__vnuma_configured(const libxl_domain_build_info *b_info);
 
 _hidden int libxl__ms_vm_genid_set(libxl__gc *gc, uint32_t domid,
                                    const libxl_ms_vm_genid *id);
diff --git a/tools/libxl/libxl_vnuma.c b/tools/libxl/libxl_vnuma.c
index 72339f7..aad297e 100644
--- a/tools/libxl/libxl_vnuma.c
+++ b/tools/libxl/libxl_vnuma.c
@@ -17,6 +17,11 @@
 #include "libxl_arch.h"
 #include <stdlib.h>
 
+bool libxl__vnuma_configured(const libxl_domain_build_info *b_info)
+{
+    return b_info->num_vnuma_nodes != 0;
+}
+
 /* Sort vmemranges in ascending order with "start" */
 static int compare_vmemrange(const void *a, const void *b)
 {
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v7 15/21] libxl: define LIBXL_HAVE_VNUMA
  2015-03-09 12:51 [PATCH v7 00/21] Virtual NUMA for PV and HVM Wei Liu
                   ` (13 preceding siblings ...)
  2015-03-09 12:51 ` [PATCH v7 14/21] libxl: disallow memory relocation when vNUMA is enabled Wei Liu
@ 2015-03-09 12:51 ` Wei Liu
  2015-03-11 15:03   ` Ian Campbell
  2015-03-09 12:51 ` [PATCH v7 16/21] libxlu: rework internal representation of setting Wei Liu
                   ` (5 subsequent siblings)
  20 siblings, 1 reply; 39+ messages in thread
From: Wei Liu @ 2015-03-09 12:51 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, Ian Campbell

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
---
Changes in v6:
1. Better description of the macro.
---
 tools/libxl/libxl.h | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index e3d2ae8..97a7c9c 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -67,6 +67,13 @@
  * the same $(XEN_VERSION) (e.g. throughout a major release).
  */
 
+/* LIBXL_HAVE_VNUMA
+ *
+ * If this is defined the type libxl_vnode_info exists, and a
+ * field 'vnuma_nodes' is present in libxl_domain_build_info.
+ */
+#define LIBXL_HAVE_VNUMA 1
+
 /* LIBXL_HAVE_USERDATA_UNLINK
  *
  * If it is defined, libxl has a library function called
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v7 16/21] libxlu: rework internal representation of setting
  2015-03-09 12:51 [PATCH v7 00/21] Virtual NUMA for PV and HVM Wei Liu
                   ` (14 preceding siblings ...)
  2015-03-09 12:51 ` [PATCH v7 15/21] libxl: define LIBXL_HAVE_VNUMA Wei Liu
@ 2015-03-09 12:51 ` Wei Liu
  2015-03-09 12:51 ` [PATCH v7 17/21] libxlu: nested list support Wei Liu
                   ` (4 subsequent siblings)
  20 siblings, 0 replies; 39+ messages in thread
From: Wei Liu @ 2015-03-09 12:51 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, Ian Campbell

This patches does following things:

1. Properly define a XLU_ConfigList type. Originally it was defined to
   be XLU_ConfigSetting.
2. Define XLU_ConfigValue type, which can be either a string or a list
   of XLU_ConfigValue.
3. ConfigSetting now references XLU_ConfigValue. Originally it only
   worked with **string.
4. Properly construct list where necessary, see changes to .y file.

To achieve above changes:

1. xlu__cfg_set_mk and xlu__cfg_set_add are deleted, because they
   are no more needed in the new code.
2. Introduce xlu__cfg_string_mk to make a XLU_ConfigSetting that points
   to a XLU_ConfigValue that wraps a string.
3. Introduce xlu__cfg_list_mk to make a XLU_ConfigSetting that points
   to XLU_ConfigValue that is a list.
4. The parser now generates XLU_ConfigValue instead of XLU_ConfigSetting
   when construct values, which enables us to recursively generate list
   of lists.
5. XLU_ConfigSetting is generated in xlu__cfg_set_store.
6. Adapt other functions to use new types.

No change to public API. Xl compiles without problem and 'xl create -n
guest.cfg' is valgrind clean.

This patch is needed because we're going to implement nested list
support, which requires support for list of list.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
---
Changes in v5:
1. Use standard expanding-array pattern.
---
 tools/libxl/libxlu_cfg.c      | 170 ++++++++++++++++++++++++++++++------------
 tools/libxl/libxlu_cfg_i.h    |  12 ++-
 tools/libxl/libxlu_cfg_y.c    |  24 +++---
 tools/libxl/libxlu_cfg_y.h    |   2 +-
 tools/libxl/libxlu_cfg_y.y    |  14 ++--
 tools/libxl/libxlu_internal.h |  30 ++++++--
 6 files changed, 173 insertions(+), 79 deletions(-)

diff --git a/tools/libxl/libxlu_cfg.c b/tools/libxl/libxlu_cfg.c
index 22adcb0..f000eed 100644
--- a/tools/libxl/libxlu_cfg.c
+++ b/tools/libxl/libxlu_cfg.c
@@ -131,14 +131,28 @@ int xlu_cfg_readdata(XLU_Config *cfg, const char *data, int length) {
     return ctx.err;
 }
 
-void xlu__cfg_set_free(XLU_ConfigSetting *set) {
+void xlu__cfg_value_free(XLU_ConfigValue *value)
+{
     int i;
 
+    if (!value) return;
+
+    switch (value->type) {
+    case XLU_STRING:
+        free(value->u.string);
+        break;
+    case XLU_LIST:
+        for (i = 0; i < value->u.list.nvalues; i++)
+            xlu__cfg_value_free(value->u.list.values[i]);
+        free(value->u.list.values);
+    }
+    free(value);
+}
+
+void xlu__cfg_set_free(XLU_ConfigSetting *set) {
     if (!set) return;
     free(set->name);
-    for (i=0; i<set->nvalues; i++)
-        free(set->values[i]);
-    free(set->values);
+    xlu__cfg_value_free(set->value);
     free(set);
 }
 
@@ -173,7 +187,7 @@ static int find_atom(const XLU_Config *cfg, const char *n,
     set= find(cfg,n);
     if (!set) return ESRCH;
 
-    if (set->avalues!=1) {
+    if (set->value->type!=XLU_STRING) {
         if (!dont_warn)
             fprintf(cfg->report,
                     "%s:%d: warning: parameter `%s' is"
@@ -191,7 +205,7 @@ int xlu_cfg_get_string(const XLU_Config *cfg, const char *n,
     int e;
 
     e= find_atom(cfg,n,&set,dont_warn);  if (e) return e;
-    *value_r= set->values[0];
+    *value_r= set->value->u.string;
     return 0;
 }
 
@@ -202,7 +216,7 @@ int xlu_cfg_replace_string(const XLU_Config *cfg, const char *n,
 
     e= find_atom(cfg,n,&set,dont_warn);  if (e) return e;
     free(*value_r);
-    *value_r= strdup(set->values[0]);
+    *value_r= strdup(set->value->u.string);
     return 0;
 }
 
@@ -214,7 +228,7 @@ int xlu_cfg_get_long(const XLU_Config *cfg, const char *n,
     char *ep;
 
     e= find_atom(cfg,n,&set,dont_warn);  if (e) return e;
-    errno= 0; l= strtol(set->values[0], &ep, 0);
+    errno= 0; l= strtol(set->value->u.string, &ep, 0);
     e= errno;
     if (errno) {
         e= errno;
@@ -226,7 +240,7 @@ int xlu_cfg_get_long(const XLU_Config *cfg, const char *n,
                     cfg->config_source, set->lineno, n, strerror(e));
         return e;
     }
-    if (*ep || ep==set->values[0]) {
+    if (*ep || ep==set->value->u.string) {
         if (!dont_warn)
             fprintf(cfg->report,
                     "%s:%d: warning: parameter `%s' is not a valid number\n",
@@ -253,7 +267,7 @@ int xlu_cfg_get_list(const XLU_Config *cfg, const char *n,
                      XLU_ConfigList **list_r, int *entries_r, int dont_warn) {
     XLU_ConfigSetting *set;
     set= find(cfg,n);  if (!set) return ESRCH;
-    if (set->avalues==1) {
+    if (set->value->type!=XLU_LIST) {
         if (!dont_warn) {
             fprintf(cfg->report,
                     "%s:%d: warning: parameter `%s' is a single value"
@@ -262,8 +276,8 @@ int xlu_cfg_get_list(const XLU_Config *cfg, const char *n,
         }
         return EINVAL;
     }
-    if (list_r) *list_r= set;
-    if (entries_r) *entries_r= set->nvalues;
+    if (list_r) *list_r= &set->value->u.list;
+    if (entries_r) *entries_r= set->value->u.list.nvalues;
     return 0;
 }
 
@@ -290,72 +304,130 @@ int xlu_cfg_get_list_as_string_list(const XLU_Config *cfg, const char *n,
     return 0;
 }
 
-const char *xlu_cfg_get_listitem(const XLU_ConfigList *set, int entry) {
-    if (entry < 0 || entry >= set->nvalues) return 0;
-    return set->values[entry];
+const char *xlu_cfg_get_listitem(const XLU_ConfigList *list, int entry) {
+    if (entry < 0 || entry >= list->nvalues) return 0;
+    if (list->values[entry]->type != XLU_STRING) return 0;
+    return list->values[entry]->u.string;
 }
 
 
-XLU_ConfigSetting *xlu__cfg_set_mk(CfgParseContext *ctx,
-                                   int alloc, char *atom) {
-    XLU_ConfigSetting *set= 0;
+XLU_ConfigValue *xlu__cfg_string_mk(CfgParseContext *ctx, char *atom)
+{
+    XLU_ConfigValue *value = NULL;
 
     if (ctx->err) goto x;
-    assert(!!alloc == !!atom);
 
-    set= malloc(sizeof(*set));
-    if (!set) goto xe;
+    value = malloc(sizeof(*value));
+    if (!value) goto xe;
+    value->type = XLU_STRING;
+    value->u.string = atom;
+
+    return value;
 
-    set->name= 0; /* tbd */
-    set->avalues= alloc;
+ xe:
+    ctx->err= errno;
+ x:
+    free(value);
+    free(atom);
+    return NULL;
+}
 
-    if (!alloc) {
-        set->nvalues= 0;
-        set->values= 0;
-    } else {
-        set->values= malloc(sizeof(*set->values) * alloc);
-        if (!set->values) goto xe;
+XLU_ConfigValue *xlu__cfg_list_mk(CfgParseContext *ctx, char *atom)
+{
+    XLU_ConfigValue *value = NULL;
+    XLU_ConfigValue **values = NULL;
+    XLU_ConfigValue *val = NULL;
 
-        set->nvalues= 1;
-        set->values[0]= atom;
-    }
-    return set;
+    if (ctx->err) goto x;
+
+    val = malloc(sizeof(*val));
+    if (!val) goto xe;
+    val->type = XLU_STRING;
+    val->u.string = atom;
+
+    values = malloc(sizeof(*values));
+    if (!values) goto xe;
+    values[0] = val;
+
+    value = malloc(sizeof(*value));
+    if (!value) goto xe;
+    value->type = XLU_LIST;
+    value->u.list.nvalues = 1;
+    value->u.list.avalues = 1;
+    value->u.list.values = values;
+
+    return value;
 
  xe:
     ctx->err= errno;
  x:
-    free(set);
+    free(value);
+    free(values);
+    free(val);
     free(atom);
-    return 0;
+    return NULL;
 }
 
-void xlu__cfg_set_add(CfgParseContext *ctx, XLU_ConfigSetting *set,
-                      char *atom) {
+void xlu__cfg_list_append(CfgParseContext *ctx,
+                          XLU_ConfigValue *list,
+                          char *atom)
+{
+    XLU_ConfigValue *val = NULL;
     if (ctx->err) return;
 
     assert(atom);
+    assert(list->type == XLU_LIST);
 
-    if (set->nvalues >= set->avalues) {
+    if (list->u.list.nvalues >= list->u.list.avalues) {
         int new_avalues;
-        char **new_values;
-
-        if (set->avalues > INT_MAX / 100) { ctx->err= ERANGE; return; }
-        new_avalues= set->avalues * 4;
-        new_values= realloc(set->values,
-                            sizeof(*new_values) * new_avalues);
-        if (!new_values) { ctx->err= errno; free(atom); return; }
-        set->values= new_values;
-        set->avalues= new_avalues;
+        XLU_ConfigValue **new_values = NULL;
+
+        if (list->u.list.avalues > INT_MAX / 100) {
+            ctx->err = ERANGE;
+            free(atom);
+            return;
+        }
+
+        new_avalues = list->u.list.avalues * 4;
+        new_values  = realloc(list->u.list.values,
+                              sizeof(*new_values) * new_avalues);
+        if (!new_values) {
+            ctx->err = errno;
+            free(atom);
+            return;
+        }
+
+        list->u.list.avalues = new_avalues;
+        list->u.list.values  = new_values;
+    }
+
+    val = malloc(sizeof(*val));
+    if (!val) {
+        ctx->err = errno;
+        free(atom);
+        return;
     }
-    set->values[set->nvalues++]= atom;
+
+    val->type = XLU_STRING;
+    val->u.string = atom;
+    list->u.list.values[list->u.list.nvalues] = val;
+    list->u.list.nvalues++;
 }
 
 void xlu__cfg_set_store(CfgParseContext *ctx, char *name,
-                        XLU_ConfigSetting *set, int lineno) {
+                        XLU_ConfigValue *val, int lineno) {
+    XLU_ConfigSetting *set;
+
     if (ctx->err) return;
 
     assert(name);
+    set = malloc(sizeof(*set));
+    if (!set) {
+        ctx->err = errno;
+        return;
+    }
     set->name= name;
+    set->value = val;
     set->lineno= lineno;
     set->next= ctx->cfg->settings;
     ctx->cfg->settings= set;
diff --git a/tools/libxl/libxlu_cfg_i.h b/tools/libxl/libxlu_cfg_i.h
index 54d033c..b71e9fd 100644
--- a/tools/libxl/libxlu_cfg_i.h
+++ b/tools/libxl/libxlu_cfg_i.h
@@ -23,11 +23,15 @@
 #include "libxlu_cfg_y.h"
 
 void xlu__cfg_set_free(XLU_ConfigSetting *set);
-XLU_ConfigSetting *xlu__cfg_set_mk(CfgParseContext*, int alloc, char *atom);
-void xlu__cfg_set_add(CfgParseContext*, XLU_ConfigSetting *set, char *atom);
 void xlu__cfg_set_store(CfgParseContext*, char *name,
-                        XLU_ConfigSetting *set, int lineno);
-
+                        XLU_ConfigValue *val, int lineno);
+XLU_ConfigValue *xlu__cfg_string_mk(CfgParseContext *ctx,
+                                    char *atom);
+XLU_ConfigValue *xlu__cfg_list_mk(CfgParseContext *ctx, char *atom);
+void xlu__cfg_list_append(CfgParseContext *ctx,
+                          XLU_ConfigValue *list,
+                          char *atom);
+void xlu__cfg_value_free(XLU_ConfigValue *value);
 char *xlu__cfgl_strdup(CfgParseContext*, const char *src);
 char *xlu__cfgl_dequote(CfgParseContext*, const char *src);
 
diff --git a/tools/libxl/libxlu_cfg_y.c b/tools/libxl/libxlu_cfg_y.c
index 07b5a1d..eb3884f 100644
--- a/tools/libxl/libxlu_cfg_y.c
+++ b/tools/libxl/libxlu_cfg_y.c
@@ -126,7 +126,7 @@ typedef union YYSTYPE
 #line 25 "libxlu_cfg_y.y"
 
   char *string;
-  XLU_ConfigSetting *setting;
+  XLU_ConfigValue *value;
 
 
 
@@ -1148,7 +1148,7 @@ yydestruct (yymsg, yytype, yyvaluep, yylocationp, ctx)
 
 /* Line 1391 of yacc.c  */
 #line 43 "libxlu_cfg_y.y"
-	{ xlu__cfg_set_free((yyvaluep->setting)); };
+	{ xlu__cfg_value_free((yyvaluep->value)); };
 
 /* Line 1391 of yacc.c  */
 #line 1155 "libxlu_cfg_y.c"
@@ -1166,7 +1166,7 @@ yydestruct (yymsg, yytype, yyvaluep, yylocationp, ctx)
 
 /* Line 1391 of yacc.c  */
 #line 43 "libxlu_cfg_y.y"
-	{ xlu__cfg_set_free((yyvaluep->setting)); };
+	{ xlu__cfg_value_free((yyvaluep->value)); };
 
 /* Line 1391 of yacc.c  */
 #line 1173 "libxlu_cfg_y.c"
@@ -1175,7 +1175,7 @@ yydestruct (yymsg, yytype, yyvaluep, yylocationp, ctx)
 
 /* Line 1391 of yacc.c  */
 #line 43 "libxlu_cfg_y.y"
-	{ xlu__cfg_set_free((yyvaluep->setting)); };
+	{ xlu__cfg_value_free((yyvaluep->value)); };
 
 /* Line 1391 of yacc.c  */
 #line 1182 "libxlu_cfg_y.c"
@@ -1508,21 +1508,21 @@ yyreduce:
 
 /* Line 1806 of yacc.c  */
 #line 57 "libxlu_cfg_y.y"
-    { xlu__cfg_set_store(ctx,(yyvsp[(1) - (3)].string),(yyvsp[(3) - (3)].setting),(yylsp[(3) - (3)]).first_line); }
+    { xlu__cfg_set_store(ctx,(yyvsp[(1) - (3)].string),(yyvsp[(3) - (3)].value),(yylsp[(3) - (3)]).first_line); }
     break;
 
   case 12:
 
 /* Line 1806 of yacc.c  */
 #line 62 "libxlu_cfg_y.y"
-    { (yyval.setting)= xlu__cfg_set_mk(ctx,1,(yyvsp[(1) - (1)].string)); }
+    { (yyval.value)= xlu__cfg_string_mk(ctx,(yyvsp[(1) - (1)].string)); }
     break;
 
   case 13:
 
 /* Line 1806 of yacc.c  */
 #line 63 "libxlu_cfg_y.y"
-    { (yyval.setting)= (yyvsp[(3) - (4)].setting); }
+    { (yyval.value)= (yyvsp[(3) - (4)].value); }
     break;
 
   case 14:
@@ -1543,35 +1543,35 @@ yyreduce:
 
 /* Line 1806 of yacc.c  */
 #line 68 "libxlu_cfg_y.y"
-    { (yyval.setting)= xlu__cfg_set_mk(ctx,0,0); }
+    { (yyval.value)= xlu__cfg_list_mk(ctx,NULL); }
     break;
 
   case 17:
 
 /* Line 1806 of yacc.c  */
 #line 69 "libxlu_cfg_y.y"
-    { (yyval.setting)= (yyvsp[(1) - (1)].setting); }
+    { (yyval.value)= (yyvsp[(1) - (1)].value); }
     break;
 
   case 18:
 
 /* Line 1806 of yacc.c  */
 #line 70 "libxlu_cfg_y.y"
-    { (yyval.setting)= (yyvsp[(1) - (3)].setting); }
+    { (yyval.value)= (yyvsp[(1) - (3)].value); }
     break;
 
   case 19:
 
 /* Line 1806 of yacc.c  */
 #line 72 "libxlu_cfg_y.y"
-    { (yyval.setting)= xlu__cfg_set_mk(ctx,2,(yyvsp[(1) - (2)].string)); }
+    { (yyval.value)= xlu__cfg_list_mk(ctx,(yyvsp[(1) - (2)].string)); }
     break;
 
   case 20:
 
 /* Line 1806 of yacc.c  */
 #line 73 "libxlu_cfg_y.y"
-    { xlu__cfg_set_add(ctx,(yyvsp[(1) - (5)].setting),(yyvsp[(4) - (5)].string)); (yyval.setting)= (yyvsp[(1) - (5)].setting); }
+    { xlu__cfg_list_append(ctx,(yyvsp[(1) - (5)].value),(yyvsp[(4) - (5)].string)); (yyval.value)= (yyvsp[(1) - (5)].value); }
     break;
 
 
diff --git a/tools/libxl/libxlu_cfg_y.h b/tools/libxl/libxlu_cfg_y.h
index d7dfaf2..37e8213 100644
--- a/tools/libxl/libxlu_cfg_y.h
+++ b/tools/libxl/libxlu_cfg_y.h
@@ -54,7 +54,7 @@ typedef union YYSTYPE
 #line 25 "libxlu_cfg_y.y"
 
   char *string;
-  XLU_ConfigSetting *setting;
+  XLU_ConfigValue *value;
 
 
 
diff --git a/tools/libxl/libxlu_cfg_y.y b/tools/libxl/libxlu_cfg_y.y
index 5acd438..6848686 100644
--- a/tools/libxl/libxlu_cfg_y.y
+++ b/tools/libxl/libxlu_cfg_y.y
@@ -24,7 +24,7 @@
 
 %union {
   char *string;
-  XLU_ConfigSetting *setting;
+  XLU_ConfigValue *value;
 }
 
 %locations
@@ -39,8 +39,8 @@
 %type <string>            atom
 %destructor { free($$); } atom IDENT STRING NUMBER
 
-%type <setting>                         value valuelist values
-%destructor { xlu__cfg_set_free($$); }  value valuelist values
+%type <value>                             value valuelist values
+%destructor { xlu__cfg_value_free($$); }  value valuelist values
 
 %%
 
@@ -59,18 +59,18 @@ assignment: IDENT '=' value { xlu__cfg_set_store(ctx,$1,$3,@3.first_line); }
 endstmt: NEWLINE
  |      ';'
 
-value:  atom                         { $$= xlu__cfg_set_mk(ctx,1,$1); }
+value:  atom                         { $$= xlu__cfg_string_mk(ctx,$1); }
  |      '[' nlok valuelist ']'       { $$= $3; }
 
 atom:   STRING                   { $$= $1; }
  |      NUMBER                   { $$= $1; }
 
-valuelist: /* empty */           { $$= xlu__cfg_set_mk(ctx,0,0); }
+valuelist: /* empty */           { $$= xlu__cfg_list_mk(ctx,NULL); }
  |      values                  { $$= $1; }
  |      values ',' nlok         { $$= $1; }
 
-values: atom nlok                  { $$= xlu__cfg_set_mk(ctx,2,$1); }
- |      values ',' nlok atom nlok  { xlu__cfg_set_add(ctx,$1,$4); $$= $1; }
+values: atom nlok                  { $$= xlu__cfg_list_mk(ctx,$1); }
+ |      values ',' nlok atom nlok  { xlu__cfg_list_append(ctx,$1,$4); $$= $1; }
 
 nlok:
         /* nothing */
diff --git a/tools/libxl/libxlu_internal.h b/tools/libxl/libxlu_internal.h
index 7579158..092a17a 100644
--- a/tools/libxl/libxlu_internal.h
+++ b/tools/libxl/libxlu_internal.h
@@ -23,17 +23,35 @@
 #include <assert.h>
 #include <regex.h>
 
-#define XLU_ConfigList XLU_ConfigSetting
-
 #include "libxlutil.h"
 
-struct XLU_ConfigSetting { /* transparent */
+enum XLU_ConfigValueType {
+    XLU_STRING,
+    XLU_LIST,
+};
+
+typedef struct XLU_ConfigValue XLU_ConfigValue;
+
+typedef struct XLU_ConfigList {
+    int avalues; /* available slots */
+    int nvalues; /* actual occupied slots */
+    XLU_ConfigValue **values;
+} XLU_ConfigList;
+
+struct XLU_ConfigValue {
+    enum XLU_ConfigValueType type;
+    union {
+        char *string;
+        XLU_ConfigList list;
+    } u;
+};
+
+typedef struct XLU_ConfigSetting { /* transparent */
     struct XLU_ConfigSetting *next;
     char *name;
-    int nvalues, avalues; /* lists have avalues>1 */
-    char **values;
+    XLU_ConfigValue *value;
     int lineno;
-};
+} XLU_ConfigSetting;
 
 struct XLU_Config {
     XLU_ConfigSetting *settings;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v7 17/21] libxlu: nested list support
  2015-03-09 12:51 [PATCH v7 00/21] Virtual NUMA for PV and HVM Wei Liu
                   ` (15 preceding siblings ...)
  2015-03-09 12:51 ` [PATCH v7 16/21] libxlu: rework internal representation of setting Wei Liu
@ 2015-03-09 12:51 ` Wei Liu
  2015-03-09 12:51 ` [PATCH v7 18/21] libxlu: record location when parsing values Wei Liu
                   ` (3 subsequent siblings)
  20 siblings, 0 replies; 39+ messages in thread
From: Wei Liu @ 2015-03-09 12:51 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, Ian Campbell

1. Extend grammar of parser.
2. Adjust internal functions to accept XLU_ConfigValue instead of
   char *.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
---
 tools/libxl/libxlu_cfg.c   | 30 +++++++-----------------------
 tools/libxl/libxlu_cfg_i.h |  5 +++--
 tools/libxl/libxlu_cfg_y.c | 26 +++++++++++++-------------
 tools/libxl/libxlu_cfg_y.y |  4 ++--
 4 files changed, 25 insertions(+), 40 deletions(-)

diff --git a/tools/libxl/libxlu_cfg.c b/tools/libxl/libxlu_cfg.c
index f000eed..611f5ec 100644
--- a/tools/libxl/libxlu_cfg.c
+++ b/tools/libxl/libxlu_cfg.c
@@ -332,19 +332,14 @@ XLU_ConfigValue *xlu__cfg_string_mk(CfgParseContext *ctx, char *atom)
     return NULL;
 }
 
-XLU_ConfigValue *xlu__cfg_list_mk(CfgParseContext *ctx, char *atom)
+XLU_ConfigValue *xlu__cfg_list_mk(CfgParseContext *ctx,
+                                  XLU_ConfigValue *val)
 {
     XLU_ConfigValue *value = NULL;
     XLU_ConfigValue **values = NULL;
-    XLU_ConfigValue *val = NULL;
 
     if (ctx->err) goto x;
 
-    val = malloc(sizeof(*val));
-    if (!val) goto xe;
-    val->type = XLU_STRING;
-    val->u.string = atom;
-
     values = malloc(sizeof(*values));
     if (!values) goto xe;
     values[0] = val;
@@ -363,19 +358,17 @@ XLU_ConfigValue *xlu__cfg_list_mk(CfgParseContext *ctx, char *atom)
  x:
     free(value);
     free(values);
-    free(val);
-    free(atom);
+    xlu__cfg_value_free(val);
     return NULL;
 }
 
 void xlu__cfg_list_append(CfgParseContext *ctx,
                           XLU_ConfigValue *list,
-                          char *atom)
+                          XLU_ConfigValue *val)
 {
-    XLU_ConfigValue *val = NULL;
     if (ctx->err) return;
 
-    assert(atom);
+    assert(val);
     assert(list->type == XLU_LIST);
 
     if (list->u.list.nvalues >= list->u.list.avalues) {
@@ -384,7 +377,7 @@ void xlu__cfg_list_append(CfgParseContext *ctx,
 
         if (list->u.list.avalues > INT_MAX / 100) {
             ctx->err = ERANGE;
-            free(atom);
+            xlu__cfg_value_free(val);
             return;
         }
 
@@ -393,7 +386,7 @@ void xlu__cfg_list_append(CfgParseContext *ctx,
                               sizeof(*new_values) * new_avalues);
         if (!new_values) {
             ctx->err = errno;
-            free(atom);
+            xlu__cfg_value_free(val);
             return;
         }
 
@@ -401,15 +394,6 @@ void xlu__cfg_list_append(CfgParseContext *ctx,
         list->u.list.values  = new_values;
     }
 
-    val = malloc(sizeof(*val));
-    if (!val) {
-        ctx->err = errno;
-        free(atom);
-        return;
-    }
-
-    val->type = XLU_STRING;
-    val->u.string = atom;
     list->u.list.values[list->u.list.nvalues] = val;
     list->u.list.nvalues++;
 }
diff --git a/tools/libxl/libxlu_cfg_i.h b/tools/libxl/libxlu_cfg_i.h
index b71e9fd..11dc33f 100644
--- a/tools/libxl/libxlu_cfg_i.h
+++ b/tools/libxl/libxlu_cfg_i.h
@@ -27,10 +27,11 @@ void xlu__cfg_set_store(CfgParseContext*, char *name,
                         XLU_ConfigValue *val, int lineno);
 XLU_ConfigValue *xlu__cfg_string_mk(CfgParseContext *ctx,
                                     char *atom);
-XLU_ConfigValue *xlu__cfg_list_mk(CfgParseContext *ctx, char *atom);
+XLU_ConfigValue *xlu__cfg_list_mk(CfgParseContext *ctx,
+                                  XLU_ConfigValue *val);
 void xlu__cfg_list_append(CfgParseContext *ctx,
                           XLU_ConfigValue *list,
-                          char *atom);
+                          XLU_ConfigValue *val);
 void xlu__cfg_value_free(XLU_ConfigValue *value);
 char *xlu__cfgl_strdup(CfgParseContext*, const char *src);
 char *xlu__cfgl_dequote(CfgParseContext*, const char *src);
diff --git a/tools/libxl/libxlu_cfg_y.c b/tools/libxl/libxlu_cfg_y.c
index eb3884f..b05e48b 100644
--- a/tools/libxl/libxlu_cfg_y.c
+++ b/tools/libxl/libxlu_cfg_y.c
@@ -377,7 +377,7 @@ union yyalloc
 /* YYFINAL -- State number of the termination state.  */
 #define YYFINAL  3
 /* YYLAST -- Last index in YYTABLE.  */
-#define YYLAST   24
+#define YYLAST   25
 
 /* YYNTOKENS -- Number of terminals.  */
 #define YYNTOKENS  12
@@ -444,8 +444,8 @@ static const yytype_int8 yyrhs[] =
       15,    -1,    16,    17,    -1,    17,    -1,     1,     6,    -1,
        3,     7,    18,    -1,     6,    -1,     8,    -1,    19,    -1,
        9,    22,    20,    10,    -1,     4,    -1,     5,    -1,    -1,
-      21,    -1,    21,    11,    22,    -1,    19,    22,    -1,    21,
-      11,    22,    19,    22,    -1,    -1,    22,     6,    -1
+      21,    -1,    21,    11,    22,    -1,    18,    22,    -1,    21,
+      11,    22,    18,    22,    -1,    -1,    22,     6,    -1
 };
 
 /* YYRLINE[YYN] -- source line where rule number YYN was defined.  */
@@ -517,14 +517,14 @@ static const yytype_int8 yydefgoto[] =
 static const yytype_int8 yypact[] =
 {
      -18,     4,     0,   -18,    -1,     6,   -18,   -18,   -18,     3,
-     -18,   -18,    11,   -18,   -18,   -18,   -18,   -18,   -18,    13,
-     -18,   -18,    12,    10,    17,   -18,   -18,    13,   -18,    17
+     -18,   -18,    14,   -18,   -18,   -18,   -18,   -18,   -18,    11,
+     -18,   -18,    12,    10,    18,   -18,   -18,    11,   -18,    18
 };
 
 /* YYPGOTO[NTERM-NUM].  */
 static const yytype_int8 yypgoto[] =
 {
-     -18,   -18,   -18,   -18,   -18,    15,   -18,   -17,   -18,   -18,
+     -18,   -18,   -18,   -18,   -18,    16,   -17,   -18,   -18,   -18,
      -14
 };
 
@@ -535,8 +535,8 @@ static const yytype_int8 yypgoto[] =
 static const yytype_int8 yytable[] =
 {
       -2,     4,    21,     5,     3,    11,     6,    24,     7,     6,
-      28,     7,    27,    12,    29,    14,    15,    14,    15,    20,
-      16,    26,    25,    20,    13
+      28,     7,    27,    12,    29,    14,    15,    20,    14,    15,
+      16,    26,    25,    16,    20,    13
 };
 
 #define yypact_value_is_default(yystate) \
@@ -548,8 +548,8 @@ static const yytype_int8 yytable[] =
 static const yytype_uint8 yycheck[] =
 {
        0,     1,    19,     3,     0,     6,     6,    21,     8,     6,
-      27,     8,    26,     7,    28,     4,     5,     4,     5,     6,
-       9,    11,    10,     6,     9
+      27,     8,    26,     7,    28,     4,     5,     6,     4,     5,
+       9,    11,    10,     9,     6,     9
 };
 
 /* YYSTOS[STATE-NUM] -- The (internal number of the) accessing
@@ -558,7 +558,7 @@ static const yytype_uint8 yystos[] =
 {
        0,    13,    14,     0,     1,     3,     6,     8,    15,    16,
       17,     6,     7,    17,     4,     5,     9,    18,    19,    22,
-       6,    19,    20,    21,    22,    10,    11,    22,    19,    22
+       6,    18,    20,    21,    22,    10,    11,    22,    18,    22
 };
 
 #define yyerrok		(yyerrstatus = 0)
@@ -1564,14 +1564,14 @@ yyreduce:
 
 /* Line 1806 of yacc.c  */
 #line 72 "libxlu_cfg_y.y"
-    { (yyval.value)= xlu__cfg_list_mk(ctx,(yyvsp[(1) - (2)].string)); }
+    { (yyval.value)= xlu__cfg_list_mk(ctx,(yyvsp[(1) - (2)].value)); }
     break;
 
   case 20:
 
 /* Line 1806 of yacc.c  */
 #line 73 "libxlu_cfg_y.y"
-    { xlu__cfg_list_append(ctx,(yyvsp[(1) - (5)].value),(yyvsp[(4) - (5)].string)); (yyval.value)= (yyvsp[(1) - (5)].value); }
+    { xlu__cfg_list_append(ctx,(yyvsp[(1) - (5)].value),(yyvsp[(4) - (5)].value)); (yyval.value)= (yyvsp[(1) - (5)].value); }
     break;
 
 
diff --git a/tools/libxl/libxlu_cfg_y.y b/tools/libxl/libxlu_cfg_y.y
index 6848686..4a5ca3a 100644
--- a/tools/libxl/libxlu_cfg_y.y
+++ b/tools/libxl/libxlu_cfg_y.y
@@ -69,8 +69,8 @@ valuelist: /* empty */           { $$= xlu__cfg_list_mk(ctx,NULL); }
  |      values                  { $$= $1; }
  |      values ',' nlok         { $$= $1; }
 
-values: atom nlok                  { $$= xlu__cfg_list_mk(ctx,$1); }
- |      values ',' nlok atom nlok  { xlu__cfg_list_append(ctx,$1,$4); $$= $1; }
+values: value nlok                  { $$= xlu__cfg_list_mk(ctx,$1); }
+ |      values ',' nlok value nlok  { xlu__cfg_list_append(ctx,$1,$4); $$= $1; }
 
 nlok:
         /* nothing */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v7 18/21] libxlu: record location when parsing values
  2015-03-09 12:51 [PATCH v7 00/21] Virtual NUMA for PV and HVM Wei Liu
                   ` (16 preceding siblings ...)
  2015-03-09 12:51 ` [PATCH v7 17/21] libxlu: nested list support Wei Liu
@ 2015-03-09 12:51 ` Wei Liu
  2015-03-11 15:12   ` Ian Campbell
  2015-03-09 12:51 ` [PATCH v7 19/21] libxlu: introduce new APIs Wei Liu
                   ` (2 subsequent siblings)
  20 siblings, 1 reply; 39+ messages in thread
From: Wei Liu @ 2015-03-09 12:51 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, Ian Campbell

Originally only setting has line number recorded. Since we're moving to
more sophisticated API, record the location for individual value. It is
useful for error reporting.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
---
Changes in v7:
1. Use yylloc in empty rule.
1. Use YYLTYPE instead of individual line / column values.
---
 tools/libxl/libxlu_cfg.c      | 14 ++++++++++----
 tools/libxl/libxlu_cfg_i.h    |  5 +++--
 tools/libxl/libxlu_cfg_y.c    |  6 +++---
 tools/libxl/libxlu_cfg_y.y    |  6 +++---
 tools/libxl/libxlu_internal.h |  2 ++
 5 files changed, 21 insertions(+), 12 deletions(-)

diff --git a/tools/libxl/libxlu_cfg.c b/tools/libxl/libxlu_cfg.c
index 611f5ec..858f894 100644
--- a/tools/libxl/libxlu_cfg.c
+++ b/tools/libxl/libxlu_cfg.c
@@ -311,16 +311,19 @@ const char *xlu_cfg_get_listitem(const XLU_ConfigList *list, int entry) {
 }
 
 
-XLU_ConfigValue *xlu__cfg_string_mk(CfgParseContext *ctx, char *atom)
+XLU_ConfigValue *xlu__cfg_string_mk(CfgParseContext *ctx, char *atom,
+                                    YYLTYPE *loc)
 {
     XLU_ConfigValue *value = NULL;
 
     if (ctx->err) goto x;
 
-    value = malloc(sizeof(*value));
+    value = malloc(sizeof(*value)+sizeof(*loc));
     if (!value) goto xe;
     value->type = XLU_STRING;
     value->u.string = atom;
+    value->loc = (YYLTYPE *)(value+1);
+    memcpy(value->loc, loc, sizeof(*loc));
 
     return value;
 
@@ -333,7 +336,8 @@ XLU_ConfigValue *xlu__cfg_string_mk(CfgParseContext *ctx, char *atom)
 }
 
 XLU_ConfigValue *xlu__cfg_list_mk(CfgParseContext *ctx,
-                                  XLU_ConfigValue *val)
+                                  XLU_ConfigValue *val,
+                                  YYLTYPE *loc)
 {
     XLU_ConfigValue *value = NULL;
     XLU_ConfigValue **values = NULL;
@@ -344,12 +348,14 @@ XLU_ConfigValue *xlu__cfg_list_mk(CfgParseContext *ctx,
     if (!values) goto xe;
     values[0] = val;
 
-    value = malloc(sizeof(*value));
+    value = malloc(sizeof(*value)+sizeof(*loc));
     if (!value) goto xe;
     value->type = XLU_LIST;
     value->u.list.nvalues = 1;
     value->u.list.avalues = 1;
     value->u.list.values = values;
+    value->loc = (YYLTYPE *)(value+1);
+    memcpy(value->loc, loc, sizeof(*loc));
 
     return value;
 
diff --git a/tools/libxl/libxlu_cfg_i.h b/tools/libxl/libxlu_cfg_i.h
index 11dc33f..1b59b33 100644
--- a/tools/libxl/libxlu_cfg_i.h
+++ b/tools/libxl/libxlu_cfg_i.h
@@ -26,9 +26,10 @@ void xlu__cfg_set_free(XLU_ConfigSetting *set);
 void xlu__cfg_set_store(CfgParseContext*, char *name,
                         XLU_ConfigValue *val, int lineno);
 XLU_ConfigValue *xlu__cfg_string_mk(CfgParseContext *ctx,
-                                    char *atom);
+                                    char *atom, YYLTYPE *loc);
 XLU_ConfigValue *xlu__cfg_list_mk(CfgParseContext *ctx,
-                                  XLU_ConfigValue *val);
+                                  XLU_ConfigValue *val,
+                                  YYLTYPE *loc);
 void xlu__cfg_list_append(CfgParseContext *ctx,
                           XLU_ConfigValue *list,
                           XLU_ConfigValue *val);
diff --git a/tools/libxl/libxlu_cfg_y.c b/tools/libxl/libxlu_cfg_y.c
index b05e48b..fbfdd0f 100644
--- a/tools/libxl/libxlu_cfg_y.c
+++ b/tools/libxl/libxlu_cfg_y.c
@@ -1515,7 +1515,7 @@ yyreduce:
 
 /* Line 1806 of yacc.c  */
 #line 62 "libxlu_cfg_y.y"
-    { (yyval.value)= xlu__cfg_string_mk(ctx,(yyvsp[(1) - (1)].string)); }
+    { (yyval.value)= xlu__cfg_string_mk(ctx,(yyvsp[(1) - (1)].string),&(yylsp[(1) - (1)])); }
     break;
 
   case 13:
@@ -1543,7 +1543,7 @@ yyreduce:
 
 /* Line 1806 of yacc.c  */
 #line 68 "libxlu_cfg_y.y"
-    { (yyval.value)= xlu__cfg_list_mk(ctx,NULL); }
+    { (yyval.value)= xlu__cfg_list_mk(ctx,NULL,&yylloc); }
     break;
 
   case 17:
@@ -1564,7 +1564,7 @@ yyreduce:
 
 /* Line 1806 of yacc.c  */
 #line 72 "libxlu_cfg_y.y"
-    { (yyval.value)= xlu__cfg_list_mk(ctx,(yyvsp[(1) - (2)].value)); }
+    { (yyval.value)= xlu__cfg_list_mk(ctx,(yyvsp[(1) - (2)].value),&(yylsp[(1) - (2)])); }
     break;
 
   case 20:
diff --git a/tools/libxl/libxlu_cfg_y.y b/tools/libxl/libxlu_cfg_y.y
index 4a5ca3a..a923f76 100644
--- a/tools/libxl/libxlu_cfg_y.y
+++ b/tools/libxl/libxlu_cfg_y.y
@@ -59,17 +59,17 @@ assignment: IDENT '=' value { xlu__cfg_set_store(ctx,$1,$3,@3.first_line); }
 endstmt: NEWLINE
  |      ';'
 
-value:  atom                         { $$= xlu__cfg_string_mk(ctx,$1); }
+value:  atom                         { $$= xlu__cfg_string_mk(ctx,$1,&@1); }
  |      '[' nlok valuelist ']'       { $$= $3; }
 
 atom:   STRING                   { $$= $1; }
  |      NUMBER                   { $$= $1; }
 
-valuelist: /* empty */           { $$= xlu__cfg_list_mk(ctx,NULL); }
+valuelist: /* empty */           { $$= xlu__cfg_list_mk(ctx,NULL,&yylloc); }
  |      values                  { $$= $1; }
  |      values ',' nlok         { $$= $1; }
 
-values: value nlok                  { $$= xlu__cfg_list_mk(ctx,$1); }
+values: value nlok                  { $$= xlu__cfg_list_mk(ctx,$1,&@1); }
  |      values ',' nlok value nlok  { xlu__cfg_list_append(ctx,$1,$4); $$= $1; }
 
 nlok:
diff --git a/tools/libxl/libxlu_internal.h b/tools/libxl/libxlu_internal.h
index 092a17a..cc1d400 100644
--- a/tools/libxl/libxlu_internal.h
+++ b/tools/libxl/libxlu_internal.h
@@ -38,12 +38,14 @@ typedef struct XLU_ConfigList {
     XLU_ConfigValue **values;
 } XLU_ConfigList;
 
+typedef struct YYLTYPE YYLTYPE;
 struct XLU_ConfigValue {
     enum XLU_ConfigValueType type;
     union {
         char *string;
         XLU_ConfigList list;
     } u;
+    YYLTYPE *loc;
 };
 
 typedef struct XLU_ConfigSetting { /* transparent */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v7 19/21] libxlu: introduce new APIs
  2015-03-09 12:51 [PATCH v7 00/21] Virtual NUMA for PV and HVM Wei Liu
                   ` (17 preceding siblings ...)
  2015-03-09 12:51 ` [PATCH v7 18/21] libxlu: record location when parsing values Wei Liu
@ 2015-03-09 12:51 ` Wei Liu
  2015-03-09 12:51 ` [PATCH v7 20/21] xl: introduce xcalloc Wei Liu
  2015-03-09 12:51 ` [PATCH v7 21/21] xl: vNUMA support Wei Liu
  20 siblings, 0 replies; 39+ messages in thread
From: Wei Liu @ 2015-03-09 12:51 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, Ian Campbell

These APIs can be used to manipulate XLU_ConfigValue and XLU_ConfigList.

APIs introduced:
1. xlu_cfg_value_type
2. xlu_cfg_value_get_string
3. xlu_cfg_value_get_list
4. xlu_cfg_get_listitem2

Move some definitions from private header to public header as needed.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
---
Changes in v6:
1. Report value's line and column number on error.

Changes in v5:
1. Use calling convention like old APIs.
---
 tools/libxl/libxlu_cfg.c      | 47 +++++++++++++++++++++++++++++++++++++++++++
 tools/libxl/libxlu_internal.h |  7 -------
 tools/libxl/libxlutil.h       | 13 ++++++++++++
 3 files changed, 60 insertions(+), 7 deletions(-)

diff --git a/tools/libxl/libxlu_cfg.c b/tools/libxl/libxlu_cfg.c
index 858f894..5a115cb 100644
--- a/tools/libxl/libxlu_cfg.c
+++ b/tools/libxl/libxlu_cfg.c
@@ -199,6 +199,53 @@ static int find_atom(const XLU_Config *cfg, const char *n,
     return 0;
 }
 
+
+enum XLU_ConfigValueType xlu_cfg_value_type(const XLU_ConfigValue *value)
+{
+    return value->type;
+}
+
+int xlu_cfg_value_get_string(const XLU_Config *cfg, XLU_ConfigValue *value,
+                             char **value_r, int dont_warn)
+{
+    if (value->type != XLU_STRING) {
+        if (!dont_warn)
+            fprintf(cfg->report,
+                    "%s:%d:%d: warning: value is not a string\n",
+                    cfg->config_source, value->loc->first_line,
+                    value->loc->first_column);
+        *value_r = NULL;
+        return EINVAL;
+    }
+
+    *value_r = value->u.string;
+    return 0;
+}
+
+int xlu_cfg_value_get_list(const XLU_Config *cfg, XLU_ConfigValue *value,
+                           XLU_ConfigList **value_r, int dont_warn)
+{
+    if (value->type != XLU_LIST) {
+        if (!dont_warn)
+            fprintf(cfg->report,
+                    "%s:%d:%d: warning: value is not a list\n",
+                    cfg->config_source, value->loc->first_line,
+                    value->loc->first_column);
+        *value_r = NULL;
+        return EINVAL;
+    }
+
+    *value_r = &value->u.list;
+    return 0;
+}
+
+XLU_ConfigValue *xlu_cfg_get_listitem2(const XLU_ConfigList *list,
+                                       int entry)
+{
+    if (entry < 0 || entry >= list->nvalues) return NULL;
+    return list->values[entry];
+}
+
 int xlu_cfg_get_string(const XLU_Config *cfg, const char *n,
                        const char **value_r, int dont_warn) {
     XLU_ConfigSetting *set;
diff --git a/tools/libxl/libxlu_internal.h b/tools/libxl/libxlu_internal.h
index cc1d400..b6bd63f 100644
--- a/tools/libxl/libxlu_internal.h
+++ b/tools/libxl/libxlu_internal.h
@@ -25,13 +25,6 @@
 
 #include "libxlutil.h"
 
-enum XLU_ConfigValueType {
-    XLU_STRING,
-    XLU_LIST,
-};
-
-typedef struct XLU_ConfigValue XLU_ConfigValue;
-
 typedef struct XLU_ConfigList {
     int avalues; /* available slots */
     int nvalues; /* actual occupied slots */
diff --git a/tools/libxl/libxlutil.h b/tools/libxl/libxlutil.h
index 0333e55..989605a 100644
--- a/tools/libxl/libxlutil.h
+++ b/tools/libxl/libxlutil.h
@@ -20,9 +20,15 @@
 
 #include "libxl.h"
 
+enum XLU_ConfigValueType {
+    XLU_STRING,
+    XLU_LIST,
+};
+
 /* Unless otherwise stated, all functions return an errno value. */
 typedef struct XLU_Config XLU_Config;
 typedef struct XLU_ConfigList XLU_ConfigList;
+typedef struct XLU_ConfigValue XLU_ConfigValue;
 
 XLU_Config *xlu_cfg_init(FILE *report, const char *report_filename);
   /* 0 means we got ENOMEM. */
@@ -66,6 +72,13 @@ const char *xlu_cfg_get_listitem(const XLU_ConfigList*, int entry);
   /* xlu_cfg_get_listitem cannot fail, except that if entry is
    * out of range it returns 0 (not setting errno) */
 
+enum XLU_ConfigValueType xlu_cfg_value_type(const XLU_ConfigValue *value);
+int xlu_cfg_value_get_string(const XLU_Config *cfg,  XLU_ConfigValue *value,
+                             char **value_r, int dont_warn);
+int xlu_cfg_value_get_list(const XLU_Config *cfg, XLU_ConfigValue *value,
+                           XLU_ConfigList **value_r, int dont_warn);
+XLU_ConfigValue *xlu_cfg_get_listitem2(const XLU_ConfigList *list,
+                                       int entry);
 
 /*
  * Disk specification parsing.
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v7 20/21] xl: introduce xcalloc
  2015-03-09 12:51 [PATCH v7 00/21] Virtual NUMA for PV and HVM Wei Liu
                   ` (18 preceding siblings ...)
  2015-03-09 12:51 ` [PATCH v7 19/21] libxlu: introduce new APIs Wei Liu
@ 2015-03-09 12:51 ` Wei Liu
  2015-03-09 12:51 ` [PATCH v7 21/21] xl: vNUMA support Wei Liu
  20 siblings, 0 replies; 39+ messages in thread
From: Wei Liu @ 2015-03-09 12:51 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, Ian Campbell

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
Changes in v6:
1. Join two lines to make code more compact.
2. Use %zu and drop casting.
---
 tools/libxl/xl_cmdimpl.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index e41f633..689bda2 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -289,6 +289,16 @@ static void *xmalloc(size_t sz) {
     return r;
 }
 
+static void *xcalloc(size_t n, size_t sz) __attribute__((unused));
+static void *xcalloc(size_t n, size_t sz) {
+    void *r = calloc(n, sz);
+    if (!r) {
+        fprintf(stderr,"xl: Unable to calloc %zu bytes.\n", sz*n);
+        exit(-ERROR_FAIL);
+    }
+    return r;
+}
+
 static void *xrealloc(void *ptr, size_t sz) {
     void *r;
     if (!sz) { free(ptr); return 0; }
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v7 21/21] xl: vNUMA support
  2015-03-09 12:51 [PATCH v7 00/21] Virtual NUMA for PV and HVM Wei Liu
                   ` (19 preceding siblings ...)
  2015-03-09 12:51 ` [PATCH v7 20/21] xl: introduce xcalloc Wei Liu
@ 2015-03-09 12:51 ` Wei Liu
  2015-03-11 15:12   ` Ian Campbell
  20 siblings, 1 reply; 39+ messages in thread
From: Wei Liu @ 2015-03-09 12:51 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, Ian Campbell

This patch includes configuration options parser and documentation.

Please find the hunk to xl.cfg.pod.5 for more information.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
---
Changes in v7:
1. Fill in max_memkb with vNUMA memory settings.
2. Check vcpus specified in vnuma matches maxvcpus=.
3. Update manpage.
4. Drop Dario's reviewed-by due to above changes.

Changes in v6:
1. Disable NUMA auto-placement.
---
 docs/man/xl.cfg.pod.5    |  59 +++++++++++++++++-
 tools/libxl/xl_cmdimpl.c | 158 ++++++++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 214 insertions(+), 3 deletions(-)

diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5
index 408653f..4fb8dc3 100644
--- a/docs/man/xl.cfg.pod.5
+++ b/docs/man/xl.cfg.pod.5
@@ -41,8 +41,8 @@ value).
 
 =item B<[ VALUE, VALUE, ... ]>
 
-A list of C<VALUES> of the above types. Lists are homogeneous and are
-not nested.
+A list of C<VALUES> of the above types. Lists can be heterogeneous and
+nested.
 
 =back
 
@@ -266,6 +266,61 @@ it will crash.
 
 =back
 
+=head3 Guest Virtual NUMA Configuration
+
+=over 4
+
+=item B<vnuma=[ VNODE_SPEC, VNODE_SPEC, ... ]
+
+Specify virtual NUMA configuration with positional arguments. The
+nth B<VNODE_SPEC> in the list specifies the configuration of nth
+virtual node.
+
+Each B<VNODE_SPEC> is a list, which has a form of
+"[VNODE_CONFIG_OPTION,VNODE_CONFIG_OPTION, ... ]"  (without quotes).
+
+For example vnuma = [ ["pnode=0","size=512","vcpus=0-4","vdistances=10,20"] ]
+means vnode 0 is mapped to pnode 0, has 512MB ram, has vcpus 0 to 4, the
+distance to itself is 10 and the distance to vnode 1 is 20.
+
+Each B<VNODE_CONFIG_OPTION> is a quoted key=value pair. Supported
+B<VNODE_CONFIG_OPTION>s are:
+
+=over 4
+
+=item B<pnode=NUMBER>
+
+Specify which physical node this virtual node maps to.
+
+=item B<size=MBYTES>
+
+Specify the size of this virtual node. The sum of memory size of all
+vnodes will become B<maxmem=>. If B<maxmem=> is specified separately,
+a check is performed to make sure the sum of all vnode memory matches
+B<maxmem=>.
+
+=item B<vcpus=CPU-STRING>
+
+Specify which vcpus belong to this node. B<CPU-STRING> is a string
+separated by comma. You can specify range and single cpu. An example
+is "vcpus=0-5,8", which means you specify vcpu 0 to vcpu 5, and vcpu
+8.
+
+=item B<vdistances=NUMBER, NUMBER, ... >
+
+Specify virtual distance from this node to all nodes (including
+itself) with positional arguments. For example, "vdistance=10,20"
+for vnode 0 means the distance from vnode 0 to vnode 0 is 10, from
+vnode 0 to vnode 1 is 20. The number of arguments supplied must match
+the total number of vnodes.
+
+Normally you can use the values from "xl info -n" or "numactl
+--hardware" to fill in vdistance list.
+
+=back
+
+=back
+
 =head3 Event Actions
 
 =over 4
diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index 689bda2..6ffd01c 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -158,7 +158,6 @@ struct domain_create {
 };
 
 
-static uint32_t find_domain(const char *p) __attribute__((warn_unused_result));
 static uint32_t find_domain(const char *p)
 {
     uint32_t domid;
@@ -987,6 +986,161 @@ static int parse_nic_config(libxl_device_nic *nic, XLU_Config **config, char *to
     return 0;
 }
 
+static void parse_vnuma_config(const XLU_Config *config,
+                               libxl_domain_build_info *b_info)
+{
+    libxl_physinfo physinfo;
+    uint32_t nr_nodes;
+    XLU_ConfigList *vnuma;
+    int i, j, len, num_vnuma, total_cpus = 0;
+    long l;
+    unsigned long max_memkb1 = 0, max_memkb2 = 0;
+
+    libxl_physinfo_init(&physinfo);
+    if (libxl_get_physinfo(ctx, &physinfo) != 0) {
+        libxl_physinfo_dispose(&physinfo);
+        fprintf(stderr, "libxl_get_physinfo failed\n");
+        exit(1);
+    }
+
+    nr_nodes = physinfo.nr_nodes;
+    libxl_physinfo_dispose(&physinfo);
+
+    if (xlu_cfg_get_list(config, "vnuma", &vnuma, &num_vnuma, 1))
+        return;
+
+    if (!xlu_cfg_get_long (config, "maxmem", &l, 0))
+        max_memkb1 = l * 1024;
+
+    b_info->num_vnuma_nodes = num_vnuma;
+    b_info->vnuma_nodes = xcalloc(num_vnuma, sizeof(libxl_vnode_info));
+
+    for (i = 0; i < b_info->num_vnuma_nodes; i++) {
+        libxl_vnode_info *p = &b_info->vnuma_nodes[i];
+
+        libxl_vnode_info_init(p);
+        libxl_cpu_bitmap_alloc(ctx, &p->vcpus, b_info->max_vcpus);
+        libxl_bitmap_set_none(&p->vcpus);
+        p->distances = xcalloc(b_info->num_vnuma_nodes,
+                               sizeof(*p->distances));
+        p->num_distances = b_info->num_vnuma_nodes;
+    }
+
+    for (i = 0; i < num_vnuma; i++) {
+        XLU_ConfigValue *vnode_spec, *conf_option;
+        XLU_ConfigList *vnode_config_list;
+        int conf_count;
+        libxl_vnode_info *p = &b_info->vnuma_nodes[i];
+
+        vnode_spec = xlu_cfg_get_listitem2(vnuma, i);
+        assert(vnode_spec);
+
+        xlu_cfg_value_get_list(config, vnode_spec, &vnode_config_list, 0);
+        if (!vnode_config_list) {
+            fprintf(stderr, "xl: cannot get vnode config option list\n");
+            exit(1);
+        }
+
+        for (conf_count = 0;
+             (conf_option =
+              xlu_cfg_get_listitem2(vnode_config_list, conf_count));
+             conf_count++) {
+
+            if (xlu_cfg_value_type(conf_option) == XLU_STRING) {
+                char *buf, *option_untrimmed, *value_untrimmed;
+                char *option, *value;
+                char *endptr;
+                unsigned long val;
+
+                xlu_cfg_value_get_string(config, conf_option, &buf, 0);
+
+                if (!buf) continue;
+
+                if (split_string_into_pair(buf, "=",
+                                           &option_untrimmed,
+                                           &value_untrimmed)) {
+                    fprintf(stderr, "xl: failed to split \"%s\" into pair\n",
+                            buf);
+                    exit(1);
+                }
+                trim(isspace, option_untrimmed, &option);
+                trim(isspace, value_untrimmed, &value);
+
+#define ABORT_IF_FAILED(str)                                            \
+                do {                                                    \
+                    if (endptr == value || val == ULONG_MAX) {          \
+                        fprintf(stderr,                                 \
+                                "xl: failed to convert \"%s\" to number\n", \
+                                (str));                                 \
+                        exit(1);                                        \
+                    }                                                   \
+                } while (0)
+
+                if (!strcmp("pnode", option)) {
+                    val = strtoul(value, &endptr, 10);
+                    ABORT_IF_FAILED(value);
+                    if (val >= nr_nodes) {
+                        fprintf(stderr,
+                                "xl: invalid pnode number: %lu\n", val);
+                        exit(1);
+                    }
+                    p->pnode = val;
+                    libxl_defbool_set(&b_info->numa_placement, false);
+                } else if (!strcmp("size", option)) {
+                    val = strtoul(value, &endptr, 10);
+                    ABORT_IF_FAILED(value);
+                    p->memkb = val << 10;
+                    max_memkb2 += p->memkb;
+                } else if (!strcmp("vcpus", option)) {
+                    libxl_string_list cpu_spec_list;
+                    int cpu;
+                    unsigned long s, e;
+
+                    split_string_into_string_list(value, ",", &cpu_spec_list);
+                    len = libxl_string_list_length(&cpu_spec_list);
+
+                    for (j = 0; j < len; j++) {
+                        parse_range(cpu_spec_list[j], &s, &e);
+                        for (cpu = s; cpu <=e; cpu++)
+                            libxl_bitmap_set(&p->vcpus, cpu);
+                    }
+                    total_cpus += (e - s + 1);
+                    libxl_string_list_dispose(&cpu_spec_list);
+                } else if (!strcmp("vdistances", option)) {
+                    libxl_string_list vdist;
+
+                    split_string_into_string_list(value, ",", &vdist);
+                    len = libxl_string_list_length(&vdist);
+
+                    for (j = 0; j < len; j++) {
+                        val = strtoul(vdist[j], &endptr, 10);
+                        ABORT_IF_FAILED(vdist[j]);
+                        p->distances[j] = val;
+                    }
+                    libxl_string_list_dispose(&vdist);
+                }
+#undef ABORT_IF_FAILED
+                free(option);
+                free(value);
+                free(option_untrimmed);
+                free(value_untrimmed);
+            }
+        }
+    }
+
+    if (total_cpus != b_info->max_vcpus) {
+        fprintf(stderr, "xl: vnuma vcpus and maxvcpus= mismatch\n");
+        exit(1);
+    }
+
+    /* User has specified maxmem= */
+    if (max_memkb1 != 0 && max_memkb1 != max_memkb2) {
+        fprintf(stderr, "xl: maxmem and vnuma memory size mismatch\n");
+        exit(1);
+    } else
+        b_info->max_memkb = max_memkb2;
+}
+
 static void parse_config_data(const char *config_source,
                               const char *config_data,
                               int config_len,
@@ -1177,6 +1331,8 @@ static void parse_config_data(const char *config_source,
         }
     }
 
+    parse_vnuma_config(config, b_info);
+
     if (!xlu_cfg_get_long(config, "rtc_timeoffset", &l, 0))
         b_info->rtc_timeoffset = l;
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: [PATCH v7 01/21] xen: make two memory hypercalls vNUMA-aware
  2015-03-09 12:51 ` [PATCH v7 01/21] xen: make two memory hypercalls vNUMA-aware Wei Liu
@ 2015-03-09 13:22   ` Jan Beulich
  2015-03-09 13:29     ` Wei Liu
  0 siblings, 1 reply; 39+ messages in thread
From: Jan Beulich @ 2015-03-09 13:22 UTC (permalink / raw)
  To: Wei Liu; +Cc: AndrewCooper, xen-devel

>>> On 09.03.15 at 13:51, <wei.liu2@citrix.com> wrote:
> @@ -759,6 +784,14 @@ long do_memory_op(unsigned long cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>              return start_extent;
>          args.domain = d;
>  
> +        if ( construct_memop_from_reservation(d, &reservation, &args) )
> +        {
> +            rcu_unlock_domain(d);
> +            return start_extent;
> +        }
> +        args.nr_done      = start_extent;
> +        args.preempted    = 0;

Remembering Andrew's comment on your patch introducing
construct_memop_from_reservation(), the setting of
args.domain visible in the context above should either be
moved past that call, or be used by the function instead of
passing d explicitly (in which case it is pretty clear that the
function isn't free to clobber that structure field).

Jan

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v7 01/21] xen: make two memory hypercalls vNUMA-aware
  2015-03-09 13:22   ` Jan Beulich
@ 2015-03-09 13:29     ` Wei Liu
  2015-03-09 13:52       ` Jan Beulich
  0 siblings, 1 reply; 39+ messages in thread
From: Wei Liu @ 2015-03-09 13:29 UTC (permalink / raw)
  To: Jan Beulich; +Cc: AndrewCooper, Wei Liu, xen-devel

On Mon, Mar 09, 2015 at 01:22:55PM +0000, Jan Beulich wrote:
> >>> On 09.03.15 at 13:51, <wei.liu2@citrix.com> wrote:
> > @@ -759,6 +784,14 @@ long do_memory_op(unsigned long cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
> >              return start_extent;
> >          args.domain = d;
> >  
> > +        if ( construct_memop_from_reservation(d, &reservation, &args) )
> > +        {
> > +            rcu_unlock_domain(d);
> > +            return start_extent;
> > +        }
> > +        args.nr_done      = start_extent;
> > +        args.preempted    = 0;
> 
> Remembering Andrew's comment on your patch introducing
> construct_memop_from_reservation(), the setting of
> args.domain visible in the context above should either be
> moved past that call, or be used by the function instead of
> passing d explicitly (in which case it is pretty clear that the
> function isn't free to clobber that structure field).
> 

My understanding was that he wanted me to move setting args.nr_done and
args.preempted after calling construct_memop_from_reservation. But then
here I missed args.domain. :-/

I will just move "args.domain = d" after the function call, if that's OK
with you.

Wei.

> Jan

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v7 01/21] xen: make two memory hypercalls vNUMA-aware
  2015-03-09 13:29     ` Wei Liu
@ 2015-03-09 13:52       ` Jan Beulich
  2015-03-09 13:59         ` Wei Liu
  0 siblings, 1 reply; 39+ messages in thread
From: Jan Beulich @ 2015-03-09 13:52 UTC (permalink / raw)
  To: Wei Liu; +Cc: AndrewCooper, xen-devel

>>> On 09.03.15 at 14:29, <wei.liu2@citrix.com> wrote:
> On Mon, Mar 09, 2015 at 01:22:55PM +0000, Jan Beulich wrote:
>> >>> On 09.03.15 at 13:51, <wei.liu2@citrix.com> wrote:
>> > @@ -759,6 +784,14 @@ long do_memory_op(unsigned long cmd, 
> XEN_GUEST_HANDLE_PARAM(void) arg)
>> >              return start_extent;
>> >          args.domain = d;
>> >  
>> > +        if ( construct_memop_from_reservation(d, &reservation, &args) )
>> > +        {
>> > +            rcu_unlock_domain(d);
>> > +            return start_extent;
>> > +        }
>> > +        args.nr_done      = start_extent;
>> > +        args.preempted    = 0;
>> 
>> Remembering Andrew's comment on your patch introducing
>> construct_memop_from_reservation(), the setting of
>> args.domain visible in the context above should either be
>> moved past that call, or be used by the function instead of
>> passing d explicitly (in which case it is pretty clear that the
>> function isn't free to clobber that structure field).
>> 
> 
> My understanding was that he wanted me to move setting args.nr_done and
> args.preempted after calling construct_memop_from_reservation. But then
> here I missed args.domain. :-/
> 
> I will just move "args.domain = d" after the function call, if that's OK
> with you.

Yes, as said, either adjustment is fine with me.

But looking at this another time points out another issue with the
patch - you now clobber MEMF_populate_on_demand possibly
being set in args.memflags, i.e. you need to move down that part
too.

Jan

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v7 01/21] xen: make two memory hypercalls vNUMA-aware
  2015-03-09 13:52       ` Jan Beulich
@ 2015-03-09 13:59         ` Wei Liu
  0 siblings, 0 replies; 39+ messages in thread
From: Wei Liu @ 2015-03-09 13:59 UTC (permalink / raw)
  To: Jan Beulich; +Cc: AndrewCooper, Wei Liu, xen-devel

On Mon, Mar 09, 2015 at 01:52:28PM +0000, Jan Beulich wrote:
> >>> On 09.03.15 at 14:29, <wei.liu2@citrix.com> wrote:
> > On Mon, Mar 09, 2015 at 01:22:55PM +0000, Jan Beulich wrote:
> >> >>> On 09.03.15 at 13:51, <wei.liu2@citrix.com> wrote:
> >> > @@ -759,6 +784,14 @@ long do_memory_op(unsigned long cmd, 
> > XEN_GUEST_HANDLE_PARAM(void) arg)
> >> >              return start_extent;
> >> >          args.domain = d;
> >> >  
> >> > +        if ( construct_memop_from_reservation(d, &reservation, &args) )
> >> > +        {
> >> > +            rcu_unlock_domain(d);
> >> > +            return start_extent;
> >> > +        }
> >> > +        args.nr_done      = start_extent;
> >> > +        args.preempted    = 0;
> >> 
> >> Remembering Andrew's comment on your patch introducing
> >> construct_memop_from_reservation(), the setting of
> >> args.domain visible in the context above should either be
> >> moved past that call, or be used by the function instead of
> >> passing d explicitly (in which case it is pretty clear that the
> >> function isn't free to clobber that structure field).
> >> 
> > 
> > My understanding was that he wanted me to move setting args.nr_done and
> > args.preempted after calling construct_memop_from_reservation. But then
> > here I missed args.domain. :-/
> > 
> > I will just move "args.domain = d" after the function call, if that's OK
> > with you.
> 
> Yes, as said, either adjustment is fine with me.
> 
> But looking at this another time points out another issue with the
> patch - you now clobber MEMF_populate_on_demand possibly
> being set in args.memflags, i.e. you need to move down that part
> too.
> 

Sure, I will check that as well.

Wei.

> Jan

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v7 02/21] libxc: duplicate snippet to allocate p2m_host array
  2015-03-09 12:51 ` [PATCH v7 02/21] libxc: duplicate snippet to allocate p2m_host array Wei Liu
@ 2015-03-09 15:06   ` Konrad Rzeszutek Wilk
  2015-03-11 15:02   ` Ian Campbell
  1 sibling, 0 replies; 39+ messages in thread
From: Konrad Rzeszutek Wilk @ 2015-03-09 15:06 UTC (permalink / raw)
  To: Wei Liu; +Cc: Ian Jackson, Ian Campbell, xen-devel

On Mon, Mar 09, 2015 at 12:51:18PM +0000, Wei Liu wrote:
> Currently all in tree code doesn't set the superpage flag, I would just
> remove superpage support if I can, but Konrad wants it retained for the
> moment.
> 
> As I'm going to change the p2m_host array allocation, duplicate the code
> snippet to allocate p2m_host array in this patch, so that we retain the
> behaviour in superpage case.
> 
> This patch introduces no functional change and it will make future patch
> easier to review. Also removed one stray tab while I was there.
> 
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> Cc: Ian Campbell <ian.campbell@citrix.com>
> Cc: Ian Jackson <ian.jackson@eu.citrix.com>
> CC: Konrad Wilk <konrad.wilk@oracle.com>

Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

> ---
>  tools/libxc/xc_dom_x86.c | 15 ++++++++++-----
>  1 file changed, 10 insertions(+), 5 deletions(-)
> 
> diff --git a/tools/libxc/xc_dom_x86.c b/tools/libxc/xc_dom_x86.c
> index bf06fe4..9dbaedb 100644
> --- a/tools/libxc/xc_dom_x86.c
> +++ b/tools/libxc/xc_dom_x86.c
> @@ -772,15 +772,16 @@ int arch_setup_meminit(struct xc_dom_image *dom)
>              return rc;
>      }
>  
> -    dom->p2m_host = xc_dom_malloc(dom, sizeof(xen_pfn_t) * dom->total_pages);
> -    if ( dom->p2m_host == NULL )
> -        return -EINVAL;
> -
>      if ( dom->superpages )
>      {
>          int count = dom->total_pages >> SUPERPAGE_PFN_SHIFT;
>          xen_pfn_t extents[count];
>  
> +        dom->p2m_host = xc_dom_malloc(dom, sizeof(xen_pfn_t) *
> +                                      dom->total_pages);
> +        if ( dom->p2m_host == NULL )
> +            return -EINVAL;
> +
>          DOMPRINTF("Populating memory with %d superpages", count);
>          for ( pfn = 0; pfn < count; pfn++ )
>              extents[pfn] = pfn << SUPERPAGE_PFN_SHIFT;
> @@ -809,9 +810,13 @@ int arch_setup_meminit(struct xc_dom_image *dom)
>                  return rc;
>          }
>          /* setup initial p2m */
> +        dom->p2m_host = xc_dom_malloc(dom, sizeof(xen_pfn_t) *
> +                                      dom->total_pages);
> +        if ( dom->p2m_host == NULL )
> +            return -EINVAL;
>          for ( pfn = 0; pfn < dom->total_pages; pfn++ )
>              dom->p2m_host[pfn] = pfn;
> -        
> +
>          /* allocate guest memory */
>          for ( i = rc = allocsz = 0;
>                (i < dom->total_pages) && !rc;
> -- 
> 1.9.1
> 

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v7 02/21] libxc: duplicate snippet to allocate p2m_host array
  2015-03-09 12:51 ` [PATCH v7 02/21] libxc: duplicate snippet to allocate p2m_host array Wei Liu
  2015-03-09 15:06   ` Konrad Rzeszutek Wilk
@ 2015-03-11 15:02   ` Ian Campbell
  1 sibling, 0 replies; 39+ messages in thread
From: Ian Campbell @ 2015-03-11 15:02 UTC (permalink / raw)
  To: Wei Liu; +Cc: Ian Jackson, xen-devel

On Mon, 2015-03-09 at 12:51 +0000, Wei Liu wrote:
> Currently all in tree code doesn't set the superpage flag, I would just
> remove superpage support if I can, but Konrad wants it retained for the
> moment.
> 
> As I'm going to change the p2m_host array allocation, duplicate the code
> snippet to allocate p2m_host array in this patch, so that we retain the
> behaviour in superpage case.
> 
> This patch introduces no functional change and it will make future patch
> easier to review. Also removed one stray tab while I was there.
> 
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>

Acked-by: Ian Campbell <ian.campbell@citrix.com>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v7 15/21] libxl: define LIBXL_HAVE_VNUMA
  2015-03-09 12:51 ` [PATCH v7 15/21] libxl: define LIBXL_HAVE_VNUMA Wei Liu
@ 2015-03-11 15:03   ` Ian Campbell
  0 siblings, 0 replies; 39+ messages in thread
From: Ian Campbell @ 2015-03-11 15:03 UTC (permalink / raw)
  To: Wei Liu; +Cc: Ian Jackson, xen-devel

On Mon, 2015-03-09 at 12:51 +0000, Wei Liu wrote:
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com>
> Cc: Ian Campbell <ian.campbell@citrix.com>
> Cc: Ian Jackson <ian.jackson@eu.citrix.com>

Acked-by: Ian Campbell <ian.campbell@citrix.com>

(IMHO it is acceptable to include this in the patch which adds the last
bit of functionality which it is expected to cover, but I don't mind it
separate either)

> ---
> Changes in v6:
> 1. Better description of the macro.
> ---
>  tools/libxl/libxl.h | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
> index e3d2ae8..97a7c9c 100644
> --- a/tools/libxl/libxl.h
> +++ b/tools/libxl/libxl.h
> @@ -67,6 +67,13 @@
>   * the same $(XEN_VERSION) (e.g. throughout a major release).
>   */
>  
> +/* LIBXL_HAVE_VNUMA
> + *
> + * If this is defined the type libxl_vnode_info exists, and a
> + * field 'vnuma_nodes' is present in libxl_domain_build_info.
> + */
> +#define LIBXL_HAVE_VNUMA 1
> +
>  /* LIBXL_HAVE_USERDATA_UNLINK
>   *
>   * If it is defined, libxl has a library function called

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v7 21/21] xl: vNUMA support
  2015-03-09 12:51 ` [PATCH v7 21/21] xl: vNUMA support Wei Liu
@ 2015-03-11 15:12   ` Ian Campbell
  2015-03-11 20:14     ` Wei Liu
  0 siblings, 1 reply; 39+ messages in thread
From: Ian Campbell @ 2015-03-11 15:12 UTC (permalink / raw)
  To: Wei Liu; +Cc: Ian Jackson, xen-devel

On Mon, 2015-03-09 at 12:51 +0000, Wei Liu wrote:

> +Each B<VNODE_CONFIG_OPTION> is a quoted key=value pair. Supported
> +B<VNODE_CONFIG_OPTION>s are:

Which of these are optional and which are mandatory? All mandatory?

> +    if (!xlu_cfg_get_long (config, "maxmem", &l, 0))
> +        max_memkb1 = l * 1024;

I think you should arrange that if the user has specified maxmem then it
has already been parsed and is available in b_info->max_memkb for use in
this function. Reparsing it just leaves scope to get out of sync.

In particular this doesn't handle the defaulting of maxmem to memory.

> +#define ABORT_IF_FAILED(str)                                            \
> +                do {                                                    \
> +                    if (endptr == value || val == ULONG_MAX) {          \

Looking at the uses, I think you mean str here where you have value, no?

Given that you exit() here I think you could write this as a helper
function which did the parsing for you too, i.e. folds in the strtoul.

> +    if (total_cpus != b_info->max_vcpus) {
> +        fprintf(stderr, "xl: vnuma vcpus and maxvcpus= mismatch\n");
> +        exit(1);
> +    }

Can't we do as we do with memory here and set max_vcpus if it isn't
already configured?

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v7 18/21] libxlu: record location when parsing values
  2015-03-09 12:51 ` [PATCH v7 18/21] libxlu: record location when parsing values Wei Liu
@ 2015-03-11 15:12   ` Ian Campbell
  2015-03-11 15:16     ` Wei Liu
  0 siblings, 1 reply; 39+ messages in thread
From: Ian Campbell @ 2015-03-11 15:12 UTC (permalink / raw)
  To: Wei Liu; +Cc: Ian Jackson, xen-devel

On Mon, 2015-03-09 at 12:51 +0000, Wei Liu wrote:
> Originally only setting has line number recorded. Since we're moving to
> more sophisticated API, record the location for individual value. It is
> useful for error reporting.
> 
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> Cc: Ian Campbell <ian.campbell@citrix.com>
> Cc: Ian Jackson <ian.jackson@eu.citrix.com>

I'm leaving this one to Ian.

Do we need to rerun something (bison/flex?) on commit?

> ---
> Changes in v7:
> 1. Use yylloc in empty rule.
> 1. Use YYLTYPE instead of individual line / column values.
> ---
>  tools/libxl/libxlu_cfg.c      | 14 ++++++++++----
>  tools/libxl/libxlu_cfg_i.h    |  5 +++--
>  tools/libxl/libxlu_cfg_y.c    |  6 +++---
>  tools/libxl/libxlu_cfg_y.y    |  6 +++---
>  tools/libxl/libxlu_internal.h |  2 ++
>  5 files changed, 21 insertions(+), 12 deletions(-)
> 
> diff --git a/tools/libxl/libxlu_cfg.c b/tools/libxl/libxlu_cfg.c
> index 611f5ec..858f894 100644
> --- a/tools/libxl/libxlu_cfg.c
> +++ b/tools/libxl/libxlu_cfg.c
> @@ -311,16 +311,19 @@ const char *xlu_cfg_get_listitem(const XLU_ConfigList *list, int entry) {
>  }
>  
> 
> -XLU_ConfigValue *xlu__cfg_string_mk(CfgParseContext *ctx, char *atom)
> +XLU_ConfigValue *xlu__cfg_string_mk(CfgParseContext *ctx, char *atom,
> +                                    YYLTYPE *loc)
>  {
>      XLU_ConfigValue *value = NULL;
>  
>      if (ctx->err) goto x;
>  
> -    value = malloc(sizeof(*value));
> +    value = malloc(sizeof(*value)+sizeof(*loc));
>      if (!value) goto xe;
>      value->type = XLU_STRING;
>      value->u.string = atom;
> +    value->loc = (YYLTYPE *)(value+1);
> +    memcpy(value->loc, loc, sizeof(*loc));
>  
>      return value;
>  
> @@ -333,7 +336,8 @@ XLU_ConfigValue *xlu__cfg_string_mk(CfgParseContext *ctx, char *atom)
>  }
>  
>  XLU_ConfigValue *xlu__cfg_list_mk(CfgParseContext *ctx,
> -                                  XLU_ConfigValue *val)
> +                                  XLU_ConfigValue *val,
> +                                  YYLTYPE *loc)
>  {
>      XLU_ConfigValue *value = NULL;
>      XLU_ConfigValue **values = NULL;
> @@ -344,12 +348,14 @@ XLU_ConfigValue *xlu__cfg_list_mk(CfgParseContext *ctx,
>      if (!values) goto xe;
>      values[0] = val;
>  
> -    value = malloc(sizeof(*value));
> +    value = malloc(sizeof(*value)+sizeof(*loc));
>      if (!value) goto xe;
>      value->type = XLU_LIST;
>      value->u.list.nvalues = 1;
>      value->u.list.avalues = 1;
>      value->u.list.values = values;
> +    value->loc = (YYLTYPE *)(value+1);
> +    memcpy(value->loc, loc, sizeof(*loc));
>  
>      return value;
>  
> diff --git a/tools/libxl/libxlu_cfg_i.h b/tools/libxl/libxlu_cfg_i.h
> index 11dc33f..1b59b33 100644
> --- a/tools/libxl/libxlu_cfg_i.h
> +++ b/tools/libxl/libxlu_cfg_i.h
> @@ -26,9 +26,10 @@ void xlu__cfg_set_free(XLU_ConfigSetting *set);
>  void xlu__cfg_set_store(CfgParseContext*, char *name,
>                          XLU_ConfigValue *val, int lineno);
>  XLU_ConfigValue *xlu__cfg_string_mk(CfgParseContext *ctx,
> -                                    char *atom);
> +                                    char *atom, YYLTYPE *loc);
>  XLU_ConfigValue *xlu__cfg_list_mk(CfgParseContext *ctx,
> -                                  XLU_ConfigValue *val);
> +                                  XLU_ConfigValue *val,
> +                                  YYLTYPE *loc);
>  void xlu__cfg_list_append(CfgParseContext *ctx,
>                            XLU_ConfigValue *list,
>                            XLU_ConfigValue *val);
> diff --git a/tools/libxl/libxlu_cfg_y.c b/tools/libxl/libxlu_cfg_y.c
> index b05e48b..fbfdd0f 100644
> --- a/tools/libxl/libxlu_cfg_y.c
> +++ b/tools/libxl/libxlu_cfg_y.c
> @@ -1515,7 +1515,7 @@ yyreduce:
>  
>  /* Line 1806 of yacc.c  */
>  #line 62 "libxlu_cfg_y.y"
> -    { (yyval.value)= xlu__cfg_string_mk(ctx,(yyvsp[(1) - (1)].string)); }
> +    { (yyval.value)= xlu__cfg_string_mk(ctx,(yyvsp[(1) - (1)].string),&(yylsp[(1) - (1)])); }
>      break;
>  
>    case 13:
> @@ -1543,7 +1543,7 @@ yyreduce:
>  
>  /* Line 1806 of yacc.c  */
>  #line 68 "libxlu_cfg_y.y"
> -    { (yyval.value)= xlu__cfg_list_mk(ctx,NULL); }
> +    { (yyval.value)= xlu__cfg_list_mk(ctx,NULL,&yylloc); }
>      break;
>  
>    case 17:
> @@ -1564,7 +1564,7 @@ yyreduce:
>  
>  /* Line 1806 of yacc.c  */
>  #line 72 "libxlu_cfg_y.y"
> -    { (yyval.value)= xlu__cfg_list_mk(ctx,(yyvsp[(1) - (2)].value)); }
> +    { (yyval.value)= xlu__cfg_list_mk(ctx,(yyvsp[(1) - (2)].value),&(yylsp[(1) - (2)])); }
>      break;
>  
>    case 20:
> diff --git a/tools/libxl/libxlu_cfg_y.y b/tools/libxl/libxlu_cfg_y.y
> index 4a5ca3a..a923f76 100644
> --- a/tools/libxl/libxlu_cfg_y.y
> +++ b/tools/libxl/libxlu_cfg_y.y
> @@ -59,17 +59,17 @@ assignment: IDENT '=' value { xlu__cfg_set_store(ctx,$1,$3,@3.first_line); }
>  endstmt: NEWLINE
>   |      ';'
>  
> -value:  atom                         { $$= xlu__cfg_string_mk(ctx,$1); }
> +value:  atom                         { $$= xlu__cfg_string_mk(ctx,$1,&@1); }
>   |      '[' nlok valuelist ']'       { $$= $3; }
>  
>  atom:   STRING                   { $$= $1; }
>   |      NUMBER                   { $$= $1; }
>  
> -valuelist: /* empty */           { $$= xlu__cfg_list_mk(ctx,NULL); }
> +valuelist: /* empty */           { $$= xlu__cfg_list_mk(ctx,NULL,&yylloc); }
>   |      values                  { $$= $1; }
>   |      values ',' nlok         { $$= $1; }
>  
> -values: value nlok                  { $$= xlu__cfg_list_mk(ctx,$1); }
> +values: value nlok                  { $$= xlu__cfg_list_mk(ctx,$1,&@1); }
>   |      values ',' nlok value nlok  { xlu__cfg_list_append(ctx,$1,$4); $$= $1; }
>  
>  nlok:
> diff --git a/tools/libxl/libxlu_internal.h b/tools/libxl/libxlu_internal.h
> index 092a17a..cc1d400 100644
> --- a/tools/libxl/libxlu_internal.h
> +++ b/tools/libxl/libxlu_internal.h
> @@ -38,12 +38,14 @@ typedef struct XLU_ConfigList {
>      XLU_ConfigValue **values;
>  } XLU_ConfigList;
>  
> +typedef struct YYLTYPE YYLTYPE;
>  struct XLU_ConfigValue {
>      enum XLU_ConfigValueType type;
>      union {
>          char *string;
>          XLU_ConfigList list;
>      } u;
> +    YYLTYPE *loc;
>  };
>  
>  typedef struct XLU_ConfigSetting { /* transparent */

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v7 18/21] libxlu: record location when parsing values
  2015-03-11 15:12   ` Ian Campbell
@ 2015-03-11 15:16     ` Wei Liu
  2015-03-11 15:36       ` Ian Campbell
  0 siblings, 1 reply; 39+ messages in thread
From: Wei Liu @ 2015-03-11 15:16 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Ian Jackson, Wei Liu, xen-devel

On Wed, Mar 11, 2015 at 03:12:49PM +0000, Ian Campbell wrote:
> On Mon, 2015-03-09 at 12:51 +0000, Wei Liu wrote:
> > Originally only setting has line number recorded. Since we're moving to
> > more sophisticated API, record the location for individual value. It is
> > useful for error reporting.
> > 
> > Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> > Cc: Ian Campbell <ian.campbell@citrix.com>
> > Cc: Ian Jackson <ian.jackson@eu.citrix.com>
> 
> I'm leaving this one to Ian.
> 
> Do we need to rerun something (bison/flex?) on commit?
> 

These commits already contains bison / flex outputs. I used Wheezy's
flex and bison.

I can exclude them from the patch if necessary.

Wei.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v7 18/21] libxlu: record location when parsing values
  2015-03-11 15:16     ` Wei Liu
@ 2015-03-11 15:36       ` Ian Campbell
  0 siblings, 0 replies; 39+ messages in thread
From: Ian Campbell @ 2015-03-11 15:36 UTC (permalink / raw)
  To: Wei Liu; +Cc: Ian Jackson, xen-devel

On Wed, 2015-03-11 at 15:16 +0000, Wei Liu wrote:
> On Wed, Mar 11, 2015 at 03:12:49PM +0000, Ian Campbell wrote:
> > On Mon, 2015-03-09 at 12:51 +0000, Wei Liu wrote:
> > > Originally only setting has line number recorded. Since we're moving to
> > > more sophisticated API, record the location for individual value. It is
> > > useful for error reporting.
> > > 
> > > Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> > > Cc: Ian Campbell <ian.campbell@citrix.com>
> > > Cc: Ian Jackson <ian.jackson@eu.citrix.com>
> > 
> > I'm leaving this one to Ian.
> > 
> > Do we need to rerun something (bison/flex?) on commit?
> > 
> 
> These commits already contains bison / flex outputs. I used Wheezy's
> flex and bison.
> 
> I can exclude them from the patch if necessary.

Since you've got the right versions and the diff is small I think we can
live with them there. I think my precommit scripts will force a regen,
which would highlight if there was any skew.

Ian.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v7 21/21] xl: vNUMA support
  2015-03-11 15:12   ` Ian Campbell
@ 2015-03-11 20:14     ` Wei Liu
  2015-03-12 10:00       ` Ian Campbell
  2015-03-12 13:22       ` Dario Faggioli
  0 siblings, 2 replies; 39+ messages in thread
From: Wei Liu @ 2015-03-11 20:14 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Ian Jackson, Wei Liu, xen-devel

On Wed, Mar 11, 2015 at 03:12:06PM +0000, Ian Campbell wrote:
> On Mon, 2015-03-09 at 12:51 +0000, Wei Liu wrote:
> 
> > +Each B<VNODE_CONFIG_OPTION> is a quoted key=value pair. Supported
> > +B<VNODE_CONFIG_OPTION>s are:
> 
> Which of these are optional and which are mandatory? All mandatory?
> 

Yes, all mandatory for now.

> > +    if (!xlu_cfg_get_long (config, "maxmem", &l, 0))
> > +        max_memkb1 = l * 1024;
> 
> I think you should arrange that if the user has specified maxmem then it
> has already been parsed and is available in b_info->max_memkb for use in
> this function. Reparsing it just leaves scope to get out of sync.
> 
> In particular this doesn't handle the defaulting of maxmem to memory.
> 

This is exactly the reason I do this here.  At this point
b_info->max_memkb is always set.

If user has specified maxmem we should check if the configuration he /
she specifies matches the vnuma value to avoid misconfiguration;
otherwise we just use vnuma memory as maxmem.

But see below regarding maxvcpus=.

> > +#define ABORT_IF_FAILED(str)                                            \
> > +                do {                                                    \
> > +                    if (endptr == value || val == ULONG_MAX) {          \
> 
> Looking at the uses, I think you mean str here where you have value, no?
> 

Yes.

> Given that you exit() here I think you could write this as a helper
> function which did the parsing for you too, i.e. folds in the strtoul.
> 

OK.

> > +    if (total_cpus != b_info->max_vcpus) {
> > +        fprintf(stderr, "xl: vnuma vcpus and maxvcpus= mismatch\n");
> > +        exit(1);
> > +    }
> 
> Can't we do as we do with memory here and set max_vcpus if it isn't
> already configured?
> 

This can be done.  It will require moving parse_vnuma_config to
beginning of parse_config_data and stubbing out parsing of maxvcpus
and maxmem if vnuma configuration presents.

Wei.

> 

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v7 21/21] xl: vNUMA support
  2015-03-11 20:14     ` Wei Liu
@ 2015-03-12 10:00       ` Ian Campbell
  2015-03-12 10:50         ` Wei Liu
  2015-03-12 13:22       ` Dario Faggioli
  1 sibling, 1 reply; 39+ messages in thread
From: Ian Campbell @ 2015-03-12 10:00 UTC (permalink / raw)
  To: Wei Liu; +Cc: Ian Jackson, xen-devel

On Wed, 2015-03-11 at 20:14 +0000, Wei Liu wrote:
> On Wed, Mar 11, 2015 at 03:12:06PM +0000, Ian Campbell wrote:
> > On Mon, 2015-03-09 at 12:51 +0000, Wei Liu wrote:
> > 
> > > +Each B<VNODE_CONFIG_OPTION> is a quoted key=value pair. Supported
> > > +B<VNODE_CONFIG_OPTION>s are:
> > 
> > Which of these are optional and which are mandatory? All mandatory?
> > 
> 
> Yes, all mandatory for now.

Please can you say so.
> 
> > > +    if (!xlu_cfg_get_long (config, "maxmem", &l, 0))
> > > +        max_memkb1 = l * 1024;
> > 
> > I think you should arrange that if the user has specified maxmem then it
> > has already been parsed and is available in b_info->max_memkb for use in
> > this function. Reparsing it just leaves scope to get out of sync.
> > 
> > In particular this doesn't handle the defaulting of maxmem to memory.
> > 
> 
> This is exactly the reason I do this here.  At this point
> b_info->max_memkb is always set.
> 
> If user has specified maxmem we should check if the configuration he /
> she specifies matches the vnuma value to avoid misconfiguration;
> otherwise we just use vnuma memory as maxmem.
> 
> But see below regarding maxvcpus=.
[...]
> > > +    if (total_cpus != b_info->max_vcpus) {
> > > +        fprintf(stderr, "xl: vnuma vcpus and maxvcpus= mismatch\n");
> > > +        exit(1);
> > > +    }
> > 
> > Can't we do as we do with memory here and set max_vcpus if it isn't
> > already configured?
> > 
> 
> This can be done.  It will require moving parse_vnuma_config to
> beginning of parse_config_data and stubbing out parsing of maxvcpus
> and maxmem if vnuma configuration presents.

I'm afraid I don't follow.

I was thinking the flow would be:

	Call libxl_foo_init to set everything to the default.

        f ( max_mem in cfg )
                cfg->maxmem = v
        if ( memory in cfg )
                cfg->memory = v
        if ( vcpus in cfg )
                cfg->vcpus = v
                
        do numa parsing, using fact that cfg->{maxmem,memory,vcpus} are
        either default value (in which case they are set here) or
        explicitly set to some value, in which case they are sanity
        checked.

        handle maxmem/memory/vcpus/etc still being default values by
        setting the appropriate defaults.

Ian.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v7 21/21] xl: vNUMA support
  2015-03-12 10:00       ` Ian Campbell
@ 2015-03-12 10:50         ` Wei Liu
  2015-03-12 10:58           ` Ian Campbell
  0 siblings, 1 reply; 39+ messages in thread
From: Wei Liu @ 2015-03-12 10:50 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Ian Jackson, Wei Liu, xen-devel

On Thu, Mar 12, 2015 at 10:00:24AM +0000, Ian Campbell wrote:
> On Wed, 2015-03-11 at 20:14 +0000, Wei Liu wrote:
> > On Wed, Mar 11, 2015 at 03:12:06PM +0000, Ian Campbell wrote:
> > > On Mon, 2015-03-09 at 12:51 +0000, Wei Liu wrote:
> > > 
> > > > +Each B<VNODE_CONFIG_OPTION> is a quoted key=value pair. Supported
> > > > +B<VNODE_CONFIG_OPTION>s are:
> > > 
> > > Which of these are optional and which are mandatory? All mandatory?
> > > 
> > 
> > Yes, all mandatory for now.
> 
> Please can you say so.

No problem.

> > 
> > > > +    if (!xlu_cfg_get_long (config, "maxmem", &l, 0))
> > > > +        max_memkb1 = l * 1024;
> > > 
> > > I think you should arrange that if the user has specified maxmem then it
> > > has already been parsed and is available in b_info->max_memkb for use in
> > > this function. Reparsing it just leaves scope to get out of sync.
> > > 
> > > In particular this doesn't handle the defaulting of maxmem to memory.
> > > 
> > 
> > This is exactly the reason I do this here.  At this point
> > b_info->max_memkb is always set.
> > 
> > If user has specified maxmem we should check if the configuration he /
> > she specifies matches the vnuma value to avoid misconfiguration;
> > otherwise we just use vnuma memory as maxmem.
> > 
> > But see below regarding maxvcpus=.
> [...]
> > > > +    if (total_cpus != b_info->max_vcpus) {
> > > > +        fprintf(stderr, "xl: vnuma vcpus and maxvcpus= mismatch\n");
> > > > +        exit(1);
> > > > +    }
> > > 
> > > Can't we do as we do with memory here and set max_vcpus if it isn't
> > > already configured?
> > > 
> > 
> > This can be done.  It will require moving parse_vnuma_config to
> > beginning of parse_config_data and stubbing out parsing of maxvcpus
> > and maxmem if vnuma configuration presents.
> 
> I'm afraid I don't follow.
> 
> I was thinking the flow would be:
> 
> 	Call libxl_foo_init to set everything to the default.
> 
>         f ( max_mem in cfg )
>                 cfg->maxmem = v
>         if ( memory in cfg )
>                 cfg->memory = v
>         if ( vcpus in cfg )
>                 cfg->vcpus = v
>                 
>         do numa parsing, using fact that cfg->{maxmem,memory,vcpus} are
>         either default value (in which case they are set here) or
>         explicitly set to some value, in which case they are sanity
>         checked.
> 
>         handle maxmem/memory/vcpus/etc still being default values by
>         setting the appropriate defaults.
> 

Now I get your idea. I think we are more or less talking about the same
thing. I was proposing:

        parse vcpus
	parse memory
	parse maxmem

	parse vnuma

	if maxmem is not set by either config option or vnuma, set to memory
	if maxvcpus is not set by either config option or vnuma, set to vcpus

Wei.

> Ian.
> 

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v7 21/21] xl: vNUMA support
  2015-03-12 10:50         ` Wei Liu
@ 2015-03-12 10:58           ` Ian Campbell
  0 siblings, 0 replies; 39+ messages in thread
From: Ian Campbell @ 2015-03-12 10:58 UTC (permalink / raw)
  To: Wei Liu; +Cc: Ian Jackson, xen-devel

On Thu, 2015-03-12 at 10:50 +0000, Wei Liu wrote:
> Now I get your idea. I think we are more or less talking about the same
> thing. I was proposing:
> 
>         parse vcpus
> 	parse memory
> 	parse maxmem
> 
> 	parse vnuma
> 
> 	if maxmem is not set by either config option or vnuma, set to memory
> 	if maxvcpus is not set by either config option or vnuma, set to vcpus

Yes, I think we are in agreement.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v7 21/21] xl: vNUMA support
  2015-03-11 20:14     ` Wei Liu
  2015-03-12 10:00       ` Ian Campbell
@ 2015-03-12 13:22       ` Dario Faggioli
  2015-03-12 13:37         ` Wei Liu
  1 sibling, 1 reply; 39+ messages in thread
From: Dario Faggioli @ 2015-03-12 13:22 UTC (permalink / raw)
  To: Wei Liu; +Cc: Ian Jackson, Ian Campbell, xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 717 bytes --]

On Wed, 2015-03-11 at 20:14 +0000, Wei Liu wrote:
> On Wed, Mar 11, 2015 at 03:12:06PM +0000, Ian Campbell wrote:
> > On Mon, 2015-03-09 at 12:51 +0000, Wei Liu wrote:
> > 
> > > +Each B<VNODE_CONFIG_OPTION> is a quoted key=value pair. Supported
> > > +B<VNODE_CONFIG_OPTION>s are:
> > 
> > Which of these are optional and which are mandatory? All mandatory?
> > 
> 
> Yes, all mandatory for now.
> 
Indeed. I'll make some of them optional when integrating automatic
placement with vNUMA.

BTW, Wei, I've looked at this version, and you can have my:

Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com>

Although, AFAIU, you're probably changing it further and reposting...

Regards,
Dario

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v7 21/21] xl: vNUMA support
  2015-03-12 13:22       ` Dario Faggioli
@ 2015-03-12 13:37         ` Wei Liu
  0 siblings, 0 replies; 39+ messages in thread
From: Wei Liu @ 2015-03-12 13:37 UTC (permalink / raw)
  To: Dario Faggioli; +Cc: Ian Jackson, Wei Liu, Ian Campbell, xen-devel

On Thu, Mar 12, 2015 at 02:22:10PM +0100, Dario Faggioli wrote:
> On Wed, 2015-03-11 at 20:14 +0000, Wei Liu wrote:
> > On Wed, Mar 11, 2015 at 03:12:06PM +0000, Ian Campbell wrote:
> > > On Mon, 2015-03-09 at 12:51 +0000, Wei Liu wrote:
> > > 
> > > > +Each B<VNODE_CONFIG_OPTION> is a quoted key=value pair. Supported
> > > > +B<VNODE_CONFIG_OPTION>s are:
> > > 
> > > Which of these are optional and which are mandatory? All mandatory?
> > > 
> > 
> > Yes, all mandatory for now.
> > 
> Indeed. I'll make some of them optional when integrating automatic
> placement with vNUMA.
> 
> BTW, Wei, I've looked at this version, and you can have my:
> 
> Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com>
> 

Thanks for reviewing.

> Although, AFAIU, you're probably changing it further and reposting...

Yes, I've changed it and will repost...

Wei.

> 
> Regards,
> Dario

^ permalink raw reply	[flat|nested] 39+ messages in thread

end of thread, other threads:[~2015-03-12 13:37 UTC | newest]

Thread overview: 39+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-03-09 12:51 [PATCH v7 00/21] Virtual NUMA for PV and HVM Wei Liu
2015-03-09 12:51 ` [PATCH v7 01/21] xen: make two memory hypercalls vNUMA-aware Wei Liu
2015-03-09 13:22   ` Jan Beulich
2015-03-09 13:29     ` Wei Liu
2015-03-09 13:52       ` Jan Beulich
2015-03-09 13:59         ` Wei Liu
2015-03-09 12:51 ` [PATCH v7 02/21] libxc: duplicate snippet to allocate p2m_host array Wei Liu
2015-03-09 15:06   ` Konrad Rzeszutek Wilk
2015-03-11 15:02   ` Ian Campbell
2015-03-09 12:51 ` [PATCH v7 03/21] libxc: add p2m_size to xc_dom_image Wei Liu
2015-03-09 12:51 ` [PATCH v7 04/21] libxc: allocate memory with vNUMA information for PV guest Wei Liu
2015-03-09 12:51 ` [PATCH v7 05/21] libxl: introduce vNUMA types Wei Liu
2015-03-09 12:51 ` [PATCH v7 06/21] libxl: add vmemrange to libxl__domain_build_state Wei Liu
2015-03-09 12:51 ` [PATCH v7 07/21] libxl: introduce libxl__vnuma_config_check Wei Liu
2015-03-09 12:51 ` [PATCH v7 08/21] libxl: x86: factor out e820_host_sanitize Wei Liu
2015-03-09 12:51 ` [PATCH v7 09/21] libxl: functions to build vmemranges for PV guest Wei Liu
2015-03-09 12:51 ` [PATCH v7 10/21] libxl: build, check and pass vNUMA info to Xen " Wei Liu
2015-03-09 12:51 ` [PATCH v7 11/21] libxc: indentation change to xc_hvm_build_x86.c Wei Liu
2015-03-09 12:51 ` [PATCH v7 12/21] libxc: allocate memory with vNUMA information for HVM guest Wei Liu
2015-03-09 12:51 ` [PATCH v7 13/21] libxl: build, check and pass vNUMA info to Xen " Wei Liu
2015-03-09 12:51 ` [PATCH v7 14/21] libxl: disallow memory relocation when vNUMA is enabled Wei Liu
2015-03-09 12:51 ` [PATCH v7 15/21] libxl: define LIBXL_HAVE_VNUMA Wei Liu
2015-03-11 15:03   ` Ian Campbell
2015-03-09 12:51 ` [PATCH v7 16/21] libxlu: rework internal representation of setting Wei Liu
2015-03-09 12:51 ` [PATCH v7 17/21] libxlu: nested list support Wei Liu
2015-03-09 12:51 ` [PATCH v7 18/21] libxlu: record location when parsing values Wei Liu
2015-03-11 15:12   ` Ian Campbell
2015-03-11 15:16     ` Wei Liu
2015-03-11 15:36       ` Ian Campbell
2015-03-09 12:51 ` [PATCH v7 19/21] libxlu: introduce new APIs Wei Liu
2015-03-09 12:51 ` [PATCH v7 20/21] xl: introduce xcalloc Wei Liu
2015-03-09 12:51 ` [PATCH v7 21/21] xl: vNUMA support Wei Liu
2015-03-11 15:12   ` Ian Campbell
2015-03-11 20:14     ` Wei Liu
2015-03-12 10:00       ` Ian Campbell
2015-03-12 10:50         ` Wei Liu
2015-03-12 10:58           ` Ian Campbell
2015-03-12 13:22       ` Dario Faggioli
2015-03-12 13:37         ` Wei Liu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.