From mboxrd@z Thu Jan 1 00:00:00 1970 From: Boris Ostrovsky Subject: Re: [PATCH 3/4] libxc: rework vnuma bits in setup_guest Date: Tue, 19 May 2015 13:03:59 -0400 Message-ID: <555B6CFF.9060604@oracle.com> References: <1431963292-24765-1-git-send-email-wei.liu2@citrix.com> <1431963292-24765-4-git-send-email-wei.liu2@citrix.com> <555B636B.3080005@oracle.com> <20150519163151.GP26335@zion.uk.xensource.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20150519163151.GP26335@zion.uk.xensource.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Wei Liu Cc: Dario Faggioli , Ian Jackson , Ian Campbell , xen-devel@lists.xen.org List-Id: xen-devel@lists.xenproject.org On 05/19/2015 12:31 PM, Wei Liu wrote: > On Tue, May 19, 2015 at 12:23:07PM -0400, Boris Ostrovsky wrote: >> On 05/18/2015 11:34 AM, Wei Liu wrote: >>> Make the setup process similar to PV counterpart. That is, to allocate a >>> P2M array that covers the whole memory range and start from there. This >>> is clearer than using an array with no holes in it. >>> >>> Also the dummy layout should take MMIO hole into consideration. We might >>> end up having two vmemranges in the dummy layout. >>> >>> Signed-off-by: Wei Liu >>> Cc: Ian Campbell >>> Cc: Ian Jackson >>> --- >>> tools/libxc/xc_hvm_build_x86.c | 66 ++++++++++++++++++++++++++++++++---------- >>> 1 file changed, 50 insertions(+), 16 deletions(-) >>> >>> diff --git a/tools/libxc/xc_hvm_build_x86.c b/tools/libxc/xc_hvm_build_x86.c >>> index df4b7ed..77678f1 100644 >>> --- a/tools/libxc/xc_hvm_build_x86.c >>> +++ b/tools/libxc/xc_hvm_build_x86.c >>> @@ -238,6 +238,7 @@ static int setup_guest(xc_interface *xch, >>> { >>> xen_pfn_t *page_array = NULL; >>> unsigned long i, vmemid, nr_pages = args->mem_size >> PAGE_SHIFT; >>> + unsigned long p2m_size; >>> unsigned long target_pages = args->mem_target >> PAGE_SHIFT; >>> unsigned long entry_eip, cur_pages, cur_pfn; >>> void *hvm_info_page; >>> @@ -254,8 +255,8 @@ static int setup_guest(xc_interface *xch, >>> xen_pfn_t special_array[NR_SPECIAL_PAGES]; >>> xen_pfn_t ioreq_server_array[NR_IOREQ_SERVER_PAGES]; >>> uint64_t total_pages; >>> - xen_vmemrange_t dummy_vmemrange; >>> - unsigned int dummy_vnode_to_pnode; >>> + xen_vmemrange_t dummy_vmemrange[2]; >>> + unsigned int dummy_vnode_to_pnode[2]; >>> memset(&elf, 0, sizeof(elf)); >>> if ( elf_init(&elf, image, image_size) != 0 ) >>> @@ -275,17 +276,37 @@ static int setup_guest(xc_interface *xch, >>> if ( args->nr_vmemranges == 0 ) >>> { >>> - /* Build dummy vnode information */ >>> - dummy_vmemrange.start = 0; >>> - dummy_vmemrange.end = args->mem_size; >>> - dummy_vmemrange.flags = 0; >>> - dummy_vmemrange.nid = 0; >>> - args->nr_vmemranges = 1; >>> - args->vmemranges = &dummy_vmemrange; >>> + /* Build dummy vnode information >>> + * >>> + * Guest physical address space layout: >>> + * [0, hole_start) [hole_start, 4G) [4G, highmem_end) >>> + * >>> + * Of course if there is no high memory, the second vmemrange >>> + * has no effect on the actual result. >>> + */ >>> - dummy_vnode_to_pnode = XC_NUMA_NO_NODE; >>> + dummy_vmemrange[0].start = 0; >>> + dummy_vmemrange[0].end = args->lowmem_end; >>> + dummy_vmemrange[0].flags = 0; >>> + dummy_vmemrange[0].nid = 0; >>> + dummy_vnode_to_pnode[0] = XC_NUMA_NO_NODE; >>> + args->nr_vmemranges = 1; >>> args->nr_vnodes = 1; >>> - args->vnode_to_pnode = &dummy_vnode_to_pnode; >>> + >>> + if ( args->highmem_end > (1ULL << 32) ) >>> + { >>> + dummy_vmemrange[1].start = 1ULL << 32; >>> + dummy_vmemrange[1].end = args->highmem_end; >>> + dummy_vmemrange[1].flags = 0; >>> + dummy_vmemrange[1].nid = 0; >>> + dummy_vnode_to_pnode[1] = XC_NUMA_NO_NODE; >>> + >>> + args->nr_vmemranges++; >>> + args->nr_vnodes++; >> (+Dario) >> >> Does having high memory mean that we need to have 2 vnodes? We should be > Yes, see the comment above. > >> able to cope with multiple vmemranges per node, right? > > Yes. That's already done in libxl's function to build hvm vmemranges. > > This is only a simple dummy layout so nothing fancy happens here. Right. But having multiple vnodes for a dummy topology looks to me a little counter-intuitive: people often assume that when number of nodes is 1 we don't have any NUMA-ness. Here we may need to look at vnode_to_pnode (possibly at both of the elements) to realize that this is a dummy layout. And given that this layout can be expressed with nr_vnodes=1 && nr_vmemranges=2 I am not sure what we gain by having two vnodes. -boris > > Wei. > >> -boris >> >> >>> + } >>> + >>> + args->vmemranges = dummy_vmemrange; >>> + args->vnode_to_pnode = dummy_vnode_to_pnode; >>> } >>>