From mboxrd@z Thu Jan 1 00:00:00 1970 From: Boris Ostrovsky Subject: Re: [PATCH 3/4] libxc: rework vnuma bits in setup_guest Date: Tue, 19 May 2015 13:21:48 -0400 Message-ID: <555B712C.5060704@oracle.com> References: <1431963292-24765-1-git-send-email-wei.liu2@citrix.com> <1431963292-24765-4-git-send-email-wei.liu2@citrix.com> <555B636B.3080005@oracle.com> <20150519163151.GP26335@zion.uk.xensource.com> <555B6CFF.9060604@oracle.com> <20150519171427.GR26335@zion.uk.xensource.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20150519171427.GR26335@zion.uk.xensource.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Wei Liu Cc: Dario Faggioli , Ian Jackson , Ian Campbell , xen-devel@lists.xen.org List-Id: xen-devel@lists.xenproject.org On 05/19/2015 01:14 PM, Wei Liu wrote: > On Tue, May 19, 2015 at 01:03:59PM -0400, Boris Ostrovsky wrote: >> On 05/19/2015 12:31 PM, Wei Liu wrote: >>> On Tue, May 19, 2015 at 12:23:07PM -0400, Boris Ostrovsky wrote: >>>> On 05/18/2015 11:34 AM, Wei Liu wrote: >>>>> Make the setup process similar to PV counterpart. That is, to allocate a >>>>> P2M array that covers the whole memory range and start from there. This >>>>> is clearer than using an array with no holes in it. >>>>> >>>>> Also the dummy layout should take MMIO hole into consideration. We might >>>>> end up having two vmemranges in the dummy layout. >>>>> >>>>> Signed-off-by: Wei Liu >>>>> Cc: Ian Campbell >>>>> Cc: Ian Jackson >>>>> --- >>>>> tools/libxc/xc_hvm_build_x86.c | 66 ++++++++++++++++++++++++++++++++---------- >>>>> 1 file changed, 50 insertions(+), 16 deletions(-) >>>>> >>>>> diff --git a/tools/libxc/xc_hvm_build_x86.c b/tools/libxc/xc_hvm_build_x86.c >>>>> index df4b7ed..77678f1 100644 >>>>> --- a/tools/libxc/xc_hvm_build_x86.c >>>>> +++ b/tools/libxc/xc_hvm_build_x86.c >>>>> @@ -238,6 +238,7 @@ static int setup_guest(xc_interface *xch, >>>>> { >>>>> xen_pfn_t *page_array = NULL; >>>>> unsigned long i, vmemid, nr_pages = args->mem_size >> PAGE_SHIFT; >>>>> + unsigned long p2m_size; >>>>> unsigned long target_pages = args->mem_target >> PAGE_SHIFT; >>>>> unsigned long entry_eip, cur_pages, cur_pfn; >>>>> void *hvm_info_page; >>>>> @@ -254,8 +255,8 @@ static int setup_guest(xc_interface *xch, >>>>> xen_pfn_t special_array[NR_SPECIAL_PAGES]; >>>>> xen_pfn_t ioreq_server_array[NR_IOREQ_SERVER_PAGES]; >>>>> uint64_t total_pages; >>>>> - xen_vmemrange_t dummy_vmemrange; >>>>> - unsigned int dummy_vnode_to_pnode; >>>>> + xen_vmemrange_t dummy_vmemrange[2]; >>>>> + unsigned int dummy_vnode_to_pnode[2]; >>>>> memset(&elf, 0, sizeof(elf)); >>>>> if ( elf_init(&elf, image, image_size) != 0 ) >>>>> @@ -275,17 +276,37 @@ static int setup_guest(xc_interface *xch, >>>>> if ( args->nr_vmemranges == 0 ) >>>>> { >>>>> - /* Build dummy vnode information */ >>>>> - dummy_vmemrange.start = 0; >>>>> - dummy_vmemrange.end = args->mem_size; >>>>> - dummy_vmemrange.flags = 0; >>>>> - dummy_vmemrange.nid = 0; >>>>> - args->nr_vmemranges = 1; >>>>> - args->vmemranges = &dummy_vmemrange; >>>>> + /* Build dummy vnode information >>>>> + * >>>>> + * Guest physical address space layout: >>>>> + * [0, hole_start) [hole_start, 4G) [4G, highmem_end) >>>>> + * >>>>> + * Of course if there is no high memory, the second vmemrange >>>>> + * has no effect on the actual result. >>>>> + */ >>>>> - dummy_vnode_to_pnode = XC_NUMA_NO_NODE; >>>>> + dummy_vmemrange[0].start = 0; >>>>> + dummy_vmemrange[0].end = args->lowmem_end; >>>>> + dummy_vmemrange[0].flags = 0; >>>>> + dummy_vmemrange[0].nid = 0; >>>>> + dummy_vnode_to_pnode[0] = XC_NUMA_NO_NODE; >>>>> + args->nr_vmemranges = 1; >>>>> args->nr_vnodes = 1; >>>>> - args->vnode_to_pnode = &dummy_vnode_to_pnode; >>>>> + >>>>> + if ( args->highmem_end > (1ULL << 32) ) >>>>> + { >>>>> + dummy_vmemrange[1].start = 1ULL << 32; >>>>> + dummy_vmemrange[1].end = args->highmem_end; >>>>> + dummy_vmemrange[1].flags = 0; >>>>> + dummy_vmemrange[1].nid = 0; >>>>> + dummy_vnode_to_pnode[1] = XC_NUMA_NO_NODE; >>>>> + >>>>> + args->nr_vmemranges++; >>>>> + args->nr_vnodes++; >>>> (+Dario) >>>> >>>> Does having high memory mean that we need to have 2 vnodes? We should be >>> Yes, see the comment above. >>> >>>> able to cope with multiple vmemranges per node, right? >>> >>> Yes. That's already done in libxl's function to build hvm vmemranges. >>> >>> This is only a simple dummy layout so nothing fancy happens here. >> >> >> Right. But having multiple vnodes for a dummy topology looks to me a little >> counter-intuitive: people often assume that when number of nodes is 1 we >> don't have any NUMA-ness. Here we may need to look at vnode_to_pnode >> (possibly at both of the elements) to realize that this is a dummy layout. >> >> And given that this layout can be expressed with nr_vnodes=1 && >> nr_vmemranges=2 I am not sure what we gain by having two vnodes. >> > > Ah, so that's a bug: args->nr_vnodes++ should be deleted. And dummy_vnode_to_pnode[2] should become a scalar. -boris > > We still only have one vnode (nid = 0). That vnode contains two > vmemranges. > > Wei. > >> -boris >> >>> >>> Wei. >>> >>>> -boris >>>> >>>> >>>>> + } >>>>> + >>>>> + args->vmemranges = dummy_vmemrange; >>>>> + args->vnode_to_pnode = dummy_vnode_to_pnode; >>>>> } >>>>>