From mboxrd@z Thu Jan 1 00:00:00 1970 From: Juergen Gross Subject: Re: [PATCH V3 1/1] expand x86 arch_shared_info to support >3 level p2m tree Date: Mon, 15 Sep 2014 10:52:30 +0200 Message-ID: <5416A8CE.5020400@suse.com> References: <1410256709-25885-1-git-send-email-jgross@suse.com> <1410256709-25885-2-git-send-email-jgross@suse.com> <540ED600.3060102@citrix.com> <540EDB4F.30402@suse.com> <5412CB80.9030208@suse.com> <5416A379.5@citrix.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <5416A379.5@citrix.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Andrew Cooper , ian.campbell@citrix.com, ian.jackson@eu.citrix.com, jbeulich@suse.com, keir@xen.org, tim@xen.org, xen-devel@lists.xen.org List-Id: xen-devel@lists.xenproject.org On 09/15/2014 10:29 AM, Andrew Cooper wrote: > > On 12/09/2014 11:31, Juergen Gross wrote: >> On 09/09/2014 12:49 PM, Juergen Gross wrote: >>> On 09/09/2014 12:27 PM, Andrew Cooper wrote: >>>> On 09/09/14 10:58, Juergen Gross wrote: >>>>> The x86 struct arch_shared_info field pfn_to_mfn_frame_list_list >>>>> currently contains the mfn of the top level page frame of the 3 level >>>>> p2m tree, which is used by the Xen tools during saving and restoring >>>>> (and live migration) of pv domains. With three levels of the p2m tree >>>>> it is possible to support up to 512 GB of RAM for a 64 bit pv domain. >>>>> A 32 bit pv domain can support more, as each memory page can hold 1024 >>>>> instead of 512 entries, leading to a limit of 4 TB. To be able to >>>>> support more RAM on x86-64 an additional level is to be added. >>>>> >>>>> This patch expands struct arch_shared_info with a new p2m tree root >>>>> and the number of levels of the p2m tree. The new information is >>>>> indicated by the domain to be valid by storing ~0UL into >>>>> pfn_to_mfn_frame_list_list (this should be done only if more than >>>>> three levels are needed, of course). >>>> >>>> A small domain feeling a little tight on space could easily opt for a 2 >>>> or even 1 level p2m. (After all, one advantage of virt is to cram many >>>> small VMs into a server). >>>> >>>> How is xen and toolstack support for n-level p2ms going to be >>>> advertised >>>> to guests? Simply assuming the toolstack is capable of dealing with >>>> this new scheme wont work with a new pv guest running on an older Xen. >>> >>> Is it really worth doing such an optimization? This would save only very >>> few pages. >>> >>> If you think it should be done we can add another SIF_* flag to >>> start_info->flags. In this case a domain using this feature could not be >>> migrated to a server with old tools, however. So we would probably end >>> with the need to be able to suppress that flag on a per-domain base. >> >> Any further comments? >> >> Which way should I go? >> > > There are two approaches, with different up/downsides > > 1) continue to use the old method, and use the new method only when > absolutely required. This will function, but on old toolstacks, suffer > migration/suspend failures when the toolstack fails to find the p2m. > > 2) Provide a Xen feature flag indicating the presence of N-level p2m > support. Guests which can see this flag are free to use N-level, and > guests which can't are not. > > Ultimately, giving more than 512GB to a current 64bit PV domain is not > going to work, and the choice above depends on which failure mode you > wish a new/old mix to have. I'd prefer solution 1), as it will enable Dom0 with more than 512 GB without requiring a change of any Xen component. Additionally large domains can be started by users who don't care for migrating or suspending them. So I'd rather keep my patch as posted. Juergen