From mboxrd@z Thu Jan 1 00:00:00 1970 From: Xiao Guangrong Subject: Re: KVM: MMU: improve n_max_mmu_pages calculation with TDP Date: Fri, 22 Mar 2013 11:00:28 +0800 Message-ID: <514BC94C.8070802@linux.vnet.ibm.com> References: <20130320201420.GA17347@amt.cnet> <514A9DA7.10702@linux.vnet.ibm.com> <20130321142919.GA30837@amt.cnet> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Cc: kvm , Ulrich Obergfell , Takuya Yoshikawa , Avi Kivity To: Marcelo Tosatti Return-path: Received: from e28smtp05.in.ibm.com ([122.248.162.5]:46222 "EHLO e28smtp05.in.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753131Ab3CVDAr (ORCPT ); Thu, 21 Mar 2013 23:00:47 -0400 Received: from /spool/local by e28smtp05.in.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Fri, 22 Mar 2013 08:28:01 +0530 Received: from d28relay01.in.ibm.com (d28relay01.in.ibm.com [9.184.220.58]) by d28dlp01.in.ibm.com (Postfix) with ESMTP id 16693E0057 for ; Fri, 22 Mar 2013 08:32:12 +0530 (IST) Received: from d28av04.in.ibm.com (d28av04.in.ibm.com [9.184.220.66]) by d28relay01.in.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id r2M30ZWt24707218 for ; Fri, 22 Mar 2013 08:30:36 +0530 Received: from d28av04.in.ibm.com (loopback [127.0.0.1]) by d28av04.in.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id r2M30Ysl019465 for ; Fri, 22 Mar 2013 14:00:37 +1100 In-Reply-To: <20130321142919.GA30837@amt.cnet> Sender: kvm-owner@vger.kernel.org List-ID: On 03/21/2013 10:29 PM, Marcelo Tosatti wrote: > On Thu, Mar 21, 2013 at 01:41:59PM +0800, Xiao Guangrong wrote: >> On 03/21/2013 04:14 AM, Marcelo Tosatti wrote: >>> >>> kvm_mmu_calculate_mmu_pages numbers, >>> >>> maximum number of shadow pages = 2% of mapped guest pages >>> >>> Does not make sense for TDP guests where mapping all of guest >>> memory with 4k pages cannot exceed "mapped guest pages / 512" >>> (not counting root pages). >>> >>> Allow that maximum for TDP, forcing the guest to recycle otherwise. >>> >>> Signed-off-by: Marcelo Tosatti >>> >>> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c >>> index 956ca35..a9694a8d7 100644 >>> --- a/arch/x86/kvm/mmu.c >>> +++ b/arch/x86/kvm/mmu.c >>> @@ -4293,7 +4293,7 @@ nomem: >>> unsigned int kvm_mmu_calculate_mmu_pages(struct kvm *kvm) >>> { >>> unsigned int nr_mmu_pages; >>> - unsigned int nr_pages = 0; >>> + unsigned int i, nr_pages = 0; >>> struct kvm_memslots *slots; >>> struct kvm_memory_slot *memslot; >>> >>> @@ -4302,7 +4302,19 @@ unsigned int kvm_mmu_calculate_mmu_pages(struct kvm *kvm) >>> kvm_for_each_memslot(memslot, slots) >>> nr_pages += memslot->npages; >>> >>> - nr_mmu_pages = nr_pages * KVM_PERMILLE_MMU_PAGES / 1000; >>> + if (tdp_enabled) { >>> + /* one root page */ >>> + nr_mmu_pages = 1; >>> + /* nr_pages / (512^i) per level, due to >>> + * guest RAM map being linear */ >>> + for (i = 1; i < 4; i++) { >>> + int nr_pages_round = nr_pages + (1 << (9*i)); >>> + nr_mmu_pages += nr_pages_round >> (9*i); >>> + } >> >> Marcelo, >> >> Can it work if nested guest is used? Did you see any problem in practice (direct guest >> uses more memory than your calculation)? > > Direct guest can use more than the calculation by switching between > different paging modes. I mean guest runs on hardmmu (tdp is used but no nested guest). Its only use one page table and seems can not use more memory than your calculation (except some mmio page tables). So, you calculation is only used to limit memory used if tdp + nested guest? > > About nested guest: at one point in time the working set cannot exceed > the number of physical pages visible by the guest. But it can cause lots of #PF, it is the nightmare for performance, no? > > Allowing an excessively high number of shadow pages is a security The security concern means "optimization memory usage"? Or something else? > concern, also, as unpreemptable long operations are necessary to tear > down the pages. You mean limiting the shadow pages to let some patch run faster like remove-write-access and zap-all-sp etc.? If yes, we can directly optimize for these paths, this is more effective i think. > >> And mmio also can build some page table that looks like not considered >> in this patch. > > Right, but its only a few pages. Same argument as above: working set at > one given time is smaller than total RAM. Do you see any potential > problem? Marcelo, I just confused whether the limitation is reasonable, as i said, the limitation is not effective enough on hardmmu-only guest (no nested). and it seems too low for nested guests.