From mboxrd@z Thu Jan  1 00:00:00 1970
From: Marcelo Tosatti <mtosatti@redhat.com>
Subject: Re: KVM: MMU: improve n_max_mmu_pages calculation with TDP
Date: Fri, 22 Mar 2013 07:31:12 -0300
Message-ID: <20130322103112.GA7543@amt.cnet>
References: <20130320201420.GA17347@amt.cnet>
 <514A9DA7.10702@linux.vnet.ibm.com>
 <20130321142919.GA30837@amt.cnet>
 <514BC94C.8070802@linux.vnet.ibm.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Cc: kvm <kvm@vger.kernel.org>, Ulrich Obergfell <uobergfe@redhat.com>,
	Takuya Yoshikawa <takuya.yoshikawa@gmail.com>,
	Avi Kivity <avi.kivity@gmail.com>
To: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Return-path: <kvm-owner@vger.kernel.org>
Received: from mx1.redhat.com ([209.132.183.28]:5256 "EHLO mx1.redhat.com"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S1753994Ab3CVKnK (ORCPT <rfc822;kvm@vger.kernel.org>);
	Fri, 22 Mar 2013 06:43:10 -0400
Content-Disposition: inline
In-Reply-To: <514BC94C.8070802@linux.vnet.ibm.com>
Sender: kvm-owner@vger.kernel.org
List-ID: <kvm.vger.kernel.org>

On Fri, Mar 22, 2013 at 11:00:28AM +0800, Xiao Guangrong wrote:
> On 03/21/2013 10:29 PM, Marcelo Tosatti wrote:
> > On Thu, Mar 21, 2013 at 01:41:59PM +0800, Xiao Guangrong wrote:
> >> On 03/21/2013 04:14 AM, Marcelo Tosatti wrote:
> >>>
> >>> kvm_mmu_calculate_mmu_pages numbers, 
> >>>
> >>> maximum number of shadow pages = 2% of mapped guest pages
> >>>
> >>> Does not make sense for TDP guests where mapping all of guest
> >>> memory with 4k pages cannot exceed "mapped guest pages / 512"
> >>> (not counting root pages).
> >>>
> >>> Allow that maximum for TDP, forcing the guest to recycle otherwise.
> >>>
> >>> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
> >>>
> >>> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> >>> index 956ca35..a9694a8d7 100644
> >>> --- a/arch/x86/kvm/mmu.c
> >>> +++ b/arch/x86/kvm/mmu.c
> >>> @@ -4293,7 +4293,7 @@ nomem:
> >>>  unsigned int kvm_mmu_calculate_mmu_pages(struct kvm *kvm)
> >>>  {
> >>>  	unsigned int nr_mmu_pages;
> >>> -	unsigned int  nr_pages = 0;
> >>> +	unsigned int i, nr_pages = 0;
> >>>  	struct kvm_memslots *slots;
> >>>  	struct kvm_memory_slot *memslot;
> >>>
> >>> @@ -4302,7 +4302,19 @@ unsigned int kvm_mmu_calculate_mmu_pages(struct kvm *kvm)
> >>>  	kvm_for_each_memslot(memslot, slots)
> >>>  		nr_pages += memslot->npages;
> >>>
> >>> -	nr_mmu_pages = nr_pages * KVM_PERMILLE_MMU_PAGES / 1000;
> >>> +	if (tdp_enabled) {
> >>> +		/* one root page */
> >>> +		nr_mmu_pages = 1;
> >>> +		/* nr_pages / (512^i) per level, due to
> >>> +		 * guest RAM map being linear */
> >>> +		for (i = 1; i < 4; i++) {
> >>> +			int nr_pages_round = nr_pages + (1 << (9*i));
> >>> +			nr_mmu_pages += nr_pages_round >> (9*i);
> >>> +		}
> >>
> >> Marcelo,
> >>
> >> Can it work if nested guest is used? Did you see any problem in practice (direct guest
> >> uses more memory than your calculation)?
> > 
> > Direct guest can use more than the calculation by switching between
> > different paging modes.
> 
> I mean guest runs on hardmmu (tdp is used but no nested guest). Its only
> use one page table and seems can not use more memory than your calculation
> (except some mmio page tables).
> 
> So, you calculation is only used to limit memory used if tdp + nested guest?

Yes, you're right, there is no duplication of shadow pages even with
mode switches so the patch is not needed.

> > About nested guest: at one point in time the working set cannot exceed 
> > the number of physical pages visible by the guest.
> 
> But it can cause lots of #PF, it is the nightmare for performance, no?
> 
> > 
> > Allowing an excessively high number of shadow pages is a security
> 
> The security concern means "optimization memory usage"? Or something else?
> 
> > concern, also, as unpreemptable long operations are necessary to tear
> > down the pages.
> 
> You mean limiting the shadow pages to let some patch run faster like
> remove-write-access and zap-all-sp etc.? If yes, we can directly optimize
> for these paths, this is more effective i think.
> 
> > 
> >> And mmio also can build some page table that looks like not considered
> >> in this patch.
> > 
> > Right, but its only a few pages. Same argument as above: working set at
> > one given time is smaller than total RAM. Do you see any potential
> > problem?
> 
> Marcelo, I just confused whether the limitation is reasonable, as i said,
> the limitation is not effective enough on hardmmu-only guest (no nested).
> and it seems too low for nested guests.
>