From mboxrd@z Thu Jan  1 00:00:00 1970
From: Joerg Roedel <joro@8bytes.org>
Subject: Re: [PATCH] kvm mmu: add support for 1GB pages in shadow paging
	code
Date: Sat, 28 Mar 2009 23:04:29 +0100
Message-ID: <20090328220429.GF31080@8bytes.org>
References: <1238164518-16261-1-git-send-email-joerg.roedel@amd.com> <20090328212834.GA4694@amt.cnet>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Cc: Joerg Roedel <joerg.roedel@amd.com>, Avi Kivity <avi@redhat.com>,
	kvm@vger.kernel.org, linux-kernel@vger.kernel.org
To: Marcelo Tosatti <mtosatti@redhat.com>
Return-path: <linux-kernel-owner+glk-linux-kernel-3=40m.gmane.org-S932070AbZC1WFB@vger.kernel.org>
Content-Disposition: inline
In-Reply-To: <20090328212834.GA4694@amt.cnet>
Sender: linux-kernel-owner@vger.kernel.org
List-Id: kvm.vger.kernel.org

On Sat, Mar 28, 2009 at 06:28:35PM -0300, Marcelo Tosatti wrote:
> > I have searched this bug for quite some time with no real luck. Maybe
> > some other reviewers have more luck than I had by now.
> 
> Sorry, I can't spot what is wrong here. Avi?
> 
> Perhaps it helps if you provide some info of the hang when guest
> allocates hugepages on boot (its probably and endless fault that can't
> be corrected?).

I will try to find out why the guest stucks. I also created a full
mmu trace of a boot crash case. But its size was around 170MB and I
found no real problem there.

> Also another point is that the large huge page at 0-1GB will never
> be created, because it crosses slot boundary.

The instabilies only occur if the guest has enough memory to use a
gbpage in its own direct mapping. They also go away when I boot it with
nogbpages command line option. So its likely that is has something to do
with the processing of guest gbpages in the softmmu code. But I looked
over this code again and again. There seems to be no bug.

> 
> > Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
> > ---
> >  arch/x86/kvm/mmu.c         |   56 +++++++++++++++++++++++++++++++------------
> >  arch/x86/kvm/paging_tmpl.h |   35 +++++++++++++++++++++------
> >  arch/x86/kvm/svm.c         |    2 +-
> >  3 files changed, 68 insertions(+), 25 deletions(-)
> > 
> > +	psize = backing_size(vcpu, vcpu->arch.update_pte.gfn);
> 
> This can block, and this path holds mmu_lock. Thats why it needs to be
> done in guess_page_from_pte_write.

Ah true. Thanks for pointing this out. The previous code in the
guess_page function makes sense now.

> > +	if ((sp->role.level == PT_DIRECTORY_LEVEL) &&
> > +	    (psize >= KVM_PAGE_SIZE_2M)) {
> > +		psize = KVM_PAGE_SIZE_2M;
> > +		vcpu->arch.update_pte.gfn &= ~(KVM_PAGES_PER_2M_PAGE-1);
> > +		vcpu->arch.update_pte.pfn &= ~(KVM_PAGES_PER_2M_PAGE-1);
> > +	} else if ((sp->role.level == PT_MIDDLE_LEVEL) &&
> > +		   (psize == KVM_PAGE_SIZE_1G)) {
> > +		vcpu->arch.update_pte.gfn &= ~(KVM_PAGES_PER_1G_PAGE-1);
> > +		vcpu->arch.update_pte.pfn &= ~(KVM_PAGES_PER_1G_PAGE-1);
> > +	} else
> > +		goto out_pde;
> 
> Better just zap the entry in case its a 1GB one and let the
> fault path handle it.

Yes, that probably better.

	Joerg