All of lore.kernel.org
 help / color / mirror / Atom feed
From: Marcelo Tosatti <marcelo@kvack.org>
To: Avi Kivity <avi@qumranet.com>
Cc: Marcelo Tosatti <marcelo@kvack.org>,
	kvm-devel <kvm-devel@lists.sourceforge.net>
Subject: Re: large page support for kvm
Date: Tue, 19 Feb 2008 17:37:33 -0300	[thread overview]
Message-ID: <20080219203733.GA7558@dmt> (raw)
In-Reply-To: <47B800AB.20905@qumranet.com>

On Sun, Feb 17, 2008 at 11:38:51AM +0200, Avi Kivity wrote:
> >+ * Return the pointer to the largepage write count for a given
> >+ * gfn, handling slots that are not large page aligned.
> >+ */
> >+static int *slot_largepage_idx(gfn_t gfn, struct kvm_memory_slot *slot)
> >+{
> >+	unsigned long idx;
> >+
> >+	idx = (gfn - slot->base_gfn) + hpage_align_diff(slot->base_gfn);
> >+	idx /= KVM_PAGES_PER_HPAGE;
> >+	return &slot->lpage_info[idx].write_count;
> >+}
> >  
> 
> Can be further simplified to (gfn / KVM_PAGES_PER_HPAGE) - 
> (slot->base_gfn / KVM_PAGES_PER_HPAGE).  Sorry for not noticing earlier.

Right.

> >+	/* guest has 4M pages, host 2M */
> >+	if (!is_pae(vcpu) && HPAGE_SHIFT == 21)
> >+		return 0;
> >  
> 
> Is this check necessary?  I think that if we remove it things will just 
> work.  A 4MB page will be have either one or two 2MB sptes (which may 
> even belong to different slots).

You mentioned that before, I agree its not necessary.

> >+			/*
> >+ 			 * Largepage creation is susceptible to a upper-level
> >+ 			 * table to be shadowed and write-protected in the
> >+ 			 * area being mapped. If that is the case, invalidate
> >+ 			 * the entry and let the instruction fault again
> >+ 			 * and use 4K mappings.
> >+ 			 */
> >+			if (largepage) {
> >+				spte = shadow_trap_nonpresent_pte;
> >+				kvm_x86_ops->tlb_flush(vcpu);
> >+				goto unshadowed;
> >+			}
> >  
> 
> Would it not repeat exactly the same code path?  Or is this just for the 
> case of the pte_update path?

The problem is if the instruction writing to one of the roots can't be
emulated.

kvm_mmu_unprotect_page() does not know about largepages, so it will zap
a gfn inside the large page frame, but not the large translation itself.

And zapping the gfn brings the shadowed page count in large area to
zero, allowing has_wrprotected_page() to succeed. Endless unfixable
write faults.

> >-	page = gfn_to_page(vcpu->kvm, gpa >> PAGE_SHIFT);
> >+	if (is_largepage_backed(vcpu, gfn & ~(KVM_PAGES_PER_HPAGE-1))
> >+	    && is_physical_memory(vcpu->kvm, gfn)) {
> >+		gfn &= ~(KVM_PAGES_PER_HPAGE-1);
> >+		largepage = 1;
> >+	}
> >  
> 
> Doesn't is_largepage_backed() imply is_physical_memory?

Hum. I'll verify... it seems that now that the ends of the slots have
write_count set to 1 that should be true.

> > 
> >Index: kvm.largepages/arch/x86/kvm/x86.c
> >===================================================================
> >--- kvm.largepages.orig/arch/x86/kvm/x86.c
> >+++ kvm.largepages/arch/x86/kvm/x86.c
> >@@ -86,6 +86,7 @@ struct kvm_stats_debugfs_item debugfs_en
> > 	{ "mmu_recycled", VM_STAT(mmu_recycled) },
> > 	{ "mmu_cache_miss", VM_STAT(mmu_cache_miss) },
> > 	{ "remote_tlb_flush", VM_STAT(remote_tlb_flush) },
> >+	{ "lpages", VM_STAT(lpages) },
> > 	{ NULL }
> > };
> >  
> 
> s/lpages/largepages/, this is user visible.

OK.

> >+		new.lpage_info = vmalloc(largepages * 
> >sizeof(*new.lpage_info));
> >+
> >+		if (!new.lpage_info)
> >+			goto out_free;
> >+
> >+		memset(new.lpage_info, 0, largepages * 
> >sizeof(*new.lpage_info));
> >+		/* large page crosses memslot boundary */
> >+		if (npages % KVM_PAGES_PER_HPAGE) {
> >+			new.lpage_info[0].write_count = 1;
> >  
> 
> This seems wrong, say a 3MB slot at 1GB, you kill the first largepage 
> which is good.
> 
> >+			new.lpage_info[largepages-1].write_count = 1;
> >  
> 
> OTOH, a 3MB slot at 3MB, the last page is fine.  The check needs to be 
> against base_gfn and base_gfn + npages, not the number of pages.

Right, will fix. Will post an uptodated patch soon.


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

  reply	other threads:[~2008-02-19 20:37 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-01-29 17:20 large page support for kvm Avi Kivity
     [not found] ` <479F604C.20107-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2008-01-30 18:40   ` Joerg Roedel
     [not found]     ` <20080130184035.GS6960-5C7GfCeVMHo@public.gmane.org>
2008-01-31  5:44       ` Avi Kivity
2008-02-11 15:49         ` Marcelo Tosatti
2008-02-12 11:55           ` Avi Kivity
2008-02-13  0:15             ` Marcelo Tosatti
2008-02-13  6:45               ` Avi Kivity
2008-02-14 23:17                 ` Marcelo Tosatti
2008-02-15  7:40                   ` Roedel, Joerg
2008-02-17  9:38                   ` Avi Kivity
2008-02-19 20:37                     ` Marcelo Tosatti [this message]
2008-02-20 14:25                       ` Avi Kivity
2008-02-22  2:01                         ` Marcelo Tosatti
2008-02-22  7:16                           ` Avi Kivity

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080219203733.GA7558@dmt \
    --to=marcelo@kvack.org \
    --cc=avi@qumranet.com \
    --cc=kvm-devel@lists.sourceforge.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.