Re: [PATCH] KVM: MMU: lazily drop large spte

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Marcelo Tosatti <mtosatti@redhat.com>
To: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Cc: Avi Kivity <avi@redhat.com>, LKML <linux-kernel@vger.kernel.org>,
	KVM <kvm@vger.kernel.org>
Subject: Re: [PATCH] KVM: MMU: lazily drop large spte
Date: Fri, 16 Nov 2012 01:02:22 -0200	[thread overview]
Message-ID: <20121116030222.GA21822@amt.cnet> (raw)
In-Reply-To: <50A4267B.1030902@linux.vnet.ibm.com>

On Thu, Nov 15, 2012 at 07:17:15AM +0800, Xiao Guangrong wrote:
> On 11/14/2012 10:37 PM, Marcelo Tosatti wrote:
> > On Tue, Nov 13, 2012 at 04:26:16PM +0800, Xiao Guangrong wrote:
> >> Hi Marcelo,
> >>
> >> On 11/13/2012 07:10 AM, Marcelo Tosatti wrote:
> >>> On Mon, Nov 05, 2012 at 05:59:26PM +0800, Xiao Guangrong wrote:
> >>>> Do not drop large spte until it can be insteaded by small pages so that
> >>>> the guest can happliy read memory through it
> >>>>
> >>>> The idea is from Avi:
> >>>> | As I mentioned before, write-protecting a large spte is a good idea,
> >>>> | since it moves some work from protect-time to fault-time, so it reduces
> >>>> | jitter.  This removes the need for the return value.
> >>>>
> >>>> Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
> >>>> ---
> >>>>  arch/x86/kvm/mmu.c |   34 +++++++++-------------------------
> >>>>  1 files changed, 9 insertions(+), 25 deletions(-)
> >>>
> >>> Its likely that other 4k pages are mapped read-write in the 2mb range 
> >>> covered by a read-only 2mb map. Therefore its not entirely useful to
> >>> map read-only. 
> >>>
> >>
> >> It needs a page fault to install a pte even if it is the read access.
> >> After the change, the page fault can be avoided.
> >>
> >>> Can you measure an improvement with this change?
> >>
> >> I have a test case to measure the read time which has been attached.
> >> It maps 4k pages at first (dirt-loggged), then switch to large sptes
> >> (stop dirt-logging), at the last, measure the read access time after write
> >> protect sptes.
> >>
> >> Before: 23314111 ns	After: 11404197 ns
> > 
> > Ok, i'm concerned about cases similar to e49146dce8c3dc6f44 (with shadow),
> > that is:
> > 
> > - large page must be destroyed when write protecting due to 
> > shadowed page.
> > - with shadow, it does not make sense to write protect 
> > large sptes as mentioned earlier.
> > 
> 
> This case is removed now, the code when e49146dce8c3dc6f44 was applied is:
> |
> |                pt = sp->spt;
> |                for (i = 0; i < PT64_ENT_PER_PAGE; ++i)
> |                        /* avoid RMW */
> |                        if (is_writable_pte(pt[i]))
> |                                update_spte(&pt[i], pt[i] & ~PT_WRITABLE_MASK);
> |        }
> 
> The real problem in this code is it would write-protect the spte even if
> it is not a last spte that caused the middle-level shadow page table was
> write-protected. So e49146dce8c3dc6f44 added this code:
> |                if (sp->role.level != PT_PAGE_TABLE_LEVEL)
> |                        continue;
> |
> was good to fix this problem.
> 
> Now, the current code is:
> |		for (i = 0; i < PT64_ENT_PER_PAGE; ++i) {
> |			if (!is_shadow_present_pte(pt[i]) ||
> |			      !is_last_spte(pt[i], sp->role.level))
> |				continue;
> |
> |			spte_write_protect(kvm, &pt[i], &flush, false);
> |		}
> It only write-protect the last spte. So, it allows large spte existent.
> (the large spte can be broken by drop_large_spte() on the page-fault path.)
> 
> > So i wonder why is this part from your patch
> > 
> > -               if (level > PT_PAGE_TABLE_LEVEL &&
> > -                   has_wrprotected_page(vcpu->kvm, gfn, level)) {
> > -                       ret = 1;
> > -                       drop_spte(vcpu->kvm, sptep);
> > -                       goto done;
> > -               }
> > 
> > necessary (assuming EPT is in use).
> 
> This is safe, we change these code to:
> 
> -		if (mmu_need_write_protect(vcpu, gfn, can_unsync)) {
> +		if ((level > PT_PAGE_TABLE_LEVEL &&
> +		   has_wrprotected_page(vcpu->kvm, gfn, level)) ||
> +		      mmu_need_write_protect(vcpu, gfn, can_unsync)) {
>  			pgprintk("%s: found shadow page for %llx, marking ro\n",
>  				 __func__, gfn);
>  			ret = 1;
> 
> The spte become read-only which can ensure the shadow gfn can not be changed.
> 
> Btw, the origin code allows to create readonly spte under this case if !(pte_access & WRITEABBLE)

Regarding shadow: it should be fine as long as fault path always deletes
large mappings, when shadowed pages are present in the region.

Ah, unshadowing from reexecute_instruction does not handle
large pages. I suppose that is what "simplification" refers 
to.

next prev parent reply	other threads:[~2012-11-16  3:02 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-11-05  9:59 [PATCH] KVM: MMU: lazily drop large spte Xiao Guangrong
2012-11-12 23:10 ` Marcelo Tosatti
2012-11-13  8:26   ` Xiao Guangrong
2012-11-14 14:37     ` Marcelo Tosatti
2012-11-14 23:17       ` Xiao Guangrong
2012-11-16  3:02         ` Marcelo Tosatti [this message]
2012-11-16  3:39           ` Xiao Guangrong
2012-11-16  3:56             ` Marcelo Tosatti
2012-11-16  4:46               ` Xiao Guangrong
2012-11-16  9:57                 ` Marcelo Tosatti
2012-11-17 14:06                   ` Xiao Guangrong
2012-11-18  3:00                     ` Marcelo Tosatti
2012-11-28  5:27                       ` Xiao Guangrong
2012-11-28 11:39                         ` Marcelo Tosatti
2012-11-13 15:33   ` Takuya Yoshikawa
2012-11-14 14:44     ` Marcelo Tosatti
2012-11-14 23:33       ` Xiao Guangrong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20121116030222.GA21822@amt.cnet \
    --to=mtosatti@redhat.com \
    --cc=avi@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=xiaoguangrong@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.