Re: large page support for kvm

public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed

From: "Joerg Roedel" <joerg.roedel-5C7GfCeVMHo@public.gmane.org>
To: "Avi Kivity" <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
Cc: kvm-devel <kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org>
Subject: Re: large page support for kvm
Date: Wed, 30 Jan 2008 19:40:35 +0100	[thread overview]
Message-ID: <20080130184035.GS6960@amd.com> (raw)
In-Reply-To: <479F604C.20107-atKUWr5tajBWk0Htik3J/w@public.gmane.org>

On Tue, Jan 29, 2008 at 07:20:12PM +0200, Avi Kivity wrote:

> Here's a rough sketch of my proposal:
> 
> - For every memory slot, allocate an array containing one int for every 
> potential large page included within that memory slot.  Each entry in 
> the array contains the number of write-protected 4KB pages within the 
> large page frame corresponding to that entry.
> 
> For example, if we have a memory slot for gpas 1MB-1GB, we'd have an 
> array of size 511, corresponding to the 511 2MB pages from 2MB upwards.  
> If we shadow a pagetable at address 4MB+8KB, we'd increment the entry 
> corresponding to the large page at 4MB.  When we unshadow that page, 
> decrement the entry.

You need to take care the the 2MB gpa is aligned 2 MB host physical to
be able to map it correctly with a large pte. So maybe we need two
memslots for 1MB-1GB. One for 1MB-2MB using normal 4kb pages and one
from 2MB-1GB which can be allocated using HugeTLBfs.

> - If we attempt to shadow a large page (either a guest pse pte, or a 
> real-mode pseudo pte), we check if the host page is a large page.  If 
> so, we also check the write-protect count array.  If the result is zero, 
> we create a shadow pse pte.
> 
> - Whenever we write-protect a page, also zap any large-page mappings for 
> that page.  This means rmap will need some extension to handle pde rmaps 
> in addition to pte rmaps.

This sounds straight forward to me. All you need is a short value for
every potential large page and initialize it with -1 if the host page is
a large page and with 0 otherwise. Every time this value reaches -1 we
can map the page with a large pte (and the guest maps with large pte).

> - qemu is extended to have a command-line option to use large pages to 
> back guest memory.
> 
> Large pages should improve performance significantly, both with 
> traditional shadow and npt/ept.

Yes, I think that too. But with shadow paging it really depends on the
guest if the performance increasement is long-term. In a Linux guest,
for example, the direct mapped memory will become fragmented over
time (together with the location of the page tables). So the
number of potential large page mappings will likely decrease over
time.

But the situation is different when the Linux guest uses HugeTLBfs in
its userspace. We will always be able to map these pages using large
ptes if the guest physical memory is correctly aligned.

With Nested Paging (and EPT also) we will always have the benefit
because we don't need to write protect anything.

I really look forward to large page support in KVM. Maybe we reach the
95% VCPU performance mark compared to native performance with it :-)

> Hopefully we will have transparent Linux support for them one day.

Unlikely. As far as I know Linus doesn't like the idea...

Joerg

-- 
           |           AMD Saxony Limited Liability Company & Co. KG
 Operating |         Wilschdorfer Landstr. 101, 01109 Dresden, Germany
 System    |                  Register Court Dresden: HRA 4896
 Research  |              General Partner authorized to represent:
 Center    |             AMD Saxony LLC (Wilmington, Delaware, US)
           | General Manager of AMD Saxony LLC: Dr. Hans-R. Deppe, Thomas McCoy

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

next prev parent reply	other threads:[~2008-01-30 18:40 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-01-29 17:20 large page support for kvm Avi Kivity
     [not found] ` <479F604C.20107-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2008-01-30 18:40   ` Joerg Roedel [this message]
     [not found]     ` <20080130184035.GS6960-5C7GfCeVMHo@public.gmane.org>
2008-01-31  5:44       ` Avi Kivity
2008-02-11 15:49         ` Marcelo Tosatti
2008-02-12 11:55           ` Avi Kivity
2008-02-13  0:15             ` Marcelo Tosatti
2008-02-13  6:45               ` Avi Kivity
2008-02-14 23:17                 ` Marcelo Tosatti
2008-02-15  7:40                   ` Roedel, Joerg
2008-02-17  9:38                   ` Avi Kivity
2008-02-19 20:37                     ` Marcelo Tosatti
2008-02-20 14:25                       ` Avi Kivity
2008-02-22  2:01                         ` Marcelo Tosatti
2008-02-22  7:16                           ` Avi Kivity

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080130184035.GS6960@amd.com \
    --to=joerg.roedel-5c7gfcevmho@public.gmane.org \
    --cc=avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org \
    --cc=kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox