kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/10] nEPT v2: Nested EPT support for Nested VMX
@ 2012-08-01 14:36 Nadav Har'El
  2012-08-01 14:37 ` [PATCH 01/10] nEPT: Support LOAD_IA32_EFER entry/exit controls for L1 Nadav Har'El
                   ` (10 more replies)
  0 siblings, 11 replies; 15+ messages in thread
From: Nadav Har'El @ 2012-08-01 14:36 UTC (permalink / raw)
  To: kvm; +Cc: Joerg.Roedel, avi, owasserm, abelg, eddie.dong, yang.z.zhang

The following patches add nested EPT support to Nested VMX.

This is the second version of this patch set. Most of the issues from the
previous reviews were handled, and in particular there is now a new variant
of paging_tmpl for EPT page tables.

However, while this version does work in my tests, there are still some known
problems/bugs with this version and unhandled issues from the previous review:

 1. 32-bit *PAE* L2s currently don't work. non-PAE 32-bit L2s do work
    (and so do, of course, 64-bit L2s).

 2. nested_ept_inject_page_fault() assumes vm_exit_reason is already set
    to EPT_VIOLATION. However, it is conceivable that L0 emulates some
    L2 instruction, and during this emulation we read some L2 memory
    causing a need to exit (from L2 to L1) with an EPT violation.

 3. Moreover, now nested_ept_inject_page_fault() always causes an
    EPT_VIOLATION, with vmcs12->exit_qualification = fault->error_code.
    This is wrong: first fault->error code is not in exit qualification
    format but in PFERR_* format. Moreover, PFERR_RSVD_MASK would mean
    we need to cause an EPT_MISCONFIG, NOT EPT_VIOLATION.
    Instead of trying to fix this by translating PFERR to exit_qualification,
    we should calculate and remember in walk_addr() the exit qualification
    (and and an additional bit: whether it's an EPT VIOLATION or
    MISCONFIGURATION). This will be remembered in new fields in x86_exception.

    Avi suggested: "[add to x86_exception] another bool, to distinguish
    between EPT VIOLATION and EPT_QUALIFICATION. The error_code field should
    be extended to 64 bits for EXIT_QUALIFICATION (though only bits 0-12 are
    defined). You need another field for the guest linear address. 
    EXIT_QUALIFICATION has to be calculated, it cannot be derived from the
    original exit. Look at kvm_propagate_fault()."
    He also added: "If we're injecting an EPT VIOLATION to L1 (because we
    weren't able to resolve it; say L1 write-protected the page), then we
    need to compute EXIT_QUALIFICATION.  Bits 3-5 of EXIT_QUALIFICATION are
    computed from EPT12 paging structure entries (easy to derive them from
    pt_access/pte_access)."

 4. Also, nested_ept_inject_page_fault() doesn't set guest linear address.
 
 5. There are several "TODO"s left in the code.

If there's any volunteer willing to help me with some of these issues,
it would be great :-)


About nested EPT:
-----------------

Nested EPT means emulating EPT for an L1 guest, allowing it to use EPT when
running a nested guest L2. When L1 uses EPT, it allows the L2 guest to set
its own cr3 and take its own page faults without either of L0 or L1 getting
involved. In many workloads this significanlty improves L2's performance over
the previous two alternatives (shadow page tables over ept, and shadow page
tables over shadow page tables). Our paper [1] described these three options,
and the advantages of nested EPT ("multidimensional paging" in the paper).

Nested EPT is enabled by default (if the hardware supports EPT), so users do
not have to do anything special to enjoy the performance improvement that
this patch gives to L2 guests. L1 may of course choose not to use nested
EPT, by simply not using EPT (e.g., a KVM in L1 may use the "ept=0" option).

Just as a non-scientific, non-representative indication of the kind of
dramatic performance improvement you may see in workloads that have a lot of
context switches and page faults, here is a measurement of the time
an example single-threaded "make" took in L2 (kvm over kvm):

 shadow over shadow: 105 seconds
 ("ept=0" in L0 forces this)

 shadow over EPT: 87 seconds
 (the previous default; Can be forced with "ept=0" in L1)

 EPT over EPT: 29 seconds
 (the default after this patch)

Note that the same test on L1 (with EPT) took 25 seconds, so for this example
workload, performance of nested virtualization is now very close to that of
single-level virtualization.


[1] "The Turtles Project: Design and Implementation of Nested Virtualization",
    http://www.usenix.org/events/osdi10/tech/full_papers/Ben-Yehuda.pdf


Patch statistics:
-----------------

 Documentation/virtual/kvm/nested-vmx.txt |    4 
 arch/x86/include/asm/vmx.h               |    2 
 arch/x86/kvm/mmu.c                       |   52 +++-
 arch/x86/kvm/mmu.h                       |    1 
 arch/x86/kvm/paging_tmpl.h               |   98 ++++++++-
 arch/x86/kvm/vmx.c                       |  227 +++++++++++++++++++--
 arch/x86/kvm/x86.c                       |   11 -
 7 files changed, 354 insertions(+), 41 deletions(-)

--
Nadav Har'El
IBM Haifa Research Lab


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2012-08-03  8:08 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-08-01 14:36 [PATCH 0/10] nEPT v2: Nested EPT support for Nested VMX Nadav Har'El
2012-08-01 14:37 ` [PATCH 01/10] nEPT: Support LOAD_IA32_EFER entry/exit controls for L1 Nadav Har'El
2012-08-01 14:37 ` [PATCH 02/10] nEPT: Add EPT tables support to paging_tmpl.h Nadav Har'El
2012-08-02  4:00   ` Xiao Guangrong
2012-08-02 21:25     ` Nadav Har'El
2012-08-03  8:08       ` Xiao Guangrong
2012-08-01 14:38 ` [PATCH 03/10] nEPT: MMU context for nested EPT Nadav Har'El
2012-08-01 14:38 ` [PATCH 04/10] nEPT: Fix cr3 handling in nested exit and entry Nadav Har'El
2012-08-01 14:39 ` [PATCH 05/10] nEPT: Fix wrong test in kvm_set_cr3 Nadav Har'El
2012-08-01 14:39 ` [PATCH 06/10] nEPT: Some additional comments Nadav Har'El
2012-08-01 14:40 ` [PATCH 07/10] nEPT: Advertise EPT to L1 Nadav Har'El
2012-08-01 14:40 ` [PATCH 08/10] nEPT: Nested INVEPT Nadav Har'El
2012-08-01 14:41 ` [PATCH 09/10] nEPT: Documentation Nadav Har'El
2012-08-01 14:41 ` [PATCH 10/10] nEPT: Miscelleneous cleanups Nadav Har'El
2012-08-01 15:07 ` [PATCH 0/10] nEPT v2: Nested EPT support for Nested VMX Avi Kivity

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).