kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/30] nVMX: Nested VMX, v9
@ 2011-05-08  8:15 Nadav Har'El
  2011-05-08  8:15 ` [PATCH 01/30] nVMX: Add "nested" module option to kvm_intel Nadav Har'El
                   ` (30 more replies)
  0 siblings, 31 replies; 83+ messages in thread
From: Nadav Har'El @ 2011-05-08  8:15 UTC (permalink / raw)
  To: kvm; +Cc: gleb, avi

Hi,

This is the ninth iteration of the nested VMX patch set. This iteration
addresses all of the comments and requests that were raised by reviewers in
the previous rounds, with only a few exception listed below.

Some of the issues which were solved in this version include:

 * Overhauled the hardware VMCS (vmcs02) allocation. Previously we had up to
   256 vmcs02s, one for each L2. Now we only have one, which is reused.
   We also have a compile-time option VMCS02_POOL_SIZE to keep a bigger pool
   of vmcs02s. This option will be useful in the future if vmcs02 won't be
   filled from scratch on each entry from L1 to L2 (currently, it is).

 * The vmcs01 structure, containing a copy of all fields from L1's VMCS, was
   unnecessary, as all the necessary values are either known to KVM or appear
   in vmcs12. This structure is now gone for good.

 * There is no longer a "vmcs_fields" sub-structure that everyone disliked.
   All the VMCS fields appear directly in the vmcs12 structure, which makes
   the code simpler and more readable.

 * Make sure that the vmcs12 fields have fixed sizes and location, and add
   some extra padding, to support live migration and improve future-proofing.

 * For some fields, nested exit used to fail to return the host-state as set
   by L1. Fixed that.

 * nested_vmx_exit_handled (deciding if to let L1 handle an exit, or handle it
   in L0 and return to L2) is now more correct, and handles more exit reasons.

 * Complete overhaul of the cr0, exception bitmap, cr3 and cr4 handling code.
   The code is now shorter (uses existing functions like kvm_set_cr3, etc.),
   more readable, and more uniform (no pieces of code for enable_ept and not,
   less special code for cr0.TS, and none of that ugly cr0.PG monkey-business).

 * Use kvm_register_write(), kvm_rip_read(), etc. Got rid of new and now
   unneeded function sync_cached_regs_to_vcms().

 * Fix return value of the VMX msrs to be more correct, and more constant
   (not to needlessly vary on different hosts).

 * Added some more missing verifications to vmcs12's fields (cleanly failing
   the nested entry if these verifications fail).

 * Expose the MSR-bitmap feature to L1. Every MSR access still exits to L0,
   but slow exits to L1 are avoided when L1's MSR bitmap doesn't want it.

 * Removed or rate limited printouts which could be exploited by guests.

 * Fix VM_ENTRY_LOAD_IA32_PAT feature handling.

 * Fixed potential bug and verified that nested vmx now works with both
   CONFIG_PREEMPT and CONFIG_SMP enabled.

 * Dozens of other code cleanups and bug fixes.

Only a few issues from previous reviews remain unaddressed. These are:

 * The interrupt injection and IDT_VECTORING_INFO_FIELD handling code was
   still not rewritten. It works, though ;-)

 * No KVM autotests for nested VMX yet.

 * Merging of L0's and L1's MSR bitmaps (and IO bitmaps) is still not
   supported. As explained above, the current code uses L1's MSR bitmap
   to avoid costly exits to L1, but still suffers exits to L0 on each
   MSR access in L2.

 * Still no option for disabling some capabilities advertised to L1.

 * No support for TPR_SHADOW feature for L1.

This new set of patches applies to the current KVM trunk (I checked with
082f9eced53d50c136e42d072598da4be4b9ba23).
If you wish, you can also check out an already-patched version of KVM from
branch "nvmx9" of the repository:
	 git://github.com/nyh/kvm-nested-vmx.git


About nested VMX:
-----------------

The following 30 patches implement nested VMX support. This feature enables
a guest to use the VMX APIs in order to run its own nested guests.
In other words, it allows running hypervisors (that use VMX) under KVM.
Multiple guest hypervisors can be run concurrently, and each of those can
in turn host multiple guests.

The theory behind this work, our implementation, and its performance
characteristics were presented in OSDI 2010 (the USENIX Symposium on
Operating Systems Design and Implementation). Our paper was titled
"The Turtles Project: Design and Implementation of Nested Virtualization",
and was awarded "Jay Lepreau Best Paper". The paper is available online, at:

	http://www.usenix.org/events/osdi10/tech/full_papers/Ben-Yehuda.pdf

This patch set does not include all the features described in the paper.
In particular, this patch set is missing nested EPT (L1 can't use EPT and
must use shadow page tables). It is also missing some features required to
run VMWare hypervisors as a guest. These missing features will be sent as
follow-on patchs.

Running nested VMX:
------------------

The nested VMX feature is currently disabled by default. It must be
explicitly enabled with the "nested=1" option to the kvm-intel module.

No modifications are required to user space (qemu). However, qemu's default
emulated CPU type (qemu64) does not list the "VMX" CPU feature, so it must be
explicitly enabled, by giving qemu one of the following options:

     -cpu host              (emulated CPU has all features of the real CPU)

     -cpu qemu64,+vmx       (add just the vmx feature to a named CPU type)


This version was only tested with KVM (64-bit) as a guest hypervisor, and
Linux as a nested guest.


Patch statistics:
-----------------

 Documentation/kvm/nested-vmx.txt |  243 ++
 arch/x86/include/asm/kvm_host.h  |    2 
 arch/x86/include/asm/msr-index.h |   12 
 arch/x86/include/asm/vmx.h       |   31 
 arch/x86/kvm/svm.c               |    6 
 arch/x86/kvm/vmx.c               | 2558 +++++++++++++++++++++++++++--
 arch/x86/kvm/x86.c               |   11 
 arch/x86/kvm/x86.h               |    8 
 8 files changed, 2773 insertions(+), 98 deletions(-)

--
Nadav Har'El
IBM Haifa Research Lab

^ permalink raw reply	[flat|nested] 83+ messages in thread

end of thread, other threads:[~2011-05-24 13:07 UTC | newest]

Thread overview: 83+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-05-08  8:15 [PATCH 0/30] nVMX: Nested VMX, v9 Nadav Har'El
2011-05-08  8:15 ` [PATCH 01/30] nVMX: Add "nested" module option to kvm_intel Nadav Har'El
2011-05-08  8:16 ` [PATCH 02/30] nVMX: Implement VMXON and VMXOFF Nadav Har'El
2011-05-08  8:16 ` [PATCH 03/30] nVMX: Allow setting the VMXE bit in CR4 Nadav Har'El
2011-05-08  8:17 ` [PATCH 04/30] nVMX: Introduce vmcs12: a VMCS structure for L1 Nadav Har'El
2011-05-08  8:17 ` [PATCH 05/30] nVMX: Implement reading and writing of VMX MSRs Nadav Har'El
2011-05-08  8:18 ` [PATCH 06/30] nVMX: Decoding memory operands of VMX instructions Nadav Har'El
2011-05-09  9:47   ` Avi Kivity
2011-05-08  8:18 ` [PATCH 07/30] nVMX: Introduce vmcs02: VMCS used to run L2 Nadav Har'El
2011-05-16 15:30   ` Marcelo Tosatti
2011-05-16 18:32     ` Nadav Har'El
2011-05-17 13:20       ` Marcelo Tosatti
2011-05-08  8:19 ` [PATCH 08/30] nVMX: Fix local_vcpus_link handling Nadav Har'El
2011-05-08  8:19 ` [PATCH 09/30] nVMX: Add VMCS fields to the vmcs12 Nadav Har'El
2011-05-08  8:20 ` [PATCH 10/30] nVMX: Success/failure of VMX instructions Nadav Har'El
2011-05-08  8:20 ` [PATCH 11/30] nVMX: Implement VMCLEAR Nadav Har'El
2011-05-08  8:21 ` [PATCH 12/30] nVMX: Implement VMPTRLD Nadav Har'El
2011-05-16 14:34   ` Marcelo Tosatti
2011-05-16 18:58     ` Nadav Har'El
2011-05-16 19:09       ` Nadav Har'El
2011-05-08  8:21 ` [PATCH 13/30] nVMX: Implement VMPTRST Nadav Har'El
2011-05-08  8:22 ` [PATCH 14/30] nVMX: Implement VMREAD and VMWRITE Nadav Har'El
2011-05-08  8:22 ` [PATCH 15/30] nVMX: Move host-state field setup to a function Nadav Har'El
2011-05-09  9:56   ` Avi Kivity
2011-05-09 10:40     ` Nadav Har'El
2011-05-08  8:23 ` [PATCH 16/30] nVMX: Move control field setup to functions Nadav Har'El
2011-05-08  8:23 ` [PATCH 17/30] nVMX: Prepare vmcs02 from vmcs01 and vmcs12 Nadav Har'El
2011-05-09 10:12   ` Avi Kivity
2011-05-09 10:27     ` Nadav Har'El
2011-05-09 10:45       ` Avi Kivity
2011-05-08  8:24 ` [PATCH 18/30] nVMX: Implement VMLAUNCH and VMRESUME Nadav Har'El
2011-05-08  8:24 ` [PATCH 19/30] nVMX: No need for handle_vmx_insn function any more Nadav Har'El
2011-05-08  8:25 ` [PATCH 20/30] nVMX: Exiting from L2 to L1 Nadav Har'El
2011-05-09 10:45   ` Avi Kivity
2011-05-08  8:25 ` [PATCH 21/30] nVMX: Deciding if L0 or L1 should handle an L2 exit Nadav Har'El
2011-05-08  8:26 ` [PATCH 22/30] nVMX: Correct handling of interrupt injection Nadav Har'El
2011-05-09 10:57   ` Avi Kivity
2011-05-08  8:27 ` [PATCH 23/30] nVMX: Correct handling of exception injection Nadav Har'El
2011-05-08  8:27 ` [PATCH 24/30] nVMX: Correct handling of idt vectoring info Nadav Har'El
2011-05-09 11:04   ` Avi Kivity
2011-05-08  8:28 ` [PATCH 25/30] nVMX: Handling of CR0 and CR4 modifying instructions Nadav Har'El
2011-05-08  8:28 ` [PATCH 26/30] nVMX: Further fixes for lazy FPU loading Nadav Har'El
2011-05-08  8:29 ` [PATCH 27/30] nVMX: Additional TSC-offset handling Nadav Har'El
2011-05-09 17:27   ` Zachary Amsden
2011-05-08  8:29 ` [PATCH 28/30] nVMX: Add VMX to list of supported cpuid features Nadav Har'El
2011-05-08  8:30 ` [PATCH 29/30] nVMX: Miscellenous small corrections Nadav Har'El
2011-05-08  8:30 ` [PATCH 30/30] nVMX: Documentation Nadav Har'El
2011-05-09 11:18 ` [PATCH 0/30] nVMX: Nested VMX, v9 Avi Kivity
2011-05-09 11:37   ` Nadav Har'El
2011-05-11  8:20   ` Gleb Natapov
2011-05-12 15:42     ` Nadav Har'El
2011-05-12 15:57       ` Gleb Natapov
2011-05-12 16:08         ` Avi Kivity
2011-05-12 16:14           ` Gleb Natapov
2011-05-12 16:31         ` Nadav Har'El
2011-05-12 16:51           ` Gleb Natapov
2011-05-12 17:00             ` Avi Kivity
2011-05-15 23:11               ` Nadav Har'El
2011-05-16  6:38                 ` Gleb Natapov
2011-05-16  7:44                   ` Nadav Har'El
2011-05-16  7:57                     ` Gleb Natapov
2011-05-16  9:50                 ` Avi Kivity
2011-05-16 10:20                   ` Avi Kivity
2011-05-22 19:32             ` Nadav Har'El
2011-05-23  9:37               ` Joerg Roedel
2011-05-23  9:52               ` Avi Kivity
2011-05-23 13:02                 ` Joerg Roedel
2011-05-23 13:08                   ` Avi Kivity
2011-05-23 13:40                     ` Joerg Roedel
2011-05-23 13:52                       ` Avi Kivity
2011-05-23 14:10                         ` Nadav Har'El
2011-05-23 14:32                           ` Avi Kivity
2011-05-23 14:44                             ` Nadav Har'El
2011-05-23 15:23                               ` Avi Kivity
2011-05-23 18:06                                 ` Alexander Graf
2011-05-24 11:09                                   ` Avi Kivity
2011-05-24 13:07                                     ` Joerg Roedel
2011-05-23 14:28                         ` Joerg Roedel
2011-05-23 14:34                           ` Avi Kivity
2011-05-23 14:58                             ` Joerg Roedel
2011-05-23 15:19                               ` Avi Kivity
2011-05-23 13:18                   ` Nadav Har'El
2011-05-12 16:18       ` Avi Kivity

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).