From mboxrd@z Thu Jan 1 00:00:00 1970 From: Avi Kivity Subject: Re: [PATCH 0/10] nEPT: Nested EPT support for Nested VMX Date: Thu, 10 Nov 2011 14:26:30 +0200 Message-ID: <4EBBC2F6.8050903@redhat.com> References: <1320919040-nyh@il.ibm.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: kvm@vger.kernel.org, "Roedel, Joerg" , owasserm@redhat.com, abelg@il.ibm.com To: "Nadav Har'El" Return-path: Received: from mx1.redhat.com ([209.132.183.28]:27370 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933683Ab1KJM0j (ORCPT ); Thu, 10 Nov 2011 07:26:39 -0500 In-Reply-To: <1320919040-nyh@il.ibm.com> Sender: kvm-owner@vger.kernel.org List-ID: On 11/10/2011 11:57 AM, Nadav Har'El wrote: > The following patches add nested EPT support to Nested VMX. > > Nested EPT means emulating EPT for an L1 guest, allowing it use EPT when > running a nested guest L2. When L1 uses EPT, it allows the L2 guest to set > its own cr3 and take its own page faults without either of L0 or L1 getting > involved. In many workloads this significanlty improves L2's performance over > the previous two alternatives (shadow page tables over ept, and shadow page > tables over shadow page tables). Our paper [1] described these three options, > and the advantages of nested EPT ("multidimensional paging"). > > Nested EPT is enabled by default (if the hardware supports EPT), so users do > not have to do anything special to enjoy the performance improvement that > this patch gives to L2 guests. > > Just as a non-scientific, non-representative indication of the kind of > dramatic performance improvement you may see in workloads that have a lot of > context switches and page faults, here is a measurement of the time > an example single-threaded "make" took in L2 (kvm over kvm): > > shadow over shadow: 105 seconds > ("ept=0" forces this) > > shadow over EPT: 87 seconds > (the previous default; Can be forced now with "nested_ept=0") > > EPT over EPT: 29 seconds > (the default after this patch) > > Note that the same test on L1 (with EPT) took 25 seconds, so for this example > workload, performance of nested virtualization is now very close to that of > single-level virtualization. > > This patchset is missing a fairly hairy patch that makes reading L2 virtual addresses work. The standard example is L1 passing a bit of hardware (emulated in L0) to a L2; when L2 accesses it, the instruction will fault and need to be handled in L0, transparently to L1. The emulation can cause a fault to be injected to L2, or and EPT violation or misconfiguration injected to L1. -- error compiling committee.c: too many arguments to function