From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jan Kiszka Subject: Re: [PATCH v3 01/13] nEPT: Support LOAD_IA32_EFER entry/exit controls for L1 Date: Tue, 02 Jul 2013 17:34:56 +0200 Message-ID: <51D2F320.9070901@web.de> References: <1368939152-11406-1-git-send-email-jun.nakajima@intel.com> <519A182C.40306@redhat.com> <20130702135921.GA18489@redhat.com> <51D2E3A8.1070405@web.de> <20130702151523.GC18489@redhat.com> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="----enig2GJOKEKAOJALPQDGQQEPN" Cc: "Zhang, Yang Z" , Paolo Bonzini , "Nakajima, Jun" , "kvm@vger.kernel.org" To: Gleb Natapov Return-path: Received: from mout.web.de ([212.227.17.11]:49845 "EHLO mout.web.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753180Ab3GBPfO (ORCPT ); Tue, 2 Jul 2013 11:35:14 -0400 In-Reply-To: <20130702151523.GC18489@redhat.com> Sender: kvm-owner@vger.kernel.org List-ID: This is an OpenPGP/MIME signed message (RFC 4880 and 3156) ------enig2GJOKEKAOJALPQDGQQEPN Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable On 2013-07-02 17:15, Gleb Natapov wrote: > On Tue, Jul 02, 2013 at 04:28:56PM +0200, Jan Kiszka wrote: >> On 2013-07-02 15:59, Gleb Natapov wrote: >>> On Tue, Jul 02, 2013 at 03:01:24AM +0000, Zhang, Yang Z wrote: >>>> Since this series is pending in mail list for long time. And it's re= ally a big feature for Nested. Also, I doubt the original authors(Jun and= Nahav)should not have enough time to continue it. So I will pick it up. = :) >>>> >>>> See comments below: >>>> >>>> Paolo Bonzini wrote on 2013-05-20: >>>>> Il 19/05/2013 06:52, Jun Nakajima ha scritto: >>>>>> From: Nadav Har'El >>>>>> >>>>>> Recent KVM, since >>>>> http://kerneltrap.org/mailarchive/linux-kvm/2010/5/2/6261577 >>>>>> switch the EFER MSR when EPT is used and the host and guest have d= ifferent >>>>>> NX bits. So if we add support for nested EPT (L1 guest using EPT t= o run L2) >>>>>> and want to be able to run recent KVM as L1, we need to allow L1 t= o use this >>>>>> EFER switching feature. >>>>>> >>>>>> To do this EFER switching, KVM uses VM_ENTRY/EXIT_LOAD_IA32_EFER i= f >>>>> available, >>>>>> and if it isn't, it uses the generic VM_ENTRY/EXIT_MSR_LOAD. This = patch adds >>>>>> support for the former (the latter is still unsupported). >>>>>> >>>>>> Nested entry and exit emulation (prepare_vmcs_02 and >>>>> load_vmcs12_host_state, >>>>>> respectively) already handled VM_ENTRY/EXIT_LOAD_IA32_EFER correct= ly. So >>>>> all >>>>>> that's left to do in this patch is to properly advertise this feat= ure to L1. >>>>>> >>>>>> Note that vmcs12's VM_ENTRY/EXIT_LOAD_IA32_EFER are emulated by L0= , by >>>>> using >>>>>> vmx_set_efer (which itself sets one of several vmcs02 fields), so = we always >>>>>> support this feature, regardless of whether the host supports it. >>>>>> >>>>>> Signed-off-by: Nadav Har'El >>>>>> Signed-off-by: Jun Nakajima >>>>>> Signed-off-by: Xinhao Xu >>>>>> --- >>>>>> arch/x86/kvm/vmx.c | 23 ++++++++++++++++------- >>>>>> 1 file changed, 16 insertions(+), 7 deletions(-) >>>>>> >>>>>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c >>>>>> index 260a919..fb9cae5 100644 >>>>>> --- a/arch/x86/kvm/vmx.c >>>>>> +++ b/arch/x86/kvm/vmx.c >>>>>> @@ -2192,7 +2192,8 @@ static __init void nested_vmx_setup_ctls_msr= s(void) >>>>>> #else >>>>>> nested_vmx_exit_ctls_high =3D 0; >>>>>> #endif >>>>>> - nested_vmx_exit_ctls_high |=3D >>>>> VM_EXIT_ALWAYSON_WITHOUT_TRUE_MSR; >>>>>> + nested_vmx_exit_ctls_high |=3D >>>>> (VM_EXIT_ALWAYSON_WITHOUT_TRUE_MSR | >>>>>> + VM_EXIT_LOAD_IA32_EFER); >>>>>> >>>>>> /* entry controls */ >>>>>> rdmsr(MSR_IA32_VMX_ENTRY_CTLS, >>>>>> @@ -2201,8 +2202,8 @@ static __init void nested_vmx_setup_ctls_msr= s(void) >>>>>> nested_vmx_entry_ctls_low =3D >>>>> VM_ENTRY_ALWAYSON_WITHOUT_TRUE_MSR; >>>>>> nested_vmx_entry_ctls_high &=3D >>>>>> VM_ENTRY_LOAD_IA32_PAT | VM_ENTRY_IA32E_MODE; >>>>>> - nested_vmx_entry_ctls_high |=3D >>>>> VM_ENTRY_ALWAYSON_WITHOUT_TRUE_MSR; >>>>>> - >>>>>> + nested_vmx_entry_ctls_high |=3D >>>>> (VM_ENTRY_ALWAYSON_WITHOUT_TRUE_MSR | >>>>>> + VM_ENTRY_LOAD_IA32_EFER); >>>>>> /* cpu-based controls */ >>>>>> rdmsr(MSR_IA32_VMX_PROCBASED_CTLS, >>>>>> nested_vmx_procbased_ctls_low, nested_vmx_procbased_ctls_high);= >>>>>> @@ -7492,10 +7493,18 @@ static void prepare_vmcs02(struct kvm_vcpu= *vcpu, >>>>> struct vmcs12 *vmcs12) >>>>>> vcpu->arch.cr0_guest_owned_bits &=3D ~vmcs12->cr0_guest_host_mas= k; >>>>>> vmcs_writel(CR0_GUEST_HOST_MASK, >>>>> ~vcpu->arch.cr0_guest_owned_bits); >>>>>> >>>>>> - /* Note: IA32_MODE, LOAD_IA32_EFER are modified by vmx_set_efer >>>>> below */ >>>>>> - vmcs_write32(VM_EXIT_CONTROLS, >>>>>> - vmcs12->vm_exit_controls | vmcs_config.vmexit_ctrl); >>>>>> - vmcs_write32(VM_ENTRY_CONTROLS, vmcs12->vm_entry_controls | >>>>>> + /* L2->L1 exit controls are emulated - the hardware exit is to L= 0 so >>>>>> + * we should use its exit controls. Note that IA32_MODE, LOAD_IA= 32_EFER >>>>>> + * bits are further modified by vmx_set_efer() below. >>>>>> + */ >>>>>> + vmcs_write32(VM_EXIT_CONTROLS, vmcs_config.vmexit_ctrl); >>>> This is wrong. We cannot use L0 exit control directly. >>>> LOAD_PERF_GLOBAL_CTRL, LOAD_HOST_EFE, LOAD_HOST_PAT, ACK_INTR_ON_EXI= T should use host's exit control. But others, still need use (vmcs12|host= ). >>>> >>> I do not see why. We always intercept DR7/PAT/EFER, so save is emulat= ed >>> too. Host address space size always come from L0 and preemption timer= is >>> not supported for nested IIRC and when it will be host will have to s= ave >>> it on exit anyway for correct emulation. >> >> Preemption timer is already supported and works fine as far as I teste= d. >> KVM doesn't use it for L1, so we do not need to save/restore it - IIRC= =2E >> > So what happens if L1 configures it to value X after X/2 ticks L0 exit > happen and L0 gets back to L2 directly. The counter will be X again > instead of X/2. Likely. Yes, we need to improve our emulation by setting "Save VMX-preemption timer value" or emulate this in software if the hardware lacks support for it (was this flag introduced after the preemption timer itself?). Jan ------enig2GJOKEKAOJALPQDGQQEPN Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iEYEARECAAYFAlHS8yIACgkQitSsb3rl5xSnpwCeKri9Pe+b3CQN76WimLOgHzEk 2+cAn2XGuf7slPeuQgm9eAQiMz2dO89b =ViSZ -----END PGP SIGNATURE----- ------enig2GJOKEKAOJALPQDGQQEPN--