From: "Radim Krčmář" <rkrcmar@redhat.com>
To: Ladi Prosek <lprosek@redhat.com>
Cc: kvm@vger.kernel.org, pbonzini@redhat.com, bsd@redhat.com
Subject: Re: [PATCH] KVM: nVMX: fix CR3 load if L2 uses PAE paging and EPT
Date: Thu, 24 Nov 2016 19:25:34 +0100 [thread overview]
Message-ID: <20161124182534.GB17619@potion> (raw)
In-Reply-To: <1479994178-27703-1-git-send-email-lprosek@redhat.com>
2016-11-24 14:29+0100, Ladi Prosek:
> KVM does not correctly handle L1 hypervisors that emulate L2 real mode with
> PAE and EPT, such as Hyper-V. In this mode, the L1 hypervisor populates guest
> PDPTE VMCS fields and leaves guest CR3 uninitialized because it is not used
> (see 26.3.2.4 Loading Page-Directory-Pointer-Table Entries). KVM always
> dereferences CR3 and tries to load PDPTEs if PAE is on. This leads to two
> related issues:
>
> 1) On the first nested vmentry, the guest PDPTEs, as populated by L1, are
> overwritten in ept_load_pdptrs because the registers are believed to have
> been loaded in load_pdptrs as part of kvm_set_cr3. This is incorrect. L2 is
> running with PAE enabled but PDPTRs have been set up by L1.
>
> 2) When L2 is about to enable paging and loads its CR3, we, again, attempt
> to load PDPTEs in load_pdptrs called from kvm_set_cr3. There are no guarantees
> that this will succeed (it's just a CR3 load, paging is not enabled yet) and
> if it doesn't, kvm_set_cr3 returns early without persisting the CR3 which is
> then lost and L2 crashes right after it enables paging.
>
> This patch replaces the kvm_set_cr3 call with a simple register write if PAE
> and EPT are both on. CR3 is not to be interpreted in this case.
>
> Signed-off-by: Ladi Prosek <lprosek@redhat.com>
> ---
>
> tl;dr This makes Hyper-V (Windows Server 2016) work on KVM.
Great work!
> @@ -10113,8 +10115,17 @@ static void prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12)
> vmx_set_cr4(vcpu, vmcs12->guest_cr4);
> vmcs_writel(CR4_READ_SHADOW, nested_read_cr4(vmcs12));
>
> - /* shadow page tables on either EPT or shadow page tables */
> - kvm_set_cr3(vcpu, vmcs12->guest_cr3);
> + /*
> + * Shadow page tables on either EPT or shadow page tables.
> + * If PAE and EPT are both on, CR3 is not used by the CPU and must not
> + * be dereferenced.
> + */
> + if (is_pae(vcpu) && is_paging(vcpu) && nested_ept_enabled) {
The spec also has a "&& !is_long_mode(vcpu)" condition for PAE paging.
I'm pretty sure that is_pae() and is_paging() is set in long mode ...
the stuff in kvm_set_cr3() is not needed even then?
> + vcpu->arch.cr3 = vmcs12->guest_cr3;
> + __set_bit(VCPU_EXREG_CR3, (ulong *)&vcpu->arch.regs_avail);
> + } else
> + kvm_set_cr3(vcpu, vmcs12->guest_cr3);
> +
> kvm_mmu_reset_context(vcpu);
>
> if (!enable_ept)
> --
> 2.7.4
>
next prev parent reply other threads:[~2016-11-24 18:25 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-11-24 13:29 [PATCH] KVM: nVMX: fix CR3 load if L2 uses PAE paging and EPT Ladi Prosek
2016-11-24 16:28 ` Paolo Bonzini
2016-11-24 18:25 ` Radim Krčmář [this message]
2016-11-25 8:44 ` Ladi Prosek
2016-11-25 14:15 ` Radim Krčmář
2016-11-28 18:12 ` Ladi Prosek
2016-11-29 16:15 ` Radim Krčmář
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20161124182534.GB17619@potion \
--to=rkrcmar@redhat.com \
--cc=bsd@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=lprosek@redhat.com \
--cc=pbonzini@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox