From: Bandan Das <bsd@redhat.com>
To: David Hildenbrand <david@redhat.com>
Cc: kvm@vger.kernel.org, pbonzini@redhat.com, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v4 3/3] KVM: nVMX: Emulate EPTP switching for the L1 hypervisor
Date: Tue, 11 Jul 2017 13:58:33 -0400 [thread overview]
Message-ID: <jpgy3ruam9i.fsf@linux.bootlegged.copy> (raw)
In-Reply-To: <2d50ebc4-9328-ce08-b55b-6a331ee13cc3@redhat.com> (David Hildenbrand's message of "Tue, 11 Jul 2017 09:51:39 +0200")
David Hildenbrand <david@redhat.com> writes:
> On 10.07.2017 22:49, Bandan Das wrote:
>> When L2 uses vmfunc, L0 utilizes the associated vmexit to
>> emulate a switching of the ept pointer by reloading the
>> guest MMU.
>>
>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>> Signed-off-by: Bandan Das <bsd@redhat.com>
>> ---
>> arch/x86/include/asm/vmx.h | 6 +++++
>> arch/x86/kvm/vmx.c | 58 +++++++++++++++++++++++++++++++++++++++++++---
>> 2 files changed, 61 insertions(+), 3 deletions(-)
>>
>> diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h
>> index da5375e..5f63a2e 100644
>> --- a/arch/x86/include/asm/vmx.h
>> +++ b/arch/x86/include/asm/vmx.h
>> @@ -115,6 +115,10 @@
>> #define VMX_MISC_SAVE_EFER_LMA 0x00000020
>> #define VMX_MISC_ACTIVITY_HLT 0x00000040
>>
>> +/* VMFUNC functions */
>> +#define VMX_VMFUNC_EPTP_SWITCHING 0x00000001
>> +#define VMFUNC_EPTP_ENTRIES 512
>> +
>> static inline u32 vmx_basic_vmcs_revision_id(u64 vmx_basic)
>> {
>> return vmx_basic & GENMASK_ULL(30, 0);
>> @@ -200,6 +204,8 @@ enum vmcs_field {
>> EOI_EXIT_BITMAP2_HIGH = 0x00002021,
>> EOI_EXIT_BITMAP3 = 0x00002022,
>> EOI_EXIT_BITMAP3_HIGH = 0x00002023,
>> + EPTP_LIST_ADDRESS = 0x00002024,
>> + EPTP_LIST_ADDRESS_HIGH = 0x00002025,
>> VMREAD_BITMAP = 0x00002026,
>> VMWRITE_BITMAP = 0x00002028,
>> XSS_EXIT_BITMAP = 0x0000202C,
>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>> index fe8f5fc..0a969fb 100644
>> --- a/arch/x86/kvm/vmx.c
>> +++ b/arch/x86/kvm/vmx.c
>> @@ -246,6 +246,7 @@ struct __packed vmcs12 {
>> u64 eoi_exit_bitmap1;
>> u64 eoi_exit_bitmap2;
>> u64 eoi_exit_bitmap3;
>> + u64 eptp_list_address;
>> u64 xss_exit_bitmap;
>> u64 guest_physical_address;
>> u64 vmcs_link_pointer;
>> @@ -771,6 +772,7 @@ static const unsigned short vmcs_field_to_offset_table[] = {
>> FIELD64(EOI_EXIT_BITMAP1, eoi_exit_bitmap1),
>> FIELD64(EOI_EXIT_BITMAP2, eoi_exit_bitmap2),
>> FIELD64(EOI_EXIT_BITMAP3, eoi_exit_bitmap3),
>> + FIELD64(EPTP_LIST_ADDRESS, eptp_list_address),
>> FIELD64(XSS_EXIT_BITMAP, xss_exit_bitmap),
>> FIELD64(GUEST_PHYSICAL_ADDRESS, guest_physical_address),
>> FIELD64(VMCS_LINK_POINTER, vmcs_link_pointer),
>> @@ -1402,6 +1404,13 @@ static inline bool nested_cpu_has_vmfunc(struct vmcs12 *vmcs12)
>> return nested_cpu_has2(vmcs12, SECONDARY_EXEC_ENABLE_VMFUNC);
>> }
>>
>> +static inline bool nested_cpu_has_eptp_switching(struct vmcs12 *vmcs12)
>> +{
>> + return nested_cpu_has_vmfunc(vmcs12) &&
>> + (vmcs12->vm_function_control &
>
> I wonder if it makes sense to rename vm_function_control to
> - vmfunc_control
> - vmfunc_controls (so it matches nested_vmx_vmfunc_controls)
> - vmfunc_ctrl
I tend to follow the SDM names because it's easy to look for them.
>> + VMX_VMFUNC_EPTP_SWITCHING);
>> +}
>> +
>> static inline bool is_nmi(u32 intr_info)
>> {
>> return (intr_info & (INTR_INFO_INTR_TYPE_MASK | INTR_INFO_VALID_MASK))
>> @@ -2791,7 +2800,12 @@ static void nested_vmx_setup_ctls_msrs(struct vcpu_vmx *vmx)
>> if (cpu_has_vmx_vmfunc()) {
>> vmx->nested.nested_vmx_secondary_ctls_high |=
>> SECONDARY_EXEC_ENABLE_VMFUNC;
>> - vmx->nested.nested_vmx_vmfunc_controls = 0;
>> + /*
>> + * Advertise EPTP switching unconditionally
>> + * since we emulate it
>> + */
>> + vmx->nested.nested_vmx_vmfunc_controls =
>> + VMX_VMFUNC_EPTP_SWITCHING;> }
>>
>> /*
>> @@ -7772,6 +7786,9 @@ static int handle_vmfunc(struct kvm_vcpu *vcpu)
>> struct vcpu_vmx *vmx = to_vmx(vcpu);
>> struct vmcs12 *vmcs12;
>> u32 function = vcpu->arch.regs[VCPU_REGS_RAX];
>> + u32 index = vcpu->arch.regs[VCPU_REGS_RCX];
>> + struct page *page = NULL;
>> + u64 *l1_eptp_list, address;
>>
>> /*
>> * VMFUNC is only supported for nested guests, but we always enable the
>> @@ -7784,11 +7801,46 @@ static int handle_vmfunc(struct kvm_vcpu *vcpu)
>> }
>>
>> vmcs12 = get_vmcs12(vcpu);
>> - if ((vmcs12->vm_function_control & (1 << function)) == 0)
>> + if (((vmcs12->vm_function_control & (1 << function)) == 0) ||
>> + WARN_ON_ONCE(function))
>
> "... instruction causes a VM exit if the bit at position EAX is 0 in the
> VM-function controls (the selected VM function is
> not enabled)."
>
> So g2 can trigger this WARN_ON_ONCE, no? I think we should drop it then
> completely.
It's a good hint to see if L2 misbehaved and WARN_ON_ONCE makes sure it's
not misused.
>> + goto fail;
>> +
>> + if (!nested_cpu_has_ept(vmcs12) ||
>> + !nested_cpu_has_eptp_switching(vmcs12))
>> + goto fail;
>> +
>> + if (!vmcs12->eptp_list_address || index >= VMFUNC_EPTP_ENTRIES)
>> + goto fail;
>
> I can find the definition for an vmexit in case of index >=
> VMFUNC_EPTP_ENTRIES, but not for !vmcs12->eptp_list_address in the SDM.
>
> Can you give me a hint?
I don't think there is. Since, we are basically emulating eptp switching
for L2, this is a good check to have.
>> +
>> + page = nested_get_page(vcpu, vmcs12->eptp_list_address);
>> + if (!page)
>> goto fail;
>> - WARN_ONCE(1, "VMCS12 VM function control should have been zero");
>> +
>> + l1_eptp_list = kmap(page);
>> + address = l1_eptp_list[index];
>> + if (!address)
>> + goto fail;
>
> Can you move that check to the other address checks below? (or rework if
> this make sense, see below)
>
>> + /*
>> + * If the (L2) guest does a vmfunc to the currently
>> + * active ept pointer, we don't have to do anything else
>> + */
>> + if (vmcs12->ept_pointer != address) {
>> + if (address >> cpuid_maxphyaddr(vcpu) ||
>> + !IS_ALIGNED(address, 4096))
>
> Couldn't the pfn still be invalid and make kvm_mmu_reload() fail?
> (triggering a KVM_REQ_TRIPLE_FAULT)
If there's a triple fault, I think it's a good idea to inject it
back. Basically, there's no need to take care of damage control
that L1 is intentionally doing.
>> + goto fail;
>> + kvm_mmu_unload(vcpu);
>> + vmcs12->ept_pointer = address;
>> + kvm_mmu_reload(vcpu);
>
> I was thinking about something like this:
>
> kvm_mmu_unload(vcpu);
> old = vmcs12->ept_pointer;
> vmcs12->ept_pointer = address;
> if (kvm_mmu_reload(vcpu)) {
> /* pointer invalid, restore previous state */
> kvm_clear_request(KVM_REQ_TRIPLE_FAULT, vcpu);
> vmcs12->ept_pointer = old;
> kvm_mmu_reload(vcpu);
> goto fail;
> }
>
> The you can inherit the checks from mmu_check_root().
>
>
> Wonder why I can't spot checks for cpuid_maxphyaddr() /
> IS_ALIGNED(address, 4096) for ordinary use of vmcs12->ept_pointer. The
> checks should be identical.
I think the reason is vmcs12->ept_pointer is never used directly. It's
used to create a shadow table but nevertheless, the check should be there.
>
>> + kunmap(page);
>> + nested_release_page_clean(page);
>
> shouldn't the kunmap + nested_release_page_clean go outside the if clause?
:) Indeed, thanks for the catch.
Bandan
>> + }
>> + return kvm_skip_emulated_instruction(vcpu);
>>
>> fail:
>> + if (page) {
>> + kunmap(page);
>> + nested_release_page_clean(page);
>> + }
>> nested_vmx_vmexit(vcpu, vmx->exit_reason,
>> vmcs_read32(VM_EXIT_INTR_INFO),
>> vmcs_readl(EXIT_QUALIFICATION));
>>
>
> David and mmu code are not yet best friends. So sorry if I am missing
> something.
next prev parent reply other threads:[~2017-07-11 17:58 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-07-10 20:49 [PATCH v4 0/3] Expose VMFUNC to the nested hypervisor Bandan Das
2017-07-10 20:49 ` [PATCH v4 1/3] KVM: vmx: Enable VMFUNCs Bandan Das
2017-07-10 20:49 ` [PATCH v4 2/3] KVM: nVMX: Enable VMFUNC for the L1 hypervisor Bandan Das
2017-07-10 20:49 ` [PATCH v4 3/3] KVM: nVMX: Emulate EPTP switching " Bandan Das
2017-07-11 7:51 ` David Hildenbrand
2017-07-11 8:39 ` Paolo Bonzini
2017-07-11 13:52 ` Radim Krčmář
2017-07-11 18:05 ` Bandan Das
2017-07-11 19:12 ` Radim Krčmář
2017-07-11 19:34 ` Bandan Das
2017-07-11 17:58 ` Bandan Das [this message]
2017-07-11 18:22 ` Jim Mattson
2017-07-11 18:35 ` Bandan Das
2017-07-11 19:13 ` Radim Krčmář
2017-07-11 19:38 ` Bandan Das
2017-07-11 20:22 ` Radim Krčmář
2017-07-11 20:45 ` Bandan Das
2017-07-12 13:41 ` Radim Krčmář
2017-07-12 18:04 ` Bandan Das
2017-07-11 18:24 ` Bandan Das
2017-07-11 19:32 ` Radim Krčmář
2017-07-11 19:50 ` Bandan Das
2017-07-11 20:21 ` Radim Krčmář
2017-07-11 20:34 ` Bandan Das
2017-07-11 20:45 ` Radim Krčmář
2017-07-11 21:08 ` Bandan Das
2017-07-12 13:24 ` Radim Krčmář
2017-07-12 18:11 ` Bandan Das
2017-07-12 19:18 ` Radim Krčmář
2017-07-17 17:58 ` Bandan Das
2017-07-19 9:30 ` Radim Krčmář
2017-07-19 17:54 ` Bandan Das
2017-07-13 15:39 ` David Hildenbrand
2017-07-13 17:08 ` Bandan Das
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=jpgy3ruam9i.fsf@linux.bootlegged.copy \
--to=bsd@redhat.com \
--cc=david@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=pbonzini@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.