public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/3] kvm: nVMX: Remove nested_vmx_succeed after successful VM-entry
@ 2017-09-14 23:31 Jim Mattson
  2017-09-14 23:31 ` Jim Mattson
                   ` (8 more replies)
  0 siblings, 9 replies; 13+ messages in thread
From: Jim Mattson @ 2017-09-14 23:31 UTC (permalink / raw)
  To: kvm; +Cc: Jim Mattson

After a successful VM-entry, RFLAGS is cleared, with the exception of
bit 1, which is always set. This is handled by load_vmcs12_host_state.

Signed-off-by: Jim Mattson <jmattson@google.com>
---
 arch/x86/kvm/vmx.c | 16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 09204993a739..84af1268860c 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -11486,16 +11486,18 @@ static void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 exit_reason,
 	 */
 	kvm_make_request(KVM_REQ_APIC_PAGE_RELOAD, vcpu);
 
-	/*
-	 * Exiting from L2 to L1, we're now back to L1 which thinks it just
-	 * finished a VMLAUNCH or VMRESUME instruction, so we need to set the
-	 * success or failure flag accordingly.
-	 */
 	if (unlikely(vmx->fail)) {
+		/*
+		 * After an early L2 VM-entry failure, we're now back
+		 * in L1 which thinks it just finished a VMLAUNCH or
+		 * VMRESUME instruction, so we need to set the failure
+		 * flag and the VM-instruction error field of the VMCS
+		 * accordingly.
+		 */
 		vmx->fail = 0;
 		nested_vmx_failValid(vcpu, vm_inst_error);
-	} else
-		nested_vmx_succeed(vcpu);
+	}
+
 	if (enable_shadow_vmcs)
 		vmx->nested.sync_shadow_vmcs = true;
 
-- 
2.14.1.690.gbb1197296e-goog

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 1/3] kvm: nVMX: Remove nested_vmx_succeed after successful VM-entry
  2017-09-14 23:31 [PATCH 1/3] kvm: nVMX: Remove nested_vmx_succeed after successful VM-entry Jim Mattson
@ 2017-09-14 23:31 ` Jim Mattson
  2017-09-14 23:31 ` [PATCH 2/3] kvm: vmx: Handle VMLAUNCH/VMRESUME failure properly Jim Mattson
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 13+ messages in thread
From: Jim Mattson @ 2017-09-14 23:31 UTC (permalink / raw)
  To: kvm; +Cc: Jim Mattson

After a successful VM-entry, RFLAGS is cleared, with the exception of
bit 1, which is always set. This is handled by load_vmcs12_host_state.

Change-Id: I43796013022cdedcbb0628fb97a54912a6b23fa1
Signed-off-by: Jim Mattson <jmattson@google.com>
---
 arch/x86/kvm/vmx.c | 16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 09204993a739..84af1268860c 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -11486,16 +11486,18 @@ static void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 exit_reason,
 	 */
 	kvm_make_request(KVM_REQ_APIC_PAGE_RELOAD, vcpu);
 
-	/*
-	 * Exiting from L2 to L1, we're now back to L1 which thinks it just
-	 * finished a VMLAUNCH or VMRESUME instruction, so we need to set the
-	 * success or failure flag accordingly.
-	 */
 	if (unlikely(vmx->fail)) {
+		/*
+		 * After an early L2 VM-entry failure, we're now back
+		 * in L1 which thinks it just finished a VMLAUNCH or
+		 * VMRESUME instruction, so we need to set the failure
+		 * flag and the VM-instruction error field of the VMCS
+		 * accordingly.
+		 */
 		vmx->fail = 0;
 		nested_vmx_failValid(vcpu, vm_inst_error);
-	} else
-		nested_vmx_succeed(vcpu);
+	}
+
 	if (enable_shadow_vmcs)
 		vmx->nested.sync_shadow_vmcs = true;
 
-- 
2.14.1.690.gbb1197296e-goog

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/3] kvm: vmx: Handle VMLAUNCH/VMRESUME failure properly
  2017-09-14 23:31 [PATCH 1/3] kvm: nVMX: Remove nested_vmx_succeed after successful VM-entry Jim Mattson
  2017-09-14 23:31 ` Jim Mattson
@ 2017-09-14 23:31 ` Jim Mattson
  2017-09-14 23:31 ` Jim Mattson
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 13+ messages in thread
From: Jim Mattson @ 2017-09-14 23:31 UTC (permalink / raw)
  To: kvm; +Cc: Jim Mattson

On an early VMLAUNCH/VMRESUME failure (i.e. one which sets the
VM-instruction error field of the current VMCS), the launch state of
the current VMCS is not set to "launched," and the VM-exit information
fields of the current VMCS (including IDT-vectoring information and
exit reason) are stale.

On a late VMLAUNCH/VMRESUME failure (i.e. one which sets the high bit
of the exit reason field), the launch state of the current VMCS is not
set to "launched," and only two of the VM-exit information fields of
the current VMCS are modified (exit reason and exit
qualification). The remaining VM-exit information fields of the
current VMCS (including IDT-vectoring information, in particular) are
stale.

Signed-off-by: Jim Mattson <jmattson@google.com>
---
 arch/x86/kvm/vmx.c | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 84af1268860c..7f494d9278f6 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -9424,12 +9424,6 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu)
 				  | (1 << VCPU_EXREG_CR3));
 	vcpu->arch.regs_dirty = 0;
 
-	vmx->idt_vectoring_info = vmcs_read32(IDT_VECTORING_INFO_FIELD);
-
-	vmx->loaded_vmcs->launched = 1;
-
-	vmx->exit_reason = vmcs_read32(VM_EXIT_REASON);
-
 	/*
 	 * eager fpu is enabled if PKEY is supported and CR4 is switched
 	 * back on host, so it is safe to read guest PKRU from current
@@ -9451,6 +9445,14 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu)
 		kvm_make_request(KVM_REQ_EVENT, vcpu);
 
 	vmx->nested.nested_run_pending = 0;
+	vmx->idt_vectoring_info = 0;
+
+	vmx->exit_reason = vmx->fail ? 0xdead : vmcs_read32(VM_EXIT_REASON);
+	if (vmx->fail || (vmx->exit_reason & VMX_EXIT_REASONS_FAILED_VMENTRY))
+		return;
+
+	vmx->loaded_vmcs->launched = 1;
+	vmx->idt_vectoring_info = vmcs_read32(IDT_VECTORING_INFO_FIELD);
 
 	vmx_complete_atomic_exit(vmx);
 	vmx_recover_nmi_blocking(vmx);
-- 
2.14.1.690.gbb1197296e-goog

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/3] kvm: vmx: Handle VMLAUNCH/VMRESUME failure properly
  2017-09-14 23:31 [PATCH 1/3] kvm: nVMX: Remove nested_vmx_succeed after successful VM-entry Jim Mattson
  2017-09-14 23:31 ` Jim Mattson
  2017-09-14 23:31 ` [PATCH 2/3] kvm: vmx: Handle VMLAUNCH/VMRESUME failure properly Jim Mattson
@ 2017-09-14 23:31 ` Jim Mattson
  2017-09-14 23:31 ` [PATCH 3/3] kvm: nVMX: Handle deferred early " Jim Mattson
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 13+ messages in thread
From: Jim Mattson @ 2017-09-14 23:31 UTC (permalink / raw)
  To: kvm; +Cc: Jim Mattson

On an early VMLAUNCH/VMRESUME failure (i.e. one which sets the
VM-instruction error field of the current VMCS), the launch state of
the current VMCS is not set to "launched," and the VM-exit information
fields of the current VMCS (including IDT-vectoring information and
exit reason) are stale.

On a late VMLAUNCH/VMRESUME failure (i.e. one which sets the high bit
of the exit reason field), the launch state of the current VMCS is not
set to "launched," and only two of the VM-exit information fields of
the current VMCS are modified (exit reason and exit
qualification). The remaining VM-exit information fields of the
current VMCS (including IDT-vectoring information, in particular) are
stale.

Change-Id: I595c7868758af0782353257605a0a470af9c598b
Signed-off-by: Jim Mattson <jmattson@google.com>
---
 arch/x86/kvm/vmx.c | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 84af1268860c..7f494d9278f6 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -9424,12 +9424,6 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu)
 				  | (1 << VCPU_EXREG_CR3));
 	vcpu->arch.regs_dirty = 0;
 
-	vmx->idt_vectoring_info = vmcs_read32(IDT_VECTORING_INFO_FIELD);
-
-	vmx->loaded_vmcs->launched = 1;
-
-	vmx->exit_reason = vmcs_read32(VM_EXIT_REASON);
-
 	/*
 	 * eager fpu is enabled if PKEY is supported and CR4 is switched
 	 * back on host, so it is safe to read guest PKRU from current
@@ -9451,6 +9445,14 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu)
 		kvm_make_request(KVM_REQ_EVENT, vcpu);
 
 	vmx->nested.nested_run_pending = 0;
+	vmx->idt_vectoring_info = 0;
+
+	vmx->exit_reason = vmx->fail ? 0xdead : vmcs_read32(VM_EXIT_REASON);
+	if (vmx->fail || (vmx->exit_reason & VMX_EXIT_REASONS_FAILED_VMENTRY))
+		return;
+
+	vmx->loaded_vmcs->launched = 1;
+	vmx->idt_vectoring_info = vmcs_read32(IDT_VECTORING_INFO_FIELD);
 
 	vmx_complete_atomic_exit(vmx);
 	vmx_recover_nmi_blocking(vmx);
-- 
2.14.1.690.gbb1197296e-goog

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 3/3] kvm: nVMX: Handle deferred early VMLAUNCH/VMRESUME failure properly
  2017-09-14 23:31 [PATCH 1/3] kvm: nVMX: Remove nested_vmx_succeed after successful VM-entry Jim Mattson
                   ` (2 preceding siblings ...)
  2017-09-14 23:31 ` Jim Mattson
@ 2017-09-14 23:31 ` Jim Mattson
  2017-09-14 23:31 ` Jim Mattson
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 13+ messages in thread
From: Jim Mattson @ 2017-09-14 23:31 UTC (permalink / raw)
  To: kvm; +Cc: Jim Mattson

When emulating a nested VM-entry from L1 to L2, several control field
validation checks are deferred to the hardware. Should one of these
validation checks fail, vcpu_vmx_run will set the vmx->fail flag. When
this happens, the L2 guest state is not loaded (even in part), and
execution should continue in L1 with the next instruction after the
VMLAUNCH/VMRESUME.

The VMCS12 is not modified (except for the VM-instruction error
field), the VMCS12 MSR save/load lists are not processed, and the CPU
state is not loaded from the VMCS12 host area. Moreover, the vmcs02
exit reason is stale, so it should not be consulted for any reason.

Signed-off-by: Jim Mattson <jmattson@google.com>
---
 arch/x86/kvm/vmx.c | 134 ++++++++++++++++++++++++++++++-----------------------
 1 file changed, 75 insertions(+), 59 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 7f494d9278f6..69b219da1c93 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -8344,12 +8344,14 @@ static bool nested_vmx_exit_reflected(struct kvm_vcpu *vcpu, u32 exit_reason)
 	struct vcpu_vmx *vmx = to_vmx(vcpu);
 	struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
 
-	trace_kvm_nested_vmexit(kvm_rip_read(vcpu), exit_reason,
-				vmcs_readl(EXIT_QUALIFICATION),
-				vmx->idt_vectoring_info,
-				intr_info,
-				vmcs_read32(VM_EXIT_INTR_ERROR_CODE),
-				KVM_ISA_VMX);
+	if (vmx->nested.nested_run_pending)
+		return false;
+
+	if (unlikely(vmx->fail)) {
+		pr_info_ratelimited("%s failed vm entry %x\n", __func__,
+				    vmcs_read32(VM_INSTRUCTION_ERROR));
+		return true;
+	}
 
 	/*
 	 * The host physical addresses of some pages of guest memory
@@ -8363,14 +8365,12 @@ static bool nested_vmx_exit_reflected(struct kvm_vcpu *vcpu, u32 exit_reason)
 	 */
 	nested_mark_vmcs12_pages_dirty(vcpu);
 
-	if (vmx->nested.nested_run_pending)
-		return false;
-
-	if (unlikely(vmx->fail)) {
-		pr_info_ratelimited("%s failed vm entry %x\n", __func__,
-				    vmcs_read32(VM_INSTRUCTION_ERROR));
-		return true;
-	}
+	trace_kvm_nested_vmexit(kvm_rip_read(vcpu), exit_reason,
+				vmcs_readl(EXIT_QUALIFICATION),
+				vmx->idt_vectoring_info,
+				intr_info,
+				vmcs_read32(VM_EXIT_INTR_ERROR_CODE),
+				KVM_ISA_VMX);
 
 	switch (exit_reason) {
 	case EXIT_REASON_EXCEPTION_NMI:
@@ -11390,46 +11390,30 @@ static void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 exit_reason,
 {
 	struct vcpu_vmx *vmx = to_vmx(vcpu);
 	struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
-	u32 vm_inst_error = 0;
 
 	/* trying to cancel vmlaunch/vmresume is a bug */
 	WARN_ON_ONCE(vmx->nested.nested_run_pending);
 
+	/*
+	 * The only expected VM-instruction error is "VM entry with
+	 * invalid control field(s)." Anything else indicates a
+	 * problem with L0.
+	 */
+	WARN_ON_ONCE(vmx->fail && (vmcs_read32(VM_INSTRUCTION_ERROR) !=
+				   VMXERR_ENTRY_INVALID_CONTROL_FIELD));
+
 	leave_guest_mode(vcpu);
-	prepare_vmcs12(vcpu, vmcs12, exit_reason, exit_intr_info,
-		       exit_qualification);
 
-	if (nested_vmx_store_msr(vcpu, vmcs12->vm_exit_msr_store_addr,
-				 vmcs12->vm_exit_msr_store_count))
-		nested_vmx_abort(vcpu, VMX_ABORT_SAVE_GUEST_MSR_FAIL);
+	if (likely(!vmx->fail)) {
+		prepare_vmcs12(vcpu, vmcs12, exit_reason, exit_intr_info,
+			       exit_qualification);
 
-	if (unlikely(vmx->fail))
-		vm_inst_error = vmcs_read32(VM_INSTRUCTION_ERROR);
+		if (nested_vmx_store_msr(vcpu, vmcs12->vm_exit_msr_store_addr,
+					 vmcs12->vm_exit_msr_store_count))
+			nested_vmx_abort(vcpu, VMX_ABORT_SAVE_GUEST_MSR_FAIL);
+	}
 
 	vmx_switch_vmcs(vcpu, &vmx->vmcs01);
-
-	/*
-	 * TODO: SDM says that with acknowledge interrupt on exit, bit 31 of
-	 * the VM-exit interrupt information (valid interrupt) is always set to
-	 * 1 on EXIT_REASON_EXTERNAL_INTERRUPT, so we shouldn't need
-	 * kvm_cpu_has_interrupt().  See the commit message for details.
-	 */
-	if (nested_exit_intr_ack_set(vcpu) &&
-	    exit_reason == EXIT_REASON_EXTERNAL_INTERRUPT &&
-	    kvm_cpu_has_interrupt(vcpu)) {
-		int irq = kvm_cpu_get_interrupt(vcpu);
-		WARN_ON(irq < 0);
-		vmcs12->vm_exit_intr_info = irq |
-			INTR_INFO_VALID_MASK | INTR_TYPE_EXT_INTR;
-	}
-
-	trace_kvm_nested_vmexit_inject(vmcs12->vm_exit_reason,
-				       vmcs12->exit_qualification,
-				       vmcs12->idt_vectoring_info_field,
-				       vmcs12->vm_exit_intr_info,
-				       vmcs12->vm_exit_intr_error_code,
-				       KVM_ISA_VMX);
-
 	vm_entry_controls_reset_shadow(vmx);
 	vm_exit_controls_reset_shadow(vmx);
 	vmx_segment_cache_clear(vmx);
@@ -11438,8 +11422,6 @@ static void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 exit_reason,
 	if (VMCS02_POOL_SIZE == 0)
 		nested_free_vmcs02(vmx, vmx->nested.current_vmptr);
 
-	load_vmcs12_host_state(vcpu, vmcs12);
-
 	/* Update any VMCS fields that might have changed while L2 ran */
 	vmcs_write32(VM_EXIT_MSR_LOAD_COUNT, vmx->msr_autoload.nr);
 	vmcs_write32(VM_ENTRY_MSR_LOAD_COUNT, vmx->msr_autoload.nr);
@@ -11488,23 +11470,57 @@ static void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 exit_reason,
 	 */
 	kvm_make_request(KVM_REQ_APIC_PAGE_RELOAD, vcpu);
 
-	if (unlikely(vmx->fail)) {
-		/*
-		 * After an early L2 VM-entry failure, we're now back
-		 * in L1 which thinks it just finished a VMLAUNCH or
-		 * VMRESUME instruction, so we need to set the failure
-		 * flag and the VM-instruction error field of the VMCS
-		 * accordingly.
-		 */
-		vmx->fail = 0;
-		nested_vmx_failValid(vcpu, vm_inst_error);
-	}
-
 	if (enable_shadow_vmcs)
 		vmx->nested.sync_shadow_vmcs = true;
 
 	/* in case we halted in L2 */
 	vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE;
+
+	if (likely(!vmx->fail)) {
+		/*
+		 * TODO: SDM says that with acknowledge interrupt on
+		 * exit, bit 31 of the VM-exit interrupt information
+		 * (valid interrupt) is always set to 1 on
+		 * EXIT_REASON_EXTERNAL_INTERRUPT, so we shouldn't
+		 * need kvm_cpu_has_interrupt().  See the commit
+		 * message for details.
+		 */
+		if (nested_exit_intr_ack_set(vcpu) &&
+		    exit_reason == EXIT_REASON_EXTERNAL_INTERRUPT &&
+		    kvm_cpu_has_interrupt(vcpu)) {
+			int irq = kvm_cpu_get_interrupt(vcpu);
+			WARN_ON(irq < 0);
+			vmcs12->vm_exit_intr_info = irq |
+				INTR_INFO_VALID_MASK | INTR_TYPE_EXT_INTR;
+		}
+
+		trace_kvm_nested_vmexit_inject(vmcs12->vm_exit_reason,
+					       vmcs12->exit_qualification,
+					       vmcs12->idt_vectoring_info_field,
+					       vmcs12->vm_exit_intr_info,
+					       vmcs12->vm_exit_intr_error_code,
+					       KVM_ISA_VMX);
+
+		load_vmcs12_host_state(vcpu, vmcs12);
+
+		return;
+	}
+	
+	/*
+	 * After an early L2 VM-entry failure, we're now back
+	 * in L1 which thinks it just finished a VMLAUNCH or
+	 * VMRESUME instruction, so we need to set the failure
+	 * flag and the VM-instruction error field of the VMCS
+	 * accordingly.
+	 */
+	nested_vmx_failValid(vcpu, VMXERR_ENTRY_INVALID_CONTROL_FIELD);
+	/*
+	 * The emulated instruction was already skipped in
+	 * nested_vmx_run, but the updated RIP was never
+	 * written back to the vmcs01.
+	 */
+	skip_emulated_instruction(vcpu);
+	vmx->fail = 0;
 }
 
 /*
-- 
2.14.1.690.gbb1197296e-goog

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 3/3] kvm: nVMX: Handle deferred early VMLAUNCH/VMRESUME failure properly
  2017-09-14 23:31 [PATCH 1/3] kvm: nVMX: Remove nested_vmx_succeed after successful VM-entry Jim Mattson
                   ` (3 preceding siblings ...)
  2017-09-14 23:31 ` [PATCH 3/3] kvm: nVMX: Handle deferred early " Jim Mattson
@ 2017-09-14 23:31 ` Jim Mattson
  2017-09-14 23:31 ` [kvm-unit-tests PATCH 1/2] x86: Skip some VMX control field tests Jim Mattson
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 13+ messages in thread
From: Jim Mattson @ 2017-09-14 23:31 UTC (permalink / raw)
  To: kvm; +Cc: Jim Mattson

When emulating a nested VM-entry from L1 to L2, several control field
validation checks are deferred to the hardware. Should one of these
validation checks fail, vcpu_vmx_run will set the vmx->fail flag. When
this happens, the L2 guest state is not loaded (even in part), and
execution should continue in L1 with the next instruction after the
VMLAUNCH/VMRESUME.

The VMCS12 is not modified (except for the VM-instruction error
field), the VMCS12 MSR save/load lists are not processed, and the CPU
state is not loaded from the VMCS12 host area. Moreover, the vmcs02
exit reason is stale, so it should not be consulted for any reason.

Change-Id: I29bcca36ac8e7a3c22a2d8ef9b0020a744fe9965
Signed-off-by: Jim Mattson <jmattson@google.com>
---
 arch/x86/kvm/vmx.c | 134 ++++++++++++++++++++++++++++++-----------------------
 1 file changed, 75 insertions(+), 59 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 7f494d9278f6..69b219da1c93 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -8344,12 +8344,14 @@ static bool nested_vmx_exit_reflected(struct kvm_vcpu *vcpu, u32 exit_reason)
 	struct vcpu_vmx *vmx = to_vmx(vcpu);
 	struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
 
-	trace_kvm_nested_vmexit(kvm_rip_read(vcpu), exit_reason,
-				vmcs_readl(EXIT_QUALIFICATION),
-				vmx->idt_vectoring_info,
-				intr_info,
-				vmcs_read32(VM_EXIT_INTR_ERROR_CODE),
-				KVM_ISA_VMX);
+	if (vmx->nested.nested_run_pending)
+		return false;
+
+	if (unlikely(vmx->fail)) {
+		pr_info_ratelimited("%s failed vm entry %x\n", __func__,
+				    vmcs_read32(VM_INSTRUCTION_ERROR));
+		return true;
+	}
 
 	/*
 	 * The host physical addresses of some pages of guest memory
@@ -8363,14 +8365,12 @@ static bool nested_vmx_exit_reflected(struct kvm_vcpu *vcpu, u32 exit_reason)
 	 */
 	nested_mark_vmcs12_pages_dirty(vcpu);
 
-	if (vmx->nested.nested_run_pending)
-		return false;
-
-	if (unlikely(vmx->fail)) {
-		pr_info_ratelimited("%s failed vm entry %x\n", __func__,
-				    vmcs_read32(VM_INSTRUCTION_ERROR));
-		return true;
-	}
+	trace_kvm_nested_vmexit(kvm_rip_read(vcpu), exit_reason,
+				vmcs_readl(EXIT_QUALIFICATION),
+				vmx->idt_vectoring_info,
+				intr_info,
+				vmcs_read32(VM_EXIT_INTR_ERROR_CODE),
+				KVM_ISA_VMX);
 
 	switch (exit_reason) {
 	case EXIT_REASON_EXCEPTION_NMI:
@@ -11390,46 +11390,30 @@ static void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 exit_reason,
 {
 	struct vcpu_vmx *vmx = to_vmx(vcpu);
 	struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
-	u32 vm_inst_error = 0;
 
 	/* trying to cancel vmlaunch/vmresume is a bug */
 	WARN_ON_ONCE(vmx->nested.nested_run_pending);
 
+	/*
+	 * The only expected VM-instruction error is "VM entry with
+	 * invalid control field(s)." Anything else indicates a
+	 * problem with L0.
+	 */
+	WARN_ON_ONCE(vmx->fail && (vmcs_read32(VM_INSTRUCTION_ERROR) !=
+				   VMXERR_ENTRY_INVALID_CONTROL_FIELD));
+
 	leave_guest_mode(vcpu);
-	prepare_vmcs12(vcpu, vmcs12, exit_reason, exit_intr_info,
-		       exit_qualification);
 
-	if (nested_vmx_store_msr(vcpu, vmcs12->vm_exit_msr_store_addr,
-				 vmcs12->vm_exit_msr_store_count))
-		nested_vmx_abort(vcpu, VMX_ABORT_SAVE_GUEST_MSR_FAIL);
+	if (likely(!vmx->fail)) {
+		prepare_vmcs12(vcpu, vmcs12, exit_reason, exit_intr_info,
+			       exit_qualification);
 
-	if (unlikely(vmx->fail))
-		vm_inst_error = vmcs_read32(VM_INSTRUCTION_ERROR);
+		if (nested_vmx_store_msr(vcpu, vmcs12->vm_exit_msr_store_addr,
+					 vmcs12->vm_exit_msr_store_count))
+			nested_vmx_abort(vcpu, VMX_ABORT_SAVE_GUEST_MSR_FAIL);
+	}
 
 	vmx_switch_vmcs(vcpu, &vmx->vmcs01);
-
-	/*
-	 * TODO: SDM says that with acknowledge interrupt on exit, bit 31 of
-	 * the VM-exit interrupt information (valid interrupt) is always set to
-	 * 1 on EXIT_REASON_EXTERNAL_INTERRUPT, so we shouldn't need
-	 * kvm_cpu_has_interrupt().  See the commit message for details.
-	 */
-	if (nested_exit_intr_ack_set(vcpu) &&
-	    exit_reason == EXIT_REASON_EXTERNAL_INTERRUPT &&
-	    kvm_cpu_has_interrupt(vcpu)) {
-		int irq = kvm_cpu_get_interrupt(vcpu);
-		WARN_ON(irq < 0);
-		vmcs12->vm_exit_intr_info = irq |
-			INTR_INFO_VALID_MASK | INTR_TYPE_EXT_INTR;
-	}
-
-	trace_kvm_nested_vmexit_inject(vmcs12->vm_exit_reason,
-				       vmcs12->exit_qualification,
-				       vmcs12->idt_vectoring_info_field,
-				       vmcs12->vm_exit_intr_info,
-				       vmcs12->vm_exit_intr_error_code,
-				       KVM_ISA_VMX);
-
 	vm_entry_controls_reset_shadow(vmx);
 	vm_exit_controls_reset_shadow(vmx);
 	vmx_segment_cache_clear(vmx);
@@ -11438,8 +11422,6 @@ static void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 exit_reason,
 	if (VMCS02_POOL_SIZE == 0)
 		nested_free_vmcs02(vmx, vmx->nested.current_vmptr);
 
-	load_vmcs12_host_state(vcpu, vmcs12);
-
 	/* Update any VMCS fields that might have changed while L2 ran */
 	vmcs_write32(VM_EXIT_MSR_LOAD_COUNT, vmx->msr_autoload.nr);
 	vmcs_write32(VM_ENTRY_MSR_LOAD_COUNT, vmx->msr_autoload.nr);
@@ -11488,23 +11470,57 @@ static void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 exit_reason,
 	 */
 	kvm_make_request(KVM_REQ_APIC_PAGE_RELOAD, vcpu);
 
-	if (unlikely(vmx->fail)) {
-		/*
-		 * After an early L2 VM-entry failure, we're now back
-		 * in L1 which thinks it just finished a VMLAUNCH or
-		 * VMRESUME instruction, so we need to set the failure
-		 * flag and the VM-instruction error field of the VMCS
-		 * accordingly.
-		 */
-		vmx->fail = 0;
-		nested_vmx_failValid(vcpu, vm_inst_error);
-	}
-
 	if (enable_shadow_vmcs)
 		vmx->nested.sync_shadow_vmcs = true;
 
 	/* in case we halted in L2 */
 	vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE;
+
+	if (likely(!vmx->fail)) {
+		/*
+		 * TODO: SDM says that with acknowledge interrupt on
+		 * exit, bit 31 of the VM-exit interrupt information
+		 * (valid interrupt) is always set to 1 on
+		 * EXIT_REASON_EXTERNAL_INTERRUPT, so we shouldn't
+		 * need kvm_cpu_has_interrupt().  See the commit
+		 * message for details.
+		 */
+		if (nested_exit_intr_ack_set(vcpu) &&
+		    exit_reason == EXIT_REASON_EXTERNAL_INTERRUPT &&
+		    kvm_cpu_has_interrupt(vcpu)) {
+			int irq = kvm_cpu_get_interrupt(vcpu);
+			WARN_ON(irq < 0);
+			vmcs12->vm_exit_intr_info = irq |
+				INTR_INFO_VALID_MASK | INTR_TYPE_EXT_INTR;
+		}
+
+		trace_kvm_nested_vmexit_inject(vmcs12->vm_exit_reason,
+					       vmcs12->exit_qualification,
+					       vmcs12->idt_vectoring_info_field,
+					       vmcs12->vm_exit_intr_info,
+					       vmcs12->vm_exit_intr_error_code,
+					       KVM_ISA_VMX);
+
+		load_vmcs12_host_state(vcpu, vmcs12);
+
+		return;
+	}
+	
+	/*
+	 * After an early L2 VM-entry failure, we're now back
+	 * in L1 which thinks it just finished a VMLAUNCH or
+	 * VMRESUME instruction, so we need to set the failure
+	 * flag and the VM-instruction error field of the VMCS
+	 * accordingly.
+	 */
+	nested_vmx_failValid(vcpu, VMXERR_ENTRY_INVALID_CONTROL_FIELD);
+	/*
+	 * The emulated instruction was already skipped in
+	 * nested_vmx_run, but the updated RIP was never
+	 * written back to the vmcs01.
+	 */
+	skip_emulated_instruction(vcpu);
+	vmx->fail = 0;
 }
 
 /*
-- 
2.14.1.690.gbb1197296e-goog

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [kvm-unit-tests PATCH 1/2] x86: Skip some VMX control field tests
  2017-09-14 23:31 [PATCH 1/3] kvm: nVMX: Remove nested_vmx_succeed after successful VM-entry Jim Mattson
                   ` (4 preceding siblings ...)
  2017-09-14 23:31 ` Jim Mattson
@ 2017-09-14 23:31 ` Jim Mattson
  2017-09-14 23:31 ` Jim Mattson
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 13+ messages in thread
From: Jim Mattson @ 2017-09-14 23:31 UTC (permalink / raw)
  To: kvm; +Cc: Jim Mattson

When the APIC virtualization address references unbacked memory, kvm
incorrectly fails a VM-entry with "invalid control field(s)." Until
this can be fixed, just skip the VMX control field tests that populate
a VMCS field with a physical address that isn't backed.

Signed-off-by: Jim Mattson <jmattson@google.com>
---
 x86/vmx_tests.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/x86/vmx_tests.c b/x86/vmx_tests.c
index 088d32aaee0b..e82bc04d72e2 100644
--- a/x86/vmx_tests.c
+++ b/x86/vmx_tests.c
@@ -3359,6 +3359,12 @@ static void test_vmcs_page_addr(const char *name,
 				bool ignored,
 				u64 addr)
 {
+	/*
+	 * Skip tests that reference unbacked memory for now, because
+	 * kvm doesn't always handle this situation correctly.
+	 */
+	if (addr > fwcfg_get_u64(FW_CFG_RAM_SIZE) - PAGE_SIZE)
+		return;
 	report_prefix_pushf("%s = %lx", name, addr);
 	vmcs_write(encoding, addr);
 	test_vmx_controls(ignored || (IS_ALIGNED(addr, PAGE_SIZE) &&
-- 
2.14.1.690.gbb1197296e-goog

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [kvm-unit-tests PATCH 1/2] x86: Skip some VMX control field tests
  2017-09-14 23:31 [PATCH 1/3] kvm: nVMX: Remove nested_vmx_succeed after successful VM-entry Jim Mattson
                   ` (5 preceding siblings ...)
  2017-09-14 23:31 ` [kvm-unit-tests PATCH 1/2] x86: Skip some VMX control field tests Jim Mattson
@ 2017-09-14 23:31 ` Jim Mattson
  2017-09-15 16:40   ` Paolo Bonzini
  2017-09-14 23:31 ` [kvm-unit-tests PATCH 2/2] x86: Add test for TPR threshold check on VM-entry Jim Mattson
  2017-09-14 23:31 ` Jim Mattson
  8 siblings, 1 reply; 13+ messages in thread
From: Jim Mattson @ 2017-09-14 23:31 UTC (permalink / raw)
  To: kvm; +Cc: Jim Mattson

When the APIC virtualization address references unbacked memory, kvm
incorrectly fails a VM-entry with "invalid control field(s)." Until
this can be fixed, just skip the VMX control field tests that populate
a VMCS field with a physical address that isn't backed.

Tested: <multi line explanation of how the code was tested>

Effort: <Your effort group>
Google-Bug-Id: <buganizer-id>
Change-Id: I624091cb5cd7cfaae887de8f017e18b5f95d8f61
Signed-off-by: Jim Mattson <jmattson@google.com>
---
 x86/vmx_tests.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/x86/vmx_tests.c b/x86/vmx_tests.c
index 088d32aaee0b..e82bc04d72e2 100644
--- a/x86/vmx_tests.c
+++ b/x86/vmx_tests.c
@@ -3359,6 +3359,12 @@ static void test_vmcs_page_addr(const char *name,
 				bool ignored,
 				u64 addr)
 {
+	/*
+	 * Skip tests that reference unbacked memory for now, because
+	 * kvm doesn't always handle this situation correctly.
+	 */
+	if (addr > fwcfg_get_u64(FW_CFG_RAM_SIZE) - PAGE_SIZE)
+		return;
 	report_prefix_pushf("%s = %lx", name, addr);
 	vmcs_write(encoding, addr);
 	test_vmx_controls(ignored || (IS_ALIGNED(addr, PAGE_SIZE) &&
-- 
2.14.1.690.gbb1197296e-goog

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [kvm-unit-tests PATCH 2/2] x86: Add test for TPR threshold check on VM-entry
  2017-09-14 23:31 [PATCH 1/3] kvm: nVMX: Remove nested_vmx_succeed after successful VM-entry Jim Mattson
                   ` (6 preceding siblings ...)
  2017-09-14 23:31 ` Jim Mattson
@ 2017-09-14 23:31 ` Jim Mattson
  2017-09-14 23:31 ` Jim Mattson
  8 siblings, 0 replies; 13+ messages in thread
From: Jim Mattson @ 2017-09-14 23:31 UTC (permalink / raw)
  To: kvm; +Cc: Jim Mattson

This test checks that a nested VM-entry fails if the "use TPR shadow"
VM-execution control is set, the "virtual-interrupt delivery"
VM-execution control is clear, and the TPR threshold is greater than
0xf.

Signed-off-by: Jim Mattson <jmattson@google.com>
---
 x86/vmx.h       |  3 ++-
 x86/vmx_tests.c | 81 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 83 insertions(+), 1 deletion(-)

diff --git a/x86/vmx.h b/x86/vmx.h
index eb14ecc9b055..f0b87767c0e7 100644
--- a/x86/vmx.h
+++ b/x86/vmx.h
@@ -392,8 +392,9 @@ enum Ctrl1 {
 	CPU_DESC_TABLE		= 1ul << 2,
 	CPU_RDTSCP		= 1ul << 3,
 	CPU_VPID		= 1ul << 5,
-	CPU_URG			= 1ul << 7,
 	CPU_WBINVD		= 1ul << 6,
+	CPU_URG			= 1ul << 7,
+	CPU_VINTD		= 1ul << 9,
 	CPU_RDRAND		= 1ul << 11,
 	CPU_RDSEED		= 1ul << 16,
 	CPU_PML                 = 1ul << 17,
diff --git a/x86/vmx_tests.c b/x86/vmx_tests.c
index e82bc04d72e2..8afdf7c22411 100644
--- a/x86/vmx_tests.c
+++ b/x86/vmx_tests.c
@@ -3464,6 +3464,86 @@ static void test_apic_virt_addr(void)
 				 "virtual-APIC address", "Use TPR shadow");
 }
 
+static void try_tpr_threshold(unsigned val)
+{
+	bool valid = true;
+
+	if ((vmcs_read(CPU_EXEC_CTRL0) & CPU_TPR_SHADOW) &&
+	    !((vmcs_read(CPU_EXEC_CTRL0) & CPU_SECONDARY) &&
+	      (vmcs_read(CPU_EXEC_CTRL1) & CPU_VINTD)))
+		valid = !(val >> 4);
+	report_prefix_pushf("TPR threshold 0x%x", val);
+	vmcs_write(TPR_THRESHOLD, val);
+	test_vmx_controls(valid);
+	report_prefix_pop();
+}
+
+/*
+ * Test interesting TPR threshold values.
+ */
+static void test_tpr_threshold_values(void)
+{
+	unsigned i;
+
+	for (i = 0; i < 0x10; i++)
+		try_tpr_threshold(i);
+	for (i = 4; i < 32; i++)
+		try_tpr_threshold(1u << i);
+	try_tpr_threshold(-1u);
+	try_tpr_threshold(0x7fffffff);
+}
+
+/*
+ * If the "use TPR shadow" VM-execution control is 1 and the
+ * "virtual-interrupt delivery" VM-execution control is 0, bits 31:4
+ * of the TPR threshold VM-execution control field must be 0.
+ * [Intel SDM]
+ */
+static void test_tpr_threshold(void)
+{
+	u32 primary = vmcs_read(CPU_EXEC_CTRL0);
+	void *virtual_apic_page;
+
+	if (!(ctrl_cpu_rev[0].clr & CPU_TPR_SHADOW))
+		return;
+
+	virtual_apic_page = alloc_page();
+	memset(virtual_apic_page, 0xff, PAGE_SIZE);
+	vmcs_write(APIC_VIRT_ADDR, virt_to_phys(virtual_apic_page));
+
+	vmcs_write(CPU_EXEC_CTRL0, primary & ~(CPU_TPR_SHADOW | CPU_SECONDARY));
+	report_prefix_pushf("Use TPR shadow disabled");
+	test_tpr_threshold_values();
+	report_prefix_pop();
+	vmcs_write(CPU_EXEC_CTRL0, vmcs_read(CPU_EXEC_CTRL0) | CPU_TPR_SHADOW);
+	report_prefix_pushf("Use TPR shadow enabled");
+	test_tpr_threshold_values();
+	report_prefix_pop();
+
+	if ((ctrl_cpu_rev[0].clr & CPU_SECONDARY) &&
+	    (ctrl_cpu_rev[1].clr & CPU_VINTD)) {
+		u32 secondary = vmcs_read(CPU_EXEC_CTRL1);
+
+		vmcs_write(CPU_EXEC_CTRL1, CPU_VINTD);
+		report_prefix_pushf("Use TPR shadow enabled; secondary controls disabled");
+		test_tpr_threshold_values();
+		report_prefix_pop();
+		vmcs_write(CPU_EXEC_CTRL0,
+			   vmcs_read(CPU_EXEC_CTRL0) | CPU_SECONDARY);
+		report_prefix_pushf("Use TPR shadow enabled; virtual-interrupt delivery enabled");
+		test_tpr_threshold_values();
+		report_prefix_pop();
+
+		vmcs_write(CPU_EXEC_CTRL1, secondary);
+	}
+
+	vmcs_write(CPU_EXEC_CTRL0, primary);
+}
+
+/*
+ * Check that the virtual CPU checks all of the VMX controls as
+ * documented in the Intel SDM.
+ */
 static void vmx_controls_test(void)
 {
 	/*
@@ -3480,6 +3560,7 @@ static void vmx_controls_test(void)
 	test_io_bitmaps();
 	test_msr_bitmap();
 	test_apic_virt_addr();
+	test_tpr_threshold();
 }
 
 static bool valid_vmcs_for_vmentry(void)
-- 
2.14.1.690.gbb1197296e-goog

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [kvm-unit-tests PATCH 2/2] x86: Add test for TPR threshold check on VM-entry
  2017-09-14 23:31 [PATCH 1/3] kvm: nVMX: Remove nested_vmx_succeed after successful VM-entry Jim Mattson
                   ` (7 preceding siblings ...)
  2017-09-14 23:31 ` [kvm-unit-tests PATCH 2/2] x86: Add test for TPR threshold check on VM-entry Jim Mattson
@ 2017-09-14 23:31 ` Jim Mattson
  8 siblings, 0 replies; 13+ messages in thread
From: Jim Mattson @ 2017-09-14 23:31 UTC (permalink / raw)
  To: kvm; +Cc: Jim Mattson

This test checks that a nested VM-entry fails if the "use TPR shadow"
VM-execution control is set, the "virtual-interrupt delivery"
VM-execution control is clear, and the TPR threshold is greater than
0xf.

Change-Id: I3a64e75c7dd488829701ff74fc92d3db202cc621
Signed-off-by: Jim Mattson <jmattson@google.com>
---
 x86/vmx.h       |  3 ++-
 x86/vmx_tests.c | 81 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 83 insertions(+), 1 deletion(-)

diff --git a/x86/vmx.h b/x86/vmx.h
index eb14ecc9b055..f0b87767c0e7 100644
--- a/x86/vmx.h
+++ b/x86/vmx.h
@@ -392,8 +392,9 @@ enum Ctrl1 {
 	CPU_DESC_TABLE		= 1ul << 2,
 	CPU_RDTSCP		= 1ul << 3,
 	CPU_VPID		= 1ul << 5,
-	CPU_URG			= 1ul << 7,
 	CPU_WBINVD		= 1ul << 6,
+	CPU_URG			= 1ul << 7,
+	CPU_VINTD		= 1ul << 9,
 	CPU_RDRAND		= 1ul << 11,
 	CPU_RDSEED		= 1ul << 16,
 	CPU_PML                 = 1ul << 17,
diff --git a/x86/vmx_tests.c b/x86/vmx_tests.c
index e82bc04d72e2..8afdf7c22411 100644
--- a/x86/vmx_tests.c
+++ b/x86/vmx_tests.c
@@ -3464,6 +3464,86 @@ static void test_apic_virt_addr(void)
 				 "virtual-APIC address", "Use TPR shadow");
 }
 
+static void try_tpr_threshold(unsigned val)
+{
+	bool valid = true;
+
+	if ((vmcs_read(CPU_EXEC_CTRL0) & CPU_TPR_SHADOW) &&
+	    !((vmcs_read(CPU_EXEC_CTRL0) & CPU_SECONDARY) &&
+	      (vmcs_read(CPU_EXEC_CTRL1) & CPU_VINTD)))
+		valid = !(val >> 4);
+	report_prefix_pushf("TPR threshold 0x%x", val);
+	vmcs_write(TPR_THRESHOLD, val);
+	test_vmx_controls(valid);
+	report_prefix_pop();
+}
+
+/*
+ * Test interesting TPR threshold values.
+ */
+static void test_tpr_threshold_values(void)
+{
+	unsigned i;
+
+	for (i = 0; i < 0x10; i++)
+		try_tpr_threshold(i);
+	for (i = 4; i < 32; i++)
+		try_tpr_threshold(1u << i);
+	try_tpr_threshold(-1u);
+	try_tpr_threshold(0x7fffffff);
+}
+
+/*
+ * If the "use TPR shadow" VM-execution control is 1 and the
+ * "virtual-interrupt delivery" VM-execution control is 0, bits 31:4
+ * of the TPR threshold VM-execution control field must be 0.
+ * [Intel SDM]
+ */
+static void test_tpr_threshold(void)
+{
+	u32 primary = vmcs_read(CPU_EXEC_CTRL0);
+	void *virtual_apic_page;
+
+	if (!(ctrl_cpu_rev[0].clr & CPU_TPR_SHADOW))
+		return;
+
+	virtual_apic_page = alloc_page();
+	memset(virtual_apic_page, 0xff, PAGE_SIZE);
+	vmcs_write(APIC_VIRT_ADDR, virt_to_phys(virtual_apic_page));
+
+	vmcs_write(CPU_EXEC_CTRL0, primary & ~(CPU_TPR_SHADOW | CPU_SECONDARY));
+	report_prefix_pushf("Use TPR shadow disabled");
+	test_tpr_threshold_values();
+	report_prefix_pop();
+	vmcs_write(CPU_EXEC_CTRL0, vmcs_read(CPU_EXEC_CTRL0) | CPU_TPR_SHADOW);
+	report_prefix_pushf("Use TPR shadow enabled");
+	test_tpr_threshold_values();
+	report_prefix_pop();
+
+	if ((ctrl_cpu_rev[0].clr & CPU_SECONDARY) &&
+	    (ctrl_cpu_rev[1].clr & CPU_VINTD)) {
+		u32 secondary = vmcs_read(CPU_EXEC_CTRL1);
+
+		vmcs_write(CPU_EXEC_CTRL1, CPU_VINTD);
+		report_prefix_pushf("Use TPR shadow enabled; secondary controls disabled");
+		test_tpr_threshold_values();
+		report_prefix_pop();
+		vmcs_write(CPU_EXEC_CTRL0,
+			   vmcs_read(CPU_EXEC_CTRL0) | CPU_SECONDARY);
+		report_prefix_pushf("Use TPR shadow enabled; virtual-interrupt delivery enabled");
+		test_tpr_threshold_values();
+		report_prefix_pop();
+
+		vmcs_write(CPU_EXEC_CTRL1, secondary);
+	}
+
+	vmcs_write(CPU_EXEC_CTRL0, primary);
+}
+
+/*
+ * Check that the virtual CPU checks all of the VMX controls as
+ * documented in the Intel SDM.
+ */
 static void vmx_controls_test(void)
 {
 	/*
@@ -3480,6 +3560,7 @@ static void vmx_controls_test(void)
 	test_io_bitmaps();
 	test_msr_bitmap();
 	test_apic_virt_addr();
+	test_tpr_threshold();
 }
 
 static bool valid_vmcs_for_vmentry(void)
-- 
2.14.1.690.gbb1197296e-goog

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [kvm-unit-tests PATCH 1/2] x86: Skip some VMX control field tests
  2017-09-14 23:31 ` Jim Mattson
@ 2017-09-15 16:40   ` Paolo Bonzini
  2017-09-15 17:49     ` Jim Mattson
  2017-09-17 12:28     ` Andrew Jones
  0 siblings, 2 replies; 13+ messages in thread
From: Paolo Bonzini @ 2017-09-15 16:40 UTC (permalink / raw)
  To: Jim Mattson, kvm

On 15/09/2017 01:31, Jim Mattson wrote:
> When the APIC virtualization address references unbacked memory, kvm
> incorrectly fails a VM-entry with "invalid control field(s)." Until
> this can be fixed, just skip the VMX control field tests that populate
> a VMCS field with a physical address that isn't backed.
> 
> Tested: <multi line explanation of how the code was tested>
> 
> Effort: <Your effort group>
> Google-Bug-Id: <buganizer-id>
> Change-Id: I624091cb5cd7cfaae887de8f017e18b5f95d8f61
> Signed-off-by: Jim Mattson <jmattson@google.com>
> ---
>  x86/vmx_tests.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/x86/vmx_tests.c b/x86/vmx_tests.c
> index 088d32aaee0b..e82bc04d72e2 100644
> --- a/x86/vmx_tests.c
> +++ b/x86/vmx_tests.c
> @@ -3359,6 +3359,12 @@ static void test_vmcs_page_addr(const char *name,
>  				bool ignored,
>  				u64 addr)
>  {
> +	/*
> +	 * Skip tests that reference unbacked memory for now, because
> +	 * kvm doesn't always handle this situation correctly.

Since in practice it's only 16 tests that fail, it would be nice to only 
XFAIL those.  The following is a bit intrusive, but it works.  What do 
you think?

diff --git a/x86/vmx_tests.c b/x86/vmx_tests.c
index 729e85f..13b45dd 100644
--- a/x86/vmx_tests.c
+++ b/x86/vmx_tests.c
@@ -3177,14 +3177,14 @@ success:
 /*
  * Try to launch the current VMCS.
  */
-static void test_vmx_controls(bool controls_valid)
+static void test_vmx_controls(bool controls_valid, bool xfail)
 {
 	bool success = vmlaunch_succeeds();
 	u32 vmx_inst_err;
 
-	report("vmlaunch %s", success == controls_valid,
-	       controls_valid ? "succeeds" : "fails");
-	if (!controls_valid) {
+	report_xfail("vmlaunch %s", xfail, success == controls_valid,
+		     controls_valid ? "succeeds" : "fails");
+	if (!success) {
 		vmx_inst_err = vmcs_read(VMX_INST_ERROR);
 		report("VMX inst error is %d (actual %d)",
 		       vmx_inst_err == VMXERR_ENTRY_INVALID_CONTROL_FIELD,
@@ -3226,7 +3226,7 @@ static void test_rsvd_ctl_bit_value(const char *name, union vmx_ctrl_msr msr,
 		vmcs_write(encoding, msr.set & ~mask);
 		expected = !(msr.set & mask);
 	}
-	test_vmx_controls(expected);
+	test_vmx_controls(expected, false);
 	vmcs_write(encoding, controls);
 	report_prefix_pop();
 }
@@ -3322,7 +3322,7 @@ static void try_cr3_target_count(unsigned i, unsigned max)
 {
 	report_prefix_pushf("CR3 target count 0x%x", i);
 	vmcs_write(CR3_TARGET_COUNT, i);
-	test_vmx_controls(i <= max);
+	test_vmx_controls(i <= max, false);
 	report_prefix_pop();
 }
 
@@ -3357,13 +3357,21 @@ static void test_cr3_targets(void)
 static void test_vmcs_page_addr(const char *name,
 				enum Encoding encoding,
 				bool ignored,
+				bool xfail_beyond_mapped_ram,
 				u64 addr)
 {
+	bool xfail =
+		(xfail_beyond_mapped_ram &&
+		 addr > fwcfg_get_u64(FW_CFG_RAM_SIZE) - PAGE_SIZE &&
+		 addr < (1ul << cpuid_maxphyaddr()));
+
 	report_prefix_pushf("%s = %lx", name, addr);
 	vmcs_write(encoding, addr);
 	test_vmx_controls(ignored || (IS_ALIGNED(addr, PAGE_SIZE) &&
-				  addr < (1ul << cpuid_maxphyaddr())));
+				  addr < (1ul << cpuid_maxphyaddr())),
+			  xfail);
 	report_prefix_pop();
+	xfail = false;
 }
 
 /*
@@ -3371,19 +3379,26 @@ static void test_vmcs_page_addr(const char *name,
  */
 static void test_vmcs_page_values(const char *name,
 				  enum Encoding encoding,
-				  bool ignored)
+				  bool ignored,
+				  bool xfail_beyond_mapped_ram)
 {
 	unsigned i;
 	u64 orig_val = vmcs_read(encoding);
 
 	for (i = 0; i < 64; i++)
-		test_vmcs_page_addr(name, encoding, ignored, 1ul << i);
+		test_vmcs_page_addr(name, encoding, ignored,
+				    xfail_beyond_mapped_ram, 1ul << i);
 
-	test_vmcs_page_addr(name, encoding, ignored, PAGE_SIZE - 1);
-	test_vmcs_page_addr(name, encoding, ignored, PAGE_SIZE);
+	test_vmcs_page_addr(name, encoding, ignored, 
+			    xfail_beyond_mapped_ram, PAGE_SIZE - 1);
+	test_vmcs_page_addr(name, encoding, ignored, 
+			    xfail_beyond_mapped_ram, PAGE_SIZE);
 	test_vmcs_page_addr(name, encoding, ignored,
+			    xfail_beyond_mapped_ram, 
 			    (1ul << cpuid_maxphyaddr()) - PAGE_SIZE);
-	test_vmcs_page_addr(name, encoding, ignored, -1ul);
+	test_vmcs_page_addr(name, encoding, ignored,
+			    xfail_beyond_mapped_ram,
+			    -1ul);
 
 	vmcs_write(encoding, orig_val);
 }
@@ -3394,7 +3409,8 @@ static void test_vmcs_page_values(const char *name,
  */
 static void test_vmcs_page_reference(u32 control_bit, enum Encoding field,
 				     const char *field_name,
-				     const char *control_name)
+				     const char *control_name,
+				     bool xfail_beyond_mapped_ram)
 {
 	u32 primary = vmcs_read(CPU_EXEC_CTRL0);
 	u64 page_addr;
@@ -3406,12 +3422,12 @@ static void test_vmcs_page_reference(u32 control_bit, enum Encoding field,
 
 	report_prefix_pushf("%s enabled", control_name);
 	vmcs_write(CPU_EXEC_CTRL0, primary | control_bit);
-	test_vmcs_page_values(field_name, field, false);
+	test_vmcs_page_values(field_name, field, false, xfail_beyond_mapped_ram);
 	report_prefix_pop();
 
 	report_prefix_pushf("%s disabled", control_name);
 	vmcs_write(CPU_EXEC_CTRL0, primary & ~control_bit);
-	test_vmcs_page_values(field_name, field, true);
+	test_vmcs_page_values(field_name, field, true, false);
 	report_prefix_pop();
 
 	vmcs_write(field, page_addr);
@@ -3427,9 +3443,9 @@ static void test_vmcs_page_reference(u32 control_bit, enum Encoding field,
 static void test_io_bitmaps(void)
 {
 	test_vmcs_page_reference(CPU_IO_BITMAP, IO_BITMAP_A,
-				 "I/O bitmap A", "Use I/O bitmaps");
+				 "I/O bitmap A", "Use I/O bitmaps", false);
 	test_vmcs_page_reference(CPU_IO_BITMAP, IO_BITMAP_B,
-				 "I/O bitmap B", "Use I/O bitmaps");
+				 "I/O bitmap B", "Use I/O bitmaps", false);
 }
 
 /*
@@ -3441,7 +3457,7 @@ static void test_io_bitmaps(void)
 static void test_msr_bitmap(void)
 {
 	test_vmcs_page_reference(CPU_MSR_BITMAP, MSR_BITMAP,
-				 "MSR bitmap", "Use MSR bitmaps");
+				 "MSR bitmap", "Use MSR bitmaps", false);
 }
 
 /*
@@ -3455,7 +3471,7 @@ static void test_msr_bitmap(void)
 static void test_apic_virt_addr(void)
 {
 	test_vmcs_page_reference(CPU_TPR_SHADOW, APIC_VIRT_ADDR,
-				 "virtual-APIC address", "Use TPR shadow");
+				 "virtual-APIC address", "Use TPR shadow", true);
 }
 
 static void try_tpr_threshold(unsigned val)
@@ -3468,7 +3484,7 @@ static void try_tpr_threshold(unsigned val)
 		valid = !(val >> 4);
 	report_prefix_pushf("TPR threshold 0x%x", val);
 	vmcs_write(TPR_THRESHOLD, val);
-	test_vmx_controls(valid);
+	test_vmx_controls(valid, false);
 	report_prefix_pop();
 }
 

Paolo

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [kvm-unit-tests PATCH 1/2] x86: Skip some VMX control field tests
  2017-09-15 16:40   ` Paolo Bonzini
@ 2017-09-15 17:49     ` Jim Mattson
  2017-09-17 12:28     ` Andrew Jones
  1 sibling, 0 replies; 13+ messages in thread
From: Jim Mattson @ 2017-09-15 17:49 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm list

Yes, that works for me. An alternative I've toyed with is modifying
nested_get_vmcs12_pages to generate a VM-exit for "illegal guest
state" rather than a VM-instruction failure for "VM entry with invalid
control field(s)" when kvm_vcpu_gpa_to_page(vcpu,
vmcs12->virtual_apic_page_addr) returns an error page. Still wrong,
but the kvm-unit-test isn't yet smart enough to notice the error.

On Fri, Sep 15, 2017 at 9:40 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
> On 15/09/2017 01:31, Jim Mattson wrote:
>> When the APIC virtualization address references unbacked memory, kvm
>> incorrectly fails a VM-entry with "invalid control field(s)." Until
>> this can be fixed, just skip the VMX control field tests that populate
>> a VMCS field with a physical address that isn't backed.
>>
>> Tested: <multi line explanation of how the code was tested>
>>
>> Effort: <Your effort group>
>> Google-Bug-Id: <buganizer-id>
>> Change-Id: I624091cb5cd7cfaae887de8f017e18b5f95d8f61
>> Signed-off-by: Jim Mattson <jmattson@google.com>
>> ---
>>  x86/vmx_tests.c | 6 ++++++
>>  1 file changed, 6 insertions(+)
>>
>> diff --git a/x86/vmx_tests.c b/x86/vmx_tests.c
>> index 088d32aaee0b..e82bc04d72e2 100644
>> --- a/x86/vmx_tests.c
>> +++ b/x86/vmx_tests.c
>> @@ -3359,6 +3359,12 @@ static void test_vmcs_page_addr(const char *name,
>>                               bool ignored,
>>                               u64 addr)
>>  {
>> +     /*
>> +      * Skip tests that reference unbacked memory for now, because
>> +      * kvm doesn't always handle this situation correctly.
>
> Since in practice it's only 16 tests that fail, it would be nice to only
> XFAIL those.  The following is a bit intrusive, but it works.  What do
> you think?
>
> diff --git a/x86/vmx_tests.c b/x86/vmx_tests.c
> index 729e85f..13b45dd 100644
> --- a/x86/vmx_tests.c
> +++ b/x86/vmx_tests.c
> @@ -3177,14 +3177,14 @@ success:
>  /*
>   * Try to launch the current VMCS.
>   */
> -static void test_vmx_controls(bool controls_valid)
> +static void test_vmx_controls(bool controls_valid, bool xfail)
>  {
>         bool success = vmlaunch_succeeds();
>         u32 vmx_inst_err;
>
> -       report("vmlaunch %s", success == controls_valid,
> -              controls_valid ? "succeeds" : "fails");
> -       if (!controls_valid) {
> +       report_xfail("vmlaunch %s", xfail, success == controls_valid,
> +                    controls_valid ? "succeeds" : "fails");
> +       if (!success) {
>                 vmx_inst_err = vmcs_read(VMX_INST_ERROR);
>                 report("VMX inst error is %d (actual %d)",
>                        vmx_inst_err == VMXERR_ENTRY_INVALID_CONTROL_FIELD,
> @@ -3226,7 +3226,7 @@ static void test_rsvd_ctl_bit_value(const char *name, union vmx_ctrl_msr msr,
>                 vmcs_write(encoding, msr.set & ~mask);
>                 expected = !(msr.set & mask);
>         }
> -       test_vmx_controls(expected);
> +       test_vmx_controls(expected, false);
>         vmcs_write(encoding, controls);
>         report_prefix_pop();
>  }
> @@ -3322,7 +3322,7 @@ static void try_cr3_target_count(unsigned i, unsigned max)
>  {
>         report_prefix_pushf("CR3 target count 0x%x", i);
>         vmcs_write(CR3_TARGET_COUNT, i);
> -       test_vmx_controls(i <= max);
> +       test_vmx_controls(i <= max, false);
>         report_prefix_pop();
>  }
>
> @@ -3357,13 +3357,21 @@ static void test_cr3_targets(void)
>  static void test_vmcs_page_addr(const char *name,
>                                 enum Encoding encoding,
>                                 bool ignored,
> +                               bool xfail_beyond_mapped_ram,
>                                 u64 addr)
>  {
> +       bool xfail =
> +               (xfail_beyond_mapped_ram &&
> +                addr > fwcfg_get_u64(FW_CFG_RAM_SIZE) - PAGE_SIZE &&
> +                addr < (1ul << cpuid_maxphyaddr()));
> +
>         report_prefix_pushf("%s = %lx", name, addr);
>         vmcs_write(encoding, addr);
>         test_vmx_controls(ignored || (IS_ALIGNED(addr, PAGE_SIZE) &&
> -                                 addr < (1ul << cpuid_maxphyaddr())));
> +                                 addr < (1ul << cpuid_maxphyaddr())),
> +                         xfail);
>         report_prefix_pop();
> +       xfail = false;
>  }
>
>  /*
> @@ -3371,19 +3379,26 @@ static void test_vmcs_page_addr(const char *name,
>   */
>  static void test_vmcs_page_values(const char *name,
>                                   enum Encoding encoding,
> -                                 bool ignored)
> +                                 bool ignored,
> +                                 bool xfail_beyond_mapped_ram)
>  {
>         unsigned i;
>         u64 orig_val = vmcs_read(encoding);
>
>         for (i = 0; i < 64; i++)
> -               test_vmcs_page_addr(name, encoding, ignored, 1ul << i);
> +               test_vmcs_page_addr(name, encoding, ignored,
> +                                   xfail_beyond_mapped_ram, 1ul << i);
>
> -       test_vmcs_page_addr(name, encoding, ignored, PAGE_SIZE - 1);
> -       test_vmcs_page_addr(name, encoding, ignored, PAGE_SIZE);
> +       test_vmcs_page_addr(name, encoding, ignored,
> +                           xfail_beyond_mapped_ram, PAGE_SIZE - 1);
> +       test_vmcs_page_addr(name, encoding, ignored,
> +                           xfail_beyond_mapped_ram, PAGE_SIZE);
>         test_vmcs_page_addr(name, encoding, ignored,
> +                           xfail_beyond_mapped_ram,
>                             (1ul << cpuid_maxphyaddr()) - PAGE_SIZE);
> -       test_vmcs_page_addr(name, encoding, ignored, -1ul);
> +       test_vmcs_page_addr(name, encoding, ignored,
> +                           xfail_beyond_mapped_ram,
> +                           -1ul);
>
>         vmcs_write(encoding, orig_val);
>  }
> @@ -3394,7 +3409,8 @@ static void test_vmcs_page_values(const char *name,
>   */
>  static void test_vmcs_page_reference(u32 control_bit, enum Encoding field,
>                                      const char *field_name,
> -                                    const char *control_name)
> +                                    const char *control_name,
> +                                    bool xfail_beyond_mapped_ram)
>  {
>         u32 primary = vmcs_read(CPU_EXEC_CTRL0);
>         u64 page_addr;
> @@ -3406,12 +3422,12 @@ static void test_vmcs_page_reference(u32 control_bit, enum Encoding field,
>
>         report_prefix_pushf("%s enabled", control_name);
>         vmcs_write(CPU_EXEC_CTRL0, primary | control_bit);
> -       test_vmcs_page_values(field_name, field, false);
> +       test_vmcs_page_values(field_name, field, false, xfail_beyond_mapped_ram);
>         report_prefix_pop();
>
>         report_prefix_pushf("%s disabled", control_name);
>         vmcs_write(CPU_EXEC_CTRL0, primary & ~control_bit);
> -       test_vmcs_page_values(field_name, field, true);
> +       test_vmcs_page_values(field_name, field, true, false);
>         report_prefix_pop();
>
>         vmcs_write(field, page_addr);
> @@ -3427,9 +3443,9 @@ static void test_vmcs_page_reference(u32 control_bit, enum Encoding field,
>  static void test_io_bitmaps(void)
>  {
>         test_vmcs_page_reference(CPU_IO_BITMAP, IO_BITMAP_A,
> -                                "I/O bitmap A", "Use I/O bitmaps");
> +                                "I/O bitmap A", "Use I/O bitmaps", false);
>         test_vmcs_page_reference(CPU_IO_BITMAP, IO_BITMAP_B,
> -                                "I/O bitmap B", "Use I/O bitmaps");
> +                                "I/O bitmap B", "Use I/O bitmaps", false);
>  }
>
>  /*
> @@ -3441,7 +3457,7 @@ static void test_io_bitmaps(void)
>  static void test_msr_bitmap(void)
>  {
>         test_vmcs_page_reference(CPU_MSR_BITMAP, MSR_BITMAP,
> -                                "MSR bitmap", "Use MSR bitmaps");
> +                                "MSR bitmap", "Use MSR bitmaps", false);
>  }
>
>  /*
> @@ -3455,7 +3471,7 @@ static void test_msr_bitmap(void)
>  static void test_apic_virt_addr(void)
>  {
>         test_vmcs_page_reference(CPU_TPR_SHADOW, APIC_VIRT_ADDR,
> -                                "virtual-APIC address", "Use TPR shadow");
> +                                "virtual-APIC address", "Use TPR shadow", true);
>  }
>
>  static void try_tpr_threshold(unsigned val)
> @@ -3468,7 +3484,7 @@ static void try_tpr_threshold(unsigned val)
>                 valid = !(val >> 4);
>         report_prefix_pushf("TPR threshold 0x%x", val);
>         vmcs_write(TPR_THRESHOLD, val);
> -       test_vmx_controls(valid);
> +       test_vmx_controls(valid, false);
>         report_prefix_pop();
>  }
>
>
> Paolo

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [kvm-unit-tests PATCH 1/2] x86: Skip some VMX control field tests
  2017-09-15 16:40   ` Paolo Bonzini
  2017-09-15 17:49     ` Jim Mattson
@ 2017-09-17 12:28     ` Andrew Jones
  1 sibling, 0 replies; 13+ messages in thread
From: Andrew Jones @ 2017-09-17 12:28 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Jim Mattson, kvm

On Fri, Sep 15, 2017 at 06:40:40PM +0200, Paolo Bonzini wrote:
> On 15/09/2017 01:31, Jim Mattson wrote:
> > When the APIC virtualization address references unbacked memory, kvm
> > incorrectly fails a VM-entry with "invalid control field(s)." Until
> > this can be fixed, just skip the VMX control field tests that populate
> > a VMCS field with a physical address that isn't backed.
> > 
> > Tested: <multi line explanation of how the code was tested>
> > 
> > Effort: <Your effort group>
> > Google-Bug-Id: <buganizer-id>
> > Change-Id: I624091cb5cd7cfaae887de8f017e18b5f95d8f61

Nice Google commit message template :-)

> > Signed-off-by: Jim Mattson <jmattson@google.com>
> > ---
> >  x86/vmx_tests.c | 6 ++++++
> >  1 file changed, 6 insertions(+)
> > 
> > diff --git a/x86/vmx_tests.c b/x86/vmx_tests.c
> > index 088d32aaee0b..e82bc04d72e2 100644
> > --- a/x86/vmx_tests.c
> > +++ b/x86/vmx_tests.c
> > @@ -3359,6 +3359,12 @@ static void test_vmcs_page_addr(const char *name,
> >  				bool ignored,
> >  				u64 addr)
> >  {
> > +	/*
> > +	 * Skip tests that reference unbacked memory for now, because
> > +	 * kvm doesn't always handle this situation correctly.
> 
> Since in practice it's only 16 tests that fail, it would be nice to only 
> XFAIL those.  The following is a bit intrusive, but it works.  What do 
> you think?
>

A nice thing about using xfail=true|false now, is that it may later be
assigned to !ERRATA(...) in order to start expecting the tests to pass
when run on hosts that have the fix.

drew

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2017-09-17 12:28 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-09-14 23:31 [PATCH 1/3] kvm: nVMX: Remove nested_vmx_succeed after successful VM-entry Jim Mattson
2017-09-14 23:31 ` Jim Mattson
2017-09-14 23:31 ` [PATCH 2/3] kvm: vmx: Handle VMLAUNCH/VMRESUME failure properly Jim Mattson
2017-09-14 23:31 ` Jim Mattson
2017-09-14 23:31 ` [PATCH 3/3] kvm: nVMX: Handle deferred early " Jim Mattson
2017-09-14 23:31 ` Jim Mattson
2017-09-14 23:31 ` [kvm-unit-tests PATCH 1/2] x86: Skip some VMX control field tests Jim Mattson
2017-09-14 23:31 ` Jim Mattson
2017-09-15 16:40   ` Paolo Bonzini
2017-09-15 17:49     ` Jim Mattson
2017-09-17 12:28     ` Andrew Jones
2017-09-14 23:31 ` [kvm-unit-tests PATCH 2/2] x86: Add test for TPR threshold check on VM-entry Jim Mattson
2017-09-14 23:31 ` Jim Mattson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox