[PATCH 0/9] KVM: x86: Fixes for KVM_INTEL_PROVE

kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 0/9] KVM: x86: Fixes for KVM_INTEL_PROVE_VE
@ 2024-05-18  0:04 Sean Christopherson
  2024-05-18  0:04 ` [PATCH 1/9] KVM: x86/mmu: Use SHADOW_NONPRESENT_VALUE for atomic zap in TDP MMU Sean Christopherson
                   ` (9 more replies)
  0 siblings, 10 replies; 23+ messages in thread
From: Sean Christopherson @ 2024-05-18  0:04 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini; +Cc: kvm, linux-kernel

Fixes and debug help for prove #VE support.  I'm not in love with the sanity
check implementation, but I also don't love the idea of plumbing in @kvm to
the low level SPTE helpers.

Not super well tested, but I wanted to get this posted asap in case someone
wants to debug the unexpected #VEs we're seeing.

Note, Isaku's patch needs his SoB.

Isaku Yamahata (1):
  KVM: x86/mmu: Use SHADOW_NONPRESENT_VALUE for atomic zap in TDP MMU

Sean Christopherson (8):
  KVM: nVMX: Initialize #VE info page for vmcs02 when proving #VE
    support
  KVM: nVMX: Always handle #VEs in L0 (never forward #VEs from L2 to L1)
  KVM: x86/mmu: Add sanity checks that KVM doesn't create EPT #VE SPTEs
  KVM: VMX: Dump VMCS on unexpected #VE
  KVM: x86/mmu: Print SPTEs on unexpected #VE
  KVM: VMX: Don't kill the VM on an unexpected #VE
  KVM: VMX: Enumerate EPT Violation #VE support in /proc/cpuinfo
  KVM: x86: Disable KVM_INTEL_PROVE_VE by default

 arch/x86/include/asm/kvm_host.h    |  1 +
 arch/x86/include/asm/vmxfeatures.h |  2 +-
 arch/x86/kvm/Kconfig               |  6 ++--
 arch/x86/kvm/mmu/mmu.c             | 45 ++++++++++++++++++++++++------
 arch/x86/kvm/mmu/spte.h            |  9 ++++++
 arch/x86/kvm/mmu/tdp_iter.h        |  2 ++
 arch/x86/kvm/mmu/tdp_mmu.c         |  2 +-
 arch/x86/kvm/vmx/nested.c          |  5 ++++
 arch/x86/kvm/vmx/vmx.c             | 11 ++++++--
 9 files changed, 67 insertions(+), 16 deletions(-)


base-commit: 4aad0b1893a141f114ba40ed509066f3c9bc24b0
-- 
2.45.0.215.g3402c0e53f-goog


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH 1/9] KVM: x86/mmu: Use SHADOW_NONPRESENT_VALUE for atomic zap in TDP MMU
  2024-05-18  0:04 [PATCH 0/9] KVM: x86: Fixes for KVM_INTEL_PROVE_VE Sean Christopherson
@ 2024-05-18  0:04 ` Sean Christopherson
  2024-05-20 12:38   ` Huang, Kai
  2024-05-21  7:21   ` Isaku Yamahata
  2024-05-18  0:04 ` [PATCH 2/9] KVM: nVMX: Initialize #VE info page for vmcs02 when proving #VE support Sean Christopherson
                   ` (8 subsequent siblings)
  9 siblings, 2 replies; 23+ messages in thread
From: Sean Christopherson @ 2024-05-18  0:04 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini; +Cc: kvm, linux-kernel

From: Isaku Yamahata <isaku.yamahata@intel.com>

Use SHADOW_NONPRESENT_VALUE when zapping TDP MMU SPTEs with mmu_lock held
for read, tdp_mmu_zap_spte_atomic() was simply missed during the initial
development.

Fixes: 7f01cab84928 ("KVM: x86/mmu: Allow non-zero value for non-present SPTE and removed SPTE")
Not-yet-signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
[sean: write changelog]
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/mmu/tdp_mmu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index 1259dd63defc..36539c1b36cd 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -626,7 +626,7 @@ static inline int tdp_mmu_zap_spte_atomic(struct kvm *kvm,
 	 * SPTEs.
 	 */
 	handle_changed_spte(kvm, iter->as_id, iter->gfn, iter->old_spte,
-			    0, iter->level, true);
+			    SHADOW_NONPRESENT_VALUE, iter->level, true);
 
 	return 0;
 }
-- 
2.45.0.215.g3402c0e53f-goog


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 2/9] KVM: nVMX: Initialize #VE info page for vmcs02 when proving #VE support
  2024-05-18  0:04 [PATCH 0/9] KVM: x86: Fixes for KVM_INTEL_PROVE_VE Sean Christopherson
  2024-05-18  0:04 ` [PATCH 1/9] KVM: x86/mmu: Use SHADOW_NONPRESENT_VALUE for atomic zap in TDP MMU Sean Christopherson
@ 2024-05-18  0:04 ` Sean Christopherson
  2024-05-20 23:09   ` Huang, Kai
  2024-05-18  0:04 ` [PATCH 3/9] KVM: nVMX: Always handle #VEs in L0 (never forward #VEs from L2 to L1) Sean Christopherson
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 23+ messages in thread
From: Sean Christopherson @ 2024-05-18  0:04 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini; +Cc: kvm, linux-kernel

Point vmcs02.VE_INFORMATION_ADDRESS at the vCPU's #VE info page when
initializing vmcs02, otherwise KVM will run L2 with EPT Violation #VE
enabled and a VE info address pointing at pfn 0.

Fixes: 8131cf5b4fd8 ("KVM: VMX: Introduce test mode related to EPT violation VE")
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/vmx/nested.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index d5b832126e34..6798fadaa335 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -2242,6 +2242,9 @@ static void prepare_vmcs02_constant_state(struct vcpu_vmx *vmx)
 		vmcs_write64(EPT_POINTER,
 			     construct_eptp(&vmx->vcpu, 0, PT64_ROOT_4LEVEL));
 
+	if (vmx->ve_info)
+		vmcs_write64(VE_INFORMATION_ADDRESS, __pa(vmx->ve_info));
+
 	/* All VMFUNCs are currently emulated through L0 vmexits.  */
 	if (cpu_has_vmx_vmfunc())
 		vmcs_write64(VM_FUNCTION_CONTROL, 0);
-- 
2.45.0.215.g3402c0e53f-goog


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 3/9] KVM: nVMX: Always handle #VEs in L0 (never forward #VEs from L2 to L1)
  2024-05-18  0:04 [PATCH 0/9] KVM: x86: Fixes for KVM_INTEL_PROVE_VE Sean Christopherson
  2024-05-18  0:04 ` [PATCH 1/9] KVM: x86/mmu: Use SHADOW_NONPRESENT_VALUE for atomic zap in TDP MMU Sean Christopherson
  2024-05-18  0:04 ` [PATCH 2/9] KVM: nVMX: Initialize #VE info page for vmcs02 when proving #VE support Sean Christopherson
@ 2024-05-18  0:04 ` Sean Christopherson
  2024-05-18  0:04 ` [PATCH 4/9] KVM: x86/mmu: Add sanity checks that KVM doesn't create EPT #VE SPTEs Sean Christopherson
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 23+ messages in thread
From: Sean Christopherson @ 2024-05-18  0:04 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini; +Cc: kvm, linux-kernel

Always handle #VEs, e.g. due to prove EPT Violation #VE failures, in L0,
as KVM does not expose any #VE capabilities to L1, i.e. any and all #VEs
are KVM's responsibility.

Fixes: 8131cf5b4fd8 ("KVM: VMX: Introduce test mode related to EPT violation VE")
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/vmx/nested.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 6798fadaa335..643935a0f70a 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -6233,6 +6233,8 @@ static bool nested_vmx_l0_wants_exit(struct kvm_vcpu *vcpu,
 		else if (is_alignment_check(intr_info) &&
 			 !vmx_guest_inject_ac(vcpu))
 			return true;
+		else if (is_ve_fault(intr_info))
+			return true;
 		return false;
 	case EXIT_REASON_EXTERNAL_INTERRUPT:
 		return true;
-- 
2.45.0.215.g3402c0e53f-goog


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 4/9] KVM: x86/mmu: Add sanity checks that KVM doesn't create EPT #VE SPTEs
  2024-05-18  0:04 [PATCH 0/9] KVM: x86: Fixes for KVM_INTEL_PROVE_VE Sean Christopherson
                   ` (2 preceding siblings ...)
  2024-05-18  0:04 ` [PATCH 3/9] KVM: nVMX: Always handle #VEs in L0 (never forward #VEs from L2 to L1) Sean Christopherson
@ 2024-05-18  0:04 ` Sean Christopherson
  2024-05-18  0:04 ` [PATCH 5/9] KVM: VMX: Dump VMCS on unexpected #VE Sean Christopherson
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 23+ messages in thread
From: Sean Christopherson @ 2024-05-18  0:04 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini; +Cc: kvm, linux-kernel

Assert that KVM doesn't set a SPTE to a value that could trigger an EPT
Violation #VE on a non-MMIO SPTE, e.g. to help detect bugs even without
KVM_INTEL_PROVE_VE enabled, and to help debug actual #VE failures.

Note, this will run afoul of TDX support, which needs to reflect emulated
MMIO accesses into the guest as #VEs (which was the whole point of adding
EPT Violation #VE support in KVM).  The obvious fix for that is to exempt
MMIO SPTEs, but that's annoyingly difficult now that is_mmio_spte() relies
on a per-VM value.  However, resolving that conundrum is a future problem,
whereas getting KVM_INTEL_PROVE_VE healthy is a current problem.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/mmu/mmu.c      | 3 +++
 arch/x86/kvm/mmu/spte.h     | 9 +++++++++
 arch/x86/kvm/mmu/tdp_iter.h | 2 ++
 3 files changed, 14 insertions(+)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 5095fb46713e..d2af077d8b34 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -336,16 +336,19 @@ static int is_cpuid_PSE36(void)
 #ifdef CONFIG_X86_64
 static void __set_spte(u64 *sptep, u64 spte)
 {
+	KVM_MMU_WARN_ON(is_ept_ve_possible(spte));
 	WRITE_ONCE(*sptep, spte);
 }
 
 static void __update_clear_spte_fast(u64 *sptep, u64 spte)
 {
+	KVM_MMU_WARN_ON(is_ept_ve_possible(spte));
 	WRITE_ONCE(*sptep, spte);
 }
 
 static u64 __update_clear_spte_slow(u64 *sptep, u64 spte)
 {
+	KVM_MMU_WARN_ON(is_ept_ve_possible(spte));
 	return xchg(sptep, spte);
 }
 
diff --git a/arch/x86/kvm/mmu/spte.h b/arch/x86/kvm/mmu/spte.h
index 5dd5405fa07a..52fa004a1fbc 100644
--- a/arch/x86/kvm/mmu/spte.h
+++ b/arch/x86/kvm/mmu/spte.h
@@ -3,6 +3,8 @@
 #ifndef KVM_X86_MMU_SPTE_H
 #define KVM_X86_MMU_SPTE_H
 
+#include <asm/vmx.h>
+
 #include "mmu.h"
 #include "mmu_internal.h"
 
@@ -276,6 +278,13 @@ static inline bool is_shadow_present_pte(u64 pte)
 	return !!(pte & SPTE_MMU_PRESENT_MASK);
 }
 
+static inline bool is_ept_ve_possible(u64 spte)
+{
+	return (shadow_present_mask & VMX_EPT_SUPPRESS_VE_BIT) &&
+	       !(spte & VMX_EPT_SUPPRESS_VE_BIT) &&
+	       (spte & VMX_EPT_RWX_MASK) != VMX_EPT_MISCONFIG_WX_VALUE;
+}
+
 /*
  * Returns true if A/D bits are supported in hardware and are enabled by KVM.
  * When enabled, KVM uses A/D bits for all non-nested MMUs.  Because L1 can
diff --git a/arch/x86/kvm/mmu/tdp_iter.h b/arch/x86/kvm/mmu/tdp_iter.h
index fae559559a80..2880fd392e0c 100644
--- a/arch/x86/kvm/mmu/tdp_iter.h
+++ b/arch/x86/kvm/mmu/tdp_iter.h
@@ -21,11 +21,13 @@ static inline u64 kvm_tdp_mmu_read_spte(tdp_ptep_t sptep)
 
 static inline u64 kvm_tdp_mmu_write_spte_atomic(tdp_ptep_t sptep, u64 new_spte)
 {
+	KVM_MMU_WARN_ON(is_ept_ve_possible(new_spte));
 	return xchg(rcu_dereference(sptep), new_spte);
 }
 
 static inline void __kvm_tdp_mmu_write_spte(tdp_ptep_t sptep, u64 new_spte)
 {
+	KVM_MMU_WARN_ON(is_ept_ve_possible(new_spte));
 	WRITE_ONCE(*rcu_dereference(sptep), new_spte);
 }
 
-- 
2.45.0.215.g3402c0e53f-goog


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 5/9] KVM: VMX: Dump VMCS on unexpected #VE
  2024-05-18  0:04 [PATCH 0/9] KVM: x86: Fixes for KVM_INTEL_PROVE_VE Sean Christopherson
                   ` (3 preceding siblings ...)
  2024-05-18  0:04 ` [PATCH 4/9] KVM: x86/mmu: Add sanity checks that KVM doesn't create EPT #VE SPTEs Sean Christopherson
@ 2024-05-18  0:04 ` Sean Christopherson
  2024-05-18  0:04 ` [PATCH 6/9] KVM: x86/mmu: Print SPTEs " Sean Christopherson
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 23+ messages in thread
From: Sean Christopherson @ 2024-05-18  0:04 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini; +Cc: kvm, linux-kernel

Dump the VMCS on an unexpected #VE, otherwise it's practically impossible
to figure out why the #VE occurred.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/vmx/vmx.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 51b2cd13250a..0c68643d982b 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -5217,8 +5217,10 @@ static int handle_exception_nmi(struct kvm_vcpu *vcpu)
 	if (is_invalid_opcode(intr_info))
 		return handle_ud(vcpu);
 
-	if (KVM_BUG_ON(is_ve_fault(intr_info), vcpu->kvm))
+	if (KVM_BUG_ON(is_ve_fault(intr_info), vcpu->kvm)) {
+		dump_vmcs(vcpu);
 		return -EIO;
+	}
 
 	error_code = 0;
 	if (intr_info & INTR_INFO_DELIVER_CODE_MASK)
-- 
2.45.0.215.g3402c0e53f-goog


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 6/9] KVM: x86/mmu: Print SPTEs on unexpected #VE
  2024-05-18  0:04 [PATCH 0/9] KVM: x86: Fixes for KVM_INTEL_PROVE_VE Sean Christopherson
                   ` (4 preceding siblings ...)
  2024-05-18  0:04 ` [PATCH 5/9] KVM: VMX: Dump VMCS on unexpected #VE Sean Christopherson
@ 2024-05-18  0:04 ` Sean Christopherson
  2024-05-18  0:04 ` [PATCH 7/9] KVM: VMX: Don't kill the VM on an " Sean Christopherson
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 23+ messages in thread
From: Sean Christopherson @ 2024-05-18  0:04 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini; +Cc: kvm, linux-kernel

Print the SPTEs that correspond to the faulting GPA on an unexpected EPT
Violation #VE to help the user debug failures, e.g. to pinpoint which SPTE
didn't have SUPPRESS_VE set.

Opportunistically assert that the underlying exit reason was indeed an EPT
Violation, as the CPU has *really* gone off the rails if a #VE occurs due
to a completely unexpected exit reason.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/include/asm/kvm_host.h |  1 +
 arch/x86/kvm/mmu/mmu.c          | 42 ++++++++++++++++++++++++++-------
 arch/x86/kvm/vmx/vmx.c          |  5 ++++
 3 files changed, 39 insertions(+), 9 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index aabf1648a56a..9bb2e164c523 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -2159,6 +2159,7 @@ int kvm_emulate_hypercall(struct kvm_vcpu *vcpu);
 
 int kvm_mmu_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, u64 error_code,
 		       void *insn, int insn_len);
+void kvm_mmu_print_sptes(struct kvm_vcpu *vcpu, gpa_t gpa, const char *msg);
 void kvm_mmu_invlpg(struct kvm_vcpu *vcpu, gva_t gva);
 void kvm_mmu_invalidate_addr(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
 			     u64 addr, unsigned long roots);
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index d2af077d8b34..f2c9580d9588 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -4124,6 +4124,22 @@ static int get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes, int *root_level
 	return leaf;
 }
 
+static int get_sptes_lockless(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes,
+			      int *root_level)
+{
+	int leaf;
+
+	walk_shadow_page_lockless_begin(vcpu);
+
+	if (is_tdp_mmu_active(vcpu))
+		leaf = kvm_tdp_mmu_get_walk(vcpu, addr, sptes, root_level);
+	else
+		leaf = get_walk(vcpu, addr, sptes, root_level);
+
+	walk_shadow_page_lockless_end(vcpu);
+	return leaf;
+}
+
 /* return true if reserved bit(s) are detected on a valid, non-MMIO SPTE. */
 static bool get_mmio_spte(struct kvm_vcpu *vcpu, u64 addr, u64 *sptep)
 {
@@ -4132,15 +4148,7 @@ static bool get_mmio_spte(struct kvm_vcpu *vcpu, u64 addr, u64 *sptep)
 	int root, leaf, level;
 	bool reserved = false;
 
-	walk_shadow_page_lockless_begin(vcpu);
-
-	if (is_tdp_mmu_active(vcpu))
-		leaf = kvm_tdp_mmu_get_walk(vcpu, addr, sptes, &root);
-	else
-		leaf = get_walk(vcpu, addr, sptes, &root);
-
-	walk_shadow_page_lockless_end(vcpu);
-
+	leaf = get_sptes_lockless(vcpu, addr, sptes, &root);
 	if (unlikely(leaf < 0)) {
 		*sptep = 0ull;
 		return reserved;
@@ -5963,6 +5971,22 @@ int noinline kvm_mmu_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, u64 err
 }
 EXPORT_SYMBOL_GPL(kvm_mmu_page_fault);
 
+void kvm_mmu_print_sptes(struct kvm_vcpu *vcpu, gpa_t gpa, const char *msg)
+{
+	u64 sptes[PT64_ROOT_MAX_LEVEL + 1];
+	int root_level, leaf, level;
+
+	leaf = get_sptes_lockless(vcpu, gpa, sptes, &root_level);
+	if (unlikely(leaf < 0))
+		return;
+
+	pr_err("%s %llx", msg, gpa);
+	for (level = root_level; level >= leaf; level--)
+		pr_cont(", spte[%d] = 0x%llx", level, sptes[level]);
+	pr_cont("\n");
+}
+EXPORT_SYMBOL_GPL(kvm_mmu_print_sptes);
+
 static void __kvm_mmu_invalidate_addr(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
 				      u64 addr, hpa_t root_hpa)
 {
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 0c68643d982b..2a3fce61c785 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -5218,7 +5218,12 @@ static int handle_exception_nmi(struct kvm_vcpu *vcpu)
 		return handle_ud(vcpu);
 
 	if (KVM_BUG_ON(is_ve_fault(intr_info), vcpu->kvm)) {
+		struct vmx_ve_information *ve_info = vmx->ve_info;
+
+		WARN_ONCE(ve_info->exit_reason != EXIT_REASON_EPT_VIOLATION,
+			  "Unexpected #VE on VM-Exit reason 0x%x", ve_info->exit_reason);
 		dump_vmcs(vcpu);
+		kvm_mmu_print_sptes(vcpu, ve_info->guest_physical_address, "#VE");
 		return -EIO;
 	}
 
-- 
2.45.0.215.g3402c0e53f-goog


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 7/9] KVM: VMX: Don't kill the VM on an unexpected #VE
  2024-05-18  0:04 [PATCH 0/9] KVM: x86: Fixes for KVM_INTEL_PROVE_VE Sean Christopherson
                   ` (5 preceding siblings ...)
  2024-05-18  0:04 ` [PATCH 6/9] KVM: x86/mmu: Print SPTEs " Sean Christopherson
@ 2024-05-18  0:04 ` Sean Christopherson
  2024-05-18  0:04 ` [PATCH 8/9] KVM: VMX: Enumerate EPT Violation #VE support in /proc/cpuinfo Sean Christopherson
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 23+ messages in thread
From: Sean Christopherson @ 2024-05-18  0:04 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini; +Cc: kvm, linux-kernel

Don't terminiate the VM on an unexpected #VE, as it's extremely unlikely
the #VE is fatal to the guest, and even less likely that it presents a
danger to the host.  Simply resume the guest on "failure", as the #VE info
page's BUSY field will prevent converting any more EPT Violations to #VEs
for the vCPU (at least, that's what the BUSY field is supposed to do).

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/vmx/vmx.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 2a3fce61c785..58832aae2248 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -5217,14 +5217,14 @@ static int handle_exception_nmi(struct kvm_vcpu *vcpu)
 	if (is_invalid_opcode(intr_info))
 		return handle_ud(vcpu);
 
-	if (KVM_BUG_ON(is_ve_fault(intr_info), vcpu->kvm)) {
+	if (WARN_ON_ONCE(is_ve_fault(intr_info))) {
 		struct vmx_ve_information *ve_info = vmx->ve_info;
 
 		WARN_ONCE(ve_info->exit_reason != EXIT_REASON_EPT_VIOLATION,
 			  "Unexpected #VE on VM-Exit reason 0x%x", ve_info->exit_reason);
 		dump_vmcs(vcpu);
 		kvm_mmu_print_sptes(vcpu, ve_info->guest_physical_address, "#VE");
-		return -EIO;
+		return 1;
 	}
 
 	error_code = 0;
-- 
2.45.0.215.g3402c0e53f-goog


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 8/9] KVM: VMX: Enumerate EPT Violation #VE support in /proc/cpuinfo
  2024-05-18  0:04 [PATCH 0/9] KVM: x86: Fixes for KVM_INTEL_PROVE_VE Sean Christopherson
                   ` (6 preceding siblings ...)
  2024-05-18  0:04 ` [PATCH 7/9] KVM: VMX: Don't kill the VM on an " Sean Christopherson
@ 2024-05-18  0:04 ` Sean Christopherson
  2024-05-18  0:04 ` [PATCH 9/9] KVM: x86: Disable KVM_INTEL_PROVE_VE by default Sean Christopherson
  2024-05-23 16:41 ` [PATCH 0/9] KVM: x86: Fixes for KVM_INTEL_PROVE_VE Paolo Bonzini
  9 siblings, 0 replies; 23+ messages in thread
From: Sean Christopherson @ 2024-05-18  0:04 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini; +Cc: kvm, linux-kernel

Don't suppress printing EPT_VIOLATION_VE in /proc/cpuinfo, knowing whether
or not KVM_INTEL_PROVE_VE actually does anything is extremely valuable.
A privileged user can get at the information by reading the raw MSR, but
the whole point of the VMX flags is to avoid needing to glean information
from raw MSR reads.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/include/asm/vmxfeatures.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/vmxfeatures.h b/arch/x86/include/asm/vmxfeatures.h
index 266daf5b5b84..695f36664889 100644
--- a/arch/x86/include/asm/vmxfeatures.h
+++ b/arch/x86/include/asm/vmxfeatures.h
@@ -77,7 +77,7 @@
 #define VMX_FEATURE_ENCLS_EXITING	( 2*32+ 15) /* "" VM-Exit on ENCLS (leaf dependent) */
 #define VMX_FEATURE_RDSEED_EXITING	( 2*32+ 16) /* "" VM-Exit on RDSEED */
 #define VMX_FEATURE_PAGE_MOD_LOGGING	( 2*32+ 17) /* "pml" Log dirty pages into buffer */
-#define VMX_FEATURE_EPT_VIOLATION_VE	( 2*32+ 18) /* "" Conditionally reflect EPT violations as #VE exceptions */
+#define VMX_FEATURE_EPT_VIOLATION_VE	( 2*32+ 18) /* Conditionally reflect EPT violations as #VE exceptions */
 #define VMX_FEATURE_PT_CONCEAL_VMX	( 2*32+ 19) /* "" Suppress VMX indicators in Processor Trace */
 #define VMX_FEATURE_XSAVES		( 2*32+ 20) /* "" Enable XSAVES and XRSTORS in guest */
 #define VMX_FEATURE_MODE_BASED_EPT_EXEC	( 2*32+ 22) /* "ept_mode_based_exec" Enable separate EPT EXEC bits for supervisor vs. user */
-- 
2.45.0.215.g3402c0e53f-goog


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 9/9] KVM: x86: Disable KVM_INTEL_PROVE_VE by default
  2024-05-18  0:04 [PATCH 0/9] KVM: x86: Fixes for KVM_INTEL_PROVE_VE Sean Christopherson
                   ` (7 preceding siblings ...)
  2024-05-18  0:04 ` [PATCH 8/9] KVM: VMX: Enumerate EPT Violation #VE support in /proc/cpuinfo Sean Christopherson
@ 2024-05-18  0:04 ` Sean Christopherson
  2024-05-21 17:36   ` Paolo Bonzini
  2024-05-23 16:41 ` [PATCH 0/9] KVM: x86: Fixes for KVM_INTEL_PROVE_VE Paolo Bonzini
  9 siblings, 1 reply; 23+ messages in thread
From: Sean Christopherson @ 2024-05-18  0:04 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini; +Cc: kvm, linux-kernel

Disable KVM's "prove #VE" support by default, as it provides no functional
value, and even its sanity checking benefits are relatively limited.  I.e.
it should be fully opt-in even on debug kernels, especially since EPT
Violation #VE suppression appears to be buggy on some CPUs.

Opportunistically add a line in the help text to make it abundantly clear
that KVM_INTEL_PROVE_VE should never be enabled in a production
environment.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/Kconfig | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
index 2a7f69abcac3..3468efc4be55 100644
--- a/arch/x86/kvm/Kconfig
+++ b/arch/x86/kvm/Kconfig
@@ -97,15 +97,15 @@ config KVM_INTEL
 
 config KVM_INTEL_PROVE_VE
         bool "Check that guests do not receive #VE exceptions"
-        default KVM_PROVE_MMU || DEBUG_KERNEL
-        depends on KVM_INTEL
+        depends on KVM_INTEL && DEBUG_KERNEL && EXPERT
         help
-
           Checks that KVM's page table management code will not incorrectly
           let guests receive a virtualization exception.  Virtualization
           exceptions will be trapped by the hypervisor rather than injected
           in the guest.
 
+          This should never be enabled in a production environment.
+
           If unsure, say N.
 
 config X86_SGX_KVM
-- 
2.45.0.215.g3402c0e53f-goog


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/9] KVM: x86/mmu: Use SHADOW_NONPRESENT_VALUE for atomic zap in TDP MMU
  2024-05-18  0:04 ` [PATCH 1/9] KVM: x86/mmu: Use SHADOW_NONPRESENT_VALUE for atomic zap in TDP MMU Sean Christopherson
@ 2024-05-20 12:38   ` Huang, Kai
  2024-05-21  7:21   ` Isaku Yamahata
  1 sibling, 0 replies; 23+ messages in thread
From: Huang, Kai @ 2024-05-20 12:38 UTC (permalink / raw)
  To: pbonzini@redhat.com, seanjc@google.com
  Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org

On Fri, 2024-05-17 at 17:04 -0700, Sean Christopherson wrote:
> From: Isaku Yamahata <isaku.yamahata@intel.com>
> 
> Use SHADOW_NONPRESENT_VALUE when zapping TDP MMU SPTEs with mmu_lock held
> for read, tdp_mmu_zap_spte_atomic() was simply missed during the initial
> development.
> 
> Fixes: 7f01cab84928 ("KVM: x86/mmu: Allow non-zero value for non-present SPTE and removed SPTE")
> Not-yet-signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
> [sean: write changelog]
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kvm/mmu/tdp_mmu.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
> index 1259dd63defc..36539c1b36cd 100644
> --- a/arch/x86/kvm/mmu/tdp_mmu.c
> +++ b/arch/x86/kvm/mmu/tdp_mmu.c
> @@ -626,7 +626,7 @@ static inline int tdp_mmu_zap_spte_atomic(struct kvm *kvm,
>  	 * SPTEs.
>  	 */
>  	handle_changed_spte(kvm, iter->as_id, iter->gfn, iter->old_spte,
> -			    0, iter->level, true);
> +			    SHADOW_NONPRESENT_VALUE, iter->level, true);
>  
>  	return 0;
>  }

Reviewed-by: Kai Huang <kai.huang@intel.com>


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/9] KVM: nVMX: Initialize #VE info page for vmcs02 when proving #VE support
  2024-05-18  0:04 ` [PATCH 2/9] KVM: nVMX: Initialize #VE info page for vmcs02 when proving #VE support Sean Christopherson
@ 2024-05-20 23:09   ` Huang, Kai
  2024-05-20 23:22     ` Sean Christopherson
  0 siblings, 1 reply; 23+ messages in thread
From: Huang, Kai @ 2024-05-20 23:09 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini; +Cc: kvm, linux-kernel



On 18/05/2024 12:04 pm, Sean Christopherson wrote:
> Point vmcs02.VE_INFORMATION_ADDRESS at the vCPU's #VE info page when
> initializing vmcs02, otherwise KVM will run L2 with EPT Violation #VE
> enabled and a VE info address pointing at pfn 0.

How about we just clear EPT_VIOLATION_VE bit in 2nd_exec_control 
unconditionally for vmcs02?  Your next patch says:

"
Always handle #VEs, e.g. due to prove EPT Violation #VE failures, in L0,
as KVM does not expose any #VE capabilities to L1, i.e. any and all #VEs
are KVM's responsibility.
"

> 
> Fixes: 8131cf5b4fd8 ("KVM: VMX: Introduce test mode related to EPT violation VE")
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>   arch/x86/kvm/vmx/nested.c | 3 +++
>   1 file changed, 3 insertions(+)
> 
> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index d5b832126e34..6798fadaa335 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -2242,6 +2242,9 @@ static void prepare_vmcs02_constant_state(struct vcpu_vmx *vmx)
>   		vmcs_write64(EPT_POINTER,
>   			     construct_eptp(&vmx->vcpu, 0, PT64_ROOT_4LEVEL));
>   
> +	if (vmx->ve_info)
> +		vmcs_write64(VE_INFORMATION_ADDRESS, __pa(vmx->ve_info));
> +
>   	/* All VMFUNCs are currently emulated through L0 vmexits.  */
>   	if (cpu_has_vmx_vmfunc())
>   		vmcs_write64(VM_FUNCTION_CONTROL, 0);

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/9] KVM: nVMX: Initialize #VE info page for vmcs02 when proving #VE support
  2024-05-20 23:09   ` Huang, Kai
@ 2024-05-20 23:22     ` Sean Christopherson
  2024-05-20 23:49       ` Huang, Kai
  0 siblings, 1 reply; 23+ messages in thread
From: Sean Christopherson @ 2024-05-20 23:22 UTC (permalink / raw)
  To: Kai Huang; +Cc: Paolo Bonzini, kvm, linux-kernel

On Tue, May 21, 2024, Kai Huang wrote:
> On 18/05/2024 12:04 pm, Sean Christopherson wrote:
> > Point vmcs02.VE_INFORMATION_ADDRESS at the vCPU's #VE info page when
> > initializing vmcs02, otherwise KVM will run L2 with EPT Violation #VE
> > enabled and a VE info address pointing at pfn 0.
> 
> How about we just clear EPT_VIOLATION_VE bit in 2nd_exec_control
> unconditionally for vmcs02?

Because then KVM wouldn't get any EPT Violation #VE coverage for L2, and as
evidence by the KVM-Unit-Test failure, running L2 with EPT Violation #VEs enabled
provides unique coverage.  Doing so definitely provides coverage beyond what is
strictly needed for TDX, but it's just as easy to set the VE info page in vmcs02
as it is so clear EPT_VIOLATION_VE, so why not.

> Your next patch says:
> 
> "
> Always handle #VEs, e.g. due to prove EPT Violation #VE failures, in L0,
> as KVM does not expose any #VE capabilities to L1, i.e. any and all #VEs
> are KVM's responsibility.
> "

I don't see how that's relevant to whether or not KVM enables EPT Violation #VEs
while L2 is running.  That patch simply routes all #VEs to L0, it doesn't affect
whether or not it's safe to enable EPT Violation #VEs for L2.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/9] KVM: nVMX: Initialize #VE info page for vmcs02 when proving #VE support
  2024-05-20 23:22     ` Sean Christopherson
@ 2024-05-20 23:49       ` Huang, Kai
  2024-05-21  0:21         ` Sean Christopherson
  0 siblings, 1 reply; 23+ messages in thread
From: Huang, Kai @ 2024-05-20 23:49 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: Paolo Bonzini, kvm, linux-kernel



On 21/05/2024 11:22 am, Sean Christopherson wrote:
> On Tue, May 21, 2024, Kai Huang wrote:
>> On 18/05/2024 12:04 pm, Sean Christopherson wrote:
>>> Point vmcs02.VE_INFORMATION_ADDRESS at the vCPU's #VE info page when
>>> initializing vmcs02, otherwise KVM will run L2 with EPT Violation #VE
>>> enabled and a VE info address pointing at pfn 0.
>>
>> How about we just clear EPT_VIOLATION_VE bit in 2nd_exec_control
>> unconditionally for vmcs02?
> 
> Because then KVM wouldn't get any EPT Violation #VE coverage for L2, and as
> evidence by the KVM-Unit-Test failure, running L2 with EPT Violation #VEs enabled
> provides unique coverage.  Doing so definitely provides coverage beyond what is
> strictly needed for TDX, but it's just as easy to set the VE info page in vmcs02
> as it is so clear EPT_VIOLATION_VE, so why not.
> 
>> Your next patch says:
>>
>> "
>> Always handle #VEs, e.g. due to prove EPT Violation #VE failures, in L0,
>> as KVM does not expose any #VE capabilities to L1, i.e. any and all #VEs
>> are KVM's responsibility.
>> "
> 
> I don't see how that's relevant to whether or not KVM enables EPT Violation #VEs
> while L2 is running.  That patch simply routes all #VEs to L0, it doesn't affect
> whether or not it's safe to enable EPT Violation #VEs for L2.

My logic is, if #VE exit cannot possibly happen for L2, then we don't 
need to deal whether to route #VE exits to L1. :-)

Well, actually I think conceptually, it kinda makes sense to route #VE 
exits to L1:

L1 should never enable #VE related bits so L1 is certainly not expecting 
to see #VE from L2.  But how to act should be depending on L1's logic? 
E.g., it can choose to ignore, or just kill the L2 etc?

Unconditionally disable #VE in vmcs02 can avoid such issue because it's 
just not possible for L2 to have the #VE exit.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/9] KVM: nVMX: Initialize #VE info page for vmcs02 when proving #VE support
  2024-05-20 23:49       ` Huang, Kai
@ 2024-05-21  0:21         ` Sean Christopherson
  2024-05-21  0:42           ` Huang, Kai
  0 siblings, 1 reply; 23+ messages in thread
From: Sean Christopherson @ 2024-05-21  0:21 UTC (permalink / raw)
  To: Kai Huang; +Cc: Paolo Bonzini, kvm, linux-kernel

On Tue, May 21, 2024, Kai Huang wrote:
> On 21/05/2024 11:22 am, Sean Christopherson wrote:
> > On Tue, May 21, 2024, Kai Huang wrote:
> > > On 18/05/2024 12:04 pm, Sean Christopherson wrote:
> > > > Point vmcs02.VE_INFORMATION_ADDRESS at the vCPU's #VE info page when
> > > > initializing vmcs02, otherwise KVM will run L2 with EPT Violation #VE
> > > > enabled and a VE info address pointing at pfn 0.
> > > 
> > > How about we just clear EPT_VIOLATION_VE bit in 2nd_exec_control
> > > unconditionally for vmcs02?
> > 
> > Because then KVM wouldn't get any EPT Violation #VE coverage for L2, and as
> > evidence by the KVM-Unit-Test failure, running L2 with EPT Violation #VEs enabled
> > provides unique coverage.  Doing so definitely provides coverage beyond what is
> > strictly needed for TDX, but it's just as easy to set the VE info page in vmcs02
> > as it is so clear EPT_VIOLATION_VE, so why not.
> > 
> > > Your next patch says:
> > > 
> > > "
> > > Always handle #VEs, e.g. due to prove EPT Violation #VE failures, in L0,
> > > as KVM does not expose any #VE capabilities to L1, i.e. any and all #VEs
> > > are KVM's responsibility.
> > > "
> > 
> > I don't see how that's relevant to whether or not KVM enables EPT Violation #VEs
> > while L2 is running.  That patch simply routes all #VEs to L0, it doesn't affect
> > whether or not it's safe to enable EPT Violation #VEs for L2.
> 
> My logic is, if #VE exit cannot possibly happen for L2, then we don't need
> to deal whether to route #VE exits to L1. :-)
> 
> Well, actually I think conceptually, it kinda makes sense to route #VE exits
> to L1:
> 
> L1 should never enable #VE related bits so L1 is certainly not expecting to

Not "should never", "can never".  If L1 attempts to enable EPT_VIOLATION_VE, then
VM-Enter will VM-Fail.

> see #VE from L2.  But how to act should be depending on L1's logic? E.g., it
> can choose to ignore, or just kill the L2 etc?

No.  Architecturally, from L1's perspective, a #VE VM-Exit _cannot_ occur in L2.
L1 can inject a #VE into L2, but a #VE cannot be generated by the CPU and thus
cannot cause a VM-Exit.

> Unconditionally disable #VE in vmcs02 can avoid such issue because it's just
> not possible for L2 to have the #VE exit.

Sure, but by that argument we could just avoid all nested VMX issues by never
enabling anything for L2.

If there's an argument to be made for disabling EPT_VIOLATION_VE in vmcs02, it's
that the potential maintenance cost of keeping nEPT, nVMX, and the shadow MMU
healthy outweighs the benefits.  I.e. we don't have a use case for enabling
EPT_VIOLATION_VE while L2 is running, so why validate it?

If whatever bug the KUT EPT found ends up being a KVM bug that specifically only
affects nVMX, then it'd be worth revisiting whether or not it's worth enabling
EPT_VIOLATION_VE in vmcs02.  But that's a rather big "if" at this point.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/9] KVM: nVMX: Initialize #VE info page for vmcs02 when proving #VE support
  2024-05-21  0:21         ` Sean Christopherson
@ 2024-05-21  0:42           ` Huang, Kai
  2024-05-21  1:02             ` Sean Christopherson
  0 siblings, 1 reply; 23+ messages in thread
From: Huang, Kai @ 2024-05-21  0:42 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: Paolo Bonzini, kvm, linux-kernel



On 21/05/2024 12:21 pm, Sean Christopherson wrote:
> On Tue, May 21, 2024, Kai Huang wrote:
>> On 21/05/2024 11:22 am, Sean Christopherson wrote:
>>> On Tue, May 21, 2024, Kai Huang wrote:
>>>> On 18/05/2024 12:04 pm, Sean Christopherson wrote:
>>>>> Point vmcs02.VE_INFORMATION_ADDRESS at the vCPU's #VE info page when
>>>>> initializing vmcs02, otherwise KVM will run L2 with EPT Violation #VE
>>>>> enabled and a VE info address pointing at pfn 0.
>>>>
>>>> How about we just clear EPT_VIOLATION_VE bit in 2nd_exec_control
>>>> unconditionally for vmcs02?
>>>
>>> Because then KVM wouldn't get any EPT Violation #VE coverage for L2, and as
>>> evidence by the KVM-Unit-Test failure, running L2 with EPT Violation #VEs enabled
>>> provides unique coverage.  Doing so definitely provides coverage beyond what is
>>> strictly needed for TDX, but it's just as easy to set the VE info page in vmcs02
>>> as it is so clear EPT_VIOLATION_VE, so why not.
>>>
>>>> Your next patch says:
>>>>
>>>> "
>>>> Always handle #VEs, e.g. due to prove EPT Violation #VE failures, in L0,
>>>> as KVM does not expose any #VE capabilities to L1, i.e. any and all #VEs
>>>> are KVM's responsibility.
>>>> "
>>>
>>> I don't see how that's relevant to whether or not KVM enables EPT Violation #VEs
>>> while L2 is running.  That patch simply routes all #VEs to L0, it doesn't affect
>>> whether or not it's safe to enable EPT Violation #VEs for L2.
>>
>> My logic is, if #VE exit cannot possibly happen for L2, then we don't need
>> to deal whether to route #VE exits to L1. :-)
>>
>> Well, actually I think conceptually, it kinda makes sense to route #VE exits
>> to L1:
>>
>> L1 should never enable #VE related bits so L1 is certainly not expecting to
> 
> Not "should never", "can never".  If L1 attempts to enable EPT_VIOLATION_VE, then
> VM-Enter will VM-Fail.
> 
>> see #VE from L2.  But how to act should be depending on L1's logic? E.g., it
>> can choose to ignore, or just kill the L2 etc?
> 
> No.  Architecturally, from L1's perspective, a #VE VM-Exit _cannot_ occur in L2.
> L1 can inject a #VE into L2, but a #VE cannot be generated by the CPU and thus
> cannot cause a VM-Exit.

OK.  The point is not to argue about L1 how to handle, but whether we 
should inject to L1 -- L1 can do whatever it believes legal/sane.

But I understand the purpose is to test/validate, so it's fine for L0 to 
handle, and by handle it eventually means we want to just dump that #VE 
exit.

But now L0 always handles #VE exits from L2, and AFAICT L0 will just 
kill the L1, until the patch:

	KVM: VMX: Don't kill the VM on an unexpected #VE

lands.

So looks that patch at least should be done first.  Otherwise it doesn't 
make a lot sense to kill L1 for #VE exits from L2.

> 
>> Unconditionally disable #VE in vmcs02 can avoid such issue because it's just
>> not possible for L2 to have the #VE exit.
> 
> Sure, but by that argument we could just avoid all nested VMX issues by never
> enabling anything for L2.
> 
> If there's an argument to be made for disabling EPT_VIOLATION_VE in vmcs02, it's
> that the potential maintenance cost of keeping nEPT, nVMX, and the shadow MMU
> healthy outweighs the benefits.  I.e. we don't have a use case for enabling
> EPT_VIOLATION_VE while L2 is running, so why validate it?

Yeah.  I am not sure the purpose of validating #VE exits from L2.

> 
> If whatever bug the KUT EPT found ends up being a KVM bug that specifically only
> affects nVMX, then it'd be worth revisiting whether or not it's worth enabling
> EPT_VIOLATION_VE in vmcs02.  But that's a rather big "if" at this point.

OK.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/9] KVM: nVMX: Initialize #VE info page for vmcs02 when proving #VE support
  2024-05-21  0:42           ` Huang, Kai
@ 2024-05-21  1:02             ` Sean Christopherson
  0 siblings, 0 replies; 23+ messages in thread
From: Sean Christopherson @ 2024-05-21  1:02 UTC (permalink / raw)
  To: Kai Huang; +Cc: Paolo Bonzini, kvm, linux-kernel

On Tue, May 21, 2024, Kai Huang wrote:
> But now L0 always handles #VE exits from L2, and AFAICT L0 will just kill
> the L1, until the patch:
> 
> 	KVM: VMX: Don't kill the VM on an unexpected #VE
> 
> lands.
> 
> So looks that patch at least should be done first.  Otherwise it doesn't
> make a lot sense to kill L1 for #VE exits from L2.

I have no objection to changing the order.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/9] KVM: x86/mmu: Use SHADOW_NONPRESENT_VALUE for atomic zap in TDP MMU
  2024-05-18  0:04 ` [PATCH 1/9] KVM: x86/mmu: Use SHADOW_NONPRESENT_VALUE for atomic zap in TDP MMU Sean Christopherson
  2024-05-20 12:38   ` Huang, Kai
@ 2024-05-21  7:21   ` Isaku Yamahata
  1 sibling, 0 replies; 23+ messages in thread
From: Isaku Yamahata @ 2024-05-21  7:21 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, kvm, linux-kernel, isaku.yamahata, isaku.yamahata

On Fri, May 17, 2024 at 05:04:22PM -0700,
Sean Christopherson <seanjc@google.com> wrote:

> From: Isaku Yamahata <isaku.yamahata@intel.com>
> 
> Use SHADOW_NONPRESENT_VALUE when zapping TDP MMU SPTEs with mmu_lock held
> for read, tdp_mmu_zap_spte_atomic() was simply missed during the initial
> development.
> 
> Fixes: 7f01cab84928 ("KVM: x86/mmu: Allow non-zero value for non-present SPTE and removed SPTE")
> Not-yet-signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
> [sean: write changelog]
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kvm/mmu/tdp_mmu.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
> index 1259dd63defc..36539c1b36cd 100644
> --- a/arch/x86/kvm/mmu/tdp_mmu.c
> +++ b/arch/x86/kvm/mmu/tdp_mmu.c
> @@ -626,7 +626,7 @@ static inline int tdp_mmu_zap_spte_atomic(struct kvm *kvm,
>  	 * SPTEs.
>  	 */
>  	handle_changed_spte(kvm, iter->as_id, iter->gfn, iter->old_spte,
> -			    0, iter->level, true);
> +			    SHADOW_NONPRESENT_VALUE, iter->level, true);
>  
>  	return 0;
>  }
> -- 
> 2.45.0.215.g3402c0e53f-goog

Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
-- 
Isaku Yamahata <isaku.yamahata@intel.com>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 9/9] KVM: x86: Disable KVM_INTEL_PROVE_VE by default
  2024-05-18  0:04 ` [PATCH 9/9] KVM: x86: Disable KVM_INTEL_PROVE_VE by default Sean Christopherson
@ 2024-05-21 17:36   ` Paolo Bonzini
  2024-05-21 18:18     ` Sean Christopherson
  0 siblings, 1 reply; 23+ messages in thread
From: Paolo Bonzini @ 2024-05-21 17:36 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: kvm, linux-kernel

On Sat, May 18, 2024 at 2:04 AM Sean Christopherson <seanjc@google.com> wrote:
> Disable KVM's "prove #VE" support by default, as it provides no functional
> value, and even its sanity checking benefits are relatively limited.  I.e.
> it should be fully opt-in even on debug kernels, especially since EPT
> Violation #VE suppression appears to be buggy on some CPUs.

More #VE trapping than #VE suppression.

I wouldn't go so far as making it *depend* on DEBUG_KERNEL.  EXPERT
plus the scary help message is good enough.

What about this:

diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
index b6831e17ec31..2864608c7016 100644
--- a/arch/x86/kvm/Kconfig
+++ b/arch/x86/kvm/Kconfig
@@ -97,14 +97,15 @@ config KVM_INTEL

 config KVM_INTEL_PROVE_VE
         bool "Check that guests do not receive #VE exceptions"
-        depends on KVM_INTEL && DEBUG_KERNEL && EXPERT
+        depends on KVM_INTEL && EXPERT
         help
           Checks that KVM's page table management code will not incorrectly
           let guests receive a virtualization exception.  Virtualization
           exceptions will be trapped by the hypervisor rather than injected
           in the guest.

-          This should never be enabled in a production environment.
+          Note that #VE trapping appears to be buggy on some CPUs.
+          This should never be enabled in a production environment!

           If unsure, say N.


Paolo

> Opportunistically add a line in the help text to make it abundantly clear
> that KVM_INTEL_PROVE_VE should never be enabled in a production
> environment.
>
> Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kvm/Kconfig | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
> index 2a7f69abcac3..3468efc4be55 100644
> --- a/arch/x86/kvm/Kconfig
> +++ b/arch/x86/kvm/Kconfig
> @@ -97,15 +97,15 @@ config KVM_INTEL
>
>  config KVM_INTEL_PROVE_VE
>          bool "Check that guests do not receive #VE exceptions"
> -        default KVM_PROVE_MMU || DEBUG_KERNEL
> -        depends on KVM_INTEL
> +        depends on KVM_INTEL && DEBUG_KERNEL && EXPERT
>          help
> -
>            Checks that KVM's page table management code will not incorrectly
>            let guests receive a virtualization exception.  Virtualization
>            exceptions will be trapped by the hypervisor rather than injected
>            in the guest.
>
> +          This should never be enabled in a production environment.
> +
>            If unsure, say N.
>
>  config X86_SGX_KVM
> --
> 2.45.0.215.g3402c0e53f-goog
>


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH 9/9] KVM: x86: Disable KVM_INTEL_PROVE_VE by default
  2024-05-21 17:36   ` Paolo Bonzini
@ 2024-05-21 18:18     ` Sean Christopherson
  2024-05-21 20:25       ` Paolo Bonzini
  0 siblings, 1 reply; 23+ messages in thread
From: Sean Christopherson @ 2024-05-21 18:18 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm, linux-kernel

On Tue, May 21, 2024, Paolo Bonzini wrote:
> On Sat, May 18, 2024 at 2:04 AM Sean Christopherson <seanjc@google.com> wrote:
> > Disable KVM's "prove #VE" support by default, as it provides no functional
> > value, and even its sanity checking benefits are relatively limited.  I.e.
> > it should be fully opt-in even on debug kernels, especially since EPT
> > Violation #VE suppression appears to be buggy on some CPUs.
> 
> More #VE trapping than #VE suppression.
>
> I wouldn't go so far as making it *depend* on DEBUG_KERNEL.  EXPERT
> plus the scary help message is good enough.

Works for me.

> 
> What about this:
> 
> diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
> index b6831e17ec31..2864608c7016 100644
> --- a/arch/x86/kvm/Kconfig
> +++ b/arch/x86/kvm/Kconfig
> @@ -97,14 +97,15 @@ config KVM_INTEL
> 
>  config KVM_INTEL_PROVE_VE
>          bool "Check that guests do not receive #VE exceptions"
> -        depends on KVM_INTEL && DEBUG_KERNEL && EXPERT
> +        depends on KVM_INTEL && EXPERT
>          help
>            Checks that KVM's page table management code will not incorrectly
>            let guests receive a virtualization exception.  Virtualization
>            exceptions will be trapped by the hypervisor rather than injected
>            in the guest.
> 
> -          This should never be enabled in a production environment.
> +          Note that #VE trapping appears to be buggy on some CPUs.

I see where you're coming from, but I don't think "trapping" is much better,
e.g. it suggests there's something broken with the interception of #VEs.  Ah,
the entire help text is weird.

This?

config KVM_INTEL_PROVE_VE
        bool "Verify guests do not receive unexpected EPT Violation #VEs"
        depends on KVM_INTEL && EXPERT
        help
          Enable EPT Violation #VEs (when supported) for all VMs, to verify
	  that KVM's EPT management code will not incorrectly result in a #VE
	  (KVM is supposed to supress #VEs by default).  Unexpected #VEs will
	  be intercepted by KVM and will trigger a WARN, but are otherwise
	  transparent to the guest.
	  
	  Note, EPT Violation #VE support appears to be buggy on some CPUs.

          This should never be enabled in a production environment!

          If unsure, say N.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 9/9] KVM: x86: Disable KVM_INTEL_PROVE_VE by default
  2024-05-21 18:18     ` Sean Christopherson
@ 2024-05-21 20:25       ` Paolo Bonzini
  2024-05-22  0:29         ` Sean Christopherson
  0 siblings, 1 reply; 23+ messages in thread
From: Paolo Bonzini @ 2024-05-21 20:25 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: kvm, linux-kernel

On Tue, May 21, 2024 at 8:18 PM Sean Christopherson <seanjc@google.com> wrote:
> > -          This should never be enabled in a production environment.
> > +          Note that #VE trapping appears to be buggy on some CPUs.
>
> I see where you're coming from, but I don't think "trapping" is much better,
> e.g. it suggests there's something broken with the interception of #VEs.  Ah,
> the entire help text is weird.

Yeah, I didn't want to say #VE is broken altogether - interception is
where we saw issues, and #VE is used in production as far as I know
(not just by TDX; at least Xen and maybe Hyper-V use it for
anti-malware purposes?).

Maybe "Note: there appear to be bugs in some CPUs that will trigger
the WARN, in particular with eptad=0 and/or nested virtualization"
covers all bases.

Paolo

>
> This?
>
> config KVM_INTEL_PROVE_VE
>         bool "Verify guests do not receive unexpected EPT Violation #VEs"
>         depends on KVM_INTEL && EXPERT
>         help
>           Enable EPT Violation #VEs (when supported) for all VMs, to verify
>           that KVM's EPT management code will not incorrectly result in a #VE
>           (KVM is supposed to supress #VEs by default).  Unexpected #VEs will
>           be intercepted by KVM and will trigger a WARN, but are otherwise
>           transparent to the guest.
>
>           Note, EPT Violation #VE support appears to be buggy on some CPUs.
>
>           This should never be enabled in a production environment!
>
>           If unsure, say N.
>


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 9/9] KVM: x86: Disable KVM_INTEL_PROVE_VE by default
  2024-05-21 20:25       ` Paolo Bonzini
@ 2024-05-22  0:29         ` Sean Christopherson
  0 siblings, 0 replies; 23+ messages in thread
From: Sean Christopherson @ 2024-05-22  0:29 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm, linux-kernel

On Tue, May 21, 2024, Paolo Bonzini wrote:
> On Tue, May 21, 2024 at 8:18 PM Sean Christopherson <seanjc@google.com> wrote:
> > > -          This should never be enabled in a production environment.
> > > +          Note that #VE trapping appears to be buggy on some CPUs.
> >
> > I see where you're coming from, but I don't think "trapping" is much better,
> > e.g. it suggests there's something broken with the interception of #VEs.  Ah,
> > the entire help text is weird.
> 
> Yeah, I didn't want to say #VE is broken altogether -

Ah, yeah, good call.  The #VE isn't broken per se, just spurious/unexpected.

> interception is where we saw issues,

It's not an issue with interception, disabling #VE intercepts results in the #VE
being delivered to the guest.

Test suite: ept_access_test_not_present
PTE[4] @ 109fff8 = 9fed0007
PTE[3] @ 9fed0ff0 = 9fed1007
PTE[2] @ 9fed1000 = 9fed2007
VA PTE @ 9fed2000 = 8000000007
Created EPT @ 9feca008 = 11d2007
Created EPT @ 11d2000 = 11d3007
Created EPT @ 11d3000 = 11d4007
L1 hva = 40000000, hpa = 40000000, L2 gva = ffffffff80000000, gpa = 8000000000
Unhandled exception 8 #DF at ip 0000000000410d39
error_code=0000      rflags=00010097      cs=00000008
rax=ffffffff80000000 rcx=0000000000000000 rdx=0000000000000000 rbx=0000000000000000
rbp=000000009fec6fe0 rsi=0000000000000000 rdi=0000000000000000
 r8=0000000000000000  r9=0000000000000000 r10=0000000000000000 r11=0000000000000000
r12=ffffffff80000008 r13=0000000000000000 r14=0000000000000000 r15=0000000000000000
cr0=0000000080010031 cr2=0000000000000000 cr3=000000000109f000 cr4=0000000000002020
cr8=0000000000000000
	STACK: @410d39 40144a 4002dd

> and #VE is used in production as far as I know (not just by TDX; at least Xen
> and maybe Hyper-V use it for anti-malware purposes?).

Hmm, maybe a spurious #VE is benign?  Or it really is limited to A/D bits being
disabled?  Not that us speculating is going to change anything :-)

> Maybe "Note: there appear to be bugs in some CPUs that will trigger
> the WARN, in particular with eptad=0 and/or nested virtualization"
> covers all bases.

Works for me.  Maybe tweak it slightly to explain why the WARN is triggered?

  Note, some CPUs appear to generate spurious EPT Violations #VEs that trigger
  KVM's WARN, in particular with eptad=0 and/or nested virtualization.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 0/9] KVM: x86: Fixes for KVM_INTEL_PROVE_VE
  2024-05-18  0:04 [PATCH 0/9] KVM: x86: Fixes for KVM_INTEL_PROVE_VE Sean Christopherson
                   ` (8 preceding siblings ...)
  2024-05-18  0:04 ` [PATCH 9/9] KVM: x86: Disable KVM_INTEL_PROVE_VE by default Sean Christopherson
@ 2024-05-23 16:41 ` Paolo Bonzini
  9 siblings, 0 replies; 23+ messages in thread
From: Paolo Bonzini @ 2024-05-23 16:41 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: kvm, linux-kernel

Queued, thanks.

I moved "KVM: VMX: Don't kill the VM on an unexpected #VE" as the second patch
in the series.

Paolo



^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2024-05-23 16:42 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-05-18  0:04 [PATCH 0/9] KVM: x86: Fixes for KVM_INTEL_PROVE_VE Sean Christopherson
2024-05-18  0:04 ` [PATCH 1/9] KVM: x86/mmu: Use SHADOW_NONPRESENT_VALUE for atomic zap in TDP MMU Sean Christopherson
2024-05-20 12:38   ` Huang, Kai
2024-05-21  7:21   ` Isaku Yamahata
2024-05-18  0:04 ` [PATCH 2/9] KVM: nVMX: Initialize #VE info page for vmcs02 when proving #VE support Sean Christopherson
2024-05-20 23:09   ` Huang, Kai
2024-05-20 23:22     ` Sean Christopherson
2024-05-20 23:49       ` Huang, Kai
2024-05-21  0:21         ` Sean Christopherson
2024-05-21  0:42           ` Huang, Kai
2024-05-21  1:02             ` Sean Christopherson
2024-05-18  0:04 ` [PATCH 3/9] KVM: nVMX: Always handle #VEs in L0 (never forward #VEs from L2 to L1) Sean Christopherson
2024-05-18  0:04 ` [PATCH 4/9] KVM: x86/mmu: Add sanity checks that KVM doesn't create EPT #VE SPTEs Sean Christopherson
2024-05-18  0:04 ` [PATCH 5/9] KVM: VMX: Dump VMCS on unexpected #VE Sean Christopherson
2024-05-18  0:04 ` [PATCH 6/9] KVM: x86/mmu: Print SPTEs " Sean Christopherson
2024-05-18  0:04 ` [PATCH 7/9] KVM: VMX: Don't kill the VM on an " Sean Christopherson
2024-05-18  0:04 ` [PATCH 8/9] KVM: VMX: Enumerate EPT Violation #VE support in /proc/cpuinfo Sean Christopherson
2024-05-18  0:04 ` [PATCH 9/9] KVM: x86: Disable KVM_INTEL_PROVE_VE by default Sean Christopherson
2024-05-21 17:36   ` Paolo Bonzini
2024-05-21 18:18     ` Sean Christopherson
2024-05-21 20:25       ` Paolo Bonzini
2024-05-22  0:29         ` Sean Christopherson
2024-05-23 16:41 ` [PATCH 0/9] KVM: x86: Fixes for KVM_INTEL_PROVE_VE Paolo Bonzini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).