[PATCH v2 0/6] Enhance event delivery error handling

public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH v2 0/6] Enhance event delivery error handling
@ 2024-11-11 10:27 Ivan Orlov
  2024-11-11 10:27 ` [PATCH v2 1/6] KVM: x86: Add function for vectoring error generation Ivan Orlov
                   ` (6 more replies)
  0 siblings, 7 replies; 21+ messages in thread
From: Ivan Orlov @ 2024-11-11 10:27 UTC (permalink / raw)
  To: bp, dave.hansen, mingo, pbonzini, seanjc, shuah, tglx
  Cc: Ivan Orlov, hpa, kvm, linux-kernel, linux-kselftest, x86,
	pdurrant, dwmw

Currently, the situation when guest accesses MMIO during vectoring is
handled differently on VMX and SVM: on VMX KVM returns internal error,
when SVM goes into infinite loop trying to deliver an event again and
again.

This patch series eliminates this difference by returning a KVM internal
error when guest performs MMIO during vectoring for both VMX and SVM.

Also, introduce a selftest test case which covers the error handling
mentioned above.

V1 -> V2:
- Make commit messages more brief, avoid using pronouns
- Extract SVM error handling into a separate commit
- Introduce a new X86EMUL_ return type and detect the unhandleable
vectoring error in vendor-specific check_emulate_instruction instead of
handling it in the common MMU code (which is specific for cached MMIO)

Ivan Orlov (6):
  KVM: x86: Add function for vectoring error generation
  KVM: x86: Add emulation status for vectoring during MMIO
  KVM: VMX: Handle vectoring error in check_emulate_instruction
  KVM: SVM: Handle MMIO during vectroing error
  selftests: KVM: extract lidt into helper function
  selftests: KVM: Add test case for MMIO during vectoring

 arch/x86/include/asm/kvm_host.h               | 12 ++++-
 arch/x86/kvm/kvm_emulate.h                    |  2 +
 arch/x86/kvm/svm/svm.c                        |  9 +++-
 arch/x86/kvm/vmx/vmx.c                        | 33 +++++-------
 arch/x86/kvm/x86.c                            | 27 ++++++++++
 .../selftests/kvm/include/x86_64/processor.h  |  7 +++
 .../selftests/kvm/set_memory_region_test.c    | 53 ++++++++++++++++++-
 .../selftests/kvm/x86_64/sev_smoke_test.c     |  2 +-
 8 files changed, 119 insertions(+), 26 deletions(-)

-- 
2.43.0


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v2 1/6] KVM: x86: Add function for vectoring error generation
  2024-11-11 10:27 [PATCH v2 0/6] Enhance event delivery error handling Ivan Orlov
@ 2024-11-11 10:27 ` Ivan Orlov
  2024-12-11 18:02   ` Sean Christopherson
  2024-11-11 10:27 ` [PATCH v2 2/6] KVM: x86: Add emulation status for vectoring during MMIO Ivan Orlov
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 21+ messages in thread
From: Ivan Orlov @ 2024-11-11 10:27 UTC (permalink / raw)
  To: bp, dave.hansen, mingo, pbonzini, seanjc, shuah, tglx
  Cc: Ivan Orlov, hpa, kvm, linux-kernel, linux-kselftest, x86,
	pdurrant, dwmw

Extract VMX code for unhandleable VM-Exit during vectoring into
vendor-agnostic function so that boiler-plate code can be shared by SVM.

Report an actual GPA for EPT misconfig or invalid GPA for any other exit
code in internal.data[3].

Signed-off-by: Ivan Orlov <iorlov@amazon.com>
---
V1 -> V2:
- Return GPA for any exit reason, using reported GPA when it is valid or
INVALID_GPA otherwise.
- Rename the error preparation function
- Fix indentation

 arch/x86/include/asm/kvm_host.h |  2 ++
 arch/x86/kvm/vmx/vmx.c          | 16 ++++------------
 arch/x86/kvm/x86.c              | 22 ++++++++++++++++++++++
 3 files changed, 28 insertions(+), 12 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 6d9f763a7bb9..eb413079b7c6 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -2060,6 +2060,8 @@ void __kvm_prepare_emulation_failure_exit(struct kvm_vcpu *vcpu,
 					  u64 *data, u8 ndata);
 void kvm_prepare_emulation_failure_exit(struct kvm_vcpu *vcpu);
 
+void kvm_prepare_event_vectoring_exit(struct kvm_vcpu *vcpu, gpa_t gpa);
+
 void kvm_enable_efer_bits(u64);
 bool kvm_valid_efer(struct kvm_vcpu *vcpu, u64 efer);
 int kvm_get_msr_with_filter(struct kvm_vcpu *vcpu, u32 index, u64 *data);
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index f6900bec4874..f92740e7e107 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -6452,6 +6452,7 @@ static int __vmx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t exit_fastpath)
 	union vmx_exit_reason exit_reason = vmx->exit_reason;
 	u32 vectoring_info = vmx->idt_vectoring_info;
 	u16 exit_handler_index;
+	gpa_t gpa;
 
 	/*
 	 * Flush logged GPAs PML buffer, this will make dirty_bitmap more
@@ -6550,19 +6551,10 @@ static int __vmx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t exit_fastpath)
 	     exit_reason.basic != EXIT_REASON_APIC_ACCESS &&
 	     exit_reason.basic != EXIT_REASON_TASK_SWITCH &&
 	     exit_reason.basic != EXIT_REASON_NOTIFY)) {
-		int ndata = 3;
+		gpa = exit_reason.basic == EXIT_REASON_EPT_MISCONFIG
+		      ? vmcs_read64(GUEST_PHYSICAL_ADDRESS) : INVALID_GPA;
 
-		vcpu->run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
-		vcpu->run->internal.suberror = KVM_INTERNAL_ERROR_DELIVERY_EV;
-		vcpu->run->internal.data[0] = vectoring_info;
-		vcpu->run->internal.data[1] = exit_reason.full;
-		vcpu->run->internal.data[2] = vmx_get_exit_qual(vcpu);
-		if (exit_reason.basic == EXIT_REASON_EPT_MISCONFIG) {
-			vcpu->run->internal.data[ndata++] =
-				vmcs_read64(GUEST_PHYSICAL_ADDRESS);
-		}
-		vcpu->run->internal.data[ndata++] = vcpu->arch.last_vmentry_cpu;
-		vcpu->run->internal.ndata = ndata;
+		kvm_prepare_event_vectoring_exit(vcpu, gpa);
 		return 0;
 	}
 
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 83fe0a78146f..e338d583f48f 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -8828,6 +8828,28 @@ void kvm_prepare_emulation_failure_exit(struct kvm_vcpu *vcpu)
 }
 EXPORT_SYMBOL_GPL(kvm_prepare_emulation_failure_exit);
 
+void kvm_prepare_event_vectoring_exit(struct kvm_vcpu *vcpu, gpa_t gpa)
+{
+	u32 reason, intr_info, error_code;
+	struct kvm_run *run = vcpu->run;
+	u64 info1, info2;
+	int ndata = 0;
+
+	kvm_x86_call(get_exit_info)(vcpu, &reason, &info1, &info2,
+				    &intr_info, &error_code);
+
+	run->internal.data[ndata++] = info2;
+	run->internal.data[ndata++] = reason;
+	run->internal.data[ndata++] = info1;
+	run->internal.data[ndata++] = (u64)gpa;
+	run->internal.data[ndata++] = vcpu->arch.last_vmentry_cpu;
+
+	run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
+	run->internal.suberror = KVM_INTERNAL_ERROR_DELIVERY_EV;
+	run->internal.ndata = ndata;
+}
+EXPORT_SYMBOL_GPL(kvm_prepare_event_vectoring_exit);
+
 static int handle_emulation_failure(struct kvm_vcpu *vcpu, int emulation_type)
 {
 	struct kvm *kvm = vcpu->kvm;
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v2 2/6] KVM: x86: Add emulation status for vectoring during MMIO
  2024-11-11 10:27 [PATCH v2 0/6] Enhance event delivery error handling Ivan Orlov
  2024-11-11 10:27 ` [PATCH v2 1/6] KVM: x86: Add function for vectoring error generation Ivan Orlov
@ 2024-11-11 10:27 ` Ivan Orlov
  2024-11-11 10:27 ` [PATCH v2 3/6] KVM: VMX: Handle vectoring error in check_emulate_instruction Ivan Orlov
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 21+ messages in thread
From: Ivan Orlov @ 2024-11-11 10:27 UTC (permalink / raw)
  To: bp, dave.hansen, mingo, pbonzini, seanjc, shuah, tglx
  Cc: Ivan Orlov, hpa, kvm, linux-kernel, linux-kselftest, x86,
	pdurrant, dwmw

Add emulation status for vectoring error due to MMIO. Such a situation
can occur if guest sets the IDT descriptor base to point to MMIO region,
and triggers an exception after that.

Exit to userspace with event delivery error when MMIO happens during
vectoring.

Signed-off-by: Ivan Orlov <iorlov@amazon.com>
---
V1 -> V2:
- This patch wasn't included in V1.

 arch/x86/kvm/kvm_emulate.h | 2 ++
 arch/x86/kvm/x86.c         | 5 +++++
 2 files changed, 7 insertions(+)

diff --git a/arch/x86/kvm/kvm_emulate.h b/arch/x86/kvm/kvm_emulate.h
index 55a18e2f2dcd..f856bc979bdb 100644
--- a/arch/x86/kvm/kvm_emulate.h
+++ b/arch/x86/kvm/kvm_emulate.h
@@ -88,6 +88,8 @@ struct x86_instruction_info {
 #define X86EMUL_CMPXCHG_FAILED  4 /* cmpxchg did not see expected value */
 #define X86EMUL_IO_NEEDED       5 /* IO is needed to complete emulation */
 #define X86EMUL_INTERCEPTED     6 /* Intercepted by nested VMCB/VMCS */
+/* Vectroing requires MMIO and can't be emulated */
+#define X86EMUL_UNHANDLEABLE_VECTORING_IO	7
 
 /* x86-specific emulation flags */
 #define X86EMUL_F_WRITE			BIT(0)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index e338d583f48f..4ba371040685 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -9122,6 +9122,11 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
 		if (r == X86EMUL_RETRY_INSTR || r == X86EMUL_PROPAGATE_FAULT)
 			return 1;
 
+		if (r == X86EMUL_UNHANDLEABLE_VECTORING_IO) {
+			kvm_prepare_event_vectoring_exit(vcpu, cr2_or_gpa);
+			return 0;
+		}
+
 		WARN_ON_ONCE(r != X86EMUL_UNHANDLEABLE);
 		return handle_emulation_failure(vcpu, emulation_type);
 	}
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v2 3/6] KVM: VMX: Handle vectoring error in check_emulate_instruction
  2024-11-11 10:27 [PATCH v2 0/6] Enhance event delivery error handling Ivan Orlov
  2024-11-11 10:27 ` [PATCH v2 1/6] KVM: x86: Add function for vectoring error generation Ivan Orlov
  2024-11-11 10:27 ` [PATCH v2 2/6] KVM: x86: Add emulation status for vectoring during MMIO Ivan Orlov
@ 2024-11-11 10:27 ` Ivan Orlov
  2024-12-11 18:15   ` Sean Christopherson
  2024-11-11 10:27 ` [PATCH v2 4/6] KVM: SVM: Handle MMIO during vectroing error Ivan Orlov
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 21+ messages in thread
From: Ivan Orlov @ 2024-11-11 10:27 UTC (permalink / raw)
  To: bp, dave.hansen, mingo, pbonzini, seanjc, shuah, tglx
  Cc: Ivan Orlov, hpa, kvm, linux-kernel, linux-kselftest, x86,
	pdurrant, dwmw

Move unhandleable vmexit due to MMIO during vectoring error detection
into check_emulate_instruction. Implement a function which checks if
emul_type indicates MMIO so it can be used for both VMX and SVM.

Fix the comment about EMULTYPE_PF as this flag doesn't necessarily
mean MMIO anymore: it can also be set due to the write protection
violation.

Signed-off-by: Ivan Orlov <iorlov@amazon.com>
---
V1 -> V2:
- Detect the unhandleable vectoring error in vmx_check_emulate_instruction
instead of handling it in the common MMU code (which is specific for
cached MMIO)

 arch/x86/include/asm/kvm_host.h | 10 ++++++++--
 arch/x86/kvm/vmx/vmx.c          | 25 ++++++++++++-------------
 2 files changed, 20 insertions(+), 15 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index eb413079b7c6..3de9702a9135 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -2017,8 +2017,8 @@ u64 vcpu_tsc_khz(struct kvm_vcpu *vcpu);
  *			VMware backdoor emulation handles select instructions
  *			and reinjects the #GP for all other cases.
  *
- * EMULTYPE_PF - Set when emulating MMIO by way of an intercepted #PF, in which
- *		 case the CR2/GPA value pass on the stack is valid.
+ * EMULTYPE_PF - Set when an intercepted #PF triggers the emulation, in which case
+ *		 the CR2/GPA value pass on the stack is valid.
  *
  * EMULTYPE_COMPLETE_USER_EXIT - Set when the emulator should update interruptibility
  *				 state and inject single-step #DBs after skipping
@@ -2053,6 +2053,12 @@ u64 vcpu_tsc_khz(struct kvm_vcpu *vcpu);
 #define EMULTYPE_COMPLETE_USER_EXIT (1 << 7)
 #define EMULTYPE_WRITE_PF_TO_SP	    (1 << 8)
 
+static inline bool kvm_is_emul_type_mmio(int emul_type)
+{
+	return (emul_type & EMULTYPE_PF) &&
+		!(emul_type & EMULTYPE_WRITE_PF_TO_SP);
+}
+
 int kvm_emulate_instruction(struct kvm_vcpu *vcpu, int emulation_type);
 int kvm_emulate_instruction_from_buffer(struct kvm_vcpu *vcpu,
 					void *insn, int insn_len);
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index f92740e7e107..a10f35d9704b 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1693,6 +1693,8 @@ static int vmx_rtit_ctl_check(struct kvm_vcpu *vcpu, u64 data)
 int vmx_check_emulate_instruction(struct kvm_vcpu *vcpu, int emul_type,
 				  void *insn, int insn_len)
 {
+	bool is_vect;
+
 	/*
 	 * Emulation of instructions in SGX enclaves is impossible as RIP does
 	 * not point at the failing instruction, and even if it did, the code
@@ -1704,6 +1706,13 @@ int vmx_check_emulate_instruction(struct kvm_vcpu *vcpu, int emul_type,
 		kvm_queue_exception(vcpu, UD_VECTOR);
 		return X86EMUL_PROPAGATE_FAULT;
 	}
+
+	is_vect = to_vmx(vcpu)->idt_vectoring_info & VECTORING_INFO_VALID_MASK;
+
+	/* Emulation is not possible when MMIO happens during event vectoring. */
+	if (kvm_is_emul_type_mmio(emul_type) && is_vect)
+		return X86EMUL_UNHANDLEABLE_VECTORING_IO;
+
 	return X86EMUL_CONTINUE;
 }
 
@@ -6452,7 +6461,6 @@ static int __vmx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t exit_fastpath)
 	union vmx_exit_reason exit_reason = vmx->exit_reason;
 	u32 vectoring_info = vmx->idt_vectoring_info;
 	u16 exit_handler_index;
-	gpa_t gpa;
 
 	/*
 	 * Flush logged GPAs PML buffer, this will make dirty_bitmap more
@@ -6537,24 +6545,15 @@ static int __vmx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t exit_fastpath)
 		return 0;
 	}
 
-	/*
-	 * Note:
-	 * Do not try to fix EXIT_REASON_EPT_MISCONFIG if it caused by
-	 * delivery event since it indicates guest is accessing MMIO.
-	 * The vm-exit can be triggered again after return to guest that
-	 * will cause infinite loop.
-	 */
 	if ((vectoring_info & VECTORING_INFO_VALID_MASK) &&
 	    (exit_reason.basic != EXIT_REASON_EXCEPTION_NMI &&
 	     exit_reason.basic != EXIT_REASON_EPT_VIOLATION &&
 	     exit_reason.basic != EXIT_REASON_PML_FULL &&
 	     exit_reason.basic != EXIT_REASON_APIC_ACCESS &&
 	     exit_reason.basic != EXIT_REASON_TASK_SWITCH &&
-	     exit_reason.basic != EXIT_REASON_NOTIFY)) {
-		gpa = exit_reason.basic == EXIT_REASON_EPT_MISCONFIG
-		      ? vmcs_read64(GUEST_PHYSICAL_ADDRESS) : INVALID_GPA;
-
-		kvm_prepare_event_vectoring_exit(vcpu, gpa);
+	     exit_reason.basic != EXIT_REASON_NOTIFY &&
+	     exit_reason.basic != EXIT_REASON_EPT_MISCONFIG)) {
+		kvm_prepare_event_vectoring_exit(vcpu, INVALID_GPA);
 		return 0;
 	}
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v2 4/6] KVM: SVM: Handle MMIO during vectroing error
  2024-11-11 10:27 [PATCH v2 0/6] Enhance event delivery error handling Ivan Orlov
                   ` (2 preceding siblings ...)
  2024-11-11 10:27 ` [PATCH v2 3/6] KVM: VMX: Handle vectoring error in check_emulate_instruction Ivan Orlov
@ 2024-11-11 10:27 ` Ivan Orlov
  2024-12-11 18:16   ` Sean Christopherson
  2024-11-11 10:27 ` [PATCH v2 5/6] selftests: KVM: extract lidt into helper function Ivan Orlov
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 21+ messages in thread
From: Ivan Orlov @ 2024-11-11 10:27 UTC (permalink / raw)
  To: bp, dave.hansen, mingo, pbonzini, seanjc, shuah, tglx
  Cc: Ivan Orlov, hpa, kvm, linux-kernel, linux-kselftest, x86,
	pdurrant, dwmw

Handle MMIO during vectoring error in check_emulate_instruction to
prevent infinite loop on SVM and eliminate the difference in how the
situation when the guest accesses MMIO during vectoring is handled on
SVM and VMX.

Signed-off-by: Ivan Orlov <iorlov@amazon.com>
---
V1 -> V2:
- Detect the unhandleable vectoring error in svm_check_emulate_instruction
instead of handling it in the common MMU code (which is specific for
cached MMIO)

 arch/x86/kvm/svm/svm.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index c1e29307826b..b69f0f98c576 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -4797,9 +4797,16 @@ static void svm_enable_smi_window(struct kvm_vcpu *vcpu)
 static int svm_check_emulate_instruction(struct kvm_vcpu *vcpu, int emul_type,
 					 void *insn, int insn_len)
 {
-	bool smep, smap, is_user;
+	bool smep, smap, is_user, is_vect;
 	u64 error_code;
 
+	is_vect = to_svm(vcpu)->vmcb->control.exit_int_info &
+		  SVM_EXITINTINFO_TYPE_MASK;
+
+	/* Emulation is not possible when MMIO happens during event vectoring. */
+	if (kvm_is_emul_type_mmio(emul_type) && is_vect)
+		return X86EMUL_UNHANDLEABLE_VECTORING_IO;
+
 	/* Emulation is always possible when KVM has access to all guest state. */
 	if (!sev_guest(vcpu->kvm))
 		return X86EMUL_CONTINUE;
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v2 5/6] selftests: KVM: extract lidt into helper function
  2024-11-11 10:27 [PATCH v2 0/6] Enhance event delivery error handling Ivan Orlov
                   ` (3 preceding siblings ...)
  2024-11-11 10:27 ` [PATCH v2 4/6] KVM: SVM: Handle MMIO during vectroing error Ivan Orlov
@ 2024-11-11 10:27 ` Ivan Orlov
  2024-11-11 10:27 ` [PATCH v2 6/6] selftests: KVM: Add test case for MMIO during vectoring Ivan Orlov
  2024-12-11 18:20 ` [PATCH v2 0/6] Enhance event delivery error handling Sean Christopherson
  6 siblings, 0 replies; 21+ messages in thread
From: Ivan Orlov @ 2024-11-11 10:27 UTC (permalink / raw)
  To: bp, dave.hansen, mingo, pbonzini, seanjc, shuah, tglx
  Cc: Ivan Orlov, hpa, kvm, linux-kernel, linux-kselftest, x86,
	pdurrant, dwmw

Implement a function for setting the IDT descriptor from the guest
code. Replace the existing lidt occurrences with calls to this function
as `lidt` is used in multiple places.

Signed-off-by: Ivan Orlov <iorlov@amazon.com>
---
V1 -> V2:
- This patch wasn't included in V1.

 tools/testing/selftests/kvm/include/x86_64/processor.h | 5 +++++
 tools/testing/selftests/kvm/set_memory_region_test.c   | 2 +-
 tools/testing/selftests/kvm/x86_64/sev_smoke_test.c    | 2 +-
 3 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/kvm/include/x86_64/processor.h b/tools/testing/selftests/kvm/include/x86_64/processor.h
index e247f99e0473..1a60c99b5833 100644
--- a/tools/testing/selftests/kvm/include/x86_64/processor.h
+++ b/tools/testing/selftests/kvm/include/x86_64/processor.h
@@ -571,6 +571,11 @@ static inline void set_cr4(uint64_t val)
 	__asm__ __volatile__("mov %0, %%cr4" : : "r" (val) : "memory");
 }
 
+static inline void set_idt(const struct desc_ptr *idt_desc)
+{
+	__asm__ __volatile__("lidt %0"::"m"(*idt_desc));
+}
+
 static inline u64 xgetbv(u32 index)
 {
 	u32 eax, edx;
diff --git a/tools/testing/selftests/kvm/set_memory_region_test.c b/tools/testing/selftests/kvm/set_memory_region_test.c
index a8267628e9ed..a1c53cc854a5 100644
--- a/tools/testing/selftests/kvm/set_memory_region_test.c
+++ b/tools/testing/selftests/kvm/set_memory_region_test.c
@@ -235,7 +235,7 @@ static void guest_code_delete_memory_region(void)
 	 * in the guest will never succeed, and so isn't an option.
 	 */
 	memset(&idt, 0, sizeof(idt));
-	__asm__ __volatile__("lidt %0" :: "m"(idt));
+	set_idt(&idt);
 
 	GUEST_SYNC(0);
 
diff --git a/tools/testing/selftests/kvm/x86_64/sev_smoke_test.c b/tools/testing/selftests/kvm/x86_64/sev_smoke_test.c
index 2e9197eb1652..8c33e02a3183 100644
--- a/tools/testing/selftests/kvm/x86_64/sev_smoke_test.c
+++ b/tools/testing/selftests/kvm/x86_64/sev_smoke_test.c
@@ -166,7 +166,7 @@ static void guest_shutdown_code(void)
 
 	/* Clobber the IDT so that #UD is guaranteed to trigger SHUTDOWN. */
 	memset(&idt, 0, sizeof(idt));
-	__asm__ __volatile__("lidt %0" :: "m"(idt));
+	set_idt(&idt);
 
 	__asm__ __volatile__("ud2");
 }
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v2 6/6] selftests: KVM: Add test case for MMIO during vectoring
  2024-11-11 10:27 [PATCH v2 0/6] Enhance event delivery error handling Ivan Orlov
                   ` (4 preceding siblings ...)
  2024-11-11 10:27 ` [PATCH v2 5/6] selftests: KVM: extract lidt into helper function Ivan Orlov
@ 2024-11-11 10:27 ` Ivan Orlov
  2024-12-11 18:19   ` Sean Christopherson
  2024-12-11 18:20 ` [PATCH v2 0/6] Enhance event delivery error handling Sean Christopherson
  6 siblings, 1 reply; 21+ messages in thread
From: Ivan Orlov @ 2024-11-11 10:27 UTC (permalink / raw)
  To: bp, dave.hansen, mingo, pbonzini, seanjc, shuah, tglx
  Cc: Ivan Orlov, hpa, kvm, linux-kernel, linux-kselftest, x86,
	pdurrant, dwmw

Extend the 'set_memory_region_test' with a test case which covers the
MMIO during vectoring error handling. The test case

1) Sets an IDT descriptor base to point to an MMIO address
2) Generates a #GP in the guest
3) Verifies that we got a correct exit reason and suberror code
4) Verifies that we got a corrent reported GPA in internal.data[3]

Also, add a definition of non-canonical address to processor.h

Signed-off-by: Ivan Orlov <iorlov@amazon.com>
---
V1 -> V2:
- Get rid of pronouns, redundant comments and incorrect wording
- Define noncanonical address in processor.h
- Fix indentation and wrap lines at 80 columns

 .../selftests/kvm/include/x86_64/processor.h  |  2 +
 .../selftests/kvm/set_memory_region_test.c    | 51 +++++++++++++++++++
 2 files changed, 53 insertions(+)

diff --git a/tools/testing/selftests/kvm/include/x86_64/processor.h b/tools/testing/selftests/kvm/include/x86_64/processor.h
index 1a60c99b5833..997df5003edb 100644
--- a/tools/testing/selftests/kvm/include/x86_64/processor.h
+++ b/tools/testing/selftests/kvm/include/x86_64/processor.h
@@ -1165,6 +1165,8 @@ void vm_install_exception_handler(struct kvm_vm *vm, int vector,
 /* If a toddler were to say "abracadabra". */
 #define KVM_EXCEPTION_MAGIC 0xabacadabaULL
 
+#define NONCANONICAL 0xaaaaaaaaaaaaaaaaull
+
 /*
  * KVM selftest exception fixup uses registers to coordinate with the exception
  * handler, versus the kernel's in-memory tables and KVM-Unit-Tests's in-memory
diff --git a/tools/testing/selftests/kvm/set_memory_region_test.c b/tools/testing/selftests/kvm/set_memory_region_test.c
index a1c53cc854a5..d65a9f20aa1a 100644
--- a/tools/testing/selftests/kvm/set_memory_region_test.c
+++ b/tools/testing/selftests/kvm/set_memory_region_test.c
@@ -553,6 +553,56 @@ static void test_add_overlapping_private_memory_regions(void)
 	close(memfd);
 	kvm_vm_free(vm);
 }
+
+static void guest_code_mmio_during_vectoring(void)
+{
+	const struct desc_ptr idt_desc = {
+		.address = MEM_REGION_GPA,
+		.size = 0xFFF,
+	};
+
+	set_idt(&idt_desc);
+
+	/* Generate a #GP by dereferencing a non-canonical address */
+	*((uint8_t *)NONCANONICAL) = 0x1;
+
+	GUEST_ASSERT(0);
+}
+
+/*
+ * This test points the IDT descriptor base to an MMIO address. It should cause
+ * a KVM internal error when an event occurs in the guest.
+ */
+static void test_mmio_during_vectoring(void)
+{
+	struct kvm_vcpu *vcpu;
+	struct kvm_run *run;
+	struct kvm_vm *vm;
+	u64 expected_gpa;
+
+	pr_info("Testing MMIO during vectoring error handling\n");
+
+	vm = vm_create_with_one_vcpu(&vcpu, guest_code_mmio_during_vectoring);
+	virt_map(vm, MEM_REGION_GPA, MEM_REGION_GPA, 1);
+
+	run = vcpu->run;
+
+	vcpu_run(vcpu);
+	TEST_ASSERT_KVM_EXIT_REASON(vcpu, KVM_EXIT_INTERNAL_ERROR);
+	TEST_ASSERT(run->internal.suberror == KVM_INTERNAL_ERROR_DELIVERY_EV,
+		    "Unexpected suberror = %d", vcpu->run->internal.suberror);
+	TEST_ASSERT(run->internal.ndata != 4, "Unexpected internal error data array size = %d",
+		    run->internal.ndata);
+
+	/* The reported GPA should be IDT base + offset of the GP vector */
+	expected_gpa = MEM_REGION_GPA + GP_VECTOR * sizeof(struct idt_entry);
+
+	TEST_ASSERT(run->internal.data[3] == expected_gpa,
+		    "Unexpected GPA = %llx (expected %lx)",
+		    vcpu->run->internal.data[3], expected_gpa);
+
+	kvm_vm_free(vm);
+}
 #endif
 
 int main(int argc, char *argv[])
@@ -568,6 +618,7 @@ int main(int argc, char *argv[])
 	 * KVM_RUN fails with ENOEXEC or EFAULT.
 	 */
 	test_zero_memory_regions();
+	test_mmio_during_vectoring();
 #endif
 
 	test_invalid_memory_region_flags();
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 1/6] KVM: x86: Add function for vectoring error generation
  2024-11-11 10:27 ` [PATCH v2 1/6] KVM: x86: Add function for vectoring error generation Ivan Orlov
@ 2024-12-11 18:02   ` Sean Christopherson
  0 siblings, 0 replies; 21+ messages in thread
From: Sean Christopherson @ 2024-12-11 18:02 UTC (permalink / raw)
  To: Ivan Orlov
  Cc: bp, dave.hansen, mingo, pbonzini, shuah, tglx, hpa, kvm,
	linux-kernel, linux-kselftest, x86, pdurrant, dwmw

On Mon, Nov 11, 2024, Ivan Orlov wrote:
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index f6900bec4874..f92740e7e107 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -6452,6 +6452,7 @@ static int __vmx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t exit_fastpath)
>  	union vmx_exit_reason exit_reason = vmx->exit_reason;
>  	u32 vectoring_info = vmx->idt_vectoring_info;
>  	u16 exit_handler_index;
> +	gpa_t gpa;

I've gone back and forth on where to declare scoped varaibles, but in this case,
I think it makes sense to declare "gpa" inside the if-statement.  Making it
visible at the function scope when it's valid in a _super_ limited case is bound
to cause issues.

Of course, this code goes away by the end of the series, so that point is moot.
But on the other hand, declaring the variable in the if-statement is desirable
as the churn is precisely limited to the code that's being changed.

>  	/*
>  	 * Flush logged GPAs PML buffer, this will make dirty_bitmap more
> @@ -6550,19 +6551,10 @@ static int __vmx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t exit_fastpath)
>  	     exit_reason.basic != EXIT_REASON_APIC_ACCESS &&
>  	     exit_reason.basic != EXIT_REASON_TASK_SWITCH &&
>  	     exit_reason.basic != EXIT_REASON_NOTIFY)) {
> -		int ndata = 3;
> +		gpa = exit_reason.basic == EXIT_REASON_EPT_MISCONFIG
> +		      ? vmcs_read64(GUEST_PHYSICAL_ADDRESS) : INVALID_GPA;

Again a moot point, but IMO using a ternary operator here makes it unnecessarily
difficult to see that gpa is valid if and only if the exit was an EPT misconfig.

		gpa_t gpa = INVALID_GPA;

		if (exit_reason.basic == EXIT_REASON_EPT_MISCONFIG)
			gpa = vmcs_read64(GUEST_PHYSICAL_ADDRESS);


> -		vcpu->run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
> -		vcpu->run->internal.suberror = KVM_INTERNAL_ERROR_DELIVERY_EV;
> -		vcpu->run->internal.data[0] = vectoring_info;
> -		vcpu->run->internal.data[1] = exit_reason.full;
> -		vcpu->run->internal.data[2] = vmx_get_exit_qual(vcpu);
> -		if (exit_reason.basic == EXIT_REASON_EPT_MISCONFIG) {
> -			vcpu->run->internal.data[ndata++] =
> -				vmcs_read64(GUEST_PHYSICAL_ADDRESS);
> -		}
> -		vcpu->run->internal.data[ndata++] = vcpu->arch.last_vmentry_cpu;
> -		vcpu->run->internal.ndata = ndata;
> +		kvm_prepare_event_vectoring_exit(vcpu, gpa);
>  		return 0;
>  	}
>  
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 83fe0a78146f..e338d583f48f 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -8828,6 +8828,28 @@ void kvm_prepare_emulation_failure_exit(struct kvm_vcpu *vcpu)
>  }
>  EXPORT_SYMBOL_GPL(kvm_prepare_emulation_failure_exit);
>  
> +void kvm_prepare_event_vectoring_exit(struct kvm_vcpu *vcpu, gpa_t gpa)
> +{
> +	u32 reason, intr_info, error_code;
> +	struct kvm_run *run = vcpu->run;
> +	u64 info1, info2;
> +	int ndata = 0;
> +
> +	kvm_x86_call(get_exit_info)(vcpu, &reason, &info1, &info2,
> +				    &intr_info, &error_code);
> +
> +	run->internal.data[ndata++] = info2;
> +	run->internal.data[ndata++] = reason;
> +	run->internal.data[ndata++] = info1;
> +	run->internal.data[ndata++] = (u64)gpa;

No need for the cast.

> +	run->internal.data[ndata++] = vcpu->arch.last_vmentry_cpu;
> +
> +	run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
> +	run->internal.suberror = KVM_INTERNAL_ERROR_DELIVERY_EV;
> +	run->internal.ndata = ndata;
> +}
> +EXPORT_SYMBOL_GPL(kvm_prepare_event_vectoring_exit);
> +
>  static int handle_emulation_failure(struct kvm_vcpu *vcpu, int emulation_type)
>  {
>  	struct kvm *kvm = vcpu->kvm;
> -- 
> 2.43.0
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 3/6] KVM: VMX: Handle vectoring error in check_emulate_instruction
  2024-11-11 10:27 ` [PATCH v2 3/6] KVM: VMX: Handle vectoring error in check_emulate_instruction Ivan Orlov
@ 2024-12-11 18:15   ` Sean Christopherson
  2024-12-11 22:05     ` Ivan Orlov
  2024-12-11 23:12     ` Ivan Orlov
  0 siblings, 2 replies; 21+ messages in thread
From: Sean Christopherson @ 2024-12-11 18:15 UTC (permalink / raw)
  To: Ivan Orlov
  Cc: bp, dave.hansen, mingo, pbonzini, shuah, tglx, hpa, kvm,
	linux-kernel, linux-kselftest, x86, pdurrant, dwmw

On Mon, Nov 11, 2024, Ivan Orlov wrote:
> Move unhandleable vmexit due to MMIO during vectoring error detection
> into check_emulate_instruction. Implement a function which checks if
> emul_type indicates MMIO so it can be used for both VMX and SVM.
> 
> Fix the comment about EMULTYPE_PF as this flag doesn't necessarily
> mean MMIO anymore: it can also be set due to the write protection
> violation.
> 
> Signed-off-by: Ivan Orlov <iorlov@amazon.com>
> ---
> V1 -> V2:
> - Detect the unhandleable vectoring error in vmx_check_emulate_instruction
> instead of handling it in the common MMU code (which is specific for
> cached MMIO)
> 
>  arch/x86/include/asm/kvm_host.h | 10 ++++++++--
>  arch/x86/kvm/vmx/vmx.c          | 25 ++++++++++++-------------
>  2 files changed, 20 insertions(+), 15 deletions(-)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index eb413079b7c6..3de9702a9135 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -2017,8 +2017,8 @@ u64 vcpu_tsc_khz(struct kvm_vcpu *vcpu);
>   *			VMware backdoor emulation handles select instructions
>   *			and reinjects the #GP for all other cases.
>   *
> - * EMULTYPE_PF - Set when emulating MMIO by way of an intercepted #PF, in which
> - *		 case the CR2/GPA value pass on the stack is valid.
> + * EMULTYPE_PF - Set when an intercepted #PF triggers the emulation, in which case
> + *		 the CR2/GPA value pass on the stack is valid.
>   *
>   * EMULTYPE_COMPLETE_USER_EXIT - Set when the emulator should update interruptibility
>   *				 state and inject single-step #DBs after skipping
> @@ -2053,6 +2053,12 @@ u64 vcpu_tsc_khz(struct kvm_vcpu *vcpu);
>  #define EMULTYPE_COMPLETE_USER_EXIT (1 << 7)
>  #define EMULTYPE_WRITE_PF_TO_SP	    (1 << 8)
>  
> +static inline bool kvm_is_emul_type_mmio(int emul_type)

Hmm, this should probably be "pf_mmio", not just "mmio".  E.g. if KVM is emulating
large swaths of guest code because unrestricted guest is disabled, then can end up
emulating an MMIO access for "normal" emulation.

Hmm, actually, what if we go with this?

  static inline bool kvm_can_emulate_event_vectoring(int emul_type)
  {
	return !(emul_type & EMULTYPE_PF) ||
	       (emul_type & EMULTYPE_WRITE_PF_TO_SP);
  }

The MMIO aspect isn't unique to VMX or SVM, and the above would allow handling
other incompatible emulation types, should they ever arise (unlikely).

More importantly, it provides a single location to document (via comment) exactly
why KVM can't deal with MMIO emulation during event vectoring (and why KVM allows
emulation of write-protect page faults).

It would require using X86EMUL_UNHANDLEABLE_VECTORING instead of
X86EMUL_UNHANDLEABLE_VECTORING_IO, but I don't think that's necessarily a bad
thing.

> +{
> +	return (emul_type & EMULTYPE_PF) &&
> +		!(emul_type & EMULTYPE_WRITE_PF_TO_SP);
> +}
> +
>  int kvm_emulate_instruction(struct kvm_vcpu *vcpu, int emulation_type);
>  int kvm_emulate_instruction_from_buffer(struct kvm_vcpu *vcpu,
>  					void *insn, int insn_len);
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index f92740e7e107..a10f35d9704b 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -1693,6 +1693,8 @@ static int vmx_rtit_ctl_check(struct kvm_vcpu *vcpu, u64 data)
>  int vmx_check_emulate_instruction(struct kvm_vcpu *vcpu, int emul_type,
>  				  void *insn, int insn_len)
>  {
> +	bool is_vect;
> +
>  	/*
>  	 * Emulation of instructions in SGX enclaves is impossible as RIP does
>  	 * not point at the failing instruction, and even if it did, the code
> @@ -1704,6 +1706,13 @@ int vmx_check_emulate_instruction(struct kvm_vcpu *vcpu, int emul_type,
>  		kvm_queue_exception(vcpu, UD_VECTOR);
>  		return X86EMUL_PROPAGATE_FAULT;
>  	}
> +
> +	is_vect = to_vmx(vcpu)->idt_vectoring_info & VECTORING_INFO_VALID_MASK;
> +
> +	/* Emulation is not possible when MMIO happens during event vectoring. */
> +	if (kvm_is_emul_type_mmio(emul_type) && is_vect)

I definitely prefer to omit the local variable.  

	if ((to_vmx(vcpu)->idt_vectoring_info & VECTORING_INFO_VALID_MASK) &&
	    !kvm_can_emulate_event_vectoring(emul_type))
		return X86EMUL_UNHANDLEABLE_VECTORING;

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 4/6] KVM: SVM: Handle MMIO during vectroing error
  2024-11-11 10:27 ` [PATCH v2 4/6] KVM: SVM: Handle MMIO during vectroing error Ivan Orlov
@ 2024-12-11 18:16   ` Sean Christopherson
  0 siblings, 0 replies; 21+ messages in thread
From: Sean Christopherson @ 2024-12-11 18:16 UTC (permalink / raw)
  To: Ivan Orlov
  Cc: bp, dave.hansen, mingo, pbonzini, shuah, tglx, hpa, kvm,
	linux-kernel, linux-kselftest, x86, pdurrant, dwmw

On Mon, Nov 11, 2024, Ivan Orlov wrote:
> Handle MMIO during vectoring error in check_emulate_instruction to
> prevent infinite loop on SVM and eliminate the difference in how the
> situation when the guest accesses MMIO during vectoring is handled on
> SVM and VMX.
> 
> Signed-off-by: Ivan Orlov <iorlov@amazon.com>
> ---
> V1 -> V2:
> - Detect the unhandleable vectoring error in svm_check_emulate_instruction
> instead of handling it in the common MMU code (which is specific for
> cached MMIO)
> 
>  arch/x86/kvm/svm/svm.c | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index c1e29307826b..b69f0f98c576 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -4797,9 +4797,16 @@ static void svm_enable_smi_window(struct kvm_vcpu *vcpu)
>  static int svm_check_emulate_instruction(struct kvm_vcpu *vcpu, int emul_type,
>  					 void *insn, int insn_len)
>  {
> -	bool smep, smap, is_user;
> +	bool smep, smap, is_user, is_vect;
>  	u64 error_code;
>  
> +	is_vect = to_svm(vcpu)->vmcb->control.exit_int_info &
> +		  SVM_EXITINTINFO_TYPE_MASK;
> +
> +	/* Emulation is not possible when MMIO happens during event vectoring. */
> +	if (kvm_is_emul_type_mmio(emul_type) && is_vect)

Same nit here, omit the local variable.

> +		return X86EMUL_UNHANDLEABLE_VECTORING_IO;
> +
>  	/* Emulation is always possible when KVM has access to all guest state. */
>  	if (!sev_guest(vcpu->kvm))
>  		return X86EMUL_CONTINUE;
> -- 
> 2.43.0
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 6/6] selftests: KVM: Add test case for MMIO during vectoring
  2024-11-11 10:27 ` [PATCH v2 6/6] selftests: KVM: Add test case for MMIO during vectoring Ivan Orlov
@ 2024-12-11 18:19   ` Sean Christopherson
  2024-12-12 17:11     ` Ivan Orlov
  0 siblings, 1 reply; 21+ messages in thread
From: Sean Christopherson @ 2024-12-11 18:19 UTC (permalink / raw)
  To: Ivan Orlov
  Cc: bp, dave.hansen, mingo, pbonzini, shuah, tglx, hpa, kvm,
	linux-kernel, linux-kselftest, x86, pdurrant, dwmw

On Mon, Nov 11, 2024, Ivan Orlov wrote:
> Extend the 'set_memory_region_test' with a test case which covers the
> MMIO during vectoring error handling. The test case
> 
> 1) Sets an IDT descriptor base to point to an MMIO address
> 2) Generates a #GP in the guest
> 3) Verifies that we got a correct exit reason and suberror code
> 4) Verifies that we got a corrent reported GPA in internal.data[3]
> 
> Also, add a definition of non-canonical address to processor.h
> 
> Signed-off-by: Ivan Orlov <iorlov@amazon.com>
> ---
> V1 -> V2:
> - Get rid of pronouns, redundant comments and incorrect wording
> - Define noncanonical address in processor.h
> - Fix indentation and wrap lines at 80 columns
> 
>  .../selftests/kvm/include/x86_64/processor.h  |  2 +
>  .../selftests/kvm/set_memory_region_test.c    | 51 +++++++++++++++++++
>  2 files changed, 53 insertions(+)
> 
> diff --git a/tools/testing/selftests/kvm/include/x86_64/processor.h b/tools/testing/selftests/kvm/include/x86_64/processor.h
> index 1a60c99b5833..997df5003edb 100644
> --- a/tools/testing/selftests/kvm/include/x86_64/processor.h
> +++ b/tools/testing/selftests/kvm/include/x86_64/processor.h
> @@ -1165,6 +1165,8 @@ void vm_install_exception_handler(struct kvm_vm *vm, int vector,
>  /* If a toddler were to say "abracadabra". */
>  #define KVM_EXCEPTION_MAGIC 0xabacadabaULL
>  
> +#define NONCANONICAL 0xaaaaaaaaaaaaaaaaull

Uber nit, I much prefer to place this definition at the top of the header.  More
specifically, it needs to not be in the middle of the selftest's exception fixup
code.

> +
>  /*
>   * KVM selftest exception fixup uses registers to coordinate with the exception
>   * handler, versus the kernel's in-memory tables and KVM-Unit-Tests's in-memory
> diff --git a/tools/testing/selftests/kvm/set_memory_region_test.c b/tools/testing/selftests/kvm/set_memory_region_test.c
> index a1c53cc854a5..d65a9f20aa1a 100644
> --- a/tools/testing/selftests/kvm/set_memory_region_test.c
> +++ b/tools/testing/selftests/kvm/set_memory_region_test.c
> @@ -553,6 +553,56 @@ static void test_add_overlapping_private_memory_regions(void)
>  	close(memfd);
>  	kvm_vm_free(vm);
>  }
> +
> +static void guest_code_mmio_during_vectoring(void)
> +{
> +	const struct desc_ptr idt_desc = {
> +		.address = MEM_REGION_GPA,
> +		.size = 0xFFF,
> +	};
> +
> +	set_idt(&idt_desc);
> +
> +	/* Generate a #GP by dereferencing a non-canonical address */
> +	*((uint8_t *)NONCANONICAL) = 0x1;

Now I'm curious what happens if this uses vcpu_arch_put_guest(), i.e. if the
test forces KVM to emulate the write.

No action needed, the test is a-ok as-is.  I'm really just curious :-)

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 0/6] Enhance event delivery error handling
  2024-11-11 10:27 [PATCH v2 0/6] Enhance event delivery error handling Ivan Orlov
                   ` (5 preceding siblings ...)
  2024-11-11 10:27 ` [PATCH v2 6/6] selftests: KVM: Add test case for MMIO during vectoring Ivan Orlov
@ 2024-12-11 18:20 ` Sean Christopherson
  2024-12-11 21:45   ` Ivan Orlov
  6 siblings, 1 reply; 21+ messages in thread
From: Sean Christopherson @ 2024-12-11 18:20 UTC (permalink / raw)
  To: Ivan Orlov
  Cc: bp, dave.hansen, mingo, pbonzini, shuah, tglx, hpa, kvm,
	linux-kernel, linux-kselftest, x86, pdurrant, dwmw

On Mon, Nov 11, 2024, Ivan Orlov wrote:
> Currently, the situation when guest accesses MMIO during vectoring is
> handled differently on VMX and SVM: on VMX KVM returns internal error,
> when SVM goes into infinite loop trying to deliver an event again and
> again.
> 
> This patch series eliminates this difference by returning a KVM internal
> error when guest performs MMIO during vectoring for both VMX and SVM.
> 
> Also, introduce a selftest test case which covers the error handling
> mentioned above.
> 
> V1 -> V2:
> - Make commit messages more brief, avoid using pronouns
> - Extract SVM error handling into a separate commit
> - Introduce a new X86EMUL_ return type and detect the unhandleable
> vectoring error in vendor-specific check_emulate_instruction instead of
> handling it in the common MMU code (which is specific for cached MMIO)
> 
> Ivan Orlov (6):
>   KVM: x86: Add function for vectoring error generation
>   KVM: x86: Add emulation status for vectoring during MMIO
>   KVM: VMX: Handle vectoring error in check_emulate_instruction
>   KVM: SVM: Handle MMIO during vectroing error
>   selftests: KVM: extract lidt into helper function
>   selftests: KVM: Add test case for MMIO during vectoring

Minor nits throughout, but unless you disagree with my suggestions, I'll fix them
up when applying, i.e. no need to post a v3.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 0/6] Enhance event delivery error handling
  2024-12-11 18:20 ` [PATCH v2 0/6] Enhance event delivery error handling Sean Christopherson
@ 2024-12-11 21:45   ` Ivan Orlov
  0 siblings, 0 replies; 21+ messages in thread
From: Ivan Orlov @ 2024-12-11 21:45 UTC (permalink / raw)
  To: Sean Christopherson, Ivan Orlov
  Cc: bp, dave.hansen, mingo, pbonzini, shuah, tglx, hpa, kvm,
	linux-kernel, linux-kselftest, x86, pdurrant, dwmw

On 12/11/24 18:20, Sean Christopherson wrote:
> On Mon, Nov 11, 2024, Ivan Orlov wrote:
>> Currently, the situation when guest accesses MMIO during vectoring is
>> handled differently on VMX and SVM: on VMX KVM returns internal error,
>> when SVM goes into infinite loop trying to deliver an event again and
>> again.
>>
>> This patch series eliminates this difference by returning a KVM internal
>> error when guest performs MMIO during vectoring for both VMX and SVM.
>>
>> Also, introduce a selftest test case which covers the error handling
>> mentioned above.
>>
>> V1 -> V2:
>> - Make commit messages more brief, avoid using pronouns
>> - Extract SVM error handling into a separate commit
>> - Introduce a new X86EMUL_ return type and detect the unhandleable
>> vectoring error in vendor-specific check_emulate_instruction instead of
>> handling it in the common MMU code (which is specific for cached MMIO)
>>
>> Ivan Orlov (6):
>>    KVM: x86: Add function for vectoring error generation
>>    KVM: x86: Add emulation status for vectoring during MMIO
>>    KVM: VMX: Handle vectoring error in check_emulate_instruction
>>    KVM: SVM: Handle MMIO during vectroing error
>>    selftests: KVM: extract lidt into helper function
>>    selftests: KVM: Add test case for MMIO during vectoring
> 
> Minor nits throughout, but unless you disagree with my suggestions, I'll fix them
> up when applying, i.e. no need to post a v3.
> 

Hi Sean,

Thanks a lot for the review :)

I don't have any conceptual disagreement with your suggestions, so 
please feel free to fix them when applying the patches. Thanks!

-- 
Kind regards,
Ivan Orlov

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 3/6] KVM: VMX: Handle vectoring error in check_emulate_instruction
  2024-12-11 18:15   ` Sean Christopherson
@ 2024-12-11 22:05     ` Ivan Orlov
  2024-12-11 23:12     ` Ivan Orlov
  1 sibling, 0 replies; 21+ messages in thread
From: Ivan Orlov @ 2024-12-11 22:05 UTC (permalink / raw)
  To: Sean Christopherson, Ivan Orlov
  Cc: bp, dave.hansen, mingo, pbonzini, shuah, tglx, hpa, kvm,
	linux-kernel, linux-kselftest, x86, pdurrant, dwmw

On 12/11/24 18:15, Sean Christopherson wrote:
> On Mon, Nov 11, 2024, Ivan Orlov wrote:
>> Move unhandleable vmexit due to MMIO during vectoring error detection
>> into check_emulate_instruction. Implement a function which checks if
>> emul_type indicates MMIO so it can be used for both VMX and SVM.
>>
>> Fix the comment about EMULTYPE_PF as this flag doesn't necessarily
>> mean MMIO anymore: it can also be set due to the write protection
>> violation.
>>
>> Signed-off-by: Ivan Orlov <iorlov@amazon.com>
>> ---
>> V1 -> V2:
>> - Detect the unhandleable vectoring error in vmx_check_emulate_instruction
>> instead of handling it in the common MMU code (which is specific for
>> cached MMIO)
>>
>>   arch/x86/include/asm/kvm_host.h | 10 ++++++++--
>>   arch/x86/kvm/vmx/vmx.c          | 25 ++++++++++++-------------
>>   2 files changed, 20 insertions(+), 15 deletions(-)
>>
>> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
>> index eb413079b7c6..3de9702a9135 100644
>> --- a/arch/x86/include/asm/kvm_host.h
>> +++ b/arch/x86/include/asm/kvm_host.h
>> @@ -2017,8 +2017,8 @@ u64 vcpu_tsc_khz(struct kvm_vcpu *vcpu);
>>    *			VMware backdoor emulation handles select instructions
>>    *			and reinjects the #GP for all other cases.
>>    *
>> - * EMULTYPE_PF - Set when emulating MMIO by way of an intercepted #PF, in which
>> - *		 case the CR2/GPA value pass on the stack is valid.
>> + * EMULTYPE_PF - Set when an intercepted #PF triggers the emulation, in which case
>> + *		 the CR2/GPA value pass on the stack is valid.
>>    *
>>    * EMULTYPE_COMPLETE_USER_EXIT - Set when the emulator should update interruptibility
>>    *				 state and inject single-step #DBs after skipping
>> @@ -2053,6 +2053,12 @@ u64 vcpu_tsc_khz(struct kvm_vcpu *vcpu);
>>   #define EMULTYPE_COMPLETE_USER_EXIT (1 << 7)
>>   #define EMULTYPE_WRITE_PF_TO_SP	    (1 << 8)
>>   
>> +static inline bool kvm_is_emul_type_mmio(int emul_type)
> 
> Hmm, this should probably be "pf_mmio", not just "mmio".  E.g. if KVM is emulating
> large swaths of guest code because unrestricted guest is disabled, then can end up
> emulating an MMIO access for "normal" emulation.
> 
> Hmm, actually, what if we go with this?
> 
>    static inline bool kvm_can_emulate_event_vectoring(int emul_type)
>    {
> 	return !(emul_type & EMULTYPE_PF) ||
> 	       (emul_type & EMULTYPE_WRITE_PF_TO_SP);
>    }
> 

I don't mind using either option here, in fact both of them would 
require an update if there is a new emulated page fault type; Let's use 
more generic one (which is kvm_can_emulate_event_vectoring) :)

I'm thinking about a static assert we could add to it, to be sure the 
condition gets updated if such an EMUL_TYPE is introduced, but I can't 
think of something neat... Anyway, it can be done in a separate patch I 
guess (if we really need it).

--
Kind regards,
Ivan Orlov

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 3/6] KVM: VMX: Handle vectoring error in check_emulate_instruction
  2024-12-11 18:15   ` Sean Christopherson
  2024-12-11 22:05     ` Ivan Orlov
@ 2024-12-11 23:12     ` Ivan Orlov
  2024-12-12  1:01       ` Sean Christopherson
  1 sibling, 1 reply; 21+ messages in thread
From: Ivan Orlov @ 2024-12-11 23:12 UTC (permalink / raw)
  To: Sean Christopherson, Ivan Orlov
  Cc: bp, dave.hansen, mingo, pbonzini, shuah, tglx, hpa, kvm,
	linux-kernel, linux-kselftest, x86, pdurrant, dwmw

On 12/11/24 18:15, Sean Christopherson wrote:
> Hmm, this should probably be "pf_mmio", not just "mmio".  E.g. if KVM is emulating
> large swaths of guest code because unrestricted guest is disabled, then can end up
> emulating an MMIO access for "normal" emulation.
> 
> Hmm, actually, what if we go with this?
> 
>    static inline bool kvm_can_emulate_event_vectoring(int emul_type)
>    {
> 	return !(emul_type & EMULTYPE_PF) ||
> 	       (emul_type & EMULTYPE_WRITE_PF_TO_SP);
>    }
> 

Hm, by the way, what is the desired behaviour if EMULTYPE_ALLOW_RETRY_PF 
is set? Is it correct that we return an internal error if it is set 
during vectoring? Or KVM may try to unprotect the page and re-execute?

If so, we may need something like

static inline bool kvm_can_emulate_event_vectoring(int emul_type)
{
	return !(emul_type & EMULTYPE_PF) ||
	       (emul_type & ~(EMULTYPE_PF));
}

So it returns true if EMULTYPE_PF is not set or if it's not the only set 
bit.

-- 
Kind regards,
Ivan Orlov

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 3/6] KVM: VMX: Handle vectoring error in check_emulate_instruction
  2024-12-11 23:12     ` Ivan Orlov
@ 2024-12-12  1:01       ` Sean Christopherson
  2024-12-12 16:41         ` Ivan Orlov
  0 siblings, 1 reply; 21+ messages in thread
From: Sean Christopherson @ 2024-12-12  1:01 UTC (permalink / raw)
  To: Ivan Orlov
  Cc: Ivan Orlov, bp, dave.hansen, mingo, pbonzini, shuah, tglx, hpa,
	kvm, linux-kernel, linux-kselftest, x86, pdurrant, dwmw

On Wed, Dec 11, 2024, Ivan Orlov wrote:
> On 12/11/24 18:15, Sean Christopherson wrote:
> > Hmm, this should probably be "pf_mmio", not just "mmio".  E.g. if KVM is emulating
> > large swaths of guest code because unrestricted guest is disabled, then can end up
> > emulating an MMIO access for "normal" emulation.
> > 
> > Hmm, actually, what if we go with this?
> > 
> >    static inline bool kvm_can_emulate_event_vectoring(int emul_type)
> >    {
> > 	return !(emul_type & EMULTYPE_PF) ||
> > 	       (emul_type & EMULTYPE_WRITE_PF_TO_SP);
> >    }
> > 
> 
> Hm, by the way, what is the desired behaviour if EMULTYPE_ALLOW_RETRY_PF is
> set? Is it correct that we return an internal error if it is set during
> vectoring? Or KVM may try to unprotect the page and re-execute?

Heh, it's sneaky, but EMULTYPE_ALLOW_RETRY_PF can be set if and only if
RET_PF_WRITE_PROTECTED is set.  Hmm, that makes me think we should do the below
(EMULTYPE_WRITE_PF_TO_SP was a recent addition).

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 2e713480933a..de5f6985d123 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -9077,7 +9077,7 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
 
        if ((emulation_type & EMULTYPE_ALLOW_RETRY_PF) &&
            (WARN_ON_ONCE(is_guest_mode(vcpu)) ||
-            WARN_ON_ONCE(!(emulation_type & EMULTYPE_PF))))
+            WARN_ON_ONCE(!(emulation_type & EMULTYPE_WRITE_PF_TO_SP))))
                emulation_type &= ~EMULTYPE_ALLOW_RETRY_PF;
 
        r = kvm_check_emulate_insn(vcpu, emulation_type, insn, insn_len);

That said, let me get back to you on this when my brain is less tired.  I'm not
sure emulating when an exit occurred during event delivery is _ever_ correct.

> If so, we may need something like
> 
> static inline bool kvm_can_emulate_event_vectoring(int emul_type)
> {
> 	return !(emul_type & EMULTYPE_PF) ||
> 	       (emul_type & ~(EMULTYPE_PF));
> }
> 
> So it returns true if EMULTYPE_PF is not set or if it's not the only set
> bit.



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 3/6] KVM: VMX: Handle vectoring error in check_emulate_instruction
  2024-12-12  1:01       ` Sean Christopherson
@ 2024-12-12 16:41         ` Ivan Orlov
  2024-12-12 19:42           ` Sean Christopherson
  0 siblings, 1 reply; 21+ messages in thread
From: Ivan Orlov @ 2024-12-12 16:41 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Ivan Orlov, bp, dave.hansen, mingo, pbonzini, shuah, tglx, hpa,
	kvm, linux-kernel, linux-kselftest, x86, pdurrant, dwmw

On Wed, Dec 11, 2024 at 05:01:07PM -0800, Sean Christopherson wrote:
> > Hm, by the way, what is the desired behaviour if EMULTYPE_ALLOW_RETRY_PF is
> > set? Is it correct that we return an internal error if it is set during
> > vectoring? Or KVM may try to unprotect the page and re-execute?
> 
> Heh, it's sneaky, but EMULTYPE_ALLOW_RETRY_PF can be set if and only if
> RET_PF_WRITE_PROTECTED is set.  Hmm, that makes me think we should do the below
> (EMULTYPE_WRITE_PF_TO_SP was a recent addition).
> 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 2e713480933a..de5f6985d123 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -9077,7 +9077,7 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
> 
>         if ((emulation_type & EMULTYPE_ALLOW_RETRY_PF) &&
>             (WARN_ON_ONCE(is_guest_mode(vcpu)) ||
> -            WARN_ON_ONCE(!(emulation_type & EMULTYPE_PF))))
> +            WARN_ON_ONCE(!(emulation_type & EMULTYPE_WRITE_PF_TO_SP))))
>                 emulation_type &= ~EMULTYPE_ALLOW_RETRY_PF;
> 

What if we are handling a write to write-protected page, but not a write to
the page table? We still can retry after unprotecting the page, so
EMULTYPE_ALLOW_RETRY_PF should be enabled, right?

>         r = kvm_check_emulate_insn(vcpu, emulation_type, insn, insn_len);
> 
> That said, let me get back to you on this when my brain is less tired.  I'm not
> sure emulating when an exit occurred during event delivery is _ever_ correct.
> 

I believe we can re-execute the instruction if exit happened during vectoring
due to exception (and if the address is not MMIO, e.g. when it's a write to
write-protected page, for instance when stack points to it).

KVM unprotects the page, executes the instruction one more time and
(probably) gets this exception once again (but the page is already
unprotected, so vectoring succeeds without vmexit). If not
(e.g. exception conditions are not met anymore), guest shouldn't really
care and it can continue execution.

However, I'm not sure what happens if vectoring is caused by external
interrupt: if we unprotect the page and re-execute the instruction,
will IRQ be delivered nonetheless, or it will be lost as irq is
already in ISR? Do we need to re-inject it in such a case?

--
Kind regards,
Ivan Orlov

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 6/6] selftests: KVM: Add test case for MMIO during vectoring
  2024-12-11 18:19   ` Sean Christopherson
@ 2024-12-12 17:11     ` Ivan Orlov
  0 siblings, 0 replies; 21+ messages in thread
From: Ivan Orlov @ 2024-12-12 17:11 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: bp, dave.hansen, mingo, pbonzini, shuah, tglx, hpa, kvm,
	linux-kernel, linux-kselftest, x86, pdurrant, dwmw

On Wed, Dec 11, 2024 at 10:19:40AM -0800, Sean Christopherson wrote:
> > +static void guest_code_mmio_during_vectoring(void)
> > +{
> > +     const struct desc_ptr idt_desc = {
> > +             .address = MEM_REGION_GPA,
> > +             .size = 0xFFF,
> > +     };
> > +
> > +     set_idt(&idt_desc);
> > +
> > +     /* Generate a #GP by dereferencing a non-canonical address */
> > +     *((uint8_t *)NONCANONICAL) = 0x1;
> 
> Now I'm curious what happens if this uses vcpu_arch_put_guest(), i.e. if the
> test forces KVM to emulate the write.
> 
> No action needed, the test is a-ok as-is.  I'm really just curious :-)

:) Just tried enabling `force_emulation_prefix` kvm parameter and replacing the
write with

vcpu_arch_put_guest(*((uint8_t *)NONCANONICAL), 0x1);

And the test simply passes (so it returns the same internal error with
suberror=3, patches applied)

--
Kind regards,
Ivan Orlov

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 3/6] KVM: VMX: Handle vectoring error in check_emulate_instruction
  2024-12-12 16:41         ` Ivan Orlov
@ 2024-12-12 19:42           ` Sean Christopherson
  2024-12-13 17:38             ` Ivan Orlov
  0 siblings, 1 reply; 21+ messages in thread
From: Sean Christopherson @ 2024-12-12 19:42 UTC (permalink / raw)
  To: Ivan Orlov
  Cc: Ivan Orlov, bp, dave.hansen, mingo, pbonzini, shuah, tglx, hpa,
	kvm, linux-kernel, linux-kselftest, x86, pdurrant, dwmw

On Thu, Dec 12, 2024, Ivan Orlov wrote:
> On Wed, Dec 11, 2024 at 05:01:07PM -0800, Sean Christopherson wrote:
> > > Hm, by the way, what is the desired behaviour if EMULTYPE_ALLOW_RETRY_PF is
> > > set? Is it correct that we return an internal error if it is set during
> > > vectoring? Or KVM may try to unprotect the page and re-execute?
> > 
> > Heh, it's sneaky, but EMULTYPE_ALLOW_RETRY_PF can be set if and only if
> > RET_PF_WRITE_PROTECTED is set.  Hmm, that makes me think we should do the below
> > (EMULTYPE_WRITE_PF_TO_SP was a recent addition).
> > 
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index 2e713480933a..de5f6985d123 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -9077,7 +9077,7 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
> > 
> >         if ((emulation_type & EMULTYPE_ALLOW_RETRY_PF) &&
> >             (WARN_ON_ONCE(is_guest_mode(vcpu)) ||
> > -            WARN_ON_ONCE(!(emulation_type & EMULTYPE_PF))))
> > +            WARN_ON_ONCE(!(emulation_type & EMULTYPE_WRITE_PF_TO_SP))))
> >                 emulation_type &= ~EMULTYPE_ALLOW_RETRY_PF;
> > 
> 
> What if we are handling a write to write-protected page, but not a write to
> the page table? We still can retry after unprotecting the page, so
> EMULTYPE_ALLOW_RETRY_PF should be enabled, right?

Gah, I got my enums mixed up.  I conflated RET_PF_WRITE_PROTECTED with
EMULTYPE_WRITE_PF_TO_SP.  Ignore the above.

FWIW, KVM _can't_ unprotect and retry in the EMULTYPE_WRITE_PF_TO_SP case.  From
kvm_unprotect_and_retry_on_failure():

	/*
	 * If the failed instruction faulted on an access to page tables that
	 * are used to translate any part of the instruction, KVM can't resolve
	 * the issue by unprotecting the gfn, as zapping the shadow page will
	 * result in the instruction taking a !PRESENT page fault and thus put
	 * the vCPU into an infinite loop of page faults.  E.g. KVM will create
	 * a SPTE and write-protect the gfn to resolve the !PRESENT fault, and
	 * then zap the SPTE to unprotect the gfn, and then do it all over
	 * again.  Report the error to userspace.
	 */
	if (emulation_type & EMULTYPE_WRITE_PF_TO_SP)
		return false;


> >         r = kvm_check_emulate_insn(vcpu, emulation_type, insn, insn_len);
> > 
> > That said, let me get back to you on this when my brain is less tired.  I'm not
> > sure emulating when an exit occurred during event delivery is _ever_ correct.
> > 
> 
> I believe we can re-execute the instruction if exit happened during vectoring
> due to exception (and if the address is not MMIO, e.g. when it's a write to
> write-protected page, for instance when stack points to it).

Unprotect and re-execute is fine, what I'm worried about is *successfully*
emulating the instruction.  E.g. 

  1. CPU executes instruction X and hits a #GP.
  2. While vectoring the #GP, a shadow #PF is taken.
  3. On VM-Exit, KVM re-injects the #GP (see __vmx_complete_interrupts()).
  4. KVM emulates because of the write-protected page.
  5. KVM "successfully" emulates and also detects the #GP
  6. KVM synthesizes a #GP, and because the vCPU already has injected #GP,
     incorrectly escalates to a #DF.

The above is a bit contrived, but I think it could happen if the guest reused a
page that _was_ a page table, for a vCPU's kernel stack.

> KVM unprotects the page, executes the instruction one more time and
> (probably) gets this exception once again (but the page is already
> unprotected, so vectoring succeeds without vmexit). If not
> (e.g. exception conditions are not met anymore), guest shouldn't really
> care and it can continue execution.
> 
> However, I'm not sure what happens if vectoring is caused by external
> interrupt: if we unprotect the page and re-execute the instruction,
> will IRQ be delivered nonetheless, or it will be lost as irq is
> already in ISR? Do we need to re-inject it in such a case?

In all cases, the event that was being vectored is re-injected.  Restarting from
scratch would be a bug.  E.g. if the cause of initial exception was "fixed", say
because the initial exception was #BP, and the guest finished patching out the INT3,
then restarting would execute the _new_ instruction, and the INT3 would be lost.

In most cases, the guest would never notice, but it's still undesriable for KVM
to effectively rewrite history.

As far as unprotect+retry being viable, I think we're on the same page.  What I'm
getting at is that I think KVM should never allow emulating on #PF when the #PF
occurred while vectoring.  E.g. this:

  static inline bool kvm_can_emulate_event_vectoring(int emul_type)
  {
	return !(emul_type & EMULTYPE_PF);
  }

and then I believe this?  Where this diff can be a separate prep patch (though I'm
pretty sure it's technically pointless without the vectoring angle, because shadow
#PF can't coincide with any of the failure paths for kvm_check_emulate_insn()).

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 07c6f1d5323d..63361b2da450 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -9107,6 +9107,10 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
                if (r == X86EMUL_RETRY_INSTR || r == X86EMUL_PROPAGATE_FAULT)
                        return 1;
 
+               if (kvm_unprotect_and_retry_on_failure(vcpu, cr2_or_gpa,
+                                                      emulation_type))
+                       return 1;
+
                if (r == X86EMUL_UNHANDLEABLE_VECTORING_IO) {
                        kvm_prepare_event_vectoring_exit(vcpu, cr2_or_gpa);
                        return 0;


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 3/6] KVM: VMX: Handle vectoring error in check_emulate_instruction
  2024-12-12 19:42           ` Sean Christopherson
@ 2024-12-13 17:38             ` Ivan Orlov
  2024-12-13 20:09               ` Sean Christopherson
  0 siblings, 1 reply; 21+ messages in thread
From: Ivan Orlov @ 2024-12-13 17:38 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Ivan Orlov, bp, dave.hansen, mingo, pbonzini, shuah, tglx, hpa,
	kvm, linux-kernel, linux-kselftest, x86, pdurrant, dwmw

On Thu, Dec 12, 2024 at 11:42:37AM -0800, Sean Christopherson wrote:
> Gah, I got my enums mixed up.  I conflated RET_PF_WRITE_PROTECTED with
> EMULTYPE_WRITE_PF_TO_SP.  Ignore the above.
> 
> FWIW, KVM _can't_ unprotect and retry in the EMULTYPE_WRITE_PF_TO_SP case.  From
> kvm_unprotect_and_retry_on_failure():
> 

Ah alright, I guess I get it now :) So, EMULTYPE_ALLOW_RETRY_PF is set
whenever there is a PF when accessing write-protected page, and
EMULTYPE_WRITE_PF_TO_SP is set when this access touches SPTE used to
translate the write itself. If both are set, we can't unprotect and
retry, and the latter can be set only if the former is set.

> Unprotect and re-execute is fine, what I'm worried about is *successfully*
> emulating the instruction.  E.g.
> 
>   1. CPU executes instruction X and hits a #GP.
>   2. While vectoring the #GP, a shadow #PF is taken.
>   3. On VM-Exit, KVM re-injects the #GP (see __vmx_complete_interrupts()).
>   4. KVM emulates because of the write-protected page.
>   5. KVM "successfully" emulates and also detects the #GP
>   6. KVM synthesizes a #GP, and because the vCPU already has injected #GP,
>      incorrectly escalates to a #DF.
> 
> The above is a bit contrived, but I think it could happen if the guest reused a
> page that _was_ a page table, for a vCPU's kernel stack.
> 

Does it work like that only for contributory exceptions / page faults?
In case if it's not #GP but (for instance) #UD, (as far as I understand)
KVM will queue only one of them without causing #DF so it's gonna be
valid?

> > However, I'm not sure what happens if vectoring is caused by external
> > interrupt: if we unprotect the page and re-execute the instruction,
> > will IRQ be delivered nonetheless, or it will be lost as irq is
> > already in ISR? Do we need to re-inject it in such a case?
> 
> In all cases, the event that was being vectored is re-injected.  Restarting from
> scratch would be a bug.  E.g. if the cause of initial exception was "fixed", say
> because the initial exception was #BP, and the guest finished patching out the INT3,
> then restarting would execute the _new_ instruction, and the INT3 would be lost.
> 

Cool, that is what I was concerned about, glad that it is already
implemented :)

> 
> As far as unprotect+retry being viable, I think we're on the same page.  What I'm
> getting at is that I think KVM should never allow emulating on #PF when the #PF
> occurred while vectoring.  E.g. this:
> 
>   static inline bool kvm_can_emulate_event_vectoring(int emul_type)
>   {
>         return !(emul_type & EMULTYPE_PF);
>   }
> 

Yeah, I agree. I'll post a V3 with suggested fixes (after running all of the
selftests to be sure that it doesn't break anything).

> and then I believe this?  Where this diff can be a separate prep patch (though I'm
> pretty sure it's technically pointless without the vectoring angle, because shadow
> #PF can't coincide with any of the failure paths for kvm_check_emulate_insn()).
> 

Looks good. If you don't mind, I could add this patch to the series with `Suggested-by`
tag since it's neccessary to allow unprotect+retry in case of shadow #PF during
vectoring.

--
Kind regards,
Ivan Orlov

> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 07c6f1d5323d..63361b2da450 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -9107,6 +9107,10 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
>                 if (r == X86EMUL_RETRY_INSTR || r == X86EMUL_PROPAGATE_FAULT)
>                         return 1;
> 
> +               if (kvm_unprotect_and_retry_on_failure(vcpu, cr2_or_gpa,
> +                                                      emulation_type))
> +                       return 1;
> +
>                 if (r == X86EMUL_UNHANDLEABLE_VECTORING_IO) {
>                         kvm_prepare_event_vectoring_exit(vcpu, cr2_or_gpa);
>                         return 0;
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 3/6] KVM: VMX: Handle vectoring error in check_emulate_instruction
  2024-12-13 17:38             ` Ivan Orlov
@ 2024-12-13 20:09               ` Sean Christopherson
  0 siblings, 0 replies; 21+ messages in thread
From: Sean Christopherson @ 2024-12-13 20:09 UTC (permalink / raw)
  To: Ivan Orlov
  Cc: Ivan Orlov, bp, dave.hansen, mingo, pbonzini, shuah, tglx, hpa,
	kvm, linux-kernel, linux-kselftest, x86, pdurrant, dwmw

On Fri, Dec 13, 2024, Ivan Orlov wrote:
> On Thu, Dec 12, 2024 at 11:42:37AM -0800, Sean Christopherson wrote:
> > Unprotect and re-execute is fine, what I'm worried about is *successfully*
> > emulating the instruction.  E.g.
> > 
> >   1. CPU executes instruction X and hits a #GP.
> >   2. While vectoring the #GP, a shadow #PF is taken.
> >   3. On VM-Exit, KVM re-injects the #GP (see __vmx_complete_interrupts()).
> >   4. KVM emulates because of the write-protected page.
> >   5. KVM "successfully" emulates and also detects the #GP
> >   6. KVM synthesizes a #GP, and because the vCPU already has injected #GP,
> >      incorrectly escalates to a #DF.
> > 
> > The above is a bit contrived, but I think it could happen if the guest reused a
> > page that _was_ a page table, for a vCPU's kernel stack.
> > 
> 
> Does it work like that only for contributory exceptions / page faults?

The #DF case, yes.

> In case if it's not #GP but (for instance) #UD, (as far as I understand)
> KVM will queue only one of them without causing #DF so it's gonna be
> valid?

No, it can still be invalid.  E.g. initialize hit a #BP, replace it with a #UD,
but there may be guest-visibile side effects from the original #BP.

> > > However, I'm not sure what happens if vectoring is caused by external
> > > interrupt: if we unprotect the page and re-execute the instruction,
> > > will IRQ be delivered nonetheless, or it will be lost as irq is
> > > already in ISR? Do we need to re-inject it in such a case?
> > 
> > In all cases, the event that was being vectored is re-injected.  Restarting from
> > scratch would be a bug.  E.g. if the cause of initial exception was "fixed", say
> > because the initial exception was #BP, and the guest finished patching out the INT3,
> > then restarting would execute the _new_ instruction, and the INT3 would be lost.
> > 
> 
> Cool, that is what I was concerned about, glad that it is already
> implemented :)
> 
> > 
> > As far as unprotect+retry being viable, I think we're on the same page.  What I'm
> > getting at is that I think KVM should never allow emulating on #PF when the #PF
> > occurred while vectoring.  E.g. this:
> > 
> >   static inline bool kvm_can_emulate_event_vectoring(int emul_type)
> >   {
> >         return !(emul_type & EMULTYPE_PF);
> >   }
> > 
> 
> Yeah, I agree. I'll post a V3 with suggested fixes (after running all of the
> selftests to be sure that it doesn't break anything).
> 
> > and then I believe this?  Where this diff can be a separate prep patch (though I'm
> > pretty sure it's technically pointless without the vectoring angle, because shadow
> > #PF can't coincide with any of the failure paths for kvm_check_emulate_insn()).
> > 
> 
> Looks good. If you don't mind, I could add this patch to the series with `Suggested-by`
> tag since it's neccessary to allow unprotect+retry in case of shadow #PF during
> vectoring.

Ya, go for it.

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2024-12-13 20:09 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-11 10:27 [PATCH v2 0/6] Enhance event delivery error handling Ivan Orlov
2024-11-11 10:27 ` [PATCH v2 1/6] KVM: x86: Add function for vectoring error generation Ivan Orlov
2024-12-11 18:02   ` Sean Christopherson
2024-11-11 10:27 ` [PATCH v2 2/6] KVM: x86: Add emulation status for vectoring during MMIO Ivan Orlov
2024-11-11 10:27 ` [PATCH v2 3/6] KVM: VMX: Handle vectoring error in check_emulate_instruction Ivan Orlov
2024-12-11 18:15   ` Sean Christopherson
2024-12-11 22:05     ` Ivan Orlov
2024-12-11 23:12     ` Ivan Orlov
2024-12-12  1:01       ` Sean Christopherson
2024-12-12 16:41         ` Ivan Orlov
2024-12-12 19:42           ` Sean Christopherson
2024-12-13 17:38             ` Ivan Orlov
2024-12-13 20:09               ` Sean Christopherson
2024-11-11 10:27 ` [PATCH v2 4/6] KVM: SVM: Handle MMIO during vectroing error Ivan Orlov
2024-12-11 18:16   ` Sean Christopherson
2024-11-11 10:27 ` [PATCH v2 5/6] selftests: KVM: extract lidt into helper function Ivan Orlov
2024-11-11 10:27 ` [PATCH v2 6/6] selftests: KVM: Add test case for MMIO during vectoring Ivan Orlov
2024-12-11 18:19   ` Sean Christopherson
2024-12-12 17:11     ` Ivan Orlov
2024-12-11 18:20 ` [PATCH v2 0/6] Enhance event delivery error handling Sean Christopherson
2024-12-11 21:45   ` Ivan Orlov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox