* [PATCH v2 1/4] KVM: SVM: Don't inject #UD if KVM attempts to skip SEV guest insn
2023-08-25 1:36 [PATCH v2 0/4] KVM: SVM: Fix unexpected #UD on INT3 in SEV guests Sean Christopherson
@ 2023-08-25 1:36 ` Sean Christopherson
2023-08-25 1:36 ` [PATCH v2 2/4] KVM: SVM: Require nrips support for SEV guests (and beyond) Sean Christopherson
` (4 subsequent siblings)
5 siblings, 0 replies; 10+ messages in thread
From: Sean Christopherson @ 2023-08-25 1:36 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini
Cc: kvm, linux-kernel, Wu Zongyo, Tom Lendacky
Don't inject a #UD if KVM attempts to "emulate" to skip an instruction
for an SEV guest, and instead resume the guest and hope that it can make
forward progress. When commit 04c40f344def ("KVM: SVM: Inject #UD on
attempted emulation for SEV guest w/o insn buffer") added the completely
arbitrary #UD behavior, there were no known scenarios where a well-behaved
guest would induce a VM-Exit that triggered emulation, i.e. it was thought
that injecting #UD would be helpful.
However, now that KVM (correctly) attempts to re-inject INT3/INTO, e.g. if
a #NPF is encountered when attempting to deliver the INT3/INTO, an SEV
guest can trigger emulation without a buffer, through no fault of its own.
Resuming the guest and retrying the INT3/INTO is architecturally wrong,
e.g. the vCPU will incorrectly re-hit code #DBs, but for SEV guests there
is literally no other option that has a chance of making forward progress.
Drop the #UD injection for all "skip" emulation, not just those related to
INT3/INTO, even though that means that the guest will likely end up in an
infinite loop instead of getting a #UD (the vCPU may also crash, e.g. if
KVM emulated everything about an instruction except for advancing RIP).
There's no evidence that suggests that an unexpected #UD is actually
better than hanging the vCPU, e.g. a soft-hung vCPU can still respond to
IRQs and NMIs to generate a backtrace.
Reported-by: Wu Zongyo <wuzongyo@mail.ustc.edu.cn>
Closes: https://lore.kernel.org/all/8eb933fd-2cf3-d7a9-32fe-2a1d82eac42a@mail.ustc.edu.cn
Fixes: 6ef88d6e36c2 ("KVM: SVM: Re-inject INT3/INTO instead of retrying the instruction")
Cc: stable@vger.kernel.org
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kvm/svm/svm.c | 35 +++++++++++++++++++++++++++--------
1 file changed, 27 insertions(+), 8 deletions(-)
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index a139c626fa8b..bd53b2d497d0 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -364,6 +364,8 @@ static void svm_set_interrupt_shadow(struct kvm_vcpu *vcpu, int mask)
svm->vmcb->control.int_state |= SVM_INTERRUPT_SHADOW_MASK;
}
+static bool svm_can_emulate_instruction(struct kvm_vcpu *vcpu, int emul_type,
+ void *insn, int insn_len);
static int __svm_skip_emulated_instruction(struct kvm_vcpu *vcpu,
bool commit_side_effects)
@@ -384,6 +386,14 @@ static int __svm_skip_emulated_instruction(struct kvm_vcpu *vcpu,
}
if (!svm->next_rip) {
+ /*
+ * FIXME: Drop this when kvm_emulate_instruction() does the
+ * right thing and treats "can't emulate" as outright failure
+ * for EMULTYPE_SKIP.
+ */
+ if (!svm_can_emulate_instruction(vcpu, EMULTYPE_SKIP, NULL, 0))
+ return 0;
+
if (unlikely(!commit_side_effects))
old_rflags = svm->vmcb->save.rflags;
@@ -4724,16 +4734,25 @@ static bool svm_can_emulate_instruction(struct kvm_vcpu *vcpu, int emul_type,
* and cannot be decrypted by KVM, i.e. KVM would read cyphertext and
* decode garbage.
*
- * Inject #UD if KVM reached this point without an instruction buffer.
- * In practice, this path should never be hit by a well-behaved guest,
- * e.g. KVM doesn't intercept #UD or #GP for SEV guests, but this path
- * is still theoretically reachable, e.g. via unaccelerated fault-like
- * AVIC access, and needs to be handled by KVM to avoid putting the
- * guest into an infinite loop. Injecting #UD is somewhat arbitrary,
- * but its the least awful option given lack of insight into the guest.
+ * If KVM is NOT trying to simply skip an instruction, inject #UD if
+ * KVM reached this point without an instruction buffer. In practice,
+ * this path should never be hit by a well-behaved guest, e.g. KVM
+ * doesn't intercept #UD or #GP for SEV guests, but this path is still
+ * theoretically reachable, e.g. via unaccelerated fault-like AVIC
+ * access, and needs to be handled by KVM to avoid putting the guest
+ * into an infinite loop. Injecting #UD is somewhat arbitrary, but
+ * its the least awful option given lack of insight into the guest.
+ *
+ * If KVM is trying to skip an instruction, simply resume the guest.
+ * If a #NPF occurs while the guest is vectoring an INT3/INTO, then KVM
+ * will attempt to re-inject the INT3/INTO and skip the instruction.
+ * In that scenario, retrying the INT3/INTO and hoping the guest will
+ * make forward progress is the only option that has a chance of
+ * success (and in practice it will work the vast majority of the time).
*/
if (unlikely(!insn)) {
- kvm_queue_exception(vcpu, UD_VECTOR);
+ if (!(emul_type & EMULTYPE_SKIP))
+ kvm_queue_exception(vcpu, UD_VECTOR);
return false;
}
--
2.42.0.rc2.253.gd59a3bf2b4-goog
^ permalink raw reply related [flat|nested] 10+ messages in thread* [PATCH v2 3/4] KVM: x86: Refactor can_emulate_instruction() return to be more expressive
2023-08-25 1:36 [PATCH v2 0/4] KVM: SVM: Fix unexpected #UD on INT3 in SEV guests Sean Christopherson
2023-08-25 1:36 ` [PATCH v2 1/4] KVM: SVM: Don't inject #UD if KVM attempts to skip SEV guest insn Sean Christopherson
2023-08-25 1:36 ` [PATCH v2 2/4] KVM: SVM: Require nrips support for SEV guests (and beyond) Sean Christopherson
@ 2023-08-25 1:36 ` Sean Christopherson
2023-08-25 1:36 ` [PATCH v2 4/4] KVM: SVM: Treat all "skip" emulation for SEV guests as outright failures Sean Christopherson
` (2 subsequent siblings)
5 siblings, 0 replies; 10+ messages in thread
From: Sean Christopherson @ 2023-08-25 1:36 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini
Cc: kvm, linux-kernel, Wu Zongyo, Tom Lendacky
Refactor and rename can_emulate_instruction() to allow vendor code to
return more than true/false, e.g. to explicitly differentiate between
"retry", "fault", and "unhandleable". For now, just do the plumbing, a
future patch will expand SVM's implementation to signal outright failure
if KVM attempts EMULTYPE_SKIP on an SEV guest.
No functional change intended (or rather, none that are visible to the
guest or userspace).
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/include/asm/kvm-x86-ops.h | 2 +-
arch/x86/include/asm/kvm_host.h | 4 ++--
arch/x86/kvm/svm/svm.c | 31 ++++++++++++++++--------------
arch/x86/kvm/vmx/vmx.c | 12 ++++++------
arch/x86/kvm/x86.c | 15 +++++++++------
5 files changed, 35 insertions(+), 29 deletions(-)
diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h
index 13bc212cd4bc..ac01552316e1 100644
--- a/arch/x86/include/asm/kvm-x86-ops.h
+++ b/arch/x86/include/asm/kvm-x86-ops.h
@@ -125,7 +125,7 @@ KVM_X86_OP_OPTIONAL(vm_copy_enc_context_from)
KVM_X86_OP_OPTIONAL(vm_move_enc_context_from)
KVM_X86_OP_OPTIONAL(guest_memory_reclaimed)
KVM_X86_OP(get_msr_feature)
-KVM_X86_OP(can_emulate_instruction)
+KVM_X86_OP(check_emulate_instruction)
KVM_X86_OP(apic_init_signal_blocked)
KVM_X86_OP_OPTIONAL(enable_l2_tlb_flush)
KVM_X86_OP_OPTIONAL(migrate_timers)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 9f57aa33798b..4760e60fad44 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1734,8 +1734,8 @@ struct kvm_x86_ops {
int (*get_msr_feature)(struct kvm_msr_entry *entry);
- bool (*can_emulate_instruction)(struct kvm_vcpu *vcpu, int emul_type,
- void *insn, int insn_len);
+ int (*check_emulate_instruction)(struct kvm_vcpu *vcpu, int emul_type,
+ void *insn, int insn_len);
bool (*apic_init_signal_blocked)(struct kvm_vcpu *vcpu);
int (*enable_l2_tlb_flush)(struct kvm_vcpu *vcpu);
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index b21253c9ceb4..39ce680013c4 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -364,8 +364,8 @@ static void svm_set_interrupt_shadow(struct kvm_vcpu *vcpu, int mask)
svm->vmcb->control.int_state |= SVM_INTERRUPT_SHADOW_MASK;
}
-static bool svm_can_emulate_instruction(struct kvm_vcpu *vcpu, int emul_type,
- void *insn, int insn_len);
+static int svm_check_emulate_instruction(struct kvm_vcpu *vcpu, int emul_type,
+ void *insn, int insn_len);
static int __svm_skip_emulated_instruction(struct kvm_vcpu *vcpu,
bool commit_side_effects)
@@ -391,7 +391,7 @@ static int __svm_skip_emulated_instruction(struct kvm_vcpu *vcpu,
* right thing and treats "can't emulate" as outright failure
* for EMULTYPE_SKIP.
*/
- if (!svm_can_emulate_instruction(vcpu, EMULTYPE_SKIP, NULL, 0))
+ if (svm_check_emulate_instruction(vcpu, EMULTYPE_SKIP, NULL, 0) != X86EMUL_CONTINUE)
return 0;
if (unlikely(!commit_side_effects))
@@ -4698,15 +4698,15 @@ static void svm_enable_smi_window(struct kvm_vcpu *vcpu)
}
#endif
-static bool svm_can_emulate_instruction(struct kvm_vcpu *vcpu, int emul_type,
- void *insn, int insn_len)
+static int svm_check_emulate_instruction(struct kvm_vcpu *vcpu, int emul_type,
+ void *insn, int insn_len)
{
bool smep, smap, is_user;
u64 error_code;
/* Emulation is always possible when KVM has access to all guest state. */
if (!sev_guest(vcpu->kvm))
- return true;
+ return X86EMUL_CONTINUE;
/* #UD and #GP should never be intercepted for SEV guests. */
WARN_ON_ONCE(emul_type & (EMULTYPE_TRAP_UD |
@@ -4718,14 +4718,14 @@ static bool svm_can_emulate_instruction(struct kvm_vcpu *vcpu, int emul_type,
* to guest register state.
*/
if (sev_es_guest(vcpu->kvm))
- return false;
+ return X86EMUL_RETRY_INSTR;
/*
* Emulation is possible if the instruction is already decoded, e.g.
* when completing I/O after returning from userspace.
*/
if (emul_type & EMULTYPE_NO_DECODE)
- return true;
+ return X86EMUL_CONTINUE;
/*
* Emulation is possible for SEV guests if and only if a prefilled
@@ -4751,9 +4751,11 @@ static bool svm_can_emulate_instruction(struct kvm_vcpu *vcpu, int emul_type,
* success (and in practice it will work the vast majority of the time).
*/
if (unlikely(!insn)) {
- if (!(emul_type & EMULTYPE_SKIP))
- kvm_queue_exception(vcpu, UD_VECTOR);
- return false;
+ if (emul_type & EMULTYPE_SKIP)
+ return X86EMUL_RETRY_INSTR;
+
+ kvm_queue_exception(vcpu, UD_VECTOR);
+ return X86EMUL_PROPAGATE_FAULT;
}
/*
@@ -4764,7 +4766,7 @@ static bool svm_can_emulate_instruction(struct kvm_vcpu *vcpu, int emul_type,
* table used to translate CS:RIP resides in emulated MMIO.
*/
if (likely(insn_len))
- return true;
+ return X86EMUL_CONTINUE;
/*
* Detect and workaround Errata 1096 Fam_17h_00_0Fh.
@@ -4822,6 +4824,7 @@ static bool svm_can_emulate_instruction(struct kvm_vcpu *vcpu, int emul_type,
kvm_inject_gp(vcpu, 0);
else
kvm_make_request(KVM_REQ_TRIPLE_FAULT, vcpu);
+ return X86EMUL_PROPAGATE_FAULT;
}
resume_guest:
@@ -4839,7 +4842,7 @@ static bool svm_can_emulate_instruction(struct kvm_vcpu *vcpu, int emul_type,
* doesn't explicitly define "ignored", i.e. doing nothing and letting
* the guest spin is technically "ignoring" the access.
*/
- return false;
+ return X86EMUL_RETRY_INSTR;
}
static bool svm_apic_init_signal_blocked(struct kvm_vcpu *vcpu)
@@ -4998,7 +5001,7 @@ static struct kvm_x86_ops svm_x86_ops __initdata = {
.vm_copy_enc_context_from = sev_vm_copy_enc_context_from,
.vm_move_enc_context_from = sev_vm_move_enc_context_from,
- .can_emulate_instruction = svm_can_emulate_instruction,
+ .check_emulate_instruction = svm_check_emulate_instruction,
.apic_init_signal_blocked = svm_apic_init_signal_blocked,
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index e6849f780dba..2d4a80c406cb 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1642,8 +1642,8 @@ static int vmx_rtit_ctl_check(struct kvm_vcpu *vcpu, u64 data)
return 0;
}
-static bool vmx_can_emulate_instruction(struct kvm_vcpu *vcpu, int emul_type,
- void *insn, int insn_len)
+static int vmx_check_emulate_instruction(struct kvm_vcpu *vcpu, int emul_type,
+ void *insn, int insn_len)
{
/*
* Emulation of instructions in SGX enclaves is impossible as RIP does
@@ -1654,9 +1654,9 @@ static bool vmx_can_emulate_instruction(struct kvm_vcpu *vcpu, int emul_type,
*/
if (to_vmx(vcpu)->exit_reason.enclave_mode) {
kvm_queue_exception(vcpu, UD_VECTOR);
- return false;
+ return X86EMUL_PROPAGATE_FAULT;
}
- return true;
+ return X86EMUL_CONTINUE;
}
static int skip_emulated_instruction(struct kvm_vcpu *vcpu)
@@ -5770,7 +5770,7 @@ static int handle_ept_misconfig(struct kvm_vcpu *vcpu)
{
gpa_t gpa;
- if (!vmx_can_emulate_instruction(vcpu, EMULTYPE_PF, NULL, 0))
+ if (vmx_check_emulate_instruction(vcpu, EMULTYPE_PF, NULL, 0))
return 1;
/*
@@ -8317,7 +8317,7 @@ static struct kvm_x86_ops vmx_x86_ops __initdata = {
.enable_smi_window = vmx_enable_smi_window,
#endif
- .can_emulate_instruction = vmx_can_emulate_instruction,
+ .check_emulate_instruction = vmx_check_emulate_instruction,
.apic_init_signal_blocked = vmx_apic_init_signal_blocked,
.migrate_timers = vmx_migrate_timers,
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index e4a939471df1..f897d582d560 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -7458,11 +7458,11 @@ int kvm_write_guest_virt_system(struct kvm_vcpu *vcpu, gva_t addr, void *val,
}
EXPORT_SYMBOL_GPL(kvm_write_guest_virt_system);
-static int kvm_can_emulate_insn(struct kvm_vcpu *vcpu, int emul_type,
- void *insn, int insn_len)
+static int kvm_check_emulate_insn(struct kvm_vcpu *vcpu, int emul_type,
+ void *insn, int insn_len)
{
- return static_call(kvm_x86_can_emulate_instruction)(vcpu, emul_type,
- insn, insn_len);
+ return static_call(kvm_x86_check_emulate_instruction)(vcpu, emul_type,
+ insn, insn_len);
}
int handle_ud(struct kvm_vcpu *vcpu)
@@ -7472,8 +7472,10 @@ int handle_ud(struct kvm_vcpu *vcpu)
int emul_type = EMULTYPE_TRAP_UD;
char sig[5]; /* ud2; .ascii "kvm" */
struct x86_exception e;
+ int r;
- if (unlikely(!kvm_can_emulate_insn(vcpu, emul_type, NULL, 0)))
+ r = kvm_check_emulate_insn(vcpu, emul_type, NULL, 0);
+ if (r != X86EMUL_CONTINUE)
return 1;
if (fep_flags &&
@@ -8855,7 +8857,8 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
struct x86_emulate_ctxt *ctxt = vcpu->arch.emulate_ctxt;
bool writeback = true;
- if (unlikely(!kvm_can_emulate_insn(vcpu, emulation_type, insn, insn_len)))
+ r = kvm_check_emulate_insn(vcpu, emulation_type, insn, insn_len);
+ if (r != X86EMUL_CONTINUE)
return 1;
vcpu->arch.l1tf_flush_l1d = true;
--
2.42.0.rc2.253.gd59a3bf2b4-goog
^ permalink raw reply related [flat|nested] 10+ messages in thread* [PATCH v2 4/4] KVM: SVM: Treat all "skip" emulation for SEV guests as outright failures
2023-08-25 1:36 [PATCH v2 0/4] KVM: SVM: Fix unexpected #UD on INT3 in SEV guests Sean Christopherson
` (2 preceding siblings ...)
2023-08-25 1:36 ` [PATCH v2 3/4] KVM: x86: Refactor can_emulate_instruction() return to be more expressive Sean Christopherson
@ 2023-08-25 1:36 ` Sean Christopherson
2023-08-25 13:43 ` Tom Lendacky
2023-08-25 19:02 ` [PATCH v2 0/4] KVM: SVM: Fix unexpected #UD on INT3 in SEV guests Sean Christopherson
2023-10-05 1:29 ` Sean Christopherson
5 siblings, 1 reply; 10+ messages in thread
From: Sean Christopherson @ 2023-08-25 1:36 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini
Cc: kvm, linux-kernel, Wu Zongyo, Tom Lendacky
Treat EMULTYPE_SKIP failures on SEV guests as unhandleable emulation
instead of simply resuming the guest, and drop the hack-a-fix which
effects that behavior for the INT3/INTO injection path. If KVM can't
skip an instruction for which KVM has already done partial emulation,
resuming the guest is undesirable as doing so may corrupt guest state.
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kvm/svm/svm.c | 12 +-----------
arch/x86/kvm/x86.c | 9 +++++++--
2 files changed, 8 insertions(+), 13 deletions(-)
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 39ce680013c4..fc2cd5585349 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -364,8 +364,6 @@ static void svm_set_interrupt_shadow(struct kvm_vcpu *vcpu, int mask)
svm->vmcb->control.int_state |= SVM_INTERRUPT_SHADOW_MASK;
}
-static int svm_check_emulate_instruction(struct kvm_vcpu *vcpu, int emul_type,
- void *insn, int insn_len);
static int __svm_skip_emulated_instruction(struct kvm_vcpu *vcpu,
bool commit_side_effects)
@@ -386,14 +384,6 @@ static int __svm_skip_emulated_instruction(struct kvm_vcpu *vcpu,
}
if (!svm->next_rip) {
- /*
- * FIXME: Drop this when kvm_emulate_instruction() does the
- * right thing and treats "can't emulate" as outright failure
- * for EMULTYPE_SKIP.
- */
- if (svm_check_emulate_instruction(vcpu, EMULTYPE_SKIP, NULL, 0) != X86EMUL_CONTINUE)
- return 0;
-
if (unlikely(!commit_side_effects))
old_rflags = svm->vmcb->save.rflags;
@@ -4752,7 +4742,7 @@ static int svm_check_emulate_instruction(struct kvm_vcpu *vcpu, int emul_type,
*/
if (unlikely(!insn)) {
if (emul_type & EMULTYPE_SKIP)
- return X86EMUL_RETRY_INSTR;
+ return X86EMUL_UNHANDLEABLE;
kvm_queue_exception(vcpu, UD_VECTOR);
return X86EMUL_PROPAGATE_FAULT;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index f897d582d560..1f4a8fbc5390 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -8858,8 +8858,13 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
bool writeback = true;
r = kvm_check_emulate_insn(vcpu, emulation_type, insn, insn_len);
- if (r != X86EMUL_CONTINUE)
- return 1;
+ if (r != X86EMUL_CONTINUE) {
+ if (r == X86EMUL_RETRY_INSTR || r == X86EMUL_PROPAGATE_FAULT)
+ return 1;
+
+ WARN_ON_ONCE(r != X86EMUL_UNHANDLEABLE);
+ return handle_emulation_failure(vcpu, emulation_type);
+ }
vcpu->arch.l1tf_flush_l1d = true;
--
2.42.0.rc2.253.gd59a3bf2b4-goog
^ permalink raw reply related [flat|nested] 10+ messages in thread