From: Yosry Ahmed <yosry.ahmed@linux.dev>
To: Sean Christopherson <seanjc@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
Yosry Ahmed <yosry.ahmed@linux.dev>,
stable@vger.kernel.org
Subject: [PATCH v5 14/26] KVM: nSVM: Unify handling of VMRUN failures with proper cleanup
Date: Fri, 6 Feb 2026 19:08:39 +0000 [thread overview]
Message-ID: <20260206190851.860662-15-yosry.ahmed@linux.dev> (raw)
In-Reply-To: <20260206190851.860662-1-yosry.ahmed@linux.dev>
There are currently two possible causes of VMRUN failures emulated by
KVM:
1) Consistency checks failures. In this case, KVM updates the exit code
in the mapped VMCB12 and exits early in nested_svm_vmrun(). This
causes a few problems:
A) KVM does not clear the GIF if the early consistency checks fail
(because nested_svm_vmexit() is not called). Nothing requires
GIF=0 before a VMRUN, from the APM:
It is assumed that VMM software cleared GIF some time before
executing the VMRUN instruction, to ensure an atomic state
switch.
So an early #VMEXIT from early consistency checks could leave the
GIF set.
B) svm_leave_smm() is missing consistency checks on the newly loaded
guest state, because the checks aren't performed by
enter_svm_guest_mode().
2) Failure to load L2's CR3 or merge the MSR bitmaps. In this case, a
fully-fledged #VMEXIT injection is performed as VMCB02 is already
prepared.
Arguably all VMRUN failures should be handled before the VMCB02 is
prepared, but with proper cleanup (e.g. clear the GIF). Move all the
potential failure checks inside enter_svm_guest_mode() before switching
to VMCB02. On failure of any of these checks, nested_svm_vmrun()
synthesizes a minimal #VMEXIT through the new nested_svm_failed_vmrun()
helper.
__nested_svm_vmexit() already performs the necessary cleanup for a
failed VMRUN, including uninitializing the nested MMU and reloading L1's
CR3. This ensures that consistency check failures do proper necessary
cleanup, while other failures do not doo too much cleanup. It also
leaves a unified path for handling VMRUN failures.
Cc: stable@vger.kernel.org
Fixes: 52c65a30a5c6 ("KVM: SVM: Check for nested vmrun intercept before emulating vmrun")
Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev>
---
arch/x86/kvm/svm/nested.c | 66 +++++++++++++++++++++++++--------------
1 file changed, 42 insertions(+), 24 deletions(-)
diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index a852508d7419..918f6a6eaf56 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -934,22 +934,19 @@ int enter_svm_guest_mode(struct kvm_vcpu *vcpu, u64 vmcb12_gpa,
vmcb12->control.intercepts[INTERCEPT_WORD4],
vmcb12->control.intercepts[INTERCEPT_WORD5]);
-
svm->nested.vmcb12_gpa = vmcb12_gpa;
WARN_ON(svm->vmcb == svm->nested.vmcb02.ptr);
enter_guest_mode(vcpu);
+ if (!nested_vmcb_check_save(vcpu, &svm->nested.save) ||
+ !nested_vmcb_check_controls(vcpu, &svm->nested.ctl))
+ return -EINVAL;
+
if (nested_npt_enabled(svm))
nested_svm_init_mmu_context(vcpu);
- nested_svm_copy_common_state(svm->vmcb01.ptr, svm->nested.vmcb02.ptr);
-
- svm_switch_vmcb(svm, &svm->nested.vmcb02);
- nested_vmcb02_prepare_control(svm, vmcb12->save.rip, vmcb12->save.cs.base);
- nested_vmcb02_prepare_save(svm, vmcb12);
-
ret = nested_svm_load_cr3(&svm->vcpu, svm->nested.save.cr3,
nested_npt_enabled(svm), from_vmrun);
if (ret)
@@ -961,6 +958,17 @@ int enter_svm_guest_mode(struct kvm_vcpu *vcpu, u64 vmcb12_gpa,
return ret;
}
+ /*
+ * Any VMRUN failure needs to happen before this point, such that the
+ * nested #VMEXIT is injected properly by nested_svm_vmrun_error_vmexit().
+ */
+
+ nested_svm_copy_common_state(svm->vmcb01.ptr, svm->nested.vmcb02.ptr);
+
+ svm_switch_vmcb(svm, &svm->nested.vmcb02);
+ nested_vmcb02_prepare_control(svm, vmcb12->save.rip, vmcb12->save.cs.base);
+ nested_vmcb02_prepare_save(svm, vmcb12);
+
if (!from_vmrun)
kvm_make_request(KVM_REQ_GET_NESTED_STATE_PAGES, vcpu);
@@ -979,6 +987,8 @@ static void __nested_svm_vmexit(struct vcpu_svm *svm)
struct vmcb *vmcb01 = svm->vmcb01.ptr;
struct kvm_vcpu *vcpu = &svm->vcpu;
+ WARN_ON_ONCE(is_guest_mode(vcpu));
+
svm->nested.vmcb12_gpa = 0;
svm->nested.ctl.nested_cr3 = 0;
@@ -1002,6 +1012,20 @@ static void __nested_svm_vmexit(struct vcpu_svm *svm)
kvm_queue_exception(vcpu, DB_VECTOR);
}
+static void nested_svm_vmrun_error_vmexit(struct kvm_vcpu *vcpu, struct vmcb *vmcb12)
+{
+ struct vcpu_svm *svm = to_svm(vcpu);
+
+ WARN_ON_ONCE(svm->vmcb == svm->nested.vmcb02.ptr);
+
+ leave_guest_mode(vcpu);
+
+ vmcb12->control.exit_code = SVM_EXIT_ERR;
+ vmcb12->control.exit_info_1 = 0;
+ vmcb12->control.exit_info_2 = 0;
+ __nested_svm_vmexit(svm);
+}
+
int nested_svm_vmrun(struct kvm_vcpu *vcpu)
{
struct vcpu_svm *svm = to_svm(vcpu);
@@ -1044,14 +1068,6 @@ int nested_svm_vmrun(struct kvm_vcpu *vcpu)
nested_copy_vmcb_control_to_cache(svm, &vmcb12->control);
nested_copy_vmcb_save_to_cache(svm, &vmcb12->save);
- if (!nested_vmcb_check_save(vcpu, &svm->nested.save) ||
- !nested_vmcb_check_controls(vcpu, &svm->nested.ctl)) {
- vmcb12->control.exit_code = SVM_EXIT_ERR;
- vmcb12->control.exit_info_1 = 0;
- vmcb12->control.exit_info_2 = 0;
- goto out;
- }
-
/*
* Since vmcb01 is not in use, we can use it to store some of the L1
* state.
@@ -1072,14 +1088,9 @@ int nested_svm_vmrun(struct kvm_vcpu *vcpu)
svm->nmi_l1_to_l2 = false;
svm->soft_int_injected = false;
- svm->vmcb->control.exit_code = SVM_EXIT_ERR;
- svm->vmcb->control.exit_info_1 = 0;
- svm->vmcb->control.exit_info_2 = 0;
-
- nested_svm_vmexit(svm);
+ nested_svm_vmrun_error_vmexit(vcpu, vmcb12);
}
-out:
kvm_vcpu_unmap(vcpu, &map);
return ret;
@@ -1217,6 +1228,13 @@ void nested_svm_vmexit(struct vcpu_svm *svm)
if (guest_cpu_cap_has(vcpu, X86_FEATURE_ERAPS))
vmcb01->control.erap_ctl |= ERAP_CONTROL_CLEAR_RAP;
+ /*
+ * nested_svm_vmexit() is intended for use only when KVM is synthesizing
+ * a #VMEXIT after a successful nested VMRUN. All VMRUN consistency
+ * checks must be performed before loading guest state, and so should
+ * use __nested_svm_vmexit().
+ */
+ WARN_ON_ONCE(svm->vmcb != svm->nested.vmcb02.ptr);
svm_switch_vmcb(svm, &svm->vmcb01);
/*
@@ -1903,9 +1921,6 @@ static int svm_set_nested_state(struct kvm_vcpu *vcpu,
if (nested_npt_enabled(svm))
nested_svm_init_mmu_context(vcpu);
- svm_switch_vmcb(svm, &svm->nested.vmcb02);
- nested_vmcb02_prepare_control(svm, svm->vmcb->save.rip, svm->vmcb->save.cs.base);
-
/*
* While the nested guest CR3 is already checked and set by
* KVM_SET_SREGS, it was set when nested state was yet loaded,
@@ -1917,6 +1932,9 @@ static int svm_set_nested_state(struct kvm_vcpu *vcpu,
if (ret)
goto out_free;
+ svm_switch_vmcb(svm, &svm->nested.vmcb02);
+ nested_vmcb02_prepare_control(svm, svm->vmcb->save.rip, svm->vmcb->save.cs.base);
+
svm->nested.force_msr_bitmap_recalc = true;
kvm_make_request(KVM_REQ_GET_NESTED_STATE_PAGES, vcpu);
--
2.53.0.rc2.204.g2597b5adb4-goog
next prev parent reply other threads:[~2026-02-06 19:09 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20260206190851.860662-1-yosry.ahmed@linux.dev>
2026-02-06 19:08 ` [PATCH v5 01/26] KVM: nSVM: Avoid clearing VMCB_LBR in vmcb12 Yosry Ahmed
2026-02-06 19:08 ` [PATCH v5 02/26] KVM: SVM: Switch svm_copy_lbrs() to a macro Yosry Ahmed
2026-02-06 19:08 ` [PATCH v5 03/26] KVM: SVM: Add missing save/restore handling of LBR MSRs Yosry Ahmed
2026-02-06 19:08 ` [PATCH v5 05/26] KVM: nSVM: Always inject a #GP if mapping VMCB12 fails on nested VMRUN Yosry Ahmed
2026-02-06 19:08 ` [PATCH v5 06/26] KVM: nSVM: Triple fault if mapping VMCB12 fails on nested #VMEXIT Yosry Ahmed
2026-02-24 0:35 ` Sean Christopherson
2026-02-24 0:51 ` Yosry Ahmed
2026-02-24 1:17 ` Sean Christopherson
2026-02-06 19:08 ` [PATCH v5 07/26] KVM: nSVM: Triple fault if restore host CR3 " Yosry Ahmed
2026-02-24 0:38 ` Sean Christopherson
2026-02-06 19:08 ` [PATCH v5 08/26] KVM: nSVM: Drop nested_vmcb_check_{save/control}() wrappers Yosry Ahmed
2026-02-06 19:08 ` [PATCH v5 09/26] KVM: nSVM: Call enter_guest_mode() before switching to VMCB02 Yosry Ahmed
2026-02-21 1:12 ` Sean Christopherson
2026-02-21 1:26 ` Jim Mattson
2026-02-21 9:06 ` Yosry Ahmed
2026-02-06 19:08 ` [PATCH v5 10/26] KVM: nSVM: Make nested_svm_merge_msrpm() return an errno Yosry Ahmed
2026-02-06 19:08 ` [PATCH v5 11/26] KVM: nSVM: Call nested_svm_merge_msrpm() from enter_svm_guest_mode() Yosry Ahmed
2026-02-06 19:08 ` [PATCH v5 12/26] KVM: nSVM: Call nested_svm_init_mmu_context() before switching to VMCB02 Yosry Ahmed
2026-02-06 19:08 ` [PATCH v5 13/26] KVM: nSVM: Refactor minimal #VMEXIT handling out of nested_svm_vmexit() Yosry Ahmed
2026-02-06 19:08 ` Yosry Ahmed [this message]
2026-02-06 19:08 ` [PATCH v5 15/26] KVM: nSVM: Clear EVENTINJ field in VMCB12 on nested #VMEXIT Yosry Ahmed
2026-02-06 19:08 ` [PATCH v5 16/26] KVM: nSVM: Drop the non-architectural consistency check for NP_ENABLE Yosry Ahmed
2026-02-06 19:08 ` [PATCH v5 17/26] KVM: nSVM: Add missing consistency check for nCR3 validity Yosry Ahmed
2026-02-06 19:08 ` [PATCH v5 18/26] KVM: nSVM: Add missing consistency check for hCR0.PG and NP_ENABLE Yosry Ahmed
2026-02-06 19:08 ` [PATCH v5 19/26] KVM: nSVM: Add missing consistency check for EFER, CR0, CR4, and CS Yosry Ahmed
2026-02-06 19:08 ` [PATCH v5 20/26] KVM: nSVM: Add missing consistency check for event_inj Yosry Ahmed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260206190851.860662-15-yosry.ahmed@linux.dev \
--to=yosry.ahmed@linux.dev \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=pbonzini@redhat.com \
--cc=seanjc@google.com \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox