public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed
From: Yosry Ahmed <yosry@kernel.org>
To: Sean Christopherson <seanjc@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
	kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
	Yosry Ahmed <yosry@kernel.org>,
	stable@vger.kernel.org
Subject: [PATCH v6 16/31] KVM: nSVM: Unify handling of VMRUN failures with proper cleanup
Date: Tue, 24 Feb 2026 22:33:50 +0000	[thread overview]
Message-ID: <20260224223405.3270433-17-yosry@kernel.org> (raw)
In-Reply-To: <20260224223405.3270433-1-yosry@kernel.org>

There are currently two possible causes of VMRUN failures emulated by
KVM:

1) Consistency checks failures. In this case, KVM updates the exit code
   in the mapped VMCB12 and exits early in nested_svm_vmrun(). This
   causes a few problems:

  A) KVM does not clear the GIF if the early consistency checks fail
     (because nested_svm_vmexit() is not called). Nothing requires
     GIF=0 before a VMRUN, from the APM:

	It is assumed that VMM software cleared GIF some time before
	executing the VMRUN instruction, to ensure an atomic state
	switch.

     So an early #VMEXIT from early consistency checks could leave the
     GIF set.

  B) svm_leave_smm() is missing consistency checks on the newly loaded
     guest state, because the checks aren't performed by
     enter_svm_guest_mode().

2) Failure to load L2's CR3 or merge the MSR bitmaps. In this case, a
   fully-fledged #VMEXIT injection is performed as VMCB02 is already
   prepared.

Arguably all VMRUN failures should be handled before the VMCB02 is
prepared, but with proper cleanup (e.g. clear the GIF). Move all the
potential failure checks inside enter_svm_guest_mode() before switching
to VMCB02. On failure of any of these checks, nested_svm_vmrun()
synthesizes a minimal #VMEXIT through the new nested_svm_failed_vmrun()
helper.

__nested_svm_vmexit() already performs the necessary cleanup for a
failed VMRUN, including uninitializing the nested MMU and reloading L1's
CR3. This ensures that consistency check failures do proper necessary
cleanup, while other failures do not doo too much cleanup. It also
leaves a unified path for handling VMRUN failures.

Cc: stable@vger.kernel.org
Fixes: 52c65a30a5c6 ("KVM: SVM: Check for nested vmrun intercept before emulating vmrun")
Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Yosry Ahmed <yosry@kernel.org>
---
 arch/x86/kvm/svm/nested.c | 66 +++++++++++++++++++++++++--------------
 1 file changed, 42 insertions(+), 24 deletions(-)

diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index a6e8c0d26c64e..2c5878cbb3940 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -939,22 +939,19 @@ int enter_svm_guest_mode(struct kvm_vcpu *vcpu, u64 vmcb12_gpa,
 				    vmcb12->control.intercepts[INTERCEPT_WORD4],
 				    vmcb12->control.intercepts[INTERCEPT_WORD5]);
 
-
 	svm->nested.vmcb12_gpa = vmcb12_gpa;
 
 	WARN_ON(svm->vmcb == svm->nested.vmcb02.ptr);
 
 	enter_guest_mode(vcpu);
 
+	if (!nested_vmcb_check_save(vcpu, &svm->nested.save) ||
+	    !nested_vmcb_check_controls(vcpu, &svm->nested.ctl))
+		return -EINVAL;
+
 	if (nested_npt_enabled(svm))
 		nested_svm_init_mmu_context(vcpu);
 
-	nested_svm_copy_common_state(svm->vmcb01.ptr, svm->nested.vmcb02.ptr);
-
-	svm_switch_vmcb(svm, &svm->nested.vmcb02);
-	nested_vmcb02_prepare_control(svm, vmcb12->save.rip, vmcb12->save.cs.base);
-	nested_vmcb02_prepare_save(svm, vmcb12);
-
 	ret = nested_svm_load_cr3(&svm->vcpu, svm->nested.save.cr3,
 				  nested_npt_enabled(svm), from_vmrun);
 	if (ret)
@@ -966,6 +963,17 @@ int enter_svm_guest_mode(struct kvm_vcpu *vcpu, u64 vmcb12_gpa,
 			return ret;
 	}
 
+	/*
+	 * Any VMRUN failure needs to happen before this point, such that the
+	 * nested #VMEXIT is injected properly by nested_svm_vmrun_error_vmexit().
+	 */
+
+	nested_svm_copy_common_state(svm->vmcb01.ptr, svm->nested.vmcb02.ptr);
+
+	svm_switch_vmcb(svm, &svm->nested.vmcb02);
+	nested_vmcb02_prepare_control(svm, vmcb12->save.rip, vmcb12->save.cs.base);
+	nested_vmcb02_prepare_save(svm, vmcb12);
+
 	if (!from_vmrun)
 		kvm_make_request(KVM_REQ_GET_NESTED_STATE_PAGES, vcpu);
 
@@ -984,6 +992,8 @@ static void __nested_svm_vmexit(struct vcpu_svm *svm)
 	struct vmcb *vmcb01 = svm->vmcb01.ptr;
 	struct kvm_vcpu *vcpu = &svm->vcpu;
 
+	WARN_ON_ONCE(is_guest_mode(vcpu));
+
 	svm->nested.vmcb12_gpa = 0;
 	svm->nested.ctl.nested_cr3 = 0;
 
@@ -1006,6 +1016,20 @@ static void __nested_svm_vmexit(struct vcpu_svm *svm)
 		kvm_queue_exception(vcpu, DB_VECTOR);
 }
 
+static void nested_svm_vmrun_error_vmexit(struct kvm_vcpu *vcpu, struct vmcb *vmcb12)
+{
+	struct vcpu_svm *svm = to_svm(vcpu);
+
+	WARN_ON_ONCE(svm->vmcb == svm->nested.vmcb02.ptr);
+
+	leave_guest_mode(vcpu);
+
+	vmcb12->control.exit_code = SVM_EXIT_ERR;
+	vmcb12->control.exit_info_1 = 0;
+	vmcb12->control.exit_info_2 = 0;
+	__nested_svm_vmexit(svm);
+}
+
 int nested_svm_vmrun(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_svm *svm = to_svm(vcpu);
@@ -1048,14 +1072,6 @@ int nested_svm_vmrun(struct kvm_vcpu *vcpu)
 	nested_copy_vmcb_control_to_cache(svm, &vmcb12->control);
 	nested_copy_vmcb_save_to_cache(svm, &vmcb12->save);
 
-	if (!nested_vmcb_check_save(vcpu, &svm->nested.save) ||
-	    !nested_vmcb_check_controls(vcpu, &svm->nested.ctl)) {
-		vmcb12->control.exit_code    = SVM_EXIT_ERR;
-		vmcb12->control.exit_info_1  = 0;
-		vmcb12->control.exit_info_2  = 0;
-		goto out;
-	}
-
 	/*
 	 * Since vmcb01 is not in use, we can use it to store some of the L1
 	 * state.
@@ -1076,14 +1092,9 @@ int nested_svm_vmrun(struct kvm_vcpu *vcpu)
 		svm->nmi_l1_to_l2 = false;
 		svm->soft_int_injected = false;
 
-		svm->vmcb->control.exit_code    = SVM_EXIT_ERR;
-		svm->vmcb->control.exit_info_1  = 0;
-		svm->vmcb->control.exit_info_2  = 0;
-
-		nested_svm_vmexit(svm);
+		nested_svm_vmrun_error_vmexit(vcpu, vmcb12);
 	}
 
-out:
 	kvm_vcpu_unmap(vcpu, &map);
 
 	return ret;
@@ -1241,6 +1252,13 @@ void nested_svm_vmexit(struct vcpu_svm *svm)
 	if (guest_cpu_cap_has(vcpu, X86_FEATURE_ERAPS))
 		vmcb01->control.erap_ctl |= ERAP_CONTROL_CLEAR_RAP;
 
+	/*
+	 * nested_svm_vmexit() is intended for use only when KVM is synthesizing
+	 * a #VMEXIT after a successful nested VMRUN.  All VMRUN consistency
+	 * checks must be performed before loading guest state, and so should
+	 * use __nested_svm_vmexit().
+	 */
+	WARN_ON_ONCE(svm->vmcb != svm->nested.vmcb02.ptr);
 	svm_switch_vmcb(svm, &svm->vmcb01);
 
 	/*
@@ -1912,9 +1930,6 @@ static int svm_set_nested_state(struct kvm_vcpu *vcpu,
 	if (nested_npt_enabled(svm))
 		nested_svm_init_mmu_context(vcpu);
 
-	svm_switch_vmcb(svm, &svm->nested.vmcb02);
-	nested_vmcb02_prepare_control(svm, svm->vmcb->save.rip, svm->vmcb->save.cs.base);
-
 	/*
 	 * While the nested guest CR3 is already checked and set by
 	 * KVM_SET_SREGS, it was set when nested state was yet loaded,
@@ -1926,6 +1941,9 @@ static int svm_set_nested_state(struct kvm_vcpu *vcpu,
 	if (ret)
 		goto out_free;
 
+	svm_switch_vmcb(svm, &svm->nested.vmcb02);
+	nested_vmcb02_prepare_control(svm, svm->vmcb->save.rip, svm->vmcb->save.cs.base);
+
 	svm->nested.force_msr_bitmap_recalc = true;
 
 	kvm_make_request(KVM_REQ_GET_NESTED_STATE_PAGES, vcpu);
-- 
2.53.0.414.gf7e9f6c205-goog


  parent reply	other threads:[~2026-02-24 22:34 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20260224223405.3270433-1-yosry@kernel.org>
2026-02-24 22:33 ` [PATCH v6 01/31] KVM: nSVM: Avoid clearing VMCB_LBR in vmcb12 Yosry Ahmed
2026-02-24 22:33 ` [PATCH v6 02/31] KVM: SVM: Switch svm_copy_lbrs() to a macro Yosry Ahmed
2026-02-24 22:33 ` [PATCH v6 03/31] KVM: SVM: Add missing save/restore handling of LBR MSRs Yosry Ahmed
2026-03-03  0:02   ` Yosry Ahmed
2026-02-24 22:33 ` [PATCH v6 05/31] KVM: nSVM: Always inject a #GP if mapping VMCB12 fails on nested VMRUN Yosry Ahmed
2026-02-24 22:33 ` [PATCH v6 06/31] KVM: nSVM: Refactor checking LBRV enablement in vmcb12 into a helper Yosry Ahmed
2026-02-24 22:33 ` [PATCH v6 07/31] KVM: nSVM: Refactor writing vmcb12 on nested #VMEXIT as " Yosry Ahmed
2026-02-24 22:33 ` [PATCH v6 08/31] KVM: nSVM: Triple fault if mapping VMCB12 fails on nested #VMEXIT Yosry Ahmed
2026-02-24 22:33 ` [PATCH v6 09/31] KVM: nSVM: Triple fault if restore host CR3 " Yosry Ahmed
2026-02-24 22:33 ` [PATCH v6 10/31] KVM: nSVM: Drop nested_vmcb_check_{save/control}() wrappers Yosry Ahmed
2026-02-24 22:33 ` [PATCH v6 11/31] KVM: nSVM: Call enter_guest_mode() before switching to VMCB02 Yosry Ahmed
2026-02-24 22:33 ` [PATCH v6 12/31] KVM: nSVM: Make nested_svm_merge_msrpm() return an errno Yosry Ahmed
2026-02-24 22:33 ` [PATCH v6 13/31] KVM: nSVM: Call nested_svm_merge_msrpm() from enter_svm_guest_mode() Yosry Ahmed
2026-02-24 22:33 ` [PATCH v6 14/31] KVM: nSVM: Call nested_svm_init_mmu_context() before switching to VMCB02 Yosry Ahmed
2026-02-24 22:33 ` [PATCH v6 15/31] KVM: nSVM: Refactor minimal #VMEXIT handling out of nested_svm_vmexit() Yosry Ahmed
2026-02-24 22:33 ` Yosry Ahmed [this message]
2026-02-25 22:44   ` [PATCH v6 16/31] KVM: nSVM: Unify handling of VMRUN failures with proper cleanup Yosry Ahmed
2026-02-28  0:07   ` Sean Christopherson
2026-02-28  0:46     ` Yosry Ahmed
2026-02-28  1:05       ` Sean Christopherson
2026-03-02 16:02         ` Yosry Ahmed
2026-03-02 16:11           ` Sean Christopherson
2026-02-24 22:33 ` [PATCH v6 17/31] KVM: nSVM: Clear EVENTINJ field in VMCB12 on nested #VMEXIT Yosry Ahmed
2026-02-24 22:33 ` [PATCH v6 18/31] KVM: nSVM: Drop the non-architectural consistency check for NP_ENABLE Yosry Ahmed
2026-02-24 22:33 ` [PATCH v6 19/31] KVM: nSVM: Add missing consistency check for nCR3 validity Yosry Ahmed
2026-02-24 22:33 ` [PATCH v6 20/31] KVM: nSVM: Add missing consistency check for hCR0.PG and NP_ENABLE Yosry Ahmed
2026-03-03  0:00   ` Yosry Ahmed
2026-02-24 22:33 ` [PATCH v6 21/31] KVM: nSVM: Add missing consistency check for EFER, CR0, CR4, and CS Yosry Ahmed
2026-03-03  0:03   ` Yosry Ahmed
2026-02-24 22:33 ` [PATCH v6 22/31] KVM: nSVM: Add missing consistency check for EVENTINJ Yosry Ahmed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260224223405.3270433-17-yosry@kernel.org \
    --to=yosry@kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=seanjc@google.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox