All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Yosry Ahmed <yosry@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
	kvm@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 3/3] KVM: x86: Check for injected exceptions before queuing a debug exception
Date: Mon, 2 Mar 2026 15:22:47 -0800	[thread overview]
Message-ID: <aaYbx59lQf5beYSv@google.com> (raw)
In-Reply-To: <CAO9r8zOFWHZ5LHRRKL4KU8TctjNs+vQYDr9OoBmao=eG9Q8C2w@mail.gmail.com>

On Fri, Feb 27, 2026, Yosry Ahmed wrote:
> > > That being said, I hate nested_run_in_progress. It's too close to
> > > nested_run_pending and I am pretty sure they will be mixed up.
> >
> > Agreed, though the fact that name is _too_ close means that, aside from the
> > potential for disaster (minor detail), it's accurate.
> >
> > One thought is to hide nested_run_in_progress beyond a KConfig, so that attempts
> > to use it for anything but the sanity check(s) would fail the build.  I don't
> > really want to create yet another KVM_PROVE_xxx though, but unlike KVM_PROVE_MMU,
> > I think we want to this enabled in production.
> >
> > I'll chew on this a bit...
> 
> Maybe (if we go this direction) name it very explicitly
> warn_on_nested_exception if it's only intended to be used for the
> sanity checks?

It's not just about exceptions though.  That's the case that has caused a rash
of recent problems, but the rule isn't specific to exceptions, it's very broadly
Thou Shalt Not Cancel VMRUN.

I think that's where there's some disconnect.  We can't make the nested_run_pending
warnings go away by adding more sanity checks, and I am dead set against removing
those warnings.

Aha!  Idea.  What if we turn nested_run_pending into a u8, and use a magic value
of '2' to indicate that userspace gained control of the CPU since nested_run_pending
was set, and then only WARN on nested_run_pending==1?  That way we don't have to
come up with a new name, and there's zero chance of nested_run_pending and something
like nested_run_in_progress getting out of sync.

---
 arch/x86/include/asm/kvm_host.h |  6 +++++-
 arch/x86/kvm/svm/nested.c       |  3 ++-
 arch/x86/kvm/vmx/nested.c       |  4 ++--
 arch/x86/kvm/x86.c              |  7 +++++++
 arch/x86/kvm/x86.h              | 10 ++++++++++
 5 files changed, 26 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 19b3790e5e99..a8d39b3aff6a 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1104,8 +1104,12 @@ struct kvm_vcpu_arch {
 	 * can only occur at instruction boundaries.  The only exception is
 	 * VMX's "notify" exits, which exist in large part to break the CPU out
 	 * of infinite ucode loops, but can corrupt vCPU state in the process!
+	 *
+	 * For all intents and purposes, this is a boolean, but it's tracked as
+	 * a u8 so that KVM can detect when userspace may have stuffed vCPU
+	 * state and generated an architecturally-impossible VM-Exit.
 	 */
-	bool nested_run_pending;
+	u8 nested_run_pending;
 
 #if IS_ENABLED(CONFIG_HYPERV)
 	hpa_t hv_root_tdp;
diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index c2d4c9c63146..77ff9ead957c 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -1138,7 +1138,8 @@ int nested_svm_vmexit(struct vcpu_svm *svm)
 	/* Exit Guest-Mode */
 	leave_guest_mode(vcpu);
 	svm->nested.vmcb12_gpa = 0;
-	WARN_ON_ONCE(vcpu->arch.nested_run_pending);
+
+	kvm_warn_on_nested_run_pending(vcpu);
 
 	kvm_clear_request(KVM_REQ_GET_NESTED_STATE_PAGES, vcpu);
 
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 031075467a6d..5659545360dc 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -5042,7 +5042,7 @@ void __nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 vm_exit_reason,
 	vmx->nested.mtf_pending = false;
 
 	/* trying to cancel vmlaunch/vmresume is a bug */
-	WARN_ON_ONCE(vcpu->arch.nested_run_pending);
+	kvm_warn_on_nested_run_pending(vcpu);
 
 #ifdef CONFIG_KVM_HYPERV
 	if (kvm_check_request(KVM_REQ_GET_NESTED_STATE_PAGES, vcpu)) {
@@ -6665,7 +6665,7 @@ bool nested_vmx_reflect_vmexit(struct kvm_vcpu *vcpu)
 	unsigned long exit_qual;
 	u32 exit_intr_info;
 
-	WARN_ON_ONCE(vcpu->arch.nested_run_pending);
+	kvm_warn_on_nested_run_pending(vcpu);
 
 	/*
 	 * Late nested VM-Fail shares the same flow as nested VM-Exit since KVM
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index db3f393192d9..30ff5a755572 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -12023,6 +12023,13 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
 	if (r <= 0)
 		goto out;
 
+	/*
+	 * If userspace may have modified vCPU state, mark nested_run_pending
+	 * as "untrusted" to avoid triggering false-positive WARNs.
+	 */
+	if (vcpu->arch.nested_run_pending == 1)
+		vcpu->arch.nested_run_pending = 2;
+
 	r = vcpu_run(vcpu);
 
 out:
diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
index 94d4f07aaaa0..d3003c8be961 100644
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -188,6 +188,16 @@ static inline bool kvm_can_set_cpuid_and_feature_msrs(struct kvm_vcpu *vcpu)
 	return vcpu->arch.last_vmentry_cpu == -1 && !is_guest_mode(vcpu);
 }
 
+/*
+ * WARN if a nested VM-Enter is pending completion, and userspace hasn't gained
+ * control since the nested VM-Enter was initiated (in which case, userspace
+ * may have modified vCPU state to induce an architecturally invalid VM-Exit).
+ */
+static inline void kvm_warn_on_nested_run_pending(struct kvm_vcpu *vcpu)
+{
+	WARN_ON_ONCE(vcpu->arch.nested_run_pending == 1);
+}
+
 static inline void kvm_set_mp_state(struct kvm_vcpu *vcpu, int mp_state)
 {
 	vcpu->arch.mp_state = mp_state;

base-commit: a68a4bbc5b9ce5b722473399f05cb05217abaee8
--

  reply	other threads:[~2026-03-02 23:22 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-27  1:13 [PATCH 0/3] KVM: x86: Fix incorrect handling of triple faults Yosry Ahmed
2026-02-27  1:13 ` [PATCH 1/3] KVM: x86: Move nested_run_pending to kvm_vcpu_arch Yosry Ahmed
2026-02-27  1:13 ` [PATCH 2/3] KVM: x86: Do not inject triple faults into an L2 with a pending run Yosry Ahmed
2026-02-27  1:13 ` [PATCH 3/3] KVM: x86: Check for injected exceptions before queuing a debug exception Yosry Ahmed
2026-02-27 16:06   ` Sean Christopherson
2026-02-27 16:34     ` Sean Christopherson
2026-02-27 17:31       ` Yosry Ahmed
2026-02-27 18:18         ` Sean Christopherson
2026-02-27 18:34           ` Yosry Ahmed
2026-03-02 23:22             ` Sean Christopherson [this message]
2026-03-02 23:36               ` Yosry Ahmed
2026-03-02 23:47                 ` Sean Christopherson
2026-03-05 17:26 ` [PATCH 0/3] KVM: x86: Fix incorrect handling of triple faults Sean Christopherson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aaYbx59lQf5beYSv@google.com \
    --to=seanjc@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=yosry@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.