public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Sean Christopherson <seanjc@google.com>,
	Paolo Bonzini <pbonzini@redhat.com>
Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
	 zhanghao <zhanghao1@kylinos.cn>,
	Wanpeng Li <kernellwp@gmail.com>
Subject: [PATCH] KVM: Drop kvm_vcpu.ready to squash race where "ready" can get stuck "true"
Date: Thu,  9 Apr 2026 14:33:33 -0700	[thread overview]
Message-ID: <20260409213333.1995382-1-seanjc@google.com> (raw)

Drop kvm_vcpu.ready and instead detect the case where a recently awakened
vCPU is runnable but not yet scheduled in by explicitly, manually checking
for a target vCPU that is (a) scheduled out, (b) wants to run, (c) is
marked as blocking in its stat, but (d) not actually flagged as blocking.
I.e. treat a runnable vCPU that's in the blocking sequence but not truly
blocking as a candidate for directed yield.

Keying off vcpu->stat.generic.blocking will yield some number of false
positives, e.g. if the vCPU is preempted _before_ blocking, but the rate of
false positives should be roughly the same as the existing approach, as
kvm_sched_out() would previously mark the vCPU as ready when it's scheduled
out and runnable.

Eliminating the write to vcpu->ready in kvm_vcpu_wake_up() fixes a race
where vcpu->ready could be set *after* the target vCPU is scheduled in,
e.g. if the task waking the target vCPU is preempted (or otherwise delayed)
after waking the vCPU, but before setting vcpu->ready.  Hitting the race
leads to a very degraded state as KVM will constantly attempt to schedule
in a vCPU that is already running.

Fixes: d73eb57b80b9 ("KVM: Boost vCPUs that are delivering interrupts")
Reported-by: zhanghao <zhanghao1@kylinos.cn>
Closes: https://lore.kernel.org/all/tencent_AE2873502605DBDD4CD1E810F06C410F0105@qq.com
Cc: stable@vger.kernel.org
Cc: Wanpeng Li <kernellwp@gmail.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/x86.c       |  2 +-
 include/linux/kvm_host.h |  9 +++++++++
 virt/kvm/kvm_main.c      | 10 +++-------
 3 files changed, 13 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 0a1b63c63d1a..eebb2eb39ec0 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -10399,7 +10399,7 @@ static void kvm_sched_yield(struct kvm_vcpu *vcpu, unsigned long dest_id)
 
 	rcu_read_unlock();
 
-	if (!target || !READ_ONCE(target->ready))
+	if (!target || !kvm_vcpu_is_runnable_and_scheduled_out(target))
 		goto no_yield;
 
 	/* Ignore requests to yield to self */
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 7981e9cab2eb..241a976e3410 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -1753,6 +1753,15 @@ static inline bool kvm_vcpu_is_blocking(struct kvm_vcpu *vcpu)
 	return rcuwait_active(kvm_arch_vcpu_get_wait(vcpu));
 }
 
+static inline bool kvm_vcpu_is_runnable_and_scheduled_out(struct kvm_vcpu *vcpu)
+{
+	return READ_ONCE(vcpu->preempted) ||
+	       (READ_ONCE(vcpu->scheduled_out) &&
+		READ_ONCE(vcpu->wants_to_run) &&
+		READ_ONCE(vcpu->stat.generic.blocking) &&
+		!kvm_vcpu_is_blocking(vcpu));
+}
+
 #ifdef __KVM_HAVE_ARCH_INTC_INITIALIZED
 /*
  * returns true if the virtual interrupt controller is initialized and
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 9faf70ccae7a..9f71e32daac5 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -455,7 +455,6 @@ static void kvm_vcpu_init(struct kvm_vcpu *vcpu, struct kvm *kvm, unsigned id)
 	kvm_vcpu_set_in_spin_loop(vcpu, false);
 	kvm_vcpu_set_dy_eligible(vcpu, false);
 	vcpu->preempted = false;
-	vcpu->ready = false;
 	preempt_notifier_init(&vcpu->preempt_notifier, &kvm_preempt_ops);
 	vcpu->last_used_slot = NULL;
 
@@ -3803,7 +3802,6 @@ EXPORT_SYMBOL_FOR_KVM_INTERNAL(kvm_vcpu_halt);
 bool kvm_vcpu_wake_up(struct kvm_vcpu *vcpu)
 {
 	if (__kvm_vcpu_wake_up(vcpu)) {
-		WRITE_ONCE(vcpu->ready, true);
 		++vcpu->stat.generic.halt_wakeup;
 		return true;
 	}
@@ -4008,7 +4006,7 @@ void kvm_vcpu_on_spin(struct kvm_vcpu *me, bool yield_to_kernel_mode)
 			continue;
 
 		vcpu = xa_load(&kvm->vcpu_array, idx);
-		if (!READ_ONCE(vcpu->ready))
+		if (!kvm_vcpu_is_runnable_and_scheduled_out(vcpu))
 			continue;
 		if (kvm_vcpu_is_blocking(vcpu) && !vcpu_dy_runnable(vcpu))
 			continue;
@@ -6393,7 +6391,6 @@ static void kvm_sched_in(struct preempt_notifier *pn, int cpu)
 	struct kvm_vcpu *vcpu = preempt_notifier_to_vcpu(pn);
 
 	WRITE_ONCE(vcpu->preempted, false);
-	WRITE_ONCE(vcpu->ready, false);
 
 	__this_cpu_write(kvm_running_vcpu, vcpu);
 	kvm_arch_vcpu_load(vcpu, cpu);
@@ -6408,10 +6405,9 @@ static void kvm_sched_out(struct preempt_notifier *pn,
 
 	WRITE_ONCE(vcpu->scheduled_out, true);
 
-	if (task_is_runnable(current) && vcpu->wants_to_run) {
+	if (task_is_runnable(current) && vcpu->wants_to_run)
 		WRITE_ONCE(vcpu->preempted, true);
-		WRITE_ONCE(vcpu->ready, true);
-	}
+
 	kvm_arch_vcpu_put(vcpu);
 	__this_cpu_write(kvm_running_vcpu, NULL);
 }

base-commit: b89df297a47e641581ee67793592e5c6ae0428f4
-- 
2.53.0.1213.gd9a14994de-goog


                 reply	other threads:[~2026-04-09 21:33 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260409213333.1995382-1-seanjc@google.com \
    --to=seanjc@google.com \
    --cc=kernellwp@gmail.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=zhanghao1@kylinos.cn \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox