From: Wanpeng Li <kernellwp@gmail.com>
To: Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@redhat.com>,
Thomas Gleixner <tglx@linutronix.de>,
Paolo Bonzini <pbonzini@redhat.com>,
Sean Christopherson <seanjc@google.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>,
Christian Borntraeger <borntraeger@linux.ibm.com>,
Steven Rostedt <rostedt@goodmis.org>,
Vincent Guittot <vincent.guittot@linaro.org>,
Juri Lelli <juri.lelli@redhat.com>,
linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
Wanpeng Li <wanpengli@tencent.com>
Subject: [PATCH v2 8/9] KVM: Implement IPI-aware directed yield candidate selection
Date: Fri, 19 Dec 2025 11:53:32 +0800 [thread overview]
Message-ID: <20251219035334.39790-9-kernellwp@gmail.com> (raw)
In-Reply-To: <20251219035334.39790-1-kernellwp@gmail.com>
From: Wanpeng Li <wanpengli@tencent.com>
Integrate IPI tracking with directed yield to improve scheduling when
vCPUs spin waiting for IPI responses.
Implement priority-based candidate selection in kvm_vcpu_on_spin()
with three tiers:
Priority 1: Use kvm_vcpu_is_ipi_receiver() to identify confirmed IPI
targets within the recency window, addressing lock holders spinning
on IPI acknowledgment.
Priority 2: Leverage existing kvm_arch_dy_has_pending_interrupt() for
compatibility with arch-specific fast paths.
Priority 3: Fall back to conventional preemption-based logic when
yield_to_kernel_mode is requested, providing a safety net for non-IPI
scenarios.
Add kvm_vcpu_is_good_yield_candidate() helper to consolidate these
checks, preventing over-aggressive boosting while enabling targeted
optimization when IPI patterns are detected.
Performance testing (16 pCPUs host, 16 vCPUs/VM):
Dedup (simlarge):
2 VMs: +47.1% throughput
3 VMs: +28.1% throughput
4 VMs: +1.7% throughput
VIPS (simlarge):
2 VMs: +26.2% throughput
3 VMs: +12.7% throughput
4 VMs: +6.0% throughput
Gains stem from effective directed yield when vCPUs spin on IPI
delivery, reducing synchronization overhead. The improvement is most
pronounced at moderate overcommit (2-3 VMs) where contention reduction
outweighs context switching cost.
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
---
virt/kvm/kvm_main.c | 46 ++++++++++++++++++++++++++++++++++++---------
1 file changed, 37 insertions(+), 9 deletions(-)
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index ff771a872c6d..45ede950314b 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -3970,6 +3970,41 @@ bool __weak kvm_vcpu_is_ipi_receiver(struct kvm_vcpu *sender,
return false;
}
+/*
+ * IPI-aware candidate selection for directed yield.
+ *
+ * Priority order:
+ * 1) Confirmed IPI receiver of 'me' within recency window (always boost)
+ * 2) Arch-provided fast pending interrupt hint (user-mode boost)
+ * 3) Kernel-mode yield: preempted-in-kernel vCPU (traditional boost)
+ * 4) Otherwise, be conservative and skip
+ */
+static bool kvm_vcpu_is_good_yield_candidate(struct kvm_vcpu *me,
+ struct kvm_vcpu *vcpu,
+ bool yield_to_kernel_mode)
+{
+ /* Priority 1: recently targeted IPI receiver */
+ if (kvm_vcpu_is_ipi_receiver(me, vcpu))
+ return true;
+
+ /* Priority 2: fast pending-interrupt hint (arch-specific) */
+ if (kvm_arch_dy_has_pending_interrupt(vcpu))
+ return true;
+
+ /*
+ * Minimal preempted gate for remaining cases:
+ * Require that the target has been preempted, and if yielding to
+ * kernel mode, additionally require preempted-in-kernel.
+ */
+ if (!READ_ONCE(vcpu->preempted))
+ return false;
+
+ if (yield_to_kernel_mode && !kvm_arch_vcpu_preempted_in_kernel(vcpu))
+ return false;
+
+ return true;
+}
+
void kvm_vcpu_on_spin(struct kvm_vcpu *me, bool yield_to_kernel_mode)
{
int nr_vcpus, start, i, idx, yielded;
@@ -4017,15 +4052,8 @@ void kvm_vcpu_on_spin(struct kvm_vcpu *me, bool yield_to_kernel_mode)
if (kvm_vcpu_is_blocking(vcpu) && !vcpu_dy_runnable(vcpu))
continue;
- /*
- * Treat the target vCPU as being in-kernel if it has a pending
- * interrupt, as the vCPU trying to yield may be spinning
- * waiting on IPI delivery, i.e. the target vCPU is in-kernel
- * for the purposes of directed yield.
- */
- if (READ_ONCE(vcpu->preempted) && yield_to_kernel_mode &&
- !kvm_arch_dy_has_pending_interrupt(vcpu) &&
- !kvm_arch_vcpu_preempted_in_kernel(vcpu))
+ /* IPI-aware candidate selection */
+ if (!kvm_vcpu_is_good_yield_candidate(me, vcpu, yield_to_kernel_mode))
continue;
if (!kvm_vcpu_eligible_for_directed_yield(vcpu))
--
2.43.0
next prev parent reply other threads:[~2025-12-19 3:54 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-12-19 3:53 [PATCH v2 0/9] sched/kvm: Semantics-aware vCPU scheduling for oversubscribed KVM Wanpeng Li
2025-12-19 3:53 ` [PATCH v2 1/9] sched: Add vCPU debooster infrastructure Wanpeng Li
2025-12-19 3:53 ` [PATCH v2 2/9] sched/fair: Add rate-limiting and validation helpers Wanpeng Li
2025-12-22 21:12 ` kernel test robot
2026-01-04 4:09 ` Hillf Danton
2025-12-19 3:53 ` [PATCH v2 3/9] sched/fair: Add cgroup LCA finder for hierarchical yield Wanpeng Li
2025-12-19 3:53 ` [PATCH v2 4/9] sched/fair: Add penalty calculation and application logic Wanpeng Li
2025-12-22 23:36 ` kernel test robot
2025-12-19 3:53 ` [PATCH v2 5/9] sched/fair: Wire up yield deboost in yield_to_task_fair() Wanpeng Li
2025-12-22 7:06 ` kernel test robot
2025-12-22 9:31 ` kernel test robot
2025-12-19 3:53 ` [PATCH v2 6/9] KVM: x86: Add IPI tracking infrastructure Wanpeng Li
2025-12-19 3:53 ` [PATCH v2 7/9] KVM: x86/lapic: Integrate IPI tracking with interrupt delivery Wanpeng Li
2025-12-19 3:53 ` Wanpeng Li [this message]
2025-12-19 3:53 ` [PATCH v2 9/9] KVM: Relaxed boost as safety net Wanpeng Li
2026-01-04 2:40 ` [PATCH v2 0/9] sched/kvm: Semantics-aware vCPU scheduling for oversubscribed KVM Wanpeng Li
2026-01-05 6:26 ` K Prateek Nayak
2026-03-13 1:13 ` Sean Christopherson
2026-04-01 9:48 ` Wanpeng Li
2026-04-02 23:43 ` Sean Christopherson
2026-03-26 14:41 ` Christian Borntraeger
2026-04-01 9:34 ` Wanpeng Li
2026-04-08 9:35 ` Richie Buturla
2026-04-17 11:30 ` Richie Buturla
2026-05-13 12:52 ` Richie Buturla
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251219035334.39790-9-kernellwp@gmail.com \
--to=kernellwp@gmail.com \
--cc=borntraeger@linux.ibm.com \
--cc=juri.lelli@redhat.com \
--cc=kprateek.nayak@amd.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=pbonzini@redhat.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=seanjc@google.com \
--cc=tglx@linutronix.de \
--cc=vincent.guittot@linaro.org \
--cc=wanpengli@tencent.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.