From: Sasha Levin <sashal@kernel.org>
To: patches@lists.linux.dev, stable@vger.kernel.org
Cc: Li RongQing <lirongqing@baidu.com>,
Sean Christopherson <seanjc@google.com>,
Wangyang Guo <wangyang.guo@intel.com>,
Sasha Levin <sashal@kernel.org>,
pbonzini@redhat.com, kvm@vger.kernel.org
Subject: [PATCH AUTOSEL 6.17-5.10] x86/kvm: Prefer native qspinlock for dedicated vCPUs irrespective of PV_UNHALT
Date: Sat, 25 Oct 2025 11:56:01 -0400 [thread overview]
Message-ID: <20251025160905.3857885-130-sashal@kernel.org> (raw)
In-Reply-To: <20251025160905.3857885-1-sashal@kernel.org>
From: Li RongQing <lirongqing@baidu.com>
[ Upstream commit 960550503965094b0babd7e8c83ec66c8a763b0b ]
The commit b2798ba0b876 ("KVM: X86: Choose qspinlock when dedicated
physical CPUs are available") states that when PV_DEDICATED=1
(vCPU has dedicated pCPU), qspinlock should be preferred regardless of
PV_UNHALT. However, the current implementation doesn't reflect this: when
PV_UNHALT=0, we still use virt_spin_lock() even with dedicated pCPUs.
This is suboptimal because:
1. Native qspinlocks should outperform virt_spin_lock() for dedicated
vCPUs irrespective of HALT exiting
2. virt_spin_lock() should only be preferred when vCPUs may be preempted
(non-dedicated case)
So reorder the PV spinlock checks to:
1. First handle dedicated pCPU case (disable virt_spin_lock_key)
2. Second check single CPU, and nopvspin configuration
3. Only then check PV_UNHALT support
This ensures we always use native qspinlock for dedicated vCPUs, delivering
pretty performance gains at high contention levels.
Signed-off-by: Li RongQing <lirongqing@baidu.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Tested-by: Wangyang Guo <wangyang.guo@intel.com>
Link: https://lore.kernel.org/r/20250722110005.4988-1-lirongqing@baidu.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
LLM Generated explanations, may be completely bogus:
YES
- What it fixes
- Aligns behavior with the earlier policy “Choose qspinlock when
dedicated physical CPUs are available” (commit b2798ba0b876):
dedicated vCPUs should prefer native qspinlock regardless of
PV_UNHALT support. Previously, if the host lacked
`KVM_FEATURE_PV_UNHALT`, `kvm_spinlock_init()` returned early and
never disabled the `virt_spin_lock()` hijack, leaving guests with
the TAS fallback even on dedicated pCPUs, which is suboptimal for
performance under contention.
- Key code changes and their effect
- Reorders checks in `kvm_spinlock_init()` so the “dedicated pCPUs”
path is handled before testing for `KVM_FEATURE_PV_UNHALT`:
- Dedicated vCPU: `if (kvm_para_has_hint(KVM_HINTS_REALTIME)) { ...
goto out; }` now runs first, followed by single-CPU and `nopvspin`
checks; only then does it test
`!kvm_para_has_feature(KVM_FEATURE_PV_UNHALT)`
(arch/x86/kernel/kvm.c:1095–1135).
- The `out:` label disables `virt_spin_lock_key` with
`static_branch_disable(&virt_spin_lock_key);`
(arch/x86/kernel/kvm.c:1135). This forces native qspinlock instead
of the virt TAS path.
- Why this matters:
- In guests, `native_pv_lock_init()` enables the
`virt_spin_lock_key` when running under a hypervisor
(arch/x86/kernel/paravirt.c:60). If `kvm_spinlock_init()` bails
out early on “no PV_UNHALT”, the key remains enabled and
`virt_spin_lock()` gets used.
- `virt_spin_lock()` is gated by the key; when enabled it uses a
Test-and-Set fallback for hypervisors without PV spinlock support
(arch/x86/include/asm/qspinlock.h:88–110). For dedicated vCPUs,
this fallback is slower than native qspinlock and unnecessary.
- After this change, dedicated vCPUs always hit `goto out;` →
`static_branch_disable(&virt_spin_lock_key);`, so
`virt_spin_lock()` immediately returns false
(arch/x86/include/asm/qspinlock.h:92), and the native qspinlock
path is used, matching the intended behavior.
- Scope and containment
- Single function change in `arch/x86/kernel/kvm.c`; no ABI or
architectural changes.
- Behavior when `KVM_FEATURE_PV_UNHALT` is present remains unchanged;
the fix only corrects the corner case when PV_UNHALT is absent.
- Also harmonizes single-CPU and `nopvspin` behavior in the no-
PV_UNHALT case by ensuring the static key is disabled via the same
`goto out` path, which is consistent with the printed messages and
expected semantics.
- Risk assessment
- Low risk: selection between native qspinlock and virt TAS fallback
is internal and controlled by KVM hints; the change makes behavior
consistent across PV_UNHALT presence/absence.
- The only behavior change is for guests on hosts without
`KVM_FEATURE_PV_UNHALT` that advertise `KVM_HINTS_REALTIME`: they
now get native qspinlock (preferred) instead of TAS fallback. This
mirrors what already happens on hosts with PV_UNHALT support, so it
does not introduce a new class of risk.
- Stable backport rationale
- Small, self-contained change; no API/ABI changes.
- Corrects a logic mismatch with an earlier change’s documented intent
(dedicated vCPU → native qspinlock), yielding concrete performance
benefits under contention.
- Fits stable criteria as a low-risk correctness/performance fix
rather than a new feature.
Code references:
- arch/x86/kernel/kvm.c:1095 (KVM_HINTS_REALTIME → goto out), :1101
(single CPU → goto out), :1107 (`nopvspin` → goto out), :1120–1126
(PV_UNHALT check now after the above), :1135
(`static_branch_disable(&virt_spin_lock_key);`).
- arch/x86/include/asm/qspinlock.h:88–110 (`virt_spin_lock()` gated by
`virt_spin_lock_key`, uses TAS fallback when enabled).
- arch/x86/kernel/paravirt.c:60 (`native_pv_lock_init()` enables
`virt_spin_lock_key` for guests).
arch/x86/kernel/kvm.c | 20 ++++++++++----------
1 file changed, 10 insertions(+), 10 deletions(-)
diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index 57379698015ed..2ecb2ec06aebc 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -1089,16 +1089,6 @@ static void kvm_wait(u8 *ptr, u8 val)
*/
void __init kvm_spinlock_init(void)
{
- /*
- * In case host doesn't support KVM_FEATURE_PV_UNHALT there is still an
- * advantage of keeping virt_spin_lock_key enabled: virt_spin_lock() is
- * preferred over native qspinlock when vCPU is preempted.
- */
- if (!kvm_para_has_feature(KVM_FEATURE_PV_UNHALT)) {
- pr_info("PV spinlocks disabled, no host support\n");
- return;
- }
-
/*
* Disable PV spinlocks and use native qspinlock when dedicated pCPUs
* are available.
@@ -1118,6 +1108,16 @@ void __init kvm_spinlock_init(void)
goto out;
}
+ /*
+ * In case host doesn't support KVM_FEATURE_PV_UNHALT there is still an
+ * advantage of keeping virt_spin_lock_key enabled: virt_spin_lock() is
+ * preferred over native qspinlock when vCPU is preempted.
+ */
+ if (!kvm_para_has_feature(KVM_FEATURE_PV_UNHALT)) {
+ pr_info("PV spinlocks disabled, no host support\n");
+ return;
+ }
+
pr_info("PV spinlocks enabled\n");
__pv_init_lock_hash();
--
2.51.0
next prev parent reply other threads:[~2025-10-25 16:15 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20251025160905.3857885-1-sashal@kernel.org>
2025-10-25 15:55 ` [PATCH AUTOSEL 6.17] vfio/nvgrace-gpu: Add GB300 SKU to the devid table Sasha Levin
2025-10-25 15:56 ` Sasha Levin [this message]
2025-10-25 15:58 ` [PATCH AUTOSEL 6.17] x86/kexec: Disable kexec/kdump on platforms with TDX partial write erratum Sasha Levin
2025-10-26 22:24 ` Huang, Kai
2025-11-03 9:26 ` Huang, Kai
2025-11-04 14:46 ` Sasha Levin
2025-11-04 21:27 ` Huang, Kai
2025-10-25 15:59 ` [PATCH AUTOSEL 6.17] x86/virt/tdx: Mark memory cache state incoherent when making SEAMCALL Sasha Levin
2025-10-26 22:25 ` Huang, Kai
2025-10-28 17:49 ` Sasha Levin
2025-10-25 16:00 ` [PATCH AUTOSEL 6.17] x86/virt/tdx: Use precalculated TDVPR page physical address Sasha Levin
2025-10-25 16:00 ` [PATCH AUTOSEL 6.17-6.1] vfio: return -ENOTTY for unsupported device feature Sasha Levin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251025160905.3857885-130-sashal@kernel.org \
--to=sashal@kernel.org \
--cc=kvm@vger.kernel.org \
--cc=lirongqing@baidu.com \
--cc=patches@lists.linux.dev \
--cc=pbonzini@redhat.com \
--cc=seanjc@google.com \
--cc=stable@vger.kernel.org \
--cc=wangyang.guo@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox