public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: patches@lists.linux.dev, stable@vger.kernel.org
Cc: Li RongQing <lirongqing@baidu.com>,
	Sean Christopherson <seanjc@google.com>,
	Wangyang Guo <wangyang.guo@intel.com>,
	Sasha Levin <sashal@kernel.org>,
	pbonzini@redhat.com, kvm@vger.kernel.org
Subject: [PATCH AUTOSEL 6.17-5.10] x86/kvm: Prefer native qspinlock for dedicated vCPUs irrespective of PV_UNHALT
Date: Sat, 25 Oct 2025 11:56:01 -0400	[thread overview]
Message-ID: <20251025160905.3857885-130-sashal@kernel.org> (raw)
In-Reply-To: <20251025160905.3857885-1-sashal@kernel.org>

From: Li RongQing <lirongqing@baidu.com>

[ Upstream commit 960550503965094b0babd7e8c83ec66c8a763b0b ]

The commit b2798ba0b876 ("KVM: X86: Choose qspinlock when dedicated
physical CPUs are available") states that when PV_DEDICATED=1
(vCPU has dedicated pCPU), qspinlock should be preferred regardless of
PV_UNHALT.  However, the current implementation doesn't reflect this: when
PV_UNHALT=0, we still use virt_spin_lock() even with dedicated pCPUs.

This is suboptimal because:
1. Native qspinlocks should outperform virt_spin_lock() for dedicated
   vCPUs irrespective of HALT exiting
2. virt_spin_lock() should only be preferred when vCPUs may be preempted
   (non-dedicated case)

So reorder the PV spinlock checks to:
1. First handle dedicated pCPU case (disable virt_spin_lock_key)
2. Second check single CPU, and nopvspin configuration
3. Only then check PV_UNHALT support

This ensures we always use native qspinlock for dedicated vCPUs, delivering
pretty performance gains at high contention levels.

Signed-off-by: Li RongQing <lirongqing@baidu.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Tested-by: Wangyang Guo <wangyang.guo@intel.com>
Link: https://lore.kernel.org/r/20250722110005.4988-1-lirongqing@baidu.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

YES

- What it fixes
  - Aligns behavior with the earlier policy “Choose qspinlock when
    dedicated physical CPUs are available” (commit b2798ba0b876):
    dedicated vCPUs should prefer native qspinlock regardless of
    PV_UNHALT support. Previously, if the host lacked
    `KVM_FEATURE_PV_UNHALT`, `kvm_spinlock_init()` returned early and
    never disabled the `virt_spin_lock()` hijack, leaving guests with
    the TAS fallback even on dedicated pCPUs, which is suboptimal for
    performance under contention.

- Key code changes and their effect
  - Reorders checks in `kvm_spinlock_init()` so the “dedicated pCPUs”
    path is handled before testing for `KVM_FEATURE_PV_UNHALT`:
    - Dedicated vCPU: `if (kvm_para_has_hint(KVM_HINTS_REALTIME)) { ...
      goto out; }` now runs first, followed by single-CPU and `nopvspin`
      checks; only then does it test
      `!kvm_para_has_feature(KVM_FEATURE_PV_UNHALT)`
      (arch/x86/kernel/kvm.c:1095–1135).
    - The `out:` label disables `virt_spin_lock_key` with
      `static_branch_disable(&virt_spin_lock_key);`
      (arch/x86/kernel/kvm.c:1135). This forces native qspinlock instead
      of the virt TAS path.
  - Why this matters:
    - In guests, `native_pv_lock_init()` enables the
      `virt_spin_lock_key` when running under a hypervisor
      (arch/x86/kernel/paravirt.c:60). If `kvm_spinlock_init()` bails
      out early on “no PV_UNHALT”, the key remains enabled and
      `virt_spin_lock()` gets used.
    - `virt_spin_lock()` is gated by the key; when enabled it uses a
      Test-and-Set fallback for hypervisors without PV spinlock support
      (arch/x86/include/asm/qspinlock.h:88–110). For dedicated vCPUs,
      this fallback is slower than native qspinlock and unnecessary.
    - After this change, dedicated vCPUs always hit `goto out;` →
      `static_branch_disable(&virt_spin_lock_key);`, so
      `virt_spin_lock()` immediately returns false
      (arch/x86/include/asm/qspinlock.h:92), and the native qspinlock
      path is used, matching the intended behavior.

- Scope and containment
  - Single function change in `arch/x86/kernel/kvm.c`; no ABI or
    architectural changes.
  - Behavior when `KVM_FEATURE_PV_UNHALT` is present remains unchanged;
    the fix only corrects the corner case when PV_UNHALT is absent.
  - Also harmonizes single-CPU and `nopvspin` behavior in the no-
    PV_UNHALT case by ensuring the static key is disabled via the same
    `goto out` path, which is consistent with the printed messages and
    expected semantics.

- Risk assessment
  - Low risk: selection between native qspinlock and virt TAS fallback
    is internal and controlled by KVM hints; the change makes behavior
    consistent across PV_UNHALT presence/absence.
  - The only behavior change is for guests on hosts without
    `KVM_FEATURE_PV_UNHALT` that advertise `KVM_HINTS_REALTIME`: they
    now get native qspinlock (preferred) instead of TAS fallback. This
    mirrors what already happens on hosts with PV_UNHALT support, so it
    does not introduce a new class of risk.

- Stable backport rationale
  - Small, self-contained change; no API/ABI changes.
  - Corrects a logic mismatch with an earlier change’s documented intent
    (dedicated vCPU → native qspinlock), yielding concrete performance
    benefits under contention.
  - Fits stable criteria as a low-risk correctness/performance fix
    rather than a new feature.

Code references:
- arch/x86/kernel/kvm.c:1095 (KVM_HINTS_REALTIME → goto out), :1101
  (single CPU → goto out), :1107 (`nopvspin` → goto out), :1120–1126
  (PV_UNHALT check now after the above), :1135
  (`static_branch_disable(&virt_spin_lock_key);`).
- arch/x86/include/asm/qspinlock.h:88–110 (`virt_spin_lock()` gated by
  `virt_spin_lock_key`, uses TAS fallback when enabled).
- arch/x86/kernel/paravirt.c:60 (`native_pv_lock_init()` enables
  `virt_spin_lock_key` for guests).

 arch/x86/kernel/kvm.c | 20 ++++++++++----------
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index 57379698015ed..2ecb2ec06aebc 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -1089,16 +1089,6 @@ static void kvm_wait(u8 *ptr, u8 val)
  */
 void __init kvm_spinlock_init(void)
 {
-	/*
-	 * In case host doesn't support KVM_FEATURE_PV_UNHALT there is still an
-	 * advantage of keeping virt_spin_lock_key enabled: virt_spin_lock() is
-	 * preferred over native qspinlock when vCPU is preempted.
-	 */
-	if (!kvm_para_has_feature(KVM_FEATURE_PV_UNHALT)) {
-		pr_info("PV spinlocks disabled, no host support\n");
-		return;
-	}
-
 	/*
 	 * Disable PV spinlocks and use native qspinlock when dedicated pCPUs
 	 * are available.
@@ -1118,6 +1108,16 @@ void __init kvm_spinlock_init(void)
 		goto out;
 	}
 
+	/*
+	 * In case host doesn't support KVM_FEATURE_PV_UNHALT there is still an
+	 * advantage of keeping virt_spin_lock_key enabled: virt_spin_lock() is
+	 * preferred over native qspinlock when vCPU is preempted.
+	 */
+	if (!kvm_para_has_feature(KVM_FEATURE_PV_UNHALT)) {
+		pr_info("PV spinlocks disabled, no host support\n");
+		return;
+	}
+
 	pr_info("PV spinlocks enabled\n");
 
 	__pv_init_lock_hash();
-- 
2.51.0


  parent reply	other threads:[~2025-10-25 16:15 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20251025160905.3857885-1-sashal@kernel.org>
2025-10-25 15:55 ` [PATCH AUTOSEL 6.17] vfio/nvgrace-gpu: Add GB300 SKU to the devid table Sasha Levin
2025-10-25 15:56 ` Sasha Levin [this message]
2025-10-25 15:58 ` [PATCH AUTOSEL 6.17] x86/kexec: Disable kexec/kdump on platforms with TDX partial write erratum Sasha Levin
2025-10-26 22:24   ` Huang, Kai
2025-11-03  9:26     ` Huang, Kai
2025-11-04 14:46       ` Sasha Levin
2025-11-04 21:27         ` Huang, Kai
2025-10-25 15:59 ` [PATCH AUTOSEL 6.17] x86/virt/tdx: Mark memory cache state incoherent when making SEAMCALL Sasha Levin
2025-10-26 22:25   ` Huang, Kai
2025-10-28 17:49     ` Sasha Levin
2025-10-25 16:00 ` [PATCH AUTOSEL 6.17] x86/virt/tdx: Use precalculated TDVPR page physical address Sasha Levin
2025-10-25 16:00 ` [PATCH AUTOSEL 6.17-6.1] vfio: return -ENOTTY for unsupported device feature Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251025160905.3857885-130-sashal@kernel.org \
    --to=sashal@kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=lirongqing@baidu.com \
    --cc=patches@lists.linux.dev \
    --cc=pbonzini@redhat.com \
    --cc=seanjc@google.com \
    --cc=stable@vger.kernel.org \
    --cc=wangyang.guo@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox