Linux Documentation
 help / color / mirror / Atom feed
From: David Woodhouse <dwmw2@infradead.org>
To: Paolo Bonzini <pbonzini@redhat.com>,
	Jonathan Corbet <corbet@lwn.net>,
	Shuah Khan <skhan@linuxfoundation.org>,
	Sean Christopherson <seanjc@google.com>,
	Thomas Gleixner <tglx@kernel.org>, Ingo Molnar <mingo@redhat.com>,
	Borislav Petkov <bp@alien8.de>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	x86@kernel.org, "H. Peter Anvin" <hpa@zytor.com>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	Juergen Gross <jgross@suse.com>,
	Boris Ostrovsky <boris.ostrovsky@oracle.com>,
	David Woodhouse <dwmw2@infradead.org>,
	Paul Durrant <paul@xen.org>, Jonathan Cameron <jic23@kernel.org>,
	Sascha Bischoff <Sascha.Bischoff@arm.com>,
	Marc Zyngier <maz@kernel.org>, Joey Gouly <joey.gouly@arm.com>,
	Jack Allister <jalliste@amazon.com>,
	Dongli Zhang <dongli.zhang@oracle.com>,
	joe.jin@oracle.com, kvm@vger.kernel.org,
	linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
	xen-devel@lists.xenproject.org, linux-kselftest@vger.kernel.org
Subject: [PATCH v4 29/30] x86/kvm: Obtain TSC frequency from CPUID if present
Date: Sat,  9 May 2026 23:46:55 +0100	[thread overview]
Message-ID: <20260509224824.3264567-30-dwmw2@infradead.org> (raw)
In-Reply-To: <20260509224824.3264567-1-dwmw2@infradead.org>

From: David Woodhouse <dwmw@amazon.co.uk>

In https://lore.kernel.org/all/1222881242.9381.17.camel@alok-dev1/
a proposal was made for generic CPUID conventions across hypervisors.
It was mostly shot down in flames, but the leaf at 0x40000010
containing timing information didn't die.

It's used by XNU and FreeBSD guests under all hypervisors to determine
the TSC frequency, and also exposed by the EC2 Nitro hypervisor and
VMware. Use it under KVM to obtain the TSC frequency more accurately,
instead of reverse-calculating the frequency from the mul/shift values
in the KVM clock.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
---
 arch/x86/include/asm/kvm_para.h      |  1 +
 arch/x86/include/uapi/asm/kvm_para.h | 11 +++++++++++
 arch/x86/kernel/kvm.c                | 10 ++++++++++
 arch/x86/kernel/kvmclock.c           |  7 ++++++-
 4 files changed, 28 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/kvm_para.h b/arch/x86/include/asm/kvm_para.h
index 4a47c16e2df8..03fa1228fcf2 100644
--- a/arch/x86/include/asm/kvm_para.h
+++ b/arch/x86/include/asm/kvm_para.h
@@ -121,6 +121,7 @@ static inline long kvm_sev_hypercall3(unsigned int nr, unsigned long p1,
 void kvmclock_init(void);
 void kvmclock_disable(void);
 bool kvm_para_available(void);
+unsigned int kvm_para_tsc_khz(void);
 unsigned int kvm_arch_para_features(void);
 unsigned int kvm_arch_para_hints(void);
 void kvm_async_pf_task_wait_schedule(u32 token);
diff --git a/arch/x86/include/uapi/asm/kvm_para.h b/arch/x86/include/uapi/asm/kvm_para.h
index a1efa7907a0b..dc0d036fe678 100644
--- a/arch/x86/include/uapi/asm/kvm_para.h
+++ b/arch/x86/include/uapi/asm/kvm_para.h
@@ -44,6 +44,17 @@
  */
 #define KVM_FEATURE_CLOCKSOURCE_STABLE_BIT	24
 
+/*
+ * In https://lore.kernel.org/all/1222881242.9381.17.camel@alok-dev1/
+ * VMware proposed a timing information leaf providing the TSC and
+ * local APIC timer frequencies:
+ *
+ *  # EAX: (Virtual) TSC frequency in kHz.
+ *  # EBX: (Virtual) Bus (local apic timer) frequency in kHz.
+ *  # ECX, EDX: RESERVED (reserved fields are set to zero).
+ */
+#define KVM_CPUID_TIMING_INFO	0x40000010
+
 #define MSR_KVM_WALL_CLOCK  0x11
 #define MSR_KVM_SYSTEM_TIME 0x12
 
diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index 29226d112029..60375165b66c 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -910,6 +910,16 @@ bool kvm_para_available(void)
 }
 EXPORT_SYMBOL_GPL(kvm_para_available);
 
+unsigned int kvm_para_tsc_khz(void)
+{
+	u32 base = kvm_cpuid_base();
+
+	if (base && cpuid_eax(base) >= (base | KVM_CPUID_TIMING_INFO))
+		return cpuid_eax(base | KVM_CPUID_TIMING_INFO);
+
+	return 0;
+}
+
 unsigned int kvm_arch_para_features(void)
 {
 	return cpuid_eax(kvm_cpuid_base() | KVM_CPUID_FEATURES);
diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
index b5991d53fc0e..74aca22dc726 100644
--- a/arch/x86/kernel/kvmclock.c
+++ b/arch/x86/kernel/kvmclock.c
@@ -118,7 +118,12 @@ static inline void kvm_sched_clock_init(bool stable)
 static unsigned long kvm_get_tsc_khz(void)
 {
 	setup_force_cpu_cap(X86_FEATURE_TSC_KNOWN_FREQ);
-	return pvclock_tsc_khz(this_cpu_pvti());
+
+	/*
+	 * If KVM advertises the frequency directly in CPUID, use that
+	 * instead of reverse-calculating it from the KVM clock data.
+	 */
+	return kvm_para_tsc_khz() ? : pvclock_tsc_khz(this_cpu_pvti());
 }
 
 static void __init kvm_get_preset_lpj(void)
-- 
2.51.0


  parent reply	other threads:[~2026-05-09 22:48 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-09 22:46 [PATCH v4] 00/30] Cleaning up the KVM clock mess David Woodhouse
2026-05-09 22:46 ` [PATCH v4 01/30] KVM: x86/xen: Do not corrupt KVM clock in kvm_xen_shared_info_init() David Woodhouse
2026-05-09 22:46 ` [PATCH v4 02/30] KVM: x86: Improve accuracy of KVM clock when TSC scaling is in force David Woodhouse
2026-05-09 22:46 ` [PATCH v4 03/30] UAPI: x86: Move pvclock-abi to UAPI for x86 platforms David Woodhouse
2026-05-09 22:46 ` [PATCH v4 04/30] KVM: x86: Add KVM_[GS]ET_CLOCK_GUEST for accurate KVM clock migration David Woodhouse
2026-05-09 22:46 ` [PATCH v4 05/30] KVM: selftests: Add KVM/PV clock selftest to prove timer correction David Woodhouse
2026-05-09 22:46 ` [PATCH v4 06/30] KVM: x86: Explicitly disable TSC scaling without CONSTANT_TSC David Woodhouse
2026-05-09 22:46 ` [PATCH v4 07/30] KVM: x86: Add KVM_VCPU_TSC_SCALE and fix the documentation on TSC migration David Woodhouse
2026-05-09 22:46 ` [PATCH v4 08/30] KVM: x86: Avoid NTP frequency skew for KVM clock on 32-bit host David Woodhouse
2026-05-09 22:46 ` [PATCH v4 09/30] KVM: x86: WARN if kvm_get_walltime_and_clockread() fails unexpectedly David Woodhouse
2026-05-09 22:46 ` [PATCH v4 10/30] KVM: x86: Fold __get_kvmclock() into get_kvmclock() David Woodhouse
2026-05-09 22:46 ` [PATCH v4 11/30] KVM: x86: Add WARN and restructure get_kvmclock() David Woodhouse
2026-05-09 22:46 ` [PATCH v4 12/30] KVM: x86: Use get_kvmclock_base_ns() as fallback in get_kvmclock() David Woodhouse
2026-05-09 22:46 ` [PATCH v4 13/30] KVM: x86: Fix KVM clock precision in get_kvmclock() with TSC scaling David Woodhouse
2026-05-09 22:46 ` [PATCH v4 14/30] KVM: x86: Use get_kvmclock() in kvm_get_wall_clock_epoch() David Woodhouse
2026-05-09 22:46 ` [PATCH v4 15/30] KVM: x86: Fix compute_guest_tsc() to handle negative time deltas David Woodhouse
2026-05-09 22:46 ` [PATCH v4 16/30] KVM: x86: Restructure kvm_guest_time_update() for TSC upscaling David Woodhouse
2026-05-09 22:46 ` [PATCH v4 17/30] KVM: x86: Simplify and comment kvm_get_time_scale() David Woodhouse
2026-05-09 22:46 ` [PATCH v4 18/30] KVM: x86: Remove implicit rdtsc() from kvm_compute_l1_tsc_offset() David Woodhouse
2026-05-09 22:46 ` [PATCH v4 19/30] KVM: x86: Improve synchronization in kvm_synchronize_tsc() David Woodhouse
2026-05-09 22:46 ` [PATCH v4 20/30] KVM: x86: Kill last_tsc_{nsec,write,offset} fields David Woodhouse
2026-05-09 22:46 ` [PATCH v4 21/30] KVM: x86: Replace nr_vcpus_matched_tsc count with all_vcpus_matched_tsc bool David Woodhouse
2026-05-09 22:46 ` [PATCH v4 22/30] KVM: x86: Allow KVM master clock mode when TSCs are offset from each other David Woodhouse
2026-05-09 22:46 ` [PATCH v4 23/30] KVM: x86: Factor out kvm_use_master_clock() David Woodhouse
2026-05-09 22:46 ` [PATCH v4 24/30] KVM: x86: Avoid gratuitous global clock updates David Woodhouse
2026-05-09 22:46 ` [PATCH v4 25/30] KVM: x86/xen: Prevent runstate times from becoming negative David Woodhouse
2026-05-09 22:46 ` [PATCH v4 26/30] KVM: x86: Avoid redundant masterclock updates from multiple vCPUs David Woodhouse
2026-05-09 22:46 ` [PATCH v4 27/30] KVM: x86: Add KVM_VCPU_TSC_EFFECTIVE_FREQ attribute David Woodhouse
2026-05-09 22:46 ` [PATCH v4 28/30] KVM: x86: Remove runtime Xen TSC frequency CPUID update David Woodhouse
2026-05-09 22:46 ` David Woodhouse [this message]
2026-05-09 22:46 ` [PATCH v4 30/30] x86/xen: Obtain TSC frequency from CPUID if present David Woodhouse
2026-05-10 20:56 ` [PATCH v4 33/30] KVM: selftests: Add Xen runstate migration test David Woodhouse
2026-05-10 20:58 ` [PATCH v4 31/30] KVM: selftests: Add Xen/generic CPUID timing leaf test David Woodhouse
2026-05-10 21:05 ` [PATCH v4 32/30] KVM: x86: Re-synchronize TSC after KVM_SET_TSC_KHZ David Woodhouse

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260509224824.3264567-30-dwmw2@infradead.org \
    --to=dwmw2@infradead.org \
    --cc=Sascha.Bischoff@arm.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=bp@alien8.de \
    --cc=corbet@lwn.net \
    --cc=dave.hansen@linux.intel.com \
    --cc=dongli.zhang@oracle.com \
    --cc=hpa@zytor.com \
    --cc=jalliste@amazon.com \
    --cc=jgross@suse.com \
    --cc=jic23@kernel.org \
    --cc=joe.jin@oracle.com \
    --cc=joey.gouly@arm.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=maz@kernel.org \
    --cc=mingo@redhat.com \
    --cc=paul@xen.org \
    --cc=pbonzini@redhat.com \
    --cc=seanjc@google.com \
    --cc=skhan@linuxfoundation.org \
    --cc=tglx@kernel.org \
    --cc=vkuznets@redhat.com \
    --cc=x86@kernel.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox