From: Sean Christopherson <seanjc@google.com>
To: Paolo Bonzini <pbonzini@redhat.com>,
Thomas Gleixner <tglx@kernel.org>, Ingo Molnar <mingo@redhat.com>,
Borislav Petkov <bp@alien8.de>,
Dave Hansen <dave.hansen@linux.intel.com>,
x86@kernel.org, Kiryl Shutsemau <kas@kernel.org>,
Sean Christopherson <seanjc@google.com>,
"K. Y. Srinivasan" <kys@microsoft.com>,
Haiyang Zhang <haiyangz@microsoft.com>,
Wei Liu <wei.liu@kernel.org>, Dexuan Cui <decui@microsoft.com>,
Long Li <longli@microsoft.com>,
Ajay Kaher <ajay.kaher@broadcom.com>,
Alexey Makhalov <alexey.makhalov@broadcom.com>,
Jan Kiszka <jan.kiszka@siemens.com>,
Andy Lutomirski <luto@kernel.org>,
Peter Zijlstra <peterz@infradead.org>,
Juergen Gross <jgross@suse.com>,
Daniel Lezcano <daniel.lezcano@kernel.org>,
John Stultz <jstultz@google.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>,
Rick Edgecombe <rick.p.edgecombe@intel.com>,
Vitaly Kuznetsov <vkuznets@redhat.com>,
Broadcom internal kernel review list
<bcm-kernel-feedback-list@broadcom.com>,
Boris Ostrovsky <boris.ostrovsky@oracle.com>,
Stephen Boyd <sboyd@kernel.org>,
kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-coco@lists.linux.dev, linux-hyperv@vger.kernel.org,
virtualization@lists.linux.dev, xen-devel@lists.xenproject.org,
David Woodhouse <dwmw@amazon.co.uk>,
Tom Lendacky <thomas.lendacky@amd.com>,
Nikunj A Dadhania <nikunj@amd.com>,
David Woodhouse <dwmw2@infradead.org>,
Michael Kelley <mhklinux@outlook.com>,
Thomas Gleixner <tglx@linutronix.de>
Subject: [PATCH v4 16/47] x86/kvm: Obtain TSC frequency from PV CPUID if present
Date: Fri, 29 May 2026 07:44:03 -0700 [thread overview]
Message-ID: <20260529144435.704127-17-seanjc@google.com> (raw)
In-Reply-To: <20260529144435.704127-1-seanjc@google.com>
From: David Woodhouse <dwmw@amazon.co.uk>
In https://lkml.org/lkml/2008/10/1/246 a proposal was made for generic
CPUID conventions across hypervisors. It was mostly shot down in flames,
but the leaf at 0x40000010 containing timing information didn't die.
It's used by XNU and FreeBSD guests under all hypervisors¹² to determine
the TSC frequency, and also exposed by the EC2 Nitro hypervisor (as
well as, presumably, VMware). FreeBSD's Bhyve is probably just about
to start exposing it too.
Use it under KVM to obtain the TSC frequency more accurately, instead of
reverse-calculating the frequency from the mul/shift values in the KVM
clock. Use the information to get the CPU frequency as well (kvmclock
feeds in kvm_get_tsc_khz() for both TSC and CPU calibration), as the info
from CPUID is superior in every way; whether or not kvmclock should be
overriding CPU calibration in the first place is an entirely different
question.
Use the info from CPUID even if the user explicitly disables kvmclock, or
if it's unsupported. The PV CPUID leaf has no dependency on kvmclock, and
is in fact more useful if kvmclock is disabled since the kernel won't be
able to use kvmclock to derive a derive the TSC frequency.
Before:
[ 0.000020] tsc: Detected 2900.014 MHz processor
After:
[ 0.000020] tsc: Detected 2900.015 MHz processor
$ cpuid -1 -l 0x40000010
CPU:
hypervisor generic timing information (0x40000010):
TSC frequency (Hz) = 2900015
bus frequency (Hz) = 1000000
Note! *Independently* query for non-null get_{cpu,tsc}_khz() overrides so
that kvmclock doesn't clobber x86_init.hyper.get_cpu_khz() if/when KVM adds
support for getting the CPU frequency separately from the TSC frequency.
¹ https://github.com/apple/darwin-xnu/blob/main/osfmk/i386/cpuid.c
² https://github.com/freebsd/freebsd-src/commit/4a432614f68
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Co-developed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kernel/kvm.c | 33 +++++++++++++++++++++++++++++++++
arch/x86/kernel/kvmclock.c | 6 ++++--
2 files changed, 37 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index dcef84da304b..909d3e5e5bcd 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -49,6 +49,8 @@
#include <asm/svm.h>
#include <asm/e820/api.h>
+static unsigned int kvm_tsc_khz_cpuid __initdata;
+
DEFINE_STATIC_KEY_FALSE_RO(kvm_async_pf_enabled);
static int kvmapf = 1;
@@ -911,6 +913,21 @@ bool kvm_para_available(void)
}
EXPORT_SYMBOL_GPL(kvm_para_available);
+static u32 __init kvm_cpuid_timing_info_leaf(void)
+{
+ u32 base = kvm_cpuid_base();
+
+ if (!base || cpuid_eax(base) < (base | KVM_CPUID_TIMING_INFO))
+ return 0;
+
+ return base | KVM_CPUID_TIMING_INFO;
+}
+
+static unsigned int __init kvm_get_tsc_khz(void)
+{
+ return kvm_tsc_khz_cpuid;
+}
+
unsigned int kvm_arch_para_features(void)
{
return cpuid_eax(kvm_cpuid_base() | KVM_CPUID_FEATURES);
@@ -960,6 +977,7 @@ static void __init kvm_init_platform(void)
.mask_lo = (u32)(~(SZ_4G - tolud - 1)) | MTRR_PHYSMASK_V,
.mask_hi = (BIT_ULL(boot_cpu_data.x86_phys_bits) - 1) >> 32,
};
+ u32 timing_info_leaf;
if (cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT) &&
kvm_para_has_feature(KVM_FEATURE_MIGRATION_CONTROL)) {
@@ -1007,6 +1025,21 @@ static void __init kvm_init_platform(void)
wrmsrq(MSR_KVM_MIGRATION_CONTROL,
KVM_MIGRATION_READY);
}
+
+ /*
+ * If KVM advertises the frequency directly in CPUID, use that instead
+ * of reverse-calculating it from the KVM clock data, or worse, trying
+ * to calibratate the TSC using an emulated device.
+ */
+ timing_info_leaf = kvm_cpuid_timing_info_leaf();
+ if (timing_info_leaf) {
+ kvm_tsc_khz_cpuid = cpuid_eax(timing_info_leaf);
+ if (kvm_tsc_khz_cpuid) {
+ x86_init.hyper.get_tsc_khz = kvm_get_tsc_khz;
+ x86_init.hyper.get_cpu_khz = kvm_get_tsc_khz;
+ }
+ }
+
kvmclock_init();
x86_platform.apic_post_init = kvm_apic_init;
diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
index c4a782a0c903..404f60741aa8 100644
--- a/arch/x86/kernel/kvmclock.c
+++ b/arch/x86/kernel/kvmclock.c
@@ -320,8 +320,10 @@ void __init kvmclock_init(void)
flags = pvclock_read_flags(&hv_clock_boot[0].pvti);
kvm_sched_clock_init(flags & PVCLOCK_TSC_STABLE_BIT);
- x86_init.hyper.get_tsc_khz = kvmclock_get_tsc_khz;
- x86_init.hyper.get_cpu_khz = kvmclock_get_tsc_khz;
+ if (!x86_init.hyper.get_tsc_khz)
+ x86_init.hyper.get_tsc_khz = kvmclock_get_tsc_khz;
+ if (!x86_init.hyper.get_cpu_khz)
+ x86_init.hyper.get_cpu_khz = kvmclock_get_tsc_khz;
x86_platform.get_wallclock = kvm_get_wallclock;
x86_platform.set_wallclock = kvm_set_wallclock;
#ifdef CONFIG_X86_LOCAL_APIC
--
2.54.0.823.g6e5bcc1fc9-goog
next prev parent reply other threads:[~2026-05-29 14:45 UTC|newest]
Thread overview: 66+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-29 14:43 [PATCH v4 00/47] x86: Try to wrangle PV clocks vs. TSC Sean Christopherson
2026-05-29 14:43 ` [PATCH v4 01/47] x86/tsc: Never re-calibrate TSC frequency if its exact timing is known Sean Christopherson
2026-05-30 3:07 ` Borislav Petkov
2026-05-29 14:43 ` [PATCH v4 02/47] x86/tsc: Add a standalone helpers for getting TSC info from CPUID.0x15 Sean Christopherson
2026-05-29 14:43 ` [PATCH v4 03/47] x86/sev: Mark TSC as reliable when configuring Secure TSC Sean Christopherson
2026-05-29 14:43 ` [PATCH v4 04/47] x86/sev: Don't override CPU frequency calibration for SNP's " Sean Christopherson
2026-05-29 15:44 ` sashiko-bot
2026-05-29 14:43 ` [PATCH v4 05/47] x86/sev: Move check for SNP Secure TSC support to tsc_early_init() Sean Christopherson
2026-05-29 14:43 ` [PATCH v4 06/47] x86/sev: Shove SNP's secure/trusted TSC frequency directly into "calibration" Sean Christopherson
2026-05-29 16:14 ` sashiko-bot
2026-05-29 16:23 ` Sean Christopherson
2026-05-29 14:43 ` [PATCH v4 07/47] x86/tdx: Force TSC frequency with CPUID-based info provided by the TDX-Module Sean Christopherson
2026-05-29 16:21 ` sashiko-bot
2026-05-29 16:59 ` Sean Christopherson
2026-05-29 14:43 ` [PATCH v4 08/47] x86/tsc: Add dedicated hypervisor hooks for getting known TSC/CPU frequencies Sean Christopherson
2026-05-29 14:43 ` [PATCH v4 09/47] x86/acrn: Mark TSC frequency as known when using ACRN for calibration Sean Christopherson
2026-05-29 16:40 ` sashiko-bot
2026-05-29 17:01 ` Sean Christopherson
2026-05-29 14:43 ` [PATCH v4 10/47] x86/tsc: Consolidate forcing of X86_FEATURE_TSC_KNOWN_FREQ for PV code Sean Christopherson
2026-05-29 19:01 ` sashiko-bot
2026-05-29 14:43 ` [PATCH v4 11/47] x86/tsc: Kill off x86_platform_ops.calibrate_{cpu,tsc}() hooks Sean Christopherson
2026-05-29 14:43 ` [PATCH v4 12/47] x86/tsc: Rename pit_hpet_ptimer_calibrate_cpu() => native_calibrate_cpu_late() Sean Christopherson
2026-05-29 14:44 ` [PATCH v4 13/47] x86/tsc: Fold native_calibrate_cpu() into recalibrate_cpu_khz() Sean Christopherson
2026-05-29 14:44 ` [PATCH v4 14/47] x86/kvmclock: Rename kvm_get_tsc_khz() to kvmclock_get_tsc_khz() Sean Christopherson
2026-05-29 14:44 ` [PATCH v4 15/47] KVM: x86: Officially define CPUID 0x40000010 as PV Timing Info (TSC and Bus) Sean Christopherson
2026-05-29 14:44 ` Sean Christopherson [this message]
2026-05-29 14:44 ` [PATCH v4 17/47] x86/kvm: Mark TSC as reliable when it's constant and nonstop Sean Christopherson
2026-05-29 18:12 ` sashiko-bot
2026-05-29 18:57 ` Sean Christopherson
2026-05-29 14:44 ` [PATCH v4 18/47] x86/kvm: Get local APIC bus frequency from PV CPUID Timing Info Sean Christopherson
2026-05-29 18:12 ` sashiko-bot
2026-05-29 18:24 ` Sean Christopherson
2026-05-29 14:44 ` [PATCH v4 19/47] x86/tsc: Add standalone helper for getting CPU frequency from CPUID Sean Christopherson
2026-05-29 14:44 ` [PATCH v4 20/47] x86/kvm: Get CPU base frequency from CPUID when it's available Sean Christopherson
2026-05-30 6:24 ` sashiko-bot
2026-05-29 14:44 ` [PATCH v4 21/47] x86/xen: Obtain TSC frequency from CPUID if present Sean Christopherson
2026-05-30 6:35 ` sashiko-bot
2026-05-29 14:44 ` [PATCH v4 22/47] clocksource: hyper-v: Register sched_clock save/restore iff it's necessary Sean Christopherson
2026-05-29 14:44 ` [PATCH v4 23/47] clocksource: hyper-v: Drop wrappers to sched_clock save/restore helpers Sean Christopherson
2026-05-29 14:44 ` [PATCH v4 24/47] clocksource: hyper-v: Don't save/restore TSC offset when using HV sched_clock Sean Christopherson
2026-05-29 14:44 ` [PATCH v4 25/47] x86/kvmclock: Setup kvmclock for secondary CPUs iff CONFIG_SMP=y Sean Christopherson
2026-05-29 14:44 ` [PATCH v4 26/47] x86/kvm: Don't disable kvmclock on BSP in syscore_suspend() Sean Christopherson
2026-05-30 7:08 ` sashiko-bot
2026-05-29 15:06 ` [PATCH v4 27/47] x86/paravirt: Remove unnecessary PARAVIRT=n stub for paravirt_set_sched_clock() Sean Christopherson
2026-05-29 15:07 ` [PATCH v4 28/47] x86/paravirt: Move handling of unstable PV clocks into paravirt_set_sched_clock() Sean Christopherson
2026-05-29 15:07 ` [PATCH v4 29/47] x86/kvmclock: Move sched_clock save/restore helpers up in kvmclock.c Sean Christopherson
2026-05-29 15:07 ` [PATCH v4 30/47] x86/xen/time: NOP-ify x86_platform's sched_clock save/restore hooks Sean Christopherson
2026-05-29 15:07 ` [PATCH v4 31/47] x86/vmware: NOP-ify save/restore hooks when using VMware's sched_clock Sean Christopherson
2026-05-29 15:07 ` [PATCH v4 32/47] x86/tsc: WARN if TSC sched_clock save/restore used with PV sched_clock Sean Christopherson
2026-05-29 15:07 ` [PATCH v4 33/47] x86/paravirt: Pass sched_clock save/restore helpers during registration Sean Christopherson
2026-05-29 15:08 ` [PATCH v4 34/47] x86/kvmclock: Move kvm_sched_clock_init() down in kvmclock.c Sean Christopherson
2026-05-29 15:08 ` [PATCH v4 35/47] x86/xen/time: Mark xen_setup_vsyscall_time_info() as __init Sean Christopherson
2026-05-29 15:08 ` [PATCH v4 36/47] x86/pvclock: Mark setup helpers and related various as __init/__ro_after_init Sean Christopherson
2026-05-29 15:08 ` [PATCH v4 37/47] x86/pvclock: WARN if pvclock's valid_flags are overwritten Sean Christopherson
2026-05-29 15:08 ` [PATCH v4 38/47] x86/kvmclock: Refactor handling of PVCLOCK_TSC_STABLE_BIT during kvmclock_init() Sean Christopherson
2026-05-29 15:08 ` [PATCH v4 39/47] timekeeping: Resume clocksources before reading persistent clock Sean Christopherson
2026-05-29 15:08 ` [PATCH v4 40/47] x86/kvmclock: Hook clocksource.suspend/resume when kvmclock isn't sched_clock Sean Christopherson
2026-05-29 15:08 ` [PATCH v4 41/47] x86/kvmclock: WARN if wall clock is read while kvmclock is suspended Sean Christopherson
2026-05-29 15:08 ` [PATCH v4 42/47] x86/paravirt: Mark __paravirt_set_sched_clock() as __init Sean Christopherson
2026-05-29 15:08 ` [PATCH v4 43/47] x86/paravirt: Plumb a return code into __paravirt_set_sched_clock() Sean Christopherson
2026-05-29 15:08 ` [PATCH v4 44/47] x86/paravirt: Don't use a PV sched_clock in CoCo guests with trusted TSC Sean Christopherson
2026-05-29 15:08 ` [PATCH v4 45/47] x86/kvmclock: Use TSC for sched_clock if it's constant and non-stop Sean Christopherson
2026-05-29 15:08 ` [PATCH v4 46/47] x86/kvmclock: Plumb in AP-online and BSP-resume to kvmlock, for documentation Sean Christopherson
2026-05-29 15:08 ` [PATCH v4 47/47] x86/paravirt: Move using_native_sched_clock() stub into timer.h Sean Christopherson
2026-05-29 15:10 ` [PATCH v4 00/47] x86: Try to wrangle PV clocks vs. TSC Sean Christopherson
2026-05-29 15:17 ` Jürgen Groß
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260529144435.704127-17-seanjc@google.com \
--to=seanjc@google.com \
--cc=ajay.kaher@broadcom.com \
--cc=alexey.makhalov@broadcom.com \
--cc=bcm-kernel-feedback-list@broadcom.com \
--cc=boris.ostrovsky@oracle.com \
--cc=bp@alien8.de \
--cc=daniel.lezcano@kernel.org \
--cc=dave.hansen@linux.intel.com \
--cc=decui@microsoft.com \
--cc=dwmw2@infradead.org \
--cc=dwmw@amazon.co.uk \
--cc=haiyangz@microsoft.com \
--cc=hpa@zytor.com \
--cc=jan.kiszka@siemens.com \
--cc=jgross@suse.com \
--cc=jstultz@google.com \
--cc=kas@kernel.org \
--cc=kvm@vger.kernel.org \
--cc=kys@microsoft.com \
--cc=linux-coco@lists.linux.dev \
--cc=linux-hyperv@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=longli@microsoft.com \
--cc=luto@kernel.org \
--cc=mhklinux@outlook.com \
--cc=mingo@redhat.com \
--cc=nikunj@amd.com \
--cc=pbonzini@redhat.com \
--cc=peterz@infradead.org \
--cc=rick.p.edgecombe@intel.com \
--cc=sboyd@kernel.org \
--cc=tglx@kernel.org \
--cc=tglx@linutronix.de \
--cc=thomas.lendacky@amd.com \
--cc=virtualization@lists.linux.dev \
--cc=vkuznets@redhat.com \
--cc=wei.liu@kernel.org \
--cc=x86@kernel.org \
--cc=xen-devel@lists.xenproject.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox