From: Sean Christopherson <seanjc@google.com>
To: David Woodhouse <dwmw2@infradead.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
Thomas Gleixner <tglx@linutronix.de>,
Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
Dave Hansen <dave.hansen@linux.intel.com>,
x86@kernel.org, "H. Peter Anvin" <hpa@zytor.com>,
Vitaly Kuznetsov <vkuznets@redhat.com>,
kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
graf@amazon.de, Ajay Kaher <ajay.kaher@broadcom.com>,
Alexey Makhalov <alexey.makhalov@broadcom.com>,
Colin Percival <cperciva@tarsnap.com>,
Zack Rusin <zack.rusin@broadcom.com>,
Doug Covelli <doug.covelli@broadcom.com>
Subject: Re: [PATCH v2 2/3] KVM: x86: Provide TSC frequency in "generic" timing infomation CPUID leaf
Date: Tue, 16 Dec 2025 12:27:21 -0800 [thread overview]
Message-ID: <aUHAqVLlIU_OwESM@google.com> (raw)
In-Reply-To: <20250816101308.2594298-3-dwmw2@infradead.org>
+Doug and Zach
VMware folks, TL;DR question for you:
Does VMware report TSC and APIC bus frequency in CPUID 0x40000010.{EAX,EBX},
or at the very least pinky swear not to use those outputs for anything else?
On Sat, Aug 16, 2025, David Woodhouse wrote:
> From: David Woodhouse <dwmw@amazon.co.uk>
>
> In https://lkml.org/lkml/2008/10/1/246 a proposal was made for generic
> CPUID leaves, of which only 0x40000010 was defined, to contain the TSC
> and local APIC frequencies. The proposal from VMware was mostly shot
> down in flames, *but* XNU does unconditionally assume that this leaf
> contains the frequency information, if it's present on any hypervisor:
> https://github.com/apple/darwin-xnu/blob/main/osfmk/i386/cpuid.c
>
> So does FreeBSD: https://github.com/freebsd/freebsd-src/commit/4a432614f68
For me, the more convincing argument is following the breadcrumbs from the
changelog for the above commit
: This speeds up the boot process by 100 ms in EC2 and other systems,
: by allowing the early calibration DELAY to be skipped.
back to QEMU commit 9954a1582e ("x86-KVM: Supply TSC and APIC clock rates to guest
like VMWare"), with an assumption that EC2 enables vmware-cpuid-freq. I.e. the
de facto reference VMM for KVM (QEMU), has utilized CPUID 0x40000010 in this way
for almost 9 years.
> So at this point it would be daft for a hypervisor to expose 0x40000010
> for any *other* content.
My only hesitation is that VMware _does_ put other content in 0x40000010. From
arch/x86/kernel/cpu/vmware.c:
static u8 __init vmware_select_hypercall(void)
{
int eax, ebx, ecx, edx;
cpuid(CPUID_VMWARE_FEATURES_LEAF, &eax, &ebx, &ecx, &edx);
return (ecx & (CPUID_VMWARE_FEATURES_ECX_VMMCALL |
CPUID_VMWARE_FEATURES_ECX_VMCALL));
}
And oddly, Linux doesn't use CPUID to get the TSC frequency on VMware:
eax = vmware_hypercall3(VMWARE_CMD_GETHZ, UINT_MAX, &ebx, &ecx);
if (ebx != UINT_MAX) {
lpj = tsc_khz = eax | (((u64)ebx) << 32);
do_div(tsc_khz, 1000);
WARN_ON(tsc_khz >> 32);
pr_info("TSC freq read from hypervisor : %lu.%03lu MHz\n",
(unsigned long) tsc_khz / 1000,
(unsigned long) tsc_khz % 1000);
if (!preset_lpj) {
do_div(lpj, HZ);
preset_lpj = lpj;
}
vmware_tsc_khz = tsc_khz;
tsc_register_calibration_routines(vmware_get_tsc_khz,
vmware_get_tsc_khz,
TSC_FREQ_KNOWN_AND_RELIABLE);
However, VMware appears to deliberately avoid using EAX and EBX, and the above
FreeBSD commit (and current code) is broken if VMware does NOT populate CPUID
0x40000010 with at least the TSC frequency. Because FreeBSD prioritizes getting
the TSC frequency from CPUID:
if (tsc_freq_cpuid_vm()) {
if (bootverbose)
printf(
"Early TSC frequency %juHz derived from hypervisor CPUID\n",
(uintmax_t)tsc_freq);
} else if (vm_guest == VM_GUEST_VMWARE) {
tsc_freq_vmware();
if (bootverbose)
printf(
"Early TSC frequency %juHz derived from VMWare hypercall\n",
(uintmax_t)tsc_freq);
}
where tsc_freq_cpuid_vm() only checks if 0x40000010 is available, not if
0x40000010.EAX contains a sane, non-zero frequency.
static int
tsc_freq_cpuid_vm(void)
{
u_int regs[4];
if (vm_guest == VM_GUEST_NO)
return (false);
if (hv_high < 0x40000010)
return (false);
do_cpuid(0x40000010, regs);
tsc_freq = (uint64_t)(regs[0]) * 1000;
tsc_early_calib_exact = 1;
return (true);
}
I.e. if VMware isn't populating 0x40000010.EAX with the TSC frequency, then I
would think FreeBSD would be getting bug reports when running on VMware, which
AFAICT isn't the case.
So jumping back to my questions for the VMware folks, if VMware enumerates timing
information in CPUID 0x40000010.{EAX,EBX}, or at least doesn't use those outputs
for other purposes, then I 100% agree that reserving CPUID 0x40000010 for timing
information in KVM's PV CPUID leaves is a no-brainer. Even if the answer to both
is "no", I think it still makes sense to carve out 0x40000010, it'll just require
a bit more care and some different context.
next prev parent reply other threads:[~2025-12-16 20:27 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-16 10:09 [PATCH v2 0/3] Support "generic" CPUID timing leaf as KVM guest and host David Woodhouse
2025-08-16 10:10 ` [PATCH v2 1/3] KVM: x86: Restore caching of KVM CPUID base David Woodhouse
2025-08-16 10:10 ` [PATCH v2 2/3] KVM: x86: Provide TSC frequency in "generic" timing infomation CPUID leaf David Woodhouse
2025-12-16 20:27 ` Sean Christopherson [this message]
2025-12-16 21:10 ` Doug Covelli
2025-12-16 22:59 ` Sean Christopherson
2025-08-16 10:10 ` [PATCH v2 3/3] x86/kvm: Obtain TSC frequency from CPUID if present David Woodhouse
2025-08-21 16:26 ` [PATCH v2 0/3] Support "generic" CPUID timing leaf as KVM guest and host Sean Christopherson
2025-08-21 17:37 ` David Woodhouse
2025-08-21 19:27 ` Sean Christopherson
2025-08-21 20:42 ` David Woodhouse
2025-08-21 20:48 ` Sean Christopherson
2025-08-21 21:10 ` David Woodhouse
2025-08-22 1:57 ` Colin Percival
2025-08-26 19:30 ` Sean Christopherson
2025-08-27 9:30 ` David Woodhouse
2025-08-28 23:40 ` Sean Christopherson
2025-08-29 9:50 ` David Woodhouse
2025-08-29 11:08 ` Durrant, Paul
2025-08-29 11:19 ` David Woodhouse
2025-08-29 20:36 ` Sean Christopherson
2025-09-02 8:31 ` David Woodhouse
2025-09-02 17:49 ` Sean Christopherson
2025-09-02 18:23 ` David Woodhouse
2025-09-04 11:59 ` Sean Christopherson
2025-09-04 12:14 ` David Woodhouse
2025-09-04 13:25 ` Sean Christopherson
2025-09-04 13:51 ` David Woodhouse
2025-09-05 7:57 ` Sean Christopherson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aUHAqVLlIU_OwESM@google.com \
--to=seanjc@google.com \
--cc=ajay.kaher@broadcom.com \
--cc=alexey.makhalov@broadcom.com \
--cc=bp@alien8.de \
--cc=cperciva@tarsnap.com \
--cc=dave.hansen@linux.intel.com \
--cc=doug.covelli@broadcom.com \
--cc=dwmw2@infradead.org \
--cc=graf@amazon.de \
--cc=hpa@zytor.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=pbonzini@redhat.com \
--cc=tglx@linutronix.de \
--cc=vkuznets@redhat.com \
--cc=x86@kernel.org \
--cc=zack.rusin@broadcom.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox