From: Sean Christopherson <seanjc@google.com>
To: Thomas Gleixner <tglx@linutronix.de>,
Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
Dave Hansen <dave.hansen@linux.intel.com>,
x86@kernel.org,
"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
Juergen Gross <jgross@suse.com>,
"K. Y. Srinivasan" <kys@microsoft.com>,
Haiyang Zhang <haiyangz@microsoft.com>,
Wei Liu <wei.liu@kernel.org>, Dexuan Cui <decui@microsoft.com>,
Ajay Kaher <ajay.kaher@broadcom.com>,
Alexey Makhalov <alexey.amakhalov@broadcom.com>,
Jan Kiszka <jan.kiszka@siemens.com>,
Paolo Bonzini <pbonzini@redhat.com>,
Andy Lutomirski <luto@kernel.org>,
Peter Zijlstra <peterz@infradead.org>
Cc: linux-kernel@vger.kernel.org, linux-coco@lists.linux.dev,
virtualization@lists.linux.dev, linux-hyperv@vger.kernel.org,
jailhouse-dev@googlegroups.com, kvm@vger.kernel.org,
xen-devel@lists.xenproject.org,
Sean Christopherson <seanjc@google.com>,
Nikunj A Dadhania <nikunj@amd.com>,
Tom Lendacky <thomas.lendacky@amd.com>
Subject: [PATCH 08/16] x86/tsc: Pass KNOWN_FREQ and RELIABLE as params to registration
Date: Fri, 31 Jan 2025 18:17:10 -0800 [thread overview]
Message-ID: <20250201021718.699411-9-seanjc@google.com> (raw)
In-Reply-To: <20250201021718.699411-1-seanjc@google.com>
Add a "tsc_properties" set of flags and use it to annotate whether the
TSC operates at a known and/or reliable frequency when registering a
paravirtual TSC calibration routine. Currently, each PV flow manually
sets the associated feature flags, but often in haphazard fashion that
makes it difficult for unfamiliar readers to see the properties of the
TSC when running under a particular hypervisor.
The other, bigger issue with manually setting the feature flags is that
it decouples the flags from the calibration routine. E.g. in theory, PV
code could mark the TSC as having a known frequency, but then have its
PV calibration discarded in favor of a method that doesn't use that known
frequency. Passing the TSC properties along with the calibration routine
will allow adding sanity checks to guard against replacing a "better"
calibration routine with a "worse" routine.
As a bonus, the flags also give developers working on new PV code a heads
up that they should at least mark the TSC as having a known frequency.
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/coco/sev/core.c | 6 ++----
arch/x86/coco/tdx/tdx.c | 7 ++-----
arch/x86/include/asm/tsc.h | 8 +++++++-
arch/x86/kernel/cpu/acrn.c | 4 ++--
arch/x86/kernel/cpu/mshyperv.c | 10 +++++++---
arch/x86/kernel/cpu/vmware.c | 7 ++++---
arch/x86/kernel/jailhouse.c | 4 ++--
arch/x86/kernel/kvmclock.c | 4 ++--
arch/x86/kernel/tsc.c | 8 +++++++-
arch/x86/xen/time.c | 4 ++--
10 files changed, 37 insertions(+), 25 deletions(-)
diff --git a/arch/x86/coco/sev/core.c b/arch/x86/coco/sev/core.c
index dab386f782ce..29dd50552715 100644
--- a/arch/x86/coco/sev/core.c
+++ b/arch/x86/coco/sev/core.c
@@ -3284,12 +3284,10 @@ void __init snp_secure_tsc_init(void)
{
unsigned long long tsc_freq_mhz;
- setup_force_cpu_cap(X86_FEATURE_TSC_KNOWN_FREQ);
- setup_force_cpu_cap(X86_FEATURE_TSC_RELIABLE);
-
rdmsrl(MSR_AMD64_GUEST_TSC_FREQ, tsc_freq_mhz);
snp_tsc_freq_khz = (unsigned long)(tsc_freq_mhz * 1000);
tsc_register_calibration_routines(securetsc_get_tsc_khz,
- securetsc_get_tsc_khz);
+ securetsc_get_tsc_khz,
+ TSC_FREQ_KNOWN_AND_RELIABLE);
}
diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c
index 9d95dc713331..b1e3cca091b3 100644
--- a/arch/x86/coco/tdx/tdx.c
+++ b/arch/x86/coco/tdx/tdx.c
@@ -1135,14 +1135,11 @@ static unsigned long tdx_get_tsc_khz(void)
void __init tdx_tsc_init(void)
{
- /* TSC is the only reliable clock in TDX guest */
- setup_force_cpu_cap(X86_FEATURE_TSC_RELIABLE);
- setup_force_cpu_cap(X86_FEATURE_TSC_KNOWN_FREQ);
-
/*
* Override the PV calibration routines (if set) with more trustworthy
* CPUID-based calibration. The TDX module emulates CPUID, whereas any
* PV information is provided by the hypervisor.
*/
- tsc_register_calibration_routines(tdx_get_tsc_khz, NULL);
+ tsc_register_calibration_routines(tdx_get_tsc_khz, NULL,
+ TSC_FREQ_KNOWN_AND_RELIABLE);
}
diff --git a/arch/x86/include/asm/tsc.h b/arch/x86/include/asm/tsc.h
index 82a6cc27cafb..e99966f10594 100644
--- a/arch/x86/include/asm/tsc.h
+++ b/arch/x86/include/asm/tsc.h
@@ -88,8 +88,14 @@ static inline int cpuid_get_cpu_freq(unsigned int *cpu_khz)
extern void tsc_early_init(void);
extern void tsc_init(void);
#if defined(CONFIG_HYPERVISOR_GUEST) || defined(CONFIG_AMD_MEM_ENCRYPT)
+enum tsc_properties {
+ TSC_FREQUENCY_KNOWN = BIT(0),
+ TSC_RELIABLE = BIT(1),
+ TSC_FREQ_KNOWN_AND_RELIABLE = TSC_FREQUENCY_KNOWN | TSC_RELIABLE,
+};
extern void tsc_register_calibration_routines(unsigned long (*calibrate_tsc)(void),
- unsigned long (*calibrate_cpu)(void));
+ unsigned long (*calibrate_cpu)(void),
+ enum tsc_properties properties);
#endif
extern void mark_tsc_unstable(char *reason);
extern int unsynchronized_tsc(void);
diff --git a/arch/x86/kernel/cpu/acrn.c b/arch/x86/kernel/cpu/acrn.c
index 2da3de4d470e..4f2f4f7ec334 100644
--- a/arch/x86/kernel/cpu/acrn.c
+++ b/arch/x86/kernel/cpu/acrn.c
@@ -29,9 +29,9 @@ static void __init acrn_init_platform(void)
/* Install system interrupt handler for ACRN hypervisor callback */
sysvec_install(HYPERVISOR_CALLBACK_VECTOR, sysvec_acrn_hv_callback);
- setup_force_cpu_cap(X86_FEATURE_TSC_KNOWN_FREQ);
tsc_register_calibration_routines(acrn_get_tsc_khz,
- acrn_get_tsc_khz);
+ acrn_get_tsc_khz,
+ TSC_FREQUENCY_KNOWN);
}
static bool acrn_x2apic_available(void)
diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
index aa60491bf738..607a3c51eddf 100644
--- a/arch/x86/kernel/cpu/mshyperv.c
+++ b/arch/x86/kernel/cpu/mshyperv.c
@@ -478,8 +478,13 @@ static void __init ms_hyperv_init_platform(void)
if (ms_hyperv.features & HV_ACCESS_FREQUENCY_MSRS &&
ms_hyperv.misc_features & HV_FEATURE_FREQUENCY_MSRS_AVAILABLE) {
- tsc_register_calibration_routines(hv_get_tsc_khz, hv_get_tsc_khz);
- setup_force_cpu_cap(X86_FEATURE_TSC_KNOWN_FREQ);
+ enum tsc_properties tsc_properties = TSC_FREQUENCY_KNOWN;
+
+ if (ms_hyperv.features & HV_ACCESS_TSC_INVARIANT)
+ tsc_properties = TSC_FREQ_KNOWN_AND_RELIABLE;
+
+ tsc_register_calibration_routines(hv_get_tsc_khz, hv_get_tsc_khz,
+ tsc_properties);
}
if (ms_hyperv.priv_high & HV_ISOLATION) {
@@ -582,7 +587,6 @@ static void __init ms_hyperv_init_platform(void)
* is called.
*/
wrmsrl(HV_X64_MSR_TSC_INVARIANT_CONTROL, HV_EXPOSE_INVARIANT_TSC);
- setup_force_cpu_cap(X86_FEATURE_TSC_RELIABLE);
}
/*
diff --git a/arch/x86/kernel/cpu/vmware.c b/arch/x86/kernel/cpu/vmware.c
index d6f079a75f05..6e4a2053857c 100644
--- a/arch/x86/kernel/cpu/vmware.c
+++ b/arch/x86/kernel/cpu/vmware.c
@@ -385,10 +385,10 @@ static void __init vmware_paravirt_ops_setup(void)
*/
static void __init vmware_set_capabilities(void)
{
+ /* TSC is non-stop and reliable even if the frequency isn't known. */
setup_force_cpu_cap(X86_FEATURE_CONSTANT_TSC);
setup_force_cpu_cap(X86_FEATURE_TSC_RELIABLE);
- if (vmware_tsc_khz)
- setup_force_cpu_cap(X86_FEATURE_TSC_KNOWN_FREQ);
+
if (vmware_hypercall_mode == CPUID_VMWARE_FEATURES_ECX_VMCALL)
setup_force_cpu_cap(X86_FEATURE_VMCALL);
else if (vmware_hypercall_mode == CPUID_VMWARE_FEATURES_ECX_VMMCALL)
@@ -417,7 +417,8 @@ static void __init vmware_platform_setup(void)
vmware_tsc_khz = tsc_khz;
tsc_register_calibration_routines(vmware_get_tsc_khz,
- vmware_get_tsc_khz);
+ vmware_get_tsc_khz,
+ TSC_FREQ_KNOWN_AND_RELIABLE);
#ifdef CONFIG_X86_LOCAL_APIC
/* Skip lapic calibration since we know the bus frequency. */
diff --git a/arch/x86/kernel/jailhouse.c b/arch/x86/kernel/jailhouse.c
index b0a053692161..d73a4d0fb118 100644
--- a/arch/x86/kernel/jailhouse.c
+++ b/arch/x86/kernel/jailhouse.c
@@ -218,7 +218,8 @@ static void __init jailhouse_init_platform(void)
machine_ops.emergency_restart = jailhouse_no_restart;
- tsc_register_calibration_routines(jailhouse_get_tsc, jailhouse_get_tsc);
+ tsc_register_calibration_routines(jailhouse_get_tsc, jailhouse_get_tsc,
+ TSC_FREQUENCY_KNOWN);
while (pa_data) {
mapping = early_memremap(pa_data, sizeof(header));
@@ -256,7 +257,6 @@ static void __init jailhouse_init_platform(void)
pr_debug("Jailhouse: PM-Timer IO Port: %#x\n", pmtmr_ioport);
precalibrated_tsc_khz = setup_data.v1.tsc_khz;
- setup_force_cpu_cap(X86_FEATURE_TSC_KNOWN_FREQ);
pci_probe = 0;
diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
index b898b95a7d50..b41ac7f27b9f 100644
--- a/arch/x86/kernel/kvmclock.c
+++ b/arch/x86/kernel/kvmclock.c
@@ -116,7 +116,6 @@ static inline void kvm_sched_clock_init(bool stable)
*/
static unsigned long kvm_get_tsc_khz(void)
{
- setup_force_cpu_cap(X86_FEATURE_TSC_KNOWN_FREQ);
return pvclock_tsc_khz(this_cpu_pvti());
}
@@ -320,7 +319,8 @@ void __init kvmclock_init(void)
flags = pvclock_read_flags(&hv_clock_boot[0].pvti);
kvm_sched_clock_init(flags & PVCLOCK_TSC_STABLE_BIT);
- tsc_register_calibration_routines(kvm_get_tsc_khz, kvm_get_tsc_khz);
+ tsc_register_calibration_routines(kvm_get_tsc_khz, kvm_get_tsc_khz,
+ TSC_FREQUENCY_KNOWN);
x86_platform.get_wallclock = kvm_get_wallclock;
x86_platform.set_wallclock = kvm_set_wallclock;
diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index 922003059101..47776f450720 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -1252,11 +1252,17 @@ static void __init check_system_tsc_reliable(void)
*/
#if defined(CONFIG_HYPERVISOR_GUEST) || defined(CONFIG_AMD_MEM_ENCRYPT)
void tsc_register_calibration_routines(unsigned long (*calibrate_tsc)(void),
- unsigned long (*calibrate_cpu)(void))
+ unsigned long (*calibrate_cpu)(void),
+ enum tsc_properties properties)
{
if (WARN_ON_ONCE(!calibrate_tsc))
return;
+ if (properties & TSC_FREQUENCY_KNOWN)
+ setup_force_cpu_cap(X86_FEATURE_TSC_KNOWN_FREQ);
+ if (properties & TSC_RELIABLE)
+ setup_force_cpu_cap(X86_FEATURE_TSC_RELIABLE);
+
x86_platform.calibrate_tsc = calibrate_tsc;
if (calibrate_cpu)
x86_platform.calibrate_cpu = calibrate_cpu;
diff --git a/arch/x86/xen/time.c b/arch/x86/xen/time.c
index 9e2e900dc0c7..e7429f3cffc6 100644
--- a/arch/x86/xen/time.c
+++ b/arch/x86/xen/time.c
@@ -40,7 +40,6 @@ static unsigned long xen_tsc_khz(void)
struct pvclock_vcpu_time_info *info =
&HYPERVISOR_shared_info->vcpu_info[0].time;
- setup_force_cpu_cap(X86_FEATURE_TSC_KNOWN_FREQ);
return pvclock_tsc_khz(info);
}
@@ -566,7 +565,8 @@ static void __init xen_init_time_common(void)
static_call_update(pv_steal_clock, xen_steal_clock);
paravirt_set_sched_clock(xen_sched_clock);
- tsc_register_calibration_routines(xen_tsc_khz, NULL);
+ tsc_register_calibration_routines(xen_tsc_khz, NULL,
+ TSC_FREQUENCY_KNOWN);
x86_platform.get_wallclock = xen_get_wallclock;
}
--
2.48.1.362.g079036d154-goog
next prev parent reply other threads:[~2025-02-01 2:17 UTC|newest]
Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-01 2:17 [PATCH 00/16] x86/tsc: Try to wrangle PV clocks vs. TSC Sean Christopherson
2025-02-01 2:17 ` [PATCH 01/16] x86/tsc: Add a standalone helpers for getting TSC info from CPUID.0x15 Sean Christopherson
2025-02-03 5:55 ` Nikunj A Dadhania
2025-02-03 22:03 ` Sean Christopherson
2025-02-05 22:13 ` Sean Christopherson
2025-02-11 15:01 ` Borislav Petkov
2025-02-11 17:25 ` Sean Christopherson
2025-02-11 18:40 ` Borislav Petkov
2025-02-11 19:03 ` Sean Christopherson
2025-02-01 2:17 ` [PATCH 02/16] x86/tsc: Add standalone helper for getting CPU frequency from CPUID Sean Christopherson
2025-02-01 2:17 ` [PATCH 03/16] x86/tsc: Add helper to register CPU and TSC freq calibration routines Sean Christopherson
2025-02-11 17:32 ` Borislav Petkov
2025-02-11 17:43 ` Sean Christopherson
2025-02-11 20:32 ` Borislav Petkov
2025-02-12 16:49 ` Tom Lendacky
2025-02-01 2:17 ` [PATCH 04/16] x86/sev: Mark TSC as reliable when configuring Secure TSC Sean Christopherson
2025-02-04 8:02 ` Nikunj A Dadhania
2025-02-01 2:17 ` [PATCH 05/16] x86/sev: Move check for SNP Secure TSC support to tsc_early_init() Sean Christopherson
2025-02-04 8:27 ` Nikunj A Dadhania
2025-02-01 2:17 ` [PATCH 06/16] x86/tdx: Override PV calibration routines with CPUID-based calibration Sean Christopherson
2025-02-04 10:16 ` Nikunj A Dadhania
2025-02-04 19:29 ` Sean Christopherson
2025-02-05 3:56 ` Nikunj A Dadhania
2025-02-01 2:17 ` [PATCH 07/16] x86/acrn: Mark TSC frequency as known when using ACRN for calibration Sean Christopherson
2025-02-01 2:17 ` Sean Christopherson [this message]
2025-02-03 14:48 ` [PATCH 08/16] x86/tsc: Pass KNOWN_FREQ and RELIABLE as params to registration Tom Lendacky
2025-02-03 19:52 ` Sean Christopherson
2025-02-01 2:17 ` [PATCH 09/16] x86/tsc: Rejects attempts to override TSC calibration with lesser routine Sean Christopherson
2025-02-01 2:17 ` [PATCH 10/16] x86/paravirt: Move handling of unstable PV clocks into paravirt_set_sched_clock() Sean Christopherson
2025-02-01 2:17 ` [PATCH 11/16] x86/paravirt: Don't use a PV sched_clock in CoCo guests with trusted TSC Sean Christopherson
2025-02-01 2:17 ` [PATCH 12/16] x86/kvmclock: Mark TSC as reliable when it's constant and nonstop Sean Christopherson
2025-02-01 2:17 ` [PATCH 13/16] x86/kvmclock: Get CPU base frequency from CPUID when it's available Sean Christopherson
2025-02-01 2:17 ` [PATCH 14/16] x86/kvmclock: Get TSC frequency from CPUID when its available Sean Christopherson
2025-02-01 2:17 ` [PATCH 15/16] x86/kvmclock: Stuff local APIC bus period when core crystal freq comes from CPUID Sean Christopherson
2025-02-01 2:17 ` [PATCH 16/16] x86/kvmclock: Use TSC for sched_clock if it's constant and non-stop Sean Christopherson
2025-02-07 17:23 ` Sean Christopherson
2025-02-08 18:03 ` Michael Kelley
2025-02-10 16:21 ` Sean Christopherson
2025-02-12 16:44 ` Michael Kelley
2025-02-12 22:55 ` Sean Christopherson
2025-02-11 14:39 ` [PATCH 00/16] x86/tsc: Try to wrangle PV clocks vs. TSC Borislav Petkov
2025-02-11 16:28 ` Sean Christopherson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250201021718.699411-9-seanjc@google.com \
--to=seanjc@google.com \
--cc=ajay.kaher@broadcom.com \
--cc=alexey.amakhalov@broadcom.com \
--cc=bp@alien8.de \
--cc=dave.hansen@linux.intel.com \
--cc=decui@microsoft.com \
--cc=haiyangz@microsoft.com \
--cc=jailhouse-dev@googlegroups.com \
--cc=jan.kiszka@siemens.com \
--cc=jgross@suse.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=kvm@vger.kernel.org \
--cc=kys@microsoft.com \
--cc=linux-coco@lists.linux.dev \
--cc=linux-hyperv@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=luto@kernel.org \
--cc=mingo@redhat.com \
--cc=nikunj@amd.com \
--cc=pbonzini@redhat.com \
--cc=peterz@infradead.org \
--cc=tglx@linutronix.de \
--cc=thomas.lendacky@amd.com \
--cc=virtualization@lists.linux.dev \
--cc=wei.liu@kernel.org \
--cc=x86@kernel.org \
--cc=xen-devel@lists.xenproject.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).