All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: David Woodhouse <dwmw2@infradead.org>
Cc: Kiryl Shutsemau <kas@kernel.org>,
	Paolo Bonzini <pbonzini@redhat.com>,
	 "K. Y. Srinivasan" <kys@microsoft.com>,
	Haiyang Zhang <haiyangz@microsoft.com>,
	Wei Liu <wei.liu@kernel.org>,  Dexuan Cui <decui@microsoft.com>,
	Long Li <longli@microsoft.com>,
	 Ajay Kaher <ajay.kaher@broadcom.com>,
	Alexey Makhalov <alexey.makhalov@broadcom.com>,
	 Jan Kiszka <jan.kiszka@siemens.com>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	 Andy Lutomirski <luto@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Juergen Gross <jgross@suse.com>,
	 Daniel Lezcano <daniel.lezcano@kernel.org>,
	Thomas Gleixner <tglx@kernel.org>,
	 John Stultz <jstultz@google.com>,
	Rick Edgecombe <rick.p.edgecombe@intel.com>,
	 Vitaly Kuznetsov <vkuznets@redhat.com>,
	 Broadcom internal kernel review list
	<bcm-kernel-feedback-list@broadcom.com>,
	 Boris Ostrovsky <boris.ostrovsky@oracle.com>,
	Stephen Boyd <sboyd@kernel.org>,
	x86@kernel.org,  linux-coco@lists.linux.dev, kvm@vger.kernel.org,
	linux-hyperv@vger.kernel.org,  virtualization@lists.linux.dev,
	linux-kernel@vger.kernel.org,  xen-devel@lists.xenproject.org,
	Michael Kelley <mhklinux@outlook.com>,
	 Tom Lendacky <thomas.lendacky@amd.com>,
	Nikunj A Dadhania <nikunj@amd.com>,
	 Thomas Gleixner <tglx@linutronix.de>
Subject: Re: [PATCH v3 02/41] x86/tsc: Add helper to register CPU and TSC freq calibration routines
Date: Thu, 21 May 2026 13:53:35 -0700	[thread overview]
Message-ID: <ag9wz3RiJOtVZrK0@google.com> (raw)
In-Reply-To: <44e0d60548d317fd59895f18bd17220dfb2f834b.camel@infradead.org>

On Wed, May 20, 2026, David Woodhouse wrote:
> On Fri, 2026-05-15 at 12:19 -0700, Sean Christopherson wrote:
> > Add a helper to register non-native, i.e. PV and CoCo, CPU and TSC
> > frequency calibration routines.  This will allow consolidating handling
> > of common TSC properties that are forced by hypervisor (PV routines),
> > and will also allow adding sanity checks to guard against overriding a
> > TSC calibration routine with a routine that is less robust/trusted.
> > 
> > Make the CPU calibration routine optional, as Xen (very sanely) doesn't
> > assume the CPU runs as the same frequency as the TSC.
> > 
> > Wrap the helper in an #ifdef to document that the kernel overrides
> > the native routines when running as a VM, and to guard against unwanted
> > usage.  Add a TODO to call out that AMD_MEM_ENCRYPT is a mess and doesn't
> > depend on HYPERVISOR_GUEST because it gates both guest and host code.
> > 
> > No functional change intended.
> > 
> > Reviewed-by: Michael Kelley <mhklinux@outlook.com>
> > Tested-by: Michael Kelley <mhklinux@outlook.com>
> > Signed-off-by: Sean Christopherson <seanjc@google.com>
> 
> Mildly concerned that we might want to support multiple options — does
> it have CPUID 0x15? Does it have 0x40000x10? Does it have a pvclock?
> There are various permutations of those which are perhaps best handled
> by *trying* each one, in some order, and populating a struct with the
> answers?
> 
> But on the basis that perfect is the enemy of good,

This has been bothering me too.

Aha!  AHA!  Idea.

... 4 hours later ...

Mhahahaahah, victory is mine!!!!

TL;DR: Overriding x86_platform_ops hooks is dumb.

To your point about making an informed decision, that's essentialy what this series
is already doing, just in a very roundabout way:

  1. x86_platform.calibrate_{cpu,tsc}() are initialized to "native" versions
  2. Hypervisor init code runs and conditionally overrides calibrate_{cpu,tsc}()
  3. CoCo init code runs and conditionally overrides calibrate_{cpu,tsc}()

So the ordering you want is already there, as is "trying" each source to some
extent, in the form of steps #2 and #3 overriding the hooks if and only if their
source of information is valid.  For all intents and purposes, the hardening I
was adding by formalizing the calibration overrides was to enforce the above ordering.

But that's obviously all but impossible to follow, _and_ it's pointless.

For every PV case, including TDX and SNP, "calibration" is simply information
retrieval, i.e. it never changes (barring broken hypervisors/firmware), and the
information is always available during early boot.

Contrast that with the pre-CPUID CPU frequency calibration, where the frequency
might change, the kernel is making a best guest based on other timekeeping sources,
and not all timekeeping sources are available during early boot.

And so overriding x86_platform.calibrate_{cpu,tsc}() for PV code is completely
unecessary, because steps #2 and #3 already know the frequency when they override
the hooks, and "success" is guaranteed, i.e. the kernel won't have to switch to a
"late" calibration flow.

If we provide x86_hyper_init hooks:

	unsigned int (*get_tsc_khz)(void);
	unsigned int (*get_cpu_khz)(void);

then we can kill off x86_platform.calibrate_{cpu,tsc}() entirely, explicitly
define the preferred ordering (user-forced => CoCo => Hypervisor => native), and
depup some of the hypervisor code.

E.g. this is what I've got for the early flow.  Testing now. 

  void __init tsc_early_init(void)
  {
	unsigned int known_cpu_khz = 0, known_tsc_khz = 0;

	if (!boot_cpu_has(X86_FEATURE_TSC))
		return;
	/* Don't change UV TSC multi-chassis synchronization */
	if (is_early_uv_system())
		return;

	if (x86_init.hyper.get_cpu_khz)
		known_cpu_khz = x86_init.hyper.get_cpu_khz();

	if (tsc_early_khz)
		known_tsc_khz = tsc_early_khz;
	else if (cc_platform_has(CC_ATTR_GUEST_SNP_SECURE_TSC))
		known_tsc_khz = snp_secure_tsc_init();
	else if (boot_cpu_has(X86_FEATURE_TDX_GUEST))
		known_tsc_khz = tdx_tsc_init();

	/*
	 * If the TSC frequency is still unknown, i.e. not provided by the user
	 * or by trusted firmware, try to get it from the hypervisor (which is
	 * untrusted when running as a CoCo guest).
	 */
	if (!known_tsc_khz && x86_init.hyper.get_tsc_khz)
		known_tsc_khz = x86_init.hyper.get_tsc_khz();

	if (known_tsc_khz)
		setup_force_cpu_cap(X86_FEATURE_TSC_KNOWN_FREQ);

	if (!determine_cpu_tsc_frequencies(true, known_cpu_khz, known_tsc_khz))
		return;
	tsc_enable_sched_clock();
  }

  parent reply	other threads:[~2026-05-21 20:53 UTC|newest]

Thread overview: 138+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-15 19:19 [PATCH v3 00/41] x86: Try to wrangle PV clocks vs. TSC Sean Christopherson
2026-05-15 19:19 ` [PATCH v3 01/41] x86/tsc: Add a standalone helpers for getting TSC info from CPUID.0x15 Sean Christopherson
2026-05-20 18:59   ` David Woodhouse
2026-05-15 19:19 ` [PATCH v3 02/41] x86/tsc: Add helper to register CPU and TSC freq calibration routines Sean Christopherson
2026-05-15 20:06   ` sashiko-bot
2026-05-18 21:59   ` Woodhouse, David
2026-05-20 17:56     ` Sean Christopherson
2026-05-20 19:04   ` David Woodhouse
2026-05-20 20:44     ` Sean Christopherson
2026-05-20 21:19       ` David Woodhouse
2026-05-20 21:33         ` Sean Christopherson
2026-05-20 21:44           ` David Woodhouse
2026-05-21 20:53     ` Sean Christopherson [this message]
2026-05-21 21:01       ` David Woodhouse
2026-05-21 21:17         ` Sean Christopherson
2026-05-21 21:37           ` David Woodhouse
2026-05-15 19:19 ` [PATCH v3 03/41] x86/sev: Mark TSC as reliable when configuring Secure TSC Sean Christopherson
2026-05-20 19:06   ` David Woodhouse
2026-05-15 19:19 ` [PATCH v3 04/41] x86/sev: Move check for SNP Secure TSC support to tsc_early_init() Sean Christopherson
2026-05-20 19:16   ` David Woodhouse
2026-05-15 19:19 ` [PATCH v3 05/41] x86/tdx: Override PV calibration routines with CPUID-based calibration Sean Christopherson
2026-05-20 19:52   ` David Woodhouse
2026-05-15 19:19 ` [PATCH v3 06/41] x86/acrn: Mark TSC frequency as known when using ACRN for calibration Sean Christopherson
2026-05-20 20:01   ` David Woodhouse
2026-05-20 21:02     ` Sean Christopherson
2026-05-15 19:19 ` [PATCH v3 07/41] clocksource: hyper-v: Register sched_clock save/restore iff it's necessary Sean Christopherson
2026-05-20 21:27   ` David Woodhouse
2026-05-27 22:49   ` Wei Liu
2026-05-15 19:19 ` [PATCH v3 08/41] clocksource: hyper-v: Drop wrappers to sched_clock save/restore helpers Sean Christopherson
2026-05-20 21:29   ` David Woodhouse
2026-05-27 22:50   ` Wei Liu
2026-05-15 19:19 ` [PATCH v3 09/41] clocksource: hyper-v: Don't save/restore TSC offset when using HV sched_clock Sean Christopherson
2026-05-20 21:30   ` David Woodhouse
2026-05-27 22:50   ` Wei Liu
2026-05-15 19:19 ` [PATCH v3 10/41] x86/kvmclock: Setup kvmclock for secondary CPUs iff CONFIG_SMP=y Sean Christopherson
2026-05-20 21:45   ` David Woodhouse
2026-05-15 19:19 ` [PATCH v3 11/41] x86/kvm: Don't disable kvmclock on BSP in syscore_suspend() Sean Christopherson
2026-05-15 20:34   ` sashiko-bot
2026-05-15 22:29     ` Sean Christopherson
2026-05-20 21:51   ` David Woodhouse
2026-05-15 19:19 ` [PATCH v3 12/41] x86/paravirt: Remove unnecessary PARAVIRT=n stub for paravirt_set_sched_clock() Sean Christopherson
2026-05-20 21:53   ` David Woodhouse
2026-05-15 19:19 ` [PATCH v3 13/41] x86/paravirt: Move handling of unstable PV clocks into paravirt_set_sched_clock() Sean Christopherson
2026-05-20 21:57   ` David Woodhouse
2026-05-15 19:19 ` [PATCH v3 14/41] x86/kvmclock: Move sched_clock save/restore helpers up in kvmclock.c Sean Christopherson
2026-05-20 21:59   ` David Woodhouse
2026-05-15 19:19 ` [PATCH v3 15/41] x86/xen/time: Nullify x86_platform's sched_clock save/restore hooks Sean Christopherson
2026-05-15 19:48   ` sashiko-bot
2026-05-15 22:43     ` Sean Christopherson
2026-05-20 22:11   ` David Woodhouse
2026-05-20 22:54     ` David Woodhouse
2026-05-15 19:19 ` [PATCH v3 16/41] x86/vmware: Nullify save/restore hooks when using VMware's sched_clock Sean Christopherson
2026-05-15 19:42   ` sashiko-bot
2026-05-20 22:15   ` David Woodhouse
2026-05-15 19:19 ` [PATCH v3 17/41] x86/tsc: WARN if TSC sched_clock save/restore used with PV sched_clock Sean Christopherson
2026-05-15 19:55   ` sashiko-bot
2026-05-20 22:27   ` David Woodhouse
2026-05-15 19:19 ` [PATCH v3 18/41] x86/paravirt: Pass sched_clock save/restore helpers during registration Sean Christopherson
2026-05-15 19:56   ` sashiko-bot
2026-05-20 22:35   ` Woodhouse, David
2026-05-15 19:19 ` [PATCH v3 19/41] x86/kvmclock: Move kvm_sched_clock_init() down in kvmclock.c Sean Christopherson
2026-05-20 22:39   ` David Woodhouse
2026-05-15 19:19 ` [PATCH v3 20/41] x86/xen/time: Mark xen_setup_vsyscall_time_info() as __init Sean Christopherson
2026-05-20 22:40   ` David Woodhouse
2026-05-15 19:19 ` [PATCH v3 21/41] x86/pvclock: Mark setup helpers and related various as __init/__ro_after_init Sean Christopherson
2026-05-20 22:43   ` Woodhouse, David
2026-05-15 19:19 ` [PATCH v3 22/41] x86/pvclock: WARN if pvclock's valid_flags are overwritten Sean Christopherson
2026-05-20 22:44   ` David Woodhouse
2026-05-15 19:19 ` [PATCH v3 23/41] x86/kvmclock: Refactor handling of PVCLOCK_TSC_STABLE_BIT during kvmclock_init() Sean Christopherson
2026-05-20 22:46   ` David Woodhouse
2026-05-15 19:19 ` [PATCH v3 24/41] timekeeping: Resume clocksources before reading persistent clock Sean Christopherson
2026-05-20 22:52   ` David Woodhouse
2026-05-15 19:19 ` [PATCH v3 25/41] x86/kvmclock: Hook clocksource.suspend/resume when kvmclock isn't sched_clock Sean Christopherson
2026-05-20 23:01   ` David Woodhouse
2026-05-20 23:06   ` David Woodhouse
2026-05-15 19:19 ` [PATCH v3 26/41] x86/kvmclock: WARN if wall clock is read while kvmclock is suspended Sean Christopherson
2026-05-20 23:19   ` David Woodhouse
2026-05-15 19:19 ` [PATCH v3 27/41] x86/kvmclock: Enable kvmclock on APs during onlining if kvmclock isn't sched_clock Sean Christopherson
2026-05-15 19:47   ` sashiko-bot
2026-05-18 23:04     ` Sean Christopherson
2026-05-20 23:27   ` David Woodhouse
2026-05-21 12:59     ` Sean Christopherson
2026-05-21 13:10       ` Peter Zijlstra
2026-05-21 13:38         ` Sean Christopherson
2026-05-21 14:13           ` David Woodhouse
2026-05-15 19:19 ` [PATCH v3 28/41] x86/paravirt: Mark __paravirt_set_sched_clock() as __init Sean Christopherson
2026-05-20 23:42   ` David Woodhouse
2026-05-15 19:19 ` [PATCH v3 29/41] x86/paravirt: Plumb a return code into __paravirt_set_sched_clock() Sean Christopherson
2026-05-15 19:48   ` sashiko-bot
2026-05-18 21:14     ` Sean Christopherson
2026-05-20 23:44   ` David Woodhouse
2026-05-21 20:35     ` Sean Christopherson
2026-05-15 19:19 ` [PATCH v3 30/41] x86/paravirt: Don't use a PV sched_clock in CoCo guests with trusted TSC Sean Christopherson
2026-05-20 23:45   ` David Woodhouse
2026-05-15 19:19 ` [PATCH v3 31/41] x86/tsc: Pass KNOWN_FREQ and RELIABLE as params to registration Sean Christopherson
2026-05-15 19:45   ` sashiko-bot
2026-05-18 22:18     ` Sean Christopherson
2026-05-19  3:12       ` Michael Kelley
2026-05-20 16:40         ` Sean Christopherson
2026-05-20 19:01           ` Michael Kelley
2026-05-20 23:49   ` Woodhouse, David
2026-05-15 19:19 ` [PATCH v3 32/41] x86/tsc: Rejects attempts to override TSC calibration with lesser routine Sean Christopherson
2026-05-15 20:16   ` sashiko-bot
2026-05-18 19:17     ` Sean Christopherson
2026-05-20 23:50   ` David Woodhouse
2026-05-15 19:19 ` [PATCH v3 33/41] x86/kvmclock: Mark TSC as reliable when it's constant and nonstop Sean Christopherson
2026-05-20 23:51   ` David Woodhouse
2026-05-15 19:19 ` [PATCH v3 34/41] KVM: x86: Officially define CPUID 0x40000010 as PV Timing Info (TSC and Bus) Sean Christopherson
2026-05-15 19:19 ` [PATCH v3 35/41] x86/kvmclock: Obtain TSC frequency from CPUID if present Sean Christopherson
2026-05-15 19:19 ` [PATCH v3 36/41] x86/kvmclock: Get local APIC bus frequency from PV CPUID Timing Info Sean Christopherson
2026-05-15 19:55   ` sashiko-bot
2026-05-18 20:57     ` Sean Christopherson
2026-05-20 23:55   ` Woodhouse, David
2026-05-21 20:34     ` Sean Christopherson
2026-05-15 19:19 ` [PATCH v3 37/41] x86/kvmclock: Use TSC for sched_clock if it's constant and non-stop Sean Christopherson
2026-05-15 20:09   ` sashiko-bot
2026-05-18 20:28     ` Sean Christopherson
2026-05-20 23:56   ` David Woodhouse
2026-05-21  9:14   ` Dongli Zhang
2026-05-21 21:01     ` Sean Christopherson
2026-05-15 19:19 ` [PATCH v3 38/41] x86/paravirt: kvmclock: Setup kvmclock early iff it's sched_clock Sean Christopherson
2026-05-20 23:59   ` David Woodhouse
2026-05-15 19:19 ` [PATCH v3 39/41] x86/paravirt: Move using_native_sched_clock() stub into timer.h Sean Christopherson
2026-05-21  0:00   ` David Woodhouse
2026-05-15 19:19 ` [PATCH v3 40/41] x86/tsc: Add standalone helper for getting CPU frequency from CPUID Sean Christopherson
2026-05-15 19:51   ` sashiko-bot
2026-05-15 23:04     ` Sean Christopherson
2026-05-16  7:42   ` Paolo Bonzini
2026-05-20 18:50     ` David Woodhouse
2026-05-15 19:19 ` [PATCH v3 41/41] x86/kvmclock: Get CPU base frequency from CPUID when it's available Sean Christopherson
2026-05-15 19:59   ` sashiko-bot
2026-05-20 21:06     ` Sean Christopherson
2026-05-20 18:52   ` David Woodhouse
2026-05-20 19:06     ` Sean Christopherson
2026-05-18 21:11 ` [PATCH v3 00/41] x86: Try to wrangle PV clocks vs. TSC Sean Christopherson
2026-05-18 23:38 ` David Woodhouse
2026-05-20 17:59   ` Sean Christopherson
2026-05-20 18:30     ` David Woodhouse

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ag9wz3RiJOtVZrK0@google.com \
    --to=seanjc@google.com \
    --cc=ajay.kaher@broadcom.com \
    --cc=alexey.makhalov@broadcom.com \
    --cc=bcm-kernel-feedback-list@broadcom.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=daniel.lezcano@kernel.org \
    --cc=dave.hansen@linux.intel.com \
    --cc=decui@microsoft.com \
    --cc=dwmw2@infradead.org \
    --cc=haiyangz@microsoft.com \
    --cc=jan.kiszka@siemens.com \
    --cc=jgross@suse.com \
    --cc=jstultz@google.com \
    --cc=kas@kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=kys@microsoft.com \
    --cc=linux-coco@lists.linux.dev \
    --cc=linux-hyperv@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=longli@microsoft.com \
    --cc=luto@kernel.org \
    --cc=mhklinux@outlook.com \
    --cc=nikunj@amd.com \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rick.p.edgecombe@intel.com \
    --cc=sboyd@kernel.org \
    --cc=tglx@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=thomas.lendacky@amd.com \
    --cc=virtualization@lists.linux.dev \
    --cc=vkuznets@redhat.com \
    --cc=wei.liu@kernel.org \
    --cc=x86@kernel.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.