kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Borislav Petkov <bp@alien8.de>
Cc: "Nikunj A. Dadhania" <nikunj@amd.com>,
	linux-kernel@vger.kernel.org, thomas.lendacky@amd.com,
	 x86@kernel.org, kvm@vger.kernel.org, mingo@redhat.com,
	tglx@linutronix.de,  dave.hansen@linux.intel.com,
	pgonda@google.com, pbonzini@redhat.com,
	 francescolavra.fl@gmail.com,
	Alexey Makhalov <alexey.makhalov@broadcom.com>,
	 Juergen Gross <jgross@suse.com>,
	Boris Ostrovsky <boris.ostrovsky@oracle.com>
Subject: Re: [PATCH v16 12/13] x86/tsc: Switch to native sched clock
Date: Fri, 17 Jan 2025 12:59:37 -0800	[thread overview]
Message-ID: <Z4rEuTonLal7Li1O@google.com> (raw)
In-Reply-To: <20250117202848.GAZ4q9gMHorhVMfvM0@fat_crate.local>

On Fri, Jan 17, 2025, Borislav Petkov wrote:
> On Thu, Jan 16, 2025 at 08:56:25AM -0800, Sean Christopherson wrote:
> > It's only with SNP and TDX that the clocksource becomes at all interesting.
> 
> So basically you're saying, let's just go ahead and trust the TSC when the HV
> sets a bunch of CPUID bits.

Sort of.  It's not a trust thing though.  The Xen, KVM, and VMware PV clocks are
all based on TSC, i.e. we already "trust" the hypervisor to not muck with TSC.

The purpose of such PV clocks is to account for things in software, that aren't
handled in hardware.  E.g. to provide a constant counter on hardware without a
constant TSC.

The proposal here, and what kvmclock already does for clocksource, is to use the
raw TSC when the hypervisor sets bits that effectively say that the massaging of
TSC done by the PV clock isn't needed.

> But we really really trust it when the guest type is SNP+STSC or TDX since
> there the HV is out of the picture and the only one who can flub it there is
> the OEM.

Yep.  This one _is_ about trust.  Specifically, we trust the raw TSC more than
any clock that is provided by the HV.

> > CPUID 0x15 (and 0x16?) is guaranteed to be available under TDX, and Secure TSC
> > would ideally assert that the kernel doesn't switch to some other calibration
> > method too.  Not sure where to hook into that though, without bleeding TDX and
> > SNP details everywhere.
> 
> We could use the platform calibrate* function pointers and assign TDX- or
> SNP-specific ones and perhaps even define new such function ptrs. That's what
> the platform stuff is for... needs staring, ofc.
> 
> > I agree the naming is weird, but outside of the vendor checks, the VM code is
> > identical to the "native" code, so I don't know that it's worth splitting into
> > multiple functions.
> > 
> > What if we simply rename it to calibrate_tsc_from_cpuid()?
> 
> This is all wrong layering with all those different guest types having their
> own ->calibrate_tsc:
> 
> arch/x86/kernel/cpu/acrn.c:32:  x86_platform.calibrate_tsc = acrn_get_tsc_khz;
> arch/x86/kernel/cpu/mshyperv.c:424:             x86_platform.calibrate_tsc = hv_get_tsc_khz;
> arch/x86/kernel/cpu/vmware.c:419:               x86_platform.calibrate_tsc = vmware_get_tsc_khz;
> arch/x86/kernel/jailhouse.c:213:        x86_platform.calibrate_tsc              = jailhouse_get_tsc;
> arch/x86/kernel/kvmclock.c:323: x86_platform.calibrate_tsc = kvm_get_tsc_khz;
> arch/x86/kernel/tsc.c:944:      tsc_khz = x86_platform.calibrate_tsc();
> arch/x86/kernel/tsc.c:1458:                     tsc_khz = x86_platform.calibrate_tsc();
> arch/x86/kernel/x86_init.c:148: .calibrate_tsc                  = native_calibrate_tsc,
> arch/x86/xen/time.c:569:        x86_platform.calibrate_tsc = xen_tsc_khz;
> 
> What you want sounds like a redesign to me considering how you want to keep
> the KVM guest code and baremetal pretty close... Hmmm, needs staring...

It's not KVM guest code though.  The CPUID stuff is Intel's architecturally
defined behavior.  There are oodles and oodles of features that are transparently
emulated by the hypervisor according to hardware specifications.  Generally
speaking, the kernel treats those as "native", e.g. native_wrmsrl(), native_cpuid(),
etc.

What I am proposing is that, for TDX especially, instead of relying on the hypervisor
to use a paravirtual channel for communicating the TSC frequency, we rely on the
hardware-defined way of getting the frequency, because CPUID is emulated by the
trusted entity, i.e. the OEM.

Hmm, though I suppose I'm arguing against myself in that case.  If the hypervisor
provides the frequency and there are no trust issues, why would we care if the
kernel gets the frequency via CPUID or the PV channel.  It's really only TDX that
matters.  And we could handle TDX by overriding .calibrate_tsc() in tsc_init(),
same as Secure TSC.

That said, I do think it makes sense to either override the vendor and F/M/S
checks native_calibrate_tsc().  Or even better drop the initial vendor check
entirely, because both Intel and AMD have a rich history of implementing each
other's CPUID leaves.  I.e. I see no reason to ignore CPUID 0x15 just because
the CPU isn't Intel.

As for the Goldmost F/M/S check, that one is a virtualization specific thing.
The argument is that when running as a guest, any non-TSC clocksource is going
to be emulated by the hypervisor, and therefore is going to be less reliable than
TSC.  I.e. putting a watchdog on TSC does more harm than good, because what ends
up happening is the TSC gets marked unreliable because the *watchdog* is unreliable.

  reply	other threads:[~2025-01-17 20:59 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-06 12:46 [PATCH v16 00/13] Add Secure TSC support for SNP guests Nikunj A Dadhania
2025-01-06 12:46 ` [PATCH v16 01/13] virt: sev-guest: Remove is_vmpck_empty() helper Nikunj A Dadhania
2025-01-07 18:38   ` Tom Lendacky
2025-01-06 12:46 ` [PATCH v16 02/13] virt: sev-guest: Replace GFP_KERNEL_ACCOUNT with GFP_KERNEL Nikunj A Dadhania
2025-01-07 18:40   ` Tom Lendacky
2025-01-06 12:46 ` [PATCH v16 03/13] x86/sev: Carve out and export SNP guest messaging init routines Nikunj A Dadhania
2025-01-06 12:46 ` [PATCH v16 04/13] x86/sev: Relocate SNP guest messaging routines to common code Nikunj A Dadhania
2025-01-06 12:46 ` [PATCH v16 05/13] x86/sev: Add Secure TSC support for SNP guests Nikunj A Dadhania
2025-01-07 10:42   ` Borislav Petkov
2025-01-07 11:43     ` Nikunj A. Dadhania
2025-01-07 12:37       ` Borislav Petkov
2025-01-07 18:53         ` Tom Lendacky
2025-01-07 19:18           ` Borislav Petkov
2025-01-08  7:47             ` Nikunj A. Dadhania
2025-01-08  8:05               ` Borislav Petkov
2025-01-08  8:37                 ` Nikunj A. Dadhania
2025-01-08  8:43                   ` Borislav Petkov
2025-01-07 19:46   ` Tom Lendacky
2025-01-07 19:53     ` Borislav Petkov
2025-01-06 12:46 ` [PATCH v16 06/13] x86/sev: Change TSC MSR behavior for Secure TSC enabled guests Nikunj A Dadhania
2025-01-07 20:09   ` Tom Lendacky
2025-01-06 12:46 ` [PATCH v16 07/13] x86/sev: Prevent GUEST_TSC_FREQ MSR interception " Nikunj A Dadhania
2025-01-06 12:46 ` [PATCH v16 08/13] x86/sev: Prevent RDTSC/RDTSCP " Nikunj A Dadhania
2025-01-06 12:46 ` [PATCH v16 09/13] x86/sev: Mark Secure TSC as reliable clocksource Nikunj A Dadhania
2025-01-06 12:46 ` [PATCH v16 10/13] x86/tsc: Switch Secure TSC guests away from kvm-clock Nikunj A Dadhania
2025-01-07 15:16   ` Borislav Petkov
2025-01-08 10:45     ` Nikunj A. Dadhania
2025-01-06 12:46 ` [PATCH v16 11/13] x86/tsc: Upgrade TSC clocksource rating for guests Nikunj A Dadhania
2025-01-07 17:51   ` Borislav Petkov
2025-01-06 12:46 ` [PATCH v16 12/13] x86/tsc: Switch to native sched clock Nikunj A Dadhania
2025-01-07 19:37   ` Borislav Petkov
2025-01-08  5:20     ` Nikunj A. Dadhania
2025-01-08  8:22       ` Borislav Petkov
2025-01-08  8:34         ` Nikunj A. Dadhania
2025-01-08 10:20           ` Nikunj A. Dadhania
2025-01-08 14:00             ` Sean Christopherson
2025-01-08 15:34               ` Borislav Petkov
2025-01-08 17:02                 ` Sean Christopherson
2025-01-08 19:53                   ` Borislav Petkov
2025-01-09  6:32                   ` Nikunj A. Dadhania
2025-01-15 21:37                     ` Sean Christopherson
2025-01-16 16:25                       ` Borislav Petkov
2025-01-16 16:56                         ` Sean Christopherson
2025-01-17 20:28                           ` Borislav Petkov
2025-01-17 20:59                             ` Sean Christopherson [this message]
2025-01-21 11:32                               ` Borislav Petkov
2025-01-21  3:59                       ` Nikunj A. Dadhania
2025-01-28  5:41                       ` Nikunj A Dadhania
2025-01-06 12:46 ` [PATCH v16 13/13] x86/sev: Allow Secure TSC feature for SNP guests Nikunj A Dadhania

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z4rEuTonLal7Li1O@google.com \
    --to=seanjc@google.com \
    --cc=alexey.makhalov@broadcom.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=francescolavra.fl@gmail.com \
    --cc=jgross@suse.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=nikunj@amd.com \
    --cc=pbonzini@redhat.com \
    --cc=pgonda@google.com \
    --cc=tglx@linutronix.de \
    --cc=thomas.lendacky@amd.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).