public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Vishal Annapurve <vannapurve@google.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Jun Nakajima <jun.nakajima@intel.com>,
	Isaku Yamahata <isaku.yamahata@intel.com>,
	Erdem Aktas <erdemaktas@google.com>,
	Sagi Shahar <sagis@google.com>,
	Nikolay Borisov <nik.borisov@suse.com>,
	"Jason A. Donenfeld" <Jason@zx2c4.com>,
	Kuppuswamy Sathyanarayanan 
	<sathyanarayanan.kuppuswamy@linux.intel.com>,
	"H. Peter Anvin" <hpa@zytor.com>,
	x86@kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] x86/tdx: Override the tsc calibration for TDX VMs
Date: Fri, 13 Oct 2023 16:02:21 -0700	[thread overview]
Message-ID: <ZSnMfS9A9HgW--YE@google.com> (raw)
In-Reply-To: <ZSnIKQ4bUavAtBz6@google.com>

On Fri, Oct 13, 2023, Sean Christopherson wrote:
> On Fri, Oct 06, 2023, Vishal Annapurve wrote:
> > TSC calibration for native execution gets the TSC frequency from CPUID,
> > but also ends up setting lapic_timer_period. When using oneshot mode
> > with lapic timer, predefined value of lapic_timer_period causes lapic
> > timer calibration to be skipped with wrong multipliers set for lapic
> > timer.
> > 
> > To avoid this issue, override the TSC calibration step for TDX VMs to
> > just calculate the TSC frequency using cpuid values.
> 
> This is a hack to workaround a KVM TDX bug.  Per Intel's SDM:
> 
>   The APIC timer frequency will be the processor’s bus clock or core crystal
>   clock frequency (when TSC/core crystal clock ratio is enumerated in CPUID
>   leaf 0x15) divided by the value specified in the divide configuration register.
> 
> TDX hardcodes the core crystal frequency to 25Mhz, whereas KVM hardcodes the APIC
> bus frequency to 1Ghz.  Upstream KVM's *current* behavior is fine, because KVM
> doesn't advertise support for CPUID 0x15, i.e. doesn't announce to host userspace
> that it's safe to expose CPUID 0x15 to the guest.  Because TDX makes exposing
> CPUID 0x15 mandatory, KVM needs to be taught to correctly emulate the guest's
> APIC bus frequency, a.k.a. the TDX guest core crystal frequency of 25Mhz.
> 
> I.e. tmict_to_ns() needs to replace APIC_BUS_CYCLE_NS with some math that makes
> the guest's APIC timer actually run at 25Mhz given whatever the host APIC bus
> runs at.
> 
>   static inline u64 tmict_to_ns(struct kvm_lapic *apic, u32 tmict)
>   {
> 	return (u64)tmict * APIC_BUS_CYCLE_NS * (u64)apic->divide_count;
>   }
> 
> The existing guest code "works" because the calibration code effectively discovers
> the host APIC bus frequency.  If we really want to force calibration, then the
> best way to do that would be to add a command line option to do exactly that, not
> hack around a KVM TDX bug.
> 
> diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
> index 15f97c0abc9d..ce1cec6b3c18 100644
> --- a/arch/x86/kernel/tsc.c
> +++ b/arch/x86/kernel/tsc.c
> @@ -723,7 +723,8 @@ unsigned long native_calibrate_tsc(void)
>          * lapic_timer_period here to avoid having to calibrate the APIC
>          * timer later.
>          */
> -       lapic_timer_period = crystal_khz * 1000 / HZ;
> +       if (!force_lapic_timer_calibration)
> +               lapic_timer_period = crystal_khz * 1000 / HZ;
>  #endif
>  
>         return crystal_khz * ebx_numerator / eax_denominator;
> 
> But I would be very leery of forcing calibration, as effectively calibrating to
> the *host* core crystal frequency will cause the guest APIC timer to be wrong if
> the VM is migrated to a host with a different core crystal frequency.  Relying
> on CPUID 0x15, if it's available, avoids that problem because it puts the onus on
> the hypervisor to account for the new host's frequency when emulating the APIC
> timer.  That mess exists today, but deliberate ignoring the mechanism that allows
> the host to fix the mess would be asinine IMO.

Gah, I had a brainfart.  tmict_to_ns() obviously converts TMICT to nanoseconds,
and then converts nanoseconds into whatever frequency the underlying host timer
runs at.

So it's really only TDX that is problematic, as TDX demands the APIC bus be
emulated at a frequency other than 1Ghz.

  reply	other threads:[~2023-10-13 23:02 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-06  1:12 [PATCH] x86/tdx: Override the tsc calibration for TDX VMs Vishal Annapurve
2023-10-06 10:43 ` Huang, Kai
2023-10-06 15:29   ` Vishal Annapurve
2023-10-06 14:02 ` Dave Hansen
2023-10-06 15:27   ` Vishal Annapurve
2023-10-06 15:43     ` Dave Hansen
2023-10-13 22:43 ` Sean Christopherson
2023-10-13 23:02   ` Sean Christopherson [this message]
2023-10-21  1:54     ` Vishal Annapurve

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZSnMfS9A9HgW--YE@google.com \
    --to=seanjc@google.com \
    --cc=Jason@zx2c4.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=erdemaktas@google.com \
    --cc=hpa@zytor.com \
    --cc=isaku.yamahata@intel.com \
    --cc=jun.nakajima@intel.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=nik.borisov@suse.com \
    --cc=peterz@infradead.org \
    --cc=sagis@google.com \
    --cc=sathyanarayanan.kuppuswamy@linux.intel.com \
    --cc=tglx@linutronix.de \
    --cc=vannapurve@google.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox