public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Thomas Lefebvre <thomas.lefebvre3@gmail.com>,
	pbonzini@redhat.com, kvm@vger.kernel.org,
	 linux-kernel@vger.kernel.org, linux-hyperv@vger.kernel.org,
	 Michael Kelley <mhklinux@outlook.com>
Subject: Re: [BUG] KVM: x86: kvmclock jumps ~253 years on Hyper-V nested virt due to cross-CPU raw TSC inconsistency
Date: Tue, 7 Apr 2026 09:44:51 -0700	[thread overview]
Message-ID: <adU0g-trGz8CRjeM@google.com> (raw)
In-Reply-To: <adU0LAW1h8q9HsGu@google.com>

On Tue, Apr 07, 2026, Sean Christopherson wrote:
> +Michael

Let's try that again.  Email address #1 bounced.

> On Tue, Apr 07, 2026, Vitaly Kuznetsov wrote:
> > Thomas Lefebvre <thomas.lefebvre3@gmail.com> writes:
> > > Under Hyper-V, raw RDTSC values are not consistent across vCPUs.
> > > The hypervisor corrects them only through the TSC page scale/offset.
> > > If pvclock_update_vm_gtod_copy() runs on CPU 0 and __get_kvmclock()
> > > later runs on CPU 1 where the raw TSC is lower, the unsigned
> > > subtraction wraps.
> > >
> > 
> > According to the TLFS, reference TSC page is partition wide:
> > 
> > "The hypervisor provides a partition-wide virtual reference TSC page
> > which is overlaid on the partition’s GPA space. A partition’s reference
> > time stamp counter page is accessed through the Reference TSC MSR."
> > 
> > so if as you say RAW rdtsc value is inconsistent across vCPUs, I can
> > hardly see how we can use this time source at all, even without
> > KVM. scale/offset are the same for all vCPUs.
> > 
> > I think the fix here is to avoid setting up Hyper-V TSC page clocksource
> > in L1. Unfortunately, with unsynchronized TSCs this will leave us the
> > only choice for a sane clocksource: raw HV_X64_MSR_TIME_REF_COUNT MSR
> > reads.
> 
> This feels like either a Hyper-V bug or a Linux-as-a-guest bug.  For "Reference
> Counter"[1]:
> 
>   The hypervisor maintains a per-partition reference time counter. It has the
>   characteristic that successive accesses to it return strictly monotonically
>   increasing (time) values as seen by any and all virtual processors of a
>   partition. Furthermore, the reference counter is rate constant and unaffected
>   by processor or bus speed transitions or deep processor power savings states. A
>   partition’s reference time counter is initialized to zero when the partition is
>   created. The reference counter for all partitions count at the same rate, but
>   at any time, their absolute values will typically differ because partitions
>   will have different creation times.
>   
>   The reference counter continues to count up as long as at least one virtual
>   processor is not explicitly suspended.
> 
> 
> And then "Partition Reference Time Enlightenment"[2]:
> 
>   The partition reference time enlightenment presents a reference time source to
>   a partition which does not require an intercept into the hypervisor. This
>   enlightenment is available only when the underlying platform provides support
>   of an invariant processor Time Stamp Counter (TSC), or iTSC. In such platforms,
>   the processor TSC frequency remains constant irrespective of changes in the
>   processor’s clock frequency due to the use of power management states such as
>   ACPI processor performance states, processor idle sleep states (ACPI C-states),
>   etc.
> 
>   The partition reference time enlightenment uses a virtual TSC value, an offset
>   and a multiplier to enable a guest partition to compute the normalized
>   reference time since partition creation, in 100nS units. The mechanism also
>   allows a guest partition to atomically compute the reference time when the
>   guest partition is migrated to a platform with a different TSC rate, and
>   provides a fallback mechanism to support migration to platforms without the
>   constant rate TSC feature.
> 
> My read of "Partition Reference Time Enlightenment" is that it should only be
> advertised if the TSC is synchronized and constant.  I can't figure out where
> that feature is actually advertised though, because IIUC it's not the same as
> HV_ACCESS_TSC_INVARIANT, which says that the virtual TSC is guaranteed to be
> invariant even across live migration.  And it's not HV_MSR_REFERENCE_TSC_AVAILABLE,
> because I'm pretty sure that just says HV_MSR_REFERENCE_TSC is available.
> 
> Michael, help?
> 
> [1] https://learn.microsoft.com/en-us/virtualization/hyper-v-on-windows/tlfs/timers#reference-counter
> [2] https://learn.microsoft.com/en-us/virtualization/hyper-v-on-windows/tlfs/timers#partition-reference-time-enlightenment

  reply	other threads:[~2026-04-07 16:44 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-05 22:10 [BUG] KVM: x86: kvmclock jumps ~253 years on Hyper-V nested virt due to cross-CPU raw TSC inconsistency Thomas Lefebvre
2026-04-06 14:11 ` Sean Christopherson
2026-04-07  8:23   ` Vitaly Kuznetsov
2026-04-07  8:17 ` Vitaly Kuznetsov
2026-04-07 16:43   ` Sean Christopherson
2026-04-07 16:44     ` Sean Christopherson [this message]
2026-04-07 18:37     ` Michael Kelley
2026-04-07 19:13       ` Thomas Lefebvre
2026-04-07 20:40         ` Michael Kelley

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=adU0g-trGz8CRjeM@google.com \
    --to=seanjc@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-hyperv@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mhklinux@outlook.com \
    --cc=pbonzini@redhat.com \
    --cc=thomas.lefebvre3@gmail.com \
    --cc=vkuznets@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox