From: David Woodhouse <dwmw2@infradead.org>
To: Dongli Zhang <dongli.zhang@oracle.com>, kvm@vger.kernel.org
Cc: seanjc@google.com, pbonzini@redhat.com, paul@xen.org,
tglx@kernel.org, mingo@redhat.com, bp@alien8.de,
dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com,
linux-kernel@vger.kernel.org, joe.jin@oracle.com
Subject: Re: [PATCH 3/3] KVM: x86: conditionally update masterclock data in pvclock_update_vm_gtod_copy()
Date: Tue, 12 May 2026 06:21:58 +0100 [thread overview]
Message-ID: <94bbb27100ac85508f00234e4ffdb619de855d7c.camel@infradead.org> (raw)
In-Reply-To: <dfc032bb-c17e-45d2-bce2-7e09d8b2b187@oracle.com>
[-- Attachment #1: Type: text/plain, Size: 3362 bytes --]
On Mon, 2026-05-11 at 17:16 -0700, Dongli Zhang wrote:
>
>
> On 5/9/26 5:22 AM, David Woodhouse wrote:
> > On Thu, 2026-01-15 at 12:22 -0800, Dongli Zhang wrote:
> > > The pvclock_update_vm_gtod_copy() function always unconditionally updates
> > > ka->master_kernel_ns and ka->master_cycle_now whenever a
> > > KVM_REQ_MASTERCLOCK_UPDATE occurs. Unfortunately, each masterclock update
> > > increases the risk of kvm-clock drift.
> > >
> > > If pvclock_update_vm_gtod_copy() is not called from
> > > vcpu_enter_guest()-->kvm_update_masterclock(), we keep the existing
> > > workflow. The argument 'forced' is introduced to tell where it is from.
> > >
> > > Otherwise, we avoid updating the masterclock if it is already
> > > active and will remain active. In such cases, updating the masterclock
> > > data is not beneficial and can instead lead to kvm-clock drift.
> > >
> > > As a result, this patch minimizes the chance of unnecessary masterclock
> > > data updates to avoid kvm-clock drift.
> > >
> > > Cc: David Woodhouse <dwmw@amazon.co.uk>
> > > Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
> >
> > Hmm... so the only caller of pvclock_update_vm_gtod_copy() that doesn't
> > set the 'force' argument is the one in kvm_update_masterclock(), so we
> > are asserting that kvm_update_masterclock() never needs to *change* the
> > masterclock origin point, if it was already set?
> >
> > The gtod notifier callback in pvclock_gtod_update_fn() also ends up
> > setting KVM_REQ_MASTERCLOCK_UPDATE, and is triggered by an actual host
> > timekeeping update (which could potentially be from a clocksource
> > change). It also hypothetically possible that the clocksource changes
> > from TSC → HPET → TSC, switching back to TSC again before the
> > masterclock update ever gets to run. Or maybe a suspend/resume?
> >
> > Are you *sure* that the optimisation is always valid...?
>
> Thank you very much!
>
> I didn't validate the scenario you mentioned. I missed that scenario because I
> assumed that most production systems nowadays use STABLE/CONSTANT/NONSTOP TSC as
> the host clocksource, although I sometimes forgot to add "clocksource=tsc
> tsc=reliable" to my AMD L1 KVM guest (acting as the hypervisor for L2 guest).
I'd love to assume that, but we do still have to cater for systems
without it (or where it goes wrong). And where userspace sets up vCPUs
with different frequencies... although I'd quite like to ban that too
:)
> I didn't follow up on this patch because I noticed another issue. I found that
> the tsc_timestamp in the PVTI can become a very large number if we simply reboot
> the guest VM. This happens because the patch stops updating the masterclock data
> when the masterclock is already active and remains active.
>
> For example:
>
> current guest TSC: 122763682
> PVTI->tsc_timestamp = 18446744073656540006
> PVTI->system_time=196515164269
>
> Although I could not reproduce any bug, that still made me feel uncomfortable.
That's just negative (-53011610). Theoretically it's OK; it's just a
consequence of using a reference point in the past. Probably just
*asking* for guest bugs though, so best avoided.
I'd like to understand how it got like that though. Did your 'reboot'
reset the guest TSC to zero but *not* its kvmclock?
[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5069 bytes --]
next prev parent reply other threads:[~2026-05-12 5:22 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-15 20:22 [PATCH 0/3] KVM: x86: Mitigate kvm-clock drift caused by masterclock update Dongli Zhang
2026-01-15 20:22 ` [PATCH 1/3] KVM: x86: Fix compute_guest_tsc() to cope with negative delta Dongli Zhang
2026-01-15 20:22 ` [PATCH 2/3] KVM: x86: conditionally clear KVM_REQ_MASTERCLOCK_UPDATE at the end of KVM_SET_CLOCK Dongli Zhang
2026-05-09 20:04 ` David Woodhouse
2026-05-12 0:21 ` Dongli Zhang
2026-05-12 7:19 ` David Woodhouse
2026-01-15 20:22 ` [PATCH 3/3] KVM: x86: conditionally update masterclock data in pvclock_update_vm_gtod_copy() Dongli Zhang
2026-05-09 12:22 ` David Woodhouse
2026-05-12 0:16 ` Dongli Zhang
2026-05-12 5:21 ` David Woodhouse [this message]
2026-05-12 23:23 ` Dongli Zhang
2026-01-15 20:37 ` [PATCH 0/3] KVM: x86: Mitigate kvm-clock drift caused by masterclock update Dongli Zhang
2026-01-15 21:13 ` David Woodhouse
2026-01-16 9:31 ` Dongli Zhang
2026-01-22 5:01 ` Dongli Zhang
2026-01-24 1:31 ` David Woodhouse
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=94bbb27100ac85508f00234e4ffdb619de855d7c.camel@infradead.org \
--to=dwmw2@infradead.org \
--cc=bp@alien8.de \
--cc=dave.hansen@linux.intel.com \
--cc=dongli.zhang@oracle.com \
--cc=hpa@zytor.com \
--cc=joe.jin@oracle.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=paul@xen.org \
--cc=pbonzini@redhat.com \
--cc=seanjc@google.com \
--cc=tglx@kernel.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox