The Linux Kernel Mailing List
 help / color / mirror / Atom feed
From: Dongli Zhang <dongli.zhang@oracle.com>
To: David Woodhouse <dwmw2@infradead.org>, kvm@vger.kernel.org
Cc: seanjc@google.com, pbonzini@redhat.com, paul@xen.org,
	tglx@kernel.org, mingo@redhat.com, bp@alien8.de,
	dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com,
	linux-kernel@vger.kernel.org, joe.jin@oracle.com
Subject: Re: [PATCH 3/3] KVM: x86: conditionally update masterclock data in pvclock_update_vm_gtod_copy()
Date: Mon, 11 May 2026 17:16:21 -0700	[thread overview]
Message-ID: <dfc032bb-c17e-45d2-bce2-7e09d8b2b187@oracle.com> (raw)
In-Reply-To: <16b69905d5d14a06dbb68f29355c58e2c7d2f0d2.camel@infradead.org>



On 5/9/26 5:22 AM, David Woodhouse wrote:
> On Thu, 2026-01-15 at 12:22 -0800, Dongli Zhang wrote:
>> The pvclock_update_vm_gtod_copy() function always unconditionally updates
>> ka->master_kernel_ns and ka->master_cycle_now whenever a
>> KVM_REQ_MASTERCLOCK_UPDATE occurs. Unfortunately, each masterclock update
>> increases the risk of kvm-clock drift.
>>
>> If pvclock_update_vm_gtod_copy() is not called from
>> vcpu_enter_guest()-->kvm_update_masterclock(), we keep the existing
>> workflow. The argument 'forced' is introduced to tell where it is from.
>>
>> Otherwise, we avoid updating the masterclock if it is already
>> active and will remain active. In such cases, updating the masterclock
>> data is not beneficial and can instead lead to kvm-clock drift.
>>
>> As a result, this patch minimizes the chance of unnecessary masterclock
>> data updates to avoid kvm-clock drift.
>>
>> Cc: David Woodhouse <dwmw@amazon.co.uk>
>> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
> 
> Hmm... so the only caller of pvclock_update_vm_gtod_copy() that doesn't
> set the 'force' argument is the one in kvm_update_masterclock(), so we
> are asserting that kvm_update_masterclock() never needs to *change* the
> masterclock origin point, if it was already set?
> 
> The gtod notifier callback in pvclock_gtod_update_fn() also ends up
> setting KVM_REQ_MASTERCLOCK_UPDATE, and is triggered by an actual host
> timekeeping update (which could potentially be from a clocksource
> change). It also hypothetically possible that the clocksource changes
> from TSC → HPET → TSC, switching back to TSC again before the
> masterclock update ever gets to run. Or maybe a suspend/resume?
> 
> Are you *sure* that the optimisation is always valid...?

Thank you very much!

I didn't validate the scenario you mentioned. I missed that scenario because I
assumed that most production systems nowadays use STABLE/CONSTANT/NONSTOP TSC as
the host clocksource, although I sometimes forgot to add "clocksource=tsc
tsc=reliable" to my AMD L1 KVM guest (acting as the hypervisor for L2 guest).

I didn't follow up on this patch because I noticed another issue. I found that
the tsc_timestamp in the PVTI can become a very large number if we simply reboot
the guest VM. This happens because the patch stops updating the masterclock data
when the masterclock is already active and remains active.

For example:

current guest TSC: 122763682
PVTI->tsc_timestamp = 18446744073656540006
PVTI->system_time=196515164269

Although I could not reproduce any bug, that still made me feel uncomfortable.

I think the patch you posted for PATCH 2/3 can fix the issue mentioned in this
patch.

https://lore.kernel.org/kvm/f37f0ae0ae1dfce3ad3c6fce653f5df34adecc0a.camel@infradead.org

I would also test with your patchset, if the unnecessary masterclock update is
avoided or minimized to one time.

https://lore.kernel.org/kvm/20260509224824.3264567-27-dwmw2@infradead.org/

Thank you very much!

Dongli Zhang

  reply	other threads:[~2026-05-12  0:16 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20260115202256.119820-1-dongli.zhang@oracle.com>
     [not found] ` <20260115202256.119820-4-dongli.zhang@oracle.com>
2026-05-09 12:22   ` [PATCH 3/3] KVM: x86: conditionally update masterclock data in pvclock_update_vm_gtod_copy() David Woodhouse
2026-05-12  0:16     ` Dongli Zhang [this message]
2026-05-12  5:21       ` David Woodhouse
     [not found] ` <20260115202256.119820-3-dongli.zhang@oracle.com>
2026-05-09 20:04   ` [PATCH 2/3] KVM: x86: conditionally clear KVM_REQ_MASTERCLOCK_UPDATE at the end of KVM_SET_CLOCK David Woodhouse
2026-05-12  0:21     ` Dongli Zhang
2026-05-12  7:19       ` David Woodhouse

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=dfc032bb-c17e-45d2-bce2-7e09d8b2b187@oracle.com \
    --to=dongli.zhang@oracle.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=dwmw2@infradead.org \
    --cc=hpa@zytor.com \
    --cc=joe.jin@oracle.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=paul@xen.org \
    --cc=pbonzini@redhat.com \
    --cc=seanjc@google.com \
    --cc=tglx@kernel.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox