From: Dongli Zhang <dongli.zhang@oracle.com>
To: David Woodhouse <dwmw2@infradead.org>, kvm@vger.kernel.org
Cc: seanjc@google.com, pbonzini@redhat.com, paul@xen.org,
tglx@kernel.org, mingo@redhat.com, bp@alien8.de,
dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com,
linux-kernel@vger.kernel.org, joe.jin@oracle.com
Subject: Re: [PATCH 2/3] KVM: x86: conditionally clear KVM_REQ_MASTERCLOCK_UPDATE at the end of KVM_SET_CLOCK
Date: Mon, 11 May 2026 17:21:12 -0700 [thread overview]
Message-ID: <3363ed64-75f9-4a5c-93b9-882d72bfb7f2@oracle.com> (raw)
In-Reply-To: <f37f0ae0ae1dfce3ad3c6fce653f5df34adecc0a.camel@infradead.org>
On 5/9/26 1:04 PM, David Woodhouse wrote:
> On Thu, 2026-01-15 at 12:22 -0800, Dongli Zhang wrote:
>> The KVM_SET_CLOCK command calls pvclock_update_vm_gtod_copy() to update the
>> masterclock data.
>>
>> Many vCPUs may already have KVM_REQ_MASTERCLOCK_UPDATE pending before
>> KVM_SET_CLOCK is invoked. If pvclock_update_vm_gtod_copy() decides to use
>> the masterclock, there is no need to update the masterclock multiple times
>> afterward. As noted in commit c52ffadc65e2 ("KVM: x86: Don't unnecessarily
>> force masterclock update on vCPU hotplug"), each unnecessary
>> KVM_REQ_MASTERCLOCK_UPDATE can cause the kvm-clock time to jump.
>>
>> Therefore, clear KVM_REQ_MASTERCLOCK_UPDATE for each vCPU at the end of
>> KVM_SET_CLOCK when the master clock is active. The 'tsc_write_lock' ensures
>> that only requests issued before KVM_SET_CLOCK are cleared.
>
> Hm, I think we should do this in kvm_make_mclock_inprogress_request()
> instead.
>
> Currently, things like pvclock_gtod_update_fn() and one or two others
> set KVM_REQ_MASTERCLOCK_UPDATE for all vCPUs. Not just one, because we
> don't want *any* vCPU going back into guest mode before doing the work.
>
> So they could *all* end up calling into kvm_masterclock_update() at the
> same time. I have a vague recollection there was once a mutex there,
> but now there isn't.
>
> So all vCPUs can each, in parallel, set KVM_REQ_MCLOCK_INPROGRESS on
> all other vCPUs. That's an unhandled request which just makes a vCPU
> spin (preventing it from entering the guest until the clock update
> completes).
>
> Then each vCPU tries to take the kvm->arch.tsc_write_lock in
> kvm_start_pvclock_update(), and from this point they'll be
> serialized... but every vCPU will go through the whole of
> pvclock_update_vm_gtod_copy() for itself, updating the masterclock one
> more time for each vCPU in the system?
>
> I came here to say that kvm_make_mclock_inprogress_request() should
> probably clear KVM_REQ_MASTERCLOCK_UPDATE on all other vCPUs, since
> it's about to do the work, and the other vCPUs will no longer need to.
>
> But... it's more broken than that now.
>
> I think maybe vcpu_enter_guest() should call kvm_update_masterclock()
> if *kvm_test_request(KVM_REQ_MASTERCLOCK_UPDATE)* (i.e. not *clear* the
> bit by using kvm_check_request()).
>
> And then after getting ->arch.tsc_write_lock,
> kvm_start_pvclock_update() should check if KVM_REQ_MASTERCLOCK_UPDATE
> is *still* set, and only proceed with the update if it is?
>
> I'll play with it some more... something like this perhaps?
>
> From 91e37d74ee267c722bc2c8f26754fcdfbc040e29 Mon Sep 17 00:00:00 2001
> From: David Woodhouse <dwmw@amazon.co.uk>
> Date: Sat, 9 May 2026 16:54:01 +0100
> Subject: [PATCH] KVM: x86: Avoid redundant masterclock updates from multiple
> vCPUs
>
> When a masterclock update is triggered (e.g. by the clocksource change
> notifier), KVM_REQ_MASTERCLOCK_UPDATE is set on all vCPUs. Without this
> fix, each vCPU independently processes the request and redundantly
> re-executes the entire pvclock_update_vm_gtod_copy() sequence, serialized
> only by tsc_write_lock. Each redundant re-snapshot of the master clock
> reference point introduces potential clock drift.
>
> Fix this by having __kvm_start_pvclock_update() check, after acquiring
> the lock, whether the requesting vCPU's KVM_REQ_MASTERCLOCK_UPDATE is
> still set. If another vCPU already did the update and cleared it, bail
> out. Otherwise, clear the request on all other vCPUs before proceeding.
>
> The caller in vcpu_enter_guest() now uses kvm_test_request() (non-clearing)
> since the clearing is done inside __kvm_start_pvclock_update() under the
> lock.
>
> Suggested-by: Dongli Zhang <dongli.zhang@oracle.com>
> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Thank you very much! Feel free to add me as Suggested-by or Reported-by.
I will also help test your large patch series.
Regarding the issue, I would like to confirm whether there can still be any
KVM_REQ_MASTERCLOCK_UPDATE after KVM_SET_CLOCK_GUEST.
Per my understanding, I would expect KVM_SET_CLOCK_GUEST to be the last point
where the masterclock can be updated (during live migration or live update).
After KVM_SET_CLOCK_GUEST, any further masterclock change may cause kvm-clock drift.
Dongli Zhang
next prev parent reply other threads:[~2026-05-12 0:21 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20260115202256.119820-1-dongli.zhang@oracle.com>
[not found] ` <20260115202256.119820-4-dongli.zhang@oracle.com>
2026-05-09 12:22 ` [PATCH 3/3] KVM: x86: conditionally update masterclock data in pvclock_update_vm_gtod_copy() David Woodhouse
2026-05-12 0:16 ` Dongli Zhang
2026-05-12 5:21 ` David Woodhouse
2026-05-12 23:23 ` Dongli Zhang
[not found] ` <20260115202256.119820-3-dongli.zhang@oracle.com>
2026-05-09 20:04 ` [PATCH 2/3] KVM: x86: conditionally clear KVM_REQ_MASTERCLOCK_UPDATE at the end of KVM_SET_CLOCK David Woodhouse
2026-05-12 0:21 ` Dongli Zhang [this message]
2026-05-12 7:19 ` David Woodhouse
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3363ed64-75f9-4a5c-93b9-882d72bfb7f2@oracle.com \
--to=dongli.zhang@oracle.com \
--cc=bp@alien8.de \
--cc=dave.hansen@linux.intel.com \
--cc=dwmw2@infradead.org \
--cc=hpa@zytor.com \
--cc=joe.jin@oracle.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=paul@xen.org \
--cc=pbonzini@redhat.com \
--cc=seanjc@google.com \
--cc=tglx@kernel.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox