qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Dongli Zhang <dongli.zhang@oracle.com>
To: David Woodhouse <dwmw2@infradead.org>, qemu-devel@nongnu.org
Cc: kvm@vger.kernel.org
Subject: Re: Should QEMU (accel=kvm) kvm-clock/guest_tsc stop counting during downtime blackout?
Date: Thu, 25 Sep 2025 12:42:10 -0700	[thread overview]
Message-ID: <848f7a55-7c68-445c-86fd-29530837b8f3@oracle.com> (raw)
In-Reply-To: <acca55a49bad023fad30625fc81e19ef1c3d0ed8.camel@infradead.org>



On 9/25/25 1:44 AM, David Woodhouse wrote:
> On Wed, 2025-09-24 at 13:53 -0700, Dongli Zhang wrote:
>>
>>
>> On 9/23/25 10:47 AM, David Woodhouse wrote:
>>> On Tue, 2025-09-23 at 10:25 -0700, Dongli Zhang wrote:
>>>>
>>>>
>>>> On 9/23/25 9:26 AM, David Woodhouse wrote:
>>>>> On Mon, 2025-09-22 at 12:37 -0700, Dongli Zhang wrote:
>>>>>> On 9/22/25 11:16 AM, David Woodhouse wrote:
>>>>
>>>> [snip]
>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> As demonstrated in my test, currently guest_tsc doesn't stop counting during
>>>>>>>> blackout because of the lack of "MSR_IA32_TSC put" at
>>>>>>>> kvmclock_vm_state_change(). Per my understanding, it is a bug and we may need to
>>>>>>>> fix it.
>>>>>>>>
>>>>>>>> BTW, kvmclock_vm_state_change() already utilizes KVM_SET_CLOCK to re-configure
>>>>>>>> kvm-clock before continuing the guest VM.
>>>>>
>>>>> Yeah, right now it's probably just introducing errors for a stop/start
>>>>> of the VM.
>>>>
>>>> But that help can meet the expectation?
>>>>
>>>> Thanks to KVM_GET_CLOCK and KVM_SET_CLOCK, QEMU saves the clock with
>>>> KVM_GET_CLOCK when the VM is stopped, and restores it with KVM_SET_CLOCK when
>>>> the VM is continued.
>>>
>>> It saves the actual *value* of the clock. I would prefer to phrase that
>>> as "it makes the clock jump backwards to the time at which the guest
>>> was paused".
>>>
>>>> This ensures that the clock value itself does not change between stop and cont.
>>>>
>>>> However, QEMU does not adjust the TSC offset via MSR_IA32_TSC during stop.
>>>>
>>>> As a result, when execution resumes, the guest TSC suddenly jumps forward.
>>>
>>> Oh wow, that seems really broken. If we're going to make it experience
>>> a time warp, we should at least be *consistent*.
>>>
>>> So a guest which uses the TSC for timekeeping should be mostly
>>> unaffected by this and its wallclock should still be accurate. A guest
>>> which uses the KVM clock will be hosed by it.
>>>
>>> I think we should fix this so that the KVM clock is unaffected too.
>>
>> From my understanding of your reply, the kvm-clock/tsc should always be adjusted
>> whenever a QEMU VM is paused and then resumed (i.e. via stop/cont).
> 
> I think I agree, except I still hate the way you use the word
> 'adjusted'.
> 
> If I look at my clock, and then go to sleep for a while and look at the
> clock again, nobody *adjusts* it. It just keeps running.
> 
> That's the effect we should always strive for, and that's how we should
> think about it and talk about it.
> 
> It's difficult to talk about clocks because what does it mean for a
> clock to be "unchanged"? Does it mean that it should return the same
> time value? Or that it should continue to count consistently? I would
> argue that we should *always* use language which assumes the latter.
> 
> Turning to physics for a clumsy analogy, it's about the frame of
> reference. We're all on a moving train. I look at you in the seat
> opposite me, I go to sleep for a while, and I wake up and you're still
> there. Nobody has "adjusted" your position to accommodate for the
> movement of the train while I was asleep.
> 

Thank you very much for explanation!

I will use something like "keeps running".

> 
> 
> 
>> This applies to:
>>
>> - stop / cont
>> - savevm / loadvm
>> - live migration
>> - cpr
>>
>> It is a bug if the clock jumps backwards to the time at which the guest was paused.
>>
>> The time elapsed while the VM is paused should always be accounted for and
>> reflected in kvm-clock/tsc once the VM resumes.
> 
> In particular, in *all* but the live migration case, there should be
> basically nothing to do. No addition, no subtraction. Only restoring
> the *existing* relationships, precisely as they were before. That is
> the TSC *offset* value, and the precise TSC→kvmclock parameters, all
> bitwise *exactly* the same as before.
> 
> And the only thing that changes on live migration is that you have to
> set the TSC offset such that the guest sees the values it *would* have
> seen on the original host at any given moment in time... and doesn't
> know it was kidnapped and moved onto a different train while it was
> sleeping...?
> 

I see. That means, only re-configure tsc_offset, while maintaining the
tsc->kvmclock PVTI. That's the reason you would like to remove
'kvm_arch->kvmclock_offset' entirely as future work.

Thank you very much!

Dongli Zhang


      reply	other threads:[~2025-09-25 19:43 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-22 16:37 Should QEMU (accel=kvm) kvm-clock/guest_tsc stop counting during downtime blackout? Dongli Zhang
2025-09-22 16:58 ` David Woodhouse
2025-09-22 17:31   ` Dongli Zhang
2025-09-22 18:16     ` David Woodhouse
2025-09-22 19:37       ` Dongli Zhang
2025-09-23 16:26         ` David Woodhouse
2025-09-23 17:25           ` Dongli Zhang
2025-09-23 17:47             ` David Woodhouse
2025-09-24 20:53               ` Dongli Zhang
2025-09-25  8:44                 ` David Woodhouse
2025-09-25 19:42                   ` Dongli Zhang [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=848f7a55-7c68-445c-86fd-29530837b8f3@oracle.com \
    --to=dongli.zhang@oracle.com \
    --cc=dwmw2@infradead.org \
    --cc=kvm@vger.kernel.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).