Re: [PATCH] hw/i386/kvm: Prevent guest monotonic clock jump after live migration

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Dongli Zhang <dongli.zhang@oracle.com>
To: "Zhou, Peng Ju" <PengJu.Zhou@amd.com>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>
Cc: "Chang, HaiJun" <HaiJun.Chang@amd.com>,
	"Ma, Qing (Mark)" <Qing.Ma@amd.com>,
	"marcel.apfelbaum@gmail.com" <marcel.apfelbaum@gmail.com>,
	"richard.henderson@linaro.org" <richard.henderson@linaro.org>,
	"eduardo@habkost.net" <eduardo@habkost.net>,
	"pbonzini@redhat.com" <pbonzini@redhat.com>,
	"mst@redhat.com" <mst@redhat.com>
Subject: Re: [PATCH] hw/i386/kvm: Prevent guest monotonic clock jump after live migration
Date: Tue, 25 Nov 2025 12:46:08 -0800	[thread overview]
Message-ID: <778c6a35-4b8a-42e9-8af5-6585a7facc7f@oracle.com> (raw)
In-Reply-To: <PH7PR12MB85950C4310D80D0EE4DC9894F8D1A@PH7PR12MB8595.namprd12.prod.outlook.com>

Hi Peng Ju,

On 11/24/25 6:34 PM, Zhou, Peng Ju wrote:
> Hi Dongli,
> Thanks for your reply.
> 
> As you said in https://urldefense.com/v3/__https://lore.kernel.org/qemu-devel/20251009095831.46297-1-dongli.zhang@oracle.com/__;!!ACWV5N9M2RV99hQ!I2xVB3T2iW4xtnnYdW0e-tnS2cRMe_EuIFx2ALBZ7Ys1lZGS1fyeZ3eYxf21kSU_VVhNkkl0FHIpHFe-U5yIxg$ 
> timeout occurred in guest after live migration due to the monotonic time jump ahead.
> 
> Hi Qemu team
> Could you help to check the patch again? 
> (I think Dongli's patch is better than mine.)
> 
> Timeout can be occurred in the following sequence:
> 1. Send a job to HW and start a timer
> 2. HW response an interrupt (which means HW finishes the work) and VM suspended without process the interrupt 
> 3. Resume the VM after live migration with a long downtime (may be 20s).
> 4. Timer timeout

Regarding such scenario ...

General Linux kernel uses 'PVCLOCK_GUEST_STOPPED' to notify guest VM that clock
may be unreliable for a short period of times. The guest kernel then calls
pvclock_touch_watchdogs() to avoid any general timeout.

Regarding a specific driver, it depends on its implementation. Usually I/O
timeout can recover once the request is finally complete.

For some very specific scenarios, perhaps we may need to resolve it case by case.

[PATCH RESEND 1/1] x86/smpboot: check cpu_initialized_mask first after returning
from schedule()
https://lore.kernel.org/all/20211223210343.1116-1-dongli.zhang@oracle.com/

Thank you very much!

Dongli Zhang

> 
> Thanks in advance.
> 
> 
> ---------------------------------------------------------------------- 
> BW
> Pengju Zhou
> 
> 
> 
> 
> 
>> -----Original Message-----
>> From: Dongli Zhang <dongli.zhang@oracle.com>
>> Sent: Monday, November 24, 2025 3:14 PM
>> To: Zhou, Peng Ju <PengJu.Zhou@amd.com>; qemu-devel@nongnu.org
>> Cc: Chang, HaiJun <HaiJun.Chang@amd.com>; Ma, Qing (Mark)
>> <Qing.Ma@amd.com>; marcel.apfelbaum@gmail.com;
>> richard.henderson@linaro.org; eduardo@habkost.net; pbonzini@redhat.com;
>> mst@redhat.com
>> Subject: Re: [PATCH] hw/i386/kvm: Prevent guest monotonic clock jump after live
>> migration
>>
>> Hi Peng Ju,
>>
>> On 11/20/25 12:44 AM, Peng Ju Zhou wrote:
>>> Problem
>>> After live migration, the guest monotonic clock may jump forward on the target.
>>>
>>> Cause
>>> kvmclock (the guest’s time base) is derived from host wall time and
>>> keeps advancing while the VM is paused. During STOP_COPY, QEMU reads
>> kvmclock twice:
>>> 1) immediately after the VM is paused, and
>>> 2) when final CPU state is collected.
>>> Only the second (later) value is migrated. The gap between the two
>>> reads is roughly the downtime, so the target restores from a later
>>> time and the guest monotonic clock jumps ahead.
>>
>> According to prior discussion, it is expected to account live migration downtime.
>>
>> https://urldefense.com/v3/__https://lore.kernel.org/qemu-__;!!ACWV5N9M2RV99hQ!I2xVB3T2iW4xtnnYdW0e-tnS2cRMe_EuIFx2ALBZ7Ys1lZGS1fyeZ3eYxf21kSU_VVhNkkl0FHIpHFePWcT4Fg$ 
>> devel/c1ceaa4e68b9264fc1c811c1ad0b60628d7fd9cd.camel@infradead.org/
>>
>>
>> That is, the jump forward is expected during live migration.
>>
>>
>> I used to send a QEMU patch to account live migration downtime.
>>
>> [PATCH 1/1] target/i386/kvm: account blackout downtime for kvm-clock and guest
>> TSC https://urldefense.com/v3/__https://lore.kernel.org/qemu-devel/20251009095831.46297-1-__;!!ACWV5N9M2RV99hQ!I2xVB3T2iW4xtnnYdW0e-tnS2cRMe_EuIFx2ALBZ7Ys1lZGS1fyeZ3eYxf21kSU_VVhNkkl0FHIpHFckXGguCA$ 
>> dongli.zhang@oracle.com/
>>
>> Thank you very much!
>>
>> Dongli Zhang
>>
>>>
>>> Fix
>>> Migrate the kvmclock value captured at pause time (the first read) so
>>> the target restores from the actual pause point.
>>>
>>> Signed-off-by: Peng Ju Zhou <PengJu.Zhou@amd.com>
>>> ---
>>>  hw/i386/kvm/clock.c | 8 +++++++-
>>>  1 file changed, 7 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/hw/i386/kvm/clock.c b/hw/i386/kvm/clock.c index
>>> 40aa9a32c3..cd6f7e1315 100644
>>> --- a/hw/i386/kvm/clock.c
>>> +++ b/hw/i386/kvm/clock.c
>>> @@ -43,6 +43,7 @@ struct KVMClockState {
>>>
>>>      /* whether the 'clock' value was obtained in the 'paused' state */
>>>      bool runstate_paused;
>>> +    RunState state;
>>>
>>>      /* whether machine type supports reliable KVM_GET_CLOCK */
>>>      bool mach_use_reliable_get_clock; @@ -108,7 +109,10 @@ static
>>> void kvm_update_clock(KVMClockState *s)
>>>          fprintf(stderr, "KVM_GET_CLOCK failed: %s\n", strerror(-ret));
>>>                  abort();
>>>      }
>>> -    s->clock = data.clock;
>>> +
>>> +    if (s->state != RUN_STATE_FINISH_MIGRATE) {
>>> +        s->clock = data.clock;
>>> +    }
>>>
>>>      /* If kvm_has_adjust_clock_stable() is false, KVM_GET_CLOCK returns
>>>       * essentially CLOCK_MONOTONIC plus a guest-specific adjustment.
>>> This @@ -217,6 +221,8 @@ static void kvmclock_vm_state_change(void
>> *opaque, bool running,
>>>           */
>>>          s->clock_valid = true;
>>>      }
>>> +
>>> +    s->state = state;
>>>  }
>>>
>>>  static void kvmclock_realize(DeviceState *dev, Error **errp)
>

     prev parent reply	other threads:[~2025-11-25 20:48 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-20  8:44 [PATCH] hw/i386/kvm: Prevent guest monotonic clock jump after live migration Peng Ju Zhou
2025-11-24  6:25 ` Zhou, Peng Ju
2025-11-24  6:37 ` Zhou, Peng Ju
2025-11-27  6:18   ` Zhang, Owen(SRDC)
2025-11-24  7:13 ` Dongli Zhang
2025-11-25  2:34   ` Zhou, Peng Ju
2025-11-25 20:46     ` Dongli Zhang [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=778c6a35-4b8a-42e9-8af5-6585a7facc7f@oracle.com \
    --to=dongli.zhang@oracle.com \
    --cc=HaiJun.Chang@amd.com \
    --cc=PengJu.Zhou@amd.com \
    --cc=Qing.Ma@amd.com \
    --cc=eduardo@habkost.net \
    --cc=marcel.apfelbaum@gmail.com \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=richard.henderson@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).