qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Dongli Zhang <dongli.zhang@oracle.com>
To: "Dr. David Alan Gilbert" <dave@treblig.org>, steven.sistare@oracle.com
Cc: qemu-devel@nongnu.org
Subject: Re: Trying cpr
Date: Mon, 21 Apr 2025 08:22:25 -0700	[thread overview]
Message-ID: <df432912-de0c-4a77-8008-0c07b23f42f0@oracle.com> (raw)
In-Reply-To: <aAZKaMkKYPlmBMcZ@gallifrey>



On 4/21/25 6:38 AM, Dr. David Alan Gilbert wrote:
> Hi Steve,
>   I've just had a go with cpr-transfer, it's quite interesting.
> I was just trying it on my (AMD) desktop.
> 
> * I was running with qemu displaying graphics, and after migration
> the source display got updated every time I moved my mouse into the
> source window; the VM was still stopped, but I guess that means
> the source GUI is still parsing the guest VRAM and displaying it.
> I'm not sure if there's any other interactions - e.g. is there any
> situation where the source GUI will try and write into the shared
> guest ram?
> 
> * Given that you pass fd's over the CPR socket, had you considered
> passing main migration fd's over it as well, that way you'd
> only need one incoming.
> 
> * The guest noticed the time skew:
>   timekeeping watchdog on CPU1: Marking clocksource 'tsc' as unstable because the skew is too large:
>      'kvm-clock' wd_nsec: 556248511 wd_new: 4a93129e69 wd_alst: 4a71eaf0aa mask: (all f's)
>      'tsc' cs_nsec: 514023131 cs_now: 1047f1d8489 cs_last: 10414538c1 mask: (all f's)
>      Clocksource 'tsc' skewed -42225380 ns (-42 ms) over watchdog 'kvm-clock' interval of 556248511 ns (556 ms)
>      'kvm-clock' (not 'tsc') is current clocksource

Here the guest kernel uses kvm-clock to measure the accuracy of tsc.

While there is a chance that the accuracy of tsc is broken, it is more likely
the kvm-clock's accuracy is broken.

That is, suppose the TSC is still good enough, it is marked unstable because the
kernel uses an inaccurate kvm-clock to measure tsc.

How about the guest kernel version? Does it have the below patch? Or is this an
AMD server (by default X86_FEATURE_CONSTANT_TSC isn't set)?

x86/tsc: Disable clocksource watchdog for TSC on qualified platorms
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b50db7095fe002fa3e16605546cba66bf1b68a3e

In addition, I assume the cpr-transfer doesn't re-create a new KVM instance (fd).

I used to encounter similar issue during vCPU hotplug.

KVM: x86: Don't unnecessarily force masterclock update on vCPU hotplug
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c52ffadc65e28ab461fd055e9991e8d8106a0056

David Woodhouse has a patchset related to kvmclock and live migration.

[RFC PATCH v3 00/21] Cleaning up the KVM clock mess
https://lore.kernel.org/all/20240522001817.619072-1-dwmw2@infradead.org/

Maciej also fixed a similar clock unstable issue.

target/i386: Reset TSCs of parked vCPUs too on VM reset
https://gitlab.com/qemu-project/qemu/-/commit/3f2a05b31ee9ce2ddb6c75a9bc3f5e7f7af9a76f

Dongli Zhang

> 
>   (That was hand copied, probably with some typos - who knew the
>    GUI doesn't let you copy/paste from serial0...)
> 
> 
> The source commandline was:
> ./try/qemu-system-x86_64  -object memory-backend-file,id=ram0,size=4G,mem-path=/dev/shm/qemuram0,share=on -m 4G -machine memory-backend=ram0,aux-ram-share=on -cpu host --enable-kvm -smp 16 -drive if=virtio,file=/discs/more/images/debian-13-nocloud-amd64-daily.qcow2 -qmp stdio
> 
> The dest commandline was:
> ./try/qemu-system-x86_64 -object memory-backend-file,id=ram0,size=4G,mem-path=/dev/shm/qemuram0,share=on -m 4G -machine memory-backend=ram0,aux-ram-share=on -cpu host --enable-kvm -smp 16 -drive if=virtio,file=/discs/more/images/debian-13-nocloud-amd64-daily.qcow2 -incoming tcp:0:44444 -incoming '{"channel-type": "cpr", "addr": { "transport": "socket", "type": "unix", "path": "cpr.sock"}}'
> 
> Dave



  reply	other threads:[~2025-04-21 15:23 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-21 13:38 Trying cpr Dr. David Alan Gilbert
2025-04-21 15:22 ` Dongli Zhang [this message]
2025-04-21 17:07   ` Dr. David Alan Gilbert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=df432912-de0c-4a77-8008-0c07b23f42f0@oracle.com \
    --to=dongli.zhang@oracle.com \
    --cc=dave@treblig.org \
    --cc=qemu-devel@nongnu.org \
    --cc=steven.sistare@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).