kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Oliver Upton <oupton@google.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: David Woodhouse <dwmw2@infradead.org>,
	kvm@vger.kernel.org, Sean Christopherson <seanjc@google.com>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	Wanpeng Li <wanpengli@tencent.com>,
	Jim Mattson <jmattson@google.com>, Joerg Roedel <joro@8bytes.org>
Subject: Re: [PATCH] Documentation: KVM: Describe guest TSC scaling in migration algorithm
Date: Mon, 21 Mar 2022 00:51:17 +0000	[thread overview]
Message-ID: <YjfMBYsse95znupa@google.com> (raw)
In-Reply-To: <9afd33cb-4052-fe15-d3ae-69a14ca252b0@redhat.com>

On Sun, Mar 20, 2022 at 02:39:34PM +0100, Paolo Bonzini wrote:
> On 3/20/22 09:52, Oliver Upton wrote:
> > What do you folks think about having a new R/O vCPU attribute that
> > returns a { TOD, guest_tsc } pair? I believe that would immediately
> > satisfy the needs of upstream to implement clock-advancing live
> > migration.
> 
> I don't think this adds much.  The missing link is on the destination side,
> not the source side.

I think it'll work:

 Source:
  - Pick a vCPU and save its { TOD, guest_TSC } pair
  - Save the tsc offset of every vCPU
  - Using all of the offsets, calculate the drift of all the follower
    vCPUs from the leader vCPU (from step 1)
  - Save the TSC frequency

 Destination:
  - Restore the TSC frequency
  - Read the { TOD, guest_TSC } pair for the first vCPU
  - Compare with the source value to work out delta_guest_TSC and
    delta_TOD
  - Apply delta_guest_TSC to all vCPUs in a VM
  - If you want to account for realtime, apply guest_tsc_freq *
    delta_TOD to every vCPU in the VM
  - Account for drift between leader/follower vCPUs

Userspace has some math to do, but IMO it needs to until we have a
better mechanism for helping the guest clean up a slow migration.
It does eliminate the need for doing TSC scaling in userspace, which
I think is the trickiest piece of it all.

Alternative could be to say that the VM has a single, authoritative {
TOD, guest_TSC } clock that can be read or written. Any vCPU offsets
then account for guest-induced drift in TSCs.

> To recap, the data that needs to be migrated from source to destination is
> the hostTSC+hostTOD pairing (returned by KVM_GET_CLOCK) plus one of each of
> the following:
> 
> * either guestTSCOffset or a guestTSC synced with the hostTSC
> 
> * either guestTODOffset or a guestTOD synced with the hostTOD.
> 
> * either guestTSCScale or hostTSCFreq
> 
> Right now we have guestTSCOffset as a vCPU attribute, we have guestTOD
> returned by KVM_GET_CLOCK, and we plan to have hostTSCFreq in sysfs. It's a
> bit mix-and-match, but it's already a 5-tuple that the destination can use.
> What's missing is a ioctl on the destination side that relieves userspace
> from having to do the math.

That ioctl will work fine, but userspace needs to accept all the
nastiness that ensues. If it yanks the guest too hard into the future
it'll need to pick up the pieces when the guest kernel panics.

--
Thanks,
Oliver

  reply	other threads:[~2022-03-21  0:51 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-16  4:53 [PATCH] Documentation: KVM: Describe guest TSC scaling in migration algorithm Oliver Upton
2022-03-16  7:47 ` Paolo Bonzini
2022-03-18 18:39   ` Oliver Upton
2022-03-19  7:52     ` Paolo Bonzini
2022-03-19  7:59       ` Oliver Upton
2022-03-19  8:08         ` David Woodhouse
2022-03-19 11:54           ` Paolo Bonzini
2022-03-19 13:00             ` Paolo Bonzini
2022-03-19 13:13               ` David Woodhouse
2022-03-20  8:10                 ` Paolo Bonzini
2022-03-20  8:52                   ` Oliver Upton
2022-03-20  9:46                     ` David Woodhouse
2022-03-21  0:38                       ` Oliver Upton
2022-03-21 19:43                         ` David Woodhouse
2022-03-21 21:23                           ` Oliver Upton
2022-03-20 13:39                     ` Paolo Bonzini
2022-03-21  0:51                       ` Oliver Upton [this message]
2022-03-21 12:36                         ` Paolo Bonzini
2022-03-21 12:56                           ` David Woodhouse
2022-03-21 12:16                       ` David Woodhouse
2022-03-21 13:10                         ` Paolo Bonzini
2022-03-21 14:59                           ` David Woodhouse
  -- strict thread matches above, loose matches on Subject: below --
2022-03-22 19:18 Franke, Daniel
2022-03-22 21:53 ` Oliver Upton
2022-03-23 12:35   ` David Woodhouse
2022-03-23 16:21     ` Oliver Upton
2022-03-25  9:03       ` David Woodhouse
2022-03-25 17:47         ` Oliver Upton
2022-03-29 14:19   ` Thomas Gleixner
2022-03-29 16:02     ` Oliver Upton
2022-03-29 19:34       ` Thomas Gleixner
2022-06-30 11:58       ` David Woodhouse
2022-07-05 14:43         ` David Woodhouse

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YjfMBYsse95znupa@google.com \
    --to=oupton@google.com \
    --cc=dwmw2@infradead.org \
    --cc=jmattson@google.com \
    --cc=joro@8bytes.org \
    --cc=kvm@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=seanjc@google.com \
    --cc=vkuznets@redhat.com \
    --cc=wanpengli@tencent.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).