From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:54213) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1c4nqh-0003dj-RZ for qemu-devel@nongnu.org; Thu, 10 Nov 2016 06:48:56 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1c4nqc-0004WN-VM for qemu-devel@nongnu.org; Thu, 10 Nov 2016 06:48:55 -0500 Received: from mx1.redhat.com ([209.132.183.28]:36012) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1c4nqc-0004W0-NC for qemu-devel@nongnu.org; Thu, 10 Nov 2016 06:48:50 -0500 Date: Thu, 10 Nov 2016 09:48:28 -0200 From: Marcelo Tosatti Message-ID: <20161110114826.GA28418@amt.cnet> References: <20161104094322.GA16930@amt.cnet> <20161104165933.GA3027@amt.cnet> <20161107154610.GG2054@work-vm> <20161107194058.GB28327@amt.cnet> <20161107200349.GC1155@work-vm> <20161108000609.GA3689@amt.cnet> <20161108102255.GC2042@work-vm> <4c34da7d-7027-5595-012a-61ab6937f8e3@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4c34da7d-7027-5595-012a-61ab6937f8e3@redhat.com> Subject: Re: [Qemu-devel] [QEMU PATCH v2] kvmclock: advance clock by time window between vm_stop and pre_save List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Paolo Bonzini Cc: "Dr. David Alan Gilbert" , kvm@vger.kernel.org, qemu-devel , Radim =?utf-8?B?S3LEjW3DocWZ?= , Juan Quintela , Eduardo Habkost GOn Wed, Nov 09, 2016 at 05:23:50PM +0100, Paolo Bonzini wrote: > > > On 08/11/2016 11:22, Dr. David Alan Gilbert wrote: > > * Marcelo Tosatti (mtosatti@redhat.com) wrote: > >> On Mon, Nov 07, 2016 at 08:03:50PM +0000, Dr. David Alan Gilbert wrote: > >>> * Marcelo Tosatti (mtosatti@redhat.com) wrote: > >>>> On Mon, Nov 07, 2016 at 03:46:11PM +0000, Dr. David Alan Gilbert wrote: > >>>>> * Marcelo Tosatti (mtosatti@redhat.com) wrote: > >>>>>> This patch, relative to pre-copy migration codepath, > >>>>>> measures the time between vm_stop() and pre_save(), > >>>>>> which includes copying the remaining RAM to destination, > >>>>>> and advances the clock by that amount. > >>>>>> > >>>>>> In a VM with 5 seconds downtime, this reduces the guest > >>>>>> clock difference on destination from 5s to 0.2s. > >>>>>> > >>>>>> Tested with Linux and Windows 2012 R2 guests with -cpu XXX,+hv-time. > >>>>> > >>>>> One thing that bothers me is that it's only this clock that's > >>>>> getting corrected; doesn't it cause things to get upset when > >>>>> one clock moves and the others dont? > >>>> > >>>> If you are correlating the clocks, then yes. > >>>> > >>>> Older Linux guests get upset (marking the TSC clocksource unstable > >>>> because the watchdog checks TSC vs kvmclock), but there is a workaround for it > >>>> in newer guests > >>>> (kvmclock interface to notify watchdog to not complain). > >>>> > >>>> Note marking TSC clocksource unstable on older guests is harmless > >>>> because kvmclock is the standard clocksource. > >>>> > >>>> For Windows guests, i don't know that Windows correlates between different > >>>> clocks. > >>>> > >>>> That is, there is relative control as to which software reads kvmclock > >>>> or Windows TIMER MSR, so i don't see the need to advance every clock > >>>> exposed. > >>>> > >>>>> Shouldn't the pause delay be recorded somewhere architecturally > >>>>> independent and then be a thing that kvm-clock happens to use and > >>>>> other clocks might as well? > >>>> > >>>> In theory, yes. In practice, i don't see the need for this... > >>> > >>> It seems unlikely to me that x86 is the only one that will want > >>> to do something similar. > >> > >> Can't they copy what kvmclock is doing today? > > > > We shouldn't have copies of code all over should we? > > Let's cross the bridge when we get there. > > For now I'm more interested in having a version of the patch that is > clean and doesn't need advance_clock. > > Paolo Destination has to run the following logic: If (source has KVM_CAP_ADVANCE_CLOCK) use KVM_GET_CLOCK value Else read pvclock from guest To support migration from older QEMU versions which do not have KVM_CAP_ADVANCE_CLOCK (or new QEMU versions running on old hosts without KVM_CAP_ADVANCE_CLOCK). I don't see any clean way to give that information, except changing the migration format to pass "host: kvm_cap_advance_clock=true/false" information. Ideas?