From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:51497) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1c2i6A-0001cG-Ox for qemu-devel@nongnu.org; Fri, 04 Nov 2016 13:16:16 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1c2i67-0001Si-La for qemu-devel@nongnu.org; Fri, 04 Nov 2016 13:16:14 -0400 Received: from mx1.redhat.com ([209.132.183.28]:45578) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1c2i67-0001RX-CL for qemu-devel@nongnu.org; Fri, 04 Nov 2016 13:16:11 -0400 Date: Fri, 4 Nov 2016 18:16:06 +0100 From: Radim =?utf-8?B?S3LEjW3DocWZ?= Message-ID: <20161104171605.GE5388@potion> References: <20161104094322.GA16930@amt.cnet> <20161104152522.GC5388@potion> <1c69a083-eef0-8fa0-0e74-5a4e25a066a0@redhat.com> <20161104154827.GD5388@potion> <1217f6ce-8143-8d0e-97f6-470926438992@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <1217f6ce-8143-8d0e-97f6-470926438992@redhat.com> Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [QEMU PATCH] kvmclock: advance clock by time window between vm_stop and pre_save List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Paolo Bonzini Cc: Marcelo Tosatti , kvm@vger.kernel.org, qemu-devel , "Dr. David Alan Gilbert" , Juan Quintela , Eduardo Habkost , Roman Kagan 2016-11-04 16:57+0100, Paolo Bonzini: > On 04/11/2016 16:48, Radim Kr=C4=8Dm=C3=A1=C5=99 wrote: >> 2016-11-04 16:33+0100, Paolo Bonzini: >>> On 04/11/2016 16:25, Radim Kr=C4=8Dm=C3=A1=C5=99 wrote: >>>>>> =20 >>>>>> + if (s->advance_clock && s->clock + s->advance_clock > s->= clock) { >>>>>> + s->clock +=3D s->advance_clock; >>>>>> + s->advance_clock =3D 0; >>>>>> + } >>>> Can't the advance_clock added to the migrated KVMClockState instead = of >>>> passing it as another parameter? >>>> >>>> (It is sad that we can't just query KVMClockState in kvmclock_pre_sa= ve >>>> because of the Linux bug.) >>> >>> What Linux bug? The one that makes us use kvmclock_current_nsec? >>=20 >> No, the one that forced Marcelo to add the 10 minute limit to the >> advance_clock. We wouldn't need this advance_clock hack if we could >> just call KVM_GET_CLOCK like we did before 00f4d64ee76e ("kvmclock: >> clock should count only if vm is running"). >=20 > There are two cases: >=20 > - migrating a paused guest >=20 > - pausing at the end of migration >=20 > In the first case, kvmclock_vm_state_change's !running branch will see > state =3D=3D RUN_STATE_FINISH_MIGRATE && s->clock_valid. In the second > case, it will see state =3D=3D RUN_STATE_FINISH_MIGRATE && !s->clock_va= lid. I lift my case, marcelo's said that stopping the time is a feature ... (*kittens die*) > In the second case it should do nothing. Then kvmclock_pre_save will > see !s->clock_valid and call KVM_GET_CLOCK. A bonus is that, if > migration fails and migration_thread() calls vm_start(), then the guest > will not see the clock drift backwards. Yes, moving KVM_GET_CLOCK to kvmclock_pre_save() and doing nothing in kvmclock_vm_state_change() to avoid the drift on failed migration would be nicer. We'd then also get rid of the ns migration parameter. >>> It should work with 4.9-rc (well, once Linus applies my pull request)= . >>> 4.9-rc will not return ktime_get_ns for KVM_GET_CLOCK; it will return >>> the raw value from the kernel timekeeper. Oh, and this does introduce a minor bug to this patch -- the time counted by KVM_GET_CLOCK is has different frequency CLOCK_MONOTONIC. Not accounting for that is bearable. >>> I'm thinking that we should add a KVM capability for this, and skip >>> kvmclock_current_nsec if the capability is present. The first part i= s >>> trivial, so we can do it even during Linux rc period. >>=20 >> I agree, that sounds like a nice improvement. >=20 > Ok, I'll send the KVM patch then. It can even be the value *returned* > for KVM_CAP_ADJUST_CLOCK---right now it returns 1, we can return 3 (bit > 0 for KVM_CAP_ADJUST_CLOCK_RESENT and bit 1 for returning the raw > value). Any suggestion for the name? KVM_CAP_GET_CLOCK_MASTERCLOCK? to show that the time is counted differently when using master clock. Another idea would be KVM_CAP_GET_CLOCK_ACTUAL, because we should now read what the guest is seeing.