From mboxrd@z Thu Jan 1 00:00:00 1970 From: Radim =?utf-8?B?S3LEjW3DocWZ?= Subject: Re: [QEMU PATCH] kvmclock: advance clock by time window between vm_stop and pre_save Date: Fri, 4 Nov 2016 18:16:06 +0100 Message-ID: <20161104171605.GE5388@potion> References: <20161104094322.GA16930@amt.cnet> <20161104152522.GC5388@potion> <1c69a083-eef0-8fa0-0e74-5a4e25a066a0@redhat.com> <20161104154827.GD5388@potion> <1217f6ce-8143-8d0e-97f6-470926438992@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Cc: Marcelo Tosatti , kvm@vger.kernel.org, qemu-devel , "Dr. David Alan Gilbert" , Juan Quintela , Eduardo Habkost , Roman Kagan To: Paolo Bonzini Return-path: Received: from mx1.redhat.com ([209.132.183.28]:42428 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756963AbcKDRQK (ORCPT ); Fri, 4 Nov 2016 13:16:10 -0400 Content-Disposition: inline In-Reply-To: <1217f6ce-8143-8d0e-97f6-470926438992@redhat.com> Sender: kvm-owner@vger.kernel.org List-ID: 2016-11-04 16:57+0100, Paolo Bonzini: > On 04/11/2016 16:48, Radim Krčmář wrote: >> 2016-11-04 16:33+0100, Paolo Bonzini: >>> On 04/11/2016 16:25, Radim Krčmář wrote: >>>>>> >>>>>> + if (s->advance_clock && s->clock + s->advance_clock > s->clock) { >>>>>> + s->clock += s->advance_clock; >>>>>> + s->advance_clock = 0; >>>>>> + } >>>> Can't the advance_clock added to the migrated KVMClockState instead of >>>> passing it as another parameter? >>>> >>>> (It is sad that we can't just query KVMClockState in kvmclock_pre_save >>>> because of the Linux bug.) >>> >>> What Linux bug? The one that makes us use kvmclock_current_nsec? >> >> No, the one that forced Marcelo to add the 10 minute limit to the >> advance_clock. We wouldn't need this advance_clock hack if we could >> just call KVM_GET_CLOCK like we did before 00f4d64ee76e ("kvmclock: >> clock should count only if vm is running"). > > There are two cases: > > - migrating a paused guest > > - pausing at the end of migration > > In the first case, kvmclock_vm_state_change's !running branch will see > state == RUN_STATE_FINISH_MIGRATE && s->clock_valid. In the second > case, it will see state == RUN_STATE_FINISH_MIGRATE && !s->clock_valid. I lift my case, marcelo's said that stopping the time is a feature ... (*kittens die*) > In the second case it should do nothing. Then kvmclock_pre_save will > see !s->clock_valid and call KVM_GET_CLOCK. A bonus is that, if > migration fails and migration_thread() calls vm_start(), then the guest > will not see the clock drift backwards. Yes, moving KVM_GET_CLOCK to kvmclock_pre_save() and doing nothing in kvmclock_vm_state_change() to avoid the drift on failed migration would be nicer. We'd then also get rid of the ns migration parameter. >>> It should work with 4.9-rc (well, once Linus applies my pull request). >>> 4.9-rc will not return ktime_get_ns for KVM_GET_CLOCK; it will return >>> the raw value from the kernel timekeeper. Oh, and this does introduce a minor bug to this patch -- the time counted by KVM_GET_CLOCK is has different frequency CLOCK_MONOTONIC. Not accounting for that is bearable. >>> I'm thinking that we should add a KVM capability for this, and skip >>> kvmclock_current_nsec if the capability is present. The first part is >>> trivial, so we can do it even during Linux rc period. >> >> I agree, that sounds like a nice improvement. > > Ok, I'll send the KVM patch then. It can even be the value *returned* > for KVM_CAP_ADJUST_CLOCK---right now it returns 1, we can return 3 (bit > 0 for KVM_CAP_ADJUST_CLOCK_RESENT and bit 1 for returning the raw > value). Any suggestion for the name? KVM_CAP_GET_CLOCK_MASTERCLOCK? to show that the time is counted differently when using master clock. Another idea would be KVM_CAP_GET_CLOCK_ACTUAL, because we should now read what the guest is seeing.