From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:51830)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <pbonzini@redhat.com>) id 1c2gsa-0005kS-Tb
	for qemu-devel@nongnu.org; Fri, 04 Nov 2016 11:58:10 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <pbonzini@redhat.com>) id 1c2gsX-00067o-O0
	for qemu-devel@nongnu.org; Fri, 04 Nov 2016 11:58:08 -0400
Received: from mx1.redhat.com ([209.132.183.28]:38850)
	by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32)
	(Exim 4.71) (envelope-from <pbonzini@redhat.com>) id 1c2gsX-00066t-Iv
	for qemu-devel@nongnu.org; Fri, 04 Nov 2016 11:58:05 -0400
References: <20161104094322.GA16930@amt.cnet> <20161104152522.GC5388@potion>
	<1c69a083-eef0-8fa0-0e74-5a4e25a066a0@redhat.com>
	<20161104154827.GD5388@potion>
From: Paolo Bonzini <pbonzini@redhat.com>
Message-ID: <1217f6ce-8143-8d0e-97f6-470926438992@redhat.com>
Date: Fri, 4 Nov 2016 16:57:59 +0100
MIME-Version: 1.0
In-Reply-To: <20161104154827.GD5388@potion>
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
Subject: Re: [Qemu-devel] [QEMU PATCH] kvmclock: advance clock by time
 window between vm_stop and pre_save
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel/>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: =?UTF-8?B?UmFkaW0gS3LEjW3DocWZ?= <rkrcmar@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>, kvm@vger.kernel.org, qemu-devel <qemu-devel@nongnu.org>, "Dr. David Alan Gilbert" <dgilbert@redhat.com>, Juan Quintela <quintela@redhat.com>, Eduardo Habkost <ehabkost@redhat.com>, Roman Kagan <rkagan@virtuozzo.com>


On 04/11/2016 16:48, Radim Kr=C4=8Dm=C3=A1=C5=99 wrote:
> 2016-11-04 16:33+0100, Paolo Bonzini:
>> On 04/11/2016 16:25, Radim Kr=C4=8Dm=C3=A1=C5=99 wrote:
>>>>> =20
>>>>> +        if (s->advance_clock && s->clock + s->advance_clock > s->c=
lock) {
>>>>> +            s->clock +=3D s->advance_clock;
>>>>> +            s->advance_clock =3D 0;
>>>>> +        }
>>> Can't the advance_clock added to the migrated KVMClockState instead o=
f
>>> passing it as another parameter?
>>>
>>> (It is sad that we can't just query KVMClockState in kvmclock_pre_sav=
e
>>>  because of the Linux bug.)
>>
>> What Linux bug?  The one that makes us use kvmclock_current_nsec?
>=20
> No, the one that forced Marcelo to add the 10 minute limit to the
> advance_clock.  We wouldn't need this advance_clock hack if we could
> just call KVM_GET_CLOCK like we did before 00f4d64ee76e ("kvmclock:
> clock should count only if vm is running").

There are two cases:

- migrating a paused guest

- pausing at the end of migration

In the first case, kvmclock_vm_state_change's !running branch will see
state =3D=3D RUN_STATE_FINISH_MIGRATE && s->clock_valid.  In the second
case, it will see state =3D=3D RUN_STATE_FINISH_MIGRATE && !s->clock_vali=
d.

In the second case it should do nothing.  Then kvmclock_pre_save will
see !s->clock_valid and call KVM_GET_CLOCK.  A bonus is that, if
migration fails and migration_thread() calls vm_start(), then the guest
will not see the clock drift backwards.

>> It should work with 4.9-rc (well, once Linus applies my pull request).
>> 4.9-rc will not return ktime_get_ns for KVM_GET_CLOCK; it will return
>> the raw value from the kernel timekeeper.
>>
>> I'm thinking that we should add a KVM capability for this, and skip
>> kvmclock_current_nsec if the capability is present.  The first part is
>> trivial, so we can do it even during Linux rc period.
>=20
> I agree, that sounds like a nice improvement.

Ok, I'll send the KVM patch then.  It can even be the value *returned*
for KVM_CAP_ADJUST_CLOCK---right now it returns 1, we can return 3 (bit
0 for KVM_CAP_ADJUST_CLOCK_RESENT and bit 1 for returning the raw
value).  Any suggestion for the name?

Paolo