From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dor Laor Subject: Re: Timedrift in KVM guests after livemigration. Date: Sun, 18 Apr 2010 12:22:54 +0300 Message-ID: <4BCACF6E.4010108@redhat.com> References: <4BC6C1B1.5010206@monsternett.no> <4BCA1164.1030808@monsternett.no> <4BCA1756.6060800@msgid.tls.msk.ru> <4BCA4296.2080608@monsternett.no> Reply-To: dlaor@redhat.com Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: kvm@vger.kernel.org To: Espen Berg Return-path: Received: from mx1.redhat.com ([209.132.183.28]:27239 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755277Ab0DRJVj (ORCPT ); Sun, 18 Apr 2010 05:21:39 -0400 In-Reply-To: <4BCA4296.2080608@monsternett.no> Sender: kvm-owner@vger.kernel.org List-ID: On 04/18/2010 02:21 AM, Espen Berg wrote: > Den 17.04.2010 22:17, skrev Michael Tokarev: >>>> We have three KVM hosts that supports live-migration between them, but >>>> one of our problems is time drifting. The three frontends has different >>>> CPU frequency and the KVM guests adopt the frequency from the host >>>> machine where it was first started. >> What do you mean by "adopts" ? Note that the cpu frequency >> means nothing for all the modern operating systems, at least >> since the days of common usage of MS-DOS which relied on CPU >> frequency for its time functions. All interesting things are >> now done using timers instead, and timers (which don't depend >> on CPU frequency again) usually work quite well. > > The assumption that frequency of the ticks was calculated by the hosts > MHz, was based on the fact that grater clock frequency differences > caused higher time drift. 60 MHz difference caused about 24min drift, > 332 MHz difference caused about 2h25min drift. > > >> What complicates things is that the most cheap and accurate >> enough time source is TSC (time stamp counter register in >> the CPU), but it will definitely be different on each >> machine. For that, 0.12.3 kvm and 2.6.32 kernel (I think) >> introduced a compensation. See for example -tdf kvm option. > > Ah, nice to know. :) That's two different things here: The issue that Espen is reporting is that the hosts have different frequency and guests that relay on the tsc as a source clock will notice that post migration. The is indeed a problem that -tdf does not solve. -tdf only adds compensation for the RTC clock emulation. What's the guest type and what's the guest's source clock? Using tsc directly as a source clock is not recommended because of this migration issue (that is not solveable until we trap every rdtsc by the guest). Using pv kvmclock in Linux mitigates this issue since it exposes both the tsc and the host clock so guests can adjust themselves. Several months ago a pvclock migration fix was added to pass the pvclock MSRs reading to the destination: 1a03675db146dfc760b3b48b3448075189f142cc > >>> Since this is a cluster in production, I'm not able to try the latest >>> version either. >> Well, that's difficult one, no? It either works or not. >> If you can't try anything else, why to ask? :) > > What I tried to say was that there are many important virtual servers > running on this cluster at the moment, so "trial by error" was not an > option. The last time we tried 0.12.x (during the initial tests of the > cluster) there where a lot of stability issues, crashes during migration > etc. > > Regards, Espen > > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html