From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:46071) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1TCBmu-0000nm-5d for qemu-devel@nongnu.org; Thu, 13 Sep 2012 11:57:12 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1TCBmn-0008Ob-Tz for qemu-devel@nongnu.org; Thu, 13 Sep 2012 11:57:08 -0400 Received: from mail-oa0-f45.google.com ([209.85.219.45]:56049) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1TCBmn-0008OX-P4 for qemu-devel@nongnu.org; Thu, 13 Sep 2012 11:57:01 -0400 Received: by oagn12 with SMTP id n12so1874644oag.4 for ; Thu, 13 Sep 2012 08:57:01 -0700 (PDT) From: Anthony Liguori In-Reply-To: <20120913144811.GL20907@redhat.com> References: <87pq5r5otp.fsf@codemonkey.ws> <20120912151549.GT20907@redhat.com> <87y5kfrtne.fsf@codemonkey.ws> <20120913104940.GA20907@redhat.com> <5051DC20.4090204@redhat.com> <20120913132804.GO7767@redhat.com> <87r4q6xbiy.fsf@codemonkey.ws> <20120913142228.GK20907@redhat.com> <87boha7zyx.fsf@codemonkey.ws> <20120913144811.GL20907@redhat.com> Date: Thu, 13 Sep 2012 10:56:56 -0500 Message-ID: <87ehm5or07.fsf@codemonkey.ws> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Subject: Re: [Qemu-devel] Rethinking missed tick catchup List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Gleb Natapov Cc: qemu-devel@nongnu.org, Jan Kiszka , Michael Roth , Luiz Capitulino , Avi Kivity , Paolo Bonzini , Eric Blake Gleb Natapov writes: > On Thu, Sep 13, 2012 at 09:35:18AM -0500, Anthony Liguori wrote: >> Gleb Natapov writes: >> >> > On Thu, Sep 13, 2012 at 09:06:29AM -0500, Anthony Liguori wrote: >> >> "Daniel P. Berrange" writes: >> >> >> >> I think it's better for QEMU to talk to qemu-ga. We can tell when a large >> >> period of time has passed in QEMU because we'll accumulate a large >> >> number of missed ticks. >> >> >> > With RTC configured to use vm clock we will not. >> >> Not for host suspend. For stop and live migration, we stop vm_clock. >> But QEMU isn't aware of host suspend so vm_clock cannot be stopped. >> > Hmm, true. What about hooking into suspend and doing vmstop during > suspend. Is suspend the only foreseeable way for this problem to happen? I don't think it is which is what concerns me about any approach that relies on "hooking suspend". Also, I don't think there is a generic way to "hook suspend". >> >> This could happen because of stop, host suspend, live migration to a >> >> file, etc. >> >> >> >> It's much easier for us to call into qemu-ga to do the time correction >> >> whenever this event occurs than to try and have libvirt figure out when >> >> it's necessary. >> > And if guest does not have qemu-ga what is better inject interrupts like >> > crazy for next 2 minutes or leave guest with incorrect time? >> >> Yes, at least that's fixable by the end-user. QEMU consuming 100% CPU >> for a prolonged period of time isn't fixable. >> > You mean yes to "leave guest with incorrect time"? QEMU will still > consume 100% of cpu for some time calling qemu_timer callback millions > times. timedrift code is not the right level to fix that. Not if we put a cap on how many interrupts we'll try to catch up. As I mentioned previously, if we acrue more than X number of missed ticks, we should simply declare bankruptcy and reset the counter. When that occurs, *if* qemu-ga is present, we should ask qemu-ga to reset the guest's clock based on reading the hardware clock via a 'guest-resync-time' command. If it isn't, time will be off. Hopefully the guest is running NTP and can correct itself. Otherwise, at least the admin can manually fix the time. Regards, Anthony Liguori > > -- > Gleb.