From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([208.118.235.92]:46483)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <avi@redhat.com>) id 1TCBy1-0008TE-1v
	for qemu-devel@nongnu.org; Thu, 13 Sep 2012 12:08:46 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <avi@redhat.com>) id 1TCBxv-0004dn-2Y
	for qemu-devel@nongnu.org; Thu, 13 Sep 2012 12:08:36 -0400
Received: from mx1.redhat.com ([209.132.183.28]:61845)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <avi@redhat.com>) id 1TCBxu-0004cg-Nt
	for qemu-devel@nongnu.org; Thu, 13 Sep 2012 12:08:30 -0400
Message-ID: <505204F4.3040300@redhat.com>
Date: Thu, 13 Sep 2012 19:08:20 +0300
From: Avi Kivity <avi@redhat.com>
MIME-Version: 1.0
References: <87pq5r5otp.fsf@codemonkey.ws> <20120912151549.GT20907@redhat.com>
	<87y5kfrtne.fsf@codemonkey.ws> <20120913104940.GA20907@redhat.com>
	<5051DC20.4090204@redhat.com> <20120913132804.GO7767@redhat.com>
	<87r4q6xbiy.fsf@codemonkey.ws> <20120913142228.GK20907@redhat.com>
	<87boha7zyx.fsf@codemonkey.ws> <20120913144811.GL20907@redhat.com>
	<87ehm5or07.fsf@codemonkey.ws>
In-Reply-To: <87ehm5or07.fsf@codemonkey.ws>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Subject: Re: [Qemu-devel] Rethinking missed tick catchup
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Anthony Liguori <anthony@codemonkey.ws>
Cc: Gleb Natapov <gleb@redhat.com>, Jan Kiszka <jan.kiszka@siemens.com>, Michael Roth <mdroth@linux.vnet.ibm.com>, qemu-devel@nongnu.org, Paolo Bonzini <pbonzini@redhat.com>, Luiz Capitulino <lcapitulino@redhat.com>, Eric Blake <eblake@redhat.com>

On 09/13/2012 06:56 PM, Anthony Liguori wrote:
>>> 
>> Hmm, true. What about hooking into suspend and doing vmstop during
>> suspend. 
> 
> Is suspend the only foreseeable way for this problem to happen?  I don't
> think it is which is what concerns me about any approach that relies on
> "hooking suspend".

No, SIGSTOP/SIGCONT (can hook SIGCONT), gdb (can't hook but is very
rare), ENOSPACE + wait for more space to be provisioned (already known
to qemu), NFS access qemu core on dead server, severe swapstorms.

> Also, I don't think there is a generic way to "hook suspend".

That is what we have Lennart for.

>>> >> This could happen because of stop, host suspend, live migration to a
>>> >> file, etc.
>>> >> 
>>> >> It's much easier for us to call into qemu-ga to do the time correction
>>> >> whenever this event occurs than to try and have libvirt figure out when
>>> >> it's necessary.
>>> > And if guest does not have qemu-ga what is better inject interrupts like
>>> > crazy for next 2 minutes or leave guest with incorrect time?
>>> 
>>> Yes, at least that's fixable by the end-user.  QEMU consuming 100% CPU
>>> for a prolonged period of time isn't fixable.
>>> 
>> You mean yes to "leave guest with incorrect time"? QEMU will still
>> consume 100% of cpu for some time calling qemu_timer callback millions
>> times. timedrift code is not the right level to fix that.
> 
> Not if we put a cap on how many interrupts we'll try to catch up.
> 
> As I mentioned previously, if we acrue more than X number of missed
> ticks, we should simply declare bankruptcy and reset the counter.

If we know we're missing N ticks, we can simply pass N to the handler.

> 
> When that occurs, *if* qemu-ga is present, we should ask qemu-ga to
> reset the guest's clock based on reading the hardware clock via a
> 'guest-resync-time' command.
> 
> If it isn't, time will be off.  Hopefully the guest is running NTP and
> can correct itself.  Otherwise, at least the admin can manually fix the
> time.

There is also the fake S3 (post host resume) that can get the guest to
read its RTC.


-- 
error compiling committee.c: too many arguments to function