From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43)
	id 1KxvyU-0002wP-IV
	for qemu-devel@nongnu.org; Wed, 05 Nov 2008 22:56:02 -0500
Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43)
	id 1KxvyQ-0002sm-1L
	for qemu-devel@nongnu.org; Wed, 05 Nov 2008 22:56:01 -0500
Received: from [199.232.76.173] (port=37747 helo=monty-python.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.43) id 1KxvyP-0002sY-Rp
	for qemu-devel@nongnu.org; Wed, 05 Nov 2008 22:55:57 -0500
Received: from mail2.shareable.org ([80.68.89.115]:43661)
	by monty-python.gnu.org with esmtps (TLS-1.0:RSA_AES_256_CBC_SHA1:32)
	(Exim 4.60) (envelope-from <jamie@shareable.org>) id 1KxvyP-0004Ro-HL
	for qemu-devel@nongnu.org; Wed, 05 Nov 2008 22:55:57 -0500
Received: from jamie by mail2.shareable.org with local (Exim 4.63)
	(envelope-from <jamie@shareable.org>) id 1KxvyM-0007Xc-I0
	for qemu-devel@nongnu.org; Thu, 06 Nov 2008 03:55:54 +0000
Date: Thu, 6 Nov 2008 03:55:54 +0000
From: Jamie Lokier <jamie@shareable.org>
Subject: Re: [Qemu-devel] [RESEND][PATCH 0/3] Fix guest time drift under heavy
	load.
Message-ID: <20081106035554.GB26160@shareable.org>
References: <20081029152236.14831.15193.stgit@dhcp-1-237.local>
	<490B59BF.3000205@codemonkey.ws>
	<20081102130441.GD16809@redhat.com>
	<4911CD42.2040803@codemonkey.ws>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <4911CD42.2040803@codemonkey.ws>
Reply-To: qemu-devel@nongnu.org
List-Id: qemu-devel.nongnu.org
List-Unsubscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/pipermail/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: qemu-devel@nongnu.org

Anthony Liguori wrote:
> >The time drift is eliminated. If there is a spike in a load time may
> >slow down, but after that it catches up (this happens only during very
> >high loads though).
> 
> How bad is time drift without it.  Under workload X, we lose N seconds 
> per Y hours and with this patch, under the same workload, we lose M 
> seconds per Y hours and N << M.

In my experience, N seconds (for any N) is typically a problem with VM
systems in general.

> I strongly, strongly doubt that you'll be eliminating drift 100%.

I do believe that the method of "virtual clock warping" can eliminate
drift 100% for all guest OSes including guests which have their own
lost-tick compensators, provided there's enough host CPU _on average_
for the guest to tick, over an appropriate averaging window.  It
should work even with guests which request a different scheme with the
Microsoft PV ops.

> If the host can awaken QEMU 1024 times a second and QEMU can deliver a 
> timer interrupt each time, there is no need for time drift fixing.
> 
> I would think that with high res timers on the host, you would have to 
> put the host under heavy load before drift began occurring.

If two guests are running at 100% CPU on a single CPU, I suspect
you'll find each QEMU instance does _not_ get to run 1024 times per
second even with high res timers.  They will behave like
non-interactive processes, running alternately with long timeslices -
so even 100 or 18 times a second won't fire.  That's a reasonable
scenario, and doesn't even require any I/O or swapping.

I've personally yet to see any VM which doesn't drift
over time periods of months (i.e. servers in VMs) unless the guest is
running some kind of regular clock sync protocol - and NTP does not
always work for them, because NTP assumes a fairly low jitter clock,
which guests on a loaded host don't always manage - see above.

-- Jamie