From mboxrd@z Thu Jan 1 00:00:00 1970 From: Marcelo Tosatti Subject: Re: kvm guest loops_per_jiffy miscalibration under host load Date: Tue, 29 Jul 2008 11:58:33 -0300 Message-ID: <20080729145833.GA28520@dmt.cnet> References: <20080722032510.GB1358@dmt.cnet> <48863B5C.9040203@cisco.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Glauber Costa , kvm-devel To: "David S. Ahern" Return-path: Received: from mx1.redhat.com ([66.187.233.31]:59995 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750925AbYG2O7M (ORCPT ); Tue, 29 Jul 2008 10:59:12 -0400 Content-Disposition: inline In-Reply-To: <48863B5C.9040203@cisco.com> Sender: kvm-owner@vger.kernel.org List-ID: On Tue, Jul 22, 2008 at 01:56:12PM -0600, David S. Ahern wrote: > I've been running a series of tests on RHEL3, RHEL4, and RHEL5. The > short of it is that all of them keep time quite well with 1 vcpu. In the > case of RHEL3 and RHEL4 time is stable for *both* the uniprocessor and > smp kernels, again with only 1 vcpu (there's no up/smp distinction in > the kernels for RHEL5). > > As soon as the number of vcpus is >1, time drifts systematically with > the guest *leading* the host. I see this on unloaded guests and hosts > (ie., cpu usage on the host ~<5%). The drift is averaging around > 0.5%-0.6% (i.e., 5 seconds gained in the guest per 1000 seconds of real > wall time). > > This very reproducible. All I am doing is installing stock RHEL3.8, 4.4 > and 5.2, i386 versions, starting them and watching the drift with no > time servers. In all of these recent cases the results are for in-kernel > pit. David, You mentioned earlier problems with ntpd syncing the guest time? Can you provide more details? I find it _necessary_ to use the RR scheduling policy for any Linux guest running at static 1000Hz (no dynticks), otherwise timer interrupts will invariably be missed. And reinjection plus lost tick adjustment is always problematic (will drift either way, depending which version of Linux). With the standard batch scheduling policy _idle_ guests can wait to run upto 6/7 ms in my testing (thus 6/7 lost timer events). Which also means latency can be horrible.