From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752922AbaBYKNz (ORCPT ); Tue, 25 Feb 2014 05:13:55 -0500 Received: from mx1.redhat.com ([209.132.183.28]:19050 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750818AbaBYKNw (ORCPT ); Tue, 25 Feb 2014 05:13:52 -0500 Date: Tue, 25 Feb 2014 11:13:46 +0100 From: Andrew Jones To: Marcelo Tosatti Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, pbonzini@redhat.com Subject: Re: [PATCH] x86: kvm: fix unstable_tsc boot Message-ID: <20140225101345.GA2292@hawk.usersys.redhat.com> References: <1393256549-7743-1-git-send-email-drjones@redhat.com> <20140224211524.GC22025@amt.cnet> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20140224211524.GC22025@amt.cnet> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Feb 24, 2014 at 06:15:24PM -0300, Marcelo Tosatti wrote: > On Mon, Feb 24, 2014 at 04:42:29PM +0100, Andrew Jones wrote: > > When the tsc is marked unstable on the host it causes global clock > > updates to be requested each time a vcpu is loaded, nearly halting > > all progress on guests with a large number of vcpus. > > > > Fix this by only requesting a local clock update unless the vcpu > > is migrating to another cpu. > > > > Signed-off-by: Andrew Jones > > --- > > arch/x86/kvm/x86.c | 7 +++++-- > > 1 file changed, 5 insertions(+), 2 deletions(-) > > > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > > index 6530019116b0d..ea716a162b4a3 100644 > > --- a/arch/x86/kvm/x86.c > > +++ b/arch/x86/kvm/x86.c > > @@ -2781,15 +2781,18 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu) > > vcpu->arch.last_guest_tsc); > > kvm_x86_ops->write_tsc_offset(vcpu, offset); > > vcpu->arch.tsc_catchup = 1; > > + set_bit(KVM_REQ_CLOCK_UPDATE, &vcpu->requests); > > } > > + } > > + > > + if (unlikely(vcpu->cpu != cpu)) { > > /* > > * On a host with synchronized TSC, there is no need to update > > * kvmclock on vcpu->cpu migration > > */ > > if (!vcpu->kvm->arch.use_master_clock || vcpu->cpu == -1) > > kvm_make_request(KVM_REQ_GLOBAL_CLOCK_UPDATE, vcpu); > > - if (vcpu->cpu != cpu) > > - kvm_migrate_timers(vcpu); > > + kvm_migrate_timers(vcpu); > > vcpu->cpu = cpu; > > } > > > > -- > > 1.8.1.4 > > Consider VCPU1 not doing kvm_arch_vcpu_load (guest not executing HLT, > not switching VCPUs, no exits to QEMU). > > VCPU0 doing kvm_arch_vcpu_load (guest executing HLT, say). > > The updates on VCPU0 must generate updates on VCPU1 as well, otherwise > NTP correction applies to VCPU0 but not VCPU1. > OK. So, as we discussed off-list, we need to bind the time that vcpu clocks are out of synch. When vcpu0 does its local clock update it may pick up an NTP correction. We can't wait an indeterminate amount of time for vcpu1 to pick up that correction, as the clocks will further diverge. However, we can't request a global clock update on every vcpu load either. The solution is to rate-limit the global clock updates. Marcelo calculates that we should delay the global clock updates no more than 0.1s as follows: Assume an NTP correction c is applied to one vcpu, but not the other, then in n seconds the delta of the vcpu system_timestamps will be c * n. If we assume a correction of 500ppm (worst-case), then the two vcpus will diverge 100us in 0.1s, which is a considerable amount. I have a patch prepared that will rate-limit global clock updates to a max frequency of 1/0.1s. This patch should be dropped, and I'll send the new one soon. drew