From mboxrd@z Thu Jan 1 00:00:00 1970 From: Marcelo Tosatti Subject: Re: KVM: x86: workaround SuSE's 2.6.16 pvclock vs masterclock issue Date: Wed, 21 Jan 2015 23:40:38 -0200 Message-ID: <20150122014038.GA5307@amt.cnet> References: <20150120175452.GA32680@amt.cnet> <20150121140926.GA9085@potion.brq.redhat.com> <20150121141634.GB718@amt.cnet> <20150121170033.GB9085@potion.brq.redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: kvm-devel , Paolo Bonzini To: Radim =?utf-8?B?S3LEjW3DocWZ?= Return-path: Received: from mx1.redhat.com ([209.132.183.28]:42003 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753971AbbAVBky (ORCPT ); Wed, 21 Jan 2015 20:40:54 -0500 Received: from int-mx14.intmail.prod.int.phx2.redhat.com (int-mx14.intmail.prod.int.phx2.redhat.com [10.5.11.27]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id t0M1erQQ009195 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL) for ; Wed, 21 Jan 2015 20:40:53 -0500 Content-Disposition: inline In-Reply-To: <20150121170033.GB9085@potion.brq.redhat.com> Sender: kvm-owner@vger.kernel.org List-ID: On Wed, Jan 21, 2015 at 06:00:37PM +0100, Radim Kr=C4=8Dm=C3=A1=C5=99 w= rote: > 2015-01-21 12:16-0200, Marcelo Tosatti: > > On Wed, Jan 21, 2015 at 03:09:27PM +0100, Radim Kr=C4=8Dm=C3=A1=C5=99= wrote: > > > 2015-01-20 15:54-0200, Marcelo Tosatti: > > > > SuSE's 2.6.16 kernel fails to boot if the delta between tsc_tim= estamp > > > > and rdtsc is larger than a given threshold: > > > [...] > > > > Disable masterclock support (which increases said delta) in cas= e the > > > > boot vcpu does not use MSR_KVM_SYSTEM_TIME_NEW. > > >=20 > > > Why do we care about 2.6.16 bugs in upstream KVM? > >=20 > > Because people do use 2.6.16 guests. >=20 > (Those people probably won't use 3.19+ host ... > Is this patch intended for stable?) Yes. > > > The code to benefit tradeoff of this patch seems bad to me ... > >=20 > > Can you state the tradeoff and then explain why it is bad ? >=20 > Additional code needs time to understand and is a source of bugs, > yet we still include it because we want to achieve something. >=20 > I meant the tradeoff between perceived value of "something" and > acceptability of the code. (Ideally, computer programs would be a > shorter version of "Do what I want.\nEOF".) >=20 > There are three main points that made think it is bad, > 1) The bug happens because a guest expects greater precision. > I consider that as a guest problem. kvmclock never guaranteed > anything, so unmet expectations should be a recoverable error. delta =3D pvclock_data.tsc_timestamp - RDTSC Guest expects delta to be smaller than a given threshold. It does not expect greater precision. Size of delta does not affect precision. > 2) With time, the probability that 2.6.16 is used is getting lower, > while people looking at KVM's code appear. > - At what point are we going to drop 2.6.16 support? > (We shouldn't let mistakes drag us down forever ... > Or are we dooming KVM on purpose?) One of the features of virtualization is to be able to run old=20 operating systems? > 3) The patch made me ask more silly questions than it answered :) > (Why can't other software depend on previous behavior? Documentation/virtual/kvm/msr.txt: whose data will be filled in by the hypervisor periodically. Only one write, or registration, is needed for each VCPU. The i= nterval between updates of this structure is arbitrary and implementati= on-dependent. The hypervisor may update this structure at any time it sees fi= t until anything with bit0 =3D=3D 0 is written to it. > Why can't kvmclock without master clock still fail? It can, given a loaded system. > Why can't we improve the master clock?) Out of context question. > > > MSR_KVM_SYSTEM_TIME is deprecated -- we could remove it now, with > > > MSR_KVM_WALL_CLOCK (after hiding KVM_FEATURE_CLOCKSOURCE) if we w= ant > > > to support old guests. > >=20 > > What is the benefit of removing support for MSR_KVM_SYSTEM_TIME ? >=20 > The maintainability of the code increases. It would look as if we ne= ver > made the mistake with MSR_KVM_SYSTEM_TIME & MSR_KVM_WALL_CLOCK. > (I like when old code looks as if we wrote it from scratch.) >=20 > After comparing the (imperfectly evaluated) benefit of both variants, > original patch: > + 2.6.16 SUSE guests work > - MSR_KVM_SYSTEM_TIME guests don't use master clock > - KVM code is worse > removal of KVM_FEATURE_CLOCKSOURCE: > + 2.6.16 SUSE guests likely work All guests which depend on KVM_FEATURE_CLOCKSOURCE will timedrift. > + KVM code is better > - MSR_KVM_SYSTEM_TIME guests use even worse clocksource >=20 > As KVM_FEATURE_CLOCKSOURCE2 was introduced in 2010, I found the remov= al > better even without waiting for the last MSR_KVM_SYSTEM_TIME guest to > perish. >=20 > > Supporting old guests is important. >=20 > It comes at a price. > (Mutually exclusive goals are important as well.) This phrase is awkward. Overlapping goals are negative, then? (think of a large number of totally overlapping goals).