From mboxrd@z Thu Jan 1 00:00:00 1970 From: Radim =?utf-8?B?S3LEjW3DocWZ?= Subject: Re: [PATCH] x86/kvm: fix condition to update kvm master clocks Date: Fri, 27 May 2016 21:29:31 +0200 Message-ID: <20160527192931.GB14163@potion> References: <1464274195-31296-1-git-send-email-rkagan@virtuozzo.com> <20160526201936.GA25334@potion> <20160527172809.GB17398@rkaganb.sw.ru> <20160527181139.GA18797@potion> <20160527184640.GC17398@rkaganb.sw.ru> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: QUOTED-PRINTABLE To: Roman Kagan , kvm@vger.kernel.org, "Denis V. Lunev" , Owen Hofmann , Paolo Bonzini , Marcelo Tosatti Return-path: Received: from mx1.redhat.com ([209.132.183.28]:59670 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755202AbcE0T3g (ORCPT ); Fri, 27 May 2016 15:29:36 -0400 Content-Disposition: inline In-Reply-To: <20160527184640.GC17398@rkaganb.sw.ru> Sender: kvm-owner@vger.kernel.org List-ID: 2016-05-27 21:46+0300, Roman Kagan: > On Fri, May 27, 2016 at 08:11:40PM +0200, Radim Kr=C4=8Dm=C3=A1=C5=99= wrote: > > 2016-05-27 20:28+0300, Roman Kagan: >> >> Interaction between kvm_gen_update_masterclock(), pvclock_gtod_wo= rk(), >> >> and NTP could be a problem: kvm_gen_update_masterclock() only ha= s to >> >> run once per VM, but pvclock_gtod_work() calls it on every VCPU, = so >> >> frequent NTP updates on bigger guests could kill performance. >> >=20 >> > Unfortunately, things are worse than that: this stuff is updated o= n >> > every *tick* on the timekeeping CPU, so, as long as you keep at le= ast >> > one of your CPUs busy, the update rate can reach HZ. The frequenc= y of >> > NTP updates is unimportant; it happens without NTP updates at all. >> >=20 >> > So I tend to agree that we're perhaps better off not fixing this b= ug and >> > leaving the kvm_clocks to drift until we figure out how to do it w= ith >> > acceptable overhead. >>=20 >> Yuck ... the hunk below could help a bit. >> I haven't checked if the timekeeping code updates gtod and therefore >> sets 'was_set' even when the resulting time hasn't changed, so we mi= ght >> need to do more to avoid useless situations. >>=20 >> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c >> index a8c7ca34ee5d..37ed0a342bf1 100644 >> --- a/arch/x86/kvm/x86.c >> +++ b/arch/x86/kvm/x86.c >> @@ -5802,12 +5802,15 @@ static DECLARE_WORK(pvclock_gtod_work, pvclo= ck_gtod_update_fn); >> /* >> * Notification about pvclock gtod data update. >> */ >> -static int pvclock_gtod_notify(struct notifier_block *nb, unsigned = long unused, >> +static int pvclock_gtod_notify(struct notifier_block *nb, unsigned = long was_set, >> void *priv) >> { >> struct pvclock_gtod_data *gtod =3D &pvclock_gtod_data; >> struct timekeeper *tk =3D priv; >> =20 >> + if (!was_set) >> + return 0; >> + >> update_pvclock_gtod(tk); >> =20 >=20 > Nope, this parameter is only set when there's a step-like change in t= he > time. The timekeeper itself is always updated. I guess we could > mitigate the costs somewhat if we skipped updating the gtod copy unti= l > the accumulated error reaches certain limit; not sure if that's gonna > help though. I see, timekeeping_adjust() isn't covered, but it should not adjust every tick, so we could propagate information about adjustments to pvclock_gtod_notify (rename unused to has_changed), because pvclock onl= y cares about change of time. Adding another threshold is a reasonable improvement if adjustments happen too often, but we need to fix pvclock_gtod_update_fn() in any case. Am I missing anyting else? Thanks.