From mboxrd@z Thu Jan 1 00:00:00 1970 From: Roman Kagan Subject: Re: [PATCH] x86/kvm: fix condition to update kvm master clocks Date: Thu, 9 Jun 2016 22:19:25 +0300 Message-ID: <20160609191923.GA3258@rkaganip.lan> References: <1464274195-31296-1-git-send-email-rkagan@virtuozzo.com> <20160529233844.GA14374@amt.cnet> <20160609032710.GA13318@amt.cnet> <20160609120945.GB2570@rkaganb.sw.ru> <20160609182501.GA24024@amt.cnet> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Cc: , "Denis V. Lunev" , Owen Hofmann , Paolo Bonzini To: Marcelo Tosatti Return-path: Received: from mail-am1on0133.outbound.protection.outlook.com ([157.56.112.133]:20713 "EHLO emea01-am1-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1750751AbcFITdh (ORCPT ); Thu, 9 Jun 2016 15:33:37 -0400 Content-Disposition: inline In-Reply-To: <20160609182501.GA24024@amt.cnet> Sender: kvm-owner@vger.kernel.org List-ID: On Thu, Jun 09, 2016 at 03:25:02PM -0300, Marcelo Tosatti wrote: > On Thu, Jun 09, 2016 at 03:09:46PM +0300, Roman Kagan wrote: > > In general I'm starting to feel the shared memory clock is trying to > > provide stronger guarantees than really useful. E.g. there's no such > > thing as synchronous TSC between vCPUs in a virtual machine, so every > > guest assuming it is broken; > > There is, the TSC is monotonic between pCPUs: > > pCPU1 | pCPU2 > > 1. a = read tsc > 2. b = read tsc. Right, the processor guarantees that upon rdtsc completion the contents of %rdx, %rax is ordered properly across CPUs. However, what matters is the comparison of the values between CPUs which happens in the code that follows rdtsc, and which is subject to delays due to vCPUs being scheduled out. E.g. vCPU1 | vCPU2 | 1. a = rdtsc | 2. preemption by host | b = rdtsc 3. enter to guest | return t(b) 4. return t(a) | t(a) < t(b) In particular, in guest linux case, in the above scenario vCPU2 would update the timekeeper and store tsc_timestamp onto it; then vCPU1 would use the older tsc value to calculate the time and get negative tsc delta. > > in reality that means that every sane guest > > must tolerate certain violations of monotonicity when multiple CPUs are > > used for timekeeping. I wonder if this consideration can allow for some > > simplification of the paravirtual clock code... > > I think applications can fail. The guest kernel must take care of compensating those violations, so that applications see consistent view of time. [ I'll need some time to grok the rest of your message, as I'm still new to the timekeeping code, so I'll reply to it in another mail. ] Thanks, Roman.