From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1750919AbdAPRjb (ORCPT ); Mon, 16 Jan 2017 12:39:31 -0500 Received: from mx1.redhat.com ([209.132.183.28]:50742 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750826AbdAPRja (ORCPT ); Mon, 16 Jan 2017 12:39:30 -0500 Date: Mon, 16 Jan 2017 15:39:12 -0200 From: Marcelo Tosatti To: Radim Krcmar Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Paolo Bonzini , Richard Cochran , Miroslav Lichvar Subject: Re: [patch 3/3] PTP: add kvm PTP driver Message-ID: <20170116173909.GA4639@amt.cnet> References: <20170113120131.086634482@redhat.com> <20170113120319.777765254@redhat.com> <20170113155657.GD22440@potion> <20170113174014.GA9310@amt.cnet> <20170116162653.GA32097@potion> <20170116165411.GA2386@potion> <20170116170827.GB2501@amt.cnet> <20170116172758.GB31452@potion> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170116172758.GB31452@potion> User-Agent: Mutt/1.5.21 (2010-09-15) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.25]); Mon, 16 Jan 2017 17:39:31 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Jan 16, 2017 at 06:27:58PM +0100, Radim Krcmar wrote: > 2017-01-16 15:08-0200, Marcelo Tosatti: > > On Mon, Jan 16, 2017 at 05:54:11PM +0100, Radim Krcmar wrote: > >> 2017-01-16 17:26+0100, Radim Krcmar: > >> > 2017-01-13 15:40-0200, Marcelo Tosatti: > >> >> On Fri, Jan 13, 2017 at 04:56:58PM +0100, Radim Krcmar wrote: > >> >> > 2017-01-13 10:01-0200, Marcelo Tosatti: > >> >>> > + version = pvclock_read_begin(src); > >> >>> > + > >> >>> > + ret = kvm_hypercall2(KVM_HC_CLOCK_OFFSET, > >> >>> > + clock_off_gpa, > >> >>> > + KVM_CLOCK_OFFSET_WALLCLOCK); > >> >>> > + if (ret != 0) { > >> >>> > + pr_err("clock offset hypercall ret %lu\n", ret); > >> >>> > + spin_unlock(&kvm_ptp_lock); > >> >>> > + preempt_enable_notrace(); > >> >>> > + return -EOPNOTSUPP; > >> >>> > + } > >> >>> > + > >> >>> > + tspec.tv_sec = clock_off.sec; > >> >>> > + tspec.tv_nsec = clock_off.nsec; > >> >>> > + > >> >>> > + delta = rdtsc_ordered() - clock_off.tsc; > >> >>> > + > >> >>> > + offset = pvclock_scale_delta(delta, src->tsc_to_system_mul, > >> >>> > + src->tsc_shift); > >> >>> > + > >> >>> > + } while (pvclock_read_retry(src, version)); > >> >>> > + > >> >>> > + preempt_enable_notrace(); > >> >>> > + > >> >>> > + tspec.tv_nsec = tspec.tv_nsec + offset; > >> >>> > + > >> >>> > + spin_unlock(&kvm_ptp_lock); > >> >>> > + > >> >>> > + if (tspec.tv_nsec >= NSEC_PER_SEC) { > >> >>> > + u64 secs = tspec.tv_nsec; > >> >>> > + > >> >>> > + tspec.tv_nsec = do_div(secs, NSEC_PER_SEC); > >> >>> > + tspec.tv_sec += secs; > >> >>> > + } > >> >>> > + > >> >>> > + memcpy(ts, &tspec, sizeof(struct timespec64)); > >> >>> > >> >>> But the whole idea is of improving the time by reading tsc a bit later > >> >>> is just weird ... why is it better to provide > >> >>> > >> >>> tsc + x, time + tsc_delta_to_time(x) > >> >>> > >> >>> than just > >> >>> > >> >>> tsc, time > >> >>> > >> >>> ? > >> >> > >> >> Because you want to calculate the value of the host realtime clock > >> >> at the moment of ptp_kvm_gettime. > >> >> > >> >> We do: > >> >> > >> >> 1. kvm_hypercall. > >> >> 2. get {sec, nsec, guest_tsc}. > >> >> 3. kvm_hypercall returns. > >> >> 4. delay = rdtsc() - guest_tsc. > >> >> > >> >> Where delay is the delta (measured with the TSC) between points 2 and 4. > >> > > >> > I see now ... the PTP interface is just not good for our purposes. > >> > >> There is getcrosststamp() callback in PTP, which seems to be exactly > >> what we want when pairing with TSC, so the pvclock delay fixup can be > >> dropped when using it. > > > > What pvclock delay fixup you refer to? The "rdtsc() - clock_offset.tsc" > > part? > > Yes. > > > You can't drop it, because if you do then your "host realtime > > clock read" will be behind by "rdtsc() - clock_offset.tsc" TSC cycles. > > The TSC read will be some cycles old when the hypercall ends, but that > doesn't matter, because we will pass {sec, nsec, guest_tsc} to PTP and > PTP should plug them into kernel's realtime clock roughly like this: > > sec/nsec + (rdtsc() - guest_tsc) * tsc_freq > > Adding delay to guest_tsc and sec/nsec cannot improve precision. > (And will likely degrade it as kvmclock's frequency is incorrect.) > > > We want the highest precision as possible. > > I agree, which is why we don't want to lose precision in the delay > guesswork because of gettime64(). Sorry the clock difference is 10ns now. So the guest clock is off by _10 ns_ of the host clock. You are suggesting to use getcrosststamp instead, to drop the (rdtsc() - guest_tsc) part ? Please be more verbose.