public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Radim Krcmar <rkrcmar@redhat.com>
To: Marcelo Tosatti <mtosatti@redhat.com>
Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
	Paolo Bonzini <pbonzini@redhat.com>,
	Richard Cochran <richardcochran@gmail.com>,
	Miroslav Lichvar <mlichvar@redhat.com>
Subject: Re: [patch 3/3] PTP: add kvm PTP driver
Date: Mon, 16 Jan 2017 18:46:49 +0100	[thread overview]
Message-ID: <20170116174649.GC31452@potion> (raw)
In-Reply-To: <20170116170415.GA2501@amt.cnet>

2017-01-16 15:04-0200, Marcelo Tosatti:
> On Mon, Jan 16, 2017 at 05:26:53PM +0100, Radim Krcmar wrote:
>> 2017-01-13 15:40-0200, Marcelo Tosatti:
>> > On Fri, Jan 13, 2017 at 04:56:58PM +0100, Radim Krcmar wrote:
>> > > 2017-01-13 10:01-0200, Marcelo Tosatti:
>> >> > +		version = pvclock_read_begin(src);
>> >> > +
>> >> > +		ret = kvm_hypercall2(KVM_HC_CLOCK_OFFSET,
>> >> > +				     clock_off_gpa,
>> >> > +				     KVM_CLOCK_OFFSET_WALLCLOCK);
>> >> > +		if (ret != 0) {
>> >> > +			pr_err("clock offset hypercall ret %lu\n", ret);
>> >> > +			spin_unlock(&kvm_ptp_lock);
>> >> > +			preempt_enable_notrace();
>> >> > +			return -EOPNOTSUPP;
>> >> > +		}
>> >> > +
>> >> > +		tspec.tv_sec = clock_off.sec;
>> >> > +		tspec.tv_nsec = clock_off.nsec;
>> >> > +
>> >> > +		delta = rdtsc_ordered() - clock_off.tsc;
>> >> > +
>> >> > +		offset = pvclock_scale_delta(delta, src->tsc_to_system_mul,
>> >> > +					     src->tsc_shift);
>> >> > +
>> >> > +	} while (pvclock_read_retry(src, version));
>> >> > +
>> >> > +	preempt_enable_notrace();
>> >> > +
>> >> > +	tspec.tv_nsec = tspec.tv_nsec + offset;
>> >> > +
>> >> > +	spin_unlock(&kvm_ptp_lock);
>> >> > +
>> >> > +	if (tspec.tv_nsec >= NSEC_PER_SEC) {
>> >> > +		u64 secs = tspec.tv_nsec;
>> >> > +
>> >> > +		tspec.tv_nsec = do_div(secs, NSEC_PER_SEC);
>> >> > +		tspec.tv_sec += secs;
>> >> > +	}
>> >> > +
>> >> > +	memcpy(ts, &tspec, sizeof(struct timespec64));
>> >> 
>> >> But the whole idea is of improving the time by reading tsc a bit later
>> >> is just weird ... why is it better to provide
>> >> 
>> >>   tsc + x, time + tsc_delta_to_time(x)
>> >> 
>> >> than just
>> >> 
>> >>  tsc, time
>> >> 
>> >> ?
>> > 
>> > Because you want to calculate the value of the host realtime clock 
>> > at the moment of ptp_kvm_gettime.
>> > 
>> > We do:
>> > 
>> > 	1. kvm_hypercall.
>> > 	2. get {sec, nsec, guest_tsc}.
>> > 	3. kvm_hypercall returns.
>> > 	4. delay = rdtsc() - guest_tsc.
>> > 
>> > Where delay is the delta (measured with the TSC) between points 2 and 4.
>> 
>> I see now ... the PTP interface is just not good for our purposes.
>> We don't return {sec, nsec, guest_tsc}, we just return {sec, nsec} at
>> some random time in the past.  And to make it a bit more accurate, you
>> add a best-effort delta before returning, which makes sense.
> 
> Not random time in the past. We return {sec, nsec} from the host
> realtime at the moment the user ran the hypercall. 

That is what we want, but {sec, nsec} is not anchored to any running
time, so it is unavoidably late when we return it.

> Since PTP is very accurate, that "a bit more" counts, yes.
> 
>> When we have to depend on pvclock, what are the advantages of not using
>> the existing pvclock API for wall clock?
>> (You mentioned some extensions.)
>> 
>>   struct pvclock_wall_clock {
>>   	u32   version;
>>   	u32   sec;
>>   	u32   nsec;
>>   } __attribute__((__packed__));
>
>> It gives the wall clock when pvclock was 0, so you just add current
>> kvmclock and get the host wall clock.  
> 
> Well, no. For one, the TSC part of kvmclock: 
> 
> 	kvmclock-read = system_timestamp + convert-to-1GHz(rdtsc() - tsc_timestamp)
> 				           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> Drifts relative to UTC. This part can be large.
> The guests NTP is responsible for fixing
> that drift of the guests realtime clock (talking about current setup, 
> without KVM PTP driver).
> 
> Now, we want very high precision (less than 1us) for this
> driver. Very small TSC drifts on a large delta defeat the purpose.

True.

>> Without a VM exit.
> 
> Huge performance is not an issue. Accuracy (how different from the host
> realtime clock our "approximation" of the host realtime clock) is.

Ah, ok.

>> And how often is ptp_kvm_gettime() usually called?
> 
> The PTP_SYS_OFFSET ioctl calls the following code in a loop:
> 
> struct ptp_sys_offset {
>         unsigned int n_samples; /* Desired number of measurements. */
>         unsigned int rsv[3];    /* Reserved for future use. */
>         /*
>          * Array of interleaved system/phc time stamps. The kernel
>          * will provide 2*n_samples + 1 time stamps, with the last
>          * one as a system time stamp.
>          */
>         struct ptp_clock_time ts[2 * PTP_MAX_SAMPLES + 1];
> };
> 
> #define PTP_MAX_SAMPLES 25 /* Maximum allowed offset measurement
> samples. */
> 
>         case PTP_SYS_OFFSET:
>                 sysoff = memdup_user((void __user *)arg,
> sizeof(*sysoff));
>                 if (IS_ERR(sysoff)) {
>                         err = PTR_ERR(sysoff);
>                         sysoff = NULL;
>                         break;
>                 }
>                 if (sysoff->n_samples > PTP_MAX_SAMPLES) {
>                         err = -EINVAL;
>                         break;
>                 }
>                 pct = &sysoff->ts[0];
>                 for (i = 0; i < sysoff->n_samples; i++) {
>                         getnstimeofday64(&ts);
>                         pct->sec = ts.tv_sec;
>                         pct->nsec = ts.tv_nsec;
>                         pct++;
>                         ptp->info->gettime64(ptp->info, &ts);
>                         pct->sec = ts.tv_sec;
>                         pct->nsec = ts.tv_nsec;
>                         pct++;

Hm, this loop alternates between system time and ptp time and I'd guess
that userspace then manipulates deltas of these two readings and applies
the result to system time.
I'm not even sure that pvclock delta improves the situation anymore --
it adds an offset to all gettime64() reads, but that can be computed
from multiple datapoints.  (And the delta also adds uncertainty to
results.)

>                 }
>                 getnstimeofday64(&ts);
>                 pct->sec = ts.tv_sec;
>                 pct->nsec = ts.tv_nsec;
> 
> How often that ioctl is called depends on the parameters of the Chrony
> PHC code. Initially (to determine the clock difference Chrony should call it 
> more frequently, later on it should call it less frequency).
> 
> Perhaps once every second initially (the ioctl). I'll confirm with the
> exact value for my setup and reply to this email.

Thanks.

  reply	other threads:[~2017-01-16 17:46 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-01-13 12:01 [patch 0/3] KVM virtual PTP driver Marcelo Tosatti
2017-01-13 12:01 ` [patch 1/3] KVM: x86: provide realtime host clock via vsyscall notifiers Marcelo Tosatti
2017-01-13 15:18   ` Radim Krcmar
2017-01-13 15:34     ` Marcelo Tosatti
2017-01-13 16:28       ` Radim Krcmar
2017-01-13 17:51         ` Marcelo Tosatti
2017-01-16 15:40           ` Radim Krcmar
2017-01-13 15:41     ` Konrad Rzeszutek Wilk
2017-01-13 15:46       ` Marcelo Tosatti
2017-01-13 12:01 ` [patch 2/3] KVM: x86: add KVM_HC_CLOCK_OFFSET hypercall Marcelo Tosatti
2017-01-13 15:31   ` Radim Krcmar
2017-01-13 15:43     ` Marcelo Tosatti
2017-01-13 17:07       ` Radim Krcmar
2017-01-13 17:57         ` Marcelo Tosatti
2017-01-13 12:01 ` [patch 3/3] PTP: add kvm PTP driver Marcelo Tosatti
2017-01-13 15:56   ` Radim Krcmar
2017-01-13 17:40     ` Marcelo Tosatti
2017-01-16 16:26       ` Radim Krcmar
2017-01-16 16:54         ` Radim Krcmar
2017-01-16 17:08           ` Marcelo Tosatti
2017-01-16 17:27             ` Radim Krcmar
2017-01-16 17:39               ` Marcelo Tosatti
2017-01-16 18:01                 ` Radim Krcmar
2017-01-16 19:36                   ` Marcelo Tosatti
2017-01-16 19:47                     ` Marcelo Tosatti
2017-01-16 20:01                       ` Marcelo Tosatti
2017-01-17  8:03                         ` Miroslav Lichvar
2017-01-17 11:30                           ` Marcelo Tosatti
2017-01-17 15:36                             ` Radim Krcmar
2017-01-18 12:17                               ` Marcelo Tosatti
2017-01-18 12:24                                 ` Marcelo Tosatti
2017-01-18 12:46                                   ` Paolo Bonzini
2017-01-18 13:36                                     ` Miroslav Lichvar
2017-01-18 14:02                                       ` Paolo Bonzini
2017-01-18 14:50                                         ` Marcelo Tosatti
2017-01-18 15:35                                           ` Radim Krcmar
2017-01-18 15:45                                           ` Paolo Bonzini
2017-01-18 15:57                                             ` Marcelo Tosatti
2017-01-18 14:24                                     ` Marcelo Tosatti
2017-01-18 15:54                                       ` Miroslav Lichvar
2017-01-18 16:07                                         ` Paolo Bonzini
2017-01-18 16:14                                         ` Radim Krcmar
2017-01-18 14:37                                     ` Marcelo Tosatti
2017-01-18 14:53                                       ` Marcelo Tosatti
2017-01-18 15:20                                         ` Radim Krcmar
2017-01-18 15:28                                           ` Marcelo Tosatti
2017-01-20 14:18                                             ` Radim Krcmar
2017-01-18 15:59                                     ` Radim Krcmar
2017-01-16 17:04         ` Marcelo Tosatti
2017-01-16 17:46           ` Radim Krcmar [this message]
2017-01-16 19:33             ` Marcelo Tosatti
2017-01-14 15:26     ` Richard Cochran
2017-01-16 15:48       ` Radim Krcmar
  -- strict thread matches above, loose matches on Subject: below --
2017-01-13 18:45 [patch 0/3] KVM virtual PTP driver (v2) Marcelo Tosatti
2017-01-13 18:46 ` [patch 3/3] PTP: add kvm PTP driver Marcelo Tosatti

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170116174649.GC31452@potion \
    --to=rkrcmar@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mlichvar@redhat.com \
    --cc=mtosatti@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=richardcochran@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox