From: Marcelo Tosatti <mtosatti@redhat.com>
To: Radim Krcmar <rkrcmar@redhat.com>
Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
Paolo Bonzini <pbonzini@redhat.com>,
Richard Cochran <richardcochran@gmail.com>,
Miroslav Lichvar <mlichvar@redhat.com>
Subject: Re: [patch 3/3] PTP: add kvm PTP driver
Date: Mon, 16 Jan 2017 15:04:18 -0200 [thread overview]
Message-ID: <20170116170415.GA2501@amt.cnet> (raw)
In-Reply-To: <20170116162653.GA32097@potion>
On Mon, Jan 16, 2017 at 05:26:53PM +0100, Radim Krcmar wrote:
> 2017-01-13 15:40-0200, Marcelo Tosatti:
> > On Fri, Jan 13, 2017 at 04:56:58PM +0100, Radim Krcmar wrote:
> > > 2017-01-13 10:01-0200, Marcelo Tosatti:
> >> > + version = pvclock_read_begin(src);
> >> > +
> >> > + ret = kvm_hypercall2(KVM_HC_CLOCK_OFFSET,
> >> > + clock_off_gpa,
> >> > + KVM_CLOCK_OFFSET_WALLCLOCK);
> >> > + if (ret != 0) {
> >> > + pr_err("clock offset hypercall ret %lu\n", ret);
> >> > + spin_unlock(&kvm_ptp_lock);
> >> > + preempt_enable_notrace();
> >> > + return -EOPNOTSUPP;
> >> > + }
> >> > +
> >> > + tspec.tv_sec = clock_off.sec;
> >> > + tspec.tv_nsec = clock_off.nsec;
> >> > +
> >> > + delta = rdtsc_ordered() - clock_off.tsc;
> >> > +
> >> > + offset = pvclock_scale_delta(delta, src->tsc_to_system_mul,
> >> > + src->tsc_shift);
> >> > +
> >> > + } while (pvclock_read_retry(src, version));
> >> > +
> >> > + preempt_enable_notrace();
> >> > +
> >> > + tspec.tv_nsec = tspec.tv_nsec + offset;
> >> > +
> >> > + spin_unlock(&kvm_ptp_lock);
> >> > +
> >> > + if (tspec.tv_nsec >= NSEC_PER_SEC) {
> >> > + u64 secs = tspec.tv_nsec;
> >> > +
> >> > + tspec.tv_nsec = do_div(secs, NSEC_PER_SEC);
> >> > + tspec.tv_sec += secs;
> >> > + }
> >> > +
> >> > + memcpy(ts, &tspec, sizeof(struct timespec64));
> >>
> >> But the whole idea is of improving the time by reading tsc a bit later
> >> is just weird ... why is it better to provide
> >>
> >> tsc + x, time + tsc_delta_to_time(x)
> >>
> >> than just
> >>
> >> tsc, time
> >>
> >> ?
> >
> > Because you want to calculate the value of the host realtime clock
> > at the moment of ptp_kvm_gettime.
> >
> > We do:
> >
> > 1. kvm_hypercall.
> > 2. get {sec, nsec, guest_tsc}.
> > 3. kvm_hypercall returns.
> > 4. delay = rdtsc() - guest_tsc.
> >
> > Where delay is the delta (measured with the TSC) between points 2 and 4.
>
> I see now ... the PTP interface is just not good for our purposes.
> We don't return {sec, nsec, guest_tsc}, we just return {sec, nsec} at
> some random time in the past. And to make it a bit more accurate, you
> add a best-effort delta before returning, which makes sense.
Not random time in the past. We return {sec, nsec} from the host
realtime at the moment the user ran the hypercall.
Since PTP is very accurate, that "a bit more" counts, yes.
> When we have to depend on pvclock, what are the advantages of not using
> the existing pvclock API for wall clock?
> (You mentioned some extensions.)
>
> struct pvclock_wall_clock {
> u32 version;
> u32 sec;
> u32 nsec;
> } __attribute__((__packed__));
> It gives the wall clock when pvclock was 0, so you just add current
> kvmclock and get the host wall clock.
Well, no. For one, the TSC part of kvmclock:
kvmclock-read = system_timestamp + convert-to-1GHz(rdtsc() - tsc_timestamp)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Drifts relative to UTC. This part can be large.
The guests NTP is responsible for fixing
that drift of the guests realtime clock (talking about current setup,
without KVM PTP driver).
Now, we want very high precision (less than 1us) for this
driver. Very small TSC drifts on a large delta defeat the purpose.
> Without a VM exit.
Huge performance is not an issue. Accuracy (how different from the host
realtime clock our "approximation" of the host realtime clock) is.
> And how often is ptp_kvm_gettime() usually called?
The PTP_SYS_OFFSET ioctl calls the following code in a loop:
struct ptp_sys_offset {
unsigned int n_samples; /* Desired number of measurements. */
unsigned int rsv[3]; /* Reserved for future use. */
/*
* Array of interleaved system/phc time stamps. The kernel
* will provide 2*n_samples + 1 time stamps, with the last
* one as a system time stamp.
*/
struct ptp_clock_time ts[2 * PTP_MAX_SAMPLES + 1];
};
#define PTP_MAX_SAMPLES 25 /* Maximum allowed offset measurement
samples. */
case PTP_SYS_OFFSET:
sysoff = memdup_user((void __user *)arg,
sizeof(*sysoff));
if (IS_ERR(sysoff)) {
err = PTR_ERR(sysoff);
sysoff = NULL;
break;
}
if (sysoff->n_samples > PTP_MAX_SAMPLES) {
err = -EINVAL;
break;
}
pct = &sysoff->ts[0];
for (i = 0; i < sysoff->n_samples; i++) {
getnstimeofday64(&ts);
pct->sec = ts.tv_sec;
pct->nsec = ts.tv_nsec;
pct++;
ptp->info->gettime64(ptp->info, &ts);
pct->sec = ts.tv_sec;
pct->nsec = ts.tv_nsec;
pct++;
}
getnstimeofday64(&ts);
pct->sec = ts.tv_sec;
pct->nsec = ts.tv_nsec;
How often that ioctl is called depends on the parameters of the Chrony
PHC code. Initially (to determine the clock difference Chrony should call it
more frequently, later on it should call it less frequency).
Perhaps once every second initially (the ioctl). I'll confirm with the
exact value for my setup and reply to this email.
>
> Thanks.
>
> >> Because we'll always be quering the time at tsc + y, where y >> x, and
> >> we'd likely have other problems if shifting the time base by few
> >> thousand cycles made a difference.
> >
> > Radim, i didnt get your "tsc + x", "time + tsc_delta_to_time(x)"
> > formulas above. Can you be more verbose please?
>
> x is the delta, tsc_delta_to_time() is what pvclock_scale_delta() does.
>
> I assumed that we set precise time with TSC, so the delta wouldn't
> matter, because PTP would either get {sec, nsec, guest_tsc}, or the
> same, but just shifted by delta, hence
> {sec + tsc_delta_to_time(x) / NSEC_PER_SEC,
> nsec + tsc_delta_to_time(x) % NSEC_PER_SEC,
> guest_tsc + x}.
Ah, OK. I understand you now understood the meaning of "tsc"
part of the {sec, nsec, guest_tsc} triple.
next prev parent reply other threads:[~2017-01-16 17:04 UTC|newest]
Thread overview: 56+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-01-13 12:01 [patch 0/3] KVM virtual PTP driver Marcelo Tosatti
2017-01-13 12:01 ` [patch 1/3] KVM: x86: provide realtime host clock via vsyscall notifiers Marcelo Tosatti
2017-01-13 15:18 ` Radim Krcmar
2017-01-13 15:34 ` Marcelo Tosatti
2017-01-13 16:28 ` Radim Krcmar
2017-01-13 17:51 ` Marcelo Tosatti
2017-01-16 15:40 ` Radim Krcmar
2017-01-13 15:41 ` Konrad Rzeszutek Wilk
2017-01-13 15:46 ` Marcelo Tosatti
2017-01-13 15:46 ` Marcelo Tosatti
2017-01-13 15:41 ` Konrad Rzeszutek Wilk
2017-01-13 12:01 ` [patch 2/3] KVM: x86: add KVM_HC_CLOCK_OFFSET hypercall Marcelo Tosatti
2017-01-13 15:31 ` Radim Krcmar
2017-01-13 15:43 ` Marcelo Tosatti
2017-01-13 17:07 ` Radim Krcmar
2017-01-13 17:57 ` Marcelo Tosatti
2017-01-13 12:01 ` [patch 3/3] PTP: add kvm PTP driver Marcelo Tosatti
2017-01-13 15:56 ` Radim Krcmar
2017-01-13 17:40 ` Marcelo Tosatti
2017-01-16 16:26 ` Radim Krcmar
2017-01-16 16:54 ` Radim Krcmar
2017-01-16 17:08 ` Marcelo Tosatti
2017-01-16 17:27 ` Radim Krcmar
2017-01-16 17:39 ` Marcelo Tosatti
2017-01-16 18:01 ` Radim Krcmar
2017-01-16 19:36 ` Marcelo Tosatti
2017-01-16 19:47 ` Marcelo Tosatti
2017-01-16 20:01 ` Marcelo Tosatti
2017-01-17 8:03 ` Miroslav Lichvar
2017-01-17 11:30 ` Marcelo Tosatti
2017-01-17 15:36 ` Radim Krcmar
2017-01-18 12:17 ` Marcelo Tosatti
2017-01-18 12:24 ` Marcelo Tosatti
2017-01-18 12:46 ` Paolo Bonzini
2017-01-18 13:36 ` Miroslav Lichvar
2017-01-18 14:02 ` Paolo Bonzini
2017-01-18 14:50 ` Marcelo Tosatti
2017-01-18 15:35 ` Radim Krcmar
2017-01-18 15:45 ` Paolo Bonzini
2017-01-18 15:57 ` Marcelo Tosatti
2017-01-18 14:24 ` Marcelo Tosatti
2017-01-18 15:54 ` Miroslav Lichvar
2017-01-18 16:07 ` Paolo Bonzini
2017-01-18 16:14 ` Radim Krcmar
2017-01-18 14:37 ` Marcelo Tosatti
2017-01-18 14:53 ` Marcelo Tosatti
2017-01-18 15:20 ` Radim Krcmar
2017-01-18 15:28 ` Marcelo Tosatti
2017-01-20 14:18 ` Radim Krcmar
2017-01-18 15:59 ` Radim Krcmar
2017-01-16 17:04 ` Marcelo Tosatti [this message]
2017-01-16 17:46 ` Radim Krcmar
2017-01-16 19:33 ` Marcelo Tosatti
2017-01-14 15:26 ` Richard Cochran
2017-01-16 15:48 ` Radim Krcmar
-- strict thread matches above, loose matches on Subject: below --
2017-01-13 18:45 [patch 0/3] KVM virtual PTP driver (v2) Marcelo Tosatti
2017-01-13 18:46 ` [patch 3/3] PTP: add kvm PTP driver Marcelo Tosatti
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170116170415.GA2501@amt.cnet \
--to=mtosatti@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mlichvar@redhat.com \
--cc=pbonzini@redhat.com \
--cc=richardcochran@gmail.com \
--cc=rkrcmar@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.