* [PATCH 0/4] kvmclock: improve accuracy
@ 2016-02-08 15:18 Paolo Bonzini
2016-02-08 15:18 ` [PATCH 1/4] KVM: x86: rename argument to kvm_set_tsc_khz Paolo Bonzini
` (4 more replies)
0 siblings, 5 replies; 15+ messages in thread
From: Paolo Bonzini @ 2016-02-08 15:18 UTC (permalink / raw)
To: linux-kernel, kvm; +Cc: mtosatti
Currently kvmclock is obtaining the multiplier and shift value from
the TSC kHz. These however are less accurate than the host kernel's
clock, which includes corrections made through NTP.
These patches change kvmclock to tick at the same frequency as the
host kernel's clocksource (as obtained through the pvclock_gtod
notifier). This is precise enough that the Hyper-V clock can be
implemented on top of it.
Paolo Bonzini (4):
KVM: x86: rename argument to kvm_set_tsc_khz
KVM: x86: rewrite handling of scaled TSC for kvmclock
KVM: x86: pass kvm_get_time_scale arguments in hertz
KVM: x86: track actual TSC frequency from the timekeeper struct
arch/x86/include/asm/kvm_host.h | 3 +-
arch/x86/kvm/x86.c | 73 +++++++++++++++++++++++++----------------
2 files changed, 47 insertions(+), 29 deletions(-)
--
1.8.3.1
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 1/4] KVM: x86: rename argument to kvm_set_tsc_khz
2016-02-08 15:18 [PATCH 0/4] kvmclock: improve accuracy Paolo Bonzini
@ 2016-02-08 15:18 ` Paolo Bonzini
2016-02-11 15:01 ` Marcelo Tosatti
2016-02-08 15:18 ` [PATCH 2/4] KVM: x86: rewrite handling of scaled TSC for kvmclock Paolo Bonzini
` (3 subsequent siblings)
4 siblings, 1 reply; 15+ messages in thread
From: Paolo Bonzini @ 2016-02-08 15:18 UTC (permalink / raw)
To: linux-kernel, kvm; +Cc: mtosatti
This refers to the desired (scaled) frequency, which is called
user_tsc_khz in the rest of the file.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
arch/x86/kvm/x86.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index aafbcf9f9776..1ca2b44f70bb 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1290,23 +1290,23 @@ static int set_tsc_khz(struct kvm_vcpu *vcpu, u32 user_tsc_khz, bool scale)
return 0;
}
-static int kvm_set_tsc_khz(struct kvm_vcpu *vcpu, u32 this_tsc_khz)
+static int kvm_set_tsc_khz(struct kvm_vcpu *vcpu, u32 user_tsc_khz)
{
u32 thresh_lo, thresh_hi;
int use_scaling = 0;
/* tsc_khz can be zero if TSC calibration fails */
- if (this_tsc_khz == 0) {
+ if (user_tsc_khz == 0) {
/* set tsc_scaling_ratio to a safe value */
vcpu->arch.tsc_scaling_ratio = kvm_default_tsc_scaling_ratio;
return -1;
}
/* Compute a scale to convert nanoseconds in TSC cycles */
- kvm_get_time_scale(this_tsc_khz, NSEC_PER_SEC / 1000,
+ kvm_get_time_scale(user_tsc_khz, NSEC_PER_SEC / 1000,
&vcpu->arch.virtual_tsc_shift,
&vcpu->arch.virtual_tsc_mult);
- vcpu->arch.virtual_tsc_khz = this_tsc_khz;
+ vcpu->arch.virtual_tsc_khz = user_tsc_khz;
/*
* Compute the variation in TSC rate which is acceptable
@@ -1316,11 +1316,11 @@ static int kvm_set_tsc_khz(struct kvm_vcpu *vcpu, u32 this_tsc_khz)
*/
thresh_lo = adjust_tsc_khz(tsc_khz, -tsc_tolerance_ppm);
thresh_hi = adjust_tsc_khz(tsc_khz, tsc_tolerance_ppm);
- if (this_tsc_khz < thresh_lo || this_tsc_khz > thresh_hi) {
- pr_debug("kvm: requested TSC rate %u falls outside tolerance [%u,%u]\n", this_tsc_khz, thresh_lo, thresh_hi);
+ if (user_tsc_khz < thresh_lo || user_tsc_khz > thresh_hi) {
+ pr_debug("kvm: requested TSC rate %u falls outside tolerance [%u,%u]\n", user_tsc_khz, thresh_lo, thresh_hi);
use_scaling = 1;
}
- return set_tsc_khz(vcpu, this_tsc_khz, use_scaling);
+ return set_tsc_khz(vcpu, user_tsc_khz, use_scaling);
}
static u64 compute_guest_tsc(struct kvm_vcpu *vcpu, s64 kernel_ns)
--
1.8.3.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH 2/4] KVM: x86: rewrite handling of scaled TSC for kvmclock
2016-02-08 15:18 [PATCH 0/4] kvmclock: improve accuracy Paolo Bonzini
2016-02-08 15:18 ` [PATCH 1/4] KVM: x86: rename argument to kvm_set_tsc_khz Paolo Bonzini
@ 2016-02-08 15:18 ` Paolo Bonzini
2016-02-11 15:23 ` Marcelo Tosatti
2016-02-08 15:18 ` [PATCH 3/4] KVM: x86: pass kvm_get_time_scale arguments in hertz Paolo Bonzini
` (2 subsequent siblings)
4 siblings, 1 reply; 15+ messages in thread
From: Paolo Bonzini @ 2016-02-08 15:18 UTC (permalink / raw)
To: linux-kernel, kvm; +Cc: mtosatti
This is the same as before:
kvm_scale_tsc(tgt_tsc_khz)
= tgt_tsc_khz * ratio
= tgt_tsc_khz * user_tsc_khz / tsc_khz (see set_tsc_khz)
= user_tsc_khz (see kvm_guest_time_update)
= vcpu->arch.virtual_tsc_khz (see kvm_set_tsc_khz)
However, computing it through kvm_scale_tsc makes it possible
to include the NTP correction in tgt_tsc_khz, as done in the
next patch.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
arch/x86/kvm/x86.c | 15 ++++++++-------
1 file changed, 8 insertions(+), 7 deletions(-)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 1ca2b44f70bb..6db3c219795b 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1713,7 +1713,7 @@ static void kvm_gen_update_masterclock(struct kvm *kvm)
static int kvm_guest_time_update(struct kvm_vcpu *v)
{
- unsigned long flags, this_tsc_khz, tgt_tsc_khz;
+ unsigned long flags, tgt_tsc_khz;
struct kvm_vcpu_arch *vcpu = &v->arch;
struct kvm_arch *ka = &v->kvm->arch;
s64 kernel_ns;
@@ -1739,8 +1739,8 @@ static int kvm_guest_time_update(struct kvm_vcpu *v)
/* Keep irq disabled to prevent changes to the clock */
local_irq_save(flags);
- this_tsc_khz = __this_cpu_read(cpu_tsc_khz);
- if (unlikely(this_tsc_khz == 0)) {
+ tgt_tsc_khz = __this_cpu_read(cpu_tsc_khz);
+ if (unlikely(tgt_tsc_khz == 0)) {
local_irq_restore(flags);
kvm_make_request(KVM_REQ_CLOCK_UPDATE, v);
return 1;
@@ -1775,13 +1775,14 @@ static int kvm_guest_time_update(struct kvm_vcpu *v)
if (!vcpu->pv_time_enabled)
return 0;
- if (unlikely(vcpu->hw_tsc_khz != this_tsc_khz)) {
- tgt_tsc_khz = kvm_has_tsc_control ?
- vcpu->virtual_tsc_khz : this_tsc_khz;
+ if (kvm_has_tsc_control)
+ tgt_tsc_khz = kvm_scale_tsc(v, tgt_tsc_khz);
+
+ if (unlikely(vcpu->hw_tsc_khz != tgt_tsc_khz)) {
kvm_get_time_scale(NSEC_PER_SEC / 1000, tgt_tsc_khz,
&vcpu->hv_clock.tsc_shift,
&vcpu->hv_clock.tsc_to_system_mul);
- vcpu->hw_tsc_khz = this_tsc_khz;
+ vcpu->hw_tsc_khz = tgt_tsc_khz;
}
/* With all the info we got, fill in the values */
--
1.8.3.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH 3/4] KVM: x86: pass kvm_get_time_scale arguments in hertz
2016-02-08 15:18 [PATCH 0/4] kvmclock: improve accuracy Paolo Bonzini
2016-02-08 15:18 ` [PATCH 1/4] KVM: x86: rename argument to kvm_set_tsc_khz Paolo Bonzini
2016-02-08 15:18 ` [PATCH 2/4] KVM: x86: rewrite handling of scaled TSC for kvmclock Paolo Bonzini
@ 2016-02-08 15:18 ` Paolo Bonzini
2016-02-08 15:18 ` [PATCH 4/4] KVM: x86: track actual TSC frequency from the timekeeper struct Paolo Bonzini
2016-02-16 14:00 ` [PATCH 0/4] kvmclock: improve accuracy Marcelo Tosatti
4 siblings, 0 replies; 15+ messages in thread
From: Paolo Bonzini @ 2016-02-08 15:18 UTC (permalink / raw)
To: linux-kernel, kvm; +Cc: mtosatti
Prepare for improving the precision in the next patch.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
arch/x86/kvm/x86.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 6db3c219795b..e0bf8f10dc22 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1203,7 +1203,7 @@ static uint32_t div_frac(uint32_t dividend, uint32_t divisor)
return dividend;
}
-static void kvm_get_time_scale(uint32_t scaled_khz, uint32_t base_khz,
+static void kvm_get_time_scale(uint64_t scaled_hz, uint64_t base_hz,
s8 *pshift, u32 *pmultiplier)
{
uint64_t scaled64;
@@ -1211,8 +1211,8 @@ static void kvm_get_time_scale(uint32_t scaled_khz, uint32_t base_khz,
uint64_t tps64;
uint32_t tps32;
- tps64 = base_khz * 1000LL;
- scaled64 = scaled_khz * 1000LL;
+ tps64 = base_hz;
+ scaled64 = scaled_hz;
while (tps64 > scaled64*2 || tps64 & 0xffffffff00000000ULL) {
tps64 >>= 1;
shift--;
@@ -1230,8 +1230,8 @@ static void kvm_get_time_scale(uint32_t scaled_khz, uint32_t base_khz,
*pshift = shift;
*pmultiplier = div_frac(scaled64, tps32);
- pr_debug("%s: base_khz %u => %u, shift %d, mul %u\n",
- __func__, base_khz, scaled_khz, shift, *pmultiplier);
+ pr_debug("%s: base_hz %llu => %llu, shift %d, mul %u\n",
+ __func__, base_hz, scaled_hz, shift, *pmultiplier);
}
#ifdef CONFIG_X86_64
@@ -1303,7 +1303,7 @@ static int kvm_set_tsc_khz(struct kvm_vcpu *vcpu, u32 user_tsc_khz)
}
/* Compute a scale to convert nanoseconds in TSC cycles */
- kvm_get_time_scale(user_tsc_khz, NSEC_PER_SEC / 1000,
+ kvm_get_time_scale(user_tsc_khz * 1000LL, NSEC_PER_SEC,
&vcpu->arch.virtual_tsc_shift,
&vcpu->arch.virtual_tsc_mult);
vcpu->arch.virtual_tsc_khz = user_tsc_khz;
@@ -1779,7 +1779,7 @@ static int kvm_guest_time_update(struct kvm_vcpu *v)
tgt_tsc_khz = kvm_scale_tsc(v, tgt_tsc_khz);
if (unlikely(vcpu->hw_tsc_khz != tgt_tsc_khz)) {
- kvm_get_time_scale(NSEC_PER_SEC / 1000, tgt_tsc_khz,
+ kvm_get_time_scale(NSEC_PER_SEC, tgt_tsc_khz * 1000LL,
&vcpu->hv_clock.tsc_shift,
&vcpu->hv_clock.tsc_to_system_mul);
vcpu->hw_tsc_khz = tgt_tsc_khz;
--
1.8.3.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH 4/4] KVM: x86: track actual TSC frequency from the timekeeper struct
2016-02-08 15:18 [PATCH 0/4] kvmclock: improve accuracy Paolo Bonzini
` (2 preceding siblings ...)
2016-02-08 15:18 ` [PATCH 3/4] KVM: x86: pass kvm_get_time_scale arguments in hertz Paolo Bonzini
@ 2016-02-08 15:18 ` Paolo Bonzini
2016-02-09 18:41 ` Owen Hofmann
2016-02-16 13:48 ` Marcelo Tosatti
2016-02-16 14:00 ` [PATCH 0/4] kvmclock: improve accuracy Marcelo Tosatti
4 siblings, 2 replies; 15+ messages in thread
From: Paolo Bonzini @ 2016-02-08 15:18 UTC (permalink / raw)
To: linux-kernel, kvm; +Cc: mtosatti
When an NTP server is running, it may adjust the time substantially
compared to the "official" frequency of the TSC. A 12 ppm change
sums up to one second per day.
This already shows up if the guest compares kvmclock with e.g. the
PM timer. It shows up even more once we add support for the Hyper-V
TSC page, because the guest expects it to be in sync with the time
reference counter; effectively the time reference counter is just a
slow path to access the same clock that is in the TSC page.
Therefore, we want kvmclock to provide the host kernel's
ktime_get_boot_ns() value, at least if the master clock is active.
To do so, reverse-compute the host's "actual" TSC frequency from
pvclock_gtod_data and return it from kvm_get_time_and_clockread.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
arch/x86/include/asm/kvm_host.h | 3 ++-
arch/x86/kvm/x86.c | 46 +++++++++++++++++++++++++++--------------
2 files changed, 33 insertions(+), 16 deletions(-)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 7b5459982433..7dd6d55b4868 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -538,7 +538,7 @@ struct kvm_vcpu_arch {
gpa_t time;
struct pvclock_vcpu_time_info hv_clock;
- unsigned int hw_tsc_khz;
+ u64 hv_clock_tsc_hz;
struct gfn_to_hva_cache pv_time;
bool pv_time_enabled;
/* set guest stopped flag in pvclock flags field */
@@ -731,6 +731,7 @@ struct kvm_arch {
spinlock_t pvclock_gtod_sync_lock;
bool use_master_clock;
+ u64 master_tsc_hz;
u64 master_kernel_ns;
cycle_t master_cycle_now;
struct delayed_work kvmclock_update_work;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index e0bf8f10dc22..c99101d4cc5e 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1580,7 +1580,16 @@ static inline u64 vgettsc(cycle_t *cycle_now)
return v * gtod->clock.mult;
}
-static int do_monotonic_boot(s64 *t, cycle_t *cycle_now)
+static u64 pvclock_gtod_tsc_hz(void)
+{
+ struct pvclock_gtod_data *gtod = &pvclock_gtod_data;
+
+ u64 gtod_tsc_hz = 1000000000ULL << gtod->clock.shift;
+ do_div(gtod_tsc_hz, gtod->clock.mult);
+ return gtod_tsc_hz;
+}
+
+static int do_monotonic_boot(s64 *t, cycle_t *cycle_now, u64 *tsc_hz)
{
struct pvclock_gtod_data *gtod = &pvclock_gtod_data;
unsigned long seq;
@@ -1589,6 +1598,7 @@ static int do_monotonic_boot(s64 *t, cycle_t *cycle_now)
do {
seq = read_seqcount_begin(>od->seq);
+ *tsc_hz = pvclock_gtod_tsc_hz();
mode = gtod->clock.vclock_mode;
ns = gtod->nsec_base;
ns += vgettsc(cycle_now);
@@ -1601,13 +1611,14 @@ static int do_monotonic_boot(s64 *t, cycle_t *cycle_now)
}
/* returns true if host is using tsc clocksource */
-static bool kvm_get_time_and_clockread(s64 *kernel_ns, cycle_t *cycle_now)
+static bool kvm_get_time_and_clockread(s64 *kernel_ns, cycle_t *cycle_now,
+ u64 *tsc_hz)
{
/* checked again under seqlock below */
if (pvclock_gtod_data.clock.vclock_mode != VCLOCK_TSC)
return false;
- return do_monotonic_boot(kernel_ns, cycle_now) == VCLOCK_TSC;
+ return do_monotonic_boot(kernel_ns, cycle_now, tsc_hz) == VCLOCK_TSC;
}
#endif
@@ -1668,7 +1679,8 @@ static void pvclock_update_vm_gtod_copy(struct kvm *kvm)
*/
host_tsc_clocksource = kvm_get_time_and_clockread(
&ka->master_kernel_ns,
- &ka->master_cycle_now);
+ &ka->master_cycle_now,
+ &ka->master_tsc_hz);
ka->use_master_clock = host_tsc_clocksource && vcpus_matched
&& !backwards_tsc_observed
@@ -1713,7 +1725,8 @@ static void kvm_gen_update_masterclock(struct kvm *kvm)
static int kvm_guest_time_update(struct kvm_vcpu *v)
{
- unsigned long flags, tgt_tsc_khz;
+ unsigned long flags;
+ u64 tgt_tsc_hz;
struct kvm_vcpu_arch *vcpu = &v->arch;
struct kvm_arch *ka = &v->kvm->arch;
s64 kernel_ns;
@@ -1724,6 +1737,7 @@ static int kvm_guest_time_update(struct kvm_vcpu *v)
kernel_ns = 0;
host_tsc = 0;
+ tgt_tsc_hz = 0;
/*
* If the host uses TSC clock, then passthrough TSC as stable
@@ -1734,20 +1748,22 @@ static int kvm_guest_time_update(struct kvm_vcpu *v)
if (use_master_clock) {
host_tsc = ka->master_cycle_now;
kernel_ns = ka->master_kernel_ns;
+ tgt_tsc_hz = ka->master_tsc_hz;
}
spin_unlock(&ka->pvclock_gtod_sync_lock);
/* Keep irq disabled to prevent changes to the clock */
local_irq_save(flags);
- tgt_tsc_khz = __this_cpu_read(cpu_tsc_khz);
- if (unlikely(tgt_tsc_khz == 0)) {
- local_irq_restore(flags);
- kvm_make_request(KVM_REQ_CLOCK_UPDATE, v);
- return 1;
- }
if (!use_master_clock) {
host_tsc = rdtsc();
kernel_ns = get_kernel_ns();
+ tgt_tsc_hz = __this_cpu_read(cpu_tsc_khz) * 1000;
+ }
+
+ if (unlikely(tgt_tsc_hz == 0)) {
+ local_irq_restore(flags);
+ kvm_make_request(KVM_REQ_CLOCK_UPDATE, v);
+ return 1;
}
tsc_timestamp = kvm_read_l1_tsc(v, host_tsc);
@@ -1776,13 +1792,13 @@ static int kvm_guest_time_update(struct kvm_vcpu *v)
return 0;
if (kvm_has_tsc_control)
- tgt_tsc_khz = kvm_scale_tsc(v, tgt_tsc_khz);
+ tgt_tsc_hz = kvm_scale_tsc(v, tgt_tsc_hz);
- if (unlikely(vcpu->hw_tsc_khz != tgt_tsc_khz)) {
- kvm_get_time_scale(NSEC_PER_SEC, tgt_tsc_khz * 1000LL,
+ if (unlikely(vcpu->hv_clock_tsc_hz != tgt_tsc_hz)) {
+ kvm_get_time_scale(NSEC_PER_SEC, tgt_tsc_hz,
&vcpu->hv_clock.tsc_shift,
&vcpu->hv_clock.tsc_to_system_mul);
- vcpu->hw_tsc_khz = tgt_tsc_khz;
+ vcpu->hv_clock_tsc_hz = tgt_tsc_hz;
}
/* With all the info we got, fill in the values */
--
1.8.3.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 4/4] KVM: x86: track actual TSC frequency from the timekeeper struct
2016-02-08 15:18 ` [PATCH 4/4] KVM: x86: track actual TSC frequency from the timekeeper struct Paolo Bonzini
@ 2016-02-09 18:41 ` Owen Hofmann
2016-02-10 13:57 ` Paolo Bonzini
2016-02-16 13:48 ` Marcelo Tosatti
1 sibling, 1 reply; 15+ messages in thread
From: Owen Hofmann @ 2016-02-09 18:41 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: LKML, KVM General, Marcelo Tosatti
Hi,
Should this patch change the condition in pvclock_gtod_notify?
Currently it looks like we'll only request a masterclock update when
tsc is no longer a good clocksource.
On Mon, Feb 8, 2016 at 7:18 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
> When an NTP server is running, it may adjust the time substantially
> compared to the "official" frequency of the TSC. A 12 ppm change
> sums up to one second per day.
>
> This already shows up if the guest compares kvmclock with e.g. the
> PM timer. It shows up even more once we add support for the Hyper-V
> TSC page, because the guest expects it to be in sync with the time
> reference counter; effectively the time reference counter is just a
> slow path to access the same clock that is in the TSC page.
>
> Therefore, we want kvmclock to provide the host kernel's
> ktime_get_boot_ns() value, at least if the master clock is active.
> To do so, reverse-compute the host's "actual" TSC frequency from
> pvclock_gtod_data and return it from kvm_get_time_and_clockread.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
> arch/x86/include/asm/kvm_host.h | 3 ++-
> arch/x86/kvm/x86.c | 46 +++++++++++++++++++++++++++--------------
> 2 files changed, 33 insertions(+), 16 deletions(-)
>
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 7b5459982433..7dd6d55b4868 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -538,7 +538,7 @@ struct kvm_vcpu_arch {
>
> gpa_t time;
> struct pvclock_vcpu_time_info hv_clock;
> - unsigned int hw_tsc_khz;
> + u64 hv_clock_tsc_hz;
> struct gfn_to_hva_cache pv_time;
> bool pv_time_enabled;
> /* set guest stopped flag in pvclock flags field */
> @@ -731,6 +731,7 @@ struct kvm_arch {
>
> spinlock_t pvclock_gtod_sync_lock;
> bool use_master_clock;
> + u64 master_tsc_hz;
> u64 master_kernel_ns;
> cycle_t master_cycle_now;
> struct delayed_work kvmclock_update_work;
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index e0bf8f10dc22..c99101d4cc5e 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -1580,7 +1580,16 @@ static inline u64 vgettsc(cycle_t *cycle_now)
> return v * gtod->clock.mult;
> }
>
> -static int do_monotonic_boot(s64 *t, cycle_t *cycle_now)
> +static u64 pvclock_gtod_tsc_hz(void)
> +{
> + struct pvclock_gtod_data *gtod = &pvclock_gtod_data;
> +
> + u64 gtod_tsc_hz = 1000000000ULL << gtod->clock.shift;
> + do_div(gtod_tsc_hz, gtod->clock.mult);
> + return gtod_tsc_hz;
> +}
> +
> +static int do_monotonic_boot(s64 *t, cycle_t *cycle_now, u64 *tsc_hz)
> {
> struct pvclock_gtod_data *gtod = &pvclock_gtod_data;
> unsigned long seq;
> @@ -1589,6 +1598,7 @@ static int do_monotonic_boot(s64 *t, cycle_t *cycle_now)
>
> do {
> seq = read_seqcount_begin(>od->seq);
> + *tsc_hz = pvclock_gtod_tsc_hz();
> mode = gtod->clock.vclock_mode;
> ns = gtod->nsec_base;
> ns += vgettsc(cycle_now);
> @@ -1601,13 +1611,14 @@ static int do_monotonic_boot(s64 *t, cycle_t *cycle_now)
> }
>
> /* returns true if host is using tsc clocksource */
> -static bool kvm_get_time_and_clockread(s64 *kernel_ns, cycle_t *cycle_now)
> +static bool kvm_get_time_and_clockread(s64 *kernel_ns, cycle_t *cycle_now,
> + u64 *tsc_hz)
> {
> /* checked again under seqlock below */
> if (pvclock_gtod_data.clock.vclock_mode != VCLOCK_TSC)
> return false;
>
> - return do_monotonic_boot(kernel_ns, cycle_now) == VCLOCK_TSC;
> + return do_monotonic_boot(kernel_ns, cycle_now, tsc_hz) == VCLOCK_TSC;
> }
> #endif
>
> @@ -1668,7 +1679,8 @@ static void pvclock_update_vm_gtod_copy(struct kvm *kvm)
> */
> host_tsc_clocksource = kvm_get_time_and_clockread(
> &ka->master_kernel_ns,
> - &ka->master_cycle_now);
> + &ka->master_cycle_now,
> + &ka->master_tsc_hz);
>
> ka->use_master_clock = host_tsc_clocksource && vcpus_matched
> && !backwards_tsc_observed
> @@ -1713,7 +1725,8 @@ static void kvm_gen_update_masterclock(struct kvm *kvm)
>
> static int kvm_guest_time_update(struct kvm_vcpu *v)
> {
> - unsigned long flags, tgt_tsc_khz;
> + unsigned long flags;
> + u64 tgt_tsc_hz;
> struct kvm_vcpu_arch *vcpu = &v->arch;
> struct kvm_arch *ka = &v->kvm->arch;
> s64 kernel_ns;
> @@ -1724,6 +1737,7 @@ static int kvm_guest_time_update(struct kvm_vcpu *v)
>
> kernel_ns = 0;
> host_tsc = 0;
> + tgt_tsc_hz = 0;
>
> /*
> * If the host uses TSC clock, then passthrough TSC as stable
> @@ -1734,20 +1748,22 @@ static int kvm_guest_time_update(struct kvm_vcpu *v)
> if (use_master_clock) {
> host_tsc = ka->master_cycle_now;
> kernel_ns = ka->master_kernel_ns;
> + tgt_tsc_hz = ka->master_tsc_hz;
> }
> spin_unlock(&ka->pvclock_gtod_sync_lock);
>
> /* Keep irq disabled to prevent changes to the clock */
> local_irq_save(flags);
> - tgt_tsc_khz = __this_cpu_read(cpu_tsc_khz);
> - if (unlikely(tgt_tsc_khz == 0)) {
> - local_irq_restore(flags);
> - kvm_make_request(KVM_REQ_CLOCK_UPDATE, v);
> - return 1;
> - }
> if (!use_master_clock) {
> host_tsc = rdtsc();
> kernel_ns = get_kernel_ns();
> + tgt_tsc_hz = __this_cpu_read(cpu_tsc_khz) * 1000;
> + }
> +
> + if (unlikely(tgt_tsc_hz == 0)) {
> + local_irq_restore(flags);
> + kvm_make_request(KVM_REQ_CLOCK_UPDATE, v);
> + return 1;
> }
>
> tsc_timestamp = kvm_read_l1_tsc(v, host_tsc);
> @@ -1776,13 +1792,13 @@ static int kvm_guest_time_update(struct kvm_vcpu *v)
> return 0;
>
> if (kvm_has_tsc_control)
> - tgt_tsc_khz = kvm_scale_tsc(v, tgt_tsc_khz);
> + tgt_tsc_hz = kvm_scale_tsc(v, tgt_tsc_hz);
>
> - if (unlikely(vcpu->hw_tsc_khz != tgt_tsc_khz)) {
> - kvm_get_time_scale(NSEC_PER_SEC, tgt_tsc_khz * 1000LL,
> + if (unlikely(vcpu->hv_clock_tsc_hz != tgt_tsc_hz)) {
> + kvm_get_time_scale(NSEC_PER_SEC, tgt_tsc_hz,
> &vcpu->hv_clock.tsc_shift,
> &vcpu->hv_clock.tsc_to_system_mul);
> - vcpu->hw_tsc_khz = tgt_tsc_khz;
> + vcpu->hv_clock_tsc_hz = tgt_tsc_hz;
> }
>
> /* With all the info we got, fill in the values */
> --
> 1.8.3.1
>
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 4/4] KVM: x86: track actual TSC frequency from the timekeeper struct
2016-02-09 18:41 ` Owen Hofmann
@ 2016-02-10 13:57 ` Paolo Bonzini
0 siblings, 0 replies; 15+ messages in thread
From: Paolo Bonzini @ 2016-02-10 13:57 UTC (permalink / raw)
To: Owen Hofmann; +Cc: LKML, KVM General, Marcelo Tosatti
On 09/02/2016 19:41, Owen Hofmann wrote:
> Hi,
> Should this patch change the condition in pvclock_gtod_notify?
> Currently it looks like we'll only request a masterclock update when
> tsc is no longer a good clocksource.
Yes, you're right.
Paolo
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/4] KVM: x86: rename argument to kvm_set_tsc_khz
2016-02-08 15:18 ` [PATCH 1/4] KVM: x86: rename argument to kvm_set_tsc_khz Paolo Bonzini
@ 2016-02-11 15:01 ` Marcelo Tosatti
0 siblings, 0 replies; 15+ messages in thread
From: Marcelo Tosatti @ 2016-02-11 15:01 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: linux-kernel, kvm
On Mon, Feb 08, 2016 at 04:18:28PM +0100, Paolo Bonzini wrote:
> This refers to the desired (scaled) frequency, which is called
> user_tsc_khz in the rest of the file.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
> arch/x86/kvm/x86.c | 14 +++++++-------
> 1 file changed, 7 insertions(+), 7 deletions(-)
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index aafbcf9f9776..1ca2b44f70bb 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -1290,23 +1290,23 @@ static int set_tsc_khz(struct kvm_vcpu *vcpu, u32 user_tsc_khz, bool scale)
> return 0;
> }
>
> -static int kvm_set_tsc_khz(struct kvm_vcpu *vcpu, u32 this_tsc_khz)
> +static int kvm_set_tsc_khz(struct kvm_vcpu *vcpu, u32 user_tsc_khz)
> {
> u32 thresh_lo, thresh_hi;
> int use_scaling = 0;
>
> /* tsc_khz can be zero if TSC calibration fails */
> - if (this_tsc_khz == 0) {
> + if (user_tsc_khz == 0) {
> /* set tsc_scaling_ratio to a safe value */
> vcpu->arch.tsc_scaling_ratio = kvm_default_tsc_scaling_ratio;
> return -1;
> }
>
> /* Compute a scale to convert nanoseconds in TSC cycles */
> - kvm_get_time_scale(this_tsc_khz, NSEC_PER_SEC / 1000,
> + kvm_get_time_scale(user_tsc_khz, NSEC_PER_SEC / 1000,
> &vcpu->arch.virtual_tsc_shift,
> &vcpu->arch.virtual_tsc_mult);
> - vcpu->arch.virtual_tsc_khz = this_tsc_khz;
> + vcpu->arch.virtual_tsc_khz = user_tsc_khz;
>
> /*
> * Compute the variation in TSC rate which is acceptable
> @@ -1316,11 +1316,11 @@ static int kvm_set_tsc_khz(struct kvm_vcpu *vcpu, u32 this_tsc_khz)
> */
> thresh_lo = adjust_tsc_khz(tsc_khz, -tsc_tolerance_ppm);
> thresh_hi = adjust_tsc_khz(tsc_khz, tsc_tolerance_ppm);
> - if (this_tsc_khz < thresh_lo || this_tsc_khz > thresh_hi) {
> - pr_debug("kvm: requested TSC rate %u falls outside tolerance [%u,%u]\n", this_tsc_khz, thresh_lo, thresh_hi);
> + if (user_tsc_khz < thresh_lo || user_tsc_khz > thresh_hi) {
> + pr_debug("kvm: requested TSC rate %u falls outside tolerance [%u,%u]\n", user_tsc_khz, thresh_lo, thresh_hi);
> use_scaling = 1;
> }
> - return set_tsc_khz(vcpu, this_tsc_khz, use_scaling);
> + return set_tsc_khz(vcpu, user_tsc_khz, use_scaling);
> }
>
> static u64 compute_guest_tsc(struct kvm_vcpu *vcpu, s64 kernel_ns)
> --
> 1.8.3.1
>
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/4] KVM: x86: rewrite handling of scaled TSC for kvmclock
2016-02-08 15:18 ` [PATCH 2/4] KVM: x86: rewrite handling of scaled TSC for kvmclock Paolo Bonzini
@ 2016-02-11 15:23 ` Marcelo Tosatti
0 siblings, 0 replies; 15+ messages in thread
From: Marcelo Tosatti @ 2016-02-11 15:23 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: linux-kernel, kvm
On Mon, Feb 08, 2016 at 04:18:29PM +0100, Paolo Bonzini wrote:
> This is the same as before:
>
> kvm_scale_tsc(tgt_tsc_khz)
> = tgt_tsc_khz * ratio
> = tgt_tsc_khz * user_tsc_khz / tsc_khz (see set_tsc_khz)
> = user_tsc_khz (see kvm_guest_time_update)
> = vcpu->arch.virtual_tsc_khz (see kvm_set_tsc_khz)
>
> However, computing it through kvm_scale_tsc makes it possible
> to include the NTP correction in tgt_tsc_khz, as done in the
> next patch.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
> arch/x86/kvm/x86.c | 15 ++++++++-------
> 1 file changed, 8 insertions(+), 7 deletions(-)
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 1ca2b44f70bb..6db3c219795b 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -1713,7 +1713,7 @@ static void kvm_gen_update_masterclock(struct kvm *kvm)
>
> static int kvm_guest_time_update(struct kvm_vcpu *v)
> {
> - unsigned long flags, this_tsc_khz, tgt_tsc_khz;
> + unsigned long flags, tgt_tsc_khz;
> struct kvm_vcpu_arch *vcpu = &v->arch;
> struct kvm_arch *ka = &v->kvm->arch;
> s64 kernel_ns;
> @@ -1739,8 +1739,8 @@ static int kvm_guest_time_update(struct kvm_vcpu *v)
>
> /* Keep irq disabled to prevent changes to the clock */
> local_irq_save(flags);
> - this_tsc_khz = __this_cpu_read(cpu_tsc_khz);
> - if (unlikely(this_tsc_khz == 0)) {
> + tgt_tsc_khz = __this_cpu_read(cpu_tsc_khz);
> + if (unlikely(tgt_tsc_khz == 0)) {
> local_irq_restore(flags);
> kvm_make_request(KVM_REQ_CLOCK_UPDATE, v);
> return 1;
> @@ -1775,13 +1775,14 @@ static int kvm_guest_time_update(struct kvm_vcpu *v)
> if (!vcpu->pv_time_enabled)
> return 0;
>
> - if (unlikely(vcpu->hw_tsc_khz != this_tsc_khz)) {
> - tgt_tsc_khz = kvm_has_tsc_control ?
> - vcpu->virtual_tsc_khz : this_tsc_khz;
> + if (kvm_has_tsc_control)
> + tgt_tsc_khz = kvm_scale_tsc(v, tgt_tsc_khz);
> +
> + if (unlikely(vcpu->hw_tsc_khz != tgt_tsc_khz)) {
> kvm_get_time_scale(NSEC_PER_SEC / 1000, tgt_tsc_khz,
> &vcpu->hv_clock.tsc_shift,
> &vcpu->hv_clock.tsc_to_system_mul);
> - vcpu->hw_tsc_khz = this_tsc_khz;
> + vcpu->hw_tsc_khz = tgt_tsc_khz;
> }
>
> /* With all the info we got, fill in the values */
> --
> 1.8.3.1
>
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 4/4] KVM: x86: track actual TSC frequency from the timekeeper struct
2016-02-08 15:18 ` [PATCH 4/4] KVM: x86: track actual TSC frequency from the timekeeper struct Paolo Bonzini
2016-02-09 18:41 ` Owen Hofmann
@ 2016-02-16 13:48 ` Marcelo Tosatti
2016-02-16 14:25 ` Marcelo Tosatti
1 sibling, 1 reply; 15+ messages in thread
From: Marcelo Tosatti @ 2016-02-16 13:48 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: linux-kernel, kvm
On Mon, Feb 08, 2016 at 04:18:31PM +0100, Paolo Bonzini wrote:
> When an NTP server is running, it may adjust the time substantially
> compared to the "official" frequency of the TSC. A 12 ppm change
> sums up to one second per day.
>
> This already shows up if the guest compares kvmclock with e.g. the
> PM timer. It shows up even more once we add support for the Hyper-V
> TSC page, because the guest expects it to be in sync with the time
> reference counter; effectively the time reference counter is just a
> slow path to access the same clock that is in the TSC page.
>
> Therefore, we want kvmclock to provide the host kernel's
> ktime_get_boot_ns() value, at least if the master clock is active.
> To do so, reverse-compute the host's "actual" TSC frequency from
> pvclock_gtod_data and return it from kvm_get_time_and_clockread.
Paolo,
You'd have to generate an update to the guest structures as well,
to reflect the new {mult,shift} values calculated by the host.
Here:
/* disable master clock if host does not trust, or does not
* use, TSC clocksource
*/
if (gtod->clock.vclock_mode != VCLOCK_TSC &&
atomic_read(&kvm_guest_has_master_clock) != 0)
queue_work(system_long_wq, &pvclock_gtod_work);
No?
At first, i'm afraid this might be heavy, so it might be interesting
to rate limit the update operation.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 0/4] kvmclock: improve accuracy
2016-02-08 15:18 [PATCH 0/4] kvmclock: improve accuracy Paolo Bonzini
` (3 preceding siblings ...)
2016-02-08 15:18 ` [PATCH 4/4] KVM: x86: track actual TSC frequency from the timekeeper struct Paolo Bonzini
@ 2016-02-16 14:00 ` Marcelo Tosatti
4 siblings, 0 replies; 15+ messages in thread
From: Marcelo Tosatti @ 2016-02-16 14:00 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: linux-kernel, kvm
On Mon, Feb 08, 2016 at 04:18:27PM +0100, Paolo Bonzini wrote:
> Currently kvmclock is obtaining the multiplier and shift value from
> the TSC kHz. These however are less accurate than the host kernel's
> clock, which includes corrections made through NTP.
>
> These patches change kvmclock to tick at the same frequency as the
> host kernel's clocksource (as obtained through the pvclock_gtod
> notifier). This is precise enough that the Hyper-V clock can be
> implemented on top of it.
Note the current option to sync kvmclock periodically every 300 seconds:
#define KVMCLOCK_SYNC_PERIOD (300 * HZ)
static void kvmclock_sync_fn(struct work_struct *work)
{
struct delayed_work *dwork = to_delayed_work(work);
struct kvm_arch *ka = container_of(dwork, struct kvm_arch,
kvmclock_sync_work);
struct kvm *kvm = container_of(ka, struct kvm, arch);
schedule_delayed_work(&kvm->arch.kvmclock_update_work, 0);
schedule_delayed_work(&kvm->arch.kvmclock_sync_work,
KVMCLOCK_SYNC_PERIOD);
}
How large is the error when the corrections are applied every 5 minutes?
Does this match the errors you are seeing?
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 4/4] KVM: x86: track actual TSC frequency from the timekeeper struct
2016-02-16 13:48 ` Marcelo Tosatti
@ 2016-02-16 14:25 ` Marcelo Tosatti
2016-02-16 16:59 ` Paolo Bonzini
0 siblings, 1 reply; 15+ messages in thread
From: Marcelo Tosatti @ 2016-02-16 14:25 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: linux-kernel, kvm
On Tue, Feb 16, 2016 at 02:48:16PM +0100, Marcelo Tosatti wrote:
> On Mon, Feb 08, 2016 at 04:18:31PM +0100, Paolo Bonzini wrote:
> > When an NTP server is running, it may adjust the time substantially
> > compared to the "official" frequency of the TSC. A 12 ppm change
> > sums up to one second per day.
> >
> > This already shows up if the guest compares kvmclock with e.g. the
> > PM timer. It shows up even more once we add support for the Hyper-V
> > TSC page, because the guest expects it to be in sync with the time
> > reference counter; effectively the time reference counter is just a
> > slow path to access the same clock that is in the TSC page.
> >
> > Therefore, we want kvmclock to provide the host kernel's
> > ktime_get_boot_ns() value, at least if the master clock is active.
> > To do so, reverse-compute the host's "actual" TSC frequency from
> > pvclock_gtod_data and return it from kvm_get_time_and_clockread.
>
> Paolo,
>
> You'd have to generate an update to the guest structures as well,
> to reflect the new {mult,shift} values calculated by the host.
> Here:
>
> /* disable master clock if host does not trust, or does not
> * use, TSC clocksource
> */
> if (gtod->clock.vclock_mode != VCLOCK_TSC &&
> atomic_read(&kvm_guest_has_master_clock) != 0)
> queue_work(system_long_wq, &pvclock_gtod_work);
>
> No?
>
> At first, i'm afraid this might be heavy, so it might be interesting
> to rate limit the update operation.
>
Paolo,
I suppose its not sufficient:
500ppm of 300 seconds = .0005*300 = 0.15 seconds.
Should aim at avoiding time backwards event in the following situation:
T1) t1_kvmclock_read = get_nanoseconds();
/* NTP correction to kernel clock = 500ppm */
/* TSC correction via mult,shift = 0ppm */
VM-exit, update kvmclock (or Hyper-V) clock data with
new values
T2) t2_kvmclock_read = get_nanoseconds();
/* NTP correction to kernel clock = 500ppm */
/* TSC correction via mult,shift = 500ppm */
So should not allow the host clock (or system_timestamp) to diverge
from (TSC based calculation) more than the duration of the event:
VM-exit, update kvmclock (or Hyper-V) with new data.
To avoid t2_kvmclock_read < t1_kvmclock_read
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 4/4] KVM: x86: track actual TSC frequency from the timekeeper struct
2016-02-16 14:25 ` Marcelo Tosatti
@ 2016-02-16 16:59 ` Paolo Bonzini
2016-02-19 14:12 ` Marcelo Tosatti
0 siblings, 1 reply; 15+ messages in thread
From: Paolo Bonzini @ 2016-02-16 16:59 UTC (permalink / raw)
To: Marcelo Tosatti; +Cc: linux-kernel, kvm
On 16/02/2016 15:25, Marcelo Tosatti wrote:
> On Tue, Feb 16, 2016 at 02:48:16PM +0100, Marcelo Tosatti wrote:
>> On Mon, Feb 08, 2016 at 04:18:31PM +0100, Paolo Bonzini wrote:
>>> When an NTP server is running, it may adjust the time substantially
>>> compared to the "official" frequency of the TSC. A 12 ppm change
>>> sums up to one second per day.
>>>
>>> This already shows up if the guest compares kvmclock with e.g. the
>>> PM timer. It shows up even more once we add support for the Hyper-V
>>> TSC page, because the guest expects it to be in sync with the time
>>> reference counter; effectively the time reference counter is just a
>>> slow path to access the same clock that is in the TSC page.
>>>
>>> Therefore, we want kvmclock to provide the host kernel's
>>> ktime_get_boot_ns() value, at least if the master clock is active.
>>> To do so, reverse-compute the host's "actual" TSC frequency from
>>> pvclock_gtod_data and return it from kvm_get_time_and_clockread.
>>
>> Paolo,
>>
>> You'd have to generate an update to the guest structures as well,
>> to reflect the new {mult,shift} values calculated by the host.
>> Here:
>>
>> /* disable master clock if host does not trust, or does not
>> * use, TSC clocksource
>> */
>> if (gtod->clock.vclock_mode != VCLOCK_TSC &&
>> atomic_read(&kvm_guest_has_master_clock) != 0)
>> queue_work(system_long_wq, &pvclock_gtod_work);
>>
>> No?
>>
>> At first, i'm afraid this might be heavy, so it might be interesting
>> to rate limit the update operation.
>>
>
> Paolo,
>
> I suppose its not sufficient:
>
> 500ppm of 300 seconds = .0005*300 = 0.15 seconds.
>
> Should aim at avoiding time backwards event in the following situation:
>
>
> T1) t1_kvmclock_read = get_nanoseconds();
> /* NTP correction to kernel clock = 500ppm */
> /* TSC correction via mult,shift = 0ppm */
>
> VM-exit, update kvmclock (or Hyper-V) clock data with
> new values
>
> T2) t2_kvmclock_read = get_nanoseconds();
> /* NTP correction to kernel clock = 500ppm */
> /* TSC correction via mult,shift = 500ppm */
>
>
> So should not allow the host clock (or system_timestamp) to diverge
> from (TSC based calculation) more than the duration of the event:
>
> VM-exit, update kvmclock (or Hyper-V) with new data.
>
> To avoid t2_kvmclock_read < t1_kvmclock_read
If I don't do rate limiting, that would not be a problem I think. The
host timekeeper code should take care of updating the base timestamps
(TSC and nanoseconds) in a way that doesn't cause a clock-goes-backwards
event? I need to check how often the timekeeper updates the parameters.
Paolo
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 4/4] KVM: x86: track actual TSC frequency from the timekeeper struct
2016-02-16 16:59 ` Paolo Bonzini
@ 2016-02-19 14:12 ` Marcelo Tosatti
2016-02-19 15:53 ` Paolo Bonzini
0 siblings, 1 reply; 15+ messages in thread
From: Marcelo Tosatti @ 2016-02-19 14:12 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: linux-kernel, kvm
On Tue, Feb 16, 2016 at 05:59:57PM +0100, Paolo Bonzini wrote:
>
>
> On 16/02/2016 15:25, Marcelo Tosatti wrote:
> > On Tue, Feb 16, 2016 at 02:48:16PM +0100, Marcelo Tosatti wrote:
> >> On Mon, Feb 08, 2016 at 04:18:31PM +0100, Paolo Bonzini wrote:
> >>> When an NTP server is running, it may adjust the time substantially
> >>> compared to the "official" frequency of the TSC. A 12 ppm change
> >>> sums up to one second per day.
> >>>
> >>> This already shows up if the guest compares kvmclock with e.g. the
> >>> PM timer. It shows up even more once we add support for the Hyper-V
> >>> TSC page, because the guest expects it to be in sync with the time
> >>> reference counter; effectively the time reference counter is just a
> >>> slow path to access the same clock that is in the TSC page.
> >>>
> >>> Therefore, we want kvmclock to provide the host kernel's
> >>> ktime_get_boot_ns() value, at least if the master clock is active.
> >>> To do so, reverse-compute the host's "actual" TSC frequency from
> >>> pvclock_gtod_data and return it from kvm_get_time_and_clockread.
> >>
> >> Paolo,
> >>
> >> You'd have to generate an update to the guest structures as well,
> >> to reflect the new {mult,shift} values calculated by the host.
> >> Here:
> >>
> >> /* disable master clock if host does not trust, or does not
> >> * use, TSC clocksource
> >> */
> >> if (gtod->clock.vclock_mode != VCLOCK_TSC &&
> >> atomic_read(&kvm_guest_has_master_clock) != 0)
> >> queue_work(system_long_wq, &pvclock_gtod_work);
> >>
> >> No?
> >>
> >> At first, i'm afraid this might be heavy, so it might be interesting
> >> to rate limit the update operation.
> >>
> >
> > Paolo,
> >
> > I suppose its not sufficient:
> >
> > 500ppm of 300 seconds = .0005*300 = 0.15 seconds.
> >
> > Should aim at avoiding time backwards event in the following situation:
> >
> >
> > T1) t1_kvmclock_read = get_nanoseconds();
> > /* NTP correction to kernel clock = 500ppm */
> > /* TSC correction via mult,shift = 0ppm */
> >
> > VM-exit, update kvmclock (or Hyper-V) clock data with
> > new values
> >
> > T2) t2_kvmclock_read = get_nanoseconds();
> > /* NTP correction to kernel clock = 500ppm */
> > /* TSC correction via mult,shift = 500ppm */
> >
> >
> > So should not allow the host clock (or system_timestamp) to diverge
> > from (TSC based calculation) more than the duration of the event:
> >
> > VM-exit, update kvmclock (or Hyper-V) with new data.
> >
> > To avoid t2_kvmclock_read < t1_kvmclock_read
>
> If I don't do rate limiting, that would not be a problem I think.
Correct.
> The
> host timekeeper code should take care of updating the base timestamps
> (TSC and nanoseconds) in a way that doesn't cause a clock-goes-backwards
> event?
Yes.
> I need to check how often the timekeeper updates the parameters.
I'd assume once every tick, the function is called (the notifier).
But you can optimize that away by only updating the TSC frequency
when mult/shift are updated, which should be much rarer.
(Note this issue is also a problem for Linux based kvmclock today).
>
> Paolo
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 4/4] KVM: x86: track actual TSC frequency from the timekeeper struct
2016-02-19 14:12 ` Marcelo Tosatti
@ 2016-02-19 15:53 ` Paolo Bonzini
0 siblings, 0 replies; 15+ messages in thread
From: Paolo Bonzini @ 2016-02-19 15:53 UTC (permalink / raw)
To: Marcelo Tosatti, KVM list, linux-kernel@vger.kernel.org
On 19/02/2016 15:12, Marcelo Tosatti wrote:
>
>> > I need to check how often the timekeeper updates the parameters.
> I'd assume once every tick, the function is called (the notifier).
>
> But you can optimize that away by only updating the TSC frequency
> when mult/shift are updated, which should be much rarer.
Yes, exactly. That would still not be rate limiting.
But worst case it can be a lot of adjustments, up to HZ per second...
Paolo
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2016-02-19 15:54 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-02-08 15:18 [PATCH 0/4] kvmclock: improve accuracy Paolo Bonzini
2016-02-08 15:18 ` [PATCH 1/4] KVM: x86: rename argument to kvm_set_tsc_khz Paolo Bonzini
2016-02-11 15:01 ` Marcelo Tosatti
2016-02-08 15:18 ` [PATCH 2/4] KVM: x86: rewrite handling of scaled TSC for kvmclock Paolo Bonzini
2016-02-11 15:23 ` Marcelo Tosatti
2016-02-08 15:18 ` [PATCH 3/4] KVM: x86: pass kvm_get_time_scale arguments in hertz Paolo Bonzini
2016-02-08 15:18 ` [PATCH 4/4] KVM: x86: track actual TSC frequency from the timekeeper struct Paolo Bonzini
2016-02-09 18:41 ` Owen Hofmann
2016-02-10 13:57 ` Paolo Bonzini
2016-02-16 13:48 ` Marcelo Tosatti
2016-02-16 14:25 ` Marcelo Tosatti
2016-02-16 16:59 ` Paolo Bonzini
2016-02-19 14:12 ` Marcelo Tosatti
2016-02-19 15:53 ` Paolo Bonzini
2016-02-16 14:00 ` [PATCH 0/4] kvmclock: improve accuracy Marcelo Tosatti
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).