* [PATCH 1/2] KVM: x86: reorganize pvclock_gtod_data members [not found] <1579702953-24184-1-git-send-email-pbonzini@redhat.com> @ 2020-01-22 14:22 ` Paolo Bonzini 2020-01-23 11:32 ` Vitaly Kuznetsov 2020-01-22 14:22 ` [PATCH 2/2] KVM: x86: use raw clock values consistently Paolo Bonzini 1 sibling, 1 reply; 6+ messages in thread From: Paolo Bonzini @ 2020-01-22 14:22 UTC (permalink / raw) To: linux-kernel, kvm; +Cc: mtosatti, stable We will need a copy of tk->offs_boot in the next patch. Store it and cleanup the struct: instead of storing tk->tkr_xxx.base with the tk->offs_boot included, store the raw value in struct pvclock_clock and sum tk->offs_boot in do_monotonic_raw and do_realtime. tk->tkr_xxx.xtime_nsec also moves to struct pvclock_clock. While at it, fix a (usually harmless) typo in do_monotonic_raw, which was using gtod->clock.shift instead of gtod->raw_clock.shift. Fixes: 53fafdbb8b21f ("KVM: x86: switch KVMCLOCK base to monotonic raw clock") Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> --- arch/x86/kvm/x86.c | 29 ++++++++++++----------------- 1 file changed, 12 insertions(+), 17 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 89621025577a..1b4273cce63c 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1532,6 +1532,8 @@ struct pvclock_clock { u64 mask; u32 mult; u32 shift; + u64 base_cycles; + u64 offset; }; struct pvclock_gtod_data { @@ -1540,11 +1542,8 @@ struct pvclock_gtod_data { struct pvclock_clock clock; /* extract of a clocksource struct */ struct pvclock_clock raw_clock; /* extract of a clocksource struct */ - u64 boot_ns_raw; - u64 boot_ns; - u64 nsec_base; + ktime_t offs_boot; u64 wall_time_sec; - u64 monotonic_raw_nsec; }; static struct pvclock_gtod_data pvclock_gtod_data; @@ -1552,10 +1551,6 @@ struct pvclock_gtod_data { static void update_pvclock_gtod(struct timekeeper *tk) { struct pvclock_gtod_data *vdata = &pvclock_gtod_data; - u64 boot_ns, boot_ns_raw; - - boot_ns = ktime_to_ns(ktime_add(tk->tkr_mono.base, tk->offs_boot)); - boot_ns_raw = ktime_to_ns(ktime_add(tk->tkr_raw.base, tk->offs_boot)); write_seqcount_begin(&vdata->seq); @@ -1565,20 +1560,20 @@ static void update_pvclock_gtod(struct timekeeper *tk) vdata->clock.mask = tk->tkr_mono.mask; vdata->clock.mult = tk->tkr_mono.mult; vdata->clock.shift = tk->tkr_mono.shift; + vdata->clock.base_cycles = tk->tkr_mono.xtime_nsec; + vdata->clock.offset = tk->tkr_mono.base; vdata->raw_clock.vclock_mode = tk->tkr_raw.clock->archdata.vclock_mode; vdata->raw_clock.cycle_last = tk->tkr_raw.cycle_last; vdata->raw_clock.mask = tk->tkr_raw.mask; vdata->raw_clock.mult = tk->tkr_raw.mult; vdata->raw_clock.shift = tk->tkr_raw.shift; - - vdata->boot_ns = boot_ns; - vdata->nsec_base = tk->tkr_mono.xtime_nsec; + vdata->raw_clock.base_cycles = tk->tkr_raw.xtime_nsec; + vdata->raw_clock.offset = tk->tkr_raw.base; vdata->wall_time_sec = tk->xtime_sec; - vdata->boot_ns_raw = boot_ns_raw; - vdata->monotonic_raw_nsec = tk->tkr_raw.xtime_nsec; + vdata->offs_boot = tk->offs_boot; write_seqcount_end(&vdata->seq); } @@ -2048,10 +2043,10 @@ static int do_monotonic_raw(s64 *t, u64 *tsc_timestamp) do { seq = read_seqcount_begin(>od->seq); - ns = gtod->monotonic_raw_nsec; + ns = gtod->raw_clock.base_cycles; ns += vgettsc(>od->raw_clock, tsc_timestamp, &mode); - ns >>= gtod->clock.shift; - ns += gtod->boot_ns_raw; + ns >>= gtod->raw_clock.shift; + ns += ktime_to_ns(ktime_add(gtod->raw_clock.offset, gtod->offs_boot)); } while (unlikely(read_seqcount_retry(>od->seq, seq))); *t = ns; @@ -2068,7 +2063,7 @@ static int do_realtime(struct timespec64 *ts, u64 *tsc_timestamp) do { seq = read_seqcount_begin(>od->seq); ts->tv_sec = gtod->wall_time_sec; - ns = gtod->nsec_base; + ns = gtod->clock.base_cycles; ns += vgettsc(>od->clock, tsc_timestamp, &mode); ns >>= gtod->clock.shift; } while (unlikely(read_seqcount_retry(>od->seq, seq))); -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH 1/2] KVM: x86: reorganize pvclock_gtod_data members 2020-01-22 14:22 ` [PATCH 1/2] KVM: x86: reorganize pvclock_gtod_data members Paolo Bonzini @ 2020-01-23 11:32 ` Vitaly Kuznetsov 2020-01-23 11:35 ` Paolo Bonzini 0 siblings, 1 reply; 6+ messages in thread From: Vitaly Kuznetsov @ 2020-01-23 11:32 UTC (permalink / raw) To: Paolo Bonzini; +Cc: mtosatti, stable, linux-kernel, kvm Paolo Bonzini <pbonzini@redhat.com> writes: > We will need a copy of tk->offs_boot in the next patch. Store it and > cleanup the struct: instead of storing tk->tkr_xxx.base with the tk->offs_boot > included, store the raw value in struct pvclock_clock and sum tk->offs_boot > in do_monotonic_raw and do_realtime. tk->tkr_xxx.xtime_nsec also moves > to struct pvclock_clock. > > While at it, fix a (usually harmless) typo in do_monotonic_raw, which > was using gtod->clock.shift instead of gtod->raw_clock.shift. > > Fixes: 53fafdbb8b21f ("KVM: x86: switch KVMCLOCK base to monotonic raw clock") > Cc: stable@vger.kernel.org > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> > --- > arch/x86/kvm/x86.c | 29 ++++++++++++----------------- > 1 file changed, 12 insertions(+), 17 deletions(-) > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index 89621025577a..1b4273cce63c 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -1532,6 +1532,8 @@ struct pvclock_clock { > u64 mask; > u32 mult; > u32 shift; > + u64 base_cycles; > + u64 offset; > }; > > struct pvclock_gtod_data { > @@ -1540,11 +1542,8 @@ struct pvclock_gtod_data { > struct pvclock_clock clock; /* extract of a clocksource struct */ > struct pvclock_clock raw_clock; /* extract of a clocksource struct */ > > - u64 boot_ns_raw; > - u64 boot_ns; > - u64 nsec_base; > + ktime_t offs_boot; > u64 wall_time_sec; > - u64 monotonic_raw_nsec; > }; > > static struct pvclock_gtod_data pvclock_gtod_data; > @@ -1552,10 +1551,6 @@ struct pvclock_gtod_data { > static void update_pvclock_gtod(struct timekeeper *tk) > { > struct pvclock_gtod_data *vdata = &pvclock_gtod_data; > - u64 boot_ns, boot_ns_raw; > - > - boot_ns = ktime_to_ns(ktime_add(tk->tkr_mono.base, tk->offs_boot)); > - boot_ns_raw = ktime_to_ns(ktime_add(tk->tkr_raw.base, tk->offs_boot)); > > write_seqcount_begin(&vdata->seq); > > @@ -1565,20 +1560,20 @@ static void update_pvclock_gtod(struct timekeeper *tk) > vdata->clock.mask = tk->tkr_mono.mask; > vdata->clock.mult = tk->tkr_mono.mult; > vdata->clock.shift = tk->tkr_mono.shift; > + vdata->clock.base_cycles = tk->tkr_mono.xtime_nsec; > + vdata->clock.offset = tk->tkr_mono.base; > > vdata->raw_clock.vclock_mode = tk->tkr_raw.clock->archdata.vclock_mode; > vdata->raw_clock.cycle_last = tk->tkr_raw.cycle_last; > vdata->raw_clock.mask = tk->tkr_raw.mask; > vdata->raw_clock.mult = tk->tkr_raw.mult; > vdata->raw_clock.shift = tk->tkr_raw.shift; > - > - vdata->boot_ns = boot_ns; > - vdata->nsec_base = tk->tkr_mono.xtime_nsec; > + vdata->raw_clock.base_cycles = tk->tkr_raw.xtime_nsec; > + vdata->raw_clock.offset = tk->tkr_raw.base; Likely a personal preference but the suggested naming is a bit confusing: we use 'base_cycles' to keep 'xtime_nsec' and 'offset' to keep ... 'base'. Not that I think that 'struct timekeeper' is perfect but at least it is documented. Should we maybe just stick to it (and name 'struct pvclock_clock' fields accordingly?) > > vdata->wall_time_sec = tk->xtime_sec; > > - vdata->boot_ns_raw = boot_ns_raw; > - vdata->monotonic_raw_nsec = tk->tkr_raw.xtime_nsec; > + vdata->offs_boot = tk->offs_boot; > > write_seqcount_end(&vdata->seq); > } > @@ -2048,10 +2043,10 @@ static int do_monotonic_raw(s64 *t, u64 *tsc_timestamp) > > do { > seq = read_seqcount_begin(>od->seq); > - ns = gtod->monotonic_raw_nsec; > + ns = gtod->raw_clock.base_cycles; > ns += vgettsc(>od->raw_clock, tsc_timestamp, &mode); > - ns >>= gtod->clock.shift; > - ns += gtod->boot_ns_raw; > + ns >>= gtod->raw_clock.shift; > + ns += ktime_to_ns(ktime_add(gtod->raw_clock.offset, gtod->offs_boot)); > } while (unlikely(read_seqcount_retry(>od->seq, seq))); > *t = ns; > > @@ -2068,7 +2063,7 @@ static int do_realtime(struct timespec64 *ts, u64 *tsc_timestamp) > do { > seq = read_seqcount_begin(>od->seq); > ts->tv_sec = gtod->wall_time_sec; > - ns = gtod->nsec_base; > + ns = gtod->clock.base_cycles; > ns += vgettsc(>od->clock, tsc_timestamp, &mode); > ns >>= gtod->clock.shift; > } while (unlikely(read_seqcount_retry(>od->seq, seq))); FWIW, Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> -- Vitaly ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 1/2] KVM: x86: reorganize pvclock_gtod_data members 2020-01-23 11:32 ` Vitaly Kuznetsov @ 2020-01-23 11:35 ` Paolo Bonzini 0 siblings, 0 replies; 6+ messages in thread From: Paolo Bonzini @ 2020-01-23 11:35 UTC (permalink / raw) To: Vitaly Kuznetsov; +Cc: mtosatti, stable, linux-kernel, kvm On 23/01/20 12:32, Vitaly Kuznetsov wrote: > Likely a personal preference but the suggested naming is a bit > confusing: we use 'base_cycles' to keep 'xtime_nsec' and 'offset' to > keep ... 'base'. Not that I think that 'struct timekeeper' is perfect > but at least it is documented. Should we maybe just stick to it (and > name 'struct pvclock_clock' fields accordingly?) > The problem is that xtime_nsec is not nanoseconds, and I'd really not want to have a worse name just for consistency. :( I chose "base_cycles" as an incremental improvement over nsec_base, even though that meant also changing struct timekeeper's "base" to "offset". Paolo ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH 2/2] KVM: x86: use raw clock values consistently [not found] <1579702953-24184-1-git-send-email-pbonzini@redhat.com> 2020-01-22 14:22 ` [PATCH 1/2] KVM: x86: reorganize pvclock_gtod_data members Paolo Bonzini @ 2020-01-22 14:22 ` Paolo Bonzini 2020-01-23 13:43 ` Vitaly Kuznetsov 1 sibling, 1 reply; 6+ messages in thread From: Paolo Bonzini @ 2020-01-22 14:22 UTC (permalink / raw) To: linux-kernel, kvm; +Cc: mtosatti, stable Commit 53fafdbb8b21f ("KVM: x86: switch KVMCLOCK base to monotonic raw clock") changed kvmclock to use tkr_raw instead of tkr_mono. However, the default kvmclock_offset for the VM was still based on the monotonic clock and, if the raw clock drifted enough from the monotonic clock, this could cause a negative system_time to be written to the guest's struct pvclock. RHEL5 does not like it and (if it boots fast enough to observe a negative time value) it hangs. There is another thing to be careful about: getboottime64 returns the host boot time in tkr_mono units, and subtracting tkr_raw units will cause the wallclock to be off if tkr_raw drifts from tkr_mono. To avoid this, compute the wallclock delta from the current time instead of being clever and using getboottime64. Fixes: 53fafdbb8b21f ("KVM: x86: switch KVMCLOCK base to monotonic raw clock") Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> --- arch/x86/kvm/x86.c | 38 +++++++++++++++++++++++--------------- 1 file changed, 23 insertions(+), 15 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 1b4273cce63c..b5e0648580e1 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1577,6 +1577,18 @@ static void update_pvclock_gtod(struct timekeeper *tk) write_seqcount_end(&vdata->seq); } + +static s64 get_kvmclock_base_ns(void) +{ + /* Count up from boot time, but with the frequency of the raw clock. */ + return ktime_to_ns(ktime_add(ktime_get_raw(), pvclock_gtod_data.offs_boot)); +} +#else +static s64 get_kvmclock_base_ns(void) +{ + /* Master clock not used, so we can just use CLOCK_BOOTTIME. */ + return ktime_get_boottime_ns(); +} #endif void kvm_set_pending_timer(struct kvm_vcpu *vcpu) @@ -1590,7 +1602,7 @@ static void kvm_write_wall_clock(struct kvm *kvm, gpa_t wall_clock) int version; int r; struct pvclock_wall_clock wc; - struct timespec64 boot; + u64 wall_nsec; if (!wall_clock) return; @@ -1610,17 +1622,12 @@ static void kvm_write_wall_clock(struct kvm *kvm, gpa_t wall_clock) /* * The guest calculates current wall clock time by adding * system time (updated by kvm_guest_time_update below) to the - * wall clock specified here. guest system time equals host - * system time for us, thus we must fill in host boot time here. + * wall clock specified here. We do the reverse here. */ - getboottime64(&boot); + wall_nsec = ktime_get_real_ns() - get_kvmclock_ns(kvm); - if (kvm->arch.kvmclock_offset) { - struct timespec64 ts = ns_to_timespec64(kvm->arch.kvmclock_offset); - boot = timespec64_sub(boot, ts); - } - wc.sec = (u32)boot.tv_sec; /* overflow in 2106 guest time */ - wc.nsec = boot.tv_nsec; + wc.nsec = do_div(wall_nsec, 1000000000); + wc.sec = (u32)wall_nsec; /* overflow in 2106 guest time */ wc.version = version; kvm_write_guest(kvm, wall_clock, &wc, sizeof(wc)); @@ -1868,7 +1875,7 @@ void kvm_write_tsc(struct kvm_vcpu *vcpu, struct msr_data *msr) raw_spin_lock_irqsave(&kvm->arch.tsc_write_lock, flags); offset = kvm_compute_tsc_offset(vcpu, data); - ns = ktime_get_boottime_ns(); + ns = get_kvmclock_base_ns(); elapsed = ns - kvm->arch.last_tsc_nsec; if (vcpu->arch.virtual_tsc_khz) { @@ -2206,7 +2213,7 @@ u64 get_kvmclock_ns(struct kvm *kvm) spin_lock(&ka->pvclock_gtod_sync_lock); if (!ka->use_master_clock) { spin_unlock(&ka->pvclock_gtod_sync_lock); - return ktime_get_boottime_ns() + ka->kvmclock_offset; + return get_kvmclock_base_ns() + ka->kvmclock_offset; } hv_clock.tsc_timestamp = ka->master_cycle_now; @@ -2222,7 +2229,7 @@ u64 get_kvmclock_ns(struct kvm *kvm) &hv_clock.tsc_to_system_mul); ret = __pvclock_read_cycles(&hv_clock, rdtsc()); } else - ret = ktime_get_boottime_ns() + ka->kvmclock_offset; + ret = get_kvmclock_base_ns() + ka->kvmclock_offset; put_cpu(); @@ -2321,7 +2328,7 @@ static int kvm_guest_time_update(struct kvm_vcpu *v) } if (!use_master_clock) { host_tsc = rdtsc(); - kernel_ns = ktime_get_boottime_ns(); + kernel_ns = get_kvmclock_base_ns(); } tsc_timestamp = kvm_read_l1_tsc(v, host_tsc); @@ -2361,6 +2368,7 @@ static int kvm_guest_time_update(struct kvm_vcpu *v) vcpu->hv_clock.tsc_timestamp = tsc_timestamp; vcpu->hv_clock.system_time = kernel_ns + v->kvm->arch.kvmclock_offset; vcpu->last_guest_tsc = tsc_timestamp; + WARN_ON(vcpu->hv_clock.system_time < 0); /* If the host uses TSC clocksource, then it is stable */ pvclock_flags = 0; @@ -9473,7 +9481,7 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type) mutex_init(&kvm->arch.apic_map_lock); spin_lock_init(&kvm->arch.pvclock_gtod_sync_lock); - kvm->arch.kvmclock_offset = -ktime_get_boottime_ns(); + kvm->arch.kvmclock_offset = -get_kvmclock_base_ns(); pvclock_update_vm_gtod_copy(kvm); kvm->arch.guest_can_read_msr_platform_info = true; -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH 2/2] KVM: x86: use raw clock values consistently 2020-01-22 14:22 ` [PATCH 2/2] KVM: x86: use raw clock values consistently Paolo Bonzini @ 2020-01-23 13:43 ` Vitaly Kuznetsov 2020-01-23 13:54 ` Paolo Bonzini 0 siblings, 1 reply; 6+ messages in thread From: Vitaly Kuznetsov @ 2020-01-23 13:43 UTC (permalink / raw) To: Paolo Bonzini; +Cc: mtosatti, stable, linux-kernel, kvm Paolo Bonzini <pbonzini@redhat.com> writes: > Commit 53fafdbb8b21f ("KVM: x86: switch KVMCLOCK base to monotonic raw > clock") changed kvmclock to use tkr_raw instead of tkr_mono. However, > the default kvmclock_offset for the VM was still based on the monotonic > clock and, if the raw clock drifted enough from the monotonic clock, > this could cause a negative system_time to be written to the guest's > struct pvclock. RHEL5 does not like it and (if it boots fast enough to > observe a negative time value) it hangs. > > There is another thing to be careful about: getboottime64 returns the > host boot time in tkr_mono units, and subtracting tkr_raw units will > cause the wallclock to be off if tkr_raw drifts from tkr_mono. To > avoid this, compute the wallclock delta from the current time instead > of being clever and using getboottime64. > > Fixes: 53fafdbb8b21f ("KVM: x86: switch KVMCLOCK base to monotonic raw clock") > Cc: stable@vger.kernel.org > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> > --- > arch/x86/kvm/x86.c | 38 +++++++++++++++++++++++--------------- > 1 file changed, 23 insertions(+), 15 deletions(-) > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index 1b4273cce63c..b5e0648580e1 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -1577,6 +1577,18 @@ static void update_pvclock_gtod(struct timekeeper *tk) > > write_seqcount_end(&vdata->seq); > } > + > +static s64 get_kvmclock_base_ns(void) > +{ > + /* Count up from boot time, but with the frequency of the raw clock. */ > + return ktime_to_ns(ktime_add(ktime_get_raw(), pvclock_gtod_data.offs_boot)); > +} > +#else > +static s64 get_kvmclock_base_ns(void) > +{ > + /* Master clock not used, so we can just use CLOCK_BOOTTIME. */ > + return ktime_get_boottime_ns(); > +} > #endif But we could've still used the RAW+offs_boot version, right? And this is just to basically preserve the existing behavior on !x86. > > void kvm_set_pending_timer(struct kvm_vcpu *vcpu) > @@ -1590,7 +1602,7 @@ static void kvm_write_wall_clock(struct kvm *kvm, gpa_t wall_clock) > int version; > int r; > struct pvclock_wall_clock wc; > - struct timespec64 boot; > + u64 wall_nsec; > > if (!wall_clock) > return; > @@ -1610,17 +1622,12 @@ static void kvm_write_wall_clock(struct kvm *kvm, gpa_t wall_clock) > /* > * The guest calculates current wall clock time by adding > * system time (updated by kvm_guest_time_update below) to the > - * wall clock specified here. guest system time equals host > - * system time for us, thus we must fill in host boot time here. > + * wall clock specified here. We do the reverse here. > */ > - getboottime64(&boot); > + wall_nsec = ktime_get_real_ns() - get_kvmclock_ns(kvm); There are not that many hosts with more than 50 years uptime and likely none running Linux with live kernel patching support so I bet noone will ever see this overflowing, however, as wall_nsec is u64 and we're dealing with kvmclock here I'd suggest to add a WARN_ON(). > > - if (kvm->arch.kvmclock_offset) { > - struct timespec64 ts = ns_to_timespec64(kvm->arch.kvmclock_offset); > - boot = timespec64_sub(boot, ts); > - } > - wc.sec = (u32)boot.tv_sec; /* overflow in 2106 guest time */ > - wc.nsec = boot.tv_nsec; > + wc.nsec = do_div(wall_nsec, 1000000000); > + wc.sec = (u32)wall_nsec; /* overflow in 2106 guest time */ > wc.version = version; > > kvm_write_guest(kvm, wall_clock, &wc, sizeof(wc)); > @@ -1868,7 +1875,7 @@ void kvm_write_tsc(struct kvm_vcpu *vcpu, struct msr_data *msr) > > raw_spin_lock_irqsave(&kvm->arch.tsc_write_lock, flags); > offset = kvm_compute_tsc_offset(vcpu, data); > - ns = ktime_get_boottime_ns(); > + ns = get_kvmclock_base_ns(); > elapsed = ns - kvm->arch.last_tsc_nsec; > > if (vcpu->arch.virtual_tsc_khz) { > @@ -2206,7 +2213,7 @@ u64 get_kvmclock_ns(struct kvm *kvm) > spin_lock(&ka->pvclock_gtod_sync_lock); > if (!ka->use_master_clock) { > spin_unlock(&ka->pvclock_gtod_sync_lock); > - return ktime_get_boottime_ns() + ka->kvmclock_offset; > + return get_kvmclock_base_ns() + ka->kvmclock_offset; > } > > hv_clock.tsc_timestamp = ka->master_cycle_now; > @@ -2222,7 +2229,7 @@ u64 get_kvmclock_ns(struct kvm *kvm) > &hv_clock.tsc_to_system_mul); > ret = __pvclock_read_cycles(&hv_clock, rdtsc()); > } else > - ret = ktime_get_boottime_ns() + ka->kvmclock_offset; > + ret = get_kvmclock_base_ns() + ka->kvmclock_offset; > > put_cpu(); > > @@ -2321,7 +2328,7 @@ static int kvm_guest_time_update(struct kvm_vcpu *v) > } > if (!use_master_clock) { > host_tsc = rdtsc(); > - kernel_ns = ktime_get_boottime_ns(); > + kernel_ns = get_kvmclock_base_ns(); > } > > tsc_timestamp = kvm_read_l1_tsc(v, host_tsc); > @@ -2361,6 +2368,7 @@ static int kvm_guest_time_update(struct kvm_vcpu *v) > vcpu->hv_clock.tsc_timestamp = tsc_timestamp; > vcpu->hv_clock.system_time = kernel_ns + v->kvm->arch.kvmclock_offset; > vcpu->last_guest_tsc = tsc_timestamp; > + WARN_ON(vcpu->hv_clock.system_time < 0); > > /* If the host uses TSC clocksource, then it is stable */ > pvclock_flags = 0; > @@ -9473,7 +9481,7 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type) > mutex_init(&kvm->arch.apic_map_lock); > spin_lock_init(&kvm->arch.pvclock_gtod_sync_lock); > > - kvm->arch.kvmclock_offset = -ktime_get_boottime_ns(); > + kvm->arch.kvmclock_offset = -get_kvmclock_base_ns(); > pvclock_update_vm_gtod_copy(kvm); > > kvm->arch.guest_can_read_msr_platform_info = true; This looks correct to me but kvmclock is a glorious beast so take this with a grain of salt) Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> -- Vitaly ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 2/2] KVM: x86: use raw clock values consistently 2020-01-23 13:43 ` Vitaly Kuznetsov @ 2020-01-23 13:54 ` Paolo Bonzini 0 siblings, 0 replies; 6+ messages in thread From: Paolo Bonzini @ 2020-01-23 13:54 UTC (permalink / raw) To: Vitaly Kuznetsov; +Cc: mtosatti, stable, linux-kernel, kvm On 23/01/20 14:43, Vitaly Kuznetsov wrote: >> + >> +static s64 get_kvmclock_base_ns(void) >> +{ >> + /* Count up from boot time, but with the frequency of the raw clock. */ >> + return ktime_to_ns(ktime_add(ktime_get_raw(), pvclock_gtod_data.offs_boot)); >> +} >> +#else >> +static s64 get_kvmclock_base_ns(void) >> +{ >> + /* Master clock not used, so we can just use CLOCK_BOOTTIME. */ >> + return ktime_get_boottime_ns(); >> +} >> #endif > But we could've still used the RAW+offs_boot version, right? And this is > just to basically preserve the existing behavior on !x86. Yes, there's no reason to restrict the pvclock_gtod notifier to x86_64. But this is stable material so I kept it easy. >> >> - getboottime64(&boot); >> + wall_nsec = ktime_get_real_ns() - get_kvmclock_ns(kvm); > > There are not that many hosts with more than 50 years uptime and likely > none running Linux with live kernel patching support so I bet noone will > ever see this overflowing, however, as wall_nsec is u64 and we're > dealing with kvmclock here I'd suggest to add a WARN_ON(). You're off by a factor of 10, 2^64 nanoseconds are about 584 years (584*365*10^9*86400). :) Paolo ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2020-01-23 13:54 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <1579702953-24184-1-git-send-email-pbonzini@redhat.com>
2020-01-22 14:22 ` [PATCH 1/2] KVM: x86: reorganize pvclock_gtod_data members Paolo Bonzini
2020-01-23 11:32 ` Vitaly Kuznetsov
2020-01-23 11:35 ` Paolo Bonzini
2020-01-22 14:22 ` [PATCH 2/2] KVM: x86: use raw clock values consistently Paolo Bonzini
2020-01-23 13:43 ` Vitaly Kuznetsov
2020-01-23 13:54 ` Paolo Bonzini
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox