From: Marcelo Tosatti <mtosatti@redhat.com>
To: rhkernel-list@redhat.com
Cc: Paolo Bonzini <pbonzini@redhat.com>,
Andrew Jones <drjones@redhat.com>, Bandan Das <bsd@redhat.com>,
Igor Mammedov <imammedo@redhat.com>,
Vitaly Kuznetsov <vkuznets@redhat.com>,
stable@vger.kernel.org
Subject: [RHEL 8.3 BZ 1768622 PATCH 3/3] KVM: x86: use raw clock values consistently
Date: Mon, 18 May 2020 16:15:19 -0300 [thread overview]
Message-ID: <20200518191759.624834612@fuller.cnet> (raw)
In-Reply-To: 20200518191516.165550666@fuller.cnet
BZ: 1768622
Brew: https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=28587593
commit 8171cd68806bd2fc28ef688e32fb2a3b3deb04e5
Commit 53fafdbb8b21f ("KVM: x86: switch KVMCLOCK base to monotonic raw
clock") changed kvmclock to use tkr_raw instead of tkr_mono. However,
the default kvmclock_offset for the VM was still based on the monotonic
clock and, if the raw clock drifted enough from the monotonic clock,
this could cause a negative system_time to be written to the guest's
struct pvclock. RHEL5 does not like it and (if it boots fast enough to
observe a negative time value) it hangs.
There is another thing to be careful about: getboottime64 returns the
host boot time with tkr_mono frequency, and subtracting the tkr_raw-based
kvmclock value will cause the wallclock to be off if tkr_raw drifts
from tkr_mono. To avoid this, compute the wallclock delta from the
current time instead of being clever and using getboottime64.
Fixes: 53fafdbb8b21f ("KVM: x86: switch KVMCLOCK base to monotonic raw clock")
Cc: stable@vger.kernel.org
Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Index: kernel-rhel/arch/x86/kvm/x86.c
===================================================================
--- kernel-rhel.orig/arch/x86/kvm/x86.c
+++ kernel-rhel/arch/x86/kvm/x86.c
@@ -1595,6 +1595,18 @@ static void update_pvclock_gtod(struct t
write_seqcount_end(&vdata->seq);
}
+
+static s64 get_kvmclock_base_ns(void)
+{
+ /* Count up from boot time, but with the frequency of the raw clock. */
+ return ktime_to_ns(ktime_add(ktime_get_raw(), pvclock_gtod_data.offs_boot));
+}
+#else
+static s64 get_kvmclock_base_ns(void)
+{
+ /* Master clock not used, so we can just use CLOCK_BOOTTIME. */
+ return ktime_get_boottime_ns();
+}
#endif
void kvm_set_pending_timer(struct kvm_vcpu *vcpu)
@@ -1608,7 +1620,7 @@ static void kvm_write_wall_clock(struct
int version;
int r;
struct pvclock_wall_clock wc;
- struct timespec64 boot;
+ u64 wall_nsec;
if (!wall_clock)
return;
@@ -1628,17 +1640,12 @@ static void kvm_write_wall_clock(struct
/*
* The guest calculates current wall clock time by adding
* system time (updated by kvm_guest_time_update below) to the
- * wall clock specified here. guest system time equals host
- * system time for us, thus we must fill in host boot time here.
+ * wall clock specified here. We do the reverse here.
*/
- getboottime64(&boot);
+ wall_nsec = ktime_get_real_ns() - get_kvmclock_ns(kvm);
- if (kvm->arch.kvmclock_offset) {
- struct timespec64 ts = ns_to_timespec64(kvm->arch.kvmclock_offset);
- boot = timespec64_sub(boot, ts);
- }
- wc.sec = (u32)boot.tv_sec; /* overflow in 2106 guest time */
- wc.nsec = boot.tv_nsec;
+ wc.nsec = do_div(wall_nsec, 1000000000);
+ wc.sec = (u32)wall_nsec; /* overflow in 2106 guest time */
wc.version = version;
kvm_write_guest(kvm, wall_clock, &wc, sizeof(wc));
@@ -1886,7 +1893,7 @@ void kvm_write_tsc(struct kvm_vcpu *vcpu
raw_spin_lock_irqsave(&kvm->arch.tsc_write_lock, flags);
offset = kvm_compute_tsc_offset(vcpu, data);
- ns = ktime_get_boot_ns();
+ ns = get_kvmclock_base_ns();
elapsed = ns - kvm->arch.last_tsc_nsec;
if (vcpu->arch.virtual_tsc_khz) {
@@ -2224,7 +2231,7 @@ u64 get_kvmclock_ns(struct kvm *kvm)
spin_lock(&ka->pvclock_gtod_sync_lock);
if (!ka->use_master_clock) {
spin_unlock(&ka->pvclock_gtod_sync_lock);
- return ktime_get_boot_ns() + ka->kvmclock_offset;
+ return get_kvmclock_base_ns() + ka->kvmclock_offset;
}
hv_clock.tsc_timestamp = ka->master_cycle_now;
@@ -2240,7 +2247,7 @@ u64 get_kvmclock_ns(struct kvm *kvm)
&hv_clock.tsc_to_system_mul);
ret = __pvclock_read_cycles(&hv_clock, rdtsc());
} else
- ret = ktime_get_boot_ns() + ka->kvmclock_offset;
+ ret = get_kvmclock_base_ns() + ka->kvmclock_offset;
put_cpu();
@@ -2339,7 +2346,7 @@ static int kvm_guest_time_update(struct
}
if (!use_master_clock) {
host_tsc = rdtsc();
- kernel_ns = ktime_get_boot_ns();
+ kernel_ns = get_kvmclock_base_ns();
}
tsc_timestamp = kvm_read_l1_tsc(v, host_tsc);
@@ -2379,6 +2386,7 @@ static int kvm_guest_time_update(struct
vcpu->hv_clock.tsc_timestamp = tsc_timestamp;
vcpu->hv_clock.system_time = kernel_ns + v->kvm->arch.kvmclock_offset;
vcpu->last_guest_tsc = tsc_timestamp;
+ WARN_ON(vcpu->hv_clock.system_time < 0);
/* If the host uses TSC clocksource, then it is stable */
pvclock_flags = 0;
@@ -9486,7 +9494,7 @@ int kvm_arch_init_vm(struct kvm *kvm, un
mutex_init(&kvm->arch.apic_map_lock);
spin_lock_init(&kvm->arch.pvclock_gtod_sync_lock);
- kvm->arch.kvmclock_offset = -ktime_get_boot_ns();
+ kvm->arch.kvmclock_offset = -get_kvmclock_base_ns();
pvclock_update_vm_gtod_copy(kvm);
kvm->arch.guest_can_read_msr_platform_info = true;
next parent reply other threads:[~2020-05-18 20:31 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20200518191516.165550666@fuller.cnet>
2020-05-18 19:15 ` Marcelo Tosatti [this message]
2020-05-19 8:58 ` [RHEL 8.3 BZ 1768622 PATCH 3/3] KVM: x86: use raw clock values consistently Frantisek Hrbata
2020-05-19 9:18 ` Greg KH
2020-05-19 9:53 ` Paolo Bonzini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200518191759.624834612@fuller.cnet \
--to=mtosatti@redhat.com \
--cc=bsd@redhat.com \
--cc=drjones@redhat.com \
--cc=imammedo@redhat.com \
--cc=pbonzini@redhat.com \
--cc=rhkernel-list@redhat.com \
--cc=stable@vger.kernel.org \
--cc=vkuznets@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).