[Qemu-devel] kvmclock, Migration, and NTP clock jitter

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [Qemu-devel] kvmclock, Migration, and NTP clock jitter
@ 2015-01-15 16:39 Mohammed Gamal
  2015-01-15 17:27 ` Paolo Bonzini
  0 siblings, 1 reply; 4+ messages in thread
From: Mohammed Gamal @ 2015-01-15 16:39 UTC (permalink / raw)
  To: qemu-devel

[-- Attachment #1: Type: text/plain, Size: 3571 bytes --]

Hi,

I've seen some strange time behavior in some of our VMs usually triggered
by live migration. In some VMs we have seen some significant time drift
which NTP was not able to correct after doing a live migration.

I've not been able so far to reproduce the same case, however, I did notice
that live migration does introduce some increase in clock jitter values,
and I am not sure if that might have anything to do with any significant
time drift.

Here is an example of a CentOS 6 guest running under qemu 1.2 before doing
a live migration:

[root@centos ~]# ntpq -pcrv
     remote           refid      st t when poll reach   delay   offset
 jitter
==========================================================================
+helium.constant 18.26.4.105      2 u   65   64  377   60.539   -0.011
0.554
-209.118.204.201 128.9.176.30     2 u   47   64  377   15.750   -1.835
0.388
*time3.chpc.utah 198.60.22.240    2 u   46   64  377   30.585    3.934
0.253
+dns2.untangle.c 216.218.254.202  2 u   21   64  377   22.196    2.345
0.740
associd=0 status=0615 leap_none, sync_ntp, 1 event, clock_sync,
version="ntpd 4.2.6p5@1.2349-o Sat Dec 20 02:53:39 UTC 2014 (1)",
processor="x86_64", system="Linux/2.6.32-504.3.3.el6.x86_64", leap=00,
stratum=3, precision=-21, rootdelay=32.355, rootdisp=53.173,
refid=155.101.3.115,
reftime=d86264f3.444c75e7  Thu, Jan 15 2015 16:10:27.266,
clock=d86265ed.10a34c1c  Thu, Jan 15 2015 16:14:37.064, peer=3418, tc=6,
mintc=3, offset=0.000, frequency=2.863, sys_jitter=2.024,
clk_jitter=2.283, clk_wander=0.000

[root@centos ~]# ntpdc -c kerninfo
pll offset:           0 s
pll frequency:        2.863 ppm
maximum error:        0.19838 s
estimated error:      0.002282 s
status:               2001  pll nano
pll time constant:    6
precision:            1e-09 s
frequency tolerance:  500 ppm

Immediately after live migration, you can see that there is an increase in
jitter values:
[root@centos ~]# ntpq -pcrv
     remote           refid      st t when poll reach   delay   offset
 jitter
==========================================================================
-helium.constant 18.26.4.105      2 u   59   64  377   60.556   -0.916
 31.921
+209.118.204.201 128.9.176.30     2 u   38   64  377   15.717   28.879
 12.220
+time3.chpc.utah 132.163.4.103    2 u   45   64  353   30.639    3.240
 26.975
*dns2.untangle.c 216.218.254.202  2 u   17   64  377   22.248   33.039
 11.791
associd=0 status=0615 leap_none, sync_ntp, 1 event, clock_sync,
version="ntpd 4.2.6p5@1.2349-o Sat Dec 20 02:53:39 UTC 2014 (1)",
processor="x86_64", system="Linux/2.6.32-504.3.3.el6.x86_64", leap=00,
stratum=3, precision=-21, rootdelay=25.086, rootdisp=83.736,
refid=74.123.29.4,
reftime=d8626838.47529689  Thu, Jan 15 2015 16:24:24.278,
clock=d8626849.4920018a  Thu, Jan 15 2015 16:24:41.285, peer=3419, tc=6,
mintc=3, offset=24.118, frequency=11.560, sys_jitter=15.145,
clk_jitter=8.056, clk_wander=2.757

[root@centos ~]# ntpdc -c kerninfo
pll offset:           0.0211957 s
pll frequency:        11.560 ppm
maximum error:        0.112523 s
estimated error:      0.008055 s
status:               2001  pll nano
pll time constant:    6
precision:            1e-09 s
frequency tolerance:  500 ppm


The increase in the jitter and offset values is well within the 500 ppm
frequency tolerance limit, and therefore are easily corrected by subsequent
NTP clock sync events, but some live migrations do cause much higher jitter
and offset jumps, which can not be corrected by NTP and cause the time to
go way off. Any idea why this is the case?

Regards,
Mohammed

[-- Attachment #2: Type: text/html, Size: 4072 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Qemu-devel] kvmclock, Migration, and NTP clock jitter
  2015-01-15 16:39 [Qemu-devel] kvmclock, Migration, and NTP clock jitter Mohammed Gamal
@ 2015-01-15 17:27 ` Paolo Bonzini
  2015-01-16 10:21   ` Mohammed Gamal
  0 siblings, 1 reply; 4+ messages in thread
From: Paolo Bonzini @ 2015-01-15 17:27 UTC (permalink / raw)
  To: Mohammed Gamal, qemu-devel



On 15/01/2015 17:39, Mohammed Gamal wrote:
> The increase in the jitter and offset values is well within the 500 ppm
> frequency tolerance limit, and therefore are easily corrected by
> subsequent NTP clock sync events, but some live migrations do cause much
> higher jitter and offset jumps, which can not be corrected by NTP and
> cause the time to go way off. Any idea why this is the case?

It might be fixed in QEMU 2.2.

See https://lists.gnu.org/archive/html/qemu-devel/2014-09/msg01239.html

Paolo

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Qemu-devel] kvmclock, Migration, and NTP clock jitter
  2015-01-15 17:27 ` Paolo Bonzini
@ 2015-01-16 10:21   ` Mohammed Gamal
  2015-01-21 10:20     ` Mohammed Gamal
  0 siblings, 1 reply; 4+ messages in thread
From: Mohammed Gamal @ 2015-01-16 10:21 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 877 bytes --]

On Thu, Jan 15, 2015 at 06:27:54PM +0100, Paolo Bonzini wrote:
> 
> 
> On 15/01/2015 17:39, Mohammed Gamal wrote:
> > The increase in the jitter and offset values is well within the 500 ppm
> > frequency tolerance limit, and therefore are easily corrected by
> > subsequent NTP clock sync events, but some live migrations do cause much
> > higher jitter and offset jumps, which can not be corrected by NTP and
> > cause the time to go way off. Any idea why this is the case?
> 
> It might be fixed in QEMU 2.2.
> 
> See https://lists.gnu.org/archive/html/qemu-devel/2014-09/msg01239.html
> 
> Paolo

Hi Paolo,

I did try to backport these patches to qemu 1.2. However, migrations 
resulted in *higher* jitter and offset values (i.e. in the order of 100+ ppm).
I am not sure if I've done the backporting correctly though. Here are my
patches on top of the qemu 1.2 stable tree.

[-- Attachment #2: backport.patch --]
[-- Type: text/x-diff, Size: 5053 bytes --]

diff --git a/cpus.c b/cpus.c
index 29aced5..e079ee5 100644
--- a/cpus.c
+++ b/cpus.c
@@ -187,6 +187,15 @@ void cpu_disable_ticks(void)
     }
 }
 
+void cpu_clean_all_dirty(void)
+{
+    CPUArchState *cpu;
+
+    for (cpu = first_cpu; cpu; cpu = cpu->next_cpu) {
+        cpu_clean_state(cpu);
+    }
+}
+
 /* Correlation between real and virtual time is always going to be
    fairly approximate, so ignore small variation.
    When the guest is idle real and virtual time will be aligned in
diff --git a/cpus.h b/cpus.h
index 3fc1a4a..1ff166b 100644
--- a/cpus.h
+++ b/cpus.h
@@ -12,6 +12,7 @@ void unplug_vcpu(void *p);
 void cpu_synchronize_all_states(void);
 void cpu_synchronize_all_post_reset(void);
 void cpu_synchronize_all_post_init(void);
+void cpu_clean_all_dirty(void);
 
 void qtest_clock_warp(int64_t dest);
 
diff --git a/hw/kvm/clock.c b/hw/kvm/clock.c
index 824b978..b2bdda4 100644
--- a/hw/kvm/clock.c
+++ b/hw/kvm/clock.c
@@ -16,6 +16,8 @@
 #include "qemu-common.h"
 #include "sysemu.h"
 #include "kvm.h"
+#include "host-utils.h"
+#include "cpus.h"
 #include "hw/sysbus.h"
 #include "hw/kvm/clock.h"
 
@@ -28,6 +30,46 @@ typedef struct KVMClockState {
     bool clock_valid;
 } KVMClockState;
 
+struct pvclock_vcpu_time_info {
+    uint32_t   version;
+    uint32_t   pad0;
+    uint64_t   tsc_timestamp;
+    uint64_t   system_time;
+    uint32_t   tsc_to_system_mul;
+    int8_t     tsc_shift;
+    uint8_t    flags;
+    uint8_t    pad[2];
+} __attribute__((__packed__)); /* 32 bytes */
+
+static uint64_t kvmclock_current_nsec(KVMClockState *s)
+{
+    CPUArchState *env = first_cpu;
+    uint64_t migration_tsc = env->tsc;
+    struct pvclock_vcpu_time_info time;
+    uint64_t delta;
+    uint64_t nsec_lo;
+    uint64_t nsec_hi;
+    uint64_t nsec;
+
+    if (!(env->system_time_msr & 1ULL)) {
+        /* KVM clock not active */
+        return 0;
+    }
+    cpu_physical_memory_read((env->system_time_msr & ~1ULL), &time, sizeof(time));
+
+    assert(time.tsc_timestamp <= migration_tsc);
+    delta = migration_tsc - time.tsc_timestamp;
+    if (time.tsc_shift < 0) {
+        delta >>= -time.tsc_shift;
+    } else {
+        delta <<= time.tsc_shift;
+    }
+
+    mulu64(&nsec_lo, &nsec_hi, delta, time.tsc_to_system_mul);
+    nsec = (nsec_lo >> 32) | (nsec_hi << 32);
+    return nsec + time.system_time;
+}
+
 static void kvmclock_pre_save(void *opaque)
 {
     KVMClockState *s = opaque;
@@ -37,6 +79,23 @@ static void kvmclock_pre_save(void *opaque)
     if (s->clock_valid) {
         return;
     }
+
+    cpu_synchronize_all_states();
+    /* In theory, the cpu_synchronize_all_states() call above wouldn't
+     * affect the rest of the code, as the VCPU state inside CPUArchState
+     * is supposed to always match the VCPU state on the kernel side.
+     *
+     * In practice, calling cpu_synchronize_state() too soon will load the
+     * kernel-side APIC state into X86CPU.apic_state too early, APIC state
+     * won't be reloaded later because CPUState.vcpu_dirty==true, and
+     * outdated APIC state may be migrated to another host.
+     *
+     * The real fix would be to make sure outdated APIC state is read
+     * from the kernel again when necessary. While this is not fixed, we
+     * need the cpu_clean_all_dirty() call below.
+     */
+    cpu_clean_all_dirty();
+
     ret = kvm_vm_ioctl(kvm_state, KVM_GET_CLOCK, &data);
     if (ret < 0) {
         fprintf(stderr, "KVM_GET_CLOCK failed: %s\n", strerror(ret));
@@ -55,6 +114,12 @@ static int kvmclock_post_load(void *opaque, int version_id)
 {
     KVMClockState *s = opaque;
     struct kvm_clock_data data;
+    uint64_t time_at_migration = kvmclock_current_nsec(s);
+
+    /* We can't rely on the migrated clock value, just discard it */
+    if (time_at_migration) {
+        s->clock = time_at_migration;
+    }
 
     data.clock = s->clock;
     data.flags = 0;
diff --git a/kvm-all.c b/kvm-all.c
index cd2ccbe..692944e 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -1547,6 +1547,11 @@ void kvm_cpu_synchronize_post_init(CPUArchState *env)
     env->kvm_vcpu_dirty = 0;
 }
 
+void kvm_cpu_clean_state(CPUArchState *env)
+{
+    env->kvm_vcpu_dirty = false;
+}
+
 int kvm_cpu_exec(CPUArchState *env)
 {
     struct kvm_run *run = env->kvm_run;
diff --git a/kvm.h b/kvm.h
index 2a68a52..92a17d8 100644
--- a/kvm.h
+++ b/kvm.h
@@ -234,6 +234,7 @@ uint32_t kvm_arch_get_supported_cpuid(KVMState *env, uint32_t function,
 void kvm_cpu_synchronize_state(CPUArchState *env);
 void kvm_cpu_synchronize_post_reset(CPUArchState *env);
 void kvm_cpu_synchronize_post_init(CPUArchState *env);
+void kvm_cpu_clean_state(CPUArchState *cpu);
 
 /* generic hooks - to be moved/refactored once there are more users */
 
@@ -258,6 +259,12 @@ static inline void cpu_synchronize_post_init(CPUArchState *env)
     }
 }
 
+static inline void cpu_clean_state(CPUArchState *env)
+{
+    if (kvm_enabled()) {
+        kvm_cpu_clean_state(env);
+    }
+}
 
 #if !defined(CONFIG_USER_ONLY)
 int kvm_physical_memory_addr_from_host(KVMState *s, void *ram_addr,

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [Qemu-devel] kvmclock, Migration, and NTP clock jitter
  2015-01-16 10:21   ` Mohammed Gamal
@ 2015-01-21 10:20     ` Mohammed Gamal
  0 siblings, 0 replies; 4+ messages in thread
From: Mohammed Gamal @ 2015-01-21 10:20 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 1020 bytes --]

On Fri, Jan 16, 2015 at 11:21 AM, Mohammed Gamal <
mohammed.gamal@profitbricks.com> wrote:

> On Thu, Jan 15, 2015 at 06:27:54PM +0100, Paolo Bonzini wrote:
> >
> >
> > On 15/01/2015 17:39, Mohammed Gamal wrote:
> > > The increase in the jitter and offset values is well within the 500 ppm
> > > frequency tolerance limit, and therefore are easily corrected by
> > > subsequent NTP clock sync events, but some live migrations do cause
> much
> > > higher jitter and offset jumps, which can not be corrected by NTP and
> > > cause the time to go way off. Any idea why this is the case?
> >
> > It might be fixed in QEMU 2.2.
> >
> > See https://lists.gnu.org/archive/html/qemu-devel/2014-09/msg01239.html
> >
> > Paolo
>
> Hi Paolo,
>
> I did try to backport these patches to qemu 1.2. However, migrations
> resulted in *higher* jitter and offset values (i.e. in the order of 100+
> ppm).
> I am not sure if I've done the backporting correctly though. Here are my
> patches on top of the qemu 1.2 stable tree.
>

Anyone?

[-- Attachment #2: Type: text/html, Size: 1708 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2015-01-21 10:21 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-01-15 16:39 [Qemu-devel] kvmclock, Migration, and NTP clock jitter Mohammed Gamal
2015-01-15 17:27 ` Paolo Bonzini
2015-01-16 10:21   ` Mohammed Gamal
2015-01-21 10:20     ` Mohammed Gamal

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).