All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] Time-related fixes for migration
@ 2014-03-30  3:05 Boris Ostrovsky
  2014-03-30  3:05 ` [PATCH 1/2] libxl: Set guest parameters from config file during a restore Boris Ostrovsky
                   ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Boris Ostrovsky @ 2014-03-30  3:05 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, suravee.suthikulpanit, stefano.stabellini, eddie.dong,
	ian.jackson, jbeulich, jun.nakajima, boris.ostrovsky,
	ian.campbell

Two patches to address time-related problem that we discovered during
migration testing.

* The first patch loads HVM parameters from configuration file during restore.
To fix the actual problem that we saw only timer_mode needed to be restored but
it seems to me that other parameters are needed as well since at least some of
them are used at runtime.

The bug can be demonstrated with a Solaris guest but I haven't been able to
trigger it with Linux. Possibly because Solaris' gethrtime() routine (which is
what the test was using) is a trap to kernel's hrtimer which reports global
time and performs some adjustments to per-CPU clock.

* The second patch keeps TSCs synchronized across VPCUs after save/restore.
Currently TSC values diverge after migration because during both save and restore
we calculate them separately for each VCPU and base each calculation on
newly-read host's TSC.

The problem can be easily demonstrated with this program for a 2-VCPU guest
(I am assuming here invariant TSC so, for example, tsc_mode="always_emulate" (*)):

int
main(int argc, char* argv[])
{

  unsigned long long h = 0LL;
  int proc = 0;
  cpu_set_t set;

  for(;;) {
    unsigned long long n = __native_read_tsc();
    if(h && n < h)
        printf("prev 0x%llx cur 0x%llx\n", h, n);
    CPU_ZERO(&set);
    proc = (proc + 1) & 1;
    CPU_SET(proc, &set);
    if (sched_setaffinity(0, sizeof(cpu_set_t), &set)) {
        perror("sched_setaffinity");
        exit(1);
    }

    h = n;
  }
}


(*) Which brings up another observation: when we are in default tsc_mode we
start off with vtsc=0 and thus clear TSC_Invariant bit in guest's CPUID.
After migration vtsc is 1 and TSC_Invariant bit is set. So the guest may observe
different values of CPUID. Which technically reflects the fact that TSC became
"safe" but I think potentially may be problematic to some guests.


Boris Ostrovsky (2):
  libxl: Set guest parameters from config file during a restore
  x86/HVM: Use fixed TSC value when saving or restoring domain

 tools/libxl/libxl_dom.c       |   51 +++++++++++++++++++++++++----------------
 xen/arch/x86/hvm/hvm.c        |   18 ++++++++++-----
 xen/arch/x86/hvm/save.c       |   36 +++++++++++++++++++++--------
 xen/arch/x86/hvm/svm/svm.c    |    4 ++--
 xen/arch/x86/hvm/vmx/vmx.c    |    4 ++--
 xen/arch/x86/hvm/vpt.c        |   16 ++++++++-----
 xen/arch/x86/time.c           |    7 ++++--
 xen/common/hvm/save.c         |    5 ++++
 xen/include/asm-x86/domain.h  |    2 ++
 xen/include/asm-x86/hvm/hvm.h |    9 +++++---
 xen/include/xen/hvm/save.h    |    2 ++
 xen/include/xen/time.h        |    3 ++-
 12 files changed, 105 insertions(+), 52 deletions(-)

-- 
1.7.10.4

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/2] libxl: Set guest parameters from config file during a restore
  2014-03-30  3:05 [PATCH 0/2] Time-related fixes for migration Boris Ostrovsky
@ 2014-03-30  3:05 ` Boris Ostrovsky
  2014-04-01 10:37   ` Ian Campbell
  2014-03-30  3:05 ` [PATCH 2/2] x86/HVM: Use fixed TSC value when saving or restoring domain Boris Ostrovsky
  2014-03-31 14:41 ` [PATCH 0/2] Time-related fixes for migration Tian, Kevin
  2 siblings, 1 reply; 14+ messages in thread
From: Boris Ostrovsky @ 2014-03-30  3:05 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, suravee.suthikulpanit, stefano.stabellini, eddie.dong,
	ian.jackson, jbeulich, jun.nakajima, boris.ostrovsky,
	ian.campbell

Guest's configuration parameters (e.g. timer_mode) are used by the hypervisor
during runtime. We should therefore set them during restore.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
---
 tools/libxl/libxl_dom.c |   51 ++++++++++++++++++++++++++++-------------------
 1 file changed, 31 insertions(+), 20 deletions(-)

diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index 55f74b2..368de39 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -201,6 +201,32 @@ static int numa_place_domain(libxl__gc *gc, uint32_t domid,
     return rc;
 }
 
+static unsigned long timer_mode(const libxl_domain_build_info *info)
+{
+    const libxl_timer_mode mode = info->u.hvm.timer_mode;
+    assert(mode >= LIBXL_TIMER_MODE_DELAY_FOR_MISSED_TICKS &&
+           mode <= LIBXL_TIMER_MODE_ONE_MISSED_TICK_PENDING);
+    return ((unsigned long)mode);
+}
+
+static void hvm_set_conf_params(xc_interface *handle, uint32_t domid,
+                                libxl_domain_build_info *const info)
+{
+    xc_set_hvm_param(handle, domid, HVM_PARAM_PAE_ENABLED,
+                    libxl_defbool_val(info->u.hvm.pae));
+#if defined(__i386__) || defined(__x86_64__)
+    xc_set_hvm_param(handle, domid, HVM_PARAM_VIRIDIAN,
+                    libxl_defbool_val(info->u.hvm.viridian));
+    xc_set_hvm_param(handle, domid, HVM_PARAM_HPET_ENABLED,
+                    libxl_defbool_val(info->u.hvm.hpet));
+#endif
+    xc_set_hvm_param(handle, domid, HVM_PARAM_TIMER_MODE, timer_mode(info));
+    xc_set_hvm_param(handle, domid, HVM_PARAM_VPT_ALIGN,
+                    libxl_defbool_val(info->u.hvm.vpt_align));
+    xc_set_hvm_param(handle, domid, HVM_PARAM_NESTEDHVM,
+                    libxl_defbool_val(info->u.hvm.nested_hvm));
+}
+
 int libxl__build_pre(libxl__gc *gc, uint32_t domid,
               libxl_domain_config *d_config, libxl__domain_build_state *state)
 {
@@ -255,6 +281,9 @@ int libxl__build_pre(libxl__gc *gc, uint32_t domid,
     state->console_port = xc_evtchn_alloc_unbound(ctx->xch, domid, state->console_domid);
     state->vm_generationid_addr = 0;
 
+    if (info->type == LIBXL_DOMAIN_TYPE_HVM)
+        hvm_set_conf_params(ctx->xch, domid, info);
+
     rc = libxl__arch_domain_create(gc, d_config, domid);
 
     return rc;
@@ -449,13 +478,6 @@ out:
     return ret == 0 ? 0 : ERROR_FAIL;
 }
 
-static unsigned long timer_mode(const libxl_domain_build_info *info)
-{
-    const libxl_timer_mode mode = info->u.hvm.timer_mode;
-    assert(mode >= LIBXL_TIMER_MODE_DELAY_FOR_MISSED_TICKS &&
-           mode <= LIBXL_TIMER_MODE_ONE_MISSED_TICK_PENDING);
-    return ((unsigned long)mode);
-}
 static int hvm_build_set_params(xc_interface *handle, uint32_t domid,
                                 libxl_domain_build_info *info,
                                 int store_evtchn, unsigned long *store_mfn,
@@ -484,22 +506,11 @@ static int hvm_build_set_params(xc_interface *handle, uint32_t domid,
 
     xc_get_hvm_param(handle, domid, HVM_PARAM_STORE_PFN, store_mfn);
     xc_get_hvm_param(handle, domid, HVM_PARAM_CONSOLE_PFN, console_mfn);
-    xc_set_hvm_param(handle, domid, HVM_PARAM_PAE_ENABLED,
-                     libxl_defbool_val(info->u.hvm.pae));
-#if defined(__i386__) || defined(__x86_64__)
-    xc_set_hvm_param(handle, domid, HVM_PARAM_VIRIDIAN,
-                     libxl_defbool_val(info->u.hvm.viridian));
-    xc_set_hvm_param(handle, domid, HVM_PARAM_HPET_ENABLED,
-                     libxl_defbool_val(info->u.hvm.hpet));
-#endif
-    xc_set_hvm_param(handle, domid, HVM_PARAM_TIMER_MODE, timer_mode(info));
-    xc_set_hvm_param(handle, domid, HVM_PARAM_VPT_ALIGN,
-                     libxl_defbool_val(info->u.hvm.vpt_align));
-    xc_set_hvm_param(handle, domid, HVM_PARAM_NESTEDHVM,
-                     libxl_defbool_val(info->u.hvm.nested_hvm));
     xc_set_hvm_param(handle, domid, HVM_PARAM_STORE_EVTCHN, store_evtchn);
     xc_set_hvm_param(handle, domid, HVM_PARAM_CONSOLE_EVTCHN, console_evtchn);
 
+    hvm_set_conf_params(handle, domid, info);
+
     xc_dom_gnttab_hvm_seed(handle, domid, *console_mfn, *store_mfn, console_domid, store_domid);
     return 0;
 }
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/2] x86/HVM: Use fixed TSC value when saving or restoring domain
  2014-03-30  3:05 [PATCH 0/2] Time-related fixes for migration Boris Ostrovsky
  2014-03-30  3:05 ` [PATCH 1/2] libxl: Set guest parameters from config file during a restore Boris Ostrovsky
@ 2014-03-30  3:05 ` Boris Ostrovsky
  2014-03-31  9:51   ` Jan Beulich
  2014-03-31 15:29   ` Tian, Kevin
  2014-03-31 14:41 ` [PATCH 0/2] Time-related fixes for migration Tian, Kevin
  2 siblings, 2 replies; 14+ messages in thread
From: Boris Ostrovsky @ 2014-03-30  3:05 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, suravee.suthikulpanit, stefano.stabellini, eddie.dong,
	ian.jackson, jbeulich, jun.nakajima, boris.ostrovsky,
	ian.campbell

When a domain is saved each VCPU's TSC value needs to be preserved. To get it we
use hvm_get_guest_tsc(). This routine (either itself or via get_s_time() which
it may call) calculates VCPU's TSC based on current host's TSC value (by doing a
rdtscll()). Since this is performed for each VCPU separately we end up with
un-synchronized TSCs.

Similarly, during a restore each VCPU is assigned its TSC based on host's current
tick, causing virtual TSCs to diverge further.

With this, we can easily get into situation where a guest may see time going
backwards.

Instead of reading new TSC value for each VCPU when saving/restoring it we should
use the same value across all VCPUs.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
---
 xen/arch/x86/hvm/hvm.c        |   18 ++++++++++++------
 xen/arch/x86/hvm/save.c       |   36 ++++++++++++++++++++++++++----------
 xen/arch/x86/hvm/svm/svm.c    |    4 ++--
 xen/arch/x86/hvm/vmx/vmx.c    |    4 ++--
 xen/arch/x86/hvm/vpt.c        |   16 ++++++++++------
 xen/arch/x86/time.c           |    7 +++++--
 xen/common/hvm/save.c         |    5 +++++
 xen/include/asm-x86/domain.h  |    2 ++
 xen/include/asm-x86/hvm/hvm.h |    9 ++++++---
 xen/include/xen/hvm/save.h    |    2 ++
 xen/include/xen/time.h        |    3 ++-
 11 files changed, 74 insertions(+), 32 deletions(-)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index ae24211..98de16a 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -248,19 +248,22 @@ int hvm_set_guest_pat(struct vcpu *v, u64 guest_pat)
     return 1;
 }
 
-void hvm_set_guest_tsc(struct vcpu *v, u64 guest_tsc)
+void hvm_set_guest_tsc_fixed(struct vcpu *v, u64 guest_tsc, u64 at_tsc)
 {
     uint64_t tsc;
     uint64_t delta_tsc;
 
     if ( v->domain->arch.vtsc )
     {
-        tsc = hvm_get_guest_time(v);
+        tsc = hvm_get_guest_time_fixed(v, at_tsc);
         tsc = gtime_to_gtsc(v->domain, tsc);
     }
     else
     {
-        rdtscll(tsc);
+        if ( at_tsc )
+            tsc = at_tsc;
+        else
+            rdtscll(tsc);
     }
 
     delta_tsc = guest_tsc - tsc;
@@ -279,19 +282,22 @@ void hvm_set_guest_tsc_adjust(struct vcpu *v, u64 tsc_adjust)
     v->arch.hvm_vcpu.msr_tsc_adjust = tsc_adjust;
 }
 
-u64 hvm_get_guest_tsc(struct vcpu *v)
+u64 hvm_get_guest_tsc_fixed(struct vcpu *v, uint64_t at_tsc)
 {
     uint64_t tsc;
 
     if ( v->domain->arch.vtsc )
     {
-        tsc = hvm_get_guest_time(v);
+        tsc = hvm_get_guest_time_fixed(v, at_tsc);
         tsc = gtime_to_gtsc(v->domain, tsc);
         v->domain->arch.vtsc_kerncount++;
     }
     else
     {
-        rdtscll(tsc);
+        if ( at_tsc )
+            tsc = at_tsc;
+        else
+            rdtscll(tsc);
     }
 
     return tsc + v->arch.hvm_vcpu.cache_tsc_offset;
diff --git a/xen/arch/x86/hvm/save.c b/xen/arch/x86/hvm/save.c
index 066fdb2..6b0767e 100644
--- a/xen/arch/x86/hvm/save.c
+++ b/xen/arch/x86/hvm/save.c
@@ -24,7 +24,7 @@
 #include <asm/hvm/support.h>
 #include <public/hvm/save.h>
 
-void arch_hvm_save(struct domain *d, struct hvm_save_header *hdr)
+void arch_hvm_save(struct domain *dom, struct hvm_save_header *hdr)
 {
     uint32_t eax, ebx, ecx, edx;
 
@@ -33,24 +33,32 @@ void arch_hvm_save(struct domain *d, struct hvm_save_header *hdr)
     hdr->cpuid = eax;
 
     /* Save guest's preferred TSC. */
-    hdr->gtsc_khz = d->arch.tsc_khz;
+   hdr->gtsc_khz = dom->arch.tsc_khz;
+
+   /* Time when saving started */
+   rdtscll(dom->arch.chkpt_tsc);
+}
+
+void arch_hvm_save_done(struct domain *dom)
+{
+    dom->arch.chkpt_tsc = 0;
 }
 
-int arch_hvm_load(struct domain *d, struct hvm_save_header *hdr)
+int arch_hvm_load(struct domain *dom, struct hvm_save_header *hdr)
 {
     uint32_t eax, ebx, ecx, edx;
 
     if ( hdr->magic != HVM_FILE_MAGIC )
     {
         printk(XENLOG_G_ERR "HVM%d restore: bad magic number %#"PRIx32"\n",
-               d->domain_id, hdr->magic);
+               dom->domain_id, hdr->magic);
         return -1;
     }
 
     if ( hdr->version != HVM_FILE_VERSION )
     {
         printk(XENLOG_G_ERR "HVM%d restore: unsupported version %u\n",
-               d->domain_id, hdr->version);
+               dom->domain_id, hdr->version);
         return -1;
     }
 
@@ -59,20 +67,28 @@ int arch_hvm_load(struct domain *d, struct hvm_save_header *hdr)
     if ( (hdr->cpuid & ~0x0fUL) != (eax & ~0x0fUL) )
         printk(XENLOG_G_INFO "HVM%d restore: VM saved on one CPU "
                "(%#"PRIx32") and restored on another (%#"PRIx32").\n",
-               d->domain_id, hdr->cpuid, eax);
+               dom->domain_id, hdr->cpuid, eax);
 
     /* Restore guest's preferred TSC frequency. */
     if ( hdr->gtsc_khz )
-        d->arch.tsc_khz = hdr->gtsc_khz;
-    if ( d->arch.vtsc )
-        hvm_set_rdtsc_exiting(d, 1);
+        dom->arch.tsc_khz = hdr->gtsc_khz;
+    if ( dom->arch.vtsc )
+        hvm_set_rdtsc_exiting(dom, 1);
+
+    /* Time when restore started  */
+    rdtscll(dom->arch.chkpt_tsc);
 
     /* VGA state is not saved/restored, so we nobble the cache. */
-    d->arch.hvm_domain.stdvga.cache = 0;
+    dom->arch.hvm_domain.stdvga.cache = 0;
 
     return 0;
 }
 
+void arch_hvm_load_done(struct domain *dom)
+{
+    dom->arch.chkpt_tsc = 0;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index 4fd5376..7aa55c3 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -318,7 +318,7 @@ static void svm_save_cpu_state(struct vcpu *v, struct hvm_hw_cpu *data)
     data->msr_efer         = v->arch.hvm_vcpu.guest_efer;
     data->msr_flags        = -1ULL;
 
-    data->tsc = hvm_get_guest_tsc(v);
+    data->tsc = hvm_get_guest_tsc_fixed(v, v->domain->arch.chkpt_tsc);
 }
 
 
@@ -334,7 +334,7 @@ static void svm_load_cpu_state(struct vcpu *v, struct hvm_hw_cpu *data)
     v->arch.hvm_vcpu.guest_efer = data->msr_efer;
     svm_update_guest_efer(v);
 
-    hvm_set_guest_tsc(v, data->tsc);
+    hvm_set_guest_tsc_fixed(v, data->tsc, v->domain->arch.chkpt_tsc);
 }
 
 static void svm_save_vmcb_ctxt(struct vcpu *v, struct hvm_hw_cpu *ctxt)
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 8395e86..f10d34c 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -540,7 +540,7 @@ static void vmx_save_cpu_state(struct vcpu *v, struct hvm_hw_cpu *data)
     data->msr_star         = guest_state->msrs[VMX_INDEX_MSR_STAR];
     data->msr_syscall_mask = guest_state->msrs[VMX_INDEX_MSR_SYSCALL_MASK];
 
-    data->tsc = hvm_get_guest_tsc(v);
+    data->tsc = hvm_get_guest_tsc_fixed(v, v->domain->arch.chkpt_tsc);
 }
 
 static void vmx_load_cpu_state(struct vcpu *v, struct hvm_hw_cpu *data)
@@ -556,7 +556,7 @@ static void vmx_load_cpu_state(struct vcpu *v, struct hvm_hw_cpu *data)
     v->arch.hvm_vmx.cstar     = data->msr_cstar;
     v->arch.hvm_vmx.shadow_gs = data->shadow_gs;
 
-    hvm_set_guest_tsc(v, data->tsc);
+    hvm_set_guest_tsc_fixed(v, data->tsc, v->domain->arch.chkpt_tsc);
 }
 
 
diff --git a/xen/arch/x86/hvm/vpt.c b/xen/arch/x86/hvm/vpt.c
index f7af688..38541cf 100644
--- a/xen/arch/x86/hvm/vpt.c
+++ b/xen/arch/x86/hvm/vpt.c
@@ -36,7 +36,7 @@ void hvm_init_guest_time(struct domain *d)
     pl->last_guest_time = 0;
 }
 
-u64 hvm_get_guest_time(struct vcpu *v)
+u64 hvm_get_guest_time_fixed(struct vcpu *v, u64 at_tsc)
 {
     struct pl_time *pl = &v->domain->arch.hvm_domain.pl_time;
     u64 now;
@@ -45,11 +45,15 @@ u64 hvm_get_guest_time(struct vcpu *v)
     ASSERT(is_hvm_vcpu(v));
 
     spin_lock(&pl->pl_time_lock);
-    now = get_s_time() + pl->stime_offset;
-    if ( (int64_t)(now - pl->last_guest_time) > 0 )
-        pl->last_guest_time = now;
-    else
-        now = ++pl->last_guest_time;
+    now = get_s_time_fixed(at_tsc) + pl->stime_offset;
+
+    if ( !at_tsc )
+    {
+        if ( (int64_t)(now - pl->last_guest_time) > 0 )
+            pl->last_guest_time = now;
+        else
+            now = ++pl->last_guest_time;
+    }
     spin_unlock(&pl->pl_time_lock);
 
     return now + v->arch.hvm_vcpu.stime_offset;
diff --git a/xen/arch/x86/time.c b/xen/arch/x86/time.c
index 000191b..d424c70 100644
--- a/xen/arch/x86/time.c
+++ b/xen/arch/x86/time.c
@@ -712,13 +712,16 @@ static unsigned long get_cmos_time(void)
  * System Time
  ***************************************************************************/
 
-s_time_t get_s_time(void)
+s_time_t get_s_time_fixed(u64 at_tsc)
 {
     struct cpu_time *t = &this_cpu(cpu_time);
     u64 tsc, delta;
     s_time_t now;
 
-    rdtscll(tsc);
+    if ( at_tsc )
+        tsc = at_tsc;
+    else
+        rdtscll(tsc);
     delta = tsc - t->local_tsc_stamp;
     now = t->stime_local_stamp + scale_delta(delta, &t->tsc_scale);
 
diff --git a/xen/common/hvm/save.c b/xen/common/hvm/save.c
index 6c16399..7db68af 100644
--- a/xen/common/hvm/save.c
+++ b/xen/common/hvm/save.c
@@ -186,6 +186,8 @@ int hvm_save(struct domain *d, hvm_domain_context_t *h)
         }
     }
 
+    arch_hvm_save_done(d);
+
     /* Save an end-of-file marker */
     if ( hvm_save_entry(END, 0, h, &end) != 0 )
     {
@@ -236,7 +238,10 @@ int hvm_load(struct domain *d, hvm_domain_context_t *h)
         /* Read the typecode of the next entry  and check for the end-marker */
         desc = (struct hvm_save_descriptor *)(&h->data[h->cur]);
         if ( desc->typecode == 0 )
+        {
+            arch_hvm_load_done(d);
             return 0; 
+        }
         
         /* Find the handler for this entry */
         if ( (desc->typecode > HVM_SAVE_CODE_MAX) ||
diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h
index 49f7c0c..201f856 100644
--- a/xen/include/asm-x86/domain.h
+++ b/xen/include/asm-x86/domain.h
@@ -308,6 +308,8 @@ struct arch_domain
                                 (possibly other cases in the future */
     uint64_t vtsc_kerncount; /* for hvm, counts all vtsc */
     uint64_t vtsc_usercount; /* not used for hvm */
+    uint64_t chkpt_tsc;      /* TSC value that VCPUs use to calculate their
+                                tsc_offset value. Used during save/restore */
 
     /* Pseudophysical e820 map (XENMEM_memory_map).  */
     spinlock_t e820_lock;
diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
index dcc3483..d80e763 100644
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -232,12 +232,15 @@ bool_t hvm_send_assist_req(struct vcpu *v);
 void hvm_get_guest_pat(struct vcpu *v, u64 *guest_pat);
 int hvm_set_guest_pat(struct vcpu *v, u64 guest_pat);
 
-void hvm_set_guest_tsc(struct vcpu *v, u64 guest_tsc);
-u64 hvm_get_guest_tsc(struct vcpu *v);
+void hvm_set_guest_tsc_fixed(struct vcpu *v, u64 guest_tsc, u64 at_tsc);
+#define hvm_set_guest_tsc(v, t) hvm_set_guest_tsc_fixed((v), (t), 0)
+u64 hvm_get_guest_tsc_fixed(struct vcpu *v, u64 at_tsc);
+#define hvm_get_guest_tsc(v) hvm_get_guest_tsc_fixed((v), 0)
 
 void hvm_init_guest_time(struct domain *d);
 void hvm_set_guest_time(struct vcpu *v, u64 guest_time);
-u64 hvm_get_guest_time(struct vcpu *v);
+u64 hvm_get_guest_time_fixed(struct vcpu *v, u64 at_tsc);
+#define hvm_get_guest_time(v) hvm_get_guest_time_fixed((v), 0)
 
 int vmsi_deliver(
     struct domain *d, int vector,
diff --git a/xen/include/xen/hvm/save.h b/xen/include/xen/hvm/save.h
index ae6f0bb..70522a9 100644
--- a/xen/include/xen/hvm/save.h
+++ b/xen/include/xen/hvm/save.h
@@ -133,6 +133,8 @@ int hvm_load(struct domain *d, hvm_domain_context_t *h);
 /* Arch-specific definitions. */
 struct hvm_save_header;
 void arch_hvm_save(struct domain *d, struct hvm_save_header *hdr);
+void arch_hvm_save_done(struct domain *d);
 int arch_hvm_load(struct domain *d, struct hvm_save_header *hdr);
+void arch_hvm_load_done(struct domain *d);
 
 #endif /* __XEN_HVM_SAVE_H__ */
diff --git a/xen/include/xen/time.h b/xen/include/xen/time.h
index 95b4b91..032eb23 100644
--- a/xen/include/xen/time.h
+++ b/xen/include/xen/time.h
@@ -32,7 +32,8 @@ struct vcpu;
 typedef s64 s_time_t;
 #define PRI_stime PRId64
 
-s_time_t get_s_time(void);
+s_time_t get_s_time_fixed(u64 at_tick);
+#define get_s_time() get_s_time_fixed(0)
 unsigned long get_localtime(struct domain *d);
 uint64_t get_localtime_us(struct domain *d);
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/2] x86/HVM: Use fixed TSC value when saving or restoring domain
  2014-03-30  3:05 ` [PATCH 2/2] x86/HVM: Use fixed TSC value when saving or restoring domain Boris Ostrovsky
@ 2014-03-31  9:51   ` Jan Beulich
  2014-03-31 14:01     ` Boris Ostrovsky
  2014-03-31 15:29   ` Tian, Kevin
  1 sibling, 1 reply; 14+ messages in thread
From: Jan Beulich @ 2014-03-31  9:51 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: kevin.tian, ian.campbell, stefano.stabellini, ian.jackson,
	eddie.dong, xen-devel, jun.nakajima, suravee.suthikulpanit

> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -248,19 +248,22 @@ int hvm_set_guest_pat(struct vcpu *v, u64 guest_pat)
>      return 1;
>  }
>  
> -void hvm_set_guest_tsc(struct vcpu *v, u64 guest_tsc)
> +void hvm_set_guest_tsc_fixed(struct vcpu *v, u64 guest_tsc, u64 at_tsc)
>  {
>      uint64_t tsc;
>      uint64_t delta_tsc;
>  
>      if ( v->domain->arch.vtsc )
>      {
> -        tsc = hvm_get_guest_time(v);
> +        tsc = hvm_get_guest_time_fixed(v, at_tsc);
>          tsc = gtime_to_gtsc(v->domain, tsc);
>      }
>      else
>      {
> -        rdtscll(tsc);
> +        if ( at_tsc )
> +            tsc = at_tsc;

Please insert this as an "else if()" above the current "else".

> @@ -279,19 +282,22 @@ void hvm_set_guest_tsc_adjust(struct vcpu *v, u64 tsc_adjust)
>      v->arch.hvm_vcpu.msr_tsc_adjust = tsc_adjust;
>  }
>  
> -u64 hvm_get_guest_tsc(struct vcpu *v)
> +u64 hvm_get_guest_tsc_fixed(struct vcpu *v, uint64_t at_tsc)
>  {
>      uint64_t tsc;
>  
>      if ( v->domain->arch.vtsc )
>      {
> -        tsc = hvm_get_guest_time(v);
> +        tsc = hvm_get_guest_time_fixed(v, at_tsc);
>          tsc = gtime_to_gtsc(v->domain, tsc);
>          v->domain->arch.vtsc_kerncount++;
>      }
>      else
>      {
> -        rdtscll(tsc);
> +        if ( at_tsc )
> +            tsc = at_tsc;

Same here.

> --- a/xen/arch/x86/hvm/save.c
> +++ b/xen/arch/x86/hvm/save.c
> @@ -24,7 +24,7 @@
>  #include <asm/hvm/support.h>
>  #include <public/hvm/save.h>
>  
> -void arch_hvm_save(struct domain *d, struct hvm_save_header *hdr)
> +void arch_hvm_save(struct domain *dom, struct hvm_save_header *hdr)

The change is unmotivated (afaict) and inconsistent with most other
code - we conventionally use "d" for struct domain * variables. Please
drop the change here and use "d" throughout the rest of the patch,
at once resulting in less churn and hence making it easier to review.

> --- a/xen/include/asm-x86/hvm/hvm.h
> +++ b/xen/include/asm-x86/hvm/hvm.h
> @@ -232,12 +232,15 @@ bool_t hvm_send_assist_req(struct vcpu *v);
>  void hvm_get_guest_pat(struct vcpu *v, u64 *guest_pat);
>  int hvm_set_guest_pat(struct vcpu *v, u64 guest_pat);
>  
> -void hvm_set_guest_tsc(struct vcpu *v, u64 guest_tsc);
> -u64 hvm_get_guest_tsc(struct vcpu *v);
> +void hvm_set_guest_tsc_fixed(struct vcpu *v, u64 guest_tsc, u64 at_tsc);
> +#define hvm_set_guest_tsc(v, t) hvm_set_guest_tsc_fixed((v), (t), 0)
> +u64 hvm_get_guest_tsc_fixed(struct vcpu *v, u64 at_tsc);
> +#define hvm_get_guest_tsc(v) hvm_get_guest_tsc_fixed((v), 0)
>  
>  void hvm_init_guest_time(struct domain *d);
>  void hvm_set_guest_time(struct vcpu *v, u64 guest_time);
> -u64 hvm_get_guest_time(struct vcpu *v);
> +u64 hvm_get_guest_time_fixed(struct vcpu *v, u64 at_tsc);
> +#define hvm_get_guest_time(v) hvm_get_guest_time_fixed((v), 0)

Pointless parentheses (several instances thereof).

> index 95b4b91..032eb23 100644
> --- a/xen/include/xen/time.h
> +++ b/xen/include/xen/time.h
> @@ -32,7 +32,8 @@ struct vcpu;
>  typedef s64 s_time_t;
>  #define PRI_stime PRId64
>  
> -s_time_t get_s_time(void);
> +s_time_t get_s_time_fixed(u64 at_tick);
> +#define get_s_time() get_s_time_fixed(0)

get_s_time(), through NOW(), has quite many users, so I'm not certain
the code bloat resulting from this is desirable. I'd suggest the function
to remain such; the compiler will be able to make it a mov+jmp.

Jan

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/2] x86/HVM: Use fixed TSC value when saving or restoring domain
  2014-03-31  9:51   ` Jan Beulich
@ 2014-03-31 14:01     ` Boris Ostrovsky
  2014-03-31 14:08       ` Tian, Kevin
  2014-03-31 14:34       ` Jan Beulich
  0 siblings, 2 replies; 14+ messages in thread
From: Boris Ostrovsky @ 2014-03-31 14:01 UTC (permalink / raw)
  To: Jan Beulich
  Cc: kevin.tian, ian.campbell, stefano.stabellini, ian.jackson,
	eddie.dong, xen-devel, jun.nakajima, suravee.suthikulpanit

On 03/31/2014 05:51 AM, Jan Beulich wrote:
>
>> --- a/xen/arch/x86/hvm/save.c
>> +++ b/xen/arch/x86/hvm/save.c
>> @@ -24,7 +24,7 @@
>>   #include <asm/hvm/support.h>
>>   #include <public/hvm/save.h>
>>   
>> -void arch_hvm_save(struct domain *d, struct hvm_save_header *hdr)
>> +void arch_hvm_save(struct domain *dom, struct hvm_save_header *hdr)
> The change is unmotivated (afaict) and inconsistent with most other
> code - we conventionally use "d" for struct domain * variables. Please
> drop the change here and use "d" throughout the rest of the patch,
> at once resulting in less churn and hence making it easier to review.

The reason for this change is subsequent
     rdtscll(dom->arch.chkpt_tsc);

which will not work as
     rdtscll(d->arch.chkpt_tsc);

The alternatives as I see are
* Declare a temporary variable for use with rdtscll
* Change rdtscll's definition to use something else instead of variable 
'd'. Say, eax and edx (hopefully this won't clash with something else)


>
>> index 95b4b91..032eb23 100644
>> --- a/xen/include/xen/time.h
>> +++ b/xen/include/xen/time.h
>> @@ -32,7 +32,8 @@ struct vcpu;
>>   typedef s64 s_time_t;
>>   #define PRI_stime PRId64
>>   
>> -s_time_t get_s_time(void);
>> +s_time_t get_s_time_fixed(u64 at_tick);
>> +#define get_s_time() get_s_time_fixed(0)
> get_s_time(), through NOW(), has quite many users, so I'm not certain
> the code bloat resulting from this is desirable. I'd suggest the function
> to remain such; the compiler will be able to make it a mov+jmp.

Sorry, not sure I understand what you are asking for.

There shouldn't be much of code size increase since get_s_time() 
currently (and get_s_time_fixed() after this patch is applied) are not 
inlines. The only increase is due to routine itself getting very 
slightly larger.

But I suspect you meant something else.

-boris

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/2] x86/HVM: Use fixed TSC value when saving or restoring domain
  2014-03-31 14:01     ` Boris Ostrovsky
@ 2014-03-31 14:08       ` Tian, Kevin
  2014-03-31 14:34       ` Jan Beulich
  1 sibling, 0 replies; 14+ messages in thread
From: Tian, Kevin @ 2014-03-31 14:08 UTC (permalink / raw)
  To: Boris Ostrovsky, Jan Beulich
  Cc: ian.campbell@citrix.com, stefano.stabellini@eu.citrix.com,
	ian.jackson@eu.citrix.com, Dong, Eddie, xen-devel@lists.xen.org,
	Nakajima, Jun, suravee.suthikulpanit@amd.com

> From: Boris Ostrovsky [mailto:boris.ostrovsky@oracle.com]
> Sent: Monday, March 31, 2014 10:01 PM
> 
> On 03/31/2014 05:51 AM, Jan Beulich wrote:
> >
> >> --- a/xen/arch/x86/hvm/save.c
> >> +++ b/xen/arch/x86/hvm/save.c
> >> @@ -24,7 +24,7 @@
> >>   #include <asm/hvm/support.h>
> >>   #include <public/hvm/save.h>
> >>
> >> -void arch_hvm_save(struct domain *d, struct hvm_save_header *hdr)
> >> +void arch_hvm_save(struct domain *dom, struct hvm_save_header *hdr)
> > The change is unmotivated (afaict) and inconsistent with most other
> > code - we conventionally use "d" for struct domain * variables. Please
> > drop the change here and use "d" throughout the rest of the patch,
> > at once resulting in less churn and hence making it easier to review.
> 
> The reason for this change is subsequent
>      rdtscll(dom->arch.chkpt_tsc);
> 
> which will not work as
>      rdtscll(d->arch.chkpt_tsc);
> 
> The alternatives as I see are
> * Declare a temporary variable for use with rdtscll
> * Change rdtscll's definition to use something else instead of variable
> 'd'. Say, eax and edx (hopefully this won't clash with something else)
> 

use temp variable at least for now, which is cleaner regarding to the
purpose of this patch. rdtscll() should be a separate patch, however
imo it is not worthy of doing that.

Thanks
Kevin

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/2] x86/HVM: Use fixed TSC value when saving or restoring domain
  2014-03-31 14:01     ` Boris Ostrovsky
  2014-03-31 14:08       ` Tian, Kevin
@ 2014-03-31 14:34       ` Jan Beulich
  2014-03-31 15:06         ` Boris Ostrovsky
  1 sibling, 1 reply; 14+ messages in thread
From: Jan Beulich @ 2014-03-31 14:34 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: kevin.tian, ian.campbell, stefano.stabellini, ian.jackson,
	eddie.dong, xen-devel, jun.nakajima, suravee.suthikulpanit

>>> On 31.03.14 at 16:01, <boris.ostrovsky@oracle.com> wrote:
> On 03/31/2014 05:51 AM, Jan Beulich wrote:
>>
>>> --- a/xen/arch/x86/hvm/save.c
>>> +++ b/xen/arch/x86/hvm/save.c
>>> @@ -24,7 +24,7 @@
>>>   #include <asm/hvm/support.h>
>>>   #include <public/hvm/save.h>
>>>   
>>> -void arch_hvm_save(struct domain *d, struct hvm_save_header *hdr)
>>> +void arch_hvm_save(struct domain *dom, struct hvm_save_header *hdr)
>> The change is unmotivated (afaict) and inconsistent with most other
>> code - we conventionally use "d" for struct domain * variables. Please
>> drop the change here and use "d" throughout the rest of the patch,
>> at once resulting in less churn and hence making it easier to review.
> 
> The reason for this change is subsequent
>      rdtscll(dom->arch.chkpt_tsc);
> 
> which will not work as
>      rdtscll(d->arch.chkpt_tsc);
> 
> The alternatives as I see are
> * Declare a temporary variable for use with rdtscll
> * Change rdtscll's definition to use something else instead of variable 
> 'd'. Say, eax and edx (hopefully this won't clash with something else)

Yeah, the macro using "a" and "d" as variables is really bad, and
it's that what should be changed.

>>> index 95b4b91..032eb23 100644
>>> --- a/xen/include/xen/time.h
>>> +++ b/xen/include/xen/time.h
>>> @@ -32,7 +32,8 @@ struct vcpu;
>>>   typedef s64 s_time_t;
>>>   #define PRI_stime PRId64
>>>   
>>> -s_time_t get_s_time(void);
>>> +s_time_t get_s_time_fixed(u64 at_tick);
>>> +#define get_s_time() get_s_time_fixed(0)
>> get_s_time(), through NOW(), has quite many users, so I'm not certain
>> the code bloat resulting from this is desirable. I'd suggest the function
>> to remain such; the compiler will be able to make it a mov+jmp.
> 
> Sorry, not sure I understand what you are asking for.
> 
> There shouldn't be much of code size increase since get_s_time() 
> currently (and get_s_time_fixed() after this patch is applied) are not 
> inlines. The only increase is due to routine itself getting very 
> slightly larger.
> 
> But I suspect you meant something else.

All call sites have to zero %edi with the change in place, and I
was trying to tell you that the number of call sites of this isn't
exactly small due to the function's use via NOW(). Hence I think
you shouldn't penalize the callers and have an explicit out of line
wrapper.

Jan

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/2] Time-related fixes for migration
  2014-03-30  3:05 [PATCH 0/2] Time-related fixes for migration Boris Ostrovsky
  2014-03-30  3:05 ` [PATCH 1/2] libxl: Set guest parameters from config file during a restore Boris Ostrovsky
  2014-03-30  3:05 ` [PATCH 2/2] x86/HVM: Use fixed TSC value when saving or restoring domain Boris Ostrovsky
@ 2014-03-31 14:41 ` Tian, Kevin
  2014-03-31 15:30   ` Boris Ostrovsky
  2 siblings, 1 reply; 14+ messages in thread
From: Tian, Kevin @ 2014-03-31 14:41 UTC (permalink / raw)
  To: Boris Ostrovsky, xen-devel@lists.xen.org
  Cc: suravee.suthikulpanit@amd.com, stefano.stabellini@eu.citrix.com,
	Dong, Eddie, ian.jackson@eu.citrix.com, jbeulich@suse.com,
	Nakajima, Jun, ian.campbell@citrix.com

> From: Boris Ostrovsky [mailto:boris.ostrovsky@oracle.com]
> Sent: Sunday, March 30, 2014 11:06 AM
> 
> Two patches to address time-related problem that we discovered during
> migration testing.
> 
> * The first patch loads HVM parameters from configuration file during
> restore.
> To fix the actual problem that we saw only timer_mode needed to be restored
> but
> it seems to me that other parameters are needed as well since at least some
> of
> them are used at runtime.
> 
> The bug can be demonstrated with a Solaris guest but I haven't been able to
> trigger it with Linux. Possibly because Solaris' gethrtime() routine (which is
> what the test was using) is a trap to kernel's hrtimer which reports global
> time and performs some adjustments to per-CPU clock.

this looks a right thing to fix.

> 
> * The second patch keeps TSCs synchronized across VPCUs after save/restore.
> Currently TSC values diverge after migration because during both save and
> restore
> we calculate them separately for each VCPU and base each calculation on
> newly-read host's TSC.
> 
> The problem can be easily demonstrated with this program for a 2-VCPU guest
> (I am assuming here invariant TSC so, for example,
> tsc_mode="always_emulate" (*)):
> 
> int
> main(int argc, char* argv[])
> {
> 
>   unsigned long long h = 0LL;
>   int proc = 0;
>   cpu_set_t set;
> 
>   for(;;) {
>     unsigned long long n = __native_read_tsc();
>     if(h && n < h)
>         printf("prev 0x%llx cur 0x%llx\n", h, n);
>     CPU_ZERO(&set);
>     proc = (proc + 1) & 1;
>     CPU_SET(proc, &set);
>     if (sched_setaffinity(0, sizeof(cpu_set_t), &set)) {
>         perror("sched_setaffinity");
>         exit(1);
>     }
> 
>     h = n;
>   }
> }
> 

what's the backward drift range from above program? dozens of cycles?
hundreds of cycles?

> 
> (*) Which brings up another observation: when we are in default tsc_mode we
> start off with vtsc=0 and thus clear TSC_Invariant bit in guest's CPUID.
> After migration vtsc is 1 and TSC_Invariant bit is set. So the guest may observe
> different values of CPUID. Which technically reflects the fact that TSC became
> "safe" but I think potentially may be problematic to some guests.
> 
> 
> Boris Ostrovsky (2):
>   libxl: Set guest parameters from config file during a restore
>   x86/HVM: Use fixed TSC value when saving or restoring domain
> 
>  tools/libxl/libxl_dom.c       |   51
> +++++++++++++++++++++++++----------------
>  xen/arch/x86/hvm/hvm.c        |   18 ++++++++++-----
>  xen/arch/x86/hvm/save.c       |   36 +++++++++++++++++++++--------
>  xen/arch/x86/hvm/svm/svm.c    |    4 ++--
>  xen/arch/x86/hvm/vmx/vmx.c    |    4 ++--
>  xen/arch/x86/hvm/vpt.c        |   16 ++++++++-----
>  xen/arch/x86/time.c           |    7 ++++--
>  xen/common/hvm/save.c         |    5 ++++
>  xen/include/asm-x86/domain.h  |    2 ++
>  xen/include/asm-x86/hvm/hvm.h |    9 +++++---
>  xen/include/xen/hvm/save.h    |    2 ++
>  xen/include/xen/time.h        |    3 ++-
>  12 files changed, 105 insertions(+), 52 deletions(-)
> 
> --
> 1.7.10.4

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/2] x86/HVM: Use fixed TSC value when saving or restoring domain
  2014-03-31 14:34       ` Jan Beulich
@ 2014-03-31 15:06         ` Boris Ostrovsky
  2014-03-31 15:09           ` Jan Beulich
  0 siblings, 1 reply; 14+ messages in thread
From: Boris Ostrovsky @ 2014-03-31 15:06 UTC (permalink / raw)
  To: Jan Beulich
  Cc: kevin.tian, ian.campbell, stefano.stabellini, ian.jackson,
	eddie.dong, xen-devel, jun.nakajima, suravee.suthikulpanit

On 03/31/2014 10:34 AM, Jan Beulich wrote:
>
>>>> index 95b4b91..032eb23 100644
>>>> --- a/xen/include/xen/time.h
>>>> +++ b/xen/include/xen/time.h
>>>> @@ -32,7 +32,8 @@ struct vcpu;
>>>>    typedef s64 s_time_t;
>>>>    #define PRI_stime PRId64
>>>>    
>>>> -s_time_t get_s_time(void);
>>>> +s_time_t get_s_time_fixed(u64 at_tick);
>>>> +#define get_s_time() get_s_time_fixed(0)
>>> get_s_time(), through NOW(), has quite many users, so I'm not certain
>>> the code bloat resulting from this is desirable. I'd suggest the function
>>> to remain such; the compiler will be able to make it a mov+jmp.
>> Sorry, not sure I understand what you are asking for.
>>
>> There shouldn't be much of code size increase since get_s_time()
>> currently (and get_s_time_fixed() after this patch is applied) are not
>> inlines. The only increase is due to routine itself getting very
>> slightly larger.
>>
>> But I suspect you meant something else.
> All call sites have to zero %edi with the change in place, and I
> was trying to tell you that the number of call sites of this isn't
> exactly small due to the function's use via NOW(). Hence I think
> you shouldn't penalize the callers and have an explicit out of line
> wrapper.

OK, I understand the concern now but still confused about what you want ;-(

Two separate routines that only differ in how tsc is calculated (rdtscll 
vs. an 'in' parameter)?

-boris

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/2] x86/HVM: Use fixed TSC value when saving or restoring domain
  2014-03-31 15:06         ` Boris Ostrovsky
@ 2014-03-31 15:09           ` Jan Beulich
  0 siblings, 0 replies; 14+ messages in thread
From: Jan Beulich @ 2014-03-31 15:09 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: kevin.tian, ian.campbell, stefano.stabellini, ian.jackson,
	eddie.dong, xen-devel, jun.nakajima, suravee.suthikulpanit

>>> On 31.03.14 at 17:06, <boris.ostrovsky@oracle.com> wrote:
> On 03/31/2014 10:34 AM, Jan Beulich wrote:
>>
>>>>> index 95b4b91..032eb23 100644
>>>>> --- a/xen/include/xen/time.h
>>>>> +++ b/xen/include/xen/time.h
>>>>> @@ -32,7 +32,8 @@ struct vcpu;
>>>>>    typedef s64 s_time_t;
>>>>>    #define PRI_stime PRId64
>>>>>    
>>>>> -s_time_t get_s_time(void);
>>>>> +s_time_t get_s_time_fixed(u64 at_tick);
>>>>> +#define get_s_time() get_s_time_fixed(0)
>>>> get_s_time(), through NOW(), has quite many users, so I'm not certain
>>>> the code bloat resulting from this is desirable. I'd suggest the function
>>>> to remain such; the compiler will be able to make it a mov+jmp.
>>> Sorry, not sure I understand what you are asking for.
>>>
>>> There shouldn't be much of code size increase since get_s_time()
>>> currently (and get_s_time_fixed() after this patch is applied) are not
>>> inlines. The only increase is due to routine itself getting very
>>> slightly larger.
>>>
>>> But I suspect you meant something else.
>> All call sites have to zero %edi with the change in place, and I
>> was trying to tell you that the number of call sites of this isn't
>> exactly small due to the function's use via NOW(). Hence I think
>> you shouldn't penalize the callers and have an explicit out of line
>> wrapper.
> 
> OK, I understand the concern now but still confused about what you want ;-(
> 
> Two separate routines that only differ in how tsc is calculated (rdtscll 
> vs. an 'in' parameter)?

No - I'm fine with get_s_time() being a wrapper for get_s_time_fixed(),
just that I ask it to be a proper out-of-line function rather than an
inline one or a macro.

Jan

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/2] x86/HVM: Use fixed TSC value when saving or restoring domain
  2014-03-30  3:05 ` [PATCH 2/2] x86/HVM: Use fixed TSC value when saving or restoring domain Boris Ostrovsky
  2014-03-31  9:51   ` Jan Beulich
@ 2014-03-31 15:29   ` Tian, Kevin
  1 sibling, 0 replies; 14+ messages in thread
From: Tian, Kevin @ 2014-03-31 15:29 UTC (permalink / raw)
  To: Boris Ostrovsky, xen-devel@lists.xen.org
  Cc: suravee.suthikulpanit@amd.com, stefano.stabellini@eu.citrix.com,
	Dong, Eddie, ian.jackson@eu.citrix.com, jbeulich@suse.com,
	Nakajima, Jun, ian.campbell@citrix.com

> From: Boris Ostrovsky [mailto:boris.ostrovsky@oracle.com]
> Sent: Sunday, March 30, 2014 11:06 AM
> 
> When a domain is saved each VCPU's TSC value needs to be preserved. To get
> it we
> use hvm_get_guest_tsc(). This routine (either itself or via get_s_time() which
> it may call) calculates VCPU's TSC based on current host's TSC value (by doing
> a
> rdtscll()). Since this is performed for each VCPU separately we end up with
> un-synchronized TSCs.
> 
> Similarly, during a restore each VCPU is assigned its TSC based on host's
> current
> tick, causing virtual TSCs to diverge further.
> 
> With this, we can easily get into situation where a guest may see time going
> backwards.
> 
> Instead of reading new TSC value for each VCPU when saving/restoring it we
> should
> use the same value across all VCPUs.
> 
> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>

Somehow the new ***_fixed interfaces looked a bit overkilled, however I don't
have a better idea now except introducing a new hypercall to save/restore all
VCPU TSCs together, which however doesn't solve the existing versions.

Reviewed-by: Kevin Tian <kevin.tian@intel.com>

Thanks
Kevin

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/2] Time-related fixes for migration
  2014-03-31 14:41 ` [PATCH 0/2] Time-related fixes for migration Tian, Kevin
@ 2014-03-31 15:30   ` Boris Ostrovsky
  0 siblings, 0 replies; 14+ messages in thread
From: Boris Ostrovsky @ 2014-03-31 15:30 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: ian.campbell@citrix.com, stefano.stabellini@eu.citrix.com,
	Nakajima, Jun, Dong, Eddie, ian.jackson@eu.citrix.com,
	xen-devel@lists.xen.org, jbeulich@suse.com,
	suravee.suthikulpanit@amd.com

On 03/31/2014 10:41 AM, Tian, Kevin wrote:
>> * The second patch keeps TSCs synchronized across VPCUs after save/restore.
>> Currently TSC values diverge after migration because during both save and
>> restore
>> we calculate them separately for each VCPU and base each calculation on
>> newly-read host's TSC.
>>
>> The problem can be easily demonstrated with this program for a 2-VCPU guest
>> (I am assuming here invariant TSC so, for example,
>> tsc_mode="always_emulate" (*)):
>>
>> int
>> main(int argc, char* argv[])
>> {
>>
>>    unsigned long long h = 0LL;
>>    int proc = 0;
>>    cpu_set_t set;
>>
>>    for(;;) {
>>      unsigned long long n = __native_read_tsc();
>>      if(h && n < h)
>>          printf("prev 0x%llx cur 0x%llx\n", h, n);
>>      CPU_ZERO(&set);
>>      proc = (proc + 1) & 1;
>>      CPU_SET(proc, &set);
>>      if (sched_setaffinity(0, sizeof(cpu_set_t), &set)) {
>>          perror("sched_setaffinity");
>>          exit(1);
>>      }
>>
>>      h = n;
>>    }
>> }
>>
> what's the backward drift range from above program? dozens of cycles?
> hundreds of cycles?

For "raw" difference (i.e. TSC registers themselves) it's usually tens 
of thousands, sometimes more (and sometimes *much* more).

For example, here are outputs of 'xl debug-key v' before and after a 
migrate for a 2p guest:

root@haswell> xl dmesg |grep Offset
(XEN) TSC Offset = ffffff54e63f1cab
(XEN) TSC Offset = ffffff54e63f1cab
(XEN) TSC Offset = ffffff6cb0a59ea9
(XEN) TSC Offset = ffffff6cb0a566ae
root@haswell>

For guest's view, taking into account, for example, the fact that 
sched_affinity() takes quite some time, it's a few hundreds of cycles.

And obviously as you inclrease number of VCPUs (and I think guest memory 
as well) the largest difference grows further.

-boris

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/2] libxl: Set guest parameters from config file during a restore
  2014-03-30  3:05 ` [PATCH 1/2] libxl: Set guest parameters from config file during a restore Boris Ostrovsky
@ 2014-04-01 10:37   ` Ian Campbell
  2014-04-01 13:45     ` Boris Ostrovsky
  0 siblings, 1 reply; 14+ messages in thread
From: Ian Campbell @ 2014-04-01 10:37 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: kevin.tian, suravee.suthikulpanit, stefano.stabellini, eddie.dong,
	ian.jackson, xen-devel, jbeulich, jun.nakajima

On Sat, 2014-03-29 at 23:05 -0400, Boris Ostrovsky wrote:
> @@ -484,22 +506,11 @@ static int hvm_build_set_params(xc_interface *handle, uint32_t domid,
>  
>      xc_get_hvm_param(handle, domid, HVM_PARAM_STORE_PFN, store_mfn);
>      xc_get_hvm_param(handle, domid, HVM_PARAM_CONSOLE_PFN, console_mfn);
> -    xc_set_hvm_param(handle, domid, HVM_PARAM_PAE_ENABLED,
> -                     libxl_defbool_val(info->u.hvm.pae));
> -#if defined(__i386__) || defined(__x86_64__)
> -    xc_set_hvm_param(handle, domid, HVM_PARAM_VIRIDIAN,
> -                     libxl_defbool_val(info->u.hvm.viridian));
> -    xc_set_hvm_param(handle, domid, HVM_PARAM_HPET_ENABLED,
> -                     libxl_defbool_val(info->u.hvm.hpet));
> -#endif
> -    xc_set_hvm_param(handle, domid, HVM_PARAM_TIMER_MODE, timer_mode(info));
> -    xc_set_hvm_param(handle, domid, HVM_PARAM_VPT_ALIGN,
> -                     libxl_defbool_val(info->u.hvm.vpt_align));
> -    xc_set_hvm_param(handle, domid, HVM_PARAM_NESTEDHVM,
> -                     libxl_defbool_val(info->u.hvm.nested_hvm));
>      xc_set_hvm_param(handle, domid, HVM_PARAM_STORE_EVTCHN, store_evtchn);
>      xc_set_hvm_param(handle, domid, HVM_PARAM_CONSOLE_EVTCHN, console_evtchn);
>  
> +    hvm_set_conf_params(handle, domid, info);

Can you confirm that this isn't now called twice on domain create, once
from libxl__build_pre and then again here.

Ian.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/2] libxl: Set guest parameters from config file during a restore
  2014-04-01 10:37   ` Ian Campbell
@ 2014-04-01 13:45     ` Boris Ostrovsky
  0 siblings, 0 replies; 14+ messages in thread
From: Boris Ostrovsky @ 2014-04-01 13:45 UTC (permalink / raw)
  To: Ian Campbell
  Cc: kevin.tian, suravee.suthikulpanit, stefano.stabellini, eddie.dong,
	ian.jackson, xen-devel, jbeulich, jun.nakajima

On 04/01/2014 06:37 AM, Ian Campbell wrote:
> On Sat, 2014-03-29 at 23:05 -0400, Boris Ostrovsky wrote:
>> @@ -484,22 +506,11 @@ static int hvm_build_set_params(xc_interface *handle, uint32_t domid,
>>   
>>       xc_get_hvm_param(handle, domid, HVM_PARAM_STORE_PFN, store_mfn);
>>       xc_get_hvm_param(handle, domid, HVM_PARAM_CONSOLE_PFN, console_mfn);
>> -    xc_set_hvm_param(handle, domid, HVM_PARAM_PAE_ENABLED,
>> -                     libxl_defbool_val(info->u.hvm.pae));
>> -#if defined(__i386__) || defined(__x86_64__)
>> -    xc_set_hvm_param(handle, domid, HVM_PARAM_VIRIDIAN,
>> -                     libxl_defbool_val(info->u.hvm.viridian));
>> -    xc_set_hvm_param(handle, domid, HVM_PARAM_HPET_ENABLED,
>> -                     libxl_defbool_val(info->u.hvm.hpet));
>> -#endif
>> -    xc_set_hvm_param(handle, domid, HVM_PARAM_TIMER_MODE, timer_mode(info));
>> -    xc_set_hvm_param(handle, domid, HVM_PARAM_VPT_ALIGN,
>> -                     libxl_defbool_val(info->u.hvm.vpt_align));
>> -    xc_set_hvm_param(handle, domid, HVM_PARAM_NESTEDHVM,
>> -                     libxl_defbool_val(info->u.hvm.nested_hvm));
>>       xc_set_hvm_param(handle, domid, HVM_PARAM_STORE_EVTCHN, store_evtchn);
>>       xc_set_hvm_param(handle, domid, HVM_PARAM_CONSOLE_EVTCHN, console_evtchn);
>>   
>> +    hvm_set_conf_params(handle, domid, info);
> Can you confirm that this isn't now called twice on domain create, once
> from libxl__build_pre and then again here.

No, I can't.

I thought I checked that libxl__build_pre() is not called on create and 
of course it clearly is. So this was me being rather sloppy.

I'll remove hvm_set_conf_params() call from hvm_build_set_params() and 
leave it only in libxl__build_pre().

-boris

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2014-04-01 13:45 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-03-30  3:05 [PATCH 0/2] Time-related fixes for migration Boris Ostrovsky
2014-03-30  3:05 ` [PATCH 1/2] libxl: Set guest parameters from config file during a restore Boris Ostrovsky
2014-04-01 10:37   ` Ian Campbell
2014-04-01 13:45     ` Boris Ostrovsky
2014-03-30  3:05 ` [PATCH 2/2] x86/HVM: Use fixed TSC value when saving or restoring domain Boris Ostrovsky
2014-03-31  9:51   ` Jan Beulich
2014-03-31 14:01     ` Boris Ostrovsky
2014-03-31 14:08       ` Tian, Kevin
2014-03-31 14:34       ` Jan Beulich
2014-03-31 15:06         ` Boris Ostrovsky
2014-03-31 15:09           ` Jan Beulich
2014-03-31 15:29   ` Tian, Kevin
2014-03-31 14:41 ` [PATCH 0/2] Time-related fixes for migration Tian, Kevin
2014-03-31 15:30   ` Boris Ostrovsky

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.