From mboxrd@z Thu Jan 1 00:00:00 1970 From: Boris Ostrovsky Subject: Re: [PATCH 0/2] Time-related fixes for migration Date: Mon, 31 Mar 2014 11:30:25 -0400 Message-ID: <53398A11.5000001@oracle.com> References: <1396148751-6918-1-git-send-email-boris.ostrovsky@oracle.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: "Tian, Kevin" Cc: "ian.campbell@citrix.com" , "stefano.stabellini@eu.citrix.com" , "Nakajima, Jun" , "Dong, Eddie" , "ian.jackson@eu.citrix.com" , "xen-devel@lists.xen.org" , "jbeulich@suse.com" , "suravee.suthikulpanit@amd.com" List-Id: xen-devel@lists.xenproject.org On 03/31/2014 10:41 AM, Tian, Kevin wrote: >> * The second patch keeps TSCs synchronized across VPCUs after save/restore. >> Currently TSC values diverge after migration because during both save and >> restore >> we calculate them separately for each VCPU and base each calculation on >> newly-read host's TSC. >> >> The problem can be easily demonstrated with this program for a 2-VCPU guest >> (I am assuming here invariant TSC so, for example, >> tsc_mode="always_emulate" (*)): >> >> int >> main(int argc, char* argv[]) >> { >> >> unsigned long long h = 0LL; >> int proc = 0; >> cpu_set_t set; >> >> for(;;) { >> unsigned long long n = __native_read_tsc(); >> if(h && n < h) >> printf("prev 0x%llx cur 0x%llx\n", h, n); >> CPU_ZERO(&set); >> proc = (proc + 1) & 1; >> CPU_SET(proc, &set); >> if (sched_setaffinity(0, sizeof(cpu_set_t), &set)) { >> perror("sched_setaffinity"); >> exit(1); >> } >> >> h = n; >> } >> } >> > what's the backward drift range from above program? dozens of cycles? > hundreds of cycles? For "raw" difference (i.e. TSC registers themselves) it's usually tens of thousands, sometimes more (and sometimes *much* more). For example, here are outputs of 'xl debug-key v' before and after a migrate for a 2p guest: root@haswell> xl dmesg |grep Offset (XEN) TSC Offset = ffffff54e63f1cab (XEN) TSC Offset = ffffff54e63f1cab (XEN) TSC Offset = ffffff6cb0a59ea9 (XEN) TSC Offset = ffffff6cb0a566ae root@haswell> For guest's view, taking into account, for example, the fact that sched_affinity() takes quite some time, it's a few hundreds of cycles. And obviously as you inclrease number of VCPUs (and I think guest memory as well) the largest difference grows further. -boris