From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dave Winchell Subject: Re: Re: Fix for get_s_time() Date: Thu, 24 Apr 2008 12:04:31 -0400 Message-ID: <4810AF8F.5030900@virtualiron.com> References: Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Keir Fraser Cc: Dave Winchell , "Tian, Kevin" , "xen-devel@lists.xensource.com" , Ian Pratt , "dan.magenheimer@oracle.com" List-Id: xen-devel@lists.xenproject.org Keir Fraser wrote: >Is the issue that there is a variable delay in waiting to acquire the >platform_timer_lock? If so, you can fix that by, for example, making >read_platform_stime() the *first* thing you do inside the >local_irq_disable/enable() block in local-time_calibration(). i.e., *before* >doing the rdtsc() and get_s_time(). This avoids a variable delay between the >TSC-based time estimates and the platform-timer-based time estimate. > > yes, this is the issue. What you suggest should be fine and I am trying it now. With the locking version (and a fix to a bug I introduced) I got .0012% error on an overnight run with hpet layered on get_s_time_mono(), which is the max(prev, cur) layer on get_s_time we discussed. -Dave >If that fixes your issue I'll apply that in preference. > > -- Keir > >On 21/4/08 23:55, "Keir Fraser" wrote: > > > >>What's the race? All accesses to platform time are already done under the >>platform_timer_lock as far as I can see. >> >> -- Keir >> >>On 21/4/08 21:32, "Dave Winchell" wrote: >> >> >> >>>Keir, >>> >>>In my work on layering hpet on get_s_time, I found a problem >>>in get_s_time and related code. Because of the problem I was getting >>>large jumps in the offset between local time and ntp time. >>>These jumps were on the order of many seconds. >>> >>>The issue is the race between local_time_calibration() executing >>>on one processor and platform_time_calibration() on another. >>> >>>I have included a patch which addresses the race in >>>local_time_calibration(), cpu_frequency_change(), and >>>init_percpu_time(). >>> >>>I'm giving you this ahead of the hpet work as it affects all users >>>of get_s_time(). >>> >>>I'm confident of the fix in local_time_calibration() as I had failures there >>>before the fix and no failures after. The other two I'm less confident >>>in, so check >>>my work closely there. >>> >>>On the hpet over get_s_time() front, this fix allows me to get .0014% error. >>>This is very close to the error going to the hardware hpet each time. >>> >>>Regards, >>>Dave >>> >>> >>>diff -r a38a41de0800 xen/arch/x86/time.c >>>--- a/xen/arch/x86/time.c Wed Apr 16 16:42:47 2008 +0100 >>>+++ b/xen/arch/x86/time.c Mon Apr 21 15:19:37 2008 -0400 >>>@@ -530,6 +530,16 @@ static s_time_t read_platform_stime(void >>> >>> return stime; >>> } >>>+static s_time_t read_platform_stime_locked(void) >>>+{ >>>+ u64 count; >>>+ s_time_t stime; >>>+ >>>+ count = plt_count64 + ((plt_src.read_counter() - plt_count) & plt_mask); >>>+ stime = __read_platform_stime(count); >>>+ >>>+ return stime; >>>+} >>> >>> static void platform_time_calibration(void) >>> { >>>@@ -749,6 +759,7 @@ int cpu_frequency_change(u64 freq) >>> { >>> struct cpu_time *t = &this_cpu(cpu_time); >>> u64 curr_tsc; >>>+ unsigned long flags; >>> >>> /* Sanity check: CPU frequency allegedly dropping below 1MHz? */ >>> if ( freq < 1000000u ) >>>@@ -758,15 +769,15 @@ int cpu_frequency_change(u64 freq) >>> return -EINVAL; >>> } >>> >>>- local_irq_disable(); >>>+ spin_lock_irqsave(&platform_timer_lock, flags); >>> rdtscll(curr_tsc); >>> t->local_tsc_stamp = curr_tsc; >>>- t->stime_master_stamp = read_platform_stime(); >>>+ t->stime_master_stamp = read_platform_stime_locked(); >>> /* TSC-extrapolated time may be bogus after frequency change. */ >>> /*t->stime_local_stamp = get_s_time();*/ >>> t->stime_local_stamp = t->stime_master_stamp; >>> set_time_scale(&t->tsc_scale, freq); >>>- local_irq_enable(); >>>+ spin_unlock_irqrestore(&platform_timer_lock, flags); >>> >>> /* A full epoch should pass before we check for deviation. */ >>> set_timer(&t->calibration_timer, NOW() + EPOCH); >>>@@ -830,16 +841,18 @@ static void local_time_calibration(void >>> /* The overall calibration scale multiplier. */ >>> u32 calibration_mul_frac; >>> >>>+ unsigned long flags; >>>+ >>> prev_tsc = t->local_tsc_stamp; >>> prev_local_stime = t->stime_local_stamp; >>> prev_master_stime = t->stime_master_stamp; >>> >>> /* Disable IRQs to get 'instantaneous' current timestamps. */ >>>- local_irq_disable(); >>>+ spin_lock_irqsave(&platform_timer_lock, flags); >>> rdtscll(curr_tsc); >>> curr_local_stime = get_s_time(); >>>- curr_master_stime = read_platform_stime(); >>>- local_irq_enable(); >>>+ curr_master_stime = read_platform_stime_locked(); >>>+ spin_unlock_irqrestore(&platform_timer_lock, flags); >>> >>> #if 0 >>> printk("PRE%d: tsc=%"PRIu64" stime=%"PRIu64" master=%"PRIu64"\n", >>>@@ -944,10 +957,10 @@ void init_percpu_time(void) >>> unsigned long flags; >>> s_time_t now; >>> >>>- local_irq_save(flags); >>>+ spin_lock_irqsave(&platform_timer_lock, flags); >>> rdtscll(t->local_tsc_stamp); >>>- now = !plt_src.read_counter ? 0 : read_platform_stime(); >>>- local_irq_restore(flags); >>>+ now = !plt_src.read_counter ? 0 : read_platform_stime_locked(); >>>+ spin_unlock_irqrestore(&platform_timer_lock, flags); >>> >>> t->stime_master_stamp = now; >>> t->stime_local_stamp = now; >>> >>> >> >>_______________________________________________ >>Xen-devel mailing list >>Xen-devel@lists.xensource.com >>http://lists.xensource.com/xen-devel >> >> > > > >