From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dave Winchell Subject: Re: [PATCH] Add a timer mode that disables pending missed ticks Date: Mon, 26 Nov 2007 15:57:24 -0500 Message-ID: <474B3334.8010906@virtualiron.com> References: Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII; format=flowed Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Keir Fraser Cc: "Shan, Haitao" , Dave Winchell , xen-devel@lists.xensource.com, "Dong, Eddie" , "Jiang, Yunhong" List-Id: xen-devel@lists.xenproject.org Keir, The accuracy data I've collected for i/o loads for the various time protocols follows. In addition, the data for cpu loads is shown. The loads labeled cpu and i/o-8 are on an 8 processor AMD box. Two guests, red hat and sles 64 bit, 8 vcpu each. The cpu load is usex -e36 on each guest. (usex is available at http://people.redhat.com/anderson/usex.) i/o load is 8 instances of dd if=/dev/hda6 of=/dev/null. The loads labeled i/o-32 are 32 instances of dd. Also, these are run on 4 cpu AMD box. In addition, there is an idle rh-32bit guest. All three guests are 8vcpu. The loads labeled i/o-4/32 are the same as i/o-32 except that the redhat-64 guest has 4 instances of dd. Date Duration Protocol sles, rhat error load 11/07 23 hrs 40 min ASYNC -4.96 sec, +4.42 sec -.006%, +.005% cpu 11/09 3 hrs 19 min ASYNC -.13 sec, +1.44 sec, -.001%, +.012% cpu 11/08 2 hrs 21 min SYNC -.80 sec, -.34 sec, -.009%, -.004% cpu 11/08 1 hr 25 min SYNC -.24 sec, -.26 sec, -.005%, -.005% cpu 11/12 65 hrs 40 min SYNC -18 sec, -8 sec, -.008%, -.003% cpu 11/08 28 min MIXED -.75 sec, -.67 sec -.045%, -.040% cpu 11/08 15 hrs 39 min MIXED -19. sec,-17.4 sec, -.034%, -.031% cpu 11/14 17 hrs 17 min ASYNC -6.1 sec,-55.7 sec, -.01%, -.09% i/o-8 11/15 2 hrs 44 min ASYNC -1.47 sec,-14.0 sec, -.015% -.14% i/o-8 11/13 15 hrs 38 min SYNC -9.7 sec,-12.3 sec, -.017%, -.022% i/o-8 11/14 48 min SYNC - .46 sec, - .48 sec, -.017%, -.018% i/o-8 11/14 4 hrs 2 min MIXED -2.9 sec, -4.15 sec, -.020%, -.029% i/o-8 11/20 16 hrs 2 min MIXED -13.4 sec,-18.1 sec, -.023%, -.031% i/o-8 11/21 28 min MIXED -2.01 sec, -.67 sec, -.12%, -.04% i/o-32 11/21 2 hrs 25 min SYNC -.96 sec, -.43 sec, -.011%, -.005% i/o-32 11/21 40 min ASYNC -2.43 sec, -2.77 sec -.10%, -.11% i/o-32 11/26 113 hrs 46 min MIXED -297. sec, 13. sec -.07%, .003% i/o-4/32 11/26 4 hrs 50 min SYNC -3.21 sec, 1.44 sec, -.017%, .01% i/o-4/32 Overhead measurements: Progress in terms of number of passes through a fixed system workload on an 8 vcpu red hat with an 8 vcpu sles idle. The workload was usex -b48. ASYNC 167 min 145 passes .868 passes/min SYNC 167 min 144 passes .862 passes/min SYNC 1065 min 919 passes .863 passes/min MIXED 221 min 196 passes .887 passes/min Conclusions: The only protocol which meets the .05% accuracy requirement for ntp tracking under the loads above is the SYNC protocol. The worst case accuracies for SYNC, MIXED, and ASYNC are .022%, .12%, and .14%, respectively. We could reduce the cost of the SYNC method by only scheduling the extra wakeups if a certain number of ticks are missed. Regards, Dave Keir Fraser wrote: >On 9/11/07 19:22, "Dave Winchell" wrote: > > > >>Since I had a high error (~.03%) for the ASYNC method a couple of days ago, >>I ran another ASYNC test. I think there may have been something >>wrong with the code I used a couple of days ago for ASYNC. It may have been >>missing the immediate delivery of interrupt after context switch in. >> >>My results indicate that either SYNC or ASYNC give acceptable accuracy, >>each running consistently around or under .01%. MIXED has a fairly high >>error of >>greater than .03%. Probably too close to .05% ntp threshold for comfort. >>I don't have an overnight run with SYNC. I plan to leave SYNC running >>over the weekend. If you'd rather I can leave MIXED running instead. >> >>It may be too early to pick the protocol and I can run more overnight tests >>next week. >> >> > >I'm a bit worried about any unwanted side effects of the SYNC+run_timer >approach -- e.g., whether timer wakeups will cause higher system-wide CPU >contention. I find it easier to think through the implications of ASYNC. I'm >surprised that MIXED loses time, and is less accurate than ASYNC. Perhaps it >delivers more timer interrupts than the other approaches, and each interrupt >event causes a small accumulated error? > >Overall I would consider MIXED and ASYNC as favourites and if the latter is >actually more accurate then I can simply revert the changeset that >implemented MIXED. > >Perhaps rather than running more of the same workloads you could try idle >VCPUs and I/O bound VCPUs (e.g., repeated large disc reads to /dev/null)? We >don't have any data on workloads that aren't CPU bound, so that's really an >obvious place to put any further effort imo. > > -- Keir > > > >