From mboxrd@z Thu Jan 1 00:00:00 1970 From: Deepak Patel Subject: Re: [PATCH] Add a timer mode that disables pending missed ticks Date: Wed, 30 Jan 2008 13:04:16 -0800 Message-ID: <47A0E650.3070506@oracle.com> References: <20080129153458531.00000002384@djm-pc> <47A096CC.2070907@virtualiron.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="------------020507050506060802020406" Return-path: In-Reply-To: <47A096CC.2070907@virtualiron.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Dave Winchell Cc: "akira.ijuin@oracle.com" , "dan.magenheimer@oracle.com" , "xen-devel@lists.xensource.com" List-Id: xen-devel@lists.xenproject.org This is a multi-part message in MIME format. --------------020507050506060802020406 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit > > Is the graph for RHEL5u1-64? (I've never tested this one.) I do not know which graph was attached with this. But I saw this behavior in EL4u5 - 32, EL4U5 - 64 and EL5U1 - 64 hvm guests when I was running ltp tests continuously. > What was the behaviour of the other guests running? All pvm guests are fine. But behavior of most of the hvm guests were as described. > If they had spikes, were they at the same wall time? No. They are not at the same wall time. > Were the other guests running ltp as well? > Yes all 6 guests (4 hvm and 2 pvm) the guests are running ltp continuously. > How are you measuring skew? I was collecting output of "ntpdate -q every 300 seconds (5 minutes) and have created graph based on that. > > Are you running ntpd? > Yes. ntp was running on all the guests. I am investigating what causes this spikes and let everyone know what are my findings. Thanks, Deepak > Anything that you can discover that would be in sync with > the spikes would be very helpful! > > The code that I test with is our product code, which is based > on 3.1. So it is possible that something in 3.2 other than vpt.c > is the cause. I can test with 3.2, if necessary. > > thanks, > Dave > > > > Dan Magenheimer wrote: > >> Hi Dave (Keir, see suggestion below) -- >> >> Thanks! >> >> Turning off vhpet certainly helps a lot (though see below). >> >> I wonder if timekeeping with vhpet is so bad that it should be >> turned off by default (in 3.1, 3.2, and unstable) until it is >> fixed? (I have a patch that defaults it off, can post it if >> there is agreement on the above point.) The whole point of an >> HPET is to provide more precise timekeeping and if vhpet is >> worse than vpit, it can only confuse users. Comments? >> >> >> In your testing, are you just measuring % skew over a long >> period of time? >> We are graphing the skew continuously and >> seeing periodic behavior that is unsettling, even with pit. >> See attached. Though your algorithm recovers, the "cliffs" >> could still cause real user problems. I wonder if there is >> anything that can be done to make the "recovery" more >> responsive? >> >> We are looking into what part(s) of LTP is causing the cliffs. >> >> Thanks, >> Dan >> >> >> >>> -----Original Message----- >>> From: Dave Winchell [mailto:dwinchell@virtualiron.com] >>> Sent: Monday, January 28, 2008 8:21 AM >>> To: dan.magenheimer@oracle.com >>> Cc: Keir Fraser; xen-devel@lists.xensource.com; >>> deepak.patel@oracle.com; >>> akira.ijuin@oracle.com; Dave Winchell >>> Subject: Re: [Xen-devel] [PATCH] Add a timer mode that disables pending >>> missed ticks >>> >>> >>> Dan, >>> >>> I guess I'm a bit out of date calling for clock= usage. >>> Looking at linux 2.6.20.4 sources, I think you should specify >>> "clocksource=pit nohpet" on the linux guest bootline. >>> >>> You can leave the xen and dom0 bootlines as they are. >>> The xen and guest clocksources do not need to be the same. >>> In my tests, xen is using the hpet for its timekeeping and >>> that appears to be the default. >>> >>> When you boot the guests you should see >>> time.c: Using PIT/TSC based timekeeping. >>> on the rh4u5-64 guest, and something similar on the others. >>> >>> > (xm dmesg shows 8x Xeon 3.2GHz stepping 04, Platform timer >>> > 14.318MHz HPET.) >>> >>> This appears to be the xen state, which is fine. >>> I was wrongly assuming that this was the guest state. >>> You might want to look in your guest logs and see what they were >>> picking >>> for a clock source. >>> >>> Regards, >>> Dave >>> >>> >>> >>> >>> Dan Magenheimer wrote: >>> >>> >>> >>>> Thanks, I hadn't realized that! No wonder we didn't see the same >>>> improvement you saw! >>>> >>>> >>>> >>>>> Try specifying clock=pit on the linux boot line... >>>>> >>>> >>>> I'm confused... do you mean "clocksource=pit" on the Xen >>> >>> command line or >>> >>> >>>> "nohpet" / "clock=pit" / "clocksource=pit" on the guest (or >>> >>> dom0?) command >>> >>> >>>> line? Or both places? Since the tests take awhile, it >>> >>> would be nice >>> >>> >>>> to get this right the first time. Do the Xen and guest >>> >>> clocksources need >>> >>> >>>> to be the same? >>>> >>>> Thanks, >>>> Dan >>>> >>>> -----Original Message----- >>>> *From:* Dave Winchell [mailto:dwinchell@virtualiron.com] >>>> *Sent:* Sunday, January 27, 2008 2:22 PM >>>> *To:* dan.magenheimer@oracle.com; Keir Fraser >>>> *Cc:* xen-devel@lists.xensource.com; deepak.patel@oracle.com; >>>> akira.ijuin@oracle.com; Dave Winchell >>>> *Subject:* RE: [Xen-devel] [PATCH] Add a timer mode that disables >>>> pending missed ticks >>>> >>>> Hi Dan, >>>> >>>> Hpet timer does have a fairly large error, as I was trying this >>>> one recently. >>>> I don't remember what I got for error, but 1% sounds >>> >>> about right. >>> >>> >>>> The problem is that hpet is not built on top of vpt.c, >>> >>> the module >>> >>> >>>> Keir and I did >>>> all the recent work in, for its periodic timer needs. Try >>>> specifying clock=pit >>>> on the linux boot line. If it still picks the hpet, which it >>>> might, let me know >>>> and I'll tell you how to get around this. >>>> >>>> Regards, >>>> Dave >>>> >>>> >>>> >>>> >>>> >>> >>> -------------------------------------------------------------- >>> ---------- >>> >>> >>>> *From:* Dan Magenheimer [mailto:dan.magenheimer@oracle.com] >>>> *Sent:* Fri 1/25/2008 6:50 PM >>>> *To:* Dave Winchell; Keir Fraser >>>> *Cc:* xen-devel@lists.xensource.com; deepak.patel@oracle.com; >>>> akira.ijuin@oracle.com >>>> *Subject:* RE: [Xen-devel] [PATCH] Add a timer mode >>> >>> that disables >>> >>> >>>> pending missed ticks >>>> >>>> Sorry for the very late followup on this but we finally >>> >>> were able >>> >>> >>>> to get our testing set up again on stable 3.1 bits and have >>>> seen some very bad results on 3.1.3-rc1, on the order of 1%. >>>> >>>> Test enviroment was a 4-socket dual core machine with 24GB of >>>> memory running six two-vcpu 2GB domains, four hvm plus two pv. >>>> All six guests were running LTP simultaneously. The four hvm >>>> guests were: RHEL5u1-64, RHEL4u5-32, RHEL5-64, and RHEL4u5-64. >>>> Timer_mode was set to 2 for 64-bit guests and 0 for >>> >>> 32-bit guests. >>> >>> >>>> All four hvm guests experienced skew around -1%, even the 32-bit >>>> guest. Less intensive testing didn't exhibit much skew at all. >>>> >>>> A representative graph is attached. >>>> >>>> Dave, I wonder if some portion of your patches didn't end up in >>>> the xen trees? >>>> >>>> (xm dmesg shows 8x Xeon 3.2GHz stepping 04, Platform timer >>>> 14.318MHz HPET.) >>>> >>>> Thanks, >>>> Dan >>>> >>>> P.S. Many thanks to Deepak and Akira for running tests. >>>> >>>> > -----Original Message----- >>>> > From: xen-devel-bounces@lists.xensource.com >>>> > [mailto:xen-devel-bounces@lists.xensource.com]On Behalf Of >>>> > Dave Winchell >>>> > Sent: Wednesday, January 09, 2008 9:53 AM >>>> > To: Keir Fraser >>>> > Cc: dan.magenheimer@oracle.com; >>> >>> xen-devel@lists.xensource.com; Dave >>> >>> >>>> > Winchell >>>> > Subject: Re: [Xen-devel] [PATCH] Add a timer mode that >>>> > disables pending >>>> > missed ticks >>>> > >>>> > >>>> > Hi Keir, >>>> > >>>> > The latest change, c/s 16690, looks fine. >>>> > I agree that the code in c/s 16690 is equivalent to >>>> > the code I submitted. Also, your version is more >>>> > concise. >>>> > >>>> > The error tests confirm the equivalence. With >>> >>> overnight cpu loads, >>> >>> >>>> > the checked in version was accurate to +.048% for sles >>>> > and +.038% for red hat. My version was +.046% and +.032% in a >>>> > 2 hour test. >>>> > I don't think the difference is significant. >>>> > >>>> > i/o loads produced errors of +.01%. >>>> > >>>> > Thanks for all your efforts on this issue. >>>> > >>>> > Regards, >>>> > Dave >>>> > >>>> > >>>> > >>>> > Keir Fraser wrote: >>>> > >>>> > >Applied as c/s 16690, although the checked-in patch is >>>> > smaller. I think the >>>> > >only important fix is to pt_intr_post() and the only bit of >>>> > the patch I >>>> > >totally omitted was the change to pt_process_missed_ticks(). >>>> > I don't think >>>> > >that change can be important, but let's see what >>> >>> happens to the >>> >>> >>>> error >>>> > >percentage... >>>> > > >>>> > > -- Keir >>>> > > >>>> > >On 4/1/08 23:24, "Dave Winchell" >>> >>> wrote: >>> >>> >>>> > > >>>> > > >>>> > > >>>> > >>Hi Dan and Keir, >>>> > >> >>>> > >>Attached is a patch that fixes some issues with the >>> >>> SYNC policy >>> >>> >>>> > >>(no_missed_ticks_pending). >>>> > >>I have not tried to make the change the minimal one, but, >>>> > rather, just >>>> > >>ported into >>>> > >>the new code what I know to work well. The error for >>>> > >>no_missed_ticks_pending goes from >>>> > >>over 3% to .03% with this change according to my testing. >>>> > >> >>>> > >>Regards, >>>> > >>Dave >>>> > >> >>>> > >>Dan Magenheimer wrote: >>>> > >> >>>> > >> >>>> > >> >>>> > >>>Hi Dave -- >>>> > >>> >>>> > >>>Did you get your correction ported? If so, it would be >>>> > nice to see this get >>>> > >>>into 3.1.3. >>>> > >>> >>>> > >>>Note that I just did some very limited testing with >>>> > timer_mode=2(=SYNC=no >>>> > >>>missed ticks pending) >>>> > >>>on tip of xen-3.1-testing (64-bit Linux hv guest) and the >>>> > worst error I've >>>> > >>>seen so far >>>> > >>>is 0.012%. But I haven't tried any exotic loads, just LTP. >>>> > >>> >>>> > >>>Thanks, >>>> > >>>Dan >>>> > >>> >>>> > >>> >>>> > >>> >>>> > >>> >>>> > >>> >>>> > >>>>-----Original Message----- >>>> > >>>>From: Dave Winchell [mailto:dwinchell@virtualiron.com] >>>> > >>>>Sent: Wednesday, December 19, 2007 12:33 PM >>>> > >>>>To: dan.magenheimer@oracle.com >>>> > >>>>Cc: Keir Fraser; Shan, Haitao; >>>> > xen-devel@lists.xensource.com; Dong, >>>> > >>>>Eddie; Jiang, Yunhong; Dave Winchell >>>> > >>>>Subject: Re: [Xen-devel] [PATCH] Add a timer mode that >>>> > >>>>disables pending >>>> > >>>>missed ticks >>>> > >>>> >>>> > >>>> >>>> > >>>>Dan, >>>> > >>>> >>>> > >>>>I did some testing with the constant tsc offset >>> >>> SYNC method >>> >>> >>>> > >>>>(now called >>>> > >>>>no_missed_ticks_pending) >>>> > >>>>and found the error to be very high, much larger >>> >>> than 1 %, as >>> >>> >>>> > >>>>I recall. >>>> > >>>>I have not had a chance to submit a correction. I >>> >>> will try to >>> >>> >>>> > >>>>do it later >>>> > >>>>this week or the first week in January. My version of >>>> constant tsc >>>> > >>>>offset SYNC method >>>> > >>>>produces .02 % error, so I just need to port that into the >>>> > >>>>current code. >>>> > >>>> >>>> > >>>>The error you got for both of those kernels is >>> >>> what I would >>> >>> >>>> expect >>>> > >>>>for the default mode, delay_for_missed_ticks. >>>> > >>>> >>>> > >>>>I'll let Keir answer on how to set the time mode. >>>> > >>>> >>>> > >>>>Regards, >>>> > >>>>Dave >>>> > >>>> >>>> > >>>>Dan Magenheimer wrote: >>>> > >>>> >>>> > >>>> >>>> > >>>> >>>> > >>>> >>>> > >>>> >>>> > >>>>>Anyone make measurements on the final patch? >>>> > >>>>> >>>> > >>>>>I just ran a 64-bit RHEL5.1 pvm kernel and saw a loss of >>>> > >>>>> >>>> > >>>>> >>>> > >>>>> >>>> > >>>>> >>>> > >>>>about 0.2% with no load. This was xen-unstable tip today >>>> > >>>>with no options specified. 32-bit was about 0.01%. >>>> > >>>> >>>> > >>>> >>>> > >>>> >>>> > >>>> >>>> > >>>>>I think I missed something... how do I run the various >>>> > >>>>> >>>> > >>>>> >>>> > >>>>> >>>> > >>>>> >>>> > >>>>accounting choices and which ones are known to be >>> >>> appropriate >>> >>> >>>> > >>>>for which kernels? >>>> > >>>> >>>> > >>>> >>>> > >>>> >>>> > >>>> >>>> > >>>>>Thanks, >>>> > >>>>>Dan >>>> > >>>>> >>>> > >>>>> >>>> > >>>>> >>>> > >>>>> >>>> > >>>>> >>>> > >>>>> >>>> > >>>>> >>>> > >>>>>>-----Original Message----- >>>> > >>>>>>From: xen-devel-bounces@lists.xensource.com >>>> > >>>> >>>>>>>>> [mailto:xen-devel-bounces@lists.xensource.com]On Behalf Of >>>>>>>>> >>>>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>Keir Fraser >>>> > >>>> >>>> > >>>> >>>> > >>>> >>>> > >>>> >>>> > >>>>>>Sent: Thursday, December 06, 2007 4:57 AM >>>> > >>>>>>To: Dave Winchell >>>> > >>>>>>Cc: Shan, Haitao; xen-devel@lists.xensource.com; Dong, >>>> > Eddie; Jiang, >>>> > >>>>>>Yunhong >>>> > >>>>>>Subject: Re: [Xen-devel] [PATCH] Add a timer mode that >>>> > >>>>>>disables pending >>>> > >>>>>>missed ticks >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>>Please take a look at xen-unstable changeset 16545. >>>> > >>>>>> >>>> > >>>>>>-- Keir >>>> > >>>>>> >>>> > >>>>>>On 26/11/07 20:57, "Dave Winchell" >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>> wrote: >>>> > >>>> >>>> > >>>> >>>> > >>>> >>>> > >>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>>>Keir, >>>> > >>>>>>> >>>> > >>>>>>>The accuracy data I've collected for i/o loads for the >>>> > >>>>>>>various time protocols follows. In addition, the data >>>> > >>>>>>>for cpu loads is shown. >>>> > >>>>>>> >>>> > >>>>>>>The loads labeled cpu and i/o-8 are on an 8 >>> >>> processor AMD >>> >>> >>>> box. >>>> > >>>>>>>Two guests, red hat and sles 64 bit, 8 vcpu each. >>>> > >>>>>>>The cpu load is usex -e36 on each guest. >>>> > >>>>>>>(usex is available at >>>> http://people.redhat.com/anderson/usex.) >>>> > >>>>>>>i/o load is 8 instances of dd if=/dev/hda6 >>> >>> of=/dev/null. >>> >>> >>>> > >>>>>>> >>>> > >>>>>>>The loads labeled i/o-32 are 32 instances of dd. >>>> > >>>>>>>Also, these are run on 4 cpu AMD box. >>>> > >>>>>>>In addition, there is an idle rh-32bit guest. >>>> > >>>>>>>All three guests are 8vcpu. >>>> > >>>>>>> >>>> > >>>>>>>The loads labeled i/o-4/32 are the same as i/o-32 >>>> > >>>>>>>except that the redhat-64 guest has 4 instances of dd. >>>> > >>>>>>> >>>> > >>>>>>>Date Duration Protocol sles, rhat error load >>>> > >>>>>>> >>>> > >>>>>>>11/07 23 hrs 40 min ASYNC -4.96 sec, +4.42 sec -.006%, >>>> > +.005% cpu >>>> > >>>>>>>11/09 3 hrs 19 min ASYNC -.13 sec, +1.44 sec, -.001%, >>>> > +.012% cpu >>>> > >>>>>>> >>>> > >>>>>>>11/08 2 hrs 21 min SYNC -.80 sec, -.34 sec, -.009%, >>>> -.004% cpu >>>> > >>>>>>>11/08 1 hr 25 min SYNC -.24 sec, -.26 sec, >>> >>> -.005%, -.005% cpu >>> >>> >>>> > >>>>>>>11/12 65 hrs 40 min SYNC -18 sec, -8 sec, >>> >>> -.008%, -.003% cpu >>> >>> >>>> > >>>>>>> >>>> > >>>>>>>11/08 28 min MIXED -.75 sec, -.67 sec -.045%, >>> >>> -.040% cpu >>> >>> >>>> > >>>>>>>11/08 15 hrs 39 min MIXED -19. sec,-17.4 sec, -.034%, >>>> > -.031% cpu >>>> > >>>>>>> >>>> > >>>>>>> >>>> > >>>>>>>11/14 17 hrs 17 min ASYNC -6.1 sec,-55.7 sec, -.01%, >>>> > -.09% i/o-8 >>>> > >>>>>>>11/15 2 hrs 44 min ASYNC -1.47 sec,-14.0 sec, -.015% >>>> > -.14% i/o-8 >>>> > >>>>>>> >>>> > >>>>>>>11/13 15 hrs 38 min SYNC -9.7 sec,-12.3 sec, -.017%, >>>> > -.022% i/o-8 >>>> > >>>>>>>11/14 48 min SYNC - .46 sec, - .48 sec, >>> >>> -.017%, -.018% i/o-8 >>> >>> >>>> > >>>>>>> >>>> > >>>>>>>11/14 4 hrs 2 min MIXED -2.9 sec, -4.15 sec, -.020%, >>>> > -.029% i/o-8 >>>> > >>>>>>>11/20 16 hrs 2 min MIXED -13.4 sec,-18.1 sec, -.023%, >>>> > -.031% i/o-8 >>>> > >>>>>>> >>>> > >>>>>>> >>>> > >>>>>>> >>>> > >>>>>>>11/21 28 min MIXED -2.01 sec, -.67 sec, -.12%, >>> >>> -.04% i/o-32 >>> >>> >>>> > >>>>>>>11/21 2 hrs 25 min SYNC -.96 sec, -.43 sec, -.011%, >>>> > -.005% i/o-32 >>>> > >>>>>>>11/21 40 min ASYNC -2.43 sec, -2.77 sec -.10%, >>> >>> -.11% i/o-32 >>> >>> >>>> > >>>>>>> >>>> > >>>>>>>11/26 113 hrs 46 min MIXED -297. sec, 13. sec -.07%, >>>> > .003% i/o-4/32 >>>> > >>>>>>>11/26 4 hrs 50 min SYNC -3.21 sec, 1.44 sec, -.017%, >>>> > .01% i/o-4/32 >>>> > >>>>>>> >>>> > >>>>>>> >>>> > >>>>>>>Overhead measurements: >>>> > >>>>>>> >>>> > >>>>>>>Progress in terms of number of passes through a fixed >>>> > >>>>>>> >>>> > >>>>>>> >>>> > >>>>>>> >>>> > >>>>>>> >>>> > >>>>>>> >>>> > >>>>>>> >>>> > >>>>>>system workload >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>>>on an 8 vcpu red hat with an 8 vcpu sles idle. >>>> > >>>>>>>The workload was usex -b48. >>>> > >>>>>>> >>>> > >>>>>>> >>>> > >>>>>>>ASYNC 167 min 145 passes .868 passes/min >>>> > >>>>>>>SYNC 167 min 144 passes .862 passes/min >>>> > >>>>>>>SYNC 1065 min 919 passes .863 passes/min >>>> > >>>>>>>MIXED 221 min 196 passes .887 passes/min >>>> > >>>>>>> >>>> > >>>>>>> >>>> > >>>>>>>Conclusions: >>>> > >>>>>>> >>>> > >>>>>>>The only protocol which meets the .05% accuracy >>>> > requirement for ntp >>>> > >>>>>>>tracking under the loads >>>> > >>>>>>>above is the SYNC protocol. The worst case >>> >>> accuracies for >>> >>> >>>> > >>>>>>> >>>> > >>>>>>> >>>> > >>>>>>> >>>> > >>>>>>> >>>> > >>>>>>> >>>> > >>>>>>> >>>> > >>>>>>SYNC, MIXED, >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>>>and ASYNC >>>> > >>>>>>>are .022%, .12%, and .14%, respectively. >>>> > >>>>>>> >>>> > >>>>>>>We could reduce the cost of the SYNC method by only >>>> > >>>>>>> >>>> > >>>>>>> >>>> > >>>>>>> >>>> > >>>>>>> >>>> > >>>>>>> >>>> > >>>>>>> >>>> > >>>>>>scheduling the extra >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>>>wakeups if a certain number >>>> > >>>>>>>of ticks are missed. >>>> > >>>>>>> >>>> > >>>>>>>Regards, >>>> > >>>>>>>Dave >>>> > >>>>>>> >>>> > >>>>>>>Keir Fraser wrote: >>>> > >>>>>>> >>>> > >>>>>>> >>>> > >>>>>>> >>>> > >>>>>>> >>>> > >>>>>>> >>>> > >>>>>>> >>>> > >>>>>>> >>>> > >>>>>>>>On 9/11/07 19:22, "Dave Winchell" >>>> > >>>>>>>> >>>> > >>>>>>>> >>>> > >>>>>>>> >>>> > >>>>>>>> >>>> > >>>>>>>> >>>> > >>>>>>>> >>>> > >>>>>> wrote: >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>>>> >>>> > >>>>>>>> >>>> > >>>>>>>> >>>> > >>>>>>>> >>>> > >>>>>>>>>Since I had a high error (~.03%) for the >>> >>> ASYNC method a >>> >>> >>>> > >>>>>>>>> >>>> > >>>>>>>>> >>>> > >>>>>>>>> >>>> > >>>>>>>>> >>>> > >>>>>>>>> >>>> > >>>>>>>>> >>>> > >>>>>>couple of days ago, >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>>>>>I ran another ASYNC test. I think there may have >>>> > been something >>>> > >>>>>>>>>wrong with the code I used a couple of days ago for >>>> > >>>>>>>>> >>>> > >>>>>>>>> >>>> > >>>>>>>>> >>>> > >>>>>>>>> >>>> > >>>>>>>>> >>>> > >>>>>>>>> >>>> > >>>>>>ASYNC. It may have been >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>>>>>missing the immediate delivery of interrupt >>> >>> after context >>> >>> >>>> > >>>>>>>>> >>>> > >>>>>>>>> >>>> > >>>>>>>>> >>>> > >>>>>>>>> >>>> > >>>>>>>>> >>>> > >>>>>>>>> >>>> > >>>>>>switch in. >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>>>>>My results indicate that either SYNC or ASYNC give >>>> > >>>>>>>>> >>>> > >>>>>>>>> >>>> > >>>>>>>>> >>>> > >>>>>>>>> >>>> > >>>>>>>>> >>>> > >>>>>>>>> >>>> > >>>>>>acceptable accuracy, >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>>>>>each running consistently around or under >>> >>> .01%. MIXED has >>> >>> >>>> > >>>>>>>>> >>>> > >>>>>>>>> >>>> > >>>>>>>>> >>>> > >>>>>>>>> >>>> > >>>>>>>>> >>>> > >>>>>>>>> >>>> > >>>>>>a fairly high >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>>>>>error of >>>> > >>>>>>>>>greater than .03%. Probably too close to .05% ntp >>>> > >>>>>>>>> >>>> > >>>>>>>>> >>>> > >>>>>>>>> >>>> > >>>>>>>>> >>>> > >>>>>>>>> >>>> > >>>>>>>>> >>>> > >>>>>>threshold for comfort. >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>>>>>I don't have an overnight run with SYNC. I >>> >>> plan to leave >>> >>> >>>> > >>>>>>>>> >>>> > >>>>>>>>> >>>> > >>>>>>>>> >>>> > >>>>>>>>> >>>> > >>>>>>>>> >>>> > >>>>>>>>> >>>> > >>>>>>SYNC running >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>>>>>over the weekend. If you'd rather I can leave MIXED >>>> > >>>>>>>>> >>>> > >>>>>>>>> >>>> > >>>>>>>>> >>>> > >>>>>>>>> >>>> > >>>>>>>>> >>>> > >>>>>>>>> >>>> > >>>>>>running instead. >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>>>>>It may be too early to pick the protocol and >>> >>> I can run >>> >>> >>>> > >>>>>>>>> >>>> > >>>>>>>>> >>>> > >>>>>>>>> >>>> > >>>>>>>>> >>>> > >>>>>>>>> >>>> > >>>>>>>>> >>>> > >>>>>>more overnight tests >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>>>>>next week. >>>> > >>>>>>>>> >>>> > >>>>>>>>> >>>> > >>>>>>>>> >>>> > >>>>>>>>> >>>> > >>>>>>>>> >>>> > >>>>>>>>> >>>> > >>>>>>>>> >>>> > >>>>>>>>> >>>> > >>>>>>>>I'm a bit worried about any unwanted side >>> >>> effects of the >>> >>> >>>> > >>>>>>>> >>>> > >>>>>>>> >>>> > >>>>>>>> >>>> > >>>>>>>> >>>> > >>>>>>>> >>>> > >>>>>>>> >>>> > >>>>>>SYNC+run_timer >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>>>>approach -- e.g., whether timer wakeups will >>> >>> cause higher >>> >>> >>>> > >>>>>>>> >>>> > >>>>>>>> >>>> > >>>>>>>> >>>> > >>>>>>>> >>>> > >>>>>>>> >>>> > >>>>>>>> >>>> > >>>>>>system-wide CPU >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>>>>contention. I find it easier to think through the >>>> > >>>>>>>> >>>> > >>>>>>>> >>>> > >>>>>>>> >>>> > >>>>>>>> >>>> > >>>>>>>> >>>> > >>>>>>>> >>>> > >>>>>>implications of ASYNC. I'm >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>>>>surprised that MIXED loses time, and is less >>> >>> accurate than >>> >>> >>>> > >>>>>>>> >>>> > >>>>>>>> >>>> > >>>>>>>> >>>> > >>>>>>>> >>>> > >>>>>>>> >>>> > >>>>>>>> >>>> > >>>>>>ASYNC. Perhaps it >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>>>>delivers more timer interrupts than the other >>> >>> approaches, >>> >>> >>>> > >>>>>>>> >>>> > >>>>>>>> >>>> > >>>>>>>> >>>> > >>>>>>>> >>>> > >>>>>>>> >>>> > >>>>>>>> >>>> > >>>>>>and each interrupt >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>>>>event causes a small accumulated error? >>>> > >>>>>>>> >>>> > >>>>>>>>Overall I would consider MIXED and ASYNC as >>> >>> favourites and >>> >>> >>>> > >>>>>>>> >>>> > >>>>>>>> >>>> > >>>>>>>> >>>> > >>>>>>>> >>>> > >>>>>>>> >>>> > >>>>>>>> >>>> > >>>>>>if the latter is >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>>>>actually more accurate then I can simply revert the >>>> > changeset that >>>> > >>>>>>>>implemented MIXED. >>>> > >>>>>>>> >>>> > >>>>>>>>Perhaps rather than running more of the same >>> >>> workloads you >>> >>> >>>> > >>>>>>>> >>>> > >>>>>>>> >>>> > >>>>>>>> >>>> > >>>>>>>> >>>> > >>>>>>>> >>>> > >>>>>>>> >>>> > >>>>>>could try idle >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>>>>VCPUs and I/O bound VCPUs (e.g., repeated >>> >>> large disc reads >>> >>> >>>> > >>>>>>>> >>>> > >>>>>>>> >>>> > >>>>>>>> >>>> > >>>>>>>> >>>> > >>>>>>>> >>>> > >>>>>>>> >>>> > >>>>>>to /dev/null)? We >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>>>>don't have any data on workloads that aren't >>> >>> CPU bound, so >>> >>> >>>> > >>>>>>>> >>>> > >>>>>>>> >>>> > >>>>>>>> >>>> > >>>>>>>> >>>> > >>>>>>>> >>>> > >>>>>>>> >>>> > >>>>>>that's really an >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>>>>obvious place to put any further effort imo. >>>> > >>>>>>>> >>>> > >>>>>>>>-- Keir >>>> > >>>>>>>> >>>> > >>>>>>>> >>>> > >>>>>>>> >>>> > >>>>>>>> >>>> > >>>>>>>> >>>> > >>>>>>>> >>>> > >>>>>>>> >>>> > >>>>>>>> >>>> > >>>>>>>> >>>> > >>>>>>>> >>>> > >>>>>>_______________________________________________ >>>> > >>>>>>Xen-devel mailing list >>>> > >>>>>>Xen-devel@lists.xensource.com >>>> > >>>>>>http://lists.xensource.com/xen-devel >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>>> >>>> > >>>>> >>>> > >>>>> >>>> > >>>>> >>>> > >>>>> >>>> > >>>> >>>> > >>>> >>>> > >>>> >>>> > >>>> >>>> > >>> >>>> > >>> >>>> > >>> >>>> > >>> >>>> > >>diff -r cfdbdca5b831 xen/arch/x86/hvm/vpt.c >>>> > >>--- a/xen/arch/x86/hvm/vpt.c Thu Dec 06 15:36:07 2007 +0000 >>>> > >>+++ b/xen/arch/x86/hvm/vpt.c Fri Jan 04 17:58:16 2008 -0500 >>>> > >>@@ -58,7 +58,7 @@ static void pt_process_missed_ticks(stru >>>> > >> >>>> > >> missed_ticks = missed_ticks / (s_time_t) >>> >>> pt->period + 1; >>> >>> >>>> > >> if ( mode_is(pt->vcpu->domain, >>> >>> no_missed_ticks_pending) ) >>> >>> >>>> > >>- pt->do_not_freeze = !pt->pending_intr_nr; >>>> > >>+ pt->do_not_freeze = 1; >>>> > >> else >>>> > >> pt->pending_intr_nr += missed_ticks; >>>> > >> pt->scheduled += missed_ticks * pt->period; >>>> > >>@@ -127,7 +127,12 @@ static void pt_timer_fn(void *data) >>>> > >> >>>> > >> pt_lock(pt); >>>> > >> >>>> > >>- pt->pending_intr_nr++; >>>> > >>+ if ( mode_is(pt->vcpu->domain, >>> >>> no_missed_ticks_pending) ) { >>> >>> >>>> > >>+ pt->pending_intr_nr = 1; >>>> > >>+ pt->do_not_freeze = 0; >>>> > >>+ } >>>> > >>+ else >>>> > >>+ pt->pending_intr_nr++; >>>> > >> >>>> > >> if ( !pt->one_shot ) >>>> > >> { >>>> > >>@@ -221,8 +226,6 @@ void pt_intr_post(struct vcpu *v, struct >>>> > >> return; >>>> > >> } >>>> > >> >>>> > >>- pt->do_not_freeze = 0; >>>> > >>- >>>> > >> if ( pt->one_shot ) >>>> > >> { >>>> > >> pt->enabled = 0; >>>> > >>@@ -235,6 +238,10 @@ void pt_intr_post(struct vcpu >>> >>> *v, struct >>> >>> >>>> > >> pt->last_plt_gtime = hvm_get_guest_time(v); >>>> > >> pt->pending_intr_nr = 0; /* 'collapse' all >>>> > missed ticks */ >>>> > >> } >>>> > >>+ else if ( mode_is(v->domain, no_missed_ticks_pending) ) { >>>> > >>+ pt->pending_intr_nr--; >>>> > >>+ pt->last_plt_gtime = hvm_get_guest_time(v); >>>> > >>+ } >>>> > >> else >>>> > >> { >>>> > >> pt->last_plt_gtime += pt->period_cycles; >>>> > >> >>>> > >> >>>> > > >>>> > > >>>> > > >>>> > > >>>> > >>>> > >>>> > _______________________________________________ >>>> > Xen-devel mailing list >>>> > Xen-devel@lists.xensource.com >>>> > http://lists.xensource.com/xen-devel >>>> > >>>> >>>> >>> > --------------020507050506060802020406 Content-Type: text/x-vcard; charset=utf-8; name="deepak.patel.vcf" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="deepak.patel.vcf" begin:vcard fn:Deepak Patel n:Patel;Deepak email;internet:deepak.patel@oracle.com version:2.1 end:vcard --------------020507050506060802020406 Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel --------------020507050506060802020406--