From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dave Winchell Subject: Re: [PATCH] Add a timer mode that disables pending missed ticks Date: Thu, 08 Nov 2007 09:57:57 -0500 Message-ID: <473323F5.4090906@virtualiron.com> References: <47332084.8090305@virtualiron.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <47332084.8090305@virtualiron.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Keir Fraser Cc: "Shan, Haitao" , Dave Winchell , "Dong, Eddie" , xen-devel@lists.xensource.com, "Jiang, Yunhong" List-Id: xen-devel@lists.xenproject.org Keir, I ran a 24 hour (23hr:40min) test. The usual setup. Protocol was ASYNC. Errors: sles9sp3-64: -4.96 sec -.0058% rh4u4-64: +4.42 sec +.0052% So, lets leave it ASYNC unless someone produces some test cases where the error gets up to close to .05%. I'll do some testing here with overnight runs or, perhaps, different loads. thanks, Dave Dave Winchell wrote: > Hi Keir, > > I've added comments below. > See my next mail on some interesting performance numbers. > > thanks, > Dave > > Keir Fraser wrote: > >> On 7/11/07 19:38, "Dave Winchell" wrote: >> >> >> >>> My feeling is that we should go full SYNC. Yes, in theory the >>> guests should be able to handle ASYNC, but in reality it appears that >>> some do not. Since it is easy for us to give them SYNC, >>> lets just do it and not stress them out. >>> >> >> >> One problem with pure SYNC is there's a fair chance you won't deliver >> any >> ticks at all for a long time, if the guest only runs in short bursts >> (e.g., >> I/O bound) and happens not to be running on any tick boundary. I'm >> not sure >> how much that matters. It could cause time goes backwards if the time >> extrapolation via the TSC is not perfectly accurate, or cause >> problems if >> there are any assumptions that TSC delta since last tick fits in 32 bits >> (less likely in x64 code I suppose). Anyway, my point is that only >> testing >> VCPUs under full load may cause us to optimise in ways that have nasty >> unexpected effects for other workloads. >> >> > I agree that this could be a problem. I have an idea that could give > us full > SYNC and eliminate the long periods without clock interrupts. > In pt_process_missed_ticks() when missed_ticks > 0 set pt->run_timer = 1. > In pt_save_timer(): > > list_for_each_entry ( pt, head, list ) > if(!pt->run_timer) > stop_timer(&pt->timer); > > And in pt_timer_fn(): > > pt->run_timer = 0; > > So, for a guest that misses a tick, we will interrupt him once from the > descheduled state and then leave him alone in the descheduled state. > >> >> >>> For default mode as checked into unstable is now, >>> 64 bit guests should run quite fast as missed is calculated and then >>> a bunch >>> of additional interrupts are delivered. On the other hand >>> 32bit guests very well in default mode. >>> >>> For the original code, before we put in the constant tsc offset >>> business, >>> 64bit guests run poorly and 32bit quests very well time-wise. >>> >> >> >> The default mode hasn't changed. Are you under the impression that >> missed-ticks-but-no-delay-of-tsc is the default mode now? I know x64 >> guests >> run badly with that because they treat every one of the missed ticks >> they >> receive as a full tick. >> >> > Sorry, I was confused. > However, the default mode will still run poorly for 64 bit guests because > of the pending_nr's accumulated while the guest has interrupts disabled. > As I recall, the effect is quite large, on the order of 10% error. > I'll get you a number later today. > >> -- Keir >> >> >> >>>> Or is the lack of >>>> synchronization of TSCs across VCPUs causing issues that you're >>>> trying to >>>> avoid? >>>> >>> >>> This does cause issues, but its not the only contributor to poor >>> timing. >>> Having TSCs synchronized across vcpus will help some of the time going >>> backwards problems we have seen, I think. >>> >>> Regards, >>> Dave >>> >>> Keir Fraser wrote: >>> >>> >>> >>>> On 7/11/07 17:29, "Keir Fraser" wrote: >>>> >>>> >>>> >>>> >>>> >>>>> So, you can see we send an interrupt immediately (and ASYNC) if >>>>> any ticks >>>>> have been missed, but then successive ticks are delivered 'on the >>>>> beat'. A >>>>> possible middleground? Or perhaps we should just go with SYNC >>>>> after all... >>>>> >>>>> >>>> >>>> How do these Linux x64 guests fare with the original and default >>>> timer mode, >>>> by the way? I would expect that time should be accounted pretty >>>> accurately >>>> in that mode, albeit with more interrupts than you'd like. Or is >>>> the lack of >>>> synchronisation of TSCs across VCPUs causing issues that you're >>>> trying to >>>> avoid? >>>> >>>> -- Keir >>>> >>>> >>>> >>>> >>>> >>> >> >> >> >> >