From mboxrd@z Thu Jan 1 00:00:00 1970 From: Konrad Rzeszutek Wilk Subject: Re: [PATCH] remove blocked time accounting from xen "clockchip" Date: Fri, 20 Jan 2012 11:00:50 -0500 Message-ID: <20120120160050.GB3959@phenom.dumpdata.com> References: <1318970579-6282-1-git-send-email-lersek@redhat.com> <4EBA8FAA020000780005FD5F@nat28.tlf.novell.com> <4EBC125A.70300@goop.org> <20120119194232.GA3728@konrad-lan> <4F194880020000780006DD7F@nat28.tlf.novell.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Content-Disposition: inline In-Reply-To: <4F194880020000780006DD7F@nat28.tlf.novell.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Jan Beulich Cc: Jeremy Fitzhardinge , xen-devel@lists.xensource.com, Joe Jin , Zhenzhong Duan , Konrad Rzeszutek Wilk , Laszlo Ersek List-Id: xen-devel@lists.xenproject.org On Fri, Jan 20, 2012 at 09:57:04AM +0000, Jan Beulich wrote: > >>> On 19.01.12 at 20:42, Konrad Rzeszutek Wilk wrote: > > I finally got some time to look at them and I think they are these ones: > > > > git log --oneline > > e03b644fe68b1c6401465b02724d261538dba10f..3c404b578fab699c4708279938078d9404b > > 255a4 > > 3c404b5 KVM guest: Add a pv_ops stub for steal time > > c9aaa89 KVM: Steal time implementation > > 9ddabbe KVM: KVM Steal time guest/host interface > > 4b6b35f KVM: Add constant to represent KVM MSRs enabled bit in guest/host > > interface > > > > What is interesting is that they end up inserting a bunch of: > > > > > > + if (steal_account_process_tick()) > > + return; > > + > > > > in irqtime_account_process_tick and in account_process_tick. > > And this (particularly the "return" part of it) is what I have a hard > time to understand: How can it be correct to not do any of the > other accounting? After all, the function calls only > account_steal_time(), but its certainly going to be common that > part of the time was stolen, and part was spent executing. > > Further, it's being called only from the process tick accounting Also from 'irqtime_account_idle_ticks' which is called from account_idle_ticks (if tsc is part of the picture) which is called from tick_nohz_idle_exit. So at the end of the idle loop the idle time is accounted for. > functions, but clearly part of idle or interrupt time can also be > stolen. It looks as if the other interrupt times: so the CPUTIME_SOFTIRQ and CPUTIME_IRQ are completly skipped - but only if there is a "steal time". The 'steal time' from the KVM is based on the host scheduler notion of 'run_delay'. I think the 'run_delay' is based purely on block I/O delay or swap I/O delay. So if the host is not running in any of those issues, then the 'steal_account_process_tick' won't have any values. And the 'if (..) return;' wont be taken and it will continue to attribute the other 'time' slots with appropiate values. If we have CPU intensive guests that are overcommitted, the guest /proc/schedstats won't show the delay between the host putting it on a CPU as as 'steal' time but rather as 'idle' time - I think? That seems odd. I am probably misreading how 'run_delay' gets computed.