From mboxrd@z Thu Jan 1 00:00:00 1970 From: Marcelo Tosatti Subject: Re: [RFC 4/7] change kernel accounting to include steal time Date: Thu, 26 Aug 2010 17:47:12 -0300 Message-ID: <20100826204712.GA3773@amt.cnet> References: <1282772597-4183-1-git-send-email-glommer@redhat.com> <1282772597-4183-2-git-send-email-glommer@redhat.com> <1282772597-4183-3-git-send-email-glommer@redhat.com> <1282772597-4183-4-git-send-email-glommer@redhat.com> <1282772597-4183-5-git-send-email-glommer@redhat.com> <20100826172303.GB21273@amt.cnet> <20100826202856.GC2985@mothafucka.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: kvm@vger.kernel.org, avi@redhat.com, zamsden@redhat.com, riel@redhat.com To: Glauber Costa Return-path: Received: from mx1.redhat.com ([209.132.183.28]:49252 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752084Ab0HZUr6 (ORCPT ); Thu, 26 Aug 2010 16:47:58 -0400 Received: from int-mx03.intmail.prod.int.phx2.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.16]) by mx1.redhat.com (8.13.8/8.13.8) with ESMTP id o7QKlwDm016171 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Thu, 26 Aug 2010 16:47:58 -0400 Content-Disposition: inline In-Reply-To: <20100826202856.GC2985@mothafucka.localdomain> Sender: kvm-owner@vger.kernel.org List-ID: On Thu, Aug 26, 2010 at 05:28:56PM -0300, Glauber Costa wrote: > On Thu, Aug 26, 2010 at 02:23:03PM -0300, Marcelo Tosatti wrote: > > On Wed, Aug 25, 2010 at 05:43:14PM -0400, Glauber Costa wrote: > > > This patch proposes a common steal time implementation. When no > > > steal time is accounted, we just add a branch to the current > > > accounting code, that shouldn't add much overhead. > > > > > > When we do want to register steal time, we proceed as following: > > > - if we would account user or system time in this tick, and there is > > > out-of-cpu time registered, we skip it altogether, and account steal > > > time only. > > > - if we would account user or system time in this tick, and we got the > > > cpu for the whole slice, we proceed normaly. > > > - if we are idle in this tick, we flush out-of-cpu time to give it the > > > chance to update whatever last-measure internal variable it may have. > > > > Problem of using sched notifiers is that you don't differentiate whether > > the vcpu scheduled out by its own (via hlt emulation) or not. > And we don't need to. If we're out because we want to, we're idle. > And so, we don't account steal time. Think of the program below. > > Skipping accounting of user/system time whenever there's any stolen > > time detected probably breaks u/s accounting on non-cpu-hog loads. > I am willing to test some workloads you can suggest, but right now, > (yeah, I mostly used cpu-hogs), this scheme worked better. > > Linux does statistical sampling for accounting anyway, so I don't see > it getting much worse. A "cpu hog" that sleeps 1us every 1ms. > > I suppose steal time should be accounted separately from u/s ticks, as > > Xen does. > It requires us to hook somewhere else, which I deem as overcomplicated. > Do you have any suggestion on how to make it simple? Unfortunately no. > Furthermore, "doing separate", is equivalent of not skipping user/system, > if we really prefer to. > > > + if (delta > 1000UL) > > + touch_softlockup_watchdog(); > > + > > > > This will break authentic soft lockup detection whenever qemu processing > > takes more than 1s. > > This should be 10s. 1000UL is a typo. Comment is still valid.