From mboxrd@z Thu Jan 1 00:00:00 1970 From: Avi Kivity Subject: Re: [RFC 4/7] change kernel accounting to include steal time Date: Mon, 30 Aug 2010 16:15:04 +0300 Message-ID: <4C7BAED8.9090409@redhat.com> References: <1282772597-4183-1-git-send-email-glommer@redhat.com> <1282772597-4183-2-git-send-email-glommer@redhat.com> <1282772597-4183-3-git-send-email-glommer@redhat.com> <1282772597-4183-4-git-send-email-glommer@redhat.com> <1282772597-4183-5-git-send-email-glommer@redhat.com> <4C7A2F88.6050807@redhat.com> <20100830124217.GA17084@mothafucka.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: kvm@vger.kernel.org, zamsden@redhat.com, Marcelo Tosatti , riel@redhat.com To: Glauber Costa Return-path: Received: from mx1.redhat.com ([209.132.183.28]:13812 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751758Ab0H3NPF (ORCPT ); Mon, 30 Aug 2010 09:15:05 -0400 Received: from int-mx08.intmail.prod.int.phx2.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.21]) by mx1.redhat.com (8.13.8/8.13.8) with ESMTP id o7UDF5Jr022163 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 30 Aug 2010 09:15:05 -0400 Received: from cleopatra.tlv.redhat.com (cleopatra.tlv.redhat.com [10.35.255.11]) by int-mx08.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id o7UDF4LL023964 for ; Mon, 30 Aug 2010 09:15:05 -0400 In-Reply-To: <20100830124217.GA17084@mothafucka.localdomain> Sender: kvm-owner@vger.kernel.org List-ID: On 08/30/2010 03:42 PM, Glauber Costa wrote: > On Sun, Aug 29, 2010 at 12:59:36PM +0300, Avi Kivity wrote: >> On 08/26/2010 12:43 AM, Glauber Costa wrote: >>> This patch proposes a common steal time implementation. When no >>> steal time is accounted, we just add a branch to the current >>> accounting code, that shouldn't add much overhead. >>> >>> When we do want to register steal time, we proceed as following: >>> - if we would account user or system time in this tick, and there is >>> out-of-cpu time registered, we skip it altogether, and account steal >>> time only. >>> - if we would account user or system time in this tick, and we got the >>> cpu for the whole slice, we proceed normaly. >>> - if we are idle in this tick, we flush out-of-cpu time to give it the >>> chance to update whatever last-measure internal variable it may have. >>> >>> This approach is simple, but proved to work well for my test scenarios. >>> in a UP guest on UP host, with a cpu-hog in both guest and host shows >>> ~ 50 % steal time. steal time is also accounted proportionally, if >>> nice values are given to the host cpu-hog. >>> >>> A cpu-hog in the host with no load in the guest, produces 0 % steal time, >>> with 100 % idle, as one would expect. >>> >> The scheduler people and lkml need to be copied on this patch. >> >> Since s390 does steal time (I think?), can this code be shared? > AFAIK, s390 enables CONFIG_VIRT_CPU_ACCOUNTING, so all timings > comes from the hypervisor, and statistical sampling is not involved. Ok. I see ppc does something similar as well (taking care of user/kernel transitions itself). > We could do that, if our hardware had any method to say precisely > how much time we spent in each state, which I don't think we do. We don't, though I'm sure everyone is wondering why we can't have cheap accurate global clocks on x86. > So in a summary, s390 is in a totally different ifdef side. Yes. > Who should we copy at the scheduler side? From MAINTAINERS: Ingo Molnar Peter Zijlstra -- error compiling committee.c: too many arguments to function