From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751838AbcESFd5 (ORCPT ); Thu, 19 May 2016 01:33:57 -0400 Received: from mx2.suse.de ([195.135.220.15]:53228 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750945AbcESFd4 (ORCPT ); Thu, 19 May 2016 01:33:56 -0400 Subject: Re: [PATCH] xen: add steal_clock support on x86 To: Boris Ostrovsky , David Vrabel , xen-devel@lists.xenproject.org, linux-kernel@vger.kernel.org References: <1463573758-11441-1-git-send-email-jgross@suse.com> <573C804F.6020708@oracle.com> <573C81D1.9040309@suse.com> <573C896F.4090303@oracle.com> <573C8D4A.1000005@suse.com> <573C8E16.5040409@citrix.com> <573C8FEA.2070407@oracle.com> <573C9188.7010303@suse.com> <573C94BF.70508@oracle.com> Cc: sstabellini@kernel.org From: Juergen Gross Message-ID: <573D5040.6070605@suse.com> Date: Thu, 19 May 2016 07:33:52 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.7.0 MIME-Version: 1.0 In-Reply-To: <573C94BF.70508@oracle.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 18/05/16 18:13, Boris Ostrovsky wrote: > On 05/18/2016 12:00 PM, Juergen Gross wrote: >> On 18/05/16 17:53, Boris Ostrovsky wrote: >>> On 05/18/2016 11:45 AM, David Vrabel wrote: >>>> On 18/05/16 16:42, Juergen Gross wrote: >>>>> On 18/05/16 17:25, Boris Ostrovsky wrote: >>>>>> On 05/18/2016 10:53 AM, Juergen Gross wrote: >>>>>>> On 18/05/16 16:46, Boris Ostrovsky wrote: >>>>>>>> On 05/18/2016 08:15 AM, Juergen Gross wrote: >>>>>>>>> } >>>>>>>>> >>>>>>>>> +void __init xen_time_setup_guest(void) >>>>>>>>> +{ >>>>>>>>> + pv_time_ops.steal_clock = xen_steal_clock; >>>>>>>>> + >>>>>>>>> + static_key_slow_inc(¶virt_steal_enabled); >>>>>>>>> + /* >>>>>>>>> + * We can't set paravirt_steal_rq_enabled as this would require the >>>>>>>>> + * capability to read another cpu's runstate info. >>>>>>>>> + */ >>>>>>>>> +} >>>>>>>> Won't we be accounting for stolen cycles twice now --- once from >>>>>>>> steal_account_process_tick()->steal_clock() and second time from >>>>>>>> do_stolen_accounting()? >>>>>>> Uuh, yes. >>>>>>> >>>>>>> I guess I should rip do_stolen_accounting() out, too? >>>>>> I don't think PARAVIRT_TIME_ACCOUNTING is always selected for Xen. If >>>>> This is easy to accomplish. :-) >>> >>> I looked at KVM code (PARAVIRT_TIME_ACCOUNTING is not selected there >>> neither) and in their case that's presumably because stealing accounting >>> is a CPUID bit, i.e. it might not be supported. In Xen case we always >>> have this interface. >> So they added it later and the default is to keep the old behavior. >> >>>>>> that's indeed the case then we should ifndef do_stolen_accounting(). Or >>>>>> maybe check for paravirt_steal_enabled. >>>>> Is this really a sensible thing to do? There is a mechanism used by KVM >>>>> to do the stolen accounting. I think we should use it instead of having >>>>> a second implementation doing the same thing in case the generic one >>>>> isn't enabled. >>>> I agree. >>>> >>>> Although I don't think selecting PARAVIRT_TIME_ACC' is necessary -- I >>>> don't think it's essential (or is it?). >>> Looks like it's useful only if paravirt_steal_rq_enabled, which we don't >>> support yet. >> I think the patch is still useful. It is reducing code size and >> it is removing arch-specific Xen-hack(s). With the patch Xen's >> solution for arm and x86 is common and the same as for KVM. Adding >> paravirt_steal_rq_enabled later will be much easier as only one >> function needs substantial modification. > > I am not arguing against having a patch that will remove > do_stolen_accounting(). I was responding to David's statement about > whether we need to select CONFIG_PARAVIRT_TIME_ACCOUNTING, and I am not > sure this is necessary since steal_account_process_tick() (that will > take case of things that do_stolen_accounting() currently does) doesn't > need it. Aah, okay. That's a good reason to not add the Kconfig stuff. > (And if it is indeed needed --- can we have Xen's Kconfig select it > instead of "default y if XEN" ?) I've verified that CONFIG_PARAVIRT_TIME_ACCOUNTING is _not_ needed. I've removed it from .config and used my patch with do_stolen_accounting() removed. In an overcommitted guest (4 vcpus on 2 physical cpus) running a parallel make top showed near 50% stolen time. Juergen