From mboxrd@z Thu Jan 1 00:00:00 1970 From: Juergen Gross Subject: Re: [PATCH 8/9] xen: sched: allow for choosing credit2 runqueues configuration at boot Date: Thu, 1 Oct 2015 09:46:41 +0200 Message-ID: <560CE4E1.2020405@suse.com> References: <20150929164726.17589.96920.stgit@Solace.station> <20150929165625.17589.17838.stgit@Solace.station> <560CC91B.80308@suse.com> <1443684219.3276.175.camel@citrix.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: Received: from mail6.bemta3.messagelabs.com ([195.245.230.39]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1ZhYZf-0006B5-Ei for xen-devel@lists.xenproject.org; Thu, 01 Oct 2015 07:46:43 +0000 In-Reply-To: <1443684219.3276.175.camel@citrix.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Dario Faggioli , xen-devel@lists.xenproject.org Cc: George Dunlap , Uma Sharma List-Id: xen-devel@lists.xenproject.org On 10/01/2015 09:23 AM, Dario Faggioli wrote: > On Thu, 2015-10-01 at 07:48 +0200, Juergen Gross wrote: >> On 09/29/2015 06:56 PM, Dario Faggioli wrote: >>> In fact, credit2 uses CPU topology to decide how to arrange >>> its internal runqueues. Before this change, only 'one runqueue >>> per socket' was allowed. However, experiments have shown that, >>> for instance, having one runqueue per physical core improves >>> performance, especially in case hyperthreading is available. >>> >>> In general, it makes sense to allow users to pick one runqueue >>> arrangement at boot time, so that: >>> - more experiments can be easily performed to even better >>> assess and improve performance; >>> - one can select the best configuration for his specific >>> use case and/or hardware. >>> >>> This patch enables the above. >>> >>> Note that, for correctly arranging runqueues to be per-core, >>> just checking cpu_to_core() on the host CPUs is not enough. >>> In fact, cores (and hyperthreads) on different sockets, can >>> have the same core (and thread) IDs! We, therefore, need to >>> check whether the full topology of two CPUs matches, for >>> them to be put in the same runqueue. >>> >>> Note also that the default (although not functional) for >>> credit2, since now, has been per-socket runqueue. This patch >>> leaves things that way, to avoid mixing policy and technical >>> changes. >> >> I think you should think about a way to make this parameter a per >> cpupool one instead a system global one. >> > Believe it or not, I though about this already, and yes, it is in my > plans to make this per-cpupool. However... > >> As this will require some >> extra work regarding the tools interface I'd be absolutely fine with >> adding this at a later time, but you should have that in mind when >> setting this up now. >> > ...yes, that was phase II in my mind as well. > > So (sorry, but just to make sure I understand), since you said you're > fine with it coming later, are you also fine with this patch, or do you > think some adjustments are necessary, right here, right now, because of > that future plan? No, I'm fine. > >>> --- a/xen/common/sched_credit2.c >>> +++ b/xen/common/sched_credit2.c >>> @@ -82,10 +82,6 @@ > >>> @@ -194,6 +190,41 @@ static int __read_mostly >>> opt_overload_balance_tolerance = -3; >>> integer_param("credit2_balance_over", >>> opt_overload_balance_tolerance); >>> >>> /* >>> + * Runqueue organization. >>> + * >>> + * The various cpus are to be assigned each one to a runqueue, and >>> we >>> + * want that to happen basing on topology. At the moment, it is >>> possible >>> + * to choose to arrange runqueues to be: >>> + * >>> + * - per-core: meaning that there will be one runqueue per each >>> physical >>> + * core of the host. This will happen if the >>> opt_runqueue >>> + * parameter is set to 'core'; >>> + * >>> + * - per-socket: meaning that there will be one runqueue per each >>> physical >>> + * socket (AKA package, which often, but not always, >>> also >>> + * matches a NUMA node) of the host; This will >>> happen if >>> + * the opt_runqueue parameter is set to 'socket'; >> >> Wouldn't it be a nice idea to add "per-numa-node" as well? >> > I think it is. > >> This would make a difference for systems with: >> >> - multiple sockets per numa-node >> - multiple numa-nodes per socket >> > Yep. > >> It might even be a good idea to be able to have only one runqueue in >> small cpupools (again, this will apply only in case you have a per >> cpupool setting instead a global one). >> > And I agree on this too. > > TBH, I had considered these too, and I was thinking to make them happen > in phase II as well. However, they're simple enough to be implemented > now (as in, in v2 of this series), so I think I'll do that. Thanks. Juergen