From mboxrd@z Thu Jan 1 00:00:00 1970 From: Juergen Gross Subject: Re: [PATCH 8/9] xen: sched: allow for choosing credit2 runqueues configuration at boot Date: Thu, 1 Oct 2015 07:48:11 +0200 Message-ID: <560CC91B.80308@suse.com> References: <20150929164726.17589.96920.stgit@Solace.station> <20150929165625.17589.17838.stgit@Solace.station> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: Received: from mail6.bemta5.messagelabs.com ([195.245.231.135]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1ZhWj1-00021C-9E for xen-devel@lists.xenproject.org; Thu, 01 Oct 2015 05:48:15 +0000 In-Reply-To: <20150929165625.17589.17838.stgit@Solace.station> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Dario Faggioli , xen-devel@lists.xenproject.org Cc: George Dunlap , Uma Sharma List-Id: xen-devel@lists.xenproject.org On 09/29/2015 06:56 PM, Dario Faggioli wrote: > In fact, credit2 uses CPU topology to decide how to arrange > its internal runqueues. Before this change, only 'one runqueue > per socket' was allowed. However, experiments have shown that, > for instance, having one runqueue per physical core improves > performance, especially in case hyperthreading is available. > > In general, it makes sense to allow users to pick one runqueue > arrangement at boot time, so that: > - more experiments can be easily performed to even better > assess and improve performance; > - one can select the best configuration for his specific > use case and/or hardware. > > This patch enables the above. > > Note that, for correctly arranging runqueues to be per-core, > just checking cpu_to_core() on the host CPUs is not enough. > In fact, cores (and hyperthreads) on different sockets, can > have the same core (and thread) IDs! We, therefore, need to > check whether the full topology of two CPUs matches, for > them to be put in the same runqueue. > > Note also that the default (although not functional) for > credit2, since now, has been per-socket runqueue. This patch > leaves things that way, to avoid mixing policy and technical > changes. I think you should think about a way to make this parameter a per cpupool one instead a system global one. As this will require some extra work regarding the tools interface I'd be absolutely fine with adding this at a later time, but you should have that in mind when setting this up now. > > Signed-off-by: Dario Faggioli > Signed-off-by: Uma Sharma > --- > Cc: George Dunlap > Cc: Uma Sharma > --- > docs/misc/xen-command-line.markdown | 11 +++++++ > xen/common/sched_credit2.c | 57 ++++++++++++++++++++++++++++++++--- > 2 files changed, 63 insertions(+), 5 deletions(-) > > diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen-command-line.markdown > index a2e427c..71315b8 100644 > --- a/docs/misc/xen-command-line.markdown > +++ b/docs/misc/xen-command-line.markdown > @@ -467,6 +467,17 @@ combination with the `low_crashinfo` command line option. > ### credit2\_load\_window\_shift > > `= ` > > +### credit2\_runqueue > +> `= socket | core` > + > +> Default: `socket` > + > +Specify how host CPUs are arranged in runqueues. Runqueues are kept > +balanced with respect to the load generated by the vCPUs running on > +them. Smaller runqueues (as in with `core`) means more accurate load > +balancing (for instance, it will deal better with hyperthreading), > +but also more overhead. > + > ### dbgp > > `= ehci[ | @pci:. ]` > > diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c > index 38f382e..025626f 100644 > --- a/xen/common/sched_credit2.c > +++ b/xen/common/sched_credit2.c > @@ -82,10 +82,6 @@ > * Credits are "reset" when the next vcpu in the runqueue is less than > * or equal to zero. At that point, everyone's credits are "clipped" > * to a small value, and a fixed credit is added to everyone. > - * > - * The plan is for all cores that share an L2 will share the same > - * runqueue. At the moment, there is one global runqueue for all > - * cores. > */ > > /* > @@ -194,6 +190,41 @@ static int __read_mostly opt_overload_balance_tolerance = -3; > integer_param("credit2_balance_over", opt_overload_balance_tolerance); > > /* > + * Runqueue organization. > + * > + * The various cpus are to be assigned each one to a runqueue, and we > + * want that to happen basing on topology. At the moment, it is possible > + * to choose to arrange runqueues to be: > + * > + * - per-core: meaning that there will be one runqueue per each physical > + * core of the host. This will happen if the opt_runqueue > + * parameter is set to 'core'; > + * > + * - per-socket: meaning that there will be one runqueue per each physical > + * socket (AKA package, which often, but not always, also > + * matches a NUMA node) of the host; This will happen if > + * the opt_runqueue parameter is set to 'socket'; Wouldn't it be a nice idea to add "per-numa-node" as well? This would make a difference for systems with: - multiple sockets per numa-node - multiple numa-nodes per socket It might even be a good idea to be able to have only one runqueue in small cpupools (again, this will apply only in case you have a per cpupool setting instead a global one). Juergen