From mboxrd@z Thu Jan  1 00:00:00 1970
From: George Dunlap <george.dunlap@eu.citrix.com>
Subject: Re: [PATCH 1 of 6 v2] xen: sched_credit: improve
 picking up the idlal CPU for a VCPU
Date: Fri, 14 Dec 2012 19:16:27 +0000
Message-ID: <50CB7B0B.9010605@eu.citrix.com>
References: <patchbomb.1355280770@Solace>
	<bced65aa4410b0272064.1355280771@Solace>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; Format="flowed"
Content-Transfer-Encoding: 7bit
Return-path: <xen-devel-bounces@lists.xen.org>
In-Reply-To: <bced65aa4410b0272064.1355280771@Solace>
List-Unsubscribe: <http://lists.xen.org/cgi-bin/mailman/options/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xen.org>
List-Help: <mailto:xen-devel-request@lists.xen.org?subject=help>
List-Subscribe: <http://lists.xen.org/cgi-bin/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=subscribe>
Sender: xen-devel-bounces@lists.xen.org
Errors-To: xen-devel-bounces@lists.xen.org
To: Dario Faggioli <dario.faggioli@citrix.com>
Cc: "xen-devel@lists.xensource.com" <xen-devel@lists.xensource.com>, "Keir
	(Xen.org)" <keir@xen.org>, Ian Campbell <Ian.Campbell@citrix.com>
List-Id: xen-devel@lists.xenproject.org

On 12/12/12 02:52, Dario Faggioli wrote:
> In _csched_cpu_pick() we try to select the best possible CPU for
> running a VCPU, considering the characteristics of the underlying
> hardware (i.e., how many threads, core, sockets, and how busy they
> are). What we want is "the idle execution vehicle with the most
> idling neighbours in its grouping".
>
> In order to achieve it, we select a CPU from the VCPU's affinity,
> giving preference to its current processor if possible, as the basis
> for the comparison with all the other CPUs. Problem is, to discount
> the VCPU itself when computing this "idleness" (in an attempt to be
> fair wrt its current processor), we arbitrarily and unconditionally
> consider that selected CPU as idle, even when it is not the case,
> for instance:
>   1. If the CPU is not the one where the VCPU is running (perhaps due
>      to the affinity being changed);
>   2. The CPU is where the VCPU is running, but it has other VCPUs in
>      its runq, so it won't go idle even if the VCPU in question goes.

Good catch -- thanks.  Comments below.

> diff --git a/xen/common/sched_credit.c b/xen/common/sched_credit.c
> --- a/xen/common/sched_credit.c
> +++ b/xen/common/sched_credit.c
> @@ -59,6 +59,8 @@
>   #define CSCHED_VCPU(_vcpu)  ((struct csched_vcpu *) (_vcpu)->sched_priv)
>   #define CSCHED_DOM(_dom)    ((struct csched_dom *) (_dom)->sched_priv)
>   #define RUNQ(_cpu)          (&(CSCHED_PCPU(_cpu)->runq))
> +/* Is the first element of _cpu's runq its idle vcpu? */
> +#define IS_RUNQ_IDLE(_cpu)  (is_idle_vcpu(__runq_elem(RUNQ(_cpu)->next)->vcpu))
>   
>   
>   /*
> @@ -479,9 +481,14 @@ static int
>        * distinct cores first and guarantees we don't do something stupid
>        * like run two VCPUs on co-hyperthreads while there are idle cores
>        * or sockets.
> +     *
> +     * Notice that, when computing the "idleness" of cpu, we may want to
> +     * discount vc. That is, iff vc is the currently running and the only
> +     * runnable vcpu on cpu, we add cpu to the idlers.
>        */
>       cpumask_and(&idlers, &cpu_online_map, CSCHED_PRIV(ops)->idlers);
> -    cpumask_set_cpu(cpu, &idlers);
> +    if ( current_on_cpu(cpu) == vc && IS_RUNQ_IDLE(cpu) )
> +        cpumask_set_cpu(cpu, &idlers);

Why bother with this whole "current_on_cpu()" thing, when you can just 
look at vc->processor?  I.e.:

if ( cpu == vc->processor && IS_RUNQ_IDLE(cpu) )

>       cpumask_and(&cpus, &cpus, &idlers);
>       cpumask_clear_cpu(cpu, &cpus);
>   
> @@ -489,7 +496,7 @@ static int
>       {
>           cpumask_t cpu_idlers;
>           cpumask_t nxt_idlers;
> -        int nxt, weight_cpu, weight_nxt;
> +        int nxt, nr_idlers_cpu, nr_idlers_nxt;

I think Jan is right, this probably should be a separate patch.

  -George