From mboxrd@z Thu Jan  1 00:00:00 1970
From: George Dunlap <george.dunlap@eu.citrix.com>
Subject: Re: [PATCH 07 of 10 v2] libxl: optimize the calculation
 of how many VCPUs can run on a candidate
Date: Fri, 21 Dec 2012 16:00:00 +0000
Message-ID: <50D48780.70302@eu.citrix.com>
References: <patchbomb.1355944036@Solace>
	<5dc2571ae5faef87977c.1355944043@Solace>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; Format="flowed"
Content-Transfer-Encoding: 7bit
Return-path: <xen-devel-bounces@lists.xen.org>
In-Reply-To: <5dc2571ae5faef87977c.1355944043@Solace>
List-Unsubscribe: <http://lists.xen.org/cgi-bin/mailman/options/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xen.org>
List-Help: <mailto:xen-devel-request@lists.xen.org?subject=help>
List-Subscribe: <http://lists.xen.org/cgi-bin/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=subscribe>
Sender: xen-devel-bounces@lists.xen.org
Errors-To: xen-devel-bounces@lists.xen.org
To: Dario Faggioli <dario.faggioli@citrix.com>
Cc: Marcus Granado <Marcus.Granado@eu.citrix.com>, Dan Magenheimer <dan.magenheimer@oracle.com>, Ian Campbell <Ian.Campbell@citrix.com>, Anil Madhavapeddy <anil@recoil.org>, Andrew Cooper <Andrew.Cooper3@citrix.com>, Juergen Gross <juergen.gross@ts.fujitsu.com>, Ian Jackson <Ian.Jackson@eu.citrix.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, Jan Beulich <JBeulich@suse.com>, Daniel De Graaf <dgdegra@tycho.nsa.gov>, Matt Wilson <msw@amazon.com>
List-Id: xen-devel@lists.xenproject.org

On 19/12/12 19:07, Dario Faggioli wrote:
> For choosing the best NUMA placement candidate, we need to figure out
> how many VCPUs are runnable on each of them. That requires going through
> all the VCPUs of all the domains and check their affinities.
>
> With this change, instead of doing the above for each candidate, we
> do it once for all, populating an array while counting. This way, when
> we later are evaluating candidates, all we need is summing up the right
> elements of the array itself.
>
> This reduces the complexity of the overall algorithm, as it moves a
> potentially expensive operation (for_each_vcpu_of_each_domain {})
> outside from the core placement loop, so that it is performed only
> once instead of (potentially) tens or hundreds of times.
>
> Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>

You know this code best. :-)  I've looked it over and just have one 
minor suggestion:

>           for (j = 0; j < nr_dom_vcpus; j++) {
> +            /* For each vcpu of each domain, increment the elements of
> +             * the array corresponding to the nodes where the vcpu runs */
> +            libxl_bitmap_set_none(&vcpu_nodemap);
> +            libxl_for_each_set_bit(k, vinfo[j].cpumap) {
> +                int node = tinfo[k].node;

I think I might rename "vcpu_nodemap" to something that suggests better 
how it fits with the algorithm -- for instance, "counted_nodemap" or 
"nodes_counted" -- something to suggest that this is how we avoid 
counting the same vcpu on the same node multiple times.

  -George