Re: [PATCH] tools: libxl/xl: run NUMA placement even when an hard-affinity is set

From: Wei Liu <wei.liu2@citrix.com>
To: Dario Faggioli <dfaggioli@suse.com>
Cc: xen-devel@lists.xenproject.org, Wei Liu <wei.liu2@citrix.com>,
	Ian Jackson <ian.jackson@eu.citrix.com>,
	George Dunlap <george.dunlap@citrix.com>
Subject: Re: [PATCH] tools: libxl/xl: run NUMA placement even when an hard-affinity is set
Date: Mon, 20 Aug 2018 11:14:28 +0100	[thread overview]
Message-ID: <20180820101428.djudte4z2wye3cz3@citrix.com> (raw)
In-Reply-To: <153452538306.14879.2645077465028661264.stgit@Palanthas.fritz.box>

On Fri, Aug 17, 2018 at 07:03:03PM +0200, Dario Faggioli wrote:
> Right now, if either an hard or soft-affinity are explicitly specified
> in a domain's config file, automatic NUMA placement is skipped. However,
> automatic NUMA placement affects only the soft-affinity of the domain
> which is being created.
> 
> Therefore, it is ok to let it run if an hard-affinity is specified. The
> semantics will be that the best placement candidate would be found,
> respecting the specified hard-affinity, i.e., using only the nodes that
> contain the pcpus in the hard-affinity mask.

The reasoning sound plausible. I have some questions below.

> 
> This is particularly helpful if global xl pinning masks are defined, as
> made possible by commit aa67b97ed34279c43 ("xl.conf: Add global affinity
> masks"). In fact, without this commit, defining a global affinity mask
> would also mean disabling automatic placement, but that does not
> necessarily have to be the case (especially in large systems).
> 
> Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
> ---
> Cc: Ian Jackson <ian.jackson@eu.citrix.com>
> Cc: Wei Liu <wei.liu2@citrix.com>
> Cc: George Dunlap <george.dunlap@citrix.com>
> ---
>  tools/libxl/libxl_dom.c |   46 ++++++++++++++++++++++++++++++++++++++++------
>  tools/xl/xl_parse.c     |    6 ++++--
>  2 files changed, 44 insertions(+), 8 deletions(-)
> 
> diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
> index eb401cf1d6..e30e2dca9a 100644
> --- a/tools/libxl/libxl_dom.c
> +++ b/tools/libxl/libxl_dom.c
> @@ -27,6 +27,8 @@
>  
>  #include "_paths.h"
>  
> +//#define DEBUG 1
> +

Stray changes here?

You can use NDEBUG instead.

>  libxl_domain_type libxl__domain_type(libxl__gc *gc, uint32_t domid)
>  {
>      libxl_ctx *ctx = libxl__gc_owner(gc);
> @@ -142,12 +144,13 @@ static int numa_place_domain(libxl__gc *gc, uint32_t domid,
>  {
>      int found;
>      libxl__numa_candidate candidate;
> -    libxl_bitmap cpupool_nodemap;
> +    libxl_bitmap cpumap, cpupool_nodemap, *map;
>      libxl_cpupoolinfo cpupool_info;
>      int i, cpupool, rc = 0;
>      uint64_t memkb;
>  
>      libxl__numa_candidate_init(&candidate);
> +    libxl_bitmap_init(&cpumap);
>      libxl_bitmap_init(&cpupool_nodemap);
>      libxl_cpupoolinfo_init(&cpupool_info);
>  
> @@ -162,6 +165,38 @@ static int numa_place_domain(libxl__gc *gc, uint32_t domid,
>      rc = libxl_cpupool_info(CTX, &cpupool_info, cpupool);
>      if (rc)
>          goto out;
> +    map = &cpupool_info.cpumap;
> +
> +    /*
> +     * If there's a well defined hard affinity mask (i.e., the same one for all
> +     * the vcpus), we can try to run the placement considering only the pcpus
> +     * within such mask.
> +     */
> +    if (info->num_vcpu_hard_affinity)
> +    {

Placement of "{" is wrong.

> +#ifdef DEBUG

#ifndef NDEBUG ?

> +        int j;
> +
> +        for (j = 0; j < info->num_vcpu_hard_affinity; j++)
> +            assert(libxl_bitmap_equal(&info->vcpu_hard_affinity[0],
> +                                      &info->vcpu_hard_affinity[j], 0));
> +#endif /* DEBUG */

But why should the above be debug only? The assumption doesn't seem to
always hold.

> +
> +        rc = libxl_bitmap_and(CTX, &cpumap, &info->vcpu_hard_affinity[0],
> +                              &cpupool_info.cpumap);
> +        if (rc)
> +            goto out;
> +
> +        /*
> +         * Hard affinity should _really_ contain cpus that are inside our
> +         * cpupool. Anyway, if it does not, log a warning and only use the
> +         * cpupool's cpus for placement.
> +         */
> +        if (!libxl_bitmap_is_empty(&cpumap))
> +            map = &cpumap;
> +        else
> +            LOG(WARN, "Hard affinity completely outside of domain's cpupool?");

Should this be an error?

What is the expected interaction for hard affinity and cpupool?

Wei.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel