[PATCH v5 16/17] xl: enable for specifying node-affinity in the config file

xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed

From: Dario Faggioli <dario.faggioli@citrix.com>
To: xen-devel@lists.xen.org
Cc: Marcus Granado <Marcus.Granado@eu.citrix.com>,
	Keir Fraser <keir@xen.org>,
	Ian Campbell <Ian.Campbell@citrix.com>,
	Li Yechen <lccycc123@gmail.com>,
	George Dunlap <george.dunlap@eu.citrix.com>,
	Andrew Cooper <Andrew.Cooper3@citrix.com>,
	Juergen Gross <juergen.gross@ts.fujitsu.com>,
	Ian Jackson <Ian.Jackson@eu.citrix.com>,
	Jan Beulich <JBeulich@suse.com>,
	Justin Weaver <jtweaver@hawaii.edu>, Matt Wilson <msw@amazon.com>,
	Elena Ufimtseva <ufimtseva@gmail.com>
Subject: [PATCH v5 16/17] xl: enable for specifying node-affinity in the config file
Date: Mon, 02 Dec 2013 19:29:51 +0100	[thread overview]
Message-ID: <20131202182951.29026.71841.stgit@Solace> (raw)
In-Reply-To: <20131202180129.29026.81543.stgit@Solace>

in a similar way to how it is possible to specify vcpu-affinity.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
---
Changes from v4:
 * fix typos and rephrase docs, as suggested during review;
 * more refactoring, i.e., more addressing factor of potential
   common code, as requested during review.

Changes from v3:
 * fix typos and language issues in docs and comments, as
   suggested during review;
 * common code to soft and hard affinity parsing factored
   together, as requested uring review.

Changes from v2:
 * use the new libxl API. Although the implementation changed
   only a little bit, I removed IanJ's Acked-by, although I am
   here saying that he did provided it, as requested.
---
 docs/man/xl.cfg.pod.5    |   23 ++++-
 tools/libxl/xl_cmdimpl.c |  216 ++++++++++++++++++++++++++++++----------------
 2 files changed, 161 insertions(+), 78 deletions(-)

diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5
index 70d6d9f..cce51ae 100644
--- a/docs/man/xl.cfg.pod.5
+++ b/docs/man/xl.cfg.pod.5
@@ -144,19 +144,36 @@ run on cpu #3 of the host.
 =back
 
 If this option is not specified, no vcpu to cpu pinning is established,
-and the vcpus of the guest can run on all the cpus of the host.
+and the vcpus of the guest can run on all the cpus of the host. If this
+option is specified, the intersection of the vcpu pinning mask, provided
+here, and the soft affinity mask, provided via B<cpus\_soft=> (if any),
+is utilized to compute the domain node-affinity, for driving memory
+allocations.
 
 If we are on a NUMA machine (i.e., if the host has more than one NUMA
 node) and this option is not specified, libxl automatically tries to
 place the guest on the least possible number of nodes. That, however,
 will not affect vcpu pinning, so the guest will still be able to run on
-all the cpus, it will just prefer the ones from the node it has been
-placed on. A heuristic approach is used for choosing the best node (or
+all the cpus. A heuristic approach is used for choosing the best node (or
 set of nodes), with the goals of maximizing performance for the guest
 and, at the same time, achieving efficient utilization of host cpus
 and memory. See F<docs/misc/xl-numa-placement.markdown> for more
 details.
 
+=item B<cpus_soft="CPU-LIST">
+
+Exactly as B<cpus=>, but specifies soft affinity, rather than pinning
+(hard affinity). When using the credit scheduler, this means what cpus
+the vcpus of the domain prefer.
+
+A C<CPU-LIST> is specified exactly as above, for B<cpus=>.
+
+If this option is not specified, the vcpus of the guest will not have
+any preference regarding on what cpu to run. If this option is specified,
+the intersection of the soft affinity mask, provided here, and the vcpu
+pinning, provided via B<cpus=> (if any), is utilized to compute the
+domain node-affinity, for driving memory allocations.
+
 =back
 
 =head3 CPU Scheduling
diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index ed4801a..768b6da 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -76,8 +76,9 @@ xlchild children[child_max];
 static const char *common_domname;
 static int fd_lock = -1;
 
-/* Stash for specific vcpu to pcpu mappping */
+/* Stash for specific vcpu to pcpu hard and soft mapping */
 static int *vcpu_to_pcpu;
+static int *vcpu_to_pcpu_soft;
 
 static const char savefileheader_magic[32]=
     "Xen saved domain, xl format\n \0 \r";
@@ -686,6 +687,92 @@ static int vcpupin_parse(char *cpu, libxl_bitmap *cpumap)
     return rc;
 }
 
+static int *parse_config_cpumap_list(XLU_ConfigList *cpus,
+                                     libxl_bitmap *cpumap,
+                                     int max_vcpus)
+{
+    int i, n_cpus = 0;
+    int *to_pcpu;
+    const char *buf;
+
+    if (libxl_cpu_bitmap_alloc(ctx, cpumap, 0)) {
+        fprintf(stderr, "Unable to allocate cpumap\n");
+        exit(1);
+    }
+
+    /* Prepare the array for single vcpu to pcpu mappings */
+    to_pcpu = xmalloc(sizeof(int) * max_vcpus);
+    memset(to_pcpu, -1, sizeof(int) * max_vcpus);
+
+    /*
+     * Idea here is to let libxl think all the domain's vcpus
+     * have cpu affinity with all the pcpus on the list. Doing
+     * that ensures memory is allocated on the proper NUMA nodes.
+     * It is then us, here in xl, that matches each single vcpu
+     * to its pcpu (and that's why we need to stash such info in
+     * the to_pcpu array now) after the domain has been created.
+     * This way, we avoid having to pass to libxl some big array
+     * hosting the single mappings.
+     */
+    libxl_bitmap_set_none(cpumap);
+    while ((buf = xlu_cfg_get_listitem(cpus, n_cpus)) != NULL) {
+        i = atoi(buf);
+        if (!libxl_bitmap_cpu_valid(cpumap, i)) {
+            fprintf(stderr, "cpu %d illegal\n", i);
+            exit(1);
+        }
+        libxl_bitmap_set(cpumap, i);
+        if (n_cpus < max_vcpus)
+            to_pcpu[n_cpus] = i;
+        n_cpus++;
+    }
+
+    return to_pcpu;
+}
+
+static void parse_config_cpumap_string(const char *buf, libxl_bitmap *cpumap)
+{
+        char *buf2 = strdup(buf);
+
+        if (libxl_cpu_bitmap_alloc(ctx, cpumap, 0)) {
+            fprintf(stderr, "Unable to allocate cpumap\n");
+            exit(1);
+        }
+
+        libxl_bitmap_set_none(cpumap);
+        if (vcpupin_parse(buf2, cpumap))
+            exit(1);
+        free(buf2);
+}
+
+static void parse_cpu_affinity(XLU_Config *config, const char *what,
+                               libxl_domain_build_info *b_info)
+{
+    XLU_ConfigList *cpus;
+    const char *buf;
+    libxl_bitmap *map;
+    int **array;
+
+    if (!strcmp(what, "cpus")) {
+        map = &b_info->cpumap;
+        array = &vcpu_to_pcpu;
+    } else if (!strcmp(what, "cpus_soft")) {
+        map = &b_info->cpumap_soft;
+        array = &vcpu_to_pcpu_soft;
+    } else
+        return;
+
+    if (!xlu_cfg_get_list (config, what, &cpus, 0, 1))
+        *array = parse_config_cpumap_list(cpus, map, b_info->max_vcpus);
+    else if (!xlu_cfg_get_string (config, what, &buf, 0))
+        parse_config_cpumap_string(buf, map);
+    else
+        return;
+
+    /* We have an hard and/or soft affinity: disable automatic placement */
+    libxl_defbool_set(&b_info->numa_placement, false);
+}
+
 static void parse_config_data(const char *config_source,
                               const char *config_data,
                               int config_len,
@@ -696,7 +783,8 @@ static void parse_config_data(const char *config_source,
     const char *buf;
     long l;
     XLU_Config *config;
-    XLU_ConfigList *cpus, *vbds, *nics, *pcis, *cvfbs, *cpuids, *vtpms;
+    XLU_ConfigList *vbds, *nics, *pcis;
+    XLU_ConfigList *cvfbs, *cpuids, *vtpms;
     XLU_ConfigList *ioports, *irqs, *iomem;
     int num_ioports, num_irqs, num_iomem;
     int pci_power_mgmt = 0;
@@ -818,60 +906,9 @@ static void parse_config_data(const char *config_source,
     if (!xlu_cfg_get_long (config, "maxvcpus", &l, 0))
         b_info->max_vcpus = l;
 
-    if (!xlu_cfg_get_list (config, "cpus", &cpus, 0, 1)) {
-        int n_cpus = 0;
 
-        if (libxl_cpu_bitmap_alloc(ctx, &b_info->cpumap, 0)) {
-            fprintf(stderr, "Unable to allocate cpumap\n");
-            exit(1);
-        }
-
-        /* Prepare the array for single vcpu to pcpu mappings */
-        vcpu_to_pcpu = xmalloc(sizeof(int) * b_info->max_vcpus);
-        memset(vcpu_to_pcpu, -1, sizeof(int) * b_info->max_vcpus);
-
-        /*
-         * Idea here is to let libxl think all the domain's vcpus
-         * have cpu affinity with all the pcpus on the list.
-         * It is then us, here in xl, that matches each single vcpu
-         * to its pcpu (and that's why we need to stash such info in
-         * the vcpu_to_pcpu array now) after the domain has been created.
-         * Doing it like this saves the burden of passing to libxl
-         * some big array hosting the single mappings. Also, using
-         * the cpumap derived from the list ensures memory is being
-         * allocated on the proper nodes anyway.
-         */
-        libxl_bitmap_set_none(&b_info->cpumap);
-        while ((buf = xlu_cfg_get_listitem(cpus, n_cpus)) != NULL) {
-            i = atoi(buf);
-            if (!libxl_bitmap_cpu_valid(&b_info->cpumap, i)) {
-                fprintf(stderr, "cpu %d illegal\n", i);
-                exit(1);
-            }
-            libxl_bitmap_set(&b_info->cpumap, i);
-            if (n_cpus < b_info->max_vcpus)
-                vcpu_to_pcpu[n_cpus] = i;
-            n_cpus++;
-        }
-
-        /* We have a cpumap, disable automatic placement */
-        libxl_defbool_set(&b_info->numa_placement, false);
-    }
-    else if (!xlu_cfg_get_string (config, "cpus", &buf, 0)) {
-        char *buf2 = strdup(buf);
-
-        if (libxl_cpu_bitmap_alloc(ctx, &b_info->cpumap, 0)) {
-            fprintf(stderr, "Unable to allocate cpumap\n");
-            exit(1);
-        }
-
-        libxl_bitmap_set_none(&b_info->cpumap);
-        if (vcpupin_parse(buf2, &b_info->cpumap))
-            exit(1);
-        free(buf2);
-
-        libxl_defbool_set(&b_info->numa_placement, false);
-    }
+    parse_cpu_affinity(config, "cpus", b_info);
+    parse_cpu_affinity(config, "cpus_soft", b_info);
 
     if (!xlu_cfg_get_long (config, "memory", &l, 0)) {
         b_info->max_memkb = l * 1024;
@@ -1990,6 +2027,40 @@ static void evdisable_disk_ejects(libxl_evgen_disk_eject **diskws,
     }
 }
 
+static inline int set_vcpu_to_pcpu_affinity(uint32_t domid, int *to_pcpu,
+                                            int max_vcpus, int soft)
+{
+    libxl_bitmap vcpu_cpumap;
+    libxl_bitmap *softmap = NULL, *hardmap = NULL;
+    int i, ret = 0;
+
+    ret = libxl_cpu_bitmap_alloc(ctx, &vcpu_cpumap, 0);
+    if (ret)
+        return -1;
+
+    if (soft)
+        softmap = &vcpu_cpumap;
+    else
+        hardmap = &vcpu_cpumap;
+
+    for (i = 0; i < max_vcpus; i++) {
+        if (to_pcpu[i] != -1) {
+            libxl_bitmap_set_none(&vcpu_cpumap);
+            libxl_bitmap_set(&vcpu_cpumap, to_pcpu[i]);
+        } else {
+            libxl_bitmap_set_any(&vcpu_cpumap);
+        }
+        if (libxl_set_vcpuaffinity(ctx, domid, i, hardmap, softmap)) {
+            fprintf(stderr, "setting affinity failed on vcpu `%d'.\n", i);
+            ret = -1;
+            break;
+        }
+    }
+    libxl_bitmap_dispose(&vcpu_cpumap);
+
+    return ret;
+}
+
 static uint32_t create_domain(struct domain_create *dom_info)
 {
     uint32_t domid = INVALID_DOMID;
@@ -2206,31 +2277,26 @@ start:
     if ( ret )
         goto error_out;
 
-    /* If single vcpu to pcpu mapping was requested, honour it */
+    /* If single vcpu pinning or soft affinity was requested, honour it */
     if (vcpu_to_pcpu) {
-        libxl_bitmap vcpu_cpumap;
+        ret = set_vcpu_to_pcpu_affinity(domid, vcpu_to_pcpu,
+                                        d_config.b_info.max_vcpus, 0);
+        free(vcpu_to_pcpu);
 
-        ret = libxl_cpu_bitmap_alloc(ctx, &vcpu_cpumap, 0);
         if (ret)
             goto error_out;
-        for (i = 0; i < d_config.b_info.max_vcpus; i++) {
 
-            if (vcpu_to_pcpu[i] != -1) {
-                libxl_bitmap_set_none(&vcpu_cpumap);
-                libxl_bitmap_set(&vcpu_cpumap, vcpu_to_pcpu[i]);
-            } else {
-                libxl_bitmap_set_any(&vcpu_cpumap);
-            }
-            if (libxl_set_vcpuaffinity(ctx, domid, i, &vcpu_cpumap, NULL)) {
-                fprintf(stderr, "setting affinity failed on vcpu `%d'.\n", i);
-                libxl_bitmap_dispose(&vcpu_cpumap);
-                free(vcpu_to_pcpu);
-                ret = ERROR_FAIL;
-                goto error_out;
-            }
-        }
-        libxl_bitmap_dispose(&vcpu_cpumap);
-        free(vcpu_to_pcpu); vcpu_to_pcpu = NULL;
+        vcpu_to_pcpu = NULL;
+    }
+    if (vcpu_to_pcpu_soft) {
+        ret = set_vcpu_to_pcpu_affinity(domid, vcpu_to_pcpu_soft,
+                                        d_config.b_info.max_vcpus, 1);
+        free(vcpu_to_pcpu_soft);
+
+        if (ret)
+            goto error_out;
+
+        vcpu_to_pcpu_soft = NULL;
     }
 
     ret = libxl_userdata_store(ctx, domid, "xl",

next prev parent reply	other threads:[~2013-12-02 18:29 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-12-02 18:27 [PATCH v5 00/17] Implement vcpu soft affinity for credit1 Dario Faggioli
2013-12-02 18:27 ` [PATCH v5 01/17] xl: match output of vcpu-list with pinning syntax Dario Faggioli
2013-12-02 18:27 ` [PATCH v5 02/17] libxl: better name for last parameter of libxl_list_vcpu Dario Faggioli
2013-12-04 11:40   ` Ian Jackson
2013-12-06 14:40     ` Dario Faggioli
2013-12-02 18:27 ` [PATCH v5 03/17] libxl: fix memory leak in libxl_list_vcpu Dario Faggioli
2013-12-05 12:07   ` Ian Jackson
2013-12-02 18:27 ` [PATCH v5 04/17] libxc/libxl: sanitize error handling in *_get_max_{cpus, nodes} Dario Faggioli
2013-12-05 12:10   ` Ian Jackson
2013-12-06 10:34     ` Dario Faggioli
2013-12-06 11:52       ` Ian Jackson
2013-12-02 18:27 ` [PATCH v5 05/17] libxc/libxl: allow to retrieve the number of online pCPUs Dario Faggioli
2013-12-02 18:28 ` [PATCH v5 06/17] xl: allow for node-wise specification of vcpu pinning Dario Faggioli
2013-12-02 18:28 ` [PATCH v5 07/17] xl: implement and enable dryrun mode for `xl vcpu-pin' Dario Faggioli
2013-12-02 18:28 ` [PATCH v5 08/17] xl: test script for the cpumap parser (for vCPU pinning) Dario Faggioli
2013-12-02 18:28 ` [PATCH v5 09/17] xen: sched: rename v->cpu_affinity into v->cpu_hard_affinity Dario Faggioli
2013-12-02 18:28 ` [PATCH v5 10/17] xen: sched: introduce soft-affinity and use it instead d->node-affinity Dario Faggioli
2013-12-02 18:28 ` [PATCH v5 11/17] xen: derive NUMA node affinity from hard and soft CPU affinity Dario Faggioli
2013-12-02 18:29 ` [PATCH v5 12/17] xen/libxc: sched: DOMCTL_*vcpuaffinity works with hard and soft affinity Dario Faggioli
2013-12-03 10:02   ` Jan Beulich
2013-12-03 10:06     ` Jan Beulich
2013-12-03 11:08       ` Dario Faggioli
2013-12-03 13:25         ` Dario Faggioli
2013-12-03 18:21       ` George Dunlap
2013-12-03 18:29         ` Dario Faggioli
2013-12-03 18:37           ` George Dunlap
2013-12-03 19:06             ` Dario Faggioli
2013-12-04  9:03               ` Dario Faggioli
2013-12-04 15:49                 ` George Dunlap
2013-12-04 16:03                   ` Dario Faggioli
2013-12-04 16:20                   ` Jan Beulich
2013-12-11 11:33         ` Jan Beulich
2013-12-03 10:59     ` Dario Faggioli
2013-12-03 11:20       ` Jan Beulich
2013-12-03 11:30         ` Dario Faggioli
2013-12-02 18:29 ` [PATCH v5 13/17] libxc: get and set soft and hard affinity Dario Faggioli
2013-12-02 18:29 ` [PATCH v5 14/17] libxl: get and set soft affinity Dario Faggioli
2013-12-02 18:29 ` [PATCH v5 15/17] xl: enable getting and setting soft Dario Faggioli
2013-12-02 18:29 ` Dario Faggioli [this message]
2013-12-02 18:29 ` [PATCH v5 17/17] libxl: automatic NUMA placement affects soft affinity Dario Faggioli
2013-12-03 14:05 ` [PATCH v5 00/17] Implement vcpu soft affinity for credit1 George Dunlap

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:70d6d9f dfblob:cce51ae dfblob:ed4801a dfblob:768b6da )
 OR (
bs:"[PATCH v5 16/17] xl: enable for specifying node-affinity in the config file" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20131202182951.29026.71841.stgit@Solace \
    --to=dario.faggioli@citrix.com \
    --cc=Andrew.Cooper3@citrix.com \
    --cc=Ian.Campbell@citrix.com \
    --cc=Ian.Jackson@eu.citrix.com \
    --cc=JBeulich@suse.com \
    --cc=Marcus.Granado@eu.citrix.com \
    --cc=george.dunlap@eu.citrix.com \
    --cc=jtweaver@hawaii.edu \
    --cc=juergen.gross@ts.fujitsu.com \
    --cc=keir@xen.org \
    --cc=lccycc123@gmail.com \
    --cc=msw@amazon.com \
    --cc=ufimtseva@gmail.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).