[PATCH 06 of 10 [RFC]] xl: Allow user to set or change node affinity on-line

xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed

From: Dario Faggioli <raistlin@linux.it>
To: xen-devel@lists.xen.org
Cc: Andre Przywara <andre.przywara@amd.com>,
	Ian Campbell <Ian.Campbell@citrix.com>,
	Stefano Stabellini <Stefano.Stabellini@eu.citrix.com>,
	George Dunlap <george.dunlap@eu.citrix.com>,
	Juergen Gross <juergen.gross@ts.fujitsu.com>,
	Ian Jackson <Ian.Jackson@eu.citrix.com>,
	Jan Beulich <JBeulich@suse.com>
Subject: [PATCH 06 of 10 [RFC]] xl: Allow user to set or change node affinity on-line
Date: Wed, 11 Apr 2012 15:17:53 +0200	[thread overview]
Message-ID: <64547b45cb112a35d3a2.1334150273@Solace> (raw)
In-Reply-To: <patchbomb.1334150267@Solace>

For feature parity with vcpu affinity, allow for specifying
node affinity not only at domain creation time, but at run-time
too.

Of course this is not going to be equally effective, as it will
only affect future memory allocations without touching what's
already there. However, in future we might want to change this,
and use this as an interface for sort-of manual "domain node
migration".

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>

diff --git a/docs/man/xl.pod.1 b/docs/man/xl.pod.1
--- a/docs/man/xl.pod.1
+++ b/docs/man/xl.pod.1
@@ -556,6 +556,26 @@ different run state is appropriate.  Pin
 this, by ensuring certain VCPUs can only run on certain physical
 CPUs.
 
+=item B<node-affinity> I<domain-id> I<nodes>
+
+Set the NUMA node affinity for the domain, i.e., the set of NUMA
+nodes of the host from which the memory of the domain will be
+allocated. More specificallly, the domain's memory will be split
+in equal (well, as equal as possible) parts among all the nodes
+it is affine with, The keyword B<all> can be used to have the
+domain affine to all NUMA nodes in the host.
+
+Normally NUMA node affinity of a domain is automatically computed
+from its VCPU affinity. The default behaviour is to have it equal
+to all the nodes the PCPUs onto which the VCPUs of the domain are
+pinned belong to. Manually specifying it can be used to restrict
+this to a specific subset of the host NUMA nodes, for improved
+locality of memory accesses by the domain. Notice, however, that
+this will not affect the memory that has already been allocated.
+For having the full amount of memory allocated on specific node(s)
+at domain creation time, the domain's configuration file is what
+should be used.
+
 =item B<vncviewer> [I<OPTIONS>] I<domain-id>
 
 Attach to domain's VNC server, forking a vncviewer process.
diff --git a/tools/libxl/libxl_utils.c b/tools/libxl/libxl_utils.c
--- a/tools/libxl/libxl_utils.c
+++ b/tools/libxl/libxl_utils.c
@@ -442,7 +442,7 @@ void libxl_map_dispose(struct libxl_map 
     free(map->map);
 }
 
-static int libxl_map_alloc(libxl_ctx *ctx, struct libxl_map *map, int n_elems)
+int libxl_map_alloc(libxl_ctx *ctx, struct libxl_map *map, int n_elems)
 {
     int sz;
 
diff --git a/tools/libxl/libxl_utils.h b/tools/libxl/libxl_utils.h
--- a/tools/libxl/libxl_utils.h
+++ b/tools/libxl/libxl_utils.h
@@ -64,6 +64,7 @@ int libxl_devid_to_device_nic(libxl_ctx 
 int libxl_vdev_to_device_disk(libxl_ctx *ctx, uint32_t domid, const char *vdev,
                                libxl_device_disk *disk);
 
+int libxl_map_alloc(libxl_ctx *ctx, struct libxl_map *map, int n_elems);
 int libxl_map_test(struct libxl_map *map, int elem);
 void libxl_map_set(struct libxl_map *map, int elem);
 void libxl_map_reset(struct libxl_map *map, int elem);
@@ -79,6 +80,10 @@ static inline int libxl_map_elem_valid(s
 {
     return elem >= 0 && elem < (map->size * 8);
 }
+#define libxl_for_each_elem(v, m) for (v = 0; v < (m).size * 8; v++)
+#define libxl_for_each_set_elem(v, m) for (v = 0; v < (m).size * 8; v++) \
+                                              if (libxl_map_test(&(m), v))
+
 
 int libxl_cpumap_alloc(libxl_ctx *ctx, libxl_cpumap *cpumap);
 static inline int libxl_cpumap_test(libxl_cpumap *cpumap, int cpu)
diff --git a/tools/libxl/xl.h b/tools/libxl/xl.h
--- a/tools/libxl/xl.h
+++ b/tools/libxl/xl.h
@@ -54,6 +54,7 @@ int main_config_update(int argc, char **
 int main_button_press(int argc, char **argv);
 int main_vcpupin(int argc, char **argv);
 int main_vcpuset(int argc, char **argv);
+int main_nodeaffinity(int argc, char **argv);
 int main_memmax(int argc, char **argv);
 int main_memset(int argc, char **argv);
 int main_sched_credit(int argc, char **argv);
diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -448,65 +448,75 @@ static void split_string_into_string_lis
     free(s);
 }
 
-static int vcpupin_parse(char *cpu, libxl_cpumap *cpumap)
+static int affinity_parse(char *str, struct libxl_map *map, int n_elems)
 {
-    libxl_cpumap exclude_cpumap;
-    uint32_t cpuida, cpuidb;
+    struct libxl_map exclude_map;
+    uint32_t stra, strb;
     char *endptr, *toka, *tokb, *saveptr = NULL;
-    int i, rc = 0, rmcpu;
-
-    if (!strcmp(cpu, "all")) {
-        libxl_cpumap_set_any(cpumap);
+    int i, rc = 0, rmelem;
+
+    if (!strcmp(str, "all")) {
+        libxl_map_set_any(map);
         return 0;
     }
 
-    if (libxl_cpumap_alloc(ctx, &exclude_cpumap)) {
-        fprintf(stderr, "Error: Failed to allocate cpumap.\n");
+    if (libxl_map_alloc(ctx, &exclude_map, n_elems)) {
+        fprintf(stderr, "Error: Failed to allocate libxl_map.\n");
         return ENOMEM;
     }
 
-    for (toka = strtok_r(cpu, ",", &saveptr); toka;
+    for (toka = strtok_r(str, ",", &saveptr); toka;
          toka = strtok_r(NULL, ",", &saveptr)) {
-        rmcpu = 0;
+        rmelem = 0;
         if (*toka == '^') {
             /* This (These) Cpu(s) will be removed from the map */
             toka++;
-            rmcpu = 1;
+            rmelem = 1;
         }
         /* Extract a valid (range of) cpu(s) */
-        cpuida = cpuidb = strtoul(toka, &endptr, 10);
+        stra = strb = strtoul(toka, &endptr, 10);
         if (endptr == toka) {
             fprintf(stderr, "Error: Invalid argument.\n");
             rc = EINVAL;
-            goto vcpp_out;
+            goto afp_out;
         }
         if (*endptr == '-') {
             tokb = endptr + 1;
-            cpuidb = strtoul(tokb, &endptr, 10);
-            if (endptr == tokb || cpuida > cpuidb) {
+            strb = strtoul(tokb, &endptr, 10);
+            if (endptr == tokb || stra > strb) {
                 fprintf(stderr, "Error: Invalid argument.\n");
                 rc = EINVAL;
-                goto vcpp_out;
+                goto afp_out;
             }
         }
-        while (cpuida <= cpuidb) {
-            rmcpu == 0 ? libxl_cpumap_set(cpumap, cpuida) :
-                         libxl_cpumap_set(&exclude_cpumap, cpuida);
-            cpuida++;
+        while (stra <= strb) {
+            rmelem == 0 ? libxl_map_set(map, stra) :
+                          libxl_map_set(&exclude_map, stra);
+            stra++;
         }
     }
 
     /* Clear all the cpus from the removal list */
-    libxl_for_each_set_cpu(i, exclude_cpumap) {
-        libxl_cpumap_reset(cpumap, i);
-    }
-
-vcpp_out:
-    libxl_cpumap_dispose(&exclude_cpumap);
+    libxl_for_each_set_elem(i, exclude_map) {
+        libxl_map_reset(map, i);
+    }
+
+afp_out:
+    libxl_map_dispose(&exclude_map);
 
     return rc;
 }
 
+static inline int vcpupin_parse(char *cpu, libxl_cpumap *cpumap)
+{
+    return affinity_parse(cpu, cpumap, libxl_get_max_cpus(ctx));
+}
+
+static inline int nodeaffinity_parse(char *nodes, libxl_nodemap *nodemap)
+{
+    return affinity_parse(nodes, nodemap, libxl_get_max_nodes(ctx));
+}
+
 static void parse_config_data(const char *configfile_filename_report,
                               const char *configfile_data,
                               int configfile_len,
@@ -3873,6 +3883,40 @@ int main_vcpuset(int argc, char **argv)
     return 0;
 }
 
+static void nodeaffinity(const char *d, char *nodes)
+{
+    libxl_nodemap nodemap;
+
+    find_domain(d);
+
+    if (libxl_nodemap_alloc(ctx, &nodemap))
+        goto nodeaffinity_out;
+
+    if (!strcmp(nodes, "all"))
+        libxl_nodemap_set_any(&nodemap);
+    else if (nodeaffinity_parse(nodes, &nodemap))
+        goto nodeaffinity_out1;
+
+    if (libxl_set_node_affinity(ctx, domid, &nodemap) == -1)
+        fprintf(stderr, "Could not set node affinity for dom `%d'.\n", domid);
+
+  nodeaffinity_out1:
+    libxl_nodemap_dispose(&nodemap);
+  nodeaffinity_out:
+    ;
+}
+
+int main_nodeaffinity(int argc, char **argv)
+{
+    int opt;
+
+    if ((opt = def_getopt(argc, argv, "", "node-affinity", 2)) != -1)
+        return opt;
+
+    nodeaffinity(argv[optind], argv[optind+1]);
+    return 0;
+}
+
 static void output_xeninfo(void)
 {
     const libxl_version_info *info;
diff --git a/tools/libxl/xl_cmdtable.c b/tools/libxl/xl_cmdtable.c
--- a/tools/libxl/xl_cmdtable.c
+++ b/tools/libxl/xl_cmdtable.c
@@ -195,6 +195,11 @@ struct cmd_spec cmd_table[] = {
       "Set the number of active VCPUs allowed for the domain",
       "<Domain> <vCPUs>",
     },
+    { "node-affinity",
+      &main_nodeaffinity, 0,
+      "Set the NUMA node affinity for the domain",
+      "<Domain> <Nodes|all>",
+    },
     { "list-vm",
       &main_list_vm, 0,
       "List the VMs,without DOM0",

next prev parent reply	other threads:[~2012-04-11 13:17 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-04-11 13:17 [PATCH 00 of 10 [RFC]] Automatically place guest on host's NUMA nodes with xl Dario Faggioli
2012-04-11 13:17 ` [PATCH 01 of 10 [RFC]] libxc: Generalize xenctl_cpumap to just xenctl_map Dario Faggioli
2012-04-11 16:08   ` George Dunlap
2012-04-11 16:31     ` Dario Faggioli
2012-04-11 16:41       ` Dario Faggioli
2012-04-11 13:17 ` [PATCH 02 of 10 [RFC]] libxl: Generalize libxl_cpumap to just libxl_map Dario Faggioli
2012-04-11 13:17 ` [PATCH 03 of 10 [RFC]] libxc, libxl: Introduce xc_nodemap_t and libxl_nodemap Dario Faggioli
2012-04-11 16:38   ` George Dunlap
2012-04-11 16:57     ` Dario Faggioli
2012-04-11 13:17 ` [PATCH 04 of 10 [RFC]] libxl: Introduce libxl_get_numainfo() calling xc_numainfo() Dario Faggioli
2012-04-11 13:17 ` [PATCH 05 of 10 [RFC]] xl: Explicit node affinity specification for guests via config file Dario Faggioli
2012-04-12 10:24   ` George Dunlap
2012-04-12 10:48     ` David Vrabel
2012-04-12 22:25       ` Dario Faggioli
2012-04-12 11:32     ` Formatting of emails which are comments on patches Ian Jackson
2012-04-12 11:42       ` George Dunlap
2012-04-12 22:21     ` [PATCH 05 of 10 [RFC]] xl: Explicit node affinity specification for guests via config file Dario Faggioli
2012-04-11 13:17 ` Dario Faggioli [this message]
2012-04-12 10:29   ` [PATCH 06 of 10 [RFC]] xl: Allow user to set or change node affinity on-line George Dunlap
2012-04-12 21:57     ` Dario Faggioli
2012-04-11 13:17 ` [PATCH 07 of 10 [RFC]] sched_credit: Let the scheduler know about `node affinity` Dario Faggioli
2012-04-12 23:06   ` Dario Faggioli
2012-04-27 14:45   ` George Dunlap
2012-05-02 15:13     ` Dario Faggioli
2012-04-11 13:17 ` [PATCH 08 of 10 [RFC]] xl: Introduce First Fit memory-wise placement of guests on nodes Dario Faggioli
2012-05-01 15:45   ` George Dunlap
2012-05-02 16:30     ` Dario Faggioli
2012-05-03  1:03       ` Dario Faggioli
2012-05-03  8:10         ` Ian Campbell
2012-05-03 10:16         ` George Dunlap
2012-05-03 13:41       ` George Dunlap
2012-05-03 14:58         ` Dario Faggioli
2012-04-11 13:17 ` [PATCH 09 of 10 [RFC]] xl: Introduce Best and Worst Fit guest placement algorithms Dario Faggioli
2012-04-16 10:29   ` Dario Faggioli
2012-04-11 13:17 ` [PATCH 10 of 10 [RFC]] xl: Some automatic NUMA placement documentation Dario Faggioli
2012-04-12  9:11   ` Ian Campbell
2012-04-12 10:32     ` Dario Faggioli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=64547b45cb112a35d3a2.1334150273@Solace \
    --to=raistlin@linux.it \
    --cc=Ian.Campbell@citrix.com \
    --cc=Ian.Jackson@eu.citrix.com \
    --cc=JBeulich@suse.com \
    --cc=Stefano.Stabellini@eu.citrix.com \
    --cc=andre.przywara@amd.com \
    --cc=george.dunlap@eu.citrix.com \
    --cc=juergen.gross@ts.fujitsu.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).