From: Dario Faggioli <dario.faggioli@citrix.com>
To: xen-devel@lists.xen.org
Cc: Marcus Granado <Marcus.Granado@eu.citrix.com>,
Keir Fraser <keir@xen.org>,
Ian Campbell <ian.campbell@citrix.com>,
Li Yechen <lccycc123@gmail.com>,
George Dunlap <george.dunlap@eu.citrix.com>,
Andrew Cooper <Andrew.Cooper3@citrix.com>,
Juergen Gross <juergen.gross@ts.fujitsu.com>,
Ian Jackson <Ian.Jackson@eu.citrix.com>,
Jan Beulich <JBeulich@suse.com>,
Justin Weaver <jtweaver@hawaii.edu>, Matt Wilson <msw@amazon.com>,
Elena Ufimtseva <ufimtseva@gmail.com>
Subject: [PATCH v5 17/17] libxl: automatic NUMA placement affects soft affinity
Date: Mon, 02 Dec 2013 19:29:59 +0100 [thread overview]
Message-ID: <20131202182959.29026.88356.stgit@Solace> (raw)
In-Reply-To: <20131202180129.29026.81543.stgit@Solace>
vCPU soft affinity and NUMA-aware scheduling does not have
to be related. However, soft affinity is how NUMA-aware
scheduling is actually implemented, and therefore, by default,
the results of automatic NUMA placement (at VM creation time)
are also used to set the soft affinity of all the vCPUs of
the domain.
Of course, this only happens if automatic NUMA placement is
enabled and actually takes place (for instance, if the user
does not specify any hard and soft affiniy in the xl config
file).
This also takes care of the vice-versa, i.e., don't trigger
automatic placement if the config file specifies either an
hard (the check for which was already there) or a soft (the
check for which is introduced by this commit) affinity.
Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Acked-by: George Dunlap <george.dunlap@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
Changes from v3:
* rephrase comments and docs, as suggestd during review.
---
docs/man/xl.cfg.pod.5 | 21 +++++++++++----------
docs/misc/xl-numa-placement.markdown | 14 ++++++++++++--
tools/libxl/libxl_dom.c | 20 ++++++++++++++++++--
3 files changed, 41 insertions(+), 14 deletions(-)
diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5
index cce51ae..99a43a7 100644
--- a/docs/man/xl.cfg.pod.5
+++ b/docs/man/xl.cfg.pod.5
@@ -150,16 +150,6 @@ here, and the soft affinity mask, provided via B<cpus\_soft=> (if any),
is utilized to compute the domain node-affinity, for driving memory
allocations.
-If we are on a NUMA machine (i.e., if the host has more than one NUMA
-node) and this option is not specified, libxl automatically tries to
-place the guest on the least possible number of nodes. That, however,
-will not affect vcpu pinning, so the guest will still be able to run on
-all the cpus. A heuristic approach is used for choosing the best node (or
-set of nodes), with the goals of maximizing performance for the guest
-and, at the same time, achieving efficient utilization of host cpus
-and memory. See F<docs/misc/xl-numa-placement.markdown> for more
-details.
-
=item B<cpus_soft="CPU-LIST">
Exactly as B<cpus=>, but specifies soft affinity, rather than pinning
@@ -174,6 +164,17 @@ the intersection of the soft affinity mask, provided here, and the vcpu
pinning, provided via B<cpus=> (if any), is utilized to compute the
domain node-affinity, for driving memory allocations.
+If this option is not specified (and B<cpus=> is not specified either),
+libxl automatically tries to place the guest on the least possible
+number of nodes. A heuristic approach is used for choosing the best
+node (or set of nodes), with the goal of maximizing performance for
+the guest and, at the same time, achieving efficient utilization of
+host cpus and memory. In that case, the soft affinity of all the vcpus
+of the domain will be set to the pcpus belonging to the NUMA nodes
+chosen during placement.
+
+For more details, see F<docs/misc/xl-numa-placement.markdown>.
+
=back
=head3 CPU Scheduling
diff --git a/docs/misc/xl-numa-placement.markdown b/docs/misc/xl-numa-placement.markdown
index b1ed361..09ae95e 100644
--- a/docs/misc/xl-numa-placement.markdown
+++ b/docs/misc/xl-numa-placement.markdown
@@ -126,10 +126,20 @@ or Xen won't be able to guarantee the locality for their memory accesses.
That, of course, also mean the vCPUs of the domain will only be able to
execute on those same pCPUs.
+It is is also possible to have a "cpus\_soft=" option in the xl config file,
+to specify the soft affinity for all the vCPUs of the domain. This affects
+the NUMA placement in the following way:
+
+ * if only "cpus\_soft=" is present, the VM's node-affinity will be equal
+ to the nodes to which the pCPUs in the soft affinity mask belong;
+ * if both "cpus\_soft=" and "cpus=" are present, the VM's node-affinity
+ will be equal to the nodes to which the pCPUs present both in hard and
+ soft affinity belong.
+
### Placing the guest automatically ###
-If no "cpus=" option is specified in the config file, libxl tries
-to figure out on its own on which node(s) the domain could fit best.
+If neither "cpus=" nor "cpus\_soft=" are present in the config file, libxl
+tries to figure out on its own on which node(s) the domain could fit best.
If it finds one (some), the domain's node affinity get set to there,
and both memory allocations and NUMA aware scheduling (for the credit
scheduler and starting from Xen 4.3) will comply with it. Starting from
diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index 4bfed60..ceb8643 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -222,18 +222,34 @@ int libxl__build_pre(libxl__gc *gc, uint32_t domid,
* some weird error manifests) the subsequent call to
* libxl_domain_set_nodeaffinity() will do the actual placement,
* whatever that turns out to be.
+ *
+ * As far as scheduling is concerned, we achieve NUMA-aware scheduling
+ * by having the results of placement affect the soft affinity of all
+ * the vcpus of the domain. Of course, we want that iff placement is
+ * enabled and actually happens, so we only change info->cpumap_soft to
+ * reflect the placement result if that is the case
*/
if (libxl_defbool_val(info->numa_placement)) {
- if (!libxl_bitmap_is_full(&info->cpumap)) {
+ /* We require both hard and soft affinity not to be set */
+ if (!libxl_bitmap_is_full(&info->cpumap) ||
+ !libxl_bitmap_is_full(&info->cpumap_soft)) {
LOG(ERROR, "Can run NUMA placement only if no vcpu "
- "affinity is specified");
+ "(hard or soft) affinity is specified");
return ERROR_INVAL;
}
rc = numa_place_domain(gc, domid, info);
if (rc)
return rc;
+
+ /*
+ * We change the soft affinity in domain_build_info here, of course
+ * after converting the result of placement from nodes to cpus. the
+ * following call to libxl_set_vcpuaffinity_all_soft() will do the
+ * actual updating of the domain's vcpus' soft affinity.
+ */
+ libxl_nodemap_to_cpumap(ctx, &info->nodemap, &info->cpumap_soft);
}
libxl_domain_set_nodeaffinity(ctx, domid, &info->nodemap);
libxl_set_vcpuaffinity_all(ctx, domid, info->max_vcpus, &info->cpumap,
next prev parent reply other threads:[~2013-12-02 18:29 UTC|newest]
Thread overview: 41+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-12-02 18:27 [PATCH v5 00/17] Implement vcpu soft affinity for credit1 Dario Faggioli
2013-12-02 18:27 ` [PATCH v5 01/17] xl: match output of vcpu-list with pinning syntax Dario Faggioli
2013-12-02 18:27 ` [PATCH v5 02/17] libxl: better name for last parameter of libxl_list_vcpu Dario Faggioli
2013-12-04 11:40 ` Ian Jackson
2013-12-06 14:40 ` Dario Faggioli
2013-12-02 18:27 ` [PATCH v5 03/17] libxl: fix memory leak in libxl_list_vcpu Dario Faggioli
2013-12-05 12:07 ` Ian Jackson
2013-12-02 18:27 ` [PATCH v5 04/17] libxc/libxl: sanitize error handling in *_get_max_{cpus, nodes} Dario Faggioli
2013-12-05 12:10 ` Ian Jackson
2013-12-06 10:34 ` Dario Faggioli
2013-12-06 11:52 ` Ian Jackson
2013-12-02 18:27 ` [PATCH v5 05/17] libxc/libxl: allow to retrieve the number of online pCPUs Dario Faggioli
2013-12-02 18:28 ` [PATCH v5 06/17] xl: allow for node-wise specification of vcpu pinning Dario Faggioli
2013-12-02 18:28 ` [PATCH v5 07/17] xl: implement and enable dryrun mode for `xl vcpu-pin' Dario Faggioli
2013-12-02 18:28 ` [PATCH v5 08/17] xl: test script for the cpumap parser (for vCPU pinning) Dario Faggioli
2013-12-02 18:28 ` [PATCH v5 09/17] xen: sched: rename v->cpu_affinity into v->cpu_hard_affinity Dario Faggioli
2013-12-02 18:28 ` [PATCH v5 10/17] xen: sched: introduce soft-affinity and use it instead d->node-affinity Dario Faggioli
2013-12-02 18:28 ` [PATCH v5 11/17] xen: derive NUMA node affinity from hard and soft CPU affinity Dario Faggioli
2013-12-02 18:29 ` [PATCH v5 12/17] xen/libxc: sched: DOMCTL_*vcpuaffinity works with hard and soft affinity Dario Faggioli
2013-12-03 10:02 ` Jan Beulich
2013-12-03 10:06 ` Jan Beulich
2013-12-03 11:08 ` Dario Faggioli
2013-12-03 13:25 ` Dario Faggioli
2013-12-03 18:21 ` George Dunlap
2013-12-03 18:29 ` Dario Faggioli
2013-12-03 18:37 ` George Dunlap
2013-12-03 19:06 ` Dario Faggioli
2013-12-04 9:03 ` Dario Faggioli
2013-12-04 15:49 ` George Dunlap
2013-12-04 16:03 ` Dario Faggioli
2013-12-04 16:20 ` Jan Beulich
2013-12-11 11:33 ` Jan Beulich
2013-12-03 10:59 ` Dario Faggioli
2013-12-03 11:20 ` Jan Beulich
2013-12-03 11:30 ` Dario Faggioli
2013-12-02 18:29 ` [PATCH v5 13/17] libxc: get and set soft and hard affinity Dario Faggioli
2013-12-02 18:29 ` [PATCH v5 14/17] libxl: get and set soft affinity Dario Faggioli
2013-12-02 18:29 ` [PATCH v5 15/17] xl: enable getting and setting soft Dario Faggioli
2013-12-02 18:29 ` [PATCH v5 16/17] xl: enable for specifying node-affinity in the config file Dario Faggioli
2013-12-02 18:29 ` Dario Faggioli [this message]
2013-12-03 14:05 ` [PATCH v5 00/17] Implement vcpu soft affinity for credit1 George Dunlap
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20131202182959.29026.88356.stgit@Solace \
--to=dario.faggioli@citrix.com \
--cc=Andrew.Cooper3@citrix.com \
--cc=Ian.Jackson@eu.citrix.com \
--cc=JBeulich@suse.com \
--cc=Marcus.Granado@eu.citrix.com \
--cc=george.dunlap@eu.citrix.com \
--cc=ian.campbell@citrix.com \
--cc=jtweaver@hawaii.edu \
--cc=juergen.gross@ts.fujitsu.com \
--cc=keir@xen.org \
--cc=lccycc123@gmail.com \
--cc=msw@amazon.com \
--cc=ufimtseva@gmail.com \
--cc=xen-devel@lists.xen.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).