From: Dario Faggioli <dario.faggioli@citrix.com>
To: xen-devel@lists.xen.org
Cc: Marcus Granado <Marcus.Granado@eu.citrix.com>,
Keir Fraser <keir@xen.org>,
Ian Campbell <Ian.Campbell@citrix.com>,
Li Yechen <lccycc123@gmail.com>,
George Dunlap <george.dunlap@eu.citrix.com>,
Andrew Cooper <Andrew.Cooper3@citrix.com>,
Juergen Gross <juergen.gross@ts.fujitsu.com>,
Ian Jackson <Ian.Jackson@eu.citrix.com>,
Jan Beulich <JBeulich@suse.com>,
Justin Weaver <jtweaver@hawaii.edu>, Matt Wilson <msw@amazon.com>,
Elena Ufimtseva <ufimtseva@gmail.com>
Subject: [PATCH v3 08/14] xen: derive NUMA node affinity from hard and soft CPU affinity
Date: Mon, 18 Nov 2013 19:17:46 +0100 [thread overview]
Message-ID: <20131118181745.31002.73423.stgit@Solace> (raw)
In-Reply-To: <20131118175544.31002.79574.stgit@Solace>
if a domain's NUMA node-affinity (which is what controls
memory allocations) is provided by the user/toolstack, it
just is not touched. However, if the user does not say
anything, leaving it all to Xen, let's compute it in the
following way:
1. cpupool's cpus & hard-affinity & soft-affinity
2. if (1) is empty: cpupool's cpus & hard-affinity
This guarantees memory to be allocated from the narrowest
possible set of NUMA nodes, ad makes it relatively easy to
set up NUMA-aware scheduling on top of soft affinity.
Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
---
Changes from v2:
* the loop computing the mask is now only executed when
it really is useful, as suggested during review;
* the loop, and all the cpumask handling is optimized,
in a way similar to what was suggested during review.
---
xen/common/domain.c | 62 +++++++++++++++++++++++++++++++++------------------
1 file changed, 40 insertions(+), 22 deletions(-)
diff --git a/xen/common/domain.c b/xen/common/domain.c
index d6ac4d1..721678a 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -353,17 +353,17 @@ struct domain *domain_create(
void domain_update_node_affinity(struct domain *d)
{
- cpumask_var_t cpumask;
- cpumask_var_t online_affinity;
+ cpumask_var_t dom_cpumask, dom_cpumask_soft;
+ cpumask_t *dom_affinity;
const cpumask_t *online;
struct vcpu *v;
- unsigned int node;
+ unsigned int cpu;
- if ( !zalloc_cpumask_var(&cpumask) )
+ if ( !zalloc_cpumask_var(&dom_cpumask) )
return;
- if ( !alloc_cpumask_var(&online_affinity) )
+ if ( !zalloc_cpumask_var(&dom_cpumask_soft) )
{
- free_cpumask_var(cpumask);
+ free_cpumask_var(dom_cpumask);
return;
}
@@ -371,31 +371,49 @@ void domain_update_node_affinity(struct domain *d)
spin_lock(&d->node_affinity_lock);
- for_each_vcpu ( d, v )
- {
- cpumask_and(online_affinity, v->cpu_hard_affinity, online);
- cpumask_or(cpumask, cpumask, online_affinity);
- }
-
/*
- * If d->auto_node_affinity is true, the domain's node-affinity mask
- * (d->node_affinity) is automaically computed from all the domain's
- * vcpus' vcpu-affinity masks (the union of which we have just built
- * above in cpumask). OTOH, if d->auto_node_affinity is false, we
- * must leave the node-affinity of the domain alone.
+ * If d->auto_node_affinity is true, let's compute the domain's
+ * node-affinity and update d->node_affinity accordingly. if false,
+ * just leave d->auto_node_affinity alone.
*/
if ( d->auto_node_affinity )
{
+ /*
+ * We want the narrowest possible set of pcpus (to get the narowest
+ * possible set of nodes). What we need is the cpumask of where the
+ * domain can run (the union of the hard affinity of all its vcpus),
+ * and the full mask of where it would prefer to run (the union of
+ * the soft affinity of all its various vcpus). Let's build them.
+ */
+ cpumask_clear(dom_cpumask);
+ cpumask_clear(dom_cpumask_soft);
+ for_each_vcpu ( d, v )
+ {
+ cpumask_or(dom_cpumask, dom_cpumask, v->cpu_hard_affinity);
+ cpumask_or(dom_cpumask_soft, dom_cpumask_soft,
+ v->cpu_soft_affinity);
+ }
+ /* Filter out non-online cpus */
+ cpumask_and(dom_cpumask, dom_cpumask, online);
+ /* And compute the intersection between hard, online and soft */
+ cpumask_and(dom_cpumask_soft, dom_cpumask_soft, dom_cpumask);
+
+ /*
+ * If not empty, the intersection of hard, soft and online is the
+ * narrowest set we want. If empty, we fall back to hard&online.
+ */
+ dom_affinity = cpumask_empty(dom_cpumask_soft) ?
+ dom_cpumask : dom_cpumask_soft;
+
nodes_clear(d->node_affinity);
- for_each_online_node ( node )
- if ( cpumask_intersects(&node_to_cpumask(node), cpumask) )
- node_set(node, d->node_affinity);
+ for_each_cpu( cpu, dom_affinity )
+ node_set(cpu_to_node(cpu), d->node_affinity);
}
spin_unlock(&d->node_affinity_lock);
- free_cpumask_var(online_affinity);
- free_cpumask_var(cpumask);
+ free_cpumask_var(dom_cpumask_soft);
+ free_cpumask_var(dom_cpumask);
}
next prev parent reply other threads:[~2013-11-18 18:17 UTC|newest]
Thread overview: 65+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-11-18 18:16 [PATCH v3 00/14] Series short description Dario Faggioli
2013-11-18 18:16 ` [PATCH v3 01/14] xl: match output of vcpu-list with pinning syntax Dario Faggioli
2013-11-18 18:16 ` [PATCH v3 02/14] libxl: sanitize error handling in libxl_get_max_{cpus, nodes} Dario Faggioli
2013-11-19 12:24 ` George Dunlap
2013-11-19 12:34 ` Dario Faggioli
2013-11-18 18:16 ` [PATCH v3 03/14] xl: allow for node-wise specification of vcpu pinning Dario Faggioli
2013-11-18 18:17 ` [PATCH v3 04/14] xl: implement and enable dryrun mode for `xl vcpu-pin' Dario Faggioli
2013-11-18 18:17 ` [PATCH v3 05/14] xl: test script for the cpumap parser (for vCPU pinning) Dario Faggioli
2013-11-18 18:17 ` [PATCH v3 06/14] xen: sched: rename v->cpu_affinity into v->cpu_hard_affinity Dario Faggioli
2013-11-18 18:17 ` [PATCH v3 07/14] xen: sched: introduce soft-affinity and use it instead d->node-affinity Dario Faggioli
2013-11-18 18:17 ` Dario Faggioli [this message]
2013-11-19 14:14 ` [PATCH v3 08/14] xen: derive NUMA node affinity from hard and soft CPU affinity George Dunlap
2013-11-19 16:20 ` Jan Beulich
2013-11-19 16:35 ` Dario Faggioli
2013-11-18 18:17 ` [PATCH v3 09/14] xen: sched: DOMCTL_*vcpuaffinity works with hard and soft affinity Dario Faggioli
2013-11-19 14:32 ` George Dunlap
2013-11-19 16:39 ` Jan Beulich
2013-11-22 18:55 ` Dario Faggioli
2013-11-25 9:32 ` Jan Beulich
2013-11-25 9:54 ` Dario Faggioli
2013-11-25 10:00 ` Jan Beulich
2013-11-25 10:58 ` George Dunlap
2013-11-18 18:18 ` [PATCH v3 10/14] libxc: get and set soft and hard affinity Dario Faggioli
2013-11-19 14:51 ` George Dunlap
2013-11-19 14:57 ` Ian Campbell
2013-11-19 14:58 ` George Dunlap
2013-11-19 17:08 ` Ian Campbell
2013-11-19 18:01 ` Dario Faggioli
2013-11-18 18:18 ` [PATCH v3 11/14] libxl: get and set soft affinity Dario Faggioli
2013-11-19 15:41 ` George Dunlap
2013-11-19 16:09 ` Dario Faggioli
2013-11-19 17:15 ` Ian Campbell
2013-11-19 18:58 ` Dario Faggioli
2013-11-20 11:30 ` Ian Campbell
2013-11-20 13:59 ` George Dunlap
2013-11-20 14:04 ` Ian Campbell
2013-11-20 16:59 ` Ian Jackson
2013-11-20 17:46 ` Dario Faggioli
2013-11-20 14:09 ` George Dunlap
2013-11-19 17:24 ` Ian Campbell
2013-11-19 17:51 ` Dario Faggioli
2013-11-20 11:27 ` Ian Campbell
2013-11-20 11:29 ` George Dunlap
2013-11-20 11:32 ` Ian Campbell
2013-11-20 11:40 ` Dario Faggioli
2013-11-20 14:45 ` George Dunlap
2013-11-20 14:52 ` Dario Faggioli
2013-11-20 12:00 ` Dario Faggioli
2013-11-20 12:05 ` Ian Campbell
2013-11-20 12:18 ` Dario Faggioli
2013-11-20 12:26 ` Ian Campbell
2013-11-20 14:50 ` Dario Faggioli
2013-11-20 14:56 ` Ian Campbell
2013-11-20 16:27 ` Dario Faggioli
2013-11-18 18:18 ` [PATCH v3 12/14] xl: enable getting and setting soft Dario Faggioli
2013-11-19 17:30 ` Ian Campbell
2013-11-19 17:52 ` Dario Faggioli
2013-11-18 18:18 ` [PATCH v3 13/14] xl: enable for specifying node-affinity in the config file Dario Faggioli
2013-11-19 17:35 ` Ian Campbell
2013-11-18 18:18 ` [PATCH v3 14/14] libxl: automatic NUMA placement affects soft affinity Dario Faggioli
2013-11-19 17:41 ` Ian Campbell
2013-11-19 17:57 ` Dario Faggioli
2013-11-18 18:20 ` [PATCH v3 00/14] Series short description Dario Faggioli
2013-11-19 16:00 ` George Dunlap
2013-11-19 16:08 ` Jan Beulich
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20131118181745.31002.73423.stgit@Solace \
--to=dario.faggioli@citrix.com \
--cc=Andrew.Cooper3@citrix.com \
--cc=Ian.Campbell@citrix.com \
--cc=Ian.Jackson@eu.citrix.com \
--cc=JBeulich@suse.com \
--cc=Marcus.Granado@eu.citrix.com \
--cc=george.dunlap@eu.citrix.com \
--cc=jtweaver@hawaii.edu \
--cc=juergen.gross@ts.fujitsu.com \
--cc=keir@xen.org \
--cc=lccycc123@gmail.com \
--cc=msw@amazon.com \
--cc=ufimtseva@gmail.com \
--cc=xen-devel@lists.xen.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.