From: Dario Faggioli <raistlin@linux.it>
To: xen-devel@lists.xen.org
Cc: Andre Przywara <andre.przywara@amd.com>,
Ian Campbell <Ian.Campbell@citrix.com>,
Stefano Stabellini <Stefano.Stabellini@eu.citrix.com>,
George Dunlap <george.dunlap@eu.citrix.com>,
Juergen Gross <juergen.gross@ts.fujitsu.com>,
Ian Jackson <Ian.Jackson@eu.citrix.com>,
Roger Pau Monne <roger.pau@citrix.com>
Subject: Re: [PATCH 08 of 10 v3] libxl: enable automatic placement of guests on NUMA nodes
Date: Wed, 04 Jul 2012 18:41:42 +0200 [thread overview]
Message-ID: <1341420102.7083.38.camel@Solace> (raw)
In-Reply-To: <7087d3622ee2051654c9.1341418687@Solace>
[-- Attachment #1.1: Type: text/plain, Size: 4968 bytes --]
On Wed, 2012-07-04 at 18:18 +0200, Dario Faggioli wrote:
> # HG changeset patch
> # User Dario Faggioli <raistlin@linux.it>
> # Date 1341416324 -7200
> # Node ID 7087d3622ee2051654c9e78fe4829da10c2d46f1
> # Parent 6fd693e7f3bc8b4d9bd20befff2c13de5591a7c5
> libxl: enable automatic placement of guests on NUMA nodes
>
George, Ian (Jackson),
> Once we know which ones, among all the possible combinations, represents valid
> placement candidates for a domain, use some heuistics for deciding which is the
> best. For instance, smaller candidates are considered to be better, both from
> the domain's point of view (fewer memory spreading among nodes) and from the
> system as a whole point of view (fewer memoy fragmentation). In case of
> candidates of equal sizes (i.e., with the same number of nodes), the amount of
> free memory and the number of domain already assigned to their nodes are
> considered. Very often, candidates with greater amount of memory are the one
> we wants, as this is also good for keeping memory fragmentation under control.
> However, if the difference in how much free memory two candidates have, the
> number of assigned domains might be what decides which candidate wins.
>
> [...]
>
> ---
> Changes from v2:
>
> [...]
>
> * Comparison function for candidates changed so that it now provides
> total ordering, as requested during review. It is still using FP
> arithmetic, though. Also I think that just putting the difference
> between the amount of free memory and between the number of assigned
> domains of two candidates in a single formula (after normalizing and
> weighting them) is both clear and effective enough.
>
> [...]
>
I thought at what a sensible comparison function should look like, and I
also plotted some graphs while randomly generating both the amount of
free memory and number of domains. The outcome of all this is I managed
in convincing myself the solution below is both clear and understandable
as it is effective (confirmed by the test I was able to perform up to
now).
Basically, the idea is to consider both things (freemem and nr_domains),
but with freemem being 3 times more "important" than nr_domains. That is
very similar to one of the log()-based solutions proposed by Ian, but I
really think just normalizing and weighting is easier to understand,
even if quickly looking at the formula/code.
Regarding the percent penalty per each domain proposed by George on IRC,
I liked the idea a lot, but the figured out that here will always be a
number of domains (e.g., 20, if the penalty is set to 5%) starting from
which nr_domains counts more than freemem, which is the opposite of what
I want to achieve.
So, now, comparison between placement candidates happens like this:
return a.nr_nodes < b.nr_nodes ? a :
b.nr_nodes < a.nr_nodes ? b :
3*norm_diff(b.freemem, a.freemem) - norm_diff(a.nr_domains, b.nr_domains)
where:
norm_diff(x, y) := (x - y)/max(x, y)
Which removes the nasty effects of having that 10% range as in v2 of the
series. If that '3' looks too much of a magic number, I can of course
enum/#define it, or even make it configurable (although, the latter, not
for 4.2, I guess :-) ).
> +/* Subtract two values and translate the result in [0, 1] */
> +static double normalized_diff(double a, double b)
> +{
> +#define max(a, b) (a > b ? a : b)
> + if (!a && a == b)
> + return 0.0;
> + return (a - b) / max(a, b);
> +}
> +
> +/*
> + * The NUMA placement candidates are reordered according to the following
> + * heuristics:
> + * - candidates involving fewer nodes come first. In case two (or
> + * more) candidates span the same number of nodes,
> + * - the amount of free memory and the number of domains assigned to the
> + * candidates are considered. In doing that, candidates with greater
> + * amount of free memory and fewer domains assigned to them are preferred,
> + * with free memory "weighting" three times as much as number of domains.
> + */
> +static int numa_cmpf(const void *v1, const void *v2)
> +{
> + const libxl__numa_candidate *c1 = v1;
> + const libxl__numa_candidate *c2 = v2;
> +#define sign(a) a > 0 ? 1 : a < 0 ? -1 : 0
> + double freememkb_diff = normalized_diff(c2->free_memkb, c1->free_memkb);
> + double nrdomains_diff = normalized_diff(c1->nr_domains, c2->nr_domains);
> +
> + if (c1->nr_nodes != c2->nr_nodes)
> + return c1->nr_nodes - c2->nr_nodes;
> +
> + return sign(3*freememkb_diff + nrdomains_diff);
> +}
> +
What do you think?
Thanks and Regards,
Dario
--
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://retis.sssup.it/people/faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)
[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 198 bytes --]
[-- Attachment #2: Type: text/plain, Size: 126 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
next prev parent reply other threads:[~2012-07-04 16:41 UTC|newest]
Thread overview: 49+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-07-04 16:17 [PATCH 00 of 10 v3] Automatic NUMA placement for xl Dario Faggioli
2012-07-04 16:18 ` [PATCH 01 of 10 v3] libxl: add a new Array type to the IDL Dario Faggioli
2012-07-04 16:18 ` [PATCH 02 of 10 v3] libxl, libxc: introduce libxl_get_numainfo() Dario Faggioli
2012-07-06 10:35 ` Ian Campbell
2012-07-04 16:18 ` [PATCH 03 of 10 v3] xl: add more NUMA information to `xl info -n' Dario Faggioli
2012-07-06 11:37 ` Ian Campbell
2012-07-06 12:00 ` Dario Faggioli
2012-07-06 12:15 ` Ian Campbell
2012-07-06 12:52 ` Dario Faggioli
2012-07-04 16:18 ` [PATCH 04 of 10 v3] libxl: rename libxl_cpumap to libxl_bitmap Dario Faggioli
2012-07-06 10:39 ` Ian Campbell
2012-07-04 16:18 ` [PATCH 05 of 10 v3] libxl: expand the libxl_bitmap API a bit Dario Faggioli
2012-07-06 10:40 ` Ian Campbell
2012-07-04 16:18 ` [PATCH 06 of 10 v3] libxl: introduce some node map helpers Dario Faggioli
2012-07-04 16:18 ` [PATCH 07 of 10 v3] libxl: explicitly check for libmath in autoconf Dario Faggioli
2012-07-04 16:44 ` Roger Pau Monne
2012-07-06 11:42 ` Ian Campbell
2012-07-06 11:54 ` Dario Faggioli
2012-07-04 16:18 ` [PATCH 08 of 10 v3] libxl: enable automatic placement of guests on NUMA nodes Dario Faggioli
2012-07-04 16:41 ` Dario Faggioli [this message]
2012-07-06 10:55 ` Ian Campbell
2012-07-06 13:03 ` Dario Faggioli
2012-07-06 13:21 ` Ian Campbell
2012-07-06 13:52 ` Dario Faggioli
2012-07-06 13:54 ` Ian Campbell
2012-07-06 11:30 ` George Dunlap
2012-07-06 13:00 ` Dario Faggioli
2012-07-06 13:05 ` George Dunlap
2012-07-06 14:35 ` Dario Faggioli
2012-07-06 14:40 ` George Dunlap
2012-07-06 16:27 ` Ian Campbell
2012-07-04 16:18 ` [PATCH 09 of 10 v3] libxl: have NUMA placement deal with cpupools Dario Faggioli
2012-07-06 12:42 ` George Dunlap
2012-07-06 13:10 ` Dario Faggioli
2012-07-06 13:27 ` George Dunlap
2012-07-06 13:32 ` Ian Campbell
2012-07-06 13:42 ` Dario Faggioli
2012-07-10 15:16 ` Dario Faggioli
2012-07-04 16:18 ` [PATCH 10 of 10 v3] Some automatic NUMA placement documentation Dario Faggioli
2012-07-06 14:08 ` George Dunlap
2012-07-06 14:26 ` George Dunlap
2012-07-06 14:37 ` Dario Faggioli
2012-07-06 11:16 ` [PATCH 00 of 10 v3] Automatic NUMA placement for xl Ian Campbell
2012-07-06 11:20 ` Ian Campbell
2012-07-06 11:22 ` Ian Campbell
2012-07-06 13:05 ` Dario Faggioli
2012-07-06 12:19 ` Ian Campbell
2012-07-08 18:32 ` Ian Campbell
2012-07-09 14:32 ` Dario Faggioli
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1341420102.7083.38.camel@Solace \
--to=raistlin@linux.it \
--cc=Ian.Campbell@citrix.com \
--cc=Ian.Jackson@eu.citrix.com \
--cc=Stefano.Stabellini@eu.citrix.com \
--cc=andre.przywara@amd.com \
--cc=george.dunlap@eu.citrix.com \
--cc=juergen.gross@ts.fujitsu.com \
--cc=roger.pau@citrix.com \
--cc=xen-devel@lists.xen.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).