From: Dario Faggioli <raistlin@linux.it>
To: xen-devel@lists.xen.org
Cc: Andre Przywara <andre.przywara@amd.com>,
Ian Campbell <Ian.Campbell@citrix.com>,
Stefano Stabellini <Stefano.Stabellini@eu.citrix.com>,
George Dunlap <george.dunlap@eu.citrix.com>,
Juergen Gross <juergen.gross@ts.fujitsu.com>,
Ian Jackson <Ian.Jackson@eu.citrix.com>,
Roger Pau Monne <roger.pau@citrix.com>
Subject: Re: [PATCH 08 of 10 v3] libxl: enable automatic placement of guests on NUMA nodes
Date: Wed, 04 Jul 2012 18:41:42 +0200 [thread overview]
Message-ID: <1341420102.7083.38.camel@Solace> (raw)
In-Reply-To: <7087d3622ee2051654c9.1341418687@Solace>
[-- Attachment #1.1: Type: text/plain, Size: 4968 bytes --]
On Wed, 2012-07-04 at 18:18 +0200, Dario Faggioli wrote:
> # HG changeset patch
> # User Dario Faggioli <raistlin@linux.it>
> # Date 1341416324 -7200
> # Node ID 7087d3622ee2051654c9e78fe4829da10c2d46f1
> # Parent 6fd693e7f3bc8b4d9bd20befff2c13de5591a7c5
> libxl: enable automatic placement of guests on NUMA nodes
>
George, Ian (Jackson),
> Once we know which ones, among all the possible combinations, represents valid
> placement candidates for a domain, use some heuistics for deciding which is the
> best. For instance, smaller candidates are considered to be better, both from
> the domain's point of view (fewer memory spreading among nodes) and from the
> system as a whole point of view (fewer memoy fragmentation). In case of
> candidates of equal sizes (i.e., with the same number of nodes), the amount of
> free memory and the number of domain already assigned to their nodes are
> considered. Very often, candidates with greater amount of memory are the one
> we wants, as this is also good for keeping memory fragmentation under control.
> However, if the difference in how much free memory two candidates have, the
> number of assigned domains might be what decides which candidate wins.
>
> [...]
>
> ---
> Changes from v2:
>
> [...]
>
> * Comparison function for candidates changed so that it now provides
> total ordering, as requested during review. It is still using FP
> arithmetic, though. Also I think that just putting the difference
> between the amount of free memory and between the number of assigned
> domains of two candidates in a single formula (after normalizing and
> weighting them) is both clear and effective enough.
>
> [...]
>
I thought at what a sensible comparison function should look like, and I
also plotted some graphs while randomly generating both the amount of
free memory and number of domains. The outcome of all this is I managed
in convincing myself the solution below is both clear and understandable
as it is effective (confirmed by the test I was able to perform up to
now).
Basically, the idea is to consider both things (freemem and nr_domains),
but with freemem being 3 times more "important" than nr_domains. That is
very similar to one of the log()-based solutions proposed by Ian, but I
really think just normalizing and weighting is easier to understand,
even if quickly looking at the formula/code.
Regarding the percent penalty per each domain proposed by George on IRC,
I liked the idea a lot, but the figured out that here will always be a
number of domains (e.g., 20, if the penalty is set to 5%) starting from
which nr_domains counts more than freemem, which is the opposite of what
I want to achieve.
So, now, comparison between placement candidates happens like this:
return a.nr_nodes < b.nr_nodes ? a :
b.nr_nodes < a.nr_nodes ? b :
3*norm_diff(b.freemem, a.freemem) - norm_diff(a.nr_domains, b.nr_domains)
where:
norm_diff(x, y) := (x - y)/max(x, y)
Which removes the nasty effects of having that 10% range as in v2 of the
series. If that '3' looks too much of a magic number, I can of course
enum/#define it, or even make it configurable (although, the latter, not
for 4.2, I guess :-) ).
> +/* Subtract two values and translate the result in [0, 1] */
> +static double normalized_diff(double a, double b)
> +{
> +#define max(a, b) (a > b ? a : b)
> + if (!a && a == b)
> + return 0.0;
> + return (a - b) / max(a, b);
> +}
> +
> +/*
> + * The NUMA placement candidates are reordered according to the following
> + * heuristics:
> + * - candidates involving fewer nodes come first. In case two (or
> + * more) candidates span the same number of nodes,
> + * - the amount of free memory and the number of domains assigned to the
> + * candidates are considered. In doing that, candidates with greater
> + * amount of free memory and fewer domains assigned to them are preferred,
> + * with free memory "weighting" three times as much as number of domains.
> + */
> +static int numa_cmpf(const void *v1, const void *v2)
> +{
> + const libxl__numa_candidate *c1 = v1;
> + const libxl__numa_candidate *c2 = v2;
> +#define sign(a) a > 0 ? 1 : a < 0 ? -1 : 0
> + double freememkb_diff = normalized_diff(c2->free_memkb, c1->free_memkb);
> + double nrdomains_diff = normalized_diff(c1->nr_domains, c2->nr_domains);
> +
> + if (c1->nr_nodes != c2->nr_nodes)
> + return c1->nr_nodes - c2->nr_nodes;
> +
> + return sign(3*freememkb_diff + nrdomains_diff);
> +}
> +
What do you think?
Thanks and Regards,
Dario
--
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://retis.sssup.it/people/faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)
[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 198 bytes --]
[-- Attachment #2: Type: text/plain, Size: 126 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
next prev parent reply other threads:[~2012-07-04 16:41 UTC|newest]
Thread overview: 49+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-07-04 16:17 [PATCH 00 of 10 v3] Automatic NUMA placement for xl Dario Faggioli
2012-07-04 16:18 ` [PATCH 01 of 10 v3] libxl: add a new Array type to the IDL Dario Faggioli
2012-07-04 16:18 ` [PATCH 02 of 10 v3] libxl, libxc: introduce libxl_get_numainfo() Dario Faggioli
2012-07-06 10:35 ` Ian Campbell
2012-07-04 16:18 ` [PATCH 03 of 10 v3] xl: add more NUMA information to `xl info -n' Dario Faggioli
2012-07-06 11:37 ` Ian Campbell
2012-07-06 12:00 ` Dario Faggioli
2012-07-06 12:15 ` Ian Campbell
2012-07-06 12:52 ` Dario Faggioli
2012-07-04 16:18 ` [PATCH 04 of 10 v3] libxl: rename libxl_cpumap to libxl_bitmap Dario Faggioli
2012-07-06 10:39 ` Ian Campbell
2012-07-04 16:18 ` [PATCH 05 of 10 v3] libxl: expand the libxl_bitmap API a bit Dario Faggioli
2012-07-06 10:40 ` Ian Campbell
2012-07-04 16:18 ` [PATCH 06 of 10 v3] libxl: introduce some node map helpers Dario Faggioli
2012-07-04 16:18 ` [PATCH 07 of 10 v3] libxl: explicitly check for libmath in autoconf Dario Faggioli
2012-07-04 16:44 ` Roger Pau Monne
2012-07-06 11:42 ` Ian Campbell
2012-07-06 11:54 ` Dario Faggioli
2012-07-04 16:18 ` [PATCH 08 of 10 v3] libxl: enable automatic placement of guests on NUMA nodes Dario Faggioli
2012-07-04 16:41 ` Dario Faggioli [this message]
2012-07-06 10:55 ` Ian Campbell
2012-07-06 13:03 ` Dario Faggioli
2012-07-06 13:21 ` Ian Campbell
2012-07-06 13:52 ` Dario Faggioli
2012-07-06 13:54 ` Ian Campbell
2012-07-06 11:30 ` George Dunlap
2012-07-06 13:00 ` Dario Faggioli
2012-07-06 13:05 ` George Dunlap
2012-07-06 14:35 ` Dario Faggioli
2012-07-06 14:40 ` George Dunlap
2012-07-06 16:27 ` Ian Campbell
2012-07-04 16:18 ` [PATCH 09 of 10 v3] libxl: have NUMA placement deal with cpupools Dario Faggioli
2012-07-06 12:42 ` George Dunlap
2012-07-06 13:10 ` Dario Faggioli
2012-07-06 13:27 ` George Dunlap
2012-07-06 13:32 ` Ian Campbell
2012-07-06 13:42 ` Dario Faggioli
2012-07-10 15:16 ` Dario Faggioli
2012-07-04 16:18 ` [PATCH 10 of 10 v3] Some automatic NUMA placement documentation Dario Faggioli
2012-07-06 14:08 ` George Dunlap
2012-07-06 14:26 ` George Dunlap
2012-07-06 14:37 ` Dario Faggioli
2012-07-06 11:16 ` [PATCH 00 of 10 v3] Automatic NUMA placement for xl Ian Campbell
2012-07-06 11:20 ` Ian Campbell
2012-07-06 11:22 ` Ian Campbell
2012-07-06 13:05 ` Dario Faggioli
2012-07-06 12:19 ` Ian Campbell
2012-07-08 18:32 ` Ian Campbell
2012-07-09 14:32 ` Dario Faggioli
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1341420102.7083.38.camel@Solace \
--to=raistlin@linux.it \
--cc=Ian.Campbell@citrix.com \
--cc=Ian.Jackson@eu.citrix.com \
--cc=Stefano.Stabellini@eu.citrix.com \
--cc=andre.przywara@amd.com \
--cc=george.dunlap@eu.citrix.com \
--cc=juergen.gross@ts.fujitsu.com \
--cc=roger.pau@citrix.com \
--cc=xen-devel@lists.xen.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.