xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Dario Faggioli <raistlin@linux.it>
To: xen-devel@lists.xen.org
Cc: Andre Przywara <andre.przywara@amd.com>,
	Ian Campbell <Ian.Campbell@citrix.com>,
	Stefano Stabellini <Stefano.Stabellini@eu.citrix.com>,
	George Dunlap <george.dunlap@eu.citrix.com>,
	Juergen Gross <juergen.gross@ts.fujitsu.com>,
	Ian Jackson <Ian.Jackson@eu.citrix.com>,
	Roger Pau Monne <roger.pau@citrix.com>
Subject: Re: [PATCH 08 of 10 v3] libxl: enable automatic placement of guests on NUMA nodes
Date: Wed, 04 Jul 2012 18:41:42 +0200	[thread overview]
Message-ID: <1341420102.7083.38.camel@Solace> (raw)
In-Reply-To: <7087d3622ee2051654c9.1341418687@Solace>


[-- Attachment #1.1: Type: text/plain, Size: 4968 bytes --]

On Wed, 2012-07-04 at 18:18 +0200, Dario Faggioli wrote:
> # HG changeset patch
> # User Dario Faggioli <raistlin@linux.it>
> # Date 1341416324 -7200
> # Node ID 7087d3622ee2051654c9e78fe4829da10c2d46f1
> # Parent  6fd693e7f3bc8b4d9bd20befff2c13de5591a7c5
> libxl: enable automatic placement of guests on NUMA nodes
> 
George, Ian (Jackson),

> Once we know which ones, among all the possible combinations, represents valid
> placement candidates for a domain, use some heuistics for deciding which is the
> best. For instance, smaller candidates are considered to be better, both from
> the domain's point of view (fewer memory spreading among nodes) and from the
> system as a whole point of view (fewer memoy fragmentation).  In case of
> candidates of equal sizes (i.e., with the same number of nodes), the amount of
> free memory and the number of domain already assigned to their nodes are
> considered. Very often, candidates with greater amount of memory are the one
> we wants, as this is also good for keeping memory fragmentation under control.
> However, if the difference in how much free memory two candidates have, the
> number of assigned domains might be what decides which candidate wins.
>
> [...]
>
> ---
> Changes from v2:
>
> [...]
>
>  * Comparison function for candidates changed so that it now provides
>    total ordering, as requested during review. It is still using FP
>    arithmetic, though. Also I think that just putting the difference
>    between the amount of free memory and between the number of assigned
>    domains of two candidates in a single formula (after normalizing and
>    weighting them) is both clear and effective enough.
>
> [...]
>  
I thought at what a sensible comparison function should look like, and I
also plotted some graphs while randomly generating both the amount of
free memory and number of domains. The outcome of all this is I managed
in convincing myself the solution below is both clear and understandable
as it is effective (confirmed by the test I was able to perform up to
now).

Basically, the idea is to consider both things (freemem and nr_domains),
but with freemem being 3 times more "important" than nr_domains. That is
very similar to one of the log()-based solutions proposed by Ian, but I
really think just normalizing and weighting is easier to understand,
even if quickly looking at the formula/code.

Regarding the percent penalty per each domain proposed by George on IRC,
I liked the idea a lot, but the figured out that here will always be a
number of domains (e.g., 20, if the penalty is set to 5%) starting from
which nr_domains counts more than freemem, which is the opposite of what
I want to achieve.

So, now, comparison between placement candidates happens like this:

  return a.nr_nodes < b.nr_nodes ? a :
             b.nr_nodes < a.nr_nodes ? b :
             3*norm_diff(b.freemem, a.freemem) - norm_diff(a.nr_domains, b.nr_domains)

 where:

  norm_diff(x, y) := (x - y)/max(x, y)

Which removes the nasty effects of having that 10% range as in v2 of the
series. If that '3' looks too much of a magic number, I can of course
enum/#define it, or even make it configurable (although, the latter, not
for 4.2, I guess :-) ).

> +/* Subtract two values and translate the result in [0, 1] */
> +static double normalized_diff(double a, double b)
> +{
> +#define max(a, b) (a > b ? a : b)
> +    if (!a && a == b)
> +        return 0.0;
> +    return (a - b) / max(a, b);
> +}
> +
> +/*
> + * The NUMA placement candidates are reordered according to the following
> + * heuristics:
> + *  - candidates involving fewer nodes come first. In case two (or
> + *    more) candidates span the same number of nodes,
> + *  - the amount of free memory and the number of domains assigned to the
> + *    candidates are considered. In doing that, candidates with greater
> + *    amount of free memory and fewer domains assigned to them are preferred,
> + *    with free memory "weighting" three times as much as number of domains.
> + */
> +static int numa_cmpf(const void *v1, const void *v2)
> +{
> +    const libxl__numa_candidate *c1 = v1;
> +    const libxl__numa_candidate *c2 = v2;
> +#define sign(a) a > 0 ? 1 : a < 0 ? -1 : 0
> +    double freememkb_diff = normalized_diff(c2->free_memkb, c1->free_memkb);
> +    double nrdomains_diff = normalized_diff(c1->nr_domains, c2->nr_domains);
> +
> +    if (c1->nr_nodes != c2->nr_nodes)
> +        return c1->nr_nodes - c2->nr_nodes;
> +
> +    return sign(3*freememkb_diff + nrdomains_diff);
> +}
> +

What do you think?

Thanks and Regards,
Dario

-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://retis.sssup.it/people/faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

  reply	other threads:[~2012-07-04 16:41 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-07-04 16:17 [PATCH 00 of 10 v3] Automatic NUMA placement for xl Dario Faggioli
2012-07-04 16:18 ` [PATCH 01 of 10 v3] libxl: add a new Array type to the IDL Dario Faggioli
2012-07-04 16:18 ` [PATCH 02 of 10 v3] libxl, libxc: introduce libxl_get_numainfo() Dario Faggioli
2012-07-06 10:35   ` Ian Campbell
2012-07-04 16:18 ` [PATCH 03 of 10 v3] xl: add more NUMA information to `xl info -n' Dario Faggioli
2012-07-06 11:37   ` Ian Campbell
2012-07-06 12:00     ` Dario Faggioli
2012-07-06 12:15       ` Ian Campbell
2012-07-06 12:52         ` Dario Faggioli
2012-07-04 16:18 ` [PATCH 04 of 10 v3] libxl: rename libxl_cpumap to libxl_bitmap Dario Faggioli
2012-07-06 10:39   ` Ian Campbell
2012-07-04 16:18 ` [PATCH 05 of 10 v3] libxl: expand the libxl_bitmap API a bit Dario Faggioli
2012-07-06 10:40   ` Ian Campbell
2012-07-04 16:18 ` [PATCH 06 of 10 v3] libxl: introduce some node map helpers Dario Faggioli
2012-07-04 16:18 ` [PATCH 07 of 10 v3] libxl: explicitly check for libmath in autoconf Dario Faggioli
2012-07-04 16:44   ` Roger Pau Monne
2012-07-06 11:42   ` Ian Campbell
2012-07-06 11:54     ` Dario Faggioli
2012-07-04 16:18 ` [PATCH 08 of 10 v3] libxl: enable automatic placement of guests on NUMA nodes Dario Faggioli
2012-07-04 16:41   ` Dario Faggioli [this message]
2012-07-06 10:55   ` Ian Campbell
2012-07-06 13:03     ` Dario Faggioli
2012-07-06 13:21       ` Ian Campbell
2012-07-06 13:52         ` Dario Faggioli
2012-07-06 13:54           ` Ian Campbell
2012-07-06 11:30   ` George Dunlap
2012-07-06 13:00     ` Dario Faggioli
2012-07-06 13:05       ` George Dunlap
2012-07-06 14:35         ` Dario Faggioli
2012-07-06 14:40           ` George Dunlap
2012-07-06 16:27             ` Ian Campbell
2012-07-04 16:18 ` [PATCH 09 of 10 v3] libxl: have NUMA placement deal with cpupools Dario Faggioli
2012-07-06 12:42   ` George Dunlap
2012-07-06 13:10     ` Dario Faggioli
2012-07-06 13:27       ` George Dunlap
2012-07-06 13:32         ` Ian Campbell
2012-07-06 13:42         ` Dario Faggioli
2012-07-10 15:16         ` Dario Faggioli
2012-07-04 16:18 ` [PATCH 10 of 10 v3] Some automatic NUMA placement documentation Dario Faggioli
2012-07-06 14:08   ` George Dunlap
2012-07-06 14:26     ` George Dunlap
2012-07-06 14:37       ` Dario Faggioli
2012-07-06 11:16 ` [PATCH 00 of 10 v3] Automatic NUMA placement for xl Ian Campbell
2012-07-06 11:20   ` Ian Campbell
2012-07-06 11:22     ` Ian Campbell
2012-07-06 13:05       ` Dario Faggioli
2012-07-06 12:19 ` Ian Campbell
2012-07-08 18:32 ` Ian Campbell
2012-07-09 14:32   ` Dario Faggioli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1341420102.7083.38.camel@Solace \
    --to=raistlin@linux.it \
    --cc=Ian.Campbell@citrix.com \
    --cc=Ian.Jackson@eu.citrix.com \
    --cc=Stefano.Stabellini@eu.citrix.com \
    --cc=andre.przywara@amd.com \
    --cc=george.dunlap@eu.citrix.com \
    --cc=juergen.gross@ts.fujitsu.com \
    --cc=roger.pau@citrix.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).