NUMA TODO-list for xen-devel

xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed

From: Dario Faggioli <raistlin@linux.it>
To: xen-devel <xen-devel@lists.xen.org>
Cc: Andre Przywara <andre.przywara@amd.com>,
	Anil Madhavapeddy <anil@recoil.org>,
	George Dunlap <dunlapg@gmail.com>,
	Jan Beulich <JBeulich@suse.com>,
	Andrew Cooper <Andrew.Cooper3@citrix.com>,
	"Zhang, Yang Z" <yang.z.zhang@intel.com>
Subject: NUMA TODO-list for xen-devel
Date: Wed, 01 Aug 2012 18:16:36 +0200	[thread overview]
Message-ID: <1343837796.4958.32.camel@Solace> (raw)


[-- Attachment #1.1: Type: text/plain, Size: 4233 bytes --]

Hi everyone,

With automatic placement finally landing into xen-unstable, I stated
thinking about what I could work on next, still in the field of
improving Xen's NUMA support. Well, it turned out that running out of
things to do is not an option! :-O

In fact, I can think of quite a bit of open issues in that area, that I'm
just braindumping here. If anyone has thoughts or idea or feedback or
whatever, I'd be happy to serve as a collector of them. I've already
created a Wiki page to help with the tracking. You can see it here
(for now it basically replicates this e-mail):

 http://wiki.xen.org/wiki/Xen_NUMA_Roadmap

I'm putting a [D] (standing for Dario) near the points I've started
working on or looking at, and again, I'd be happy to try tracking this
too, i.e., keeping the list of "who-is-doing-what" updated, in order to
ease collaboration.

So, let's cut the talking:

    - Automatic placement at guest creation time. Basics are there and
      will be shipping with 4.2. However, a lot of other things are
      missing and/or can be improved, for instance:
[D]    * automated verification and testing of the placement;
       * benchmarks and improvements of the placement heuristic;
[D]    * choosing/building up some measure of node load (more accurate
         than just counting vcpus) onto which to rely during placement;
       * consider IONUMA during placement;
       * automatic placement of Dom0, if possible (my current series is
         only affecting DomU)
       * having internal xen data structure honour the placement (e.g., 
         I've been told that right now vcpu stacks are always allocated
         on node 0... Andrew?).

[D] - NUMA aware scheduling in Xen. Don't pin vcpus on nodes' pcpus,
      just have them _prefer_ running on the nodes where their memory
      is.

[D] - Dynamic memory migration between different nodes of the host. As
      the counter-part of the NUMA-aware scheduler.

    - Virtual NUMA topology exposure to guests (a.k.a guest-numa). If a
      guest ends up on more than one nodes, make sure it knows it's
      running on a NUMA platform (smaller than the actual host, but
      still NUMA). This interacts with some of the above points:
       * consider this during automatic placement for
         resuming/migrating domains (if they have a virtual topology,
         better not to change it);
       * consider this during memory migration (it can change the
         actual topology, should we update it on-line or disable memory
         migration?)

    - NUMA and ballooning and memory sharing. In some more details:
       * page sharing on NUMA boxes: it's probably sane to make it
         possible disabling sharing pages across nodes;
       * ballooning and its interaction with placement (races, amount of
         memory needed and reported being different at different time,
         etc.).

    - Inter-VM dependencies and communication issues. If a workload is
      made up of more than just a VM and they all share the same (NUMA)
      host, it might be best to have them sharing the nodes as much as
      possible, or perhaps do right the opposite, depending on the
      specific characteristics of he workload itself, and this might be
      considered during placement, memory migration and perhaps
      scheduling.

    - Benchmarking and performances evaluation in general. Meaning both
      agreeing on a (set of) relevant workload(s) and on how to extract
      meaningful performances data from there (and maybe how to do that
      automatically?).

So, what do you think?

Thanks and Regards,
Dario

-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://retis.sssup.it/people/faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)


-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://retis.sssup.it/people/faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

next             reply	other threads:[~2012-08-01 16:16 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-08-01 16:16 Dario Faggioli [this message]
2012-08-01 16:24 ` NUMA TODO-list for xen-devel Dario Faggioli
2012-08-01 16:30 ` Andrew Cooper
2012-08-01 16:47   ` Dario Faggioli
2012-08-01 16:53     ` Andrew Cooper
2012-08-02  9:40   ` Jan Beulich
2012-08-02 13:21     ` Dario Faggioli
2012-08-01 16:32 ` Anil Madhavapeddy
2012-08-01 16:58   ` Dario Faggioli
2012-08-02  0:04     ` Malte Schwarzkopf
2012-08-07 23:53       ` Dario Faggioli
2012-08-02  1:04 ` Zhang, Yang Z
2012-08-07 22:56   ` Dario Faggioli
2012-08-02  9:43 ` Jan Beulich
2012-08-02 13:34   ` Dario Faggioli
2012-08-02 14:07     ` Jan Beulich
2012-08-02 16:36     ` George Dunlap
2012-08-03  9:23       ` Jan Beulich
2012-08-03  9:48         ` Andre Przywara
2012-08-03 10:03           ` Jan Beulich
2012-08-03 22:40             ` Dan Magenheimer
2012-08-03 11:00           ` George Dunlap
2012-08-03 22:34   ` Dan Magenheimer
2012-08-06  7:15     ` Jan Beulich
2012-08-06 16:28       ` Dan Magenheimer
2012-08-03 10:02 ` Andre Przywara
2012-08-03 10:40   ` Jan Beulich
2012-08-03 11:26     ` Andre Przywara
2012-08-03 11:38       ` Jan Beulich
2012-08-03 13:14         ` Dario Faggioli
2012-08-03 13:52           ` Jan Beulich
2012-08-03 22:42   ` Dan Magenheimer
2012-08-08  7:07     ` Dario Faggioli
2012-08-08  7:43   ` Dario Faggioli
2012-08-03 22:22 ` Dan Magenheimer
2012-08-07 23:49   ` Dario Faggioli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1343837796.4958.32.camel@Solace \
    --to=raistlin@linux.it \
    --cc=Andrew.Cooper3@citrix.com \
    --cc=JBeulich@suse.com \
    --cc=andre.przywara@amd.com \
    --cc=anil@recoil.org \
    --cc=dunlapg@gmail.com \
    --cc=xen-devel@lists.xen.org \
    --cc=yang.z.zhang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).