All of lore.kernel.org
 help / color / mirror / Atom feed
From: Aron Griffis <aron@hp.com>
To: Andre Przywara <andre.przywara@amd.com>
Cc: xen-devel@lists.xensource.com, "Xu, Anthony" <anthony.xu@intel.com>
Subject: Re: [RFC] Xen NUMA strategy
Date: Tue, 18 Sep 2007 10:31:29 -0400	[thread overview]
Message-ID: <20070918143129.GC8468@fc.hp.com> (raw)
In-Reply-To: <46EA7906.2010504@amd.com>

Hi Andre,

Andre Przywara wrote:  [Fri Sep 14 2007, 08:05:26AM EDT]
> We came up with two different approaches for better NUMA support in Xen:
> 1.) Guest NUMA support: spread a guest's resources (CPUs and memory) over 
> several nodes and propagate the appropriate topology to the guest.
> The first part of this is in the patches I sent recently to the list (PV 
> support is following, bells and whistles like automatic placement will 
> follow, too.).

It seems like you are proposing two things at once here.  Let's call
these 1a and 1b

1a. Expose NUMA topology to the guests.  This isn't the topology of
    dom0, just the topology of the domU, i.e. it is constructed by
    dom0 when starting the domain.

1b. Spread the guest over nodes.  I can't tell if you mean to do this
    automatically or by request when starting the guest.  This seems
    to be separate from 1a.

> 	***Advantages***:
> - The guest OS has better means to deal with the NUMA setup, it can more 
> easily migrate _processes_ among the nodes (Xen-HV can only migrate whole 
> domains).
> - Changes to Xen are relatively small.
> - There is no limit for the guest resources, since they can use more 
> resources than there are on one node.

The advantages above relate to 1a

> - If guests are well spread over the nodes, the system is more balanced 
> even if guests are destroyed and created later.

and this advantage relates to 1b

> 	***Disadvantages***:
> - The guest has to support NUMA. This is not true for older guests (Win2K, 
> older Linux).
> - The guest's workload has to fit NUMA. If the guests tasks are merely 
> parallelizable or use much shared memory, they cannot take advantage of 
> NUMA and will degrade in performance. This includes all single task 
> problems.

IMHO the list of disadvantages is only what we have in xen today.
Presently no guests can see the NUMA topology, so it's the same as if
they don't have support in the guest.  Adding NUMA topology
propogation does not create these disadvantages, it simply exposes the
weakness of the lesser operating systems.

> In general this approach seems to fit better with smaller NUMA nodes
> and larger guests.
>
> 2.) Dynamic load balancing and page migration: create guests within one 
> NUMA node and distribute all guests across the nodes. If the system becomes 
> imbalanced, migrate guests to other nodes and copy (at least part of) their 
> memory pages to the other node's local memory.

Again, this seems like a two-part proposal.

2a. Add to xen the ability to run a guest within a node, so that cpus
    and ram are allocated from within the node instead of randomly
    across the system.

2b. NUMA balancing.  While this seems like a worthwhile goal, IMHO
    it's separate from the first part of the proposal.

> 	***Advantages***:
> - No guest NUMA support necessary. Older as well a recent guests should run 
> fine.
> - Smaller guests don't have to cope with NUMA and will have 'flat' memory 
> available.
> - Guests running on separate nodes usually don't disturb each other and can 
> benefit from the higher distributed memory bandwidth.
> 	***Disadvantages***:
> - Guests are limited to the resources available on one node. This applies 
> for both the number of CPUs and the amount of memory.

Advantages and disadvantages above apply to 2a

> - Costly migration of guests. In a simple implementation we'd use live 
> migration, which requires the whole guest's memory to be copied before the 
> guest starts to run on the other node. If this whole move proves to be 
> unnecessary a few minutes later, all this was in vain. A more advanced 
> implementation would do the page migration in the background and thus can 
> avoid this problem, if only the hot pages are migrated first.

This applies to 2b

> - Integration into Xen seems to be more complicated (at least for the more 
> ungifted hackers among us).

It seems like 2a would be significantly easier than 2b

If the mechanics of migrating between NUMA nodes is implemented in the
hypervisor, then policy and control can be implemented in dom0
userland, so none of the automatic part of this needs to be in the
hypervisor.

Thanks for getting started on this.  Getting some of this into Xen
would be great!

Aron

  parent reply	other threads:[~2007-09-18 14:31 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-09-14 12:05 [RFC] Xen NUMA strategy Andre Przywara
2007-09-18  6:08 ` Akio Takebe
2007-09-18  6:33   ` Xu, Anthony
2007-09-18  6:57     ` Akio Takebe
2007-09-18  8:43     ` Ian Pratt
2007-09-18 13:30       ` Aron Griffis
2007-09-19  1:04         ` Ian Pratt
2007-09-20  1:44       ` Xu, Anthony
2007-09-20  9:56         ` Ian Pratt
2007-09-20  3:09       ` Aron Griffis
2007-09-20  9:50         ` Ian Pratt
2007-09-21 21:36           ` Aron Griffis
2007-09-18 14:31 ` Aron Griffis [this message]
  -- strict thread matches above, loose matches on Subject: below --
2007-09-20 10:26 André Przywara

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070918143129.GC8468@fc.hp.com \
    --to=aron@hp.com \
    --cc=andre.przywara@amd.com \
    --cc=anthony.xu@intel.com \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.