All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dario Faggioli <dario.faggioli@citrix.com>
To: Kun Cheng <chengkunck@gmail.com>
Cc: xen-devel@lists.xen.org
Subject: Re: VM Migration on a NUMA server?
Date: Fri, 31 Jul 2015 14:46:07 +0200	[thread overview]
Message-ID: <1438346767.16912.125.camel@citrix.com> (raw)
In-Reply-To: <CAO3v1VSC3rFVCkrEVByTxwBbP28t9SE5+fwJne3hHKNfz68zaA@mail.gmail.com>


[-- Attachment #1.1: Type: text/plain, Size: 3948 bytes --]

On Fri, 2015-07-31 at 02:32 +0000, Kun Cheng wrote:
> Hi all,
>
Hi,
> 
> I'm sorry for taking your time and I'd like to make an enquery about
> the status of VM migration support on a NUMA server. 
>
Status is: it's not there, and won't happen soon. I've started working
on it, but then got preempted by other issues, and concentrated on
making Xen do the best possible _without_ moving the memory (e.g., with
NUMA-aware scheduling, now achieved through per-vcpu soft affinities).

Moving memory around is really really tricky. It's probably at least
doable for HVM guests, while, for PV, I'm not even so sure it can be
done! :-/

> Currently It looks like when a vm is migrated only its vcpus are moved
> to the other node but not its memory. So, is anyone trying to fix that
> issue? 
>
What do you mean with "when a vm is migrated"? If soft affinity for a VM
is specified in the config file (or afterwards, but I'd recommend to do
it in the config file, if you're interested in NUMA effects), memory is
allocated from the NUMA node that such affinity spans, and the Xen
scheduler (provided you're using Credit1, our default scheduler), will
try as hard as it can to schedule the vcpus of the vm on the pcpus of
that same node (or set of nodes).

If it's not possible, because all those pcpus are busy, the vcpus are
allowed to run on some other pcpu, outside of the NUMA node(s) the vm
has affinity with, on the basis that some execution, even with slow
memory access, is better than no execution at all.

If you're interested in having the vcpus of the vm _only_ running on the
pcpus of the node to which the memory is attached, I'd suggest using
hard affinity, instead than soft (still specifying it in the config
file).

Support for soft affinity in Credit2 is being worked on. For other
schedulers (ARINC and RTDS), it's not that useful.

> If I want to do it myself, it seems like two major problems are ahead
> of me:
>  
> 1) How to specify the target node for memory migration? I'll be
> grateful if anyone can give me  some hints.
>
Not sure I get. In my mind, if we will have this in place at some point,
migration will happen, either:
 - automatically, upon load balancing considerations
 - manually, with dedicated libxl interfaces and xl command

at that point, for the latter case, there will be a way of specifying a
target node, and that will most likely be an integer, or a list of
integers...

> 2) Memory Migration. Looks like it can be done by leveraging the
> existing migration related functions on Xen.
>
Mmmm... Maybe I see what you mean now. So, you want to perform a local
migration, and use that as a way of actually moving the guest to another
node, is this correct? If yes, it did work, last time I checked.

If doing this like that, it's true that you don't have any way for
specifying a target node. Therefore, what happens is, either:
 - if no soft or hard affinity is specified in the config file, the
   automatic NUMA placement code will run, and it most likely will
   choose a different node for the target vm, but not in a way that you
   can control easily
 - if any affinity is set, the vm will be re-created in the same exact 
   node.

That is why, a way to workaround this, and actually use local migration
as a memory-migration mechanism, is to leverage `xl config-update'. In
fact, you can do as follows:

# xl create vm.cfg 'cpus_soft="node:1'"
# xl config-update <domid> 'cpus_soft="node:0"'
# <do a local migration>

As I said, this all worked last time I tried... Is it not working for
you? Or was it something else you were after?

Regards,
Dario

-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

  parent reply	other threads:[~2015-07-31 12:46 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-07-31  2:32 VM Migration on a NUMA server? Kun Cheng
2015-07-31 11:41 ` Kun Cheng
2015-07-31 12:10   ` Wei Liu
2015-07-31 12:50     ` Dario Faggioli
2015-08-01  6:21       ` Kun Cheng
2015-08-03 10:10         ` Dario Faggioli
2015-08-03 12:23           ` Kun Cheng
2015-08-03 12:48             ` Dario Faggioli
2015-07-31 12:46 ` Dario Faggioli [this message]
2015-07-31 15:51   ` Kun Cheng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1438346767.16912.125.camel@citrix.com \
    --to=dario.faggioli@citrix.com \
    --cc=chengkunck@gmail.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.