From: Dulloor <dulloor@gmail.com>
To: Keir Fraser <keir.fraser@eu.citrix.com>
Cc: Andre Przywara <andre.przywara@amd.com>,
"xen-devel@lists.xensource.com" <xen-devel@lists.xensource.com>
Subject: Re: [vNUMA v2][PATCH 2/8] public interface
Date: Tue, 3 Aug 2010 10:24:58 -0700 [thread overview]
Message-ID: <AANLkTin4Rx-mhu8kRUmRh5PDubQvsZWtbm7kw_ZmqLuj@mail.gmail.com> (raw)
In-Reply-To: <C87DF9E6.1C973%keir.fraser@eu.citrix.com>
On Tue, Aug 3, 2010 at 8:52 AM, Keir Fraser <keir.fraser@eu.citrix.com> wrote:
> On 03/08/2010 16:43, "Dulloor" <dulloor@gmail.com> wrote:
>
>>> I would expect guest would see nodes 0 to nr_vnodes-1, and the mnode_id
>>> could go away.
>> mnode_id maps the vnode to a particular physical node. This will be
>> used by balloon driver in
>> the VMs when the structure is passed as NUMA enlightenment to PVs and
>> PV on HVMs.
>> I have a patch ready for that (once we are done with this series).
>
> So what happens when the guest is migrated to another system with different
> physical node ids? Is that never to be supported? I'm not sure why you
> wouldn't hide the vnode-to-mnode translation in the hypervisor.
Right now, migration is not supported when NUMA strategy is set.
This is in my TODO list (along with PoD support).
There are a few open questions wrt migration :
- What if the destination host is not NUMA, but the guest is NUMA. Do we fake
those nodes ? Or, should we not select such a destination host to begin with.
- What if the destination host is not NUMA, but guest has asked to be
striped across
a specific number of nodes (possibly for higher aggregate memory bandwidth) ?
- What if the guest has asked for a particular memory strategy
(split/confined/striped),
but the destination host can't guarantee that (because of the
distribution of free memory
across the nodes) ?
Once we answer these questions, we will know whether vnode-to-mnode
translation is better
exposed or not. And, if exposed, could we just renegotiate the
vnode-to-mnode translation at the
destination host. I have started working on this. But, I have some
other patches ready to go
which we might want to check-in first - PV/Dom0 NUMA patches,
Ballooning support (see below).
As such, the purpose of vnode-to-mnode translation is for the enlightened
guests to know where their underlying memory comes from, so that
over-provisioning features
like ballooning are given a chance to maintain this distribution. This
way all that the hypervisor
cares about is to do sanity checks on increase/exchange reservation
requests from the guests
and the guest can decide whether to make an exact_node_request or not.
Other options which would allow us to discard this translation are :
- Ballooning at your risk : Let ballooning be as it is even when
guests use a numa strategy(particularly split/confined).
- Hypervisor-level policies : Let Xen do its best to maintain the
guest nodes (using gpfn ranges in guest nodes), which I think
is not a clean/flexible solution.
But, what I could do is to leave out vnode_to_mnode translation for
now and add it along with ballooning support
(if/when we decide to add it). I will just bump up the interface
version at that time. That might give us time to mull this over ?
>
> -- Keir
>
>
>
next prev parent reply other threads:[~2010-08-03 17:24 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <1BEA8649F0C00540AB2811D7922ECB6C9338B4CC@orsmsx507.amr.corp.intel.com>
2010-07-02 23:54 ` [XEN][vNUMA][PATCH 3/9] public interface Dulloor
2010-07-05 7:39 ` Keir Fraser
2010-07-05 8:52 ` Dulloor
2010-07-05 10:23 ` Keir Fraser
2010-07-06 5:57 ` Dulloor
2010-07-06 12:57 ` Keir Fraser
2010-07-06 17:52 ` Dulloor
2010-08-01 22:02 ` [vNUMA v2][PATCH 2/8] " Dulloor
2010-08-03 12:40 ` Andre Przywara
2010-08-03 15:24 ` Dulloor
2010-08-03 13:37 ` Andre Przywara
2010-08-03 14:10 ` Keir Fraser
2010-08-03 15:43 ` Dulloor
2010-08-03 15:52 ` Keir Fraser
2010-08-03 17:24 ` Dulloor [this message]
2010-08-03 19:52 ` Keir Fraser
2010-08-03 20:32 ` Dulloor
2010-08-03 21:55 ` Andre Przywara
2010-08-04 5:27 ` Keir Fraser
2010-08-04 5:48 ` Dulloor
2010-08-04 7:01 ` Andre Przywara
2010-08-04 8:45 ` Keir Fraser
2010-08-04 13:34 ` Dan Magenheimer
2010-08-03 21:35 ` Andre Przywara
2010-08-03 15:54 ` Keir Fraser
2010-08-03 15:32 ` Dulloor
2010-08-03 21:21 ` Andre Przywara
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=AANLkTin4Rx-mhu8kRUmRh5PDubQvsZWtbm7kw_ZmqLuj@mail.gmail.com \
--to=dulloor@gmail.com \
--cc=andre.przywara@amd.com \
--cc=keir.fraser@eu.citrix.com \
--cc=xen-devel@lists.xensource.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).