From mboxrd@z Thu Jan  1 00:00:00 1970
From: David Vrabel <david.vrabel@citrix.com>
Subject: Re: RFC: vNUMA project
Date: Tue, 11 Nov 2014 18:03:22 +0000
Message-ID: <54624F6A.40002@citrix.com>
References: <20141111173606.GC21312@zion.uk.xensource.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Return-path: <xen-devel-bounces@lists.xen.org>
In-Reply-To: <20141111173606.GC21312@zion.uk.xensource.com>
List-Unsubscribe: <http://lists.xen.org/cgi-bin/mailman/options/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xen.org>
List-Help: <mailto:xen-devel-request@lists.xen.org?subject=help>
List-Subscribe: <http://lists.xen.org/cgi-bin/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=subscribe>
Sender: xen-devel-bounces@lists.xen.org
Errors-To: xen-devel-bounces@lists.xen.org
To: Wei Liu <wei.liu2@citrix.com>, xen-devel@lists.xen.org
Cc: Dario Faggioli <dario.faggioli@citrix.com>, David Vrabel <david.vrabel@citrix.com>, Jan Beulich <JBeulich@suse.com>
List-Id: xen-devel@lists.xenproject.org

On 11/11/14 17:36, Wei Liu wrote:
> # What's already implemented?
> 
> PV vNUMA support in libxl/xl and Linux kernel.

Linux doesn't have vnuma yet, although the last set of patches I saw
looked fine and were waiting for acks from x86 maintainers I think.

> # NUMA-aware ballooning
> 
> It's agreed that NUMA-aware ballooning should be achieved solely in
> hypervisor. Everything should happen under the hood without guest
> knowing vnode to pnode mapping.
> 
> As far as I can tell, existing guests (Linux and FreeBSD) use
> XENMEM_populate_physmap to balloon up. There's a hypercall
> called XENMEM_increase_reservation but it's not used
> by Linux and FreeBSD.
> 
> I can think of two options to implement NUMA-aware ballooning:
> 
> 1. Modify XENMEM_populate_physmap to take into account vNUMA hint
>    when it tries to allocate a page for guest.
[...]
> Option #1 requires less modification to guest, because guest won't
> need to switch to new hypercall. It's unclear at this point if a guest
> asks to populate a gpfn that doesn't belong to any vnode, what Xen
> should do about it. Should it be permissive or strict? 

There are XENMEMF flags to request exact node or not  -- leave it up to
the balloon driver.  The Linux balloon driver could try exact on all
nodes before falling back to permissive or just always try inexact.

Perhaps a XENMEMF_vnode bit to indicate the node is virtual?

> 
> # HVM vNUMA
> 
> HVM vNUMA is implemented as followed:
> 
> 1. Libxl generates vNUMA information and passes it to hvmloader.
> 2. Hvmloader build SRAT table.
> 
> Note that hvmloader is capable of relocating memory. This means
> toolstack and guest can have different ideas of the memory layout.

Why can't hvmloader update the vnuma tables after it has relocated memory?

David