From mboxrd@z Thu Jan  1 00:00:00 1970
From: Chris Wright <chrisw@sous-sol.org>
Subject: Re: [RFC PATCH] Exporting Guest RAM information for
 NUMA binding
Date: Tue, 8 Nov 2011 09:33:04 -0800
Message-ID: <20111108173304.GA14486@sequoia.sous-sol.org>
References: <20111029184502.GH11038@in.ibm.com>
	<7816C401-9BE5-48A9-8BA9-4CDAD1B39FC8@suse.de>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Cc: Andrea Arcangeli <aarcange@redhat.com>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	kvm list <kvm@vger.kernel.org>, bharata@linux.vnet.ibm.com,
	qemu-devel Developers <qemu-devel@nongnu.org>,
	dipankar@in.ibm.com, Vaidyanathan S <svaidy@in.ibm.com>
To: Alexander Graf <agraf@suse.de>
Return-path: <qemu-devel-bounces+gceq-qemu-devel=gmane.org@nongnu.org>
Content-Disposition: inline
In-Reply-To: <7816C401-9BE5-48A9-8BA9-4CDAD1B39FC8@suse.de>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
Errors-To: qemu-devel-bounces+gceq-qemu-devel=gmane.org@nongnu.org
Sender: qemu-devel-bounces+gceq-qemu-devel=gmane.org@nongnu.org
List-Id: kvm.vger.kernel.org

* Alexander Graf (agraf@suse.de) wrote:
> On 29.10.2011, at 20:45, Bharata B Rao wrote:
> > As guests become NUMA aware, it becomes important for the guests to
> > have correct NUMA policies when they run on NUMA aware hosts.
> > Currently limited support for NUMA binding is available via libvirt
> > where it is possible to apply a NUMA policy to the guest as a whole.
> > However multinode guests would benefit if guest memory belonging to
> > different guest nodes are mapped appropriately to different host NUMA nodes.
> > 
> > To achieve this we would need QEMU to expose information about
> > guest RAM ranges (Guest Physical Address - GPA) and their host virtual
> > address mappings (Host Virtual Address - HVA). Using GPA and HVA, any external
> > tool like libvirt would be able to divide the guest RAM as per the guest NUMA
> > node geometry and bind guest memory nodes to corresponding host memory nodes
> > using HVA. This needs both QEMU (and libvirt) changes as well as changes
> > in the kernel.
> 
> Ok, let's take a step back here. You are basically growing libvirt into a memory resource manager that know how much memory is available on which nodes and how these nodes would possibly fit into the host's memory layout.
> 
> Shouldn't that be the kernel's job? It seems to me that architecturally the kernel is the place I would want my memory resource controls to be in.

I think that both Peter and Andrea are looking at this.  Before we commit
an API to QEMU that has a different semantic than a possible new kernel
interface (that perhaps QEMU could use directly to inform kernel of the
binding/relationship between vcpu thread and it's memory at VM startuup)
it would be useful to see what these guys are working on...

thanks,
-chris