From mboxrd@z Thu Jan 1 00:00:00 1970 From: Avi Kivity Subject: Re: [PATCH 0/3] KVM-userspace: add NUMA support for guests Date: Mon, 01 Dec 2008 17:37:46 +0200 Message-ID: <493404CA.3060404@redhat.com> References: <492F1DD9.8030901@amd.com> <49318A10.7080801@redhat.com> <4933F177.5040802@amd.com> <4933F4B0.7040500@redhat.com> <4934027D.8080904@codemonkey.ws> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Andre Przywara , kvm@vger.kernel.org, "Daniel P. Berrange" , Andi Kleen To: Anthony Liguori Return-path: Received: from mx2.redhat.com ([66.187.237.31]:41772 "EHLO mx2.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751334AbYLAPh6 (ORCPT ); Mon, 1 Dec 2008 10:37:58 -0500 In-Reply-To: <4934027D.8080904@codemonkey.ws> Sender: kvm-owner@vger.kernel.org List-ID: Anthony Liguori wrote: > Avi Kivity wrote: >> Andre Przywara wrote: >> >>> Any other useful commands for the monitor? Maybe (temporary) VCPU >>> migration without page migration? >> >> Right now vcpu migration is done externally (we export the thread IDs >> so management can pin them as it wishes). If we add numa support, I >> think it makes sense do it internally as well. I suggest using the >> same syntax for the monitor as for the command line; that's simplest >> to learn and to implement. > > I see no compelling reason to do cpu placement internally. It can be > done quite effectively externally. > > Memory allocation is tough, but I don't think it's out of reach. > Looking at the numactl man page, you can do: > > numactl --offset=1G --length=1G --membind=1 --file /dev/shm/A --touch > Bind the second gigabyte in the tmpfs file /dev/shm/A to node 1. > > > Since we can already create VM's with the -mem-path argument, if you > create a 2GB guest and want it to span two numa nodes, you could do: > > numactl --offset=0G --length=1G --membind=0 --file /dev/shm/A --touch > numactl --offset=1G --length=1G --membind=1 --file /dev/shm/A --touch > > And then create the VM with: > > qemu-system-x86_64 -mem-path /dev/shm/A -mem 2G ... > > What's best about this approach, is that you get full access to what > numactl is capable of. Interleaving, rebalancing, etc. It looks horribly difficult and unintuitive. It forces you to use -mem-path (which is an abomination; the only reason it lives is that we can't allocate large pages with it). -- error compiling committee.c: too many arguments to function