kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Anthony Liguori <anthony@codemonkey.ws>
To: Andre Przywara <andre.przywara@amd.com>
Cc: Avi Kivity <avi@redhat.com>,
	kvm@vger.kernel.org, "Daniel P. Berrange" <berrange@redhat.com>
Subject: Re: [PATCH 0/3] v2: KVM-userspace: add NUMA support for guests
Date: Fri, 05 Dec 2008 08:28:08 -0600	[thread overview]
Message-ID: <49393A78.5030601@codemonkey.ws> (raw)
In-Reply-To: <49392CB6.9000000@amd.com>

Hi Andre,

This patch series needs to be posted to qemu-devel.  I know qemu doesn't 
do true SMP yet, but it will in the relatively near future.  Either way, 
some of the design points needs review from a larger audience than 
present on kvm-devel.

I'm not a big fan of the libnuma dependency.  I'll willing to concede 
this if there's a wide agreement that we should support this directly in 
QEMU.

I don't think there's such a thing as a casual NUMA user.  The default 
NUMA policy in Linux is node-local memory.  As long as a VM is smaller 
than a single node, everything will work out fine.

In the event that the VM is larger than a single node, if a user is 
creating it via qemu-system-x86_64, they're going to either not care at 
all about NUMA, or be familiar enough with the numactl tools that 
they'll probably just want to use that.  Once you've got your head 
around the fact that VCPUs are just threads and the memory is just a 
shared memory segment, any knowledgable sysadmin will have no problem 
doing whatever sort of NUMA layout they want.

The other case is where management tools are creating VMs.  In this 
case, it's probably better to use numactl as an external tool because 
then it keeps things consistent wrt CPU pinning.

There's also a good argument for not introducing CPU pinning directly to 
QEMU.  There are multiple ways to effectively do CPU pinning.  You can 
use taskset, you can use cpusets or even something like libcgroup.

If you refactor the series so that the libnuma patch is the very last 
one and submit to qemu-devel, I'll review and apply all of the first 
patches.  We can continue to discuss the last patch independently of the 
first three if needed.

Regards,

Anthony Liguori

Andre Przywara wrote:
> Hi,
>
> this patch series introduces multiple NUMA nodes support within KVM 
> guests.
> This is the second try incorporating several requests from the list:
> - use the QEMU firmware configuration interface instead of CMOS-RAM
> - detect presence of libnuma automatically, can be disabled with
>   ./configure --disable-numa
> This only applies to the host side, the command line and guest (BIOS)
> side are always built and functional, although this configuration
> is only useful for research and debugging
> - use a more flexible command line interface allowing:
>   - specifying the distribution of memory across the guest nodes:
>     mem:1536M;512M
>   - specifying the distribution of the CPUs:
>     cpu:0-2;3
>   - specifying the host nodes the guest nodes should be pinned to:
>     pin:3;2
> All of these options are optional, in case of mem and cpu the 
> resources are split equally across all guest nodes if omitted. Please 
> note that at least in Linux SRAT takes precedence over E820, so the 
> total usable memory will be the sum specified at the mem: option 
> (although QEMU will still allocate the amount at -m).
> If pin: is omitted, the guest nodes will be pinned to those host nodes 
> where the threads are happen to be scheduled at on start-up time. This 
> requires the (v)getcpu (v)syscall to be usable, this is true for 
> kernels up from 2.6.19 and glibc >= 2.6 (sched_getcpu()). I have a 
> hack if glibc doesn't support this, tell me if you are interested.
> The only non-optional argument is the number of guest nodes, a 
> possible command line looks like:
> -numa 3,mem:1024M;512M;512M,cpu:0-1;2;3
> Please note that you have to quote the semicolons on the shell.
>
> The monitor command is left out for now and will be send later.
>
> Please apply.
>
> Regards,
> Andre.
>
> Signed-off-by: Andre Przywara <andre.przywara@amd.com>
>


  reply	other threads:[~2008-12-05 14:28 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-12-05 13:29 [PATCH 0/3] v2: KVM-userspace: add NUMA support for guests Andre Przywara
2008-12-05 14:28 ` Anthony Liguori [this message]
2008-12-05 15:22   ` Andre Przywara
2008-12-05 15:41     ` Anthony Liguori
2008-12-08 21:46       ` André Przywara
2008-12-08 22:01         ` Anthony Liguori
2008-12-09 14:24         ` Avi Kivity
2008-12-09 14:55           ` Anthony Liguori
2008-12-05 15:27   ` Avi Kivity
2008-12-05 15:34     ` Anthony Liguori

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=49393A78.5030601@codemonkey.ws \
    --to=anthony@codemonkey.ws \
    --cc=andre.przywara@amd.com \
    --cc=avi@redhat.com \
    --cc=berrange@redhat.com \
    --cc=kvm@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).