From: Anthony Liguori <anthony@codemonkey.ws>
To: Avi Kivity <avi@redhat.com>
Cc: Andre Przywara <andre.przywara@amd.com>,
kvm@vger.kernel.org, "Daniel P. Berrange" <berrange@redhat.com>,
Andi Kleen <ak@suse.de>
Subject: Re: [PATCH 0/3] KVM-userspace: add NUMA support for guests
Date: Mon, 01 Dec 2008 09:49:04 -0600 [thread overview]
Message-ID: <49340770.9000908@codemonkey.ws> (raw)
In-Reply-To: <493404CA.3060404@redhat.com>
Avi Kivity wrote:
> Anthony Liguori wrote:
>>
>> I see no compelling reason to do cpu placement internally. It can be
>> done quite effectively externally.
>>
>> Memory allocation is tough, but I don't think it's out of reach.
>> Looking at the numactl man page, you can do:
>>
>> numactl --offset=1G --length=1G --membind=1 --file /dev/shm/A --touch
>> Bind the second gigabyte in the tmpfs file /dev/shm/A to node 1.
>>
>>
>> Since we can already create VM's with the -mem-path argument, if you
>> create a 2GB guest and want it to span two numa nodes, you could do:
>>
>> numactl --offset=0G --length=1G --membind=0 --file /dev/shm/A --touch
>> numactl --offset=1G --length=1G --membind=1 --file /dev/shm/A --touch
>>
>> And then create the VM with:
>>
>> qemu-system-x86_64 -mem-path /dev/shm/A -mem 2G ...
>>
>> What's best about this approach, is that you get full access to what
>> numactl is capable of. Interleaving, rebalancing, etc.
>
> It looks horribly difficult and unintuitive. It forces you to use
> -mem-path (which is an abomination; the only reason it lives is that
> we can't allocate large pages with it).
As opposed to inventing new options for QEMU that convey all of the same
information a slightly different way? We're stuck with -mem-path so we
might as well make good use of it.
The proposed syntax is:
qemu -numanode node=1,cpu=2,cpu=3,start=1G,size=1G,hostnode=3
The new syntax would be:
qemu -smp 4 -numa nodes=2,cpus=1:2:3:4,mem=1G:1G -mem-path
/dev/hugetlbfs/foo
Then you would have to look up the thread ids, and do
taskset <vcpu1>
taskset <vcpu2>
taskset <vcpu3>
taskset <vcpu4>
numactl -o 1G -l 1G -m 0 -f /dev/hugetlbfs/foo
numactl -o 1G -l 1G -m 1 -f /dev/hugetlbfs/foo
This may look like a lot more, but it's not going to be nearly enough to
specify a NUMA placement on startup. What if you have a very large NUMA
system and want to rebalance virtual machines? You need a mechanism to
do this that now has to be exposed through the monitor. In fact, you'll
almost certainly introduce a taskset-like monitor command and a
numactl-like monitor command.
Why reinvent the wheel? Plus, taskset and numactl gives you a lot of
flexibility. All we're going to do by cooking this stuff into QEMU is
artificially limit ourselves.
Regards,
Anthony LIguori
next prev parent reply other threads:[~2008-12-01 15:49 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-11-27 22:23 [PATCH 0/3] KVM-userspace: add NUMA support for guests Andre Przywara
2008-11-28 8:14 ` Andi Kleen
2008-11-29 18:43 ` Avi Kivity
2008-11-29 20:10 ` Andi Kleen
2008-11-29 20:35 ` Avi Kivity
2008-11-30 15:41 ` Andi Kleen
2008-11-30 15:38 ` Avi Kivity
2008-11-30 16:05 ` Andi Kleen
2008-11-30 16:38 ` Avi Kivity
2008-11-30 17:04 ` Andi Kleen
2008-11-30 17:11 ` Avi Kivity
2008-11-30 17:42 ` Andi Kleen
2008-11-30 18:07 ` Avi Kivity
2008-11-30 18:55 ` Andi Kleen
2008-11-30 19:11 ` Skywing
2008-11-30 20:08 ` Avi Kivity
2008-11-30 20:07 ` Avi Kivity
2008-11-30 21:41 ` Andi Kleen
2008-11-30 21:50 ` Avi Kivity
2008-11-30 22:08 ` Skywing
2008-11-28 10:40 ` Daniel P. Berrange
2008-11-29 18:29 ` Avi Kivity
2008-12-01 14:15 ` Andre Przywara
2008-12-01 14:29 ` Avi Kivity
2008-12-01 15:27 ` Anthony Liguori
2008-12-01 15:34 ` Anthony Liguori
2008-12-01 15:37 ` Avi Kivity
2008-12-01 15:49 ` Anthony Liguori [this message]
2008-12-01 14:44 ` Daniel P. Berrange
2008-12-01 14:53 ` Avi Kivity
2008-12-01 15:18 ` Anthony Liguori
2008-12-01 15:35 ` Avi Kivity
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=49340770.9000908@codemonkey.ws \
--to=anthony@codemonkey.ws \
--cc=ak@suse.de \
--cc=andre.przywara@amd.com \
--cc=avi@redhat.com \
--cc=berrange@redhat.com \
--cc=kvm@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.