Re: [PATCH 0/3][RFC] NUMA: add host side pinning

public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed

From: Andre Przywara <andre.przywara@amd.com>
To: Anthony Liguori <anthony@codemonkey.ws>
Cc: "kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"agraf@suse.de" <agraf@suse.de>
Subject: Re: [PATCH 0/3][RFC] NUMA: add host side pinning
Date: Thu, 24 Jun 2010 08:44:00 +0200	[thread overview]
Message-ID: <4C22FEB0.2020002@amd.com> (raw)
In-Reply-To: <4C2288DD.3020207@codemonkey.ws>

Anthony Liguori wrote:
> On 06/23/2010 04:09 PM, Andre Przywara wrote:
>> Hi,
>>
>> these three patches add basic NUMA pinning to KVM. According to a user
>> provided assignment parts of the guest's memory will be bound to different
>> host nodes. This should increase performance in large virtual machines
>> and on loaded hosts.
>> These patches are quite basic (but work) and I send them as RFC to get
>> some feedback before implementing stuff in vain.
>>
 >> ....
>>
>> Please comment on the approach in general and the implementation.
>>    
> 
> If we extended integrated -mem-path with -numa such that a different 
> path could be used with each numa node (and we let an explicit file be 
> specified instead of just a directory), then if I understand correctly, 
> we could use numactl without any specific integration in qemu.  Does 
> this sound correct?
In general, yes. But I consider the whole hugetlbfs approach broken. 
Since 2.6.32 or so you can use MAP_HUGETLB together with MAP_ANONYMOUS 
in mmap() to avoid hugetlbfs at all, and I bet that the future will hold 
transparent hugepages anyway (RHEL6 already has them).
I am not sure whether you want to keep the -memfile option and extend it 
with some pseudo compat glue (faked directory names to be interpreted by 
QEMU) to make it work in the future. But anyway in these cases the 
external numactl approach would not work anymore.

> IOW:
> 
> qemu -numa node,mem=1G,nodeid=0,cpus=0-1,memfile=/dev/shm/node0.mem 
> -numa node,mem=2G,nodeid=1,cpus=1-2,memfile=/dev/shm/node1.mem
> 
> It's then possible to say:
> 
> numactl --file /dev/shm/node0.mem --interleave=0,1
> numactl --file /dev/shm/node1.mem --membind=2
> 
> I think this approach is nicer because it gives the user a lot more 
> flexibility without having us chase other tools like numactl.  For 
> instance, your patches only support pinning and not interleaving.
That's right. I put it on the list ;-)

Thanks for the good hint on the huge pages issue, as this is not 
properly handled in the current implementation. I will think about a 
proper way to handle this, but would still opt for a (at least 
partially) QEMU integrated solution.
Still open for discussion, though, as I see your point of avoiding 
duplicate NUMA implementation between numactl and QEMU.

Regards,
Andre.

-- 
Andre Przywara
AMD-Operating System Research Center (OSRC), Dresden, Germany
Tel: +49 351 488-3567-12

next prev parent reply	other threads:[~2010-06-24  6:42 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-06-23 21:09 [PATCH 0/3][RFC] NUMA: add host side pinning Andre Przywara
2010-06-23 21:09 ` [PATCH 1/3] NUMA: add Linux libnuma detection Andre Przywara
2010-06-23 21:09 ` [PATCH 2/3] NUMA: add parsing of host NUMA pin option Andre Przywara
2010-06-23 21:09 ` [PATCH 3/3] NUMA: realize NUMA memory pinning Andre Przywara
2010-06-23 22:21 ` [PATCH 0/3][RFC] NUMA: add host side pinning Anthony Liguori
2010-06-23 22:29   ` Alexander Graf
2010-06-24 10:58     ` Andre Przywara
2010-06-24 11:12       ` Avi Kivity
2010-06-24 11:34         ` Andre Przywara
2010-06-24 11:42           ` Avi Kivity
2010-06-28 16:20             ` Anthony Liguori
2010-06-28 16:26               ` Alexander Graf
2010-06-29  9:46               ` Avi Kivity
2010-06-25 11:00           ` Jes Sorensen
2010-06-25 11:06             ` Andre Przywara
2010-06-25 11:37               ` Jes Sorensen
2010-06-28 16:17         ` Anthony Liguori
2010-06-29  9:48           ` Avi Kivity
2010-06-24  6:44   ` Andre Przywara [this message]
2010-06-24 13:14   ` Andi Kleen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4C22FEB0.2020002@amd.com \
    --to=andre.przywara@amd.com \
    --cc=agraf@suse.de \
    --cc=anthony@codemonkey.ws \
    --cc=kvm@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox