Re: [PATCH 4/4] NUMA: realize NUMA memory pinning

public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed

From: Andre Przywara <andre.przywara@amd.com>
To: Anthony Liguori <anthony@codemonkey.ws>
Cc: Marcelo Tosatti <mtosatti@redhat.com>,
	"avi@redhat.com" <avi@redhat.com>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>
Subject: Re: [PATCH 4/4] NUMA: realize NUMA memory pinning
Date: Mon, 23 Aug 2010 23:16:56 +0200	[thread overview]
Message-ID: <4C72E548.4030701@amd.com> (raw)
In-Reply-To: <4C72CBA5.1020805@codemonkey.ws>

Anthony Liguori wrote:
> On 08/23/2010 01:59 PM, Marcelo Tosatti wrote:
>> On Wed, Aug 11, 2010 at 03:52:18PM +0200, Andre Przywara wrote:
>>    
>>> According to the user-provided assignment bind the respective part
>>> of the guest's memory to the given host node. This uses Linux'
>>> mbind syscall (which is wrapped only in libnuma) to realize the
>>> pinning right after the allocation.
>>> Failures are not fatal, but produce a warning.
>>>
>>> Signed-off-by: Andre Przywara<andre.przywara@amd.com>
 >>> ...
>>>      
>> Why is it not possible (or perhaps not desired) to change the binding
>> after the guest is started?
>>
>> Sounds unflexible.
>>    
The solution is to introduce a monitor interface to later adjust the 
pinning, allowing both changing the affinity only (only valid for future 
fault-ins) and actually copying the memory (more costly).
Actually this is the next item on my list, but I wanted to bring up the 
basics first to avoid recoding parts afterwards. Also I am not (yet) 
familiar with the QMP protocol.
> 
> We really need a solution that lets a user use a tool like numactl 
> outside of the QEMU instance.
I fear that is not how it's meant to work with the Linux' NUMA API. In 
opposite to the VCPU threads, which are externally visible entities 
(PIDs), the memory should be private to the QEMU process. While you can 
change the NUMA allocation policy of the _whole_ process, there is no 
way to externally distinguish parts of the process' memory. Although you 
could later (and externally) migrate already faulted pages (via 
move_pages(2) and by looking in /proc/$$/numa_maps), you would let an 
external tool interfere with QEMUs internal memory management. Take for 
instance the change of the allocation policy regarding the 1MB and 
3.5-4GB holes. An external tool would have to either track such changes 
or you simply could not change such things in QEMU. So what is wrong 
with keeping that code in QEMU, which knows best about the internals and 
already has flexible and mighty ways (command line and QMP) of 
manipulating its behavior?

Regards,
Andre.

-- 
Andre Przywara
AMD-Operating System Research Center (OSRC), Dresden, Germany
Tel: +49 351 448-3567-12

next prev parent reply	other threads:[~2010-08-23 21:17 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-08-11 13:52 [PATCH 0/4]: NUMA: add host binding Andre Przywara
2010-08-11 13:52 ` [PATCH 1/4] NUMA: change existing NUMA guest code to use new bitmap implementation Andre Przywara
2010-08-11 13:52 ` [PATCH 2/4] NUMA: add Linux libnuma detection Andre Przywara
2010-08-11 13:52 ` [PATCH 3/4] NUMA: parse new host dependent command line options Andre Przywara
2010-08-11 13:52 ` [PATCH 4/4] NUMA: realize NUMA memory pinning Andre Przywara
2010-08-23 18:59   ` Marcelo Tosatti
2010-08-23 19:27     ` Anthony Liguori
2010-08-23 21:16       ` Andre Przywara [this message]
2010-08-23 21:27         ` Anthony Liguori
2010-08-31 20:54           ` Andrew Theurer
2010-08-31 22:03             ` Anthony Liguori
2010-09-01  3:38               ` Andrew Theurer
2010-09-09 20:00               ` Andre Przywara

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4C72E548.4030701@amd.com \
    --to=andre.przywara@amd.com \
    --cc=anthony@codemonkey.ws \
    --cc=avi@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=mtosatti@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox