qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Anthony Liguori <anthony@codemonkey.ws>
To: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Andrea Arcangeli <aarcange@redhat.com>,
	kvm list <kvm@vger.kernel.org>,
	dipankar@in.ibm.com,
	qemu-devel Developers <qemu-devel@nongnu.org>,
	Alexander Graf <agraf@suse.de>,
	Chris Wright <chrisw@sous-sol.org>,
	bharata@linux.vnet.ibm.com, Vaidyanathan S <svaidy@in.ibm.com>
Subject: Re: [Qemu-devel] [RFC PATCH] Exporting Guest RAM information for NUMA binding
Date: Mon, 21 Nov 2011 19:51:21 -0600	[thread overview]
Message-ID: <4ECB0019.7020800@codemonkey.ws> (raw)
In-Reply-To: <1321894980.28118.16.camel@twins>

On 11/21/2011 11:03 AM, Peter Zijlstra wrote:
> On Mon, 2011-11-21 at 21:30 +0530, Bharata B Rao wrote:
>>
>> In the original post of this mail thread, I proposed a way to export
>> guest RAM ranges (Guest Physical Address-GPA) and their corresponding host
>> host virtual mappings (Host Virtual Address-HVA) from QEMU (via QEMU monitor).
>> The idea was to use this GPA to HVA mappings from tools like libvirt to bind
>> specific parts of the guest RAM to different host nodes. This needed an
>> extension to existing mbind() to allow binding memory of a process(QEMU) from a
>> different process(libvirt). This was needed since we wanted to do all this from
>> libvirt.
>>
>> Hence I was coming from that background when I asked for extending
>> ms_mbind() to take a tid parameter. If QEMU community thinks that NUMA
>> binding should all be done from outside of QEMU, it is needed, otherwise
>> what you have should be sufficient.
>
> That's just retarded, and no you won't get such extentions. Poking at
> another process's virtual address space is just daft. Esp. if there's no
> actual reason for it.

Yes, that would be a terrible interface.

Fundamentally, the entity that should be deciding what memory should be present 
and where it should located is the kernel.  I'm fundamentally opposed to trying 
to make QEMU override the scheduler/mm by using cpu or memory pinning in QEMU.

 From what I can tell about ms_mbind(), it just uses process knowledge to bind 
specific areas of memory to a memsched group and let's the kernel decide what to 
do with that knowledge.  This is exactly the type of interface that QEMU should 
be using.

QEMU should tell the kernel enough information such that the kernel can make 
good decisions.  QEMU should not be the one making the decisions.

It looks like ms_mbind() takes a flags argument which I assume is the same flags 
as mbind().  The current implementation ignores flags and just uses MPOL_BIND.

I would hope that the flags argument would only be treated as advisory by the 
kernel.

Regards,

Anthony Liguori

>
> Furthermore, it would make libvirt a required part of qemu, and since I
> don't think I've ever use libvirt that's another reason to object, I
> don't need that stinking mess.
>

  parent reply	other threads:[~2011-11-22  1:51 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-10-29 18:45 [Qemu-devel] [RFC PATCH] Exporting Guest RAM information for NUMA binding Bharata B Rao
2011-10-29 19:57 ` Alexander Graf
2011-10-30  9:32   ` Vaidyanathan Srinivasan
2011-11-08 17:33   ` Chris Wright
2011-11-21 15:18     ` Bharata B Rao
2011-11-21 15:25       ` Peter Zijlstra
2011-11-21 16:00         ` Bharata B Rao
2011-11-21 17:03           ` Peter Zijlstra
2011-11-21 22:50             ` Chris Wright
2011-11-22  1:57               ` Anthony Liguori
2011-11-22  1:51             ` Anthony Liguori [this message]
2011-11-23 15:03               ` Andrea Arcangeli
2011-11-23 18:34                 ` Alexander Graf
2011-11-23 20:19                   ` Andrea Arcangeli
2011-11-30 16:22                   ` Dipankar Sarma
2011-11-30 16:25                     ` Peter Zijlstra
2011-11-30 16:33                       ` Chris Wright
2011-11-30 17:41                     ` Andrea Arcangeli
2011-12-01 17:25                       ` Dipankar Sarma
2011-12-01 17:36                         ` Andrea Arcangeli
2011-12-01 17:49                           ` Dipankar Sarma
2011-12-01 17:40                 ` Peter Zijlstra
2011-12-22 11:01                   ` Marcelo Tosatti
2011-12-22 17:13                     ` Anthony Liguori
2011-12-22 17:55                       ` Marcelo Tosatti
2011-12-22 19:04                     ` Peter Zijlstra
2011-12-22 11:24                   ` Marcelo Tosatti
2011-11-21 18:03         ` Avi Kivity
2011-11-21 19:31           ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4ECB0019.7020800@codemonkey.ws \
    --to=anthony@codemonkey.ws \
    --cc=a.p.zijlstra@chello.nl \
    --cc=aarcange@redhat.com \
    --cc=agraf@suse.de \
    --cc=bharata@linux.vnet.ibm.com \
    --cc=chrisw@sous-sol.org \
    --cc=dipankar@in.ibm.com \
    --cc=kvm@vger.kernel.org \
    --cc=qemu-devel@nongnu.org \
    --cc=svaidy@in.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).