Re: [PATCH 0/3][RFC] NUMA: add host side pinning

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Alexander Graf <agraf@suse.de>
To: Anthony Liguori <anthony@codemonkey.ws>
Cc: Avi Kivity <avi@redhat.com>,
	Andre Przywara <andre.przywara@amd.com>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>
Subject: Re: [PATCH 0/3][RFC] NUMA: add host side pinning
Date: Mon, 28 Jun 2010 18:26:01 +0200	[thread overview]
Message-ID: <4C28CD19.9000001@suse.de> (raw)
In-Reply-To: <4C28CBC5.80109@codemonkey.ws>

Anthony Liguori wrote:
> On 06/24/2010 06:42 AM, Avi Kivity wrote:
>> On 06/24/2010 02:34 PM, Andre Przywara wrote:
>>>> Non-anonymous memory doesn't work well with ksm and transparent
>>>> hugepages.  Is it possible to use anonymous memory rather than file
>>>> backed?
>>>
>>> I'd prefer non-file backed, too. But that is how the current huge
>>> pages implementation is done. We could use MAP_HUGETLB and declare
>>> NUMA _and_ huge pages as 2.6.32+ only. Unfortunately I didn't find
>>> an easy way to detect the presence of the MAP_HUGETLB flag. If the
>>> kernel does not support it, it seems that mmap silently ignores it
>>> and uses 4KB pages instead.
>>
>> That sucks, unfortunately it is normal practice.  However it is a
>> soft failure, everything works just a bit slower.  So it's probably
>> acceptable.
>>
>>>>> To avoid this I'd like to see the pinning done from within QEMU. I
>>>>> am not sure whether calling numactl via system() and friends is
>>>>> OK, I'd prefer to run the syscalls directly (like in patch 3/3)
>>>>> and pull the necessary options into the -numa pin,... command
>>>>> line. We could mimic numactl's syntax here.
>>>>
>>>> Definitely not use system(), but IIRC numactl has a library interface?
>>> Right, that is what I include in patch 3/3 and use. I got the
>>> impression Anthony wanted to avoid reimplementing parts of numactl,
>>> especially enabling the full flexibility of the command line
>>> interface (like specifying nodes, policies and interleaving).
>>> I want QEMU to use the library and pull the necessary options into
>>> the -numa pin,... parsing, even if this means duplicating numactl
>>> functionality.
>>>
>>
>> I agree with that.  It's a lot easier to use a single tool than to
>> try to integrate things yourself, the unix tradition of grep | sort |
>> uniq -c | sort -n notwithstanding.  Especially when one of the tools
>> is qemu.
>
> I could disagree more here.  This is why we don't support CPU pinning
> and instead provide PID information for each VCPU thread.
>
> The folks that want to use pinning are not notice users.  They are not
> going to be happy unless you can make full use of existing tools. 
> That means replicating all of numactl's functionality (which is not
> what the current patches do) or enable numactl to be used with a guest.

So how about some QMP plumbing that would allow numactl to create the
VMs at defined ranges? So you'd basically get numactl --run-qemu --
qemu-kvm -blah -foo

Alex

next prev parent reply	other threads:[~2010-06-28 16:26 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-06-23 21:09 [PATCH 0/3][RFC] NUMA: add host side pinning Andre Przywara
2010-06-23 21:09 ` [PATCH 1/3] NUMA: add Linux libnuma detection Andre Przywara
2010-06-23 21:09 ` [PATCH 2/3] NUMA: add parsing of host NUMA pin option Andre Przywara
2010-06-23 21:09 ` [PATCH 3/3] NUMA: realize NUMA memory pinning Andre Przywara
2010-06-23 22:21 ` [PATCH 0/3][RFC] NUMA: add host side pinning Anthony Liguori
2010-06-23 22:29   ` Alexander Graf
2010-06-24 10:58     ` Andre Przywara
2010-06-24 11:12       ` Avi Kivity
2010-06-24 11:34         ` Andre Przywara
2010-06-24 11:42           ` Avi Kivity
2010-06-28 16:20             ` Anthony Liguori
2010-06-28 16:26               ` Alexander Graf [this message]
2010-06-29  9:46               ` Avi Kivity
2010-06-25 11:00           ` Jes Sorensen
2010-06-25 11:06             ` Andre Przywara
2010-06-25 11:37               ` Jes Sorensen
2010-06-28 16:17         ` Anthony Liguori
2010-06-29  9:48           ` Avi Kivity
2010-06-24  6:44   ` Andre Przywara
2010-06-24 13:14   ` Andi Kleen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4C28CD19.9000001@suse.de \
    --to=agraf@suse.de \
    --cc=andre.przywara@amd.com \
    --cc=anthony@codemonkey.ws \
    --cc=avi@redhat.com \
    --cc=kvm@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.