From mboxrd@z Thu Jan 1 00:00:00 1970 From: Avi Kivity Subject: Re: [PATCH 0/3][RFC] NUMA: add host side pinning Date: Thu, 24 Jun 2010 14:42:11 +0300 Message-ID: <4C234493.2050408@redhat.com> References: <1277327377-29629-1-git-send-email-andre.przywara@amd.com> <4C2288DD.3020207@codemonkey.ws> <865764AB-4E51-4ED4-8832-AED6A237A9D3@suse.de> <4C233A6D.7030805@amd.com> <4C233DAB.60106@redhat.com> <4C2342D1.4090103@amd.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Alexander Graf , Anthony Liguori , "kvm@vger.kernel.org" To: Andre Przywara Return-path: Received: from mx1.redhat.com ([209.132.183.28]:44541 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755160Ab0FXLmT (ORCPT ); Thu, 24 Jun 2010 07:42:19 -0400 In-Reply-To: <4C2342D1.4090103@amd.com> Sender: kvm-owner@vger.kernel.org List-ID: On 06/24/2010 02:34 PM, Andre Przywara wrote: >> Non-anonymous memory doesn't work well with ksm and transparent >> hugepages. Is it possible to use anonymous memory rather than file >> backed? > > I'd prefer non-file backed, too. But that is how the current huge > pages implementation is done. We could use MAP_HUGETLB and declare > NUMA _and_ huge pages as 2.6.32+ only. Unfortunately I didn't find an > easy way to detect the presence of the MAP_HUGETLB flag. If the kernel > does not support it, it seems that mmap silently ignores it and uses > 4KB pages instead. That sucks, unfortunately it is normal practice. However it is a soft failure, everything works just a bit slower. So it's probably acceptable. >>> To avoid this I'd like to see the pinning done from within QEMU. I >>> am not sure whether calling numactl via system() and friends is OK, >>> I'd prefer to run the syscalls directly (like in patch 3/3) and pull >>> the necessary options into the -numa pin,... command line. We could >>> mimic numactl's syntax here. >> >> Definitely not use system(), but IIRC numactl has a library interface? > Right, that is what I include in patch 3/3 and use. I got the > impression Anthony wanted to avoid reimplementing parts of numactl, > especially enabling the full flexibility of the command line interface > (like specifying nodes, policies and interleaving). > I want QEMU to use the library and pull the necessary options into the > -numa pin,... parsing, even if this means duplicating numactl > functionality. > I agree with that. It's a lot easier to use a single tool than to try to integrate things yourself, the unix tradition of grep | sort | uniq -c | sort -n notwithstanding. Especially when one of the tools is qemu. -- error compiling committee.c: too many arguments to function