From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1Npksp-0007lh-Fo for qemu-devel@nongnu.org; Thu, 11 Mar 2010 11:05:11 -0500 Received: from [199.232.76.173] (port=60191 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Npksp-0007lE-49 for qemu-devel@nongnu.org; Thu, 11 Mar 2010 11:05:11 -0500 Received: from Debian-exim by monty-python.gnu.org with spam-scanned (Exim 4.60) (envelope-from ) id 1Npksn-0007VJ-Pk for qemu-devel@nongnu.org; Thu, 11 Mar 2010 11:05:10 -0500 Received: from mx1.redhat.com ([209.132.183.28]:38122) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1Npksn-0007Uz-E9 for qemu-devel@nongnu.org; Thu, 11 Mar 2010 11:05:09 -0500 Received: from int-mx05.intmail.prod.int.phx2.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.18]) by mx1.redhat.com (8.13.8/8.13.8) with ESMTP id o2BG57QP010195 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Thu, 11 Mar 2010 11:05:07 -0500 Date: Thu, 11 Mar 2010 17:05:05 +0100 From: Andrea Arcangeli Subject: Re: [Qemu-devel] [PATCH QEMU] transparent hugepage support Message-ID: <20100311160505.GG5677@random.random> References: <20100311151427.GE5677@random.random> <4B9911B0.5000302@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4B9911B0.5000302@redhat.com> List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Avi Kivity Cc: qemu-devel@nongnu.org On Thu, Mar 11, 2010 at 05:52:16PM +0200, Avi Kivity wrote: > That is a little wasteful. How about a hint to mmap() requesting proper > alignment (MAP_HPAGE_ALIGN)? So you suggest adding a new kernel feature to mmap? Not sure if it's worth it, considering it'd also increase the number of vmas because it will have to leave an hole. Wasting 2M-4k of virtual memory is likely cheaper than having 1 more vma in the rbtree for every page fault. So I think it's better to just malloc and adjust ourselfs on the next offset which is done in userland by qemu_memalign I think. What we could ask the kernel is the HPAGE_SIZE. Also thinking a bit more about it, it now comes to mind what we really care about is the HOST_HPAGE_SIZE. Said that I doubt for kvm it makes a lot of difference and this only changes the kvm path. I'm open to suggestions of where to get the HPAGE_SIZE from and how to call it... > Failing that, modify qemu_memalign() to trim excess memory. > > Come to think of it, posix_memalign() needs to do that (but doesn't). It's hard to tell because of the amount of #ifdefs in .c files, but it seems to be using posix_memalign. If we don't touch these additional pages allocated and there's no transparent hugepage support in the kernel, you won't waste any more memory and less vmas will be generated this way than with a kernel option to reduce the virtual memory waste. Basically the idea is to waste virtual memory to avoid wasting cpu. In short we should make sure it only wastes virtual memory...