From mboxrd@z Thu Jan 1 00:00:00 1970 From: Marcelo Tosatti Subject: Re: patch: qemu + hugetlbfs.. Date: Wed, 9 Jul 2008 14:03:01 -0300 Message-ID: <20080709170301.GA11439@dmt.cnet> References: <4873E400.4000409@third-harmonic.com> <4873F395.6030209@codemonkey.ws> <4874051A.8000802@third-harmonic.com> <48740F86.3050306@codemonkey.ws> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: john cooper , kvm@vger.kernel.org, john.cooper@redhat.com To: Anthony Liguori Return-path: Received: from mx1.redhat.com ([66.187.233.31]:46738 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753615AbYGIRD5 (ORCPT ); Wed, 9 Jul 2008 13:03:57 -0400 Content-Disposition: inline In-Reply-To: <48740F86.3050306@codemonkey.ws> Sender: kvm-owner@vger.kernel.org List-ID: On Tue, Jul 08, 2008 at 08:08:22PM -0500, Anthony Liguori wrote: > john cooper wrote: >> I like it even less. MAP_POPULATE does not fault in physical >> hpages to the map. Again this was a qemu-side interim bandaid. > > Really? That would seem like a bug in hugetlbfs to me. This is Linux's behaviour for all filesystems. There is no error checking on MAP_POPULATE's attempt to prefault pages. >>>> +/* we failed to fault in hpage *a, fall back to conventional page >>>> mapping >>>> + */ >>>> +int remap_hpage(void *a, int sz) >>>> +{ >>>> + ASSERT(!(sz & (EXEC_PAGESIZE - 1))); >>>> + if (munmap(a, sz) < 0) >>>> + perror("remap_hpage: munmap"); >>>> + else if (mmap(a, sz, PROT_READ|PROT_WRITE, >>>> + MAP_PRIVATE|MAP_ANONYMOUS|MAP_FIXED, -1, 0) == MAP_FAILED) >>>> + perror("remap_hpage: mmap"); >>>> + else >>>> + return (1); >>>> + return (0); >>>> +} >>>> >>> >>> I think this would be simplified with MAP_POPULATE since you can fail >>> in large chunks of memory instead of potentially having a highly >>> fragmented set of VMAs. >> >> Here for 4K pages we only need to setup the map. If we later >> fault on a physically absent 4K page we'll wait if a page isn't >> immediately available. Rather in the case of a hpage being >> unavailable, we'll terminate. Note at this point we've effectively >> locked onto whatever hpages we've been able to map as they can't >> be reclaimed from us until we exit. > > Right now. Once we drop references to the large pages, there's nothing > preventing them from being reclaimed in the future. That's what I'm > concerned about. This is just a temporary workaround until a better solution is in place, as John mentioned.