From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754182AbbCFVdG (ORCPT ); Fri, 6 Mar 2015 16:33:06 -0500 Received: from aserp1040.oracle.com ([141.146.126.69]:26535 "EHLO aserp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751276AbbCFVdD (ORCPT ); Fri, 6 Mar 2015 16:33:03 -0500 Message-ID: <54FA1CFE.1000500@oracle.com> Date: Fri, 06 Mar 2015 13:32:46 -0800 From: Mike Kravetz User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.5.0 MIME-Version: 1.0 To: David Rientjes CC: Michal Hocko , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Aneesh Kumar , Joonsoo Kim Subject: Re: [RFC 0/3] hugetlbfs: optionally reserve all fs pages at mount time References: <1425077893-18366-1-git-send-email-mike.kravetz@oracle.com> <20150302151009.2ae58f4430f9f34b81533821@linux-foundation.org> <54F50BD6.1030706@oracle.com> <20150306151045.GA23443@dhcp22.suse.cz> <54F9F8F1.4020203@oracle.com> In-Reply-To: Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-Source-IP: acsinet22.oracle.com [141.146.126.238] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 03/06/2015 01:14 PM, David Rientjes wrote: > On Fri, 6 Mar 2015, Mike Kravetz wrote: > >> Thanks for the CONFIG_CGROUP_HUGETLB suggestion, however I do not >> believe this will be a satisfactory solution for my usecase. As you >> point out, cgroups could be set up (by a sysadmin) for every hugetlb >> user/application. In this case, the sysadmin needs to have knowledge >> of every huge page user/application and configure appropriately. >> >> I was approaching this from the point of view of the application. The >> application wants the guarantee of a minimum number of huge pages, >> independent of other users/applications. The "reserve" approach allows >> the application to set aside those pages at initialization time. If it >> can not get the pages it needs, it can refuse to start, or configure >> itself to use less, or take other action. >> > > Would it be too difficult to modify the application to mmap() the > hugepages at startup so they are no longer free in the global pool but > rather get marked as reserved so other applications cannot map them? That > should return MAP_FAILED if there is an insufficient number of hugepages > available to be reserved (HugePages_Rsvd in /proc/meminfo). The application is a database with multiple processes/tasks that will come and go over time. I thought about having one task do a big mmap() at initialization time, but then the issue is how to coordinate with the other tasks and their requests to allocate/free pages. -- Mike Kravetz