From mboxrd@z Thu Jan  1 00:00:00 1970
From: "Gonzalez Monroy, Sergio" <sergio.gonzalez.monroy@intel.com>
Subject: Re: libhugetlbfs
Date: Thu, 23 Jul 2015 10:29:53 +0100
Message-ID: <55B0B411.5050903@intel.com>
References: <3797202.NyZr8XgqE1@xps13> <55B09913.8040100@intel.com>
 <1504831.JexCQJ5PJA@xps13>
Mime-Version: 1.0
Content-Type: text/plain; charset=windows-1252; format=flowed
Content-Transfer-Encoding: 7bit
Cc: dev@dpdk.org
To: Thomas Monjalon <thomas.monjalon@6wind.com>
Return-path: <dev-bounces@dpdk.org>
Received: from mga11.intel.com (mga11.intel.com [192.55.52.93])
 by dpdk.org (Postfix) with ESMTP id E36BAC332
 for <dev@dpdk.org>; Thu, 23 Jul 2015 11:29:59 +0200 (CEST)
In-Reply-To: <1504831.JexCQJ5PJA@xps13>
List-Id: patches and discussions about DPDK <dev.dpdk.org>
List-Unsubscribe: <http://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <http://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org
Sender: "dev" <dev-bounces@dpdk.org>

On 23/07/2015 09:12, Thomas Monjalon wrote:
> 2015-07-23 08:34, Gonzalez Monroy, Sergio:
>> On 22/07/2015 11:40, Thomas Monjalon wrote:
>>> Sergio,
>>>
>>> As the maintainer of memory allocation, would you consider using
>>> libhugetlbfs in DPDK for Linux?
>>> It may simplify a part of our memory allocator and avoid some potential
>>> bugs which would be already fixed in the dedicated lib.
>> I did have a look at it a couple of months ago and I thought there were
>> a few issues:
>> - get_hugepage_region/get_huge_pages only allocates default size huge pages
>>     (you can set a different default huge page size with environment
>> variables but no
>>     support for multiple sizes) plus we have no guarantee on physically
>> contiguous pages.
> Speaking about that, we don't always need contiguous pages.
> Maybe we should take it into account when reserving memory.
> Some flags DMA (locked physical pages that are not swappable) and CONTIGUOUS
> may be considered.
Sure. I think I also mentioned this as possible future work in the 
Dynamic Memzones RFC.
>> - That leave us with
>> hugetlbfs_unlinked_fd/hugetlbfs_unlinked_fd_for_size. These APIs
>>     wouldn't simplify a lot the current code, just the allocation of the
>> pages themselves
>>     (ie. creating a file in hugetlbfs mount).
>>     Then there is the issue with multi-process; because they return a
>> file descriptor while
>>     unlinking the file, we would need some sort of Inter-Process
>> Communication to pass
>>     the descriptors to secondary processes.
>> - Not a big deal but AFAIK it is not possible to have multiple mount
>> points for the same
>>     hugepage size, and even if you do, hugetlbfs_find_path_for_size
>> returns always the
>>     same path (ie. first found in list).
>> - We still need to parse /proc/self/pagemap to get physical address of
>> mapped hugepages.
>>
>> I guess that if we were to push for a new API such as
>> hugetlbfs_fd_for_size, we could use
>> it for the hugepage allocation, but we still would have to parse
>> /proc/self/pagemap to get
>> physical address and then order those hugepages.
>>
>> Thoughts?
> Why not extending the API and pushing our code to this lib?
> It would allow to share the maintenance.
>
> The same move could be done to libpciaccess.
I don't disagree with the idea of using libhugetlbfs, I just tried to 
point out that
it's not just a drop in replacement.

Sergio