From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932808AbcILKK3 (ORCPT ); Mon, 12 Sep 2016 06:10:29 -0400 Received: from mail-lf0-f65.google.com ([209.85.215.65]:33999 "EHLO mail-lf0-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932200AbcILKJO (ORCPT ); Mon, 12 Sep 2016 06:09:14 -0400 Date: Mon, 12 Sep 2016 13:09:10 +0300 From: "Kirill A. Shutemov" To: Dan Williams Cc: linux-mm@kvack.org, Andrea Arcangeli , Xiao Guangrong , Arnd Bergmann , linux-nvdimm@ml01.01.org, linux-api@vger.kernel.org, Dave Hansen , linux-kernel@vger.kernel.org, Andrew Morton , "Kirill A. Shutemov" Subject: Re: [RFC PATCH 1/2] mm, mincore2(): retrieve dax and tlb-size attributes of an address range Message-ID: <20160912100910.GC23346@node.shutemov.name> References: <147361509579.17004.5258725187329709824.stgit@dwillia2-desk3.amr.corp.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <147361509579.17004.5258725187329709824.stgit@dwillia2-desk3.amr.corp.intel.com> User-Agent: Mutt/1.5.23.1 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sun, Sep 11, 2016 at 10:31:35AM -0700, Dan Williams wrote: > As evidenced by this bug report [1], userspace libraries are interested > in whether a mapping is DAX mapped, i.e. no intervening page cache. > Rather than using the ambiguous VM_MIXEDMAP flag in smaps, provide an > explicit "is dax" indication as a new flag in the page vector populated > by mincore. > > There are also cases, particularly for testing and validating a > configuration to know the hardware mapping geometry of the pages in a > given process address range. Consider filesystem-dax where a > configuration needs to take care to align partitions and block > allocations before huge page mappings might be used, or > anonymous-transparent-huge-pages where a process is opportunistically > assigned large pages. mincore2() allows these configurations to be > surveyed and validated. > > The implementation takes advantage of the unused bits in the per-page > byte returned for each PAGE_SIZE extent of a given address range. The > new format of each vector byte is: > > (TLB_SHIFT - PAGE_SHIFT) << 2 | vma_is_dax() << 1 | page_present > > [1]: https://lkml.org/lkml/2016/9/7/61 > > Cc: Arnd Bergmann > Cc: Andrea Arcangeli > Cc: Andrew Morton > Cc: Dave Hansen > Cc: Xiao Guangrong > Cc: Kirill A. Shutemov > Signed-off-by: Dan Williams > --- > include/linux/syscalls.h | 2 + > include/uapi/asm-generic/mman-common.h | 3 + > kernel/sys_ni.c | 1 > mm/mincore.c | 126 +++++++++++++++++++++++++------- > 4 files changed, 104 insertions(+), 28 deletions(-) > > diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h > index d02239022bd0..4aa2ee7e359a 100644 > --- a/include/linux/syscalls.h > +++ b/include/linux/syscalls.h > @@ -467,6 +467,8 @@ asmlinkage long sys_munlockall(void); > asmlinkage long sys_madvise(unsigned long start, size_t len, int behavior); > asmlinkage long sys_mincore(unsigned long start, size_t len, > unsigned char __user * vec); > +asmlinkage long sys_mincore2(unsigned long start, size_t len, > + unsigned char __user * vec, int flags); We had few attempts to extand mincore(2) interface/functionality before. None of them ended up in upsteam. How this attempt compares to previous? -- Kirill A. Shutemov