From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:54180) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a3l1q-0004Fz-QH for qemu-devel@nongnu.org; Tue, 01 Dec 2015 08:31:41 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1a3l1l-0004Dm-QE for qemu-devel@nongnu.org; Tue, 01 Dec 2015 08:31:34 -0500 Received: from e06smtp17.uk.ibm.com ([195.75.94.113]:55308) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a3l1l-0004BJ-A2 for qemu-devel@nongnu.org; Tue, 01 Dec 2015 08:31:29 -0500 Received: from localhost by e06smtp17.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 1 Dec 2015 13:31:23 -0000 Received: from b06cxnps4076.portsmouth.uk.ibm.com (d06relay13.portsmouth.uk.ibm.com [9.149.109.198]) by d06dlp02.portsmouth.uk.ibm.com (Postfix) with ESMTP id AECB7219004D for ; Tue, 1 Dec 2015 13:31:14 +0000 (GMT) Received: from d06av05.portsmouth.uk.ibm.com (d06av05.portsmouth.uk.ibm.com [9.149.37.229]) by b06cxnps4076.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id tB1DVLjc6619552 for ; Tue, 1 Dec 2015 13:31:21 GMT Received: from d06av05.portsmouth.uk.ibm.com (localhost [127.0.0.1]) by d06av05.portsmouth.uk.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id tB1DVLRA002269 for ; Tue, 1 Dec 2015 06:31:21 -0700 Date: Tue, 1 Dec 2015 14:31:19 +0100 From: Greg Kurz Message-ID: <20151201143119.42af4ae1@bahia.local> In-Reply-To: <20151201125659-mutt-send-email-mst@redhat.com> References: <20151130105044.12269.21261.stgit@bahia.huguette.org> <20151130150353-mutt-send-email-mst@redhat.com> <20151130144631.4736280b@bahia.local> <20151130185328-mutt-send-email-mst@redhat.com> <878u5eqw2w.fsf@linux.vnet.ibm.com> <20151201125659-mutt-send-email-mst@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH] mmap-alloc: use same backend for all mappings List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Michael S. Tsirkin" Cc: Paolo Bonzini , "Aneesh Kumar K.V" , qemu-devel@nongnu.org On Tue, 1 Dec 2015 12:57:47 +0200 "Michael S. Tsirkin" wrote: > On Tue, Dec 01, 2015 at 04:23:11PM +0530, Aneesh Kumar K.V wrote: > > "Michael S. Tsirkin" writes: > > > > > On Mon, Nov 30, 2015 at 02:46:31PM +0100, Greg Kurz wrote: > > >> On Mon, 30 Nov 2015 15:06:33 +0200 > > >> "Michael S. Tsirkin" wrote: > > >> > > > > > > .... > > >> > > >> On ppc64, the address space is divided in 256MB-sized segments where all pages > > >> have the same size. This is a hw limitation IIUC. I don't know if it can be > > >> fixed and I'll let Ben comment on it. > > > > > > But it's anonymous memory with PROT_NONE. There should be no pages there: > > > just a chunk of virtual memory reserved. > > > > > > > ppc64 use page size (called as base page size) to find the hash slot in > > which we find the virtual address to real address translation. All the > > pages in a segment should have same base page size. Hugetlb pages have a > > base page size of 16M whereas a regular linux page have 64K. mmap will > > fail to map a hugetlb mapping in a segment that already have regular > > pages mapped. > > > > -aneesh > > > I see this in kernel: > > } else if (flags & MAP_HUGETLB) { > struct user_struct *user = NULL; > struct hstate *hs; > > hs = hstate_sizelog((flags >> MAP_HUGE_SHIFT) & SHM_HUGE_MASK); > if (!hs) > return -EINVAL; > > len = ALIGN(len, huge_page_size(hs)); > /* > * VM_NORESERVE is used because the reservations will be > * taken when vm_ops->mmap() is called > * A dummy user value is used because we are not locking > * memory so no accounting is necessary > */ > file = hugetlb_file_setup(HUGETLB_ANON_FILE, len, > VM_NORESERVE, > &user, HUGETLB_ANONHUGE_INODE, > (flags >> MAP_HUGE_SHIFT) & MAP_HUGE_MASK); > if (IS_ERR(file)) > return PTR_ERR(file); > } > > So maybe it's a question of passing in MAP_HUGETLB and the > correct size mask. > I guess you are talking about the PROT_NONE mapping here ^^. How do we know that the fd points to hugepages ? And what's the difference between passing MAP_HUGETLB and passing a hugetlbfs backed fd + MAP_NORESERVE ? I think the latter is easier because we don't need to guess if backend is hugetlbfs.