From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40525) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a3lrj-0001Q5-Ft for qemu-devel@nongnu.org; Tue, 01 Dec 2015 09:25:15 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1a3lrf-00036g-Em for qemu-devel@nongnu.org; Tue, 01 Dec 2015 09:25:11 -0500 Received: from mx1.redhat.com ([209.132.183.28]:60841) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a3lrf-00036b-7Q for qemu-devel@nongnu.org; Tue, 01 Dec 2015 09:25:07 -0500 Date: Tue, 1 Dec 2015 16:25:03 +0200 From: "Michael S. Tsirkin" Message-ID: <20151201162407-mutt-send-email-mst@redhat.com> References: <20151130105044.12269.21261.stgit@bahia.huguette.org> <20151130150353-mutt-send-email-mst@redhat.com> <20151130144631.4736280b@bahia.local> <20151130185328-mutt-send-email-mst@redhat.com> <878u5eqw2w.fsf@linux.vnet.ibm.com> <20151201125659-mutt-send-email-mst@redhat.com> <87vb8i2wm8.fsf@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <87vb8i2wm8.fsf@linux.vnet.ibm.com> Subject: Re: [Qemu-devel] [PATCH] mmap-alloc: use same backend for all mappings List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Aneesh Kumar K.V" Cc: Paolo Bonzini , qemu-devel@nongnu.org, Greg Kurz On Tue, Dec 01, 2015 at 05:45:27PM +0530, Aneesh Kumar K.V wrote: > "Michael S. Tsirkin" writes: > > > On Tue, Dec 01, 2015 at 04:23:11PM +0530, Aneesh Kumar K.V wrote: > >> "Michael S. Tsirkin" writes: > >> > >> > On Mon, Nov 30, 2015 at 02:46:31PM +0100, Greg Kurz wrote: > >> >> On Mon, 30 Nov 2015 15:06:33 +0200 > >> >> "Michael S. Tsirkin" wrote: > >> >> > >> > >> > >> .... > >> >> > >> >> On ppc64, the address space is divided in 256MB-sized segments where all pages > >> >> have the same size. This is a hw limitation IIUC. I don't know if it can be > >> >> fixed and I'll let Ben comment on it. > >> > > >> > But it's anonymous memory with PROT_NONE. There should be no pages there: > >> > just a chunk of virtual memory reserved. > >> > > >> > >> ppc64 use page size (called as base page size) to find the hash slot in > >> which we find the virtual address to real address translation. All the > >> pages in a segment should have same base page size. Hugetlb pages have a > >> base page size of 16M whereas a regular linux page have 64K. mmap will > >> fail to map a hugetlb mapping in a segment that already have regular > >> pages mapped. > >> > >> -aneesh > > > > > > I see this in kernel: > > > > } else if (flags & MAP_HUGETLB) { > > struct user_struct *user = NULL; > > struct hstate *hs; > > > > hs = hstate_sizelog((flags >> MAP_HUGE_SHIFT) & SHM_HUGE_MASK); > > if (!hs) > > return -EINVAL; > > > > len = ALIGN(len, huge_page_size(hs)); > > /* > > * VM_NORESERVE is used because the reservations will be > > * taken when vm_ops->mmap() is called > > * A dummy user value is used because we are not locking > > * memory so no accounting is necessary > > */ > > file = hugetlb_file_setup(HUGETLB_ANON_FILE, len, > > VM_NORESERVE, > > &user, HUGETLB_ANONHUGE_INODE, > > (flags >> MAP_HUGE_SHIFT) & MAP_HUGE_MASK); > > if (IS_ERR(file)) > > return PTR_ERR(file); > > } > > > > So maybe it's a question of passing in MAP_HUGETLB and the > > correct size mask. > > > > Can you explain this more ? > > If the question is do we need to pass fd and remove MAP_ANONYMOUS to map > hugetlb, we don't. A good example is > tools/testing/selftest/vm/map_hugetlb.c > > If the question is whether we will loose hugepages on mmap even if the > mapping is PROT_NONE, then the answer is we do in the form of hugetlb > reservation. > > -aneesh The question is whether passing MAP_HUGETLB to the PROT_NONE mapping with fd == -1 will get a mapping in the correct slice on ppc. -- MST