From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dave Hansen Subject: Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t Date: Thu, 07 May 2015 10:56:24 -0700 Message-ID: <554BA748.9030804@linux.intel.com> References: <20150506200219.40425.74411.stgit@dwillia2-desk3.amr.corp.intel.com> <20150507173641.GA21781@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Cc: Linus Torvalds , Linux Kernel Mailing List , Boaz Harrosh , Jan Kara , Mike Snitzer , Neil Brown , Benjamin Herrenschmidt , Heiko Carstens , Chris Mason , Paul Mackerras , "H. Peter Anvin" , Christoph Hellwig , Alasdair Kergon , "linux-nvdimm@lists.01.org" , Mel Gorman , Matthew Wilcox , Ross Zwisler , Rik van Riel , Martin Schwidefsky , Jens Axboe , Theodore Ts'o , "Martin K. Petersen" , Julia Lawall , Ingo Molnar Return-path: In-Reply-To: Sender: linux-kernel-owner@vger.kernel.org List-Id: linux-fsdevel.vger.kernel.org On 05/07/2015 10:42 AM, Dan Williams wrote: > On Thu, May 7, 2015 at 10:36 AM, Ingo Molnar wrote: >> * Dan Williams wrote: >> So is there anything fundamentally wrong about creating struct page >> backing at mmap() time (and making sure aliased mmaps share struct >> page arrays)? > > Something like "get_user_pages() triggers memory hotplug for > persistent memory", so they are actual real struct pages? Can we do > memory hotplug at that granularity? We've traditionally limited them to SECTION_SIZE granularity, which is 128MB IIRC. There are also assumptions in places that you can do page++ within a MAX_ORDER block if !CONFIG_HOLES_IN_ZONE. But, in all practicality, a lot of those places are in code like the buddy allocator. If your PTEs all have _PAGE_SPECIAL set and we're not ever expecting these fake 'struct page's to hit these code paths, it probably doesn't matter. You can probably get away with just allocating PAGE_SIZE worth of 'struct page' (which is 64) and mapping it in to vmemmap[]. The worst case is that you'll eat 1 page of space for each outstanding page of I/O. That's a lot better than 2MB of temporary 'struct page' space per page of I/O that it would take with a traditional hotplug operation.