From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jerome Glisse Subject: Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t Date: Thu, 7 May 2015 13:30:58 -0400 Message-ID: <20150507173057.GA5966@gmail.com> References: <20150506200219.40425.74411.stgit@dwillia2-desk3.amr.corp.intel.com> <20150507161807.GA1671@lst.de> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: Linus Torvalds , Dan Williams , Linux Kernel Mailing List , Boaz Harrosh , Jan Kara , Mike Snitzer , Neil Brown , Benjamin Herrenschmidt , Dave Hansen , Heiko Carstens , Chris Mason , Paul Mackerras , "H. Peter Anvin" , Alasdair Kergon , "linux-nvdimm@lists.01.org" , Ingo Molnar , Mel Gorman , Matthew Wilcox , Ross Zwisler , Rik van Riel , Martin Schwidefsky , Jens Axboe , Theodore Ts'o , To: Christoph Hellwig Return-path: Content-Disposition: inline In-Reply-To: <20150507161807.GA1671@lst.de> Sender: linux-kernel-owner@vger.kernel.org List-Id: linux-fsdevel.vger.kernel.org On Thu, May 07, 2015 at 06:18:07PM +0200, Christoph Hellwig wrote: > On Wed, May 06, 2015 at 05:19:48PM -0700, Linus Torvalds wrote: > > What is the primary thing that is driving this need? Do we have a v= ery > > concrete example? >=20 > FYI, I plan to to implement RAID acceleration using nvdimms, and I pl= an to > ue pages for that. The code just merge for 4.1 can easily support pa= ge > backing, and I plan to use that for now. This still leaves support > for the gigantic intel nvdimms discovered over EFI out, but given tha= t > I don't have access to them, and I dont know of any publically availa= ble > there's little I can do for now. But adding on demand allocate struc= t > pages for the seems like the easiest way forward. Boaz already has > code to allocate pages for them, although not on demand but at boot /= plug in > time. I think here other folks might be interested, i am ccing Paul. But for = GPU we are facing similar issue of trying to present the GPU memory to the = kernel in a coherent way (coherent from the design and linux kernel concept PO= V). =46or this dynamicaly allocated struct page might effectively be a solu= tion that could be share btw persistent memory and GPU folks. We can even enforce= thing like VMEMMAP and have special region carveout where we can dynamicly ma= p/unmap backing page for range of device pfn. This would also allow to catch pe= ople trying to access such page, we could add a set of new helper like : get_page_dev()/put_page_dev() ... and only the _dev version would works= on this new kind of memory, regular get_page()/put_page() would throw erro= r. This should allow to make sure only legitimate users are referencing su= ch page. Issue might be that we can run out of kernel address space with 48bits = but if such monstruous computer ever see the light of day they might consider = using CPU with more bits. Another issue is that we might care for the 32bits platform too, but th= at's solvable at a small cost. Cheers, J=E9r=F4me