From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751440AbbCTUbf (ORCPT ); Fri, 20 Mar 2015 16:31:35 -0400 Received: from mga11.intel.com ([192.55.52.93]:22150 "EHLO mga11.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750999AbbCTUbd (ORCPT ); Fri, 20 Mar 2015 16:31:33 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.11,438,1422950400"; d="scan'208";a="683298767" Date: Fri, 20 Mar 2015 16:31:36 -0400 From: Matthew Wilcox To: Rik van Riel Cc: Andrew Morton , Dan Williams , linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org, axboe@kernel.dk, linux-nvdimm@ml01.01.org, Dave Hansen , linux-raid@vger.kernel.org, mgorman@suse.de, hch@infradead.org, linux-fsdevel@vger.kernel.org, "Michael S. Tsirkin" Subject: Re: [RFC PATCH 0/7] evacuate struct page from the block layer Message-ID: <20150320203136.GM4003@linux.intel.com> References: <20150316201640.33102.33761.stgit@dwillia2-desk3.amr.corp.intel.com> <20150318132650.3336261c58829f49a9af8675@linux-foundation.org> <20150319134313.GF4003@linux.intel.com> <550C490E.1080708@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <550C490E.1080708@redhat.com> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Mar 20, 2015 at 12:21:34PM -0400, Rik van Riel wrote: > On 03/19/2015 09:43 AM, Matthew Wilcox wrote: > > > 1. Construct struct pages for persistent memory > > 1a. Permanently > > 1b. While the pages are under I/O > > Michael Tsirkin and I have been doing some thinking about what > it would take to allocate struct pages per 2MB area permanently, > and allocate additional struct pages for 4kB pages on demand, > when a 2MB area is broken up into 4kB pages. Ah! I've looked at that a couple of times as well. I asked our database performance team what impact freeing up the memmap would have on their performance. They told me that doubling the amount of memory generally resulted in approximately a 40% performance improvement. So freeing up 1.5% additional memory would result in about 0.6% performance improvement, which I thought was probably too small a return on investment to justify turning memmap into a two-level data structure. Persistent memory might change that calculation somewhat ... but I'm not convinced. Certainly, if we already had the ability to allocate 'struct superpage', I wouldn't be pushing for page-less I/Os, I'd just allocate these data structures for PM. Even if they were 128 bytes in size, that's only a 25MB overhead per 400GB NV-DIMM, which feels quite reasonable to me. > This should work for both DRAM and persistent memory. > > I am still not convinced it is worthwhile to have struct pages > for persistent memory though, but I am willing to change my mind. There's a lot of code out there that relies on struct page being PAGE_SIZE bytes. I'm cool with replacing 'struct page' with 'struct superpage' [1] in the biovec and auditing all of the code which touches it ... but that's going to be a lot of code! I'm not sure it's less code than going directly to 'just do I/O on PFNs'. [1] Please, somebody come up with a better name!