linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Rik van Riel <riel@redhat.com>
To: Boaz Harrosh <boaz@plexistor.com>,
	Matthew Wilcox <willy@linux.intel.com>,
	Boaz Harrosh <openosd@gmail.com>
Cc: axboe@kernel.dk, linux-arch@vger.kernel.org,
	linux-raid@vger.kernel.org, linux-nvdimm@ml01.01.org,
	Dave Hansen <dave.hansen@linux.intel.com>,
	linux-kernel@vger.kernel.org, hch@infradead.org,
	Linus Torvalds <torvalds@osdl.org>,
	Al Viro <viro@ZenIV.linux.org.uk>,
	linux-fsdevel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	mgorman@suse.de
Subject: Re: [Linux-nvdimm] [RFC PATCH 0/7] evacuate struct page from the block layer
Date: Fri, 20 Mar 2015 11:56:08 -0400	[thread overview]
Message-ID: <550C4318.8010200@redhat.com> (raw)
In-Reply-To: <55098DFE.8080502@plexistor.com>

On 03/18/2015 10:38 AM, Boaz Harrosh wrote:
> On 03/18/2015 03:06 PM, Matthew Wilcox wrote:

>>> I'm not the one afraid of hard work, if it was for a good cause, but for what?
>>> really for what? The block layer, and RDMA, and networking, and spline, and what
>>> ever the heck any one wants to imagine to do with pmem, already works perfectly
>>> stable. right now!
>>
>> The overhead.  Allocating a struct page for every 4k page in a 400GB DIMM
>> (the current capacity available from one NV-DIMM vendor) occupies 6.4GB.
>> That's an unacceptable amount of overhead.
>>
> 
> So lets fix the stacks to work nice with 2M pages. That said we can
> allocate the struct page also from pmem if we need to. The fact remains
> that we need state down the different stacks and this is the current
> design over all.

Fixing the stack to work with 2M pages will be just as invasive,
and just as much work as making it work without a struct page.

What state do you need, exactly?

The struct page in the VM is mostly used for two things:
1) to get a memory address of the data
2) refcounting, to make sure the page does not go away
   during an IO operation, copy, etc...

Persistent memory cannot be paged out so (2) is not a concern, as
long as we ensure the object the page belongs to does not go away.
There are no seek times, so moving it around may not be necessary
either, making (1) not a concern.

The only case where (1) would be a concern is if we wanted to move
data in persistent memory around for better NUMA locality. However,
persistent memory DIMMs are on their way to being too large to move
the memory, anyway - all we can usefully do is detect where programs
are accessing memory, and move the programs there.

What state do you need that is not already represented?

1.5% overhead isn't a whole lot, but it appears to be unnecessary.

If you have a convincing argument as to why we need a struct page,
you might want to articulate it in order to convince us.

-- 
All rights reversed

  reply	other threads:[~2015-03-20 15:56 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-16 20:25 [RFC PATCH 0/7] evacuate struct page from the block layer Dan Williams
2015-03-16 20:25 ` [RFC PATCH 1/7] block: add helpers for accessing a bio_vec page Dan Williams
2015-03-16 20:25 ` [RFC PATCH 2/7] block: convert bio_vec.bv_page to bv_pfn Dan Williams
2015-03-16 23:05   ` Al Viro
2015-03-17 13:02     ` Matthew Wilcox
2015-03-17 15:53       ` Dan Williams
2015-03-16 20:25 ` [RFC PATCH 3/7] dma-mapping: allow archs to optionally specify a ->map_pfn() operation Dan Williams
2015-03-18 11:21   ` [Linux-nvdimm] " Boaz Harrosh
2015-03-16 20:25 ` [RFC PATCH 4/7] scatterlist: use sg_phys() Dan Williams
2015-03-16 20:25 ` [RFC PATCH 5/7] scatterlist: support "page-less" (__pfn_t only) entries Dan Williams
2015-03-16 20:25 ` [RFC PATCH 6/7] x86: support dma_map_pfn() Dan Williams
2015-03-16 20:26 ` [RFC PATCH 7/7] block: base support for pfn i/o Dan Williams
2015-03-18 10:47 ` [RFC PATCH 0/7] evacuate struct page from the block layer Boaz Harrosh
2015-03-18 13:06   ` Matthew Wilcox
2015-03-18 14:38     ` [Linux-nvdimm] " Boaz Harrosh
2015-03-20 15:56       ` Rik van Riel [this message]
2015-03-22 11:53         ` Boaz Harrosh
2015-03-18 15:35   ` Dan Williams
2015-03-18 20:26 ` Andrew Morton
2015-03-19 13:43   ` Matthew Wilcox
2015-03-19 15:54     ` [Linux-nvdimm] " Boaz Harrosh
2015-03-19 19:59       ` Andrew Morton
2015-03-19 20:59         ` Dan Williams
2015-03-22 17:22           ` Boaz Harrosh
2015-03-20 17:32         ` Wols Lists
2015-03-22 10:30         ` Boaz Harrosh
2015-03-19 18:17     ` Christoph Hellwig
2015-03-19 19:31       ` Matthew Wilcox
2015-03-22 16:46       ` Boaz Harrosh
2015-03-20 16:21     ` Rik van Riel
2015-03-20 20:31       ` Matthew Wilcox
2015-03-20 21:08         ` Rik van Riel
2015-03-22 17:06           ` Boaz Harrosh
2015-03-22 17:22             ` Dan Williams
2015-03-22 17:39               ` Boaz Harrosh
2015-03-20 21:17         ` Wols Lists
2015-03-22 16:24         ` Boaz Harrosh
2015-03-22 15:51       ` Boaz Harrosh
2015-03-23 15:19         ` Rik van Riel
2015-03-23 19:30           ` Christoph Hellwig
2015-03-24  9:41           ` Boaz Harrosh
2015-03-24 16:57             ` Rik van Riel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=550C4318.8010200@redhat.com \
    --to=riel@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=axboe@kernel.dk \
    --cc=boaz@plexistor.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=hch@infradead.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvdimm@ml01.01.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=openosd@gmail.com \
    --cc=torvalds@osdl.org \
    --cc=viro@ZenIV.linux.org.uk \
    --cc=willy@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).