linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: Matthew Wilcox <willy@linux.intel.com>
Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>,
	linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v10 00/21] Support ext4 on NV-DIMMs
Date: Wed, 27 Aug 2014 14:46:22 -0700	[thread overview]
Message-ID: <20140827144622.ed81195a1d94799bb57a3207@linux-foundation.org> (raw)
In-Reply-To: <20140827211250.GH3285@linux.intel.com>

On Wed, 27 Aug 2014 17:12:50 -0400 Matthew Wilcox <willy@linux.intel.com> wrote:

> On Wed, Aug 27, 2014 at 01:06:13PM -0700, Andrew Morton wrote:
> > On Tue, 26 Aug 2014 23:45:20 -0400 Matthew Wilcox <matthew.r.wilcox@intel.com> wrote:
> > 
> > > One of the primary uses for NV-DIMMs is to expose them as a block device
> > > and use a filesystem to store files on the NV-DIMM.  While that works,
> > > it currently wastes memory and CPU time buffering the files in the page
> > > cache.  We have support in ext2 for bypassing the page cache, but it
> > > has some races which are unfixable in the current design.  This series
> > > of patches rewrite the underlying support, and add support for direct
> > > access to ext4.
> > 
> > Sat down to read all this but I'm finding it rather unwieldy - it's
> > just a great blob of code.  Is there some overall
> > what-it-does-and-how-it-does-it roadmap?
> 
> The overall goal is to map persistent memory / NV-DIMMs directly to
> userspace.  We have that functionality in the XIP code, but the way
> it's structured is unsuitable for filesystems like ext4 & XFS, and
> it has some pretty ugly races.

When thinking about looking at the patchset I wonder things like how
does mmap work, in what situations does a page get COWed, how do we
handle partial pages at EOF, etc.  I guess that's all part of the
filemap_xip legacy, the details of which I've totally forgotten.

> Patches 1 & 3 are simply bug-fixes.  They should go in regardless of
> the merits of anything else in this series.
> 
> Patch 2 changes the API for the direct_access block_device_operation so
> it can report more than a single page at a time.  As the series evolved,
> this work also included moving support for partitioning into the VFS
> where it belongs, handling various error cases in the VFS and so on.
> 
> Patch 4 is an optimisation.  It's poor form to make userspace take two
> faults for the same dereference.
> 
> Patch 5 gives us a VFS flag for the DAX property, which lets us get rid of
> the get_xip_mem() method later on.
> 
> Patch 6 is also prep work; Al Viro liked it enough that it's now in
> his tree.
> 
> The new DAX code is then dribbled in over patches 7-11, split up by
> functional area.  At each stage, the ext2-xip code is converted over to
> the new DAX code.
> 
> Patches 12-18 delete the remnants of the old XIP code, and fix the things
> in ext2 that Jan didn't like when he reviewed them for ext4 :-)
> 
> Patches 19 & 20 are the work to make ext4 use DAX.
> 
> Patch 21 is some final cleanup of references to the old XIP code, renaming
> it all to DAX.

hrm.

> > Some explanation of why one would use ext4 instead of, say,
> > suitably-modified ramfs/tmpfs/rd/etc?
> 
> ramfs and tmpfs really rely on the page cache.  They're not exactly
> built for permanence either.  brd also relies on the page cache, and
> there's a clear desire to use a filesystem instead of a block device
> for all the usual reasons of access permissions, grow/shrink, etc.
> 
> Some people might want to use XFS instead of ext4.  We're starting with
> ext4, but we've been keeping an eye on what other filesystems might want
> to use.  btrfs isn't going to use the DAX code, but some of the other
> pieces will probably come in handy.
> 
> There are also at least three people working on their own filesystems
> specially designed for persistent memory.  I wish them all the best
> ... but I'd like to get this infrastructure into place.

This is the sort of thing which first-timers (this one at least) like
to see in [0/n].

> > Performance testing results?
> 
> I haven't been running any performance tests.  What sort of performance
> tests would be interesting for you to see?

fs benchmarks?  `dd' would be a good start ;)

I assume (because I wasn't told!) that there are two objectives here:

1) reduce memory consumption by not maintaining pagecache and
2) reduce CPU cost by avoiding the double-copies.

These things are pretty easily quantified.  And really they must be
quantified as part of the developer testing, because if you find
they've worsened then holy cow, what went wrong.

> > Carsten Otte wrote filemap_xip.c and may be a useful reviewer of this
> > work.
> 
> I cc'd him on some earlier versions and didn't hear anything back.  It felt
> rude to keep plying him with 20+ patches every month.

OK.

> > All the patch subjects violate Documentation/SubmittingPatches
> > section 15 ;)
> 
> errr ... which bit?  I used git format-patch to create them.

None of the patch titles identify the subsystem(s) which they're
hitting.  eg, "Introduce IS_DAX(inode)" is an ext2 patch, but nobody
would know that from browsing the titles.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2014-08-27 21:46 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-08-27  3:45 [PATCH v10 00/21] Support ext4 on NV-DIMMs Matthew Wilcox
2014-08-27  3:45 ` [PATCH v10 01/21] axonram: Fix bug in direct_access Matthew Wilcox
2014-08-27  3:45 ` [PATCH v10 02/21] Change direct_access calling convention Matthew Wilcox
2014-08-27  3:45 ` [PATCH v10 03/21] Fix XIP fault vs truncate race Matthew Wilcox
2014-08-27  3:45 ` [PATCH v10 04/21] Allow page fault handlers to perform the COW Matthew Wilcox
2014-08-27  3:45 ` [PATCH v10 05/21] Introduce IS_DAX(inode) Matthew Wilcox
2014-08-27  3:45 ` [PATCH v10 06/21] Add copy_to_iter(), copy_from_iter() and iov_iter_zero() Matthew Wilcox
2014-08-27  3:45 ` [PATCH v10 07/21] Replace XIP read and write with DAX I/O Matthew Wilcox
2014-09-14 14:11   ` Boaz Harrosh
2014-08-27  3:45 ` [PATCH v10 08/21] Replace ext2_clear_xip_target with dax_clear_blocks Matthew Wilcox
2014-08-27  3:45 ` [PATCH v10 09/21] Replace the XIP page fault handler with the DAX page fault handler Matthew Wilcox
2014-09-03  7:47   ` Dave Chinner
2014-09-10 15:23     ` Matthew Wilcox
2014-09-11  3:09       ` Dave Chinner
2014-09-24 15:43         ` Matthew Wilcox
2014-09-25  1:01           ` Dave Chinner
2014-08-27  3:45 ` [PATCH v10 10/21] Replace xip_truncate_page with dax_truncate_page Matthew Wilcox
2014-08-27  3:45 ` [PATCH v10 11/21] Replace XIP documentation with DAX documentation Matthew Wilcox
2014-08-27  3:45 ` [PATCH v10 12/21] Remove get_xip_mem Matthew Wilcox
2014-08-27  3:45 ` [PATCH v10 13/21] ext2: Remove ext2_xip_verify_sb() Matthew Wilcox
2014-08-27  3:45 ` [PATCH v10 14/21] ext2: Remove ext2_use_xip Matthew Wilcox
2014-08-27  3:45 ` [PATCH v10 15/21] ext2: Remove xip.c and xip.h Matthew Wilcox
2014-08-27  3:45 ` [PATCH v10 16/21] Remove CONFIG_EXT2_FS_XIP and rename CONFIG_FS_XIP to CONFIG_FS_DAX Matthew Wilcox
2014-08-27  3:45 ` [PATCH v10 17/21] ext2: Remove ext2_aops_xip Matthew Wilcox
2014-08-27  3:45 ` [PATCH v10 18/21] Get rid of most mentions of XIP in ext2 Matthew Wilcox
2014-08-27  3:45 ` [PATCH v10 19/21] xip: Add xip_zero_page_range Matthew Wilcox
2014-09-03  9:21   ` Dave Chinner
2014-09-04 21:08     ` Matthew Wilcox
2014-09-04 21:36       ` Theodore Ts'o
2014-09-08 18:59         ` Matthew Wilcox
2014-08-27  3:45 ` [PATCH v10 20/21] ext4: Add DAX functionality Matthew Wilcox
2014-09-03 11:13   ` Dave Chinner
2014-09-10 16:49     ` Boaz Harrosh
2014-09-11  4:38       ` Dave Chinner
2014-09-14 12:25         ` Boaz Harrosh
2014-09-15  6:15           ` Dave Chinner
2014-09-15  9:41             ` Boaz Harrosh
2014-08-27  3:45 ` [PATCH v10 21/21] brd: Rename XIP to DAX Matthew Wilcox
2014-08-27 20:06 ` [PATCH v10 00/21] Support ext4 on NV-DIMMs Andrew Morton
2014-08-27 21:12   ` Matthew Wilcox
2014-08-27 21:46     ` Andrew Morton [this message]
2014-08-28  1:30       ` Andy Lutomirski
2014-08-28 16:50         ` Matthew Wilcox
2014-08-28 15:45       ` Matthew Wilcox
2014-08-27 21:22   ` Christoph Lameter
2014-08-27 21:30     ` Andrew Morton
2014-08-27 23:04       ` One Thousand Gnomes
2014-08-28  7:17       ` Dave Chinner
2014-08-30 23:11         ` Christian Stroetmann
2014-08-28  8:08 ` Boaz Harrosh
2014-08-28 22:09   ` Zwisler, Ross
2014-09-03 12:05 ` [PATCH 1/1] xfs: add DAX support Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140827144622.ed81195a1d94799bb57a3207@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=matthew.r.wilcox@intel.com \
    --cc=willy@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).