From: Andrew Morton <akpm@linux-foundation.org>
To: Matthew Wilcox <willy@linux.intel.com>
Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>,
linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH v10 00/21] Support ext4 on NV-DIMMs
Date: Wed, 27 Aug 2014 14:46:22 -0700 [thread overview]
Message-ID: <20140827144622.ed81195a1d94799bb57a3207@linux-foundation.org> (raw)
In-Reply-To: <20140827211250.GH3285@linux.intel.com>
On Wed, 27 Aug 2014 17:12:50 -0400 Matthew Wilcox <willy@linux.intel.com> wrote:
> On Wed, Aug 27, 2014 at 01:06:13PM -0700, Andrew Morton wrote:
> > On Tue, 26 Aug 2014 23:45:20 -0400 Matthew Wilcox <matthew.r.wilcox@intel.com> wrote:
> >
> > > One of the primary uses for NV-DIMMs is to expose them as a block device
> > > and use a filesystem to store files on the NV-DIMM. While that works,
> > > it currently wastes memory and CPU time buffering the files in the page
> > > cache. We have support in ext2 for bypassing the page cache, but it
> > > has some races which are unfixable in the current design. This series
> > > of patches rewrite the underlying support, and add support for direct
> > > access to ext4.
> >
> > Sat down to read all this but I'm finding it rather unwieldy - it's
> > just a great blob of code. Is there some overall
> > what-it-does-and-how-it-does-it roadmap?
>
> The overall goal is to map persistent memory / NV-DIMMs directly to
> userspace. We have that functionality in the XIP code, but the way
> it's structured is unsuitable for filesystems like ext4 & XFS, and
> it has some pretty ugly races.
When thinking about looking at the patchset I wonder things like how
does mmap work, in what situations does a page get COWed, how do we
handle partial pages at EOF, etc. I guess that's all part of the
filemap_xip legacy, the details of which I've totally forgotten.
> Patches 1 & 3 are simply bug-fixes. They should go in regardless of
> the merits of anything else in this series.
>
> Patch 2 changes the API for the direct_access block_device_operation so
> it can report more than a single page at a time. As the series evolved,
> this work also included moving support for partitioning into the VFS
> where it belongs, handling various error cases in the VFS and so on.
>
> Patch 4 is an optimisation. It's poor form to make userspace take two
> faults for the same dereference.
>
> Patch 5 gives us a VFS flag for the DAX property, which lets us get rid of
> the get_xip_mem() method later on.
>
> Patch 6 is also prep work; Al Viro liked it enough that it's now in
> his tree.
>
> The new DAX code is then dribbled in over patches 7-11, split up by
> functional area. At each stage, the ext2-xip code is converted over to
> the new DAX code.
>
> Patches 12-18 delete the remnants of the old XIP code, and fix the things
> in ext2 that Jan didn't like when he reviewed them for ext4 :-)
>
> Patches 19 & 20 are the work to make ext4 use DAX.
>
> Patch 21 is some final cleanup of references to the old XIP code, renaming
> it all to DAX.
hrm.
> > Some explanation of why one would use ext4 instead of, say,
> > suitably-modified ramfs/tmpfs/rd/etc?
>
> ramfs and tmpfs really rely on the page cache. They're not exactly
> built for permanence either. brd also relies on the page cache, and
> there's a clear desire to use a filesystem instead of a block device
> for all the usual reasons of access permissions, grow/shrink, etc.
>
> Some people might want to use XFS instead of ext4. We're starting with
> ext4, but we've been keeping an eye on what other filesystems might want
> to use. btrfs isn't going to use the DAX code, but some of the other
> pieces will probably come in handy.
>
> There are also at least three people working on their own filesystems
> specially designed for persistent memory. I wish them all the best
> ... but I'd like to get this infrastructure into place.
This is the sort of thing which first-timers (this one at least) like
to see in [0/n].
> > Performance testing results?
>
> I haven't been running any performance tests. What sort of performance
> tests would be interesting for you to see?
fs benchmarks? `dd' would be a good start ;)
I assume (because I wasn't told!) that there are two objectives here:
1) reduce memory consumption by not maintaining pagecache and
2) reduce CPU cost by avoiding the double-copies.
These things are pretty easily quantified. And really they must be
quantified as part of the developer testing, because if you find
they've worsened then holy cow, what went wrong.
> > Carsten Otte wrote filemap_xip.c and may be a useful reviewer of this
> > work.
>
> I cc'd him on some earlier versions and didn't hear anything back. It felt
> rude to keep plying him with 20+ patches every month.
OK.
> > All the patch subjects violate Documentation/SubmittingPatches
> > section 15 ;)
>
> errr ... which bit? I used git format-patch to create them.
None of the patch titles identify the subsystem(s) which they're
hitting. eg, "Introduce IS_DAX(inode)" is an ext2 patch, but nobody
would know that from browsing the titles.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2014-08-27 21:46 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-08-27 3:45 [PATCH v10 00/21] Support ext4 on NV-DIMMs Matthew Wilcox
2014-08-27 3:45 ` [PATCH v10 01/21] axonram: Fix bug in direct_access Matthew Wilcox
2014-08-27 3:45 ` [PATCH v10 02/21] Change direct_access calling convention Matthew Wilcox
2014-08-27 3:45 ` [PATCH v10 03/21] Fix XIP fault vs truncate race Matthew Wilcox
2014-08-27 3:45 ` [PATCH v10 04/21] Allow page fault handlers to perform the COW Matthew Wilcox
2014-08-27 3:45 ` [PATCH v10 05/21] Introduce IS_DAX(inode) Matthew Wilcox
2014-08-27 3:45 ` [PATCH v10 06/21] Add copy_to_iter(), copy_from_iter() and iov_iter_zero() Matthew Wilcox
2014-08-27 3:45 ` [PATCH v10 07/21] Replace XIP read and write with DAX I/O Matthew Wilcox
2014-09-14 14:11 ` Boaz Harrosh
2014-08-27 3:45 ` [PATCH v10 08/21] Replace ext2_clear_xip_target with dax_clear_blocks Matthew Wilcox
2014-08-27 3:45 ` [PATCH v10 09/21] Replace the XIP page fault handler with the DAX page fault handler Matthew Wilcox
2014-09-03 7:47 ` Dave Chinner
2014-09-10 15:23 ` Matthew Wilcox
2014-09-11 3:09 ` Dave Chinner
2014-09-24 15:43 ` Matthew Wilcox
2014-09-25 1:01 ` Dave Chinner
2014-08-27 3:45 ` [PATCH v10 10/21] Replace xip_truncate_page with dax_truncate_page Matthew Wilcox
2014-08-27 3:45 ` [PATCH v10 11/21] Replace XIP documentation with DAX documentation Matthew Wilcox
2014-08-27 3:45 ` [PATCH v10 12/21] Remove get_xip_mem Matthew Wilcox
2014-08-27 3:45 ` [PATCH v10 13/21] ext2: Remove ext2_xip_verify_sb() Matthew Wilcox
2014-08-27 3:45 ` [PATCH v10 14/21] ext2: Remove ext2_use_xip Matthew Wilcox
2014-08-27 3:45 ` [PATCH v10 15/21] ext2: Remove xip.c and xip.h Matthew Wilcox
2014-08-27 3:45 ` [PATCH v10 16/21] Remove CONFIG_EXT2_FS_XIP and rename CONFIG_FS_XIP to CONFIG_FS_DAX Matthew Wilcox
2014-08-27 3:45 ` [PATCH v10 17/21] ext2: Remove ext2_aops_xip Matthew Wilcox
2014-08-27 3:45 ` [PATCH v10 18/21] Get rid of most mentions of XIP in ext2 Matthew Wilcox
2014-08-27 3:45 ` [PATCH v10 19/21] xip: Add xip_zero_page_range Matthew Wilcox
2014-09-03 9:21 ` Dave Chinner
2014-09-04 21:08 ` Matthew Wilcox
2014-09-04 21:36 ` Theodore Ts'o
2014-09-08 18:59 ` Matthew Wilcox
2014-08-27 3:45 ` [PATCH v10 20/21] ext4: Add DAX functionality Matthew Wilcox
2014-09-03 11:13 ` Dave Chinner
2014-09-10 16:49 ` Boaz Harrosh
2014-09-11 4:38 ` Dave Chinner
2014-09-14 12:25 ` Boaz Harrosh
2014-09-15 6:15 ` Dave Chinner
2014-09-15 9:41 ` Boaz Harrosh
2014-08-27 3:45 ` [PATCH v10 21/21] brd: Rename XIP to DAX Matthew Wilcox
2014-08-27 20:06 ` [PATCH v10 00/21] Support ext4 on NV-DIMMs Andrew Morton
2014-08-27 21:12 ` Matthew Wilcox
2014-08-27 21:46 ` Andrew Morton [this message]
2014-08-28 1:30 ` Andy Lutomirski
2014-08-28 16:50 ` Matthew Wilcox
2014-08-28 15:45 ` Matthew Wilcox
2014-08-27 21:22 ` Christoph Lameter
2014-08-27 21:30 ` Andrew Morton
2014-08-27 23:04 ` One Thousand Gnomes
2014-08-28 7:17 ` Dave Chinner
2014-08-30 23:11 ` Christian Stroetmann
2014-08-28 8:08 ` Boaz Harrosh
2014-08-28 22:09 ` Zwisler, Ross
2014-09-03 12:05 ` [PATCH 1/1] xfs: add DAX support Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140827144622.ed81195a1d94799bb57a3207@linux-foundation.org \
--to=akpm@linux-foundation.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=matthew.r.wilcox@intel.com \
--cc=willy@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).