From: Matthew Wilcox <willy@linux.intel.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>,
linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH v10 00/21] Support ext4 on NV-DIMMs
Date: Wed, 27 Aug 2014 17:12:50 -0400 [thread overview]
Message-ID: <20140827211250.GH3285@linux.intel.com> (raw)
In-Reply-To: <20140827130613.c8f6790093d279a447196f17@linux-foundation.org>
On Wed, Aug 27, 2014 at 01:06:13PM -0700, Andrew Morton wrote:
> On Tue, 26 Aug 2014 23:45:20 -0400 Matthew Wilcox <matthew.r.wilcox@intel.com> wrote:
>
> > One of the primary uses for NV-DIMMs is to expose them as a block device
> > and use a filesystem to store files on the NV-DIMM. While that works,
> > it currently wastes memory and CPU time buffering the files in the page
> > cache. We have support in ext2 for bypassing the page cache, but it
> > has some races which are unfixable in the current design. This series
> > of patches rewrite the underlying support, and add support for direct
> > access to ext4.
>
> Sat down to read all this but I'm finding it rather unwieldy - it's
> just a great blob of code. Is there some overall
> what-it-does-and-how-it-does-it roadmap?
The overall goal is to map persistent memory / NV-DIMMs directly to
userspace. We have that functionality in the XIP code, but the way
it's structured is unsuitable for filesystems like ext4 & XFS, and
it has some pretty ugly races.
Patches 1 & 3 are simply bug-fixes. They should go in regardless of
the merits of anything else in this series.
Patch 2 changes the API for the direct_access block_device_operation so
it can report more than a single page at a time. As the series evolved,
this work also included moving support for partitioning into the VFS
where it belongs, handling various error cases in the VFS and so on.
Patch 4 is an optimisation. It's poor form to make userspace take two
faults for the same dereference.
Patch 5 gives us a VFS flag for the DAX property, which lets us get rid of
the get_xip_mem() method later on.
Patch 6 is also prep work; Al Viro liked it enough that it's now in
his tree.
The new DAX code is then dribbled in over patches 7-11, split up by
functional area. At each stage, the ext2-xip code is converted over to
the new DAX code.
Patches 12-18 delete the remnants of the old XIP code, and fix the things
in ext2 that Jan didn't like when he reviewed them for ext4 :-)
Patches 19 & 20 are the work to make ext4 use DAX.
Patch 21 is some final cleanup of references to the old XIP code, renaming
it all to DAX.
> Some explanation of why one would use ext4 instead of, say,
> suitably-modified ramfs/tmpfs/rd/etc?
ramfs and tmpfs really rely on the page cache. They're not exactly
built for permanence either. brd also relies on the page cache, and
there's a clear desire to use a filesystem instead of a block device
for all the usual reasons of access permissions, grow/shrink, etc.
Some people might want to use XFS instead of ext4. We're starting with
ext4, but we've been keeping an eye on what other filesystems might want
to use. btrfs isn't going to use the DAX code, but some of the other
pieces will probably come in handy.
There are also at least three people working on their own filesystems
specially designed for persistent memory. I wish them all the best
... but I'd like to get this infrastructure into place.
> Performance testing results?
I haven't been running any performance tests. What sort of performance
tests would be interesting for you to see?
> Carsten Otte wrote filemap_xip.c and may be a useful reviewer of this
> work.
I cc'd him on some earlier versions and didn't hear anything back. It felt
rude to keep plying him with 20+ patches every month.
> All the patch subjects violate Documentation/SubmittingPatches
> section 15 ;)
errr ... which bit? I used git format-patch to create them.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2014-08-27 21:12 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-08-27 3:45 [PATCH v10 00/21] Support ext4 on NV-DIMMs Matthew Wilcox
2014-08-27 3:45 ` [PATCH v10 01/21] axonram: Fix bug in direct_access Matthew Wilcox
2014-08-27 3:45 ` [PATCH v10 02/21] Change direct_access calling convention Matthew Wilcox
2014-08-27 3:45 ` [PATCH v10 03/21] Fix XIP fault vs truncate race Matthew Wilcox
2014-08-27 3:45 ` [PATCH v10 04/21] Allow page fault handlers to perform the COW Matthew Wilcox
2014-08-27 3:45 ` [PATCH v10 05/21] Introduce IS_DAX(inode) Matthew Wilcox
2014-08-27 3:45 ` [PATCH v10 06/21] Add copy_to_iter(), copy_from_iter() and iov_iter_zero() Matthew Wilcox
2014-08-27 3:45 ` [PATCH v10 07/21] Replace XIP read and write with DAX I/O Matthew Wilcox
2014-09-14 14:11 ` Boaz Harrosh
2014-08-27 3:45 ` [PATCH v10 08/21] Replace ext2_clear_xip_target with dax_clear_blocks Matthew Wilcox
2014-08-27 3:45 ` [PATCH v10 09/21] Replace the XIP page fault handler with the DAX page fault handler Matthew Wilcox
2014-09-03 7:47 ` Dave Chinner
2014-09-10 15:23 ` Matthew Wilcox
2014-09-11 3:09 ` Dave Chinner
2014-09-24 15:43 ` Matthew Wilcox
2014-09-25 1:01 ` Dave Chinner
2014-08-27 3:45 ` [PATCH v10 10/21] Replace xip_truncate_page with dax_truncate_page Matthew Wilcox
2014-08-27 3:45 ` [PATCH v10 11/21] Replace XIP documentation with DAX documentation Matthew Wilcox
2014-08-27 3:45 ` [PATCH v10 12/21] Remove get_xip_mem Matthew Wilcox
2014-08-27 3:45 ` [PATCH v10 13/21] ext2: Remove ext2_xip_verify_sb() Matthew Wilcox
2014-08-27 3:45 ` [PATCH v10 14/21] ext2: Remove ext2_use_xip Matthew Wilcox
2014-08-27 3:45 ` [PATCH v10 15/21] ext2: Remove xip.c and xip.h Matthew Wilcox
2014-08-27 3:45 ` [PATCH v10 16/21] Remove CONFIG_EXT2_FS_XIP and rename CONFIG_FS_XIP to CONFIG_FS_DAX Matthew Wilcox
2014-08-27 3:45 ` [PATCH v10 17/21] ext2: Remove ext2_aops_xip Matthew Wilcox
2014-08-27 3:45 ` [PATCH v10 18/21] Get rid of most mentions of XIP in ext2 Matthew Wilcox
2014-08-27 3:45 ` [PATCH v10 19/21] xip: Add xip_zero_page_range Matthew Wilcox
2014-09-03 9:21 ` Dave Chinner
2014-09-04 21:08 ` Matthew Wilcox
2014-09-04 21:36 ` Theodore Ts'o
2014-09-08 18:59 ` Matthew Wilcox
2014-08-27 3:45 ` [PATCH v10 20/21] ext4: Add DAX functionality Matthew Wilcox
2014-09-03 11:13 ` Dave Chinner
2014-09-10 16:49 ` Boaz Harrosh
2014-09-11 4:38 ` Dave Chinner
2014-09-14 12:25 ` Boaz Harrosh
2014-09-15 6:15 ` Dave Chinner
2014-09-15 9:41 ` Boaz Harrosh
2014-08-27 3:45 ` [PATCH v10 21/21] brd: Rename XIP to DAX Matthew Wilcox
2014-08-27 20:06 ` [PATCH v10 00/21] Support ext4 on NV-DIMMs Andrew Morton
2014-08-27 21:12 ` Matthew Wilcox [this message]
2014-08-27 21:46 ` Andrew Morton
2014-08-28 1:30 ` Andy Lutomirski
2014-08-28 16:50 ` Matthew Wilcox
2014-08-28 15:45 ` Matthew Wilcox
2014-08-27 21:22 ` Christoph Lameter
2014-08-27 21:30 ` Andrew Morton
2014-08-27 23:04 ` One Thousand Gnomes
2014-08-28 7:17 ` Dave Chinner
2014-08-30 23:11 ` Christian Stroetmann
2014-08-28 8:08 ` Boaz Harrosh
2014-08-28 22:09 ` Zwisler, Ross
2014-09-03 12:05 ` [PATCH 1/1] xfs: add DAX support Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140827211250.GH3285@linux.intel.com \
--to=willy@linux.intel.com \
--cc=akpm@linux-foundation.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=matthew.r.wilcox@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).