linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Ross Zwisler <ross.zwisler@linux.intel.com>,
	Jan Kara <jack@suse.cz>,
	linux-kernel@vger.kernel.org,
	"J. Bruce Fields" <bfields@fieldses.org>,
	Theodore Ts'o <tytso@mit.edu>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Andreas Dilger <adilger.kernel@dilger.ca>,
	Andrew Morton <akpm@linux-foundation.org>,
	Dan Williams <dan.j.williams@intel.com>, Jan Kara <jack@suse.com>,
	Jeff Layton <jlayton@poochiereds.net>,
	Jens Axboe <axboe@kernel.dk>,
	Matthew Wilcox <willy@linux.intel.com>,
	linux-block@vger.kernel.org, linux-ext4@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
	linux-nvdimm@lists.01.org, xfs@oss.sgi.com
Subject: Re: [PATCH v3 3/6] ext4: Online defrag not supported with DAX
Date: Thu, 18 Feb 2016 11:12:23 +1100	[thread overview]
Message-ID: <20160218001223.GJ19486@dastard> (raw)
In-Reply-To: <20160217215037.GB30126@linux.intel.com>

On Wed, Feb 17, 2016 at 02:50:37PM -0700, Ross Zwisler wrote:
> On Tue, Feb 16, 2016 at 08:34:16PM -0700, Ross Zwisler wrote:
> > Online defrag operations for ext4 are hard coded to use the page cache.
> > See ext4_ioctl() -> ext4_move_extents() -> move_extent_per_page()
> > 
> > When combined with DAX I/O, which circumvents the page cache, this can
> > result in data corruption.  This was observed with xfstests ext4/307 and
> > ext4/308.
> > 
> > Fix this by only allowing online defrag for non-DAX files.
> 
> Jan,
> 
> Thinking about this a bit more, it's probably the case that the data
> corruption I was observing was due to us skipping the writeback of the dirty
> page cache pages because S_DAX was set.
> 
> I do think we have a problem with defrag because it is doing the extent
> swapping using the page cache, and we won't flush the dirty pages due to
> S_DAX being set.
> 
> This patch is the quick and easy answer, and is perhaps appropriate for v4.5.
> 
> Looking forward, though, what do you think the correct solution is?  Making an
> extent swapper that doesn't use the page cache (as I believe XFS has? see
> xfs_swap_extents()),

XFS does the data copy in userspace using direct IO so we don't
care about whether DAX is enabled or not on either the source or
destination inode. i.e. xfs_swap_extents() is a pure
metadata operation, swapping the entire extent tree between two
inodes if the source data has not changed while the copy was in
progress.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

  parent reply	other threads:[~2016-02-18  0:12 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-17  3:34 [PATCH v3 0/6] DAX fixes, move flushing calls to FS Ross Zwisler
2016-02-17  3:34 ` [PATCH v3 1/6] block: disable block device DAX by default Ross Zwisler
2016-02-17 21:55   ` Jan Kara
2016-02-17  3:34 ` [PATCH v3 2/6] ext2, ext4: only set S_DAX for regular inodes Ross Zwisler
2016-02-17 21:33   ` Jan Kara
2016-02-17  3:34 ` [PATCH v3 3/6] ext4: Online defrag not supported with DAX Ross Zwisler
2016-02-17 21:34   ` Jan Kara
2016-02-17 21:50   ` Ross Zwisler
2016-02-17 22:10     ` Jan Kara
2016-02-18  0:12     ` Dave Chinner [this message]
2016-02-17  3:34 ` [PATCH v3 4/6] dax: give DAX clearing code correct bdev Ross Zwisler
2016-02-17 21:37   ` Jan Kara
2016-02-17  3:34 ` [PATCH v3 5/6] dax: move writeback calls into the filesystems Ross Zwisler
2016-02-17  3:34 ` [PATCH v3 6/6] block: use dax_do_io() if blkdev_dax_capable() Ross Zwisler
2016-02-17 21:54   ` Jan Kara
2016-02-17 22:18     ` Dan Williams

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160218001223.GJ19486@dastard \
    --to=david@fromorbit.com \
    --cc=adilger.kernel@dilger.ca \
    --cc=akpm@linux-foundation.org \
    --cc=axboe@kernel.dk \
    --cc=bfields@fieldses.org \
    --cc=dan.j.williams@intel.com \
    --cc=jack@suse.com \
    --cc=jack@suse.cz \
    --cc=jlayton@poochiereds.net \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=ross.zwisler@linux.intel.com \
    --cc=tytso@mit.edu \
    --cc=viro@zeniv.linux.org.uk \
    --cc=willy@linux.intel.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).