linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Chinner <david@fromorbit.com>,
	Ross Zwisler <ross.zwisler@linux.intel.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Theodore Ts'o <tytso@mit.edu>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Andreas Dilger <adilger.kernel@dilger.ca>,
	Andrew Morton <akpm@linux-foundation.org>,
	Jan Kara <jack@suse.com>, Matthew Wilcox <willy@linux.intel.com>,
	linux-ext4 <linux-ext4@vger.kernel.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Linux MM <linux-mm@kvack.org>,
	"linux-nvdimm@lists.01.org" <linux-nvdimm@lists.01.org>,
	XFS Developers <xfs@oss.sgi.com>, jmoyer <jmoyer@redhat.com>
Subject: Re: [PATCH 2/2] dax: move writeback calls into the filesystems
Date: Tue, 9 Feb 2016 17:01:34 +0100	[thread overview]
Message-ID: <20160209160134.GA12245@quack.suse.cz> (raw)
In-Reply-To: <20160209094353.GF9451@quack.suse.cz>

[-- Attachment #1: Type: text/plain, Size: 3957 bytes --]

On Tue 09-02-16 10:43:53, Jan Kara wrote:
> On Mon 08-02-16 12:55:24, Dan Williams wrote:
> > On Mon, Feb 8, 2016 at 12:18 PM, Dave Chinner <david@fromorbit.com> wrote:
> > [..]
> > >> Setting aside the current block zeroing problem you seem to assuming
> > >> that DAX will always be faster and that may not be true at a media
> > >> level.  Waiting years for some applications to determine if DAX makes
> > >> sense for their use case seems completely reasonable.  In the meantime
> > >> the apps that are already making these changes want to know that a DAX
> > >> mapping request has not silently dropped backed to page cache.  They
> > >> also want to know if they successfully jumped through all the hoops to
> > >> get a larger than pte mapping.
> > >>
> > >> I agree it is useful to be able to force DAX on an unmodified
> > >> application to see what happens, and it follows that if those
> > >> applications want to run in that mode they will need functional
> > >> fsync()...
> > >>
> > >> I would feel better if we were talking about specific applications and
> > >> performance numbers to know if forcing DAX on application is a debug
> > >> facility or a production level capability.  You seem to have already
> > >> made that determination and I'm curious what I'm missing.
> > >
> > > I'm not setting any policy here at all.  This whole argument is
> > > based around the DAX mount option doing "global fs enable or
> > > silently turning it off" and the application not knowing about that.
> > >
> > > The whole point of having a persistent per-inode DAX flags is that
> > > it is a policy mechanism, not a policy.  The application can, if it
> > > is DAX aware, directly control whether DAX is used on a file or not.
> > > The application can even query and clear that persistent inode flag
> > > if it is configured not to (or cannot) use DAX.
> > >
> > > If the filesystem cannot support DAX, then we can error out attempts
> > > to set the DAX flag and then the app knows DAX is not available.
> > > i.e. the attempt to set policy failed. If the flag is set, then the
> > > inode will *always* use DAX - there is no "fall back to page cache"
> > > when DAX is enabled.
> > >
> > > If the applicaiton is not DAX aware, then the admin can control the
> > > DAX policy by manipulating these flags themselves, and hence control
> > > whether DAX is used by the application or not.
> > >
> > > If you think I'm dictating policy for DAX users and application,
> > > then you haven't understood anything I've previously said about why
> > > the DAX mount option needs to die before any of this is considered
> > > production ready. DAX is not an opaque "all or nothing" option. XFS
> > > will provide apps and admins with fine-grained, persistent,
> > > discoverable policy flags to allow admins and applications to set
> > > DAX policies however they see fit. This simply cannot be done if the
> > > only knob you have is a mount option that may or may not stick.
> > 
> > I agree the mount option needs to die, and I fully grok the reasoning.
> >   What I'm concerned with is that a system using fully-DAX-aware
> > applications is forced to incur the overhead of maintaining *sync
> > semantics, periodic sync(2) in particular,  even if it is not relying
> > on those semantics.
> 
> Let me somewhat correct this: IMO hard requirement is maintaining sync(2)
> semantics. Periodic writeback does not have any hard durability guarantees
> and we are free to ignore such requests in ->writepages() (that function
> has enough information in the writeback_control structure to differentiate
> between periodic writeback and data integrity sync) if we decide it is
> useful. Actually, we could do that even for 4.5.

Attached is a version of Ross' patch that will work for sync(2) and
fsync(2) and we won't flush caches during periodic writeback. The patch is
only compile-tested. Ross?

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

[-- Attachment #2: 0001-dax-move-writeback-calls-into-the-filesystems.patch --]
[-- Type: text/x-patch, Size: 0 bytes --]



  reply	other threads:[~2016-02-09 16:01 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-07  7:19 [PATCH 0/2] DAX bdev fixes - move flushing calls to FS Ross Zwisler
2016-02-07  7:19 ` [PATCH 1/2] dax: pass bdev argument to dax_clear_blocks() Ross Zwisler
2016-02-07 18:19   ` Dan Williams
2016-02-08  1:46     ` Ross Zwisler
2016-02-08  4:29       ` Ross Zwisler
2016-02-07 22:03   ` Dave Chinner
2016-02-08  1:44     ` Ross Zwisler
2016-02-08  5:17       ` Dave Chinner
2016-02-08 15:34         ` Ross Zwisler
2016-02-07  7:19 ` [PATCH 2/2] dax: move writeback calls into the filesystems Ross Zwisler
2016-02-07 19:13   ` Dan Williams
2016-02-07 21:50     ` Dave Chinner
2016-02-08  8:18       ` Dan Williams
2016-02-08 20:18         ` Dave Chinner
2016-02-08 20:55           ` Dan Williams
2016-02-08 20:58             ` Jeff Moyer
2016-02-08 22:05               ` Dan Williams
2016-02-09  9:43             ` Jan Kara
2016-02-09 16:01               ` Jan Kara [this message]
2016-02-09 18:06                 ` Ross Zwisler
2016-02-08 18:31     ` Ross Zwisler
2016-02-08 19:23       ` Dan Williams
2016-02-08 10:48   ` Jan Kara
2016-02-08 16:12     ` Ross Zwisler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160209160134.GA12245@quack.suse.cz \
    --to=jack@suse.cz \
    --cc=adilger.kernel@dilger.ca \
    --cc=akpm@linux-foundation.org \
    --cc=dan.j.williams@intel.com \
    --cc=david@fromorbit.com \
    --cc=jack@suse.com \
    --cc=jmoyer@redhat.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=ross.zwisler@linux.intel.com \
    --cc=tytso@mit.edu \
    --cc=viro@zeniv.linux.org.uk \
    --cc=willy@linux.intel.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).