linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Jan Kara <jack@suse.cz>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-kernel@vger.kernel.org,
	"Darrick J. Wong" <darrick.wong@oracle.com>,
	"J. Bruce Fields" <bfields@fieldses.org>,
	Christoph Hellwig <hch@lst.de>,
	Dan Williams <dan.j.williams@intel.com>,
	Jeff Layton <jlayton@poochiereds.net>,
	linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
	linux-nvdimm@lists.01.org, linux-xfs@vger.kernel.org
Subject: Re: [PATCH 1/7] xfs: always use DAX if mount option is used
Date: Tue, 26 Sep 2017 21:09:57 +1000	[thread overview]
Message-ID: <20170926110957.GR10955@dastard> (raw)
In-Reply-To: <20170926093548.GB13627@quack2.suse.cz>

On Tue, Sep 26, 2017 at 11:35:48AM +0200, Jan Kara wrote:
> On Tue 26-09-17 09:38:12, Dave Chinner wrote:
> > On Mon, Sep 25, 2017 at 05:13:58PM -0600, Ross Zwisler wrote:
> > > Before support for the per-inode DAX flag was disabled the XFS the code had
> > > an issue where the user couldn't reliably tell whether or not DAX was being
> > > used to service page faults and I/O when the DAX mount option was used.  In
> > > this case each inode within the mounted filesystem started with S_DAX set
> > > due to the mount option, but it could be cleared if someone touched the
> > > individual inode flag.
> > > 
> > > For example (v4.13 and before):
> > > 
> > >   # mount | grep dax
> > >   /dev/pmem0 on /mnt type xfs
> > >   (rw,relatime,seclabel,attr2,dax,inode64,sunit=4096,swidth=4096,noquota)
> > > 
> > >   # touch /mnt/a /mnt/b   # both files currently use DAX
> > > 
> > >   # xfs_io -c "lsattr" /mnt/*  # neither has the DAX inode option set
> > >   ----------e----- /mnt/a
> > >   ----------e----- /mnt/b
> > > 
> > >   # xfs_io -c "chattr -x" /mnt/a  # this clears S_DAX for /mnt/a
> > > 
> > >   # xfs_io -c "lsattr" /mnt/*
> > >   ----------e----- /mnt/a
> > >   ----------e----- /mnt/b
> > 
> > That's really a bug in the lsattr code, yes? If we've cleared the
> > S_DAX flag for the inode, then why is it being reported in lsattr?
> > Or if we failed to clear the S_DAX flag in the 'chattr -x' call,
> > then isn't that the bug that needs fixing?
> > 
> > Remember, the whole point of the dax inode flag was to be able to
> > override the mount option setting so that admins could turn off/on
> > dax for the things that didn't/did work with DAX correctly so they
> > didn't need multiple filesystems on pmem to segregate the apps that
> > did/didn't work with DAX...
> 
> So I think there is some confusion that is created by the fact that whether
> DAX is used or not is controlled by both a mount option and an inode flag.
> We could define that "Inode flag always wins" which is what you seem to
> suggest above but then mount option has no practical effect since on-disk
> S_DAX flag will always overrule it.

Well, quite frankly, I never wanted the mount option for XFS. It was
supposed to be for initial testing only, then we'd /always/ use the
the inode flags. For a filesystem to default to using DAX, we
set the DAX flag on the root inode at mkfs time, and then everything
inode flag based just works.

But it seems that we're now stuck with the stupid, blunt, brute
force mount option because that's what the first commit on ext4
used.  Now we're just about stuck with this silly "but we can't turn
it off" problem because of the mount option overriding everything.

If we have to keep the mount option, then lets fix it to mean "mount
option sets inheritable inode flag on directory creation" and
/maybe/ "mount option sets inode flag on file creation".

This then allows the inode flag to control everything else. i.e the
mount option sets the initial flag value rather than the behaviour
of the inode. The behaviour of the inode should be entirely
controlled by the inode flag, hence after initial creation the
chattr +/-x commands do what they advertise regardless of the mount
option value.

Yes, it means that existing users are going to have to run chattr -R
+x on their pmem filesystems to get the inode flags on disk, but
this is all tagged with EXPERIMENTAL and this is the sort of change
that is expected from experimental functionality.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2017-09-26 11:09 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-09-25 23:13 [PATCH 0/7] re-enable XFS per-inode DAX Ross Zwisler
2017-09-25 23:13 ` [PATCH 1/7] xfs: always use DAX if mount option is used Ross Zwisler
2017-09-25 23:38   ` Dave Chinner
2017-09-26  9:35     ` Jan Kara
2017-09-26 11:09       ` Dave Chinner [this message]
2017-09-26 14:37         ` Christoph Hellwig
2017-09-26 17:30           ` Ross Zwisler
2017-09-26 19:48             ` Darrick J. Wong
2017-09-26 22:00               ` Dave Chinner
2017-09-27  6:40             ` Christoph Hellwig
2017-09-27 16:15               ` Ross Zwisler
2017-10-01  8:17                 ` Christoph Hellwig
2017-09-26 18:02         ` Eric Sandeen
2017-09-26 18:50     ` Ross Zwisler
2017-09-25 23:13 ` [PATCH 2/7] xfs: validate bdev support for DAX inode flag Ross Zwisler
2017-09-26  6:36   ` Christoph Hellwig
2017-09-26 17:16     ` Ross Zwisler
2017-09-26 17:57       ` Darrick J. Wong
2017-09-25 23:14 ` [PATCH 3/7] xfs: protect S_DAX transitions in XFS read path Ross Zwisler
2017-09-25 23:27   ` Dave Chinner
2017-09-26  6:32   ` Christoph Hellwig
2017-09-26 13:59     ` Dan Williams
2017-09-26 14:33       ` Christoph Hellwig
2017-09-26 18:11         ` Dan Williams
2017-10-01  8:17           ` Christoph Hellwig
2017-09-25 23:14 ` [PATCH 4/7] xfs: protect S_DAX transitions in XFS write path Ross Zwisler
2017-09-25 23:29   ` Dave Chinner
2017-09-25 23:14 ` [PATCH 5/7] xfs: introduce xfs_is_dax_state_changing Ross Zwisler
2017-09-26  6:33   ` Christoph Hellwig
2017-09-25 23:14 ` [PATCH 6/7] mm, fs: introduce file_operations->post_mmap() Ross Zwisler
2017-09-25 23:38   ` Dan Williams
2017-09-26 18:57     ` Ross Zwisler
2017-09-26 19:19       ` Dan Williams
2017-09-26 21:06         ` Ross Zwisler
2017-09-26 21:41           ` Dan Williams
2017-09-27 11:35             ` Jan Kara
2017-09-27 14:00               ` Dan Williams
2017-09-27 15:07                 ` Jan Kara
2017-09-27 15:36                   ` Dan Williams
2017-09-27 15:39               ` Ross Zwisler
2017-09-27 15:54                 ` Dan Williams
2017-09-26  6:34   ` Christoph Hellwig
2017-09-25 23:14 ` [PATCH 7/7] xfs: re-enable XFS per-inode DAX Ross Zwisler
2017-09-26  0:31   ` Dave Chinner
2017-09-26  6:36   ` Christoph Hellwig
2017-09-26 19:01     ` Ross Zwisler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170926110957.GR10955@dastard \
    --to=david@fromorbit.com \
    --cc=akpm@linux-foundation.org \
    --cc=bfields@fieldses.org \
    --cc=dan.j.williams@intel.com \
    --cc=darrick.wong@oracle.com \
    --cc=hch@lst.de \
    --cc=jack@suse.cz \
    --cc=jlayton@poochiereds.net \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=ross.zwisler@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).