linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Eric Sandeen <sandeen@sandeen.net>
Cc: Brian Foster <bfoster@redhat.com>,
	Eric Sandeen <sandeen@redhat.com>,
	linux-xfs <linux-xfs@vger.kernel.org>,
	Christoph Hellwig <hch@infradead.org>,
	Jeff Moyer <jmoyer@redhat.com>
Subject: Re: [PATCH, RFC] xfs: completely disable toggling DAX flag via ioctl on reg files
Date: Fri, 27 Jul 2018 09:17:54 +1000	[thread overview]
Message-ID: <20180726231754.GB2234@dastard> (raw)
In-Reply-To: <892f27ac-f985-bf8f-1cc9-bc0a996136f7@sandeen.net>

On Thu, Jul 26, 2018 at 06:23:58AM -0700, Eric Sandeen wrote:
> On 7/26/18 5:08 AM, Brian Foster wrote:
> > On Wed, Jul 25, 2018 at 02:20:54PM -0700, Eric Sandeen wrote:
> >> 742d842 xfs: disable per-inode DAX flag was, I think, intended
> >> as a short-term workaround to avoid races when toggling DAX on
> >> and off of active inodes until mm/ sorted that out.
> >>
> >> (It's also a confusing title, as it didn't really disable
> >> per-inode DAX at all.)
> >>
> >> However, it has the surprising (to me, at least) result that while
> >> the ioctl succeeds, no behavior changes until the inode is cycled
> >> out of cache and re-read from disk at some unknown later time.
> >> This seems to badly violate the principle of least surprise.
> >>
> >> Until said races are properly resolved, it seems most prudent to
> >> disallow modification of the flag on regular files altogether.
> >> We can still allow per-inode DAX flagging via directory inheritance.
> >>
> >> Since DAX is still flagged as experimental (in part due to these
> >> concerns) I don't think it's a problem to (temporarily?) break
> >> this interface further.
> >>
> >> Signed-off-by: Eric Sandeen <sandeen@redhat.com>
> >> ---
> > 
> > I'm not in tune with the latest state of dax, but if the situation is
> > that we don't currently have a means to correctly switch the per-inode
> > state for an active inode (and thus have simply skipped changing the
> > online flag while carrying on with the on-disk flag, leading to this
> > inode cache cycling requirement), then I think this makes sense. The
> > current interface is essentially incomplete, I don't see any reason to
> > allow unless/until it actually works sanely.
> > 
> > BTW, what bits are actually missing to make that happen? Why is the
> > flush/inval currently in this function not sufficient?
> 
> TBH I don't actually know the low-level details. :/

page faults aren't synchronised with filesystem locks, so we can
change the aops callout behaviour half way through a page fault.
i.e. the first half of the page fault sees the S_DAX flag and does
prep work based on that, the second half of the page fault doesn't
see the S_DAX flag and assumes it's working on a page cache page
that doesn't exist and things go bang...

As it is, I don't think we can remove this now - people are using
the on-disk flags already, and the inherit flag from the directory
has none of the problems of changing S_DAX dynamically. Hence just
disabling it is the wrong thing to do because it removes the ability
for people to manage the flags that are already on disk....

I'd much prefer we fix the page fault synchronisation problem than
break stuff that /isn't actually broken/. Yes, it's current
behaviour is suboptimal, but that is only supposed to be /temporary/
until the aops callout problem is fixed.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

  parent reply	other threads:[~2018-07-27  0:37 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-25 21:20 [PATCH, RFC] xfs: completely disable toggling DAX flag via ioctl on reg files Eric Sandeen
2018-07-26 12:08 ` Brian Foster
2018-07-26 13:23   ` Eric Sandeen
2018-07-26 14:15     ` Brian Foster
2018-07-26 23:17     ` Dave Chinner [this message]
2018-07-27  1:20       ` Eric Sandeen
2018-07-27 18:51         ` Brian Foster
2018-07-30 16:09           ` Darrick J. Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180726231754.GB2234@dastard \
    --to=david@fromorbit.com \
    --cc=bfoster@redhat.com \
    --cc=hch@infradead.org \
    --cc=jmoyer@redhat.com \
    --cc=linux-xfs@vger.kernel.org \
    --cc=sandeen@redhat.com \
    --cc=sandeen@sandeen.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).