From: Christoph Hellwig <hch@infradead.org>
To: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Trond Myklebust <trond.myklebust@primarydata.com>,
Benjamin Coddington <bcodding@redhat.com>,
Tom Haynes <thomas.haynes@primarydata.com>,
Linux NFS Mailing List <linux-nfs@vger.kernel.org>
Subject: Re: Client never uses DATA_SYNC
Date: Fri, 7 Nov 2014 23:06:48 -0800 [thread overview]
Message-ID: <20141108070648.GA18993@infradead.org> (raw)
In-Reply-To: <20141107155307.GG22638@fieldses.org>
On Fri, Nov 07, 2014 at 10:53:08AM -0500, J. Bruce Fields wrote:
> By the way, the nfsd code is only using i_version when
> IS_I_VERSION(inode), otherwise it falls back on ctime. Do we have some
> easy way to check for change attribute support now? Otherwise we're
> ignoring it on xfs and btrfs.
Both btrfs and xfs set MS_I_VERSION. Btw, could you resend your patches
to move this out of s_flags?
> > there is no difference anyway,
> > as they update the change attribute on every write,
>
> You mean by that that the change attribute on these filesystems will
> reach the disk at the same time as the write, regardless of whether
> someone does sync or datasync?
Not nessecarily exactly the same time, but vfs_fsync_range will ensure
that we flush both all data for the range, and then flush all metadata.
With the datasync flag set to 1 we will skip inodes where only the
timestamps are dirty. Interestingly ext4 consideres the change
attribute a skippable timestamp update, XFS doesn't and btrfs doesn't
even try to optimize fdatasync, so we have three different behaviors
for three different filesystems here - my previous post was just based
on the XFS behavior.
> I'm not completely following. So if the spec had a definite statement
> one way or the other, would that be good enough to make the distinction
> used to? If we could specify the behavior from scratch, what do you
> think would be the right choice?
>
> I find it had to figure out the consequences of the change attribute not
> being written at the same time as the write, and whether there's some
> reasonable second-best behavior the server can provide in the case it
> doesn't write them to disk together atomically. It doesn't currently
> seem like there's much a client can really count on after boot.
Tom, do you think it's reasonable to propose an errata for 4.0/4.1 that
explicitly allows the behavior of updating the change attribute in memory
on a DATA_SYNC4 write, but not nessecarily persisting it? What about
COMMIT? Using datasync there would provide even more benefits in
practice there.
I guess I just need to take this to the ietf list.
next prev parent reply other threads:[~2014-11-08 7:06 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-11-04 15:47 Client never uses DATA_SYNC Benjamin Coddington
2014-11-04 20:38 ` Trond Myklebust
2014-11-05 8:53 ` Christoph Hellwig
2014-11-05 14:41 ` J. Bruce Fields
2014-11-06 20:13 ` J. Bruce Fields
2014-11-07 7:26 ` Christoph Hellwig
2014-11-07 15:53 ` J. Bruce Fields
2014-11-08 7:06 ` Christoph Hellwig [this message]
2014-11-19 20:55 ` J. Bruce Fields
2014-11-18 17:02 ` J. Bruce Fields
2014-11-20 5:48 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20141108070648.GA18993@infradead.org \
--to=hch@infradead.org \
--cc=bcodding@redhat.com \
--cc=bfields@fieldses.org \
--cc=linux-nfs@vger.kernel.org \
--cc=thomas.haynes@primarydata.com \
--cc=trond.myklebust@primarydata.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox