From: Christoph Hellwig <hch@infradead.org>
To: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Trond Myklebust <trond.myklebust@primarydata.com>,
Benjamin Coddington <bcodding@redhat.com>,
Tom Haynes <thomas.haynes@primarydata.com>,
Linux NFS Mailing List <linux-nfs@vger.kernel.org>
Subject: Re: Client never uses DATA_SYNC
Date: Fri, 7 Nov 2014 23:06:48 -0800 [thread overview]
Message-ID: <20141108070648.GA18993@infradead.org> (raw)
In-Reply-To: <20141107155307.GG22638@fieldses.org>
On Fri, Nov 07, 2014 at 10:53:08AM -0500, J. Bruce Fields wrote:
> By the way, the nfsd code is only using i_version when
> IS_I_VERSION(inode), otherwise it falls back on ctime. Do we have some
> easy way to check for change attribute support now? Otherwise we're
> ignoring it on xfs and btrfs.
Both btrfs and xfs set MS_I_VERSION. Btw, could you resend your patches
to move this out of s_flags?
> > there is no difference anyway,
> > as they update the change attribute on every write,
>
> You mean by that that the change attribute on these filesystems will
> reach the disk at the same time as the write, regardless of whether
> someone does sync or datasync?
Not nessecarily exactly the same time, but vfs_fsync_range will ensure
that we flush both all data for the range, and then flush all metadata.
With the datasync flag set to 1 we will skip inodes where only the
timestamps are dirty. Interestingly ext4 consideres the change
attribute a skippable timestamp update, XFS doesn't and btrfs doesn't
even try to optimize fdatasync, so we have three different behaviors
for three different filesystems here - my previous post was just based
on the XFS behavior.
> I'm not completely following. So if the spec had a definite statement
> one way or the other, would that be good enough to make the distinction
> used to? If we could specify the behavior from scratch, what do you
> think would be the right choice?
>
> I find it had to figure out the consequences of the change attribute not
> being written at the same time as the write, and whether there's some
> reasonable second-best behavior the server can provide in the case it
> doesn't write them to disk together atomically. It doesn't currently
> seem like there's much a client can really count on after boot.
Tom, do you think it's reasonable to propose an errata for 4.0/4.1 that
explicitly allows the behavior of updating the change attribute in memory
on a DATA_SYNC4 write, but not nessecarily persisting it? What about
COMMIT? Using datasync there would provide even more benefits in
practice there.
I guess I just need to take this to the ietf list.
next prev parent reply other threads:[~2014-11-08 7:06 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-11-04 15:47 Client never uses DATA_SYNC Benjamin Coddington
2014-11-04 20:38 ` Trond Myklebust
2014-11-05 8:53 ` Christoph Hellwig
2014-11-05 14:41 ` J. Bruce Fields
2014-11-06 20:13 ` J. Bruce Fields
2014-11-07 7:26 ` Christoph Hellwig
2014-11-07 15:53 ` J. Bruce Fields
2014-11-08 7:06 ` Christoph Hellwig [this message]
2014-11-19 20:55 ` J. Bruce Fields
2014-11-18 17:02 ` J. Bruce Fields
2014-11-20 5:48 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20141108070648.GA18993@infradead.org \
--to=hch@infradead.org \
--cc=bcodding@redhat.com \
--cc=bfields@fieldses.org \
--cc=linux-nfs@vger.kernel.org \
--cc=thomas.haynes@primarydata.com \
--cc=trond.myklebust@primarydata.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.