linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Kent Overstreet <kent.overstreet@linux.dev>
To: Jeff Layton <jlayton@kernel.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>,
	Christian Brauner <brauner@kernel.org>,
	Chuck Lever <chuck.lever@oracle.com>, Neil Brown <neilb@suse.de>,
	Olga Kornievskaia <kolga@netapp.com>,
	Dai Ngo <Dai.Ngo@oracle.com>, Tom Talpey <tom@talpey.com>,
	Chandan Babu R <chandan.babu@oracle.com>,
	"Darrick J. Wong" <djwong@kernel.org>,
	Dave Chinner <david@fromorbit.com>, Jan Kara <jack@suse.cz>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-nfs@vger.kernel.org, linux-xfs@vger.kernel.org
Subject: Re: [PATCH v8 1/5] fs: add infrastructure for multigrain timestamps
Date: Fri, 22 Sep 2023 13:31:36 -0400	[thread overview]
Message-ID: <20230922173136.qpodogsb26wq3ujj@moria.home.lan> (raw)
In-Reply-To: <20230922-ctime-v8-1-45f0c236ede1@kernel.org>

On Fri, Sep 22, 2023 at 01:14:40PM -0400, Jeff Layton wrote:
> The VFS always uses coarse-grained timestamps when updating the ctime
> and mtime after a change. This has the benefit of allowing filesystems
> to optimize away a lot metadata updates, down to around 1 per jiffy,
> even when a file is under heavy writes.
> 
> Unfortunately, this has always been an issue when we're exporting via
> NFS, which traditionally relied on timestamps to validate caches. A lot
> of changes can happen in a jiffy, and that can lead to cache-coherency
> issues between hosts.
> 
> NFSv4 added a dedicated change attribute that must change value after
> any change to an inode. Some filesystems (btrfs, ext4 and tmpfs) utilize
> the i_version field for this, but the NFSv4 spec allows a server to
> generate this value from the inode's ctime.
> 
> What we need is a way to only use fine-grained timestamps when they are
> being actively queried.
> 
> POSIX generally mandates that when the the mtime changes, the ctime must
> also change. The kernel always stores normalized ctime values, so only
> the first 30 bits of the tv_nsec field are ever used.
> 
> Use the 31st bit of the ctime tv_nsec field to indicate that something
> has queried the inode for the mtime or ctime. When this flag is set,
> on the next mtime or ctime update, the kernel will fetch a fine-grained
> timestamp instead of the usual coarse-grained one.
> 
> Filesytems can opt into this behavior by setting the FS_MGTIME flag in
> the fstype. Filesystems that don't set this flag will continue to use
> coarse-grained timestamps.

Interesting...

So in bcachefs, for most inode fields the btree inode is the "master
copy"; we do inode updates via btree transactions, and then on
successful transaction commit we update the VFS inode to match.

(exceptions: i_size, i_blocks)

I'd been contemplating switching to that model for timestamp updates as
well, since that would allow us to get rid of our
super_operations.write_inode method - except we probably wouldn't want
to do that since it would likely make timestamp updates too expensive.

And now with your scheme of stashing extra state in timespec, I'm glad
we didn't.

Still, timestamp updates are a bit messier than I'd like, would be
lovely to figure out a way to clean that up - right now we have an
awkward mix of "sometimes timestamp updates happen in a btree
transaction first, other times just the VFS inode is updated and marked
dirty".

xfs doesn't have .write_inode, so it's probably time to study what it
does...

  reply	other threads:[~2023-09-22 17:32 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-22 17:14 [PATCH v8 0/5] fs: multigrain timestamps for XFS's change_cookie Jeff Layton
2023-09-22 17:14 ` [PATCH v8 1/5] fs: add infrastructure for multigrain timestamps Jeff Layton
2023-09-22 17:31   ` Kent Overstreet [this message]
2023-09-22 18:22     ` Jeff Layton
2023-09-22 17:14 ` [PATCH v8 2/5] fs: optimize away some fine-grained timestamp updates Jeff Layton
2023-09-22 17:14 ` [PATCH v8 3/5] fs: have setattr_copy handle multigrain timestamps appropriately Jeff Layton
2023-09-22 17:14 ` [PATCH v8 4/5] fs: add timestamp_truncate_to_gran helper Jeff Layton
2023-09-22 17:14 ` [PATCH v8 5/5] xfs: switch to multigrain timestamps Jeff Layton
2023-09-23  7:15 ` [PATCH v8 0/5] fs: multigrain timestamps for XFS's change_cookie Amir Goldstein
2023-09-23 10:22   ` Jeff Layton
2023-09-23 14:58     ` Amir Goldstein
2023-09-25 10:08       ` Jeff Layton
2023-09-23 10:46   ` Jeff Layton
2023-09-23 14:52     ` Amir Goldstein
2023-09-24 22:18       ` Dave Chinner
2023-09-25 10:14         ` Jeff Layton
2023-09-25 22:32           ` Dave Chinner
2023-09-26 11:31             ` Jeff Layton
2023-09-26 23:33               ` Dave Chinner
2023-09-27 10:26                 ` Jeff Layton
2023-09-23 20:43     ` Amir Goldstein
2023-09-24 11:31 ` Christian Brauner
2023-09-24 22:44   ` NeilBrown
2023-09-25 10:17     ` Jeff Layton
2023-09-26 12:10       ` Christian Brauner
2023-09-26 12:18     ` Christian Brauner
2023-09-26 12:51       ` Jeff Layton
2023-09-26 14:29         ` Christian Brauner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230922173136.qpodogsb26wq3ujj@moria.home.lan \
    --to=kent.overstreet@linux.dev \
    --cc=Dai.Ngo@oracle.com \
    --cc=brauner@kernel.org \
    --cc=chandan.babu@oracle.com \
    --cc=chuck.lever@oracle.com \
    --cc=david@fromorbit.com \
    --cc=djwong@kernel.org \
    --cc=jack@suse.cz \
    --cc=jlayton@kernel.org \
    --cc=kolga@netapp.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=neilb@suse.de \
    --cc=tom@talpey.com \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).