From: Amir Goldstein <amir73il@gmail.com>
To: Jeff Layton <jlayton@kernel.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>,
Christian Brauner <brauner@kernel.org>,
Chuck Lever <chuck.lever@oracle.com>, Neil Brown <neilb@suse.de>,
Olga Kornievskaia <kolga@netapp.com>,
Dai Ngo <Dai.Ngo@oracle.com>, Tom Talpey <tom@talpey.com>,
Chandan Babu R <chandan.babu@oracle.com>,
"Darrick J. Wong" <djwong@kernel.org>,
Dave Chinner <david@fromorbit.com>, Jan Kara <jack@suse.cz>,
Linus Torvalds <torvalds@linux-foundation.org>,
Kent Overstreet <kent.overstreet@linux.dev>,
linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-nfs@vger.kernel.org, linux-xfs@vger.kernel.org
Subject: Re: [PATCH v8 0/5] fs: multigrain timestamps for XFS's change_cookie
Date: Sat, 23 Sep 2023 17:58:46 +0300 [thread overview]
Message-ID: <CAOQ4uxjfbq=u3PYi_+ZiiAjub92o0-KeNT__ZRKSmRogLtF75Q@mail.gmail.com> (raw)
In-Reply-To: <4b106847d5202aec0e14fdbbe93b070b7ea97477.camel@kernel.org>
On Sat, Sep 23, 2023 at 1:22 PM Jeff Layton <jlayton@kernel.org> wrote:
>
> On Sat, 2023-09-23 at 10:15 +0300, Amir Goldstein wrote:
> > On Fri, Sep 22, 2023 at 8:15 PM Jeff Layton <jlayton@kernel.org> wrote:
> > >
> > > My initial goal was to implement multigrain timestamps on most major
> > > filesystems, so we could present them to userland, and use them for
> > > NFSv3, etc.
> > >
> > > With the current implementation however, we can't guarantee that a file
> > > with a coarse grained timestamp modified after one with a fine grained
> > > timestamp will always appear to have a later value. This could confuse
> > > some programs like make, rsync, find, etc. that depend on strict
> > > ordering requirements for timestamps.
> > >
> > > The goal of this version is more modest: fix XFS' change attribute.
> > > XFS's change attribute is bumped on atime updates in addition to other
> > > deliberate changes. This makes it unsuitable for export via nfsd.
> > >
> > > Jan Kara suggested keeping this functionality internal-only for now and
> > > plumbing the fine grained timestamps through getattr [1]. This set takes
> > > a slightly different approach and has XFS use the fine-grained attr to
> > > fake up STATX_CHANGE_COOKIE in its getattr routine itself.
> > >
> > > While we keep fine-grained timestamps in struct inode, when presenting
> > > the timestamps via getattr, we truncate them at a granularity of number
> > > of ns per jiffy,
> >
> > That's not good, because user explicitly set granular mtime would be
> > truncated too and booting with different kernels (HZ) would change
> > the observed timestamps of files.
> >
>
> That's a very good point.
>
> > > which allows us to smooth over the fuzz that causes
> > > ordering problems.
> > >
> >
> > The reported ordering problems (i.e. cp -u) is not even limited to the
> > scope of a single fs, right?
> >
>
> It isn't. Most of the tools we're concerned with don't generally care
> about filesystem boundaries.
>
> > Thinking out loud - if the QERIED bit was not per inode timestamp
> > but instead in a global fs_multigrain_ts variable, then all the inodes
> > of all the mgtime fs would be using globally ordered timestamps
> >
> > That should eliminate the reported issues with time reorder for
> > fine vs coarse grained timestamps.
> >
> > The risk of extra unneeded "change cookie" updates compared to
> > per inode QUERIED bit may exist, but I think it is a rather small overhead
> > and maybe worth the tradeoff of having to maintain a real per inode
> > "change cookie" in addition to a "globally ordered mgtime"?
> >
> > If this idea is acceptable, you may still be able to salvage the reverted
> > ctime series for 6.7, because the change to use global mgtime should
> > be quite trivial?
> >
>
> This is basically the idea I was going to look at next once I got some
> other stuff settled here: Basically, when we apply a fine-grained
> timestamp to an inode, we'd advance the coarse-grained clock that
> filesystems use to that value.
>
> It could cause some write amplification: if you are streaming writes to
> a bunch of files at the same time and someone stats one of them, then
> they'd all end up getting an extra inode transaction. That doesn't sound
> _too_ bad on its face, but I probably need to implement it and then run
> some numbers to see.
>
Several journal transactions within a single jiffie tick?
If ctime/change_cookie of an inode is updated once within the scope
of a single running transaction, I don't think it matters how many
times it would be updated, but maybe I am missing something.
The problem is probably going to be that the seqlock of the coarse
grained clock is going to be invalidated way too frequently to be
"read mostly" in the presence of ls -lR workload, but again, I did
not study the implementation, so I may be way off.
Thanks,
Amir.
next prev parent reply other threads:[~2023-09-23 14:59 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-09-22 17:14 [PATCH v8 0/5] fs: multigrain timestamps for XFS's change_cookie Jeff Layton
2023-09-22 17:14 ` [PATCH v8 1/5] fs: add infrastructure for multigrain timestamps Jeff Layton
2023-09-22 17:31 ` Kent Overstreet
2023-09-22 18:22 ` Jeff Layton
2023-09-22 17:14 ` [PATCH v8 2/5] fs: optimize away some fine-grained timestamp updates Jeff Layton
2023-09-22 17:14 ` [PATCH v8 3/5] fs: have setattr_copy handle multigrain timestamps appropriately Jeff Layton
2023-09-22 17:14 ` [PATCH v8 4/5] fs: add timestamp_truncate_to_gran helper Jeff Layton
2023-09-22 17:14 ` [PATCH v8 5/5] xfs: switch to multigrain timestamps Jeff Layton
2023-09-23 7:15 ` [PATCH v8 0/5] fs: multigrain timestamps for XFS's change_cookie Amir Goldstein
2023-09-23 10:22 ` Jeff Layton
2023-09-23 14:58 ` Amir Goldstein [this message]
2023-09-25 10:08 ` Jeff Layton
2023-09-23 10:46 ` Jeff Layton
2023-09-23 14:52 ` Amir Goldstein
2023-09-24 22:18 ` Dave Chinner
2023-09-25 10:14 ` Jeff Layton
2023-09-25 22:32 ` Dave Chinner
2023-09-26 11:31 ` Jeff Layton
2023-09-26 23:33 ` Dave Chinner
2023-09-27 10:26 ` Jeff Layton
2023-09-23 20:43 ` Amir Goldstein
2023-09-24 11:31 ` Christian Brauner
2023-09-24 22:44 ` NeilBrown
2023-09-25 10:17 ` Jeff Layton
2023-09-26 12:10 ` Christian Brauner
2023-09-26 12:18 ` Christian Brauner
2023-09-26 12:51 ` Jeff Layton
2023-09-26 14:29 ` Christian Brauner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAOQ4uxjfbq=u3PYi_+ZiiAjub92o0-KeNT__ZRKSmRogLtF75Q@mail.gmail.com' \
--to=amir73il@gmail.com \
--cc=Dai.Ngo@oracle.com \
--cc=brauner@kernel.org \
--cc=chandan.babu@oracle.com \
--cc=chuck.lever@oracle.com \
--cc=david@fromorbit.com \
--cc=djwong@kernel.org \
--cc=jack@suse.cz \
--cc=jlayton@kernel.org \
--cc=kent.overstreet@linux.dev \
--cc=kolga@netapp.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
--cc=neilb@suse.de \
--cc=tom@talpey.com \
--cc=torvalds@linux-foundation.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).