From: Jeff Layton <jlayton@kernel.org>
To: Trond Myklebust <trondmy@hammerspace.com>,
"david@fromorbit.com" <david@fromorbit.com>
Cc: "darrick.wong@oracle.com" <darrick.wong@oracle.com>,
"djwong@kernel.org" <djwong@kernel.org>,
"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
"linux-xfs@vger.kernel.org" <linux-xfs@vger.kernel.org>
Subject: Re: [PATCH] xfs: fix i_version handling in xfs
Date: Thu, 18 Aug 2022 07:03:42 -0400 [thread overview]
Message-ID: <b8cf4464cc31dc262a2d38e66265c06bf1e35751.camel@kernel.org> (raw)
In-Reply-To: <0e41fb378e57794ab2a8a714b44ef92733598e8e.camel@hammerspace.com>
On Thu, 2022-08-18 at 04:15 +0000, Trond Myklebust wrote:
> On Thu, 2022-08-18 at 13:37 +1000, Dave Chinner wrote:
> > On Thu, Aug 18, 2022 at 01:11:09AM +0000, Trond Myklebust wrote:
> > > On Wed, 2022-08-17 at 08:42 +1000, Dave Chinner wrote:
> > > >
> > > > In XFS, we've defined the on-disk i_version field to mean
> > > > "increments with any persistent inode data or metadata change",
> > > > regardless of what the high level applications that use i_version
> > > > might actually require.
> > > >
> > > > That some network filesystem might only need a subset of the
> > > > metadata to be covered by i_version is largely irrelevant - if we
> > > > don't cover every persistent inode metadata change with
> > > > i_version,
> > > > then applications that *need* stuff like atime change
> > > > notification
> > > > can't be supported.
> > >
> > > OK, I'll bite...
> > >
> > > What real world application are we talking about here, and why
> > > can't it
> > > just read both the atime + i_version if it cares?
> >
> > The whole point of i_version is that the aplication can skip the
> > storage and comparison of individual metadata fields to determine if
> > anythign changed. If you're going to store multiple fields and
> > compare them all in addition to the change attribute, then what is
> > the change attribute actually gaining you?
>
> Information that is not contained in the fields themselves. Such as
> metadata about the fact that they were explicitly changed by an
> application.
>
> >
> > > The value of the change attribute lies in the fact that it gives
> > > you
> > > ctime semantics without the time resolution limitation.
> > > i.e. if the change attribute has changed, then you know that
> > > someone
> > > has explicitly modified either the file data or the file metadata
> > > (with
> > > the emphasis being on the word "explicitly").
> > > Implicit changes such as the mtime change due to a write are
> > > reflected
> > > only because they are necessarily also accompanied by an explicit
> > > change to the data contents of the file.
> > > Implicit changes, such as the atime changes due to a read are not
> > > reflected in the change attribute because there is no explicit
> > > change
> > > being made by an application.
> >
> > That's the *NFSv4 requirements*, not what people were asking XFS to
> > support in a persistent change attribute 10-15 years ago. What NFS
> > required was just one of the inputs at the time, and what we
> > implemented has kept NFSv4 happy for the past decade. I've mentioned
> > other requirements elsewhere in the thread
>
> NFS can work with a change attribute that tells it to invalidate its
> cache on every read. The only side effect will be that the performance
> graph will act as if you were filtering it through a cow's digestive
> system...
>
> > The problem we're talking about here is essentially a relatime
> > filtering issue; it's triggering an filesystem update because the
> > first access after a modification triggers an on-disk atime update
> > rahter than just storing it in memory.
>
> No. It's not... You appear to be discarding valuable information about
> why the attributes changed. I've been asking you for reasons why, and
> you're avoiding the question.
>
> > This is not a filesystem issue - the VFS controls when the on-disk
> > updates occur, and that what NFSv4 appears to need changed.
> > If NFS doesn't want the filesystem to bump change counters for
> > on-disk atime updates, then it should be asking the VFS to keep the
> > atime updates in memory. e.g. use lazytime semantics.
> >
> > This way, every filesystem will have the same behaviour, regardless
> > of how they track/store persistent change count metadata.
>
> Right now, the i_version updates are not exported via any common API,
> so any piss poor performance side-effects of the implementation are
> pretty much limited to the kernel users (NFS and... ???)
>
> Who do you expect to use this attribute if it were to be exported via
> statx() as Jeff is proposing, and why is the XFS behaviour appropriate?
> It already differs from the behaviour of both btrfs and NFS, so the
> argument that this will magically consolidate behaviour can be ignored.
>
Thanks Trond,
That's been exactly the point I've been trying to make. The _only_
consumers of i_version at this time are the kernel's NFS server and IMA.
Both of them will still work with the i_version being updated due to
atime updates, but their performance suffers.
The change I'm proposing should bring xfs in line with other providers
of i_version as well. btrfs already behaves correctly, and I have a
proposed patch for ext4 which should fix it. The ext4 devs seem amenable
to it so far.
Dave, you keep talking about the xfs i_version counter as if there are
other applications already relying on its behavior, but I don't see how
that can be. There is no way for userland applications to fetch the
counter currently.
--
Jeff Layton <jlayton@kernel.org>
next prev parent reply other threads:[~2022-08-18 11:03 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-08-16 13:17 [PATCH] xfs: fix i_version handling in xfs Jeff Layton
2022-08-16 15:43 ` Darrick J. Wong
2022-08-16 15:58 ` Jeff Layton
2022-08-16 22:42 ` Dave Chinner
2022-08-16 23:57 ` Dave Chinner
2022-08-17 12:02 ` Jeff Layton
2022-08-18 1:07 ` Dave Chinner
2022-08-18 11:12 ` Jeff Layton
2022-08-18 0:34 ` NeilBrown
2022-08-18 1:32 ` Dave Chinner
2022-08-18 1:52 ` NeilBrown
2022-08-18 2:22 ` Trond Myklebust
2022-08-18 3:00 ` Dave Chinner
2022-08-19 0:35 ` NeilBrown
2022-08-18 11:00 ` Jeff Layton
2022-08-18 23:43 ` NeilBrown
2022-08-18 1:11 ` Trond Myklebust
2022-08-18 3:37 ` Dave Chinner
2022-08-18 4:15 ` Trond Myklebust
2022-08-18 11:03 ` Jeff Layton [this message]
2022-08-23 0:05 ` Dave Chinner
2022-08-23 1:33 ` Trond Myklebust
2022-08-16 17:14 ` David Wysochanski
2022-08-16 23:37 ` Dave Chinner
2022-08-17 12:10 ` Jeff Layton
2022-08-17 21:57 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=b8cf4464cc31dc262a2d38e66265c06bf1e35751.camel@kernel.org \
--to=jlayton@kernel.org \
--cc=darrick.wong@oracle.com \
--cc=david@fromorbit.com \
--cc=djwong@kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
--cc=trondmy@hammerspace.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).