From: "Darrick J. Wong" <djwong@kernel.org>
To: Jeff Layton <jlayton@kernel.org>
Cc: Amir Goldstein <amir73il@gmail.com>,
linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org,
hch@lst.de
Subject: Re: [PATCH 14/13] xfs: make XFS_IOC_COMMIT_RANGE freshness data opaque
Date: Thu, 7 Mar 2024 15:25:49 -0800 [thread overview]
Message-ID: <20240307232549.GI1927156@frogsfrogsfrogs> (raw)
In-Reply-To: <98b41ce0e577cffcc45c7d29781ca2d85ed19d5e.camel@kernel.org>
On Sat, Mar 02, 2024 at 07:43:53AM -0500, Jeff Layton wrote:
> On Fri, 2024-03-01 at 18:48 -0800, Darrick J. Wong wrote:
> > On Fri, Mar 01, 2024 at 08:31:21AM -0500, Jeff Layton wrote:
> > > On Thu, 2024-02-29 at 15:27 -0800, Darrick J. Wong wrote:
> > > > On Tue, Feb 27, 2024 at 08:52:58PM +0200, Amir Goldstein wrote:
> > > > > On Tue, Feb 27, 2024 at 7:46 PM Darrick J. Wong <djwong@kernel.org> wrote:
> > > > > >
> > > > > > From: Darrick J. Wong <djwong@kernel.org>
> > > > > >
> > > > > > To head off bikeshedding about the fields in xfs_commit_range, let's
> > > > > > make it an opaque u64 array and require the userspace program to call
> > > > > > a third ioctl to sample the freshness data for us. If we ever converge
> > > > > > on a definition for i_version then we can use that; for now we'll just
> > > > > > use mtime/ctime like the old swapext ioctl.
> > > > >
> > > > > This addresses my concerns about using mtime/ctime.
> > > >
> > > > Oh good! :)
> > > >
> > > > > I have to say, Darrick, that I think that referring to this concern as
> > > > > bikeshedding is not being honest.
> > > > >
> > > > > I do hate nit picking reviews and I do hate "maybe also fix the world"
> > > > > review comments, but I think the question about using mtime/ctime in
> > > > > this new API was not out of place
> > > >
> > > > I agree, your question about mtime/ctime:
> > > >
> > > > "Maybe a stupid question, but under which circumstances would mtime
> > > > change and ctime not change? Why are both needed?"
> > > >
> > > > was a very good question. But perhaps that statement referred to the
> > > > other part of that thread.
> > > >
> > > > > and I think that making the freshness
> > > > > data opaque is better for everyone in the long run and hopefully, this will
> > > > > help you move to the things you care about faster.
> > > >
> > > > I wish you'd suggested an opaque blob that the fs can lay out however it
> > > > wants instead of suggesting specifically the change cookie. I'm very
> > > > much ok with an opaque freshness blob that allows future flexibility in
> > > > how we define the blob's contents.
> > > >
> > > > I was however very upset about the Jeff's suggestion of using i_version.
> > > > I apologize for using all caps in that reply, and snarling about it in
> > > > the commit message here. The final version of this patch will not have
> > > > that.
> > > >
> > > > That said, I don't think it is at all helpful to suggest using a file
> > > > attribute whose behavior is as yet unresolved. Multigrain timestamps
> > > > were a clever idea, regrettably reverted. As far as I could tell when I
> > > > wrote my reply, neither had NFS implemented a better behavior and
> > > > quietly merged it; nor have Jeff and Dave produced any sort of candidate
> > > > patchset to fix all the resulting issues in XFS.
> > > >
> > > > Reading "I realize that STATX_CHANGE_COOKIE is currently kernel
> > > > internal" made me think "OH $deity, they wants me to do that work
> > > > too???"
> > > >
> > > > A better way to have woreded that might've been "How about switching
> > > > this to a fs-determined structure so that we can switch the freshness
> > > > check to i_version when that's fully working on XFS?"
> > > >
> > > > The problem I have with reading patch review emails is that I can't
> > > > easily tell whether an author's suggestion is being made in a casual
> > > > offhand manner? Or if it reflects something they feel strongly needs
> > > > change before merging.
> > > >
> > > > In fairness to you, Amir, I don't know how much you've kept on top of
> > > > that i_version vs. XFS discussion. So I have no idea if you were aware
> > > > of the status of that work.
> > > >
> > >
> > > Sorry, I didn't mean to trigger anyone, but I do have real concerns
> > > about any API that attempts to use timestamps to detect whether
> > > something has changed.
> > >
> > > We learned that lesson in NFS in the 90's. VFS timestamp resolution is
> > > just not enough to show whether there was a change to a file -- full
> > > stop.
> > >
> > > I get the hand-wringing over i_version definitions and I don't care to
> > > rehash that discussion here, but I'll point out that this is a
> > > (proposed) XFS-private interface:
> > >
> > > What you could do is expose the XFS change counter (the one that gets
> > > bumped for everything, even atime updates, possibly via different
> > > ioctl), and use that for your "freshness" check.
> > >
> > > You'd unfortunately get false negative freshness checks after read
> > > operations, but you shouldn't get any false positives (which is real
> > > danger with timestamps).
> >
> > I don't see how would that work for this usecase? You have to sample
> > file2 before reflinking file2's contents to file1, writing the changes
> > to file1, and executing COMMIT_RANGE. Setting the xfs-private REFLINK
> > inode flag on file2 will trigger an iversion update even though it won't
> > change mtime or ctime. The COMMIT then fails due to the inode flags
> > change.
> >
> > Worse yet, applications aren't going to know if a particular access is
> > actually the one that will trigger an atime update. So this will just
> > fail unpredictably.
> >
> > If iversion was purely a write counter then I would switch the freshness
> > implementation to use it. But it's not, and I know this to be true
> > because I tried that and could not get COMMIT_RANGE to work reliably.
> > I suppose the advantage of the blob thing is that we actually /can/
> > switch over whenever it's ready.
> >
>
> Yeah, that's the other part -- you have to be willing to redrive the I/O
> every time the freshness check fails, which can get expensive depending
> on how active the file is. Again this is an XFS interface, so I don't
> really have a dog in this fight. If you think timestamps are good
> enough, then so be it.
>
> All I can do is mention that it has been our experience in the NFS world
> that relying on timestamps like this will eventually lead to data
> corruption. The race conditions may be tight, and much of the time the
> race may be benign, but if you do this enough you'll eventually get
> bitten, and end up exchanging data when you shouldn't have.
>
> All of that said, I think this is great discussion fodder for LSF this
> year. I feel like the time is right to consider these sorts of
> interfaces that do synchronized I/O without locking. I've already
> proposed a discussion around the state of the i_version counter, so
> maybe we can chat about it then?
Yes. I've gotten an invitation, so corporate approval and dumb injuries
notwithstanding, I'll be there this year. :)
--D
> --
> Jeff Layton <jlayton@kernel.org>
>
next prev parent reply other threads:[~2024-03-07 23:25 UTC|newest]
Thread overview: 60+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-02-27 2:18 [PATCHSET v29.4 03/13] xfs: atomic file content exchanges Darrick J. Wong
2024-02-27 2:21 ` [PATCH 01/14] vfs: export remap and write check helpers Darrick J. Wong
2024-02-28 15:40 ` Christoph Hellwig
2024-02-27 2:21 ` [PATCH 02/14] xfs: introduce new file range exchange ioctls Darrick J. Wong
2024-02-28 15:44 ` Christoph Hellwig
2024-02-28 19:35 ` Darrick J. Wong
2024-02-28 19:37 ` Christoph Hellwig
2024-02-28 23:00 ` Darrick J. Wong
2024-02-29 13:22 ` Christoph Hellwig
2024-02-29 17:10 ` Darrick J. Wong
2024-02-29 19:42 ` Christoph Hellwig
2024-02-27 2:21 ` [PATCH 03/14] xfs: create a log incompat flag for atomic file mapping exchanges Darrick J. Wong
2024-02-28 15:44 ` Christoph Hellwig
2024-02-27 2:21 ` [PATCH 04/14] xfs: introduce a file mapping exchange log intent item Darrick J. Wong
2024-02-28 15:45 ` Christoph Hellwig
2024-02-27 2:22 ` [PATCH 05/14] xfs: create deferred log items for file mapping exchanges Darrick J. Wong
2024-02-28 15:49 ` Christoph Hellwig
2024-02-28 19:55 ` Darrick J. Wong
2024-02-28 22:08 ` Christoph Hellwig
2024-02-28 22:56 ` Darrick J. Wong
2024-02-27 2:22 ` [PATCH 06/14] xfs: bind together the front and back ends of the file range exchange code Darrick J. Wong
2024-02-28 15:49 ` Christoph Hellwig
2024-02-27 2:22 ` [PATCH 07/14] xfs: add error injection to test file mapping exchange recovery Darrick J. Wong
2024-02-28 15:50 ` Christoph Hellwig
2024-02-27 2:22 ` [PATCH 08/14] xfs: condense extended attributes after a mapping exchange operation Darrick J. Wong
2024-02-28 15:50 ` Christoph Hellwig
2024-02-27 2:23 ` [PATCH 09/14] xfs: condense directories " Darrick J. Wong
2024-02-28 15:51 ` Christoph Hellwig
2024-02-27 2:23 ` [PATCH 10/14] xfs: condense symbolic links " Darrick J. Wong
2024-02-28 15:51 ` Christoph Hellwig
2024-02-27 2:23 ` [PATCH 11/14] xfs: make file range exchange support realtime files Darrick J. Wong
2024-02-28 15:51 ` Christoph Hellwig
2024-02-27 2:23 ` [PATCH 12/14] xfs: support non-power-of-two rtextsize with exchange-range Darrick J. Wong
2024-02-28 15:51 ` Christoph Hellwig
2024-02-27 2:24 ` [PATCH 13/14] docs: update swapext -> exchmaps language Darrick J. Wong
2024-02-28 15:52 ` Christoph Hellwig
2024-02-27 2:24 ` [PATCH 14/14] xfs: enable logged file mapping exchange feature Darrick J. Wong
2024-02-28 15:52 ` Christoph Hellwig
2024-02-27 9:23 ` [PATCHSET v29.4 03/13] xfs: atomic file content exchanges Amir Goldstein
2024-02-27 10:53 ` Jeff Layton
2024-02-27 16:06 ` Darrick J. Wong
2024-03-01 13:16 ` Jeff Layton
2024-02-27 23:46 ` Dave Chinner
2024-02-28 10:30 ` Jeff Layton
2024-02-28 10:58 ` Amir Goldstein
2024-02-28 11:01 ` Jeff Layton
2024-02-27 15:45 ` Darrick J. Wong
2024-02-27 16:58 ` Amir Goldstein
2024-02-27 17:46 ` [PATCH 14/13] xfs: make XFS_IOC_COMMIT_RANGE freshness data opaque Darrick J. Wong
2024-02-27 18:52 ` Amir Goldstein
2024-02-29 23:27 ` Darrick J. Wong
2024-03-01 13:00 ` Amir Goldstein
2024-03-01 13:31 ` Jeff Layton
2024-03-02 2:48 ` Darrick J. Wong
2024-03-02 12:43 ` Jeff Layton
2024-03-07 23:25 ` Darrick J. Wong [this message]
2024-02-28 1:50 ` [PATCHSET v29.4 03/13] xfs: atomic file content exchanges Colin Walters
2024-02-29 20:18 ` Darrick J. Wong
2024-02-29 22:43 ` Colin Walters
2024-03-01 0:03 ` Darrick J. Wong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240307232549.GI1927156@frogsfrogsfrogs \
--to=djwong@kernel.org \
--cc=amir73il@gmail.com \
--cc=hch@lst.de \
--cc=jlayton@kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox