From: Jeff Layton <jlayton@kernel.org>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
hch@lst.de, neilb@suse.de, bfields@fieldses.org,
amir73il@gmail.com, jack@suse.de, viro@zeniv.linux.org.uk
Subject: Re: [PATCH 14/19] xfs: convert to new i_version API
Date: Thu, 14 Dec 2017 06:16:19 -0500 [thread overview]
Message-ID: <1513250179.3504.2.camel@kernel.org> (raw)
In-Reply-To: <20171214021707.GG5858@dastard>
On Thu, 2017-12-14 at 13:17 +1100, Dave Chinner wrote:
> On Wed, Dec 13, 2017 at 07:10:22PM -0500, Jeff Layton wrote:
> > On Thu, 2017-12-14 at 10:25 +1100, Dave Chinner wrote:
> > > So now I've looked at the last patch .....
> > >
> > > On Thu, Dec 14, 2017 at 09:48:37AM +1100, Dave Chinner wrote:
> > > > On Wed, Dec 13, 2017 at 09:20:12AM -0500, Jeff Layton wrote:
> > > > > From: Jeff Layton <jlayton@redhat.com>
> > > > >
> > > > > Signed-off-by: Jeff Layton <jlayton@redhat.com>
> > > > > ---
> > > > > fs/xfs/libxfs/xfs_inode_buf.c | 5 +++--
> > > > > fs/xfs/xfs_icache.c | 4 ++--
> > > > > fs/xfs/xfs_inode.c | 2 +-
> > > > > fs/xfs/xfs_inode_item.c | 2 +-
> > > > > fs/xfs/xfs_trans_inode.c | 2 +-
> > > > > 5 files changed, 8 insertions(+), 7 deletions(-)
> > > > >
> > > > > diff --git a/fs/xfs/libxfs/xfs_inode_buf.c b/fs/xfs/libxfs/xfs_inode_buf.c
> > > > > index 6b7989038d75..6b47de201391 100644
> > > > > --- a/fs/xfs/libxfs/xfs_inode_buf.c
> > > > > +++ b/fs/xfs/libxfs/xfs_inode_buf.c
> > > > > @@ -264,7 +264,8 @@ xfs_inode_from_disk(
> > > > > to->di_flags = be16_to_cpu(from->di_flags);
> > > > >
> > > > > if (to->di_version == 3) {
> > > > > - inode->i_version = be64_to_cpu(from->di_changecount);
> > > > > + inode_set_iversion_queried(inode,
> > > > > + be64_to_cpu(from->di_changecount));
> > > >
> > > > So we use the "kernel managed" (really not sure what that means)
> > > > set function here to read it off disk, but...
> > >
> > > This stores the value from disk in the incore inode as "val << 1",
> > > then sets the lowest bit to indicate that it has been "queried"
> > > so that it will be incremented on the first modification.
> > >
> > > Why do we initialise values read from disk as "queried"? This means
> > > the i_version will change once every time it's brought into memory
> > > and modified, regardless of whether anyone is looking at it. What
> > > purpose does this serve?
> > >
> >
> > I don't think we want to store the QUERIED bit.
> >
> > It's always possible that we crash at an inopportune time and a query
> > happened vs. this value before this thing hit the backing store.
> >
> > If we always set the queried bit when we load it from disk, then we know
> > that that scenario is harmless, at the negligible expense of having to
> > bump it on the first write.
>
> Reasonable. Needs documentation.
>
Will do.
FWIW, there's another reason to do it this way too: backward
compatibility. If we don't try to store the queried bit then we should
be able to go back and forth between legacy kernels and the ones with
the new i_version handling without any trouble. The older kernels will
just bump the count more frequently.
> > > > > diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
> > > > > index 801274126648..be6d87980dd5 100644
> > > > > --- a/fs/xfs/xfs_inode.c
> > > > > +++ b/fs/xfs/xfs_inode.c
> > > > > @@ -833,7 +833,7 @@ xfs_ialloc(
> > > > > ip->i_d.di_flags = 0;
> > > > >
> > > > > if (ip->i_d.di_version == 3) {
> > > > > - inode->i_version = 1;
> > > > > + inode_set_iversion(inode, 1);
> > > >
> > > > But here you are using the "filesystem managed" mdoe to set the
> > > > new value. Why? How is this any different from reading the value
> > > > off disk and setting it?
> > >
> > > Still don't understand why this is different to reading the inode
> > > from disk....
> >
> > This is a allocating a brand new, never before seen inode. There's no
> > way this i_version could have ever been seen, so there's no need to flag
> > it as queried.
>
> More documentation. People are going to need to know this stuff to
> be able to implement/maintain this stuff in working order - it's no
> longer a simple, obvious "just increment the counter on
> modification" variable and that has potential ramifications for
> filesystems that store this on disk.
>
>
Definitely. I'm finding that documenting this has been the hardest part.
Thanks for the review so far!
--
Jeff Layton <jlayton@kernel.org>
next prev parent reply other threads:[~2017-12-14 11:16 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-12-13 14:19 [PATCH 00/19] fs: rework and optimize i_version handling in filesystems Jeff Layton
2017-12-13 14:19 ` [PATCH 01/19] fs: new API for handling inode->i_version Jeff Layton
2017-12-13 22:04 ` NeilBrown
2017-12-14 0:27 ` Jeff Layton
2017-12-16 4:17 ` NeilBrown
2017-12-17 13:01 ` Jeff Layton
2017-12-18 14:03 ` Jeff Layton
2017-12-13 14:20 ` [PATCH 02/19] fs: don't take the i_lock in inode_inc_iversion Jeff Layton
2017-12-13 21:52 ` Jeff Layton
2017-12-13 22:07 ` NeilBrown
2017-12-13 14:20 ` [PATCH 03/19] fat: convert to new i_version API Jeff Layton
2017-12-13 14:20 ` [PATCH 04/19] affs: " Jeff Layton
2017-12-13 14:20 ` [PATCH 05/19] afs: " Jeff Layton
2017-12-13 14:20 ` [PATCH 06/19] btrfs: " Jeff Layton
2017-12-13 14:20 ` [PATCH 07/19] exofs: switch " Jeff Layton
2017-12-13 14:20 ` [PATCH 08/19] ext2: convert " Jeff Layton
2017-12-18 12:47 ` Jan Kara
2017-12-13 14:20 ` [PATCH 09/19] ext4: " Jeff Layton
2017-12-14 21:52 ` Theodore Ts'o
2017-12-13 14:20 ` [PATCH 10/19] nfs: " Jeff Layton
2017-12-13 14:20 ` [PATCH 11/19] nfsd: " Jeff Layton
2017-12-13 14:20 ` [PATCH 12/19] ocfs2: " Jeff Layton
2017-12-18 12:49 ` Jan Kara
2017-12-13 14:20 ` [PATCH 13/19] ufs: use " Jeff Layton
2017-12-13 14:20 ` [PATCH 14/19] xfs: convert to " Jeff Layton
2017-12-13 22:48 ` Dave Chinner
2017-12-13 23:25 ` Dave Chinner
2017-12-14 0:10 ` Jeff Layton
2017-12-14 2:17 ` Dave Chinner
2017-12-14 11:16 ` Jeff Layton [this message]
2017-12-13 14:20 ` [PATCH 15/19] IMA: switch IMA over " Jeff Layton
2017-12-13 14:20 ` [PATCH 16/19] fs: only set S_VERSION when updating times if necessary Jeff Layton
2017-12-15 12:59 ` Jeff Layton
2017-12-13 14:20 ` [PATCH 17/19] xfs: avoid setting XFS_ILOG_CORE if i_version doesn't need incrementing Jeff Layton
2017-12-13 14:20 ` [PATCH 18/19] btrfs: only dirty the inode in btrfs_update_time if something was changed Jeff Layton
2017-12-15 13:03 ` Jeff Layton
2017-12-13 14:20 ` [PATCH 19/19] fs: handle inode->i_version more efficiently Jeff Layton
2017-12-13 15:05 ` [PATCH 00/19] fs: rework and optimize i_version handling in filesystems J. Bruce Fields
2017-12-13 20:14 ` Jeff Layton
2017-12-13 22:10 ` Jeff Layton
2017-12-13 23:03 ` Dave Chinner
2017-12-14 0:02 ` Jeff Layton
2017-12-14 14:14 ` Jeff Layton
2017-12-14 15:14 ` J. Bruce Fields
2017-12-15 15:15 ` Jeff Layton
2017-12-15 15:26 ` J. Bruce Fields
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1513250179.3504.2.camel@kernel.org \
--to=jlayton@kernel.org \
--cc=amir73il@gmail.com \
--cc=bfields@fieldses.org \
--cc=david@fromorbit.com \
--cc=hch@lst.de \
--cc=jack@suse.de \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=neilb@suse.de \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).