From: Mingming Cao <cmm@us.ibm.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-ext4@vger.kernel.org, nfsv4@linux-nfs.org
Subject: Re: [EXT4 set 4][PATCH 1/5] i_version:64 bit inode version
Date: Tue, 10 Jul 2007 22:27:57 -0400 [thread overview]
Message-ID: <1184120877.20193.49.camel@localhost.localdomain> (raw)
In-Reply-To: <20070710212237.06aac2a3.akpm@linux-foundation.org>
On Tue, 2007-07-10 at 21:22 -0700, Andrew Morton wrote:
> On Tue, 10 Jul 2007 20:19:16 -0400 Mingming Cao <cmm@us.ibm.com> wrote:
>
> > On Tue, 2007-07-10 at 18:22 -0700, Andrew Morton wrote:
> > > On Tue, 10 Jul 2007 18:09:40 -0400 Mingming Cao <cmm@us.ibm.com> wrote:
> > >
> > > > On Tue, 2007-07-10 at 16:30 -0700, Andrew Morton wrote:
> > > > > On Sun, 01 Jul 2007 03:37:04 -0400
> > > > > Mingming Cao <cmm@us.ibm.com> wrote:
> > > > >
> > > > > > This patch converts the 32-bit i_version in the generic inode to a 64-bit
> > > > > > i_version field.
> > > > > >
> > > > >
> > > > > That's obvious from the patch. But what was the reason for making this
> > > > > (unrelated to ext4) change?
> > > > >
> > > >
> > > > The need is came from NFSv4
> > > >
> > > > On Fri, 2007-05-25 at 18:25 +0200, Jean noel Cordenner wrote:
> > > > > Hi,
> > > > >
> > > > > This is an update of the i_version patch.
> > > > > The i_version field is a 64bit counter that is set on every inode
> > > > > creation and that is incremented every time the inode data is modified
> > > > > (similarly to the "ctime" time-stamp).
> > > > > The aim is to fulfill a NFSv4 requirement for rfc3530:
> > > > > "5.5. Mandatory Attributes - Definitions
> > > > > Name # DataType Access Description
> > > > > ___________________________________________________________________
> > > > > change 3 uint64 READ A value created by the
> > > > > server that the client can use to determine if file
> > > > > data, directory contents or attributes of the object
> > > > > have been modified. The servermay return the object's
> > > > > time_metadata attribute for this attribute's value but
> > > > > only if the filesystem object can not be updated more
> > > > > frequently than the resolution of time_metadata.
> > > > > "
> > > > >
> > > >
> > > > > Please update the changelog for this.
> > > > >
> > > >
> > > > Is above description clear to you?
> > > >
> > >
> > > Yes, thanks. It doesn't actually tell us why we want to implement
> > > this attribute and it doesn't tell us what the implications of failing
> > > to do so are, but I guess we can take that on trust from the NFS guys.
> > >
> > > But I suspect the ext4 implementation doesn't actually do this. afaict we
> > > won't update i_version for file overwrites (especially if s_time_gran can
> > > indeed be 1,000,000,000) and of course for MAP_SHARED modifications. What
> > > would be the implications of this?
> > >
> >
> > In the case of overwrite (file date updated), I assume the ctime/mtime
> > is being updated and the inode is being dirtied, so the version number
> > is being updated.
> >
> > vfs_write()->..
> > ->__generic_file_aio_write_nolock()
> > ->file_update_time()
> > ->mark_inode_dirty_sync()
> > ->__mark_inode_dirty(I_DIRTY_SYNC)
> > ->ext4_dirty_inode()
> > ->ext4_mark_inode_dirty()
>
> That assumes an mtime update for every write(). OK, so two writes in a
> single nanosecond won't be happening. But in that case why is this code:
>
> static inline struct timespec ext4_current_time(struct inode *inode)
> {
> return (inode->i_sb->s_time_gran < NSEC_PER_SEC) ?
> current_fs_time(inode->i_sb) : CURRENT_TIME_SEC;
> }
>
> checking (s_time_gran < NSEC_PER_SEC) ??
>
Ext4 can still load/read ext3 fs (which by default with 128 bytes old
inode size, means doens't have support nanosecond timestamps), so it's
not always gurantee nanosecond timestamps granularity.(it depends on the
size of the inode (>128 bytes), by default, a fresh ext4 increase inode
size to 256 bytes to have the room to store nanoseond timestamps, inode
versioning etc)
> Overall it is a bit unpleasing to rely upon mtime updates for a correct NFS
> server implementation: if we were to later decrease s_time_gran (as we
> might do, for performance reasons), the NFS server implementation starts
> reporting incorrect information.
>
:( that is a problem...
> > > And how does the NFS server know that the filesystem implements i_version?
> > > Will a zero-value of i_version have special significance, telling the
> > > server to not send this attribute, perhaps?
> >
> > Bruce raised up this question a few days back when he reviewed this
> > patch, I think the solution is add a superblock flag for fs support
> > inode versioning, probably at VFS layer?
>
> That would work.
> -
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2007-07-11 2:27 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-07-01 7:37 [EXT4 set 4][PATCH 1/5] i_version:64 bit inode version Mingming Cao
2007-07-02 14:58 ` Mingming Cao
2007-07-03 14:24 ` Trond Myklebust
2007-07-03 21:56 ` Andreas Dilger
2007-07-03 22:15 ` J. Bruce Fields
2007-07-03 23:32 ` Andreas Dilger
2007-07-06 13:51 ` J. Bruce Fields
2007-07-06 22:53 ` Andreas Dilger
2007-07-09 21:16 ` Mingming Cao
2007-07-10 23:30 ` Andrew Morton
2007-07-10 22:09 ` Mingming Cao
2007-07-11 1:22 ` Andrew Morton
2007-07-11 0:19 ` Mingming Cao
2007-07-11 4:22 ` Andrew Morton
2007-07-11 2:27 ` Mingming Cao [this message]
2007-07-11 16:57 ` J. Bruce Fields
2007-07-11 3:21 ` Neil Brown
2007-07-11 2:09 ` Mingming Cao
2007-07-11 5:17 ` Andrew Morton
2007-07-11 3:18 ` Mingming Cao
2007-07-11 6:35 ` Andrew Morton
2007-07-11 3:34 ` Trond Myklebust
2007-07-11 11:41 ` Andreas Dilger
2007-07-11 5:05 ` Neil Brown
2007-07-11 5:22 ` Andrew Morton
2007-07-11 14:28 ` Dave Kleikamp
2007-07-11 20:04 ` J. Bruce Fields
2007-07-12 4:56 ` Andreas Dilger
2007-07-11 17:26 ` J. Bruce Fields
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1184120877.20193.49.camel@localhost.localdomain \
--to=cmm@us.ibm.com \
--cc=akpm@linux-foundation.org \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=nfsv4@linux-nfs.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).