From: Neil Brown <neilb@suse.de>
To: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>, linux-nfs@vger.kernel.org
Subject: Re: Thoughts about cache consistency and directories in particular.
Date: Sat, 21 Feb 2009 08:04:38 +1100 [thread overview]
Message-ID: <18847.6886.50844.260910@notabene.brown> (raw)
In-Reply-To: message from J. Bruce Fields on Friday February 20
On Friday February 20, bfields@fieldses.org wrote:
> On Sat, Feb 21, 2009 at 06:47:58AM +1100, Neil Brown wrote:
> >
> > I cannot see how there is a regression here. Subsequent getattrs will
> > show all modifications (if you wait at least one second).
> > The first gettattr returns '-0.000000001', which is different from any
> > previously returned mtime.
> > Any subsequent getattr will return 0.999999999 or 1, depending on when
> > it arrives.
>
> Sorry, I guess I misread "smallest difference that can be reported by
> the protocol" as "smallest difference supported by the filesystem"!
>
> The former is currently *always* smaller than the latter, so you're
> reporting an mtime that will never arise in any other way. So you're
> right, this results in strictly more cache revalidations in every case.
>
> It may turn out that this mtime-1 case ends up being the typical case,
> since a single logical file modification may appear as multiple writes
> on the server, and those are likely to come in rapid succession.
When you look at a file that is changing, yes. When you look at a
file that hasn't changed for a while, you just get normal mtime.
>
> A "make" that takes less than one second, on an ext3 export, may result
> in targets with earlier mtimes than sources.
No. If they were both changed in the last second, they will both have
1 nanosecond subtracted from the mtime, so they will still look like
they have the same mtime.
>
> (Why not mtime+1? And why not ctime?)
Not "mtime + 1" because mtime should normally be monotonically
increasing.
If you touch a file into the future, then when that moment comes, its
mtime will jump backwards 1 nanosecond, then jump forwards again. But
I don't think that is a problem.
Why not what ctime?
I agree with Trond that we need to apply this adjustment to ctime too.
But we cannot set mtime == ctime. That would just be wrong.
>
> > The only possible regression is that sometimes we will flush the cache
> > when previously we didn't. In each case where that changes, the
> > client can not possible know whether it needs to or not, so flushing
> > rather than not flushing is the safest option.
> >
> > >
> > > By the way, I have one sadly neglected todo here: ext4 has a real nfsv4
> > > changeattribute, which needs to be hooked up to the nfsd code.
> >
> > Does it?
> > I just had a quick look, found that it stores a 64 bit number on disk
> > which is stored in inode->i_version.
> > And this is incremented for directory operations. But it doesn't seem
> > to be changed for file operations.
> >
> > But maybe I missed something.
>
> After some mucking around with git and git grep... looks like the
> inode_inc_iversion() calls do the job. Note there's an i_version mount
> option that's required.
Ahh.... Now if only they had called that function
"inode_inc_i_version", then my grep would have found it...
So in nfsd we can simply do:
if (IS_I_VERSION(fhp->fh_dentry->d_inode)) {
change_attribute = fhp->fh_dentry->d_inode->i_version;
} else {
change_attribute = kstat.ctime;
}
??
NeilBrown
next prev parent reply other threads:[~2009-02-20 21:04 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-02-20 2:18 Thoughts about cache consistency and directories in particular Neil Brown
2009-02-20 18:23 ` J. Bruce Fields
2009-02-20 19:47 ` Neil Brown
[not found] ` <18847.2286.101191.989726-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2009-02-20 20:14 ` J. Bruce Fields
2009-02-20 21:04 ` Neil Brown [this message]
[not found] ` <18847.6886.50844.260910-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2009-04-21 21:43 ` J. Bruce Fields
2009-04-23 21:34 ` J. Bruce Fields
2009-04-23 21:52 ` Trond Myklebust
[not found] ` <1240523577.8583.13.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2009-04-23 22:07 ` J. Bruce Fields
2009-04-23 22:24 ` Trond Myklebust
[not found] ` <18846.4842.625445.980681-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2009-02-20 18:56 ` Trond Myklebust
2009-02-20 19:52 ` Neil Brown
[not found] ` <18847.2578.480148.216735-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2009-02-20 20:32 ` Trond Myklebust
2009-02-20 21:06 ` Neil Brown
[not found] ` <18847.6988.418374.839185-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2009-02-20 22:14 ` Trond Myklebust
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=18847.6886.50844.260910@notabene.brown \
--to=neilb@suse.de \
--cc=bfields@fieldses.org \
--cc=linux-nfs@vger.kernel.org \
--cc=trond.myklebust@fys.uio.no \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox