From: Neil Brown <neilb@suse.de>
To: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>, linux-nfs@vger.kernel.org
Subject: Re: Thoughts about cache consistency and directories in particular.
Date: Sat, 21 Feb 2009 06:47:58 +1100 [thread overview]
Message-ID: <18847.2286.101191.989726@notabene.brown> (raw)
In-Reply-To: message from J. Bruce Fields on Friday February 20
On Friday February 20, bfields@fieldses.org wrote:
> On Fri, Feb 20, 2009 at 01:18:18PM +1100, Neil Brown wrote:
> > I also came up with a simple way to demonstrate a related caching
> > anomaly:
> >
> > - Create a localhost mount
> > - create a directory
> > - "ls -l" the directory via NFS
> > - create a file directly
>
> (And do this within a second (or a jiffy, depending on filesystem) of
> the directory creation?)
Sorry, I meant to include that detail - Yes.
>
> > - look again via NFS.
>
> >
> > The directory will appear empty via NFS but it is not. And this
> > cache anomily does not time out (though memory pressure could
> > eventually remove it).
> >
> > There is a script below which reproduces both anomalies (providing
> > /export is exported and /mnt is available).
> >
> >
> > Can anything be done about this?
> >
> >
> > 1/ The client could flush the cache for a directory when ctime
> > changes as well as when mtime or size change.
> > This would help solve the "rsync -a without --omit-dir-times"
> > problem (and also another weird problem I had reported that
> > involved strange behaviour from an NetApp filer).
> > It might increase the number of READDIR requests in some cases.
> > Would that be enough of an increase to be a real problem?
> > It would be no worse than NFSv4 which - as the Linux NFS server
> > uses ctime to produce the changeattr - refreshes both directories
> > and files when the ctime changes.
>
> This has come up before, but I can't remember what the argument was
> against it....
Trond seems happy with it now. And the NFSv4 server effectively
imposes it. So maybe there are no remaining arguments against it ??
>
> > 2/ The server could lie about the mtime.
> > In particular, if the mtime for a file was the same as the current
> > time - to the granularity of the filesystem storing the file -
> > then reduce the mtime that is reported by the smallest difference that
> > can be reported by the protocol.
> > That would be one microsecond for v2, and one nanosecond for v3
> > and v4.
>
> Assume for simplicity's sake the time granularity is a second, and
> measure time in seconds in the following examples:
>
> Your proposal offers an improvement in this example (currently,
> subsequent getattrs will not reflect the final modification):
>
> t=0.1 modify
> t=0.2 getattr
> t=0.3 modify
>
> Your proposal causes a regression in the following example:
>
> t=0.1 modify
> t=0.2 getattr
> t=1.1 modify
> t=1.2 modify
>
I cannot see how there is a regression here. Subsequent getattrs will
show all modifications (if you wait at least one second).
The first gettattr returns '-0.000000001', which is different from any
previously returned mtime.
Any subsequent getattr will return 0.999999999 or 1, depending on when
it arrives.
The only possible regression is that sometimes we will flush the cache
when previously we didn't. In each case where that changes, the
client can not possible know whether it needs to or not, so flushing
rather than not flushing is the safest option.
>
> By the way, I have one sadly neglected todo here: ext4 has a real nfsv4
> changeattribute, which needs to be hooked up to the nfsd code.
Does it?
I just had a quick look, found that it stores a 64 bit number on disk
which is stored in inode->i_version.
And this is incremented for directory operations. But it doesn't seem
to be changed for file operations.
But maybe I missed something.
Thanks,
NeilBrown
next prev parent reply other threads:[~2009-02-20 19:48 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-02-20 2:18 Thoughts about cache consistency and directories in particular Neil Brown
2009-02-20 18:23 ` J. Bruce Fields
2009-02-20 19:47 ` Neil Brown [this message]
[not found] ` <18847.2286.101191.989726-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2009-02-20 20:14 ` J. Bruce Fields
2009-02-20 21:04 ` Neil Brown
[not found] ` <18847.6886.50844.260910-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2009-04-21 21:43 ` J. Bruce Fields
2009-04-23 21:34 ` J. Bruce Fields
2009-04-23 21:52 ` Trond Myklebust
[not found] ` <1240523577.8583.13.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2009-04-23 22:07 ` J. Bruce Fields
2009-04-23 22:24 ` Trond Myklebust
[not found] ` <18846.4842.625445.980681-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2009-02-20 18:56 ` Trond Myklebust
2009-02-20 19:52 ` Neil Brown
[not found] ` <18847.2578.480148.216735-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2009-02-20 20:32 ` Trond Myklebust
2009-02-20 21:06 ` Neil Brown
[not found] ` <18847.6988.418374.839185-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2009-02-20 22:14 ` Trond Myklebust
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=18847.2286.101191.989726@notabene.brown \
--to=neilb@suse.de \
--cc=bfields@fieldses.org \
--cc=linux-nfs@vger.kernel.org \
--cc=trond.myklebust@fys.uio.no \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox