Linux NFS development
 help / color / mirror / Atom feed
From: Trond Myklebust <trond.myklebust@fys.uio.no>
To: Neil Brown <neilb@suse.de>
Cc: "J. Bruce Fields" <bfields@fieldses.org>, linux-nfs@vger.kernel.org
Subject: Re: Thoughts about cache consistency and directories in particular.
Date: Fri, 20 Feb 2009 10:56:47 -0800	[thread overview]
Message-ID: <1235156207.31066.21.camel@heimdal.trondhjem.org> (raw)
In-Reply-To: <18846.4842.625445.980681-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>

On Fri, 2009-02-20 at 13:18 +1100, Neil Brown wrote:
> Hi,
>  I've been thinking about cache consistency, particularly of
>  directories, in response to a customer who's NFS was getting confused
>  by their usage of "rsync -a" without the "--omit-dir-times" flag:
>  A client would see an old copy of a directory and never get a more
>  up-to-date copy because the mtime appeared not to change.
> 
>  This results in a situation where a directory has wrong data cache
>  and there is no way to force that cache to be flushed.
> 
>  This contrasts with files where you can always flush the file
>  contents by taking a read lock on the file.
> 
>  I also came up with a simple way to demonstrate a related caching
>  anomaly:
> 
>   - Create a localhost mount
>   - create a directory
>   - "ls -l" the directory via NFS
>   - create a file directly
>   - look again via NFS.
> 
>   The directory will appear empty via NFS but it is not.  And this
>   cache anomily does not time out (though memory pressure could
>   eventually remove it).
> 
>   There is a script below which reproduces both anomalies (providing
>   /export is exported and /mnt is available).
>  
> 
>  Can anything be done about this?
> 
> 
>  1/ The client could flush the cache for a directory when ctime
>     changes as well as when mtime or size change.
>     This would help solve the "rsync -a without --omit-dir-times"
>     problem (and also another weird problem I had reported that
>     involved strange behaviour from an NetApp filer).
>     It might increase the number of READDIR requests in some cases.
>     Would that be enough of an increase to be a real problem?
>     It would be no worse than NFSv4 which - as the Linux NFS server
>     uses ctime to produce the changeattr - refreshes both directories
>     and files when the ctime changes.

It should work fine. The ctime tracks the mtime in all cases except when
you setacl, setfattr, chown, chgrp, chmod, or touch the directory. Those
should be very rare operations for pretty much any workload...

>  2/ The server could lie about the mtime.
>     In particular, if the mtime for a file was the same as the current
>     time - to the granularity of the filesystem storing the file -
>     then reduce the mtime that is reported by the smallest difference that
>     can be reported by the protocol.
>     That would be one microsecond for v2, and one nanosecond for v3
>     and v4.
> 
>     This is something I've thought about (and probably muttered about)
>     in various forms at various times over the years, but this time I
>     think I am actually happy with the formulation of the solution and
>     want to push forward with it.
> 
> 
> 
>  Option 1, by itself, would mostly resolve the rsync issue and have
>  no effect on my little test case.
>  Option 2 by itself would have no effect on the rsync issue but would
>  nicely resolve my little test cache.
>  Together they should significantly reduce the number of caching
>  anomalies.

I'm assuming that option 2 applies to the ctime as well as the mtime,
otherwise applying option 1 will void the effects of option 2?

Note also that the client now has the 'lookupcache' mount option that
can be set to ensure stricter revalidation of lookups.

Cheers
  Trond


  parent reply	other threads:[~2009-02-20 18:56 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-02-20  2:18 Thoughts about cache consistency and directories in particular Neil Brown
2009-02-20 18:23 ` J. Bruce Fields
2009-02-20 19:47   ` Neil Brown
     [not found]     ` <18847.2286.101191.989726-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2009-02-20 20:14       ` J. Bruce Fields
2009-02-20 21:04         ` Neil Brown
     [not found]           ` <18847.6886.50844.260910-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2009-04-21 21:43             ` J. Bruce Fields
2009-04-23 21:34               ` J. Bruce Fields
2009-04-23 21:52                 ` Trond Myklebust
     [not found]                   ` <1240523577.8583.13.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2009-04-23 22:07                     ` J. Bruce Fields
2009-04-23 22:24                       ` Trond Myklebust
     [not found] ` <18846.4842.625445.980681-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2009-02-20 18:56   ` Trond Myklebust [this message]
2009-02-20 19:52     ` Neil Brown
     [not found]       ` <18847.2578.480148.216735-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2009-02-20 20:32         ` Trond Myklebust
2009-02-20 21:06           ` Neil Brown
     [not found]             ` <18847.6988.418374.839185-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2009-02-20 22:14               ` Trond Myklebust

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1235156207.31066.21.camel@heimdal.trondhjem.org \
    --to=trond.myklebust@fys.uio.no \
    --cc=bfields@fieldses.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=neilb@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox