From: Neil Brown <neilb@suse.de>
To: Trond Myklebust <trond.myklebust@fys.uio.no>, linux-nfs@vger.kernel.org
Subject: [PATCH] Should we expect close-to-open consistency on directories?
Date: Tue, 20 Apr 2010 17:22:38 +1000 [thread overview]
Message-ID: <20100420172238.520eaa89@notabene.brown> (raw)
Hi Trond et al,
It has come to my attention that NFS directories don't behave consistently
in terms of cache consistency.
If, on the client, you have a loop like:
while true; do sleep 1; ls -l $dirname ; done
and then on the server you make changes to the named directory, there are
some cases where you will see changes promptly and some where you wont.
In particular, if $dirname is '.' or the name of an NFS mountpoint, then
changes can be delayed by up to acdirmax. If it is any other path, i.e. with
a non-trivial path component that is in the NFS filesystem, then changes
are seen promptly.
This seems to me to relate to "close to open" consistency. Of course with
directories the 'close' side isn't relevant, but I still think it should be
that when you open a directory it validates the 'change' attribute on that
directory over the wire.
However the Linux VFS never tells NFS when a directory is opened. The
current correct behaviour for most directories is achieved through
d_revalidate == nfs_lookup_revalidate.
For '.' and mountpoints we need a different approach. Possibly the VFS could
be changed to tell the filesystem when such a directory is opened. However I
don't feel up to that at the moment.
An alternative is to do a revalidation in nfs_readdir as below. i.e. when
readdir see f_pos == 0, it requests a revalidation of the page cache.
This has two problems:
1/ a seek before the first read would cause the revalidation to be skipped.
This can be fixed by putting a similar test in nfs_llseek_dir, or maybe
triggering off 'dir_cookie == NULL' rather than 'f_pos == 0'.
2/ A normal open/readdir sequence will validate a directory twice, once in the
lookup and once in the readdir. This is probably undesirable, but it is
not clear to me how to fix it.
So: is it reasonable to view the current behaviour as 'wrong'?
any suggestions on how to craft a less problematic fix?
Thanks,
NeilBrown
diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
index a1f6b44..df4f0a6 100644
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -560,6 +560,9 @@ static int nfs_readdir(struct file *filp, void *dirent, filldir_t filldir)
desc->entry = &my_entry;
nfs_block_sillyrename(dentry);
+ if (filp->f_pos == 0)
+ /* Force attribute validity at open */
+ NFS_I(inode)->cache_validity |= NFS_INO_REVAL_PAGECACHE;
res = nfs_revalidate_mapping(inode, filp->f_mapping);
if (res < 0)
goto out;
next reply other threads:[~2010-04-20 7:22 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-04-20 7:22 Neil Brown [this message]
2010-04-20 13:02 ` [PATCH] Should we expect close-to-open consistency on directories? Trond Myklebust
2010-04-21 7:03 ` Neil Brown
2010-05-06 4:13 ` Neil Brown
[not found] ` <20100506141347.06451f56-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2010-05-06 13:58 ` Trond Myklebust
2010-05-07 22:34 ` Neil Brown
2010-05-08 13:05 ` Chuck Lever
2010-05-08 22:08 ` Neil Brown
2010-05-10 2:29 ` Chuck Lever
2010-05-10 3:01 ` Neil Brown
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100420172238.520eaa89@notabene.brown \
--to=neilb@suse.de \
--cc=linux-nfs@vger.kernel.org \
--cc=trond.myklebust@fys.uio.no \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox