From: Erez Zadok <ezk@cs.sunysb.edu>
To: akpm@linux-foundation.org
Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
viro@ftp.linux.org.uk, hch@infradead.org,
Erez Zadok <ezk@cs.sunysb.edu>
Subject: [PATCH 3/9] Unionfs: update cache-coherency detection heuristics
Date: Tue, 13 Nov 2007 05:10:22 -0500 [thread overview]
Message-ID: <1194948629600-git-send-email-ezk@cs.sunysb.edu> (raw)
In-Reply-To: <11949486283697-git-send-email-ezk@cs.sunysb.edu>
Use a small delay to reduce the number of times unionfs has to detect
changed mtime's/ctime's, and also reduce the potential for false positives.
See Documentation/filesystems/unionfs/concepts.txt for a detailed
discussion.
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
---
Documentation/filesystems/unionfs/concepts.txt | 18 ++++++++++++++
fs/unionfs/dentry.c | 29 ++++++++++++++---------
fs/unionfs/union.h | 3 ++
3 files changed, 39 insertions(+), 11 deletions(-)
diff --git a/Documentation/filesystems/unionfs/concepts.txt b/Documentation/filesystems/unionfs/concepts.txt
index 37a62d8..7654ccc 100644
--- a/Documentation/filesystems/unionfs/concepts.txt
+++ b/Documentation/filesystems/unionfs/concepts.txt
@@ -177,5 +177,23 @@ would be really nice if the VFS would export an optional file system hook
->file_revalidate (similarly to dentry->d_revalidate) that will be called
before each VFS op that has a "struct file" in it.
+Certain file systems have micro-second granularity (or better) for inode
+times, and asynchronous actions could cause those times to change with some
+small delay. In such cases, Unionfs may see a changed inode time that only
+differs by a tiny fraction of a second: such a change may be a false
+positive indication that the lower object has changed, whereas if unionfs
+waits a little longer, that false indication will not be seen. (These false
+positives are harmless, because they would at most cause unionfs to
+re-validate an object that may need no revalidation, and print a debugging
+message that clutters the console/logs.) Therefore, to minimize the chances
+of these situations, we delay the detection of changed times by a small
+factor of a few seconds, called UNIONFS_MIN_CC_TIME (which defaults to 3
+seconds, as does NFS). This means that we will detect the change, only a
+couple of seconds later, if indeed the time change persists in the lower
+file object. This delayed detection has an added performance benefit: we
+reduce the number of times that unionfs has to revalidate objects, in case
+there's a lot of concurrent activity on both the upper and lower objects,
+for the same file(s). Lastly, this delayed time attribute detection is
+similar to how NFS clients operate (e.g., acregmin).
For more information, see <http://unionfs.filesystems.org/>.
diff --git a/fs/unionfs/dentry.c b/fs/unionfs/dentry.c
index a3d7b6e..aed4d39 100644
--- a/fs/unionfs/dentry.c
+++ b/fs/unionfs/dentry.c
@@ -192,6 +192,16 @@ out:
/*
* Determine if the lower inode objects have changed from below the unionfs
* inode. Return true if changed, false otherwise.
+ *
+ * We check if the mtime or ctime have changed. However, the inode times
+ * can be changed by anyone without much protection, including
+ * asynchronously. This can sometimes cause unionfs to find that the lower
+ * file system doesn't change its inode times quick enough, resulting in a
+ * false positive indication (which is harmless, it just makes unionfs do
+ * extra work in re-validating the objects). To minimize the chances of
+ * these situations, we delay the detection of changed times by
+ * UNIONFS_MIN_CC_TIME (which defaults to 3 seconds, as with NFS's
+ * acregmin).
*/
bool is_newer_lower(const struct dentry *dentry)
{
@@ -211,26 +221,23 @@ bool is_newer_lower(const struct dentry *dentry)
lower_inode = unionfs_lower_inode_idx(inode, bindex);
if (!lower_inode)
continue;
- /*
- * We may want to apply other tests to determine if the
- * lower inode's data has changed, but checking for changed
- * ctime and mtime on the lower inode should be enough.
- */
- if (unlikely(timespec_compare(&inode->i_mtime,
- &lower_inode->i_mtime) < 0)) {
+
+ /* check if mtime/ctime have changed */
+ if (unlikely((lower_inode->i_mtime.tv_sec -
+ inode->i_mtime.tv_sec) > UNIONFS_MIN_CC_TIME)) {
pr_info("unionfs: new lower inode mtime "
"(bindex=%d, name=%s)\n", bindex,
dentry->d_name.name);
show_dinode_times(dentry);
- return true; /* mtime changed! */
+ return true;
}
- if (unlikely(timespec_compare(&inode->i_ctime,
- &lower_inode->i_ctime) < 0)) {
+ if (unlikely((lower_inode->i_ctime.tv_sec -
+ inode->i_ctime.tv_sec) > UNIONFS_MIN_CC_TIME)) {
pr_info("unionfs: new lower inode ctime "
"(bindex=%d, name=%s)\n", bindex,
dentry->d_name.name);
show_dinode_times(dentry);
- return true; /* ctime changed! */
+ return true;
}
}
return false; /* default: lower is not newer */
diff --git a/fs/unionfs/union.h b/fs/unionfs/union.h
index 09b9ac9..f61e32d 100644
--- a/fs/unionfs/union.h
+++ b/fs/unionfs/union.h
@@ -63,6 +63,9 @@
/* maximum number of branches we support, to avoid memory blowup */
#define UNIONFS_MAX_BRANCHES 128
+/* minimum time (seconds) required for time-based cache-coherency */
+#define UNIONFS_MIN_CC_TIME 3
+
/* Operations vectors defined in specific files. */
extern struct file_operations unionfs_main_fops;
extern struct file_operations unionfs_dir_fops;
--
1.5.2.2
next prev parent reply other threads:[~2007-11-13 10:10 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-11-13 10:10 [GIT PULL -mm] 0/9 Unionfs updates/cleanups/fixes Erez Zadok
2007-11-13 10:10 ` [PATCH 1/9] Unionfs: flush and release updates Erez Zadok
2007-11-13 10:10 ` [PATCH 2/9] Unionfs: use i_size wrappers Erez Zadok
2007-11-13 10:10 ` Erez Zadok [this message]
2007-11-13 10:10 ` [PATCH 4/9] Unionfs: writepage updates Erez Zadok
2007-11-13 10:10 ` [PATCH 5/9] Unionfs: clear partial read in readpage Erez Zadok
2007-11-13 10:10 ` [PATCH 6/9] Unionfs: debugging updates Erez Zadok
2007-11-13 10:10 ` [PATCH 7/9] Unionfs: remove unnecessary lower atime updates Erez Zadok
2007-11-13 10:10 ` [PATCH 8/9] Unionfs: fold do_readpage into unionfs_readpage Erez Zadok
2007-11-13 10:10 ` [PATCH 9/9] Unionfs: move debugging checks inside locks Erez Zadok
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1194948629600-git-send-email-ezk@cs.sunysb.edu \
--to=ezk@cs.sunysb.edu \
--cc=akpm@linux-foundation.org \
--cc=hch@infradead.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=viro@ftp.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).