linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] d_prune dentry_operation
@ 2011-10-06  4:26 Sage Weil
  2011-10-06  4:26 ` [PATCH] vfs: add d_prune dentry operation Sage Weil
  0 siblings, 1 reply; 15+ messages in thread
From: Sage Weil @ 2011-10-06  4:26 UTC (permalink / raw)
  To: linux-fsdevel, viro; +Cc: hch, ceph-devel, rwheeler, linux-kernel, Sage Weil

Ceph goes to great lengths to keep its client-side cache coherent, 
allowing many operations (lookup, creations, readdir) to be performed 
without any server interaction when the client has the proper leases. 

Sadly, this functionality is all currently disabled because we cannot 
handle races between dcache pruning and any of those activities with the 
current VFS interface.

This patch adds a d_prune hook that allows the filesystem to be informed 
before a dentry is removed from the cache.  Merging this for 3.2-rc1 
will make Ceph users and their metadata-intensive workload very happy.

If anybody has any issues at all with this, please tell me, so I can 
make my case or revise my approach.

Thanks!
sage


Sage Weil (1):
  vfs: add d_prune dentry operation

 Documentation/filesystems/Locking |    1 +
 fs/dcache.c                       |    8 ++++++++
 include/linux/dcache.h            |    3 +++
 3 files changed, 12 insertions(+), 0 deletions(-)

-- 
1.7.2.5


^ permalink raw reply	[flat|nested] 15+ messages in thread
* [PATCH] vfs: add d_prune dentry operation
@ 2011-07-08 21:10 Sage Weil
  2011-07-26 23:24 ` Sage Weil
  0 siblings, 1 reply; 15+ messages in thread
From: Sage Weil @ 2011-07-08 21:10 UTC (permalink / raw)
  To: linux-fsdevel, linux-kernel

This adds a d_prune dentry operation that is called by the VFS prior to 
pruning (i.e. unhashing and killing) a hashed dentry from the dcache. This 
allows the file system to maintain visibility into the contents and 
consistency of the dcache:

 - dentry eviction/pruning now calls into the fs
 - any modifications (unlink, rename, create, etc.) are protected by 
   i_mutex and call into the fs

This will be used by Ceph to maintain a flag indicating whether the 
complete contents of a directory are contained in the dcache, allowing it 
to satisfy lookups and readdir with cached results without additional 
server communication.

Signed-off-by: Sage Weil <sage@newdream.net>
---
 Documentation/filesystems/Locking |    1 +
 fs/dcache.c                       |    8 ++++++++
 include/linux/dcache.h            |    3 +++
 3 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/Documentation/filesystems/Locking b/Documentation/filesystems/Locking
index 57d827d..b3fa7c8 100644
--- a/Documentation/filesystems/Locking
+++ b/Documentation/filesystems/Locking
@@ -29,6 +29,7 @@ d_hash		no		no		no		maybe
 d_compare:	yes		no		no		maybe
 d_delete:	no		yes		no		no
 d_release:	no		no		yes		no
+d_prune:        no              yes             no              no
 d_iput:		no		no		yes		no
 d_dname:	no		no		no		no
 d_automount:	no		no		yes		no
diff --git a/fs/dcache.c b/fs/dcache.c
index 37f72ee..74d1a30 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -673,6 +673,8 @@ static void try_prune_one_dentry(struct dentry *dentry)
 			spin_unlock(&dentry->d_lock);
 			return;
 		}
+		if (dentry->d_flags & DCACHE_OP_PRUNE)
+			dentry->d_op->d_prune(dentry);
 		dentry = dentry_kill(dentry, 1);
 	}
 }
@@ -879,6 +881,8 @@ static void shrink_dcache_for_umount_subtree(struct dentry *dentry)
 
 	/* detach this root from the system */
 	spin_lock(&dentry->d_lock);
+	if (dentry->d_flags & DCACHE_OP_PRUNE)
+		dentry->d_op->d_prune(dentry);
 	dentry_lru_del(dentry);
 	__d_drop(dentry);
 	spin_unlock(&dentry->d_lock);
@@ -895,6 +899,8 @@ static void shrink_dcache_for_umount_subtree(struct dentry *dentry)
 					    d_u.d_child) {
 				spin_lock_nested(&loop->d_lock,
 						DENTRY_D_LOCK_NESTED);
+				if (dentry->d_flags & DCACHE_OP_PRUNE)
+					dentry->d_op->d_prune(dentry);
 				dentry_lru_del(loop);
 				__d_drop(loop);
 				spin_unlock(&loop->d_lock);
@@ -1363,6 +1369,8 @@ void d_set_d_op(struct dentry *dentry, const struct dentry_operations *op)
 		dentry->d_flags |= DCACHE_OP_REVALIDATE;
 	if (op->d_delete)
 		dentry->d_flags |= DCACHE_OP_DELETE;
+	if (op->d_prune)
+		dentry->d_flags |= DCACHE_OP_PRUNE;
 
 }
 EXPORT_SYMBOL(d_set_d_op);
diff --git a/include/linux/dcache.h b/include/linux/dcache.h
index 19d90a5..681f46f 100644
--- a/include/linux/dcache.h
+++ b/include/linux/dcache.h
@@ -165,6 +165,7 @@ struct dentry_operations {
 			unsigned int, const char *, const struct qstr *);
 	int (*d_delete)(const struct dentry *);
 	void (*d_release)(struct dentry *);
+	void (*d_prune)(struct dentry *);
 	void (*d_iput)(struct dentry *, struct inode *);
 	char *(*d_dname)(struct dentry *, char *, int);
 	struct vfsmount *(*d_automount)(struct path *);
@@ -219,6 +220,8 @@ struct dentry_operations {
 #define DCACHE_MANAGED_DENTRY \
 	(DCACHE_MOUNTED|DCACHE_NEED_AUTOMOUNT|DCACHE_MANAGE_TRANSIT)
 
+#define DCACHE_OP_PRUNE         0x80000
+
 extern seqlock_t rename_lock;
 
 static inline int dname_external(struct dentry *dentry)
-- 
1.7.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2011-10-28 17:02 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-10-06  4:26 [PATCH] d_prune dentry_operation Sage Weil
2011-10-06  4:26 ` [PATCH] vfs: add d_prune dentry operation Sage Weil
2011-10-06 21:21   ` Christoph Hellwig
2011-10-06 22:20     ` Sage Weil
2011-10-09 13:21       ` Christoph Hellwig
2011-10-10  5:11       ` Dave Chinner
2011-10-10 11:23         ` Christoph Hellwig
2011-10-10 16:19         ` Sage Weil
2011-10-10 16:21           ` Christoph Hellwig
2011-10-11 15:39             ` Sage Weil
2011-10-11 21:56               ` Dave Chinner
2011-10-28 12:16           ` Christoph Hellwig
2011-10-28 17:02             ` Sage Weil
  -- strict thread matches above, loose matches on Subject: below --
2011-07-08 21:10 Sage Weil
2011-07-26 23:24 ` Sage Weil

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).