[PATCH 00/12] Implement NFSv4 delegations, take 5

linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 00/12] Implement NFSv4 delegations, take 5
@ 2012-10-16 22:01 J. Bruce Fields
  2012-10-16 22:01 ` [PATCH 02/12] vfs: don't use PARENT/CHILD lock classes for non-directories J. Bruce Fields
                   ` (6 more replies)
  0 siblings, 7 replies; 13+ messages in thread
From: J. Bruce Fields @ 2012-10-16 22:01 UTC (permalink / raw)
  To: Al Viro
  Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, J. Bruce Fields

From: "J. Bruce Fields" <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

Previous posting:

	http://mid.gmane.org/<1346878524-10585-1-git-send-email-bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

This is just a rebase to 3.7-rc1.  Jeff Layton's audit patches fixed the
last problem I know of with this patchset, so as far as I'm concerned
it's ready.  But it needs review.

Summary, as before:

NFSv3 clients can only find out about changes to a file by polling.
NFSv4 instead allows a client to get a "delegation" on a file and then
operate on a file without talking to the server.  Consistency is
maintained by requiring the server to recall the delegation before
allowing any conflicting operation.

But our kfsd doesn't always recall delegations on conflicting operations
from local (non-NFS) users of an exported filesystem.

This patch series fixes that by defining a new lock type to represent a
delegation.  This new lock type isn't available to userspace for
now--nfsd is the only user.

Delegations come in both "read" and "write" flavors, but I'm only
implementing read delegations for now.

Delegations are similar to leases.  The main difference is that
delegations need to be broken on any operation that changes a file's
attributes or the set of links pointing to it (like link, unlink, and
rename).

Such operations take several locks (including at least one i_mutex on a
directory).  Delegations can take a long time (about a minute) to recall
when NFS clients are unresponsive.  To avoid blocking a lot of unrelated
operations, this version of the patches drops all of those locks before
waiting.

To that end, I'm passing an extra inode ** to functions like vfs_unlink.
When vfs_unlink finds a delegation it can then pass back the offending
inode so that the caller can wait for the delegation recall.  If the
caller passes in NULL, then it instead gets -EWOULDBLOCK on encountering
a delegation.

In fact, callers outside the vfs are mostly passing in NULL:

	- nfsd wants to imediately return NFS4ERR_DELAY to callers on
	  encountering a delegation, instead of waiting.
	- I assume that anyone exporting the fileystem underlying an
	  ecryptfs mount is making a mistake, and that it's better to
	  return an error than to wait.
	- Ditto for fscache.

But those other callers could be taught to wait as well if necessary.

--b.

J. Bruce Fields (12):
  vfs: pull ext4's double-i_mutex-locking into common code
  vfs: don't use PARENT/CHILD lock classes for non-directories
  vfs: rename I_MUTEX_QUOTA now that it's not used for quotas
  vfs: take i_mutex on renamed file
  locks: introduce new FL_DELEG lock flag
  locks: implement delegations
  namei: minor vfs_unlink cleanup
  locks: break delegations on unlink
  locks: helper functions for delegation breaking
  locks: break delegations on rename
  locks: break delegations on link
  locks: break delegations on any attribute modification

 Documentation/filesystems/directory-locking |   31 +++++++---
 drivers/base/devtmpfs.c                     |    6 +-
 fs/attr.c                                   |    5 +-
 fs/cachefiles/interface.c                   |    4 +-
 fs/cachefiles/namei.c                       |    4 +-
 fs/ecryptfs/inode.c                         |    6 +-
 fs/ext4/move_extent.c                       |   40 +------------
 fs/hpfs/namei.c                             |    2 +-
 fs/inode.c                                  |   35 ++++++++++-
 fs/locks.c                                  |   51 ++++++++++++----
 fs/namei.c                                  |   83 +++++++++++++++++++++------
 fs/nfsd/nfs4state.c                         |    2 +-
 fs/nfsd/vfs.c                               |   14 +++--
 fs/open.c                                   |   21 +++++--
 fs/utimes.c                                 |    9 ++-
 include/linux/fs.h                          |   72 +++++++++++++++++++----
 ipc/mqueue.c                                |    2 +-
 17 files changed, 273 insertions(+), 114 deletions(-)

-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 02/12] vfs: don't use PARENT/CHILD lock classes for non-directories
  2012-10-16 22:01 [PATCH 00/12] Implement NFSv4 delegations, take 5 J. Bruce Fields
@ 2012-10-16 22:01 ` J. Bruce Fields
       [not found] ` <1350424897-26894-1-git-send-email-bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: J. Bruce Fields @ 2012-10-16 22:01 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-nfs, linux-fsdevel, J. Bruce Fields

From: "J. Bruce Fields" <bfields@redhat.com>

Reserve I_MUTEX_PARENT and I_MUTEX_CHILD for locking of actual
directories.

(Also I_MUTEX_QUOTA isn't really a meaningful name for this locking
class any more; fixed in a later patch.)

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
 fs/inode.c |    8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/fs/inode.c b/fs/inode.c
index 9d5b6bd..413b043 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -976,11 +976,11 @@ EXPORT_SYMBOL(unlock_new_inode);
 void lock_two_nondirectories(struct inode *inode1, struct inode *inode2)
 {
 	if (inode1 < inode2) {
-		mutex_lock_nested(&inode1->i_mutex, I_MUTEX_PARENT);
-		mutex_lock_nested(&inode2->i_mutex, I_MUTEX_CHILD);
+		mutex_lock(&inode1->i_mutex);
+		mutex_lock_nested(&inode2->i_mutex, I_MUTEX_QUOTA);
 	} else {
-		mutex_lock_nested(&inode2->i_mutex, I_MUTEX_PARENT);
-		mutex_lock_nested(&inode1->i_mutex, I_MUTEX_CHILD);
+		mutex_lock(&inode2->i_mutex);
+		mutex_lock_nested(&inode1->i_mutex, I_MUTEX_QUOTA);
 	}
 }
 EXPORT_SYMBOL(lock_two_nondirectories);
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 13+ messages in thread

[parent not found: <1350424897-26894-1-git-send-email-bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>]

* [PATCH 01/12] vfs: pull ext4's double-i_mutex-locking into common code
       [not found] ` <1350424897-26894-1-git-send-email-bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2012-10-16 22:01   ` J. Bruce Fields
  2012-10-16 22:01   ` [PATCH 03/12] vfs: rename I_MUTEX_QUOTA now that it's not used for quotas J. Bruce Fields
                     ` (4 subsequent siblings)
  5 siblings, 0 replies; 13+ messages in thread
From: J. Bruce Fields @ 2012-10-16 22:01 UTC (permalink / raw)
  To: Al Viro
  Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, J. Bruce Fields,
	Theodore Ts'o, Andreas Dilger

From: "J. Bruce Fields" <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

We want to do this elsewhere as well.

Cc: "Theodore Ts'o" <tytso-3s7WtUTddSA@public.gmane.org>
Cc: Andreas Dilger <adilger.kernel-m1MBpc4rdrD3fQ9qLvQP4Q@public.gmane.org>
Signed-off-by: J. Bruce Fields <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
---
 fs/ext4/move_extent.c |   40 ++--------------------------------------
 fs/inode.c            |   29 +++++++++++++++++++++++++++++
 include/linux/fs.h    |    3 +++
 3 files changed, 34 insertions(+), 38 deletions(-)

diff --git a/fs/ext4/move_extent.c b/fs/ext4/move_extent.c
index 292daee..ee02681 100644
--- a/fs/ext4/move_extent.c
+++ b/fs/ext4/move_extent.c
@@ -1197,42 +1197,6 @@ mext_check_arguments(struct inode *orig_inode,
 }
 
 /**
- * mext_inode_double_lock - Lock i_mutex on both @inode1 and @inode2
- *
- * @inode1:	the inode structure
- * @inode2:	the inode structure
- *
- * Lock two inodes' i_mutex
- */
-static void
-mext_inode_double_lock(struct inode *inode1, struct inode *inode2)
-{
-	BUG_ON(inode1 == inode2);
-	if (inode1 < inode2) {
-		mutex_lock_nested(&inode1->i_mutex, I_MUTEX_PARENT);
-		mutex_lock_nested(&inode2->i_mutex, I_MUTEX_CHILD);
-	} else {
-		mutex_lock_nested(&inode2->i_mutex, I_MUTEX_PARENT);
-		mutex_lock_nested(&inode1->i_mutex, I_MUTEX_CHILD);
-	}
-}
-
-/**
- * mext_inode_double_unlock - Release i_mutex on both @inode1 and @inode2
- *
- * @inode1:     the inode that is released first
- * @inode2:     the inode that is released second
- *
- */
-
-static void
-mext_inode_double_unlock(struct inode *inode1, struct inode *inode2)
-{
-	mutex_unlock(&inode1->i_mutex);
-	mutex_unlock(&inode2->i_mutex);
-}
-
-/**
  * ext4_move_extents - Exchange the specified range of a file
  *
  * @o_filp:		file structure of the original file
@@ -1321,7 +1285,7 @@ ext4_move_extents(struct file *o_filp, struct file *d_filp,
 		return -EINVAL;
 	}
 	/* Protect orig and donor inodes against a truncate */
-	mext_inode_double_lock(orig_inode, donor_inode);
+	lock_two_nondirectories(orig_inode, donor_inode);
 
 	/* Wait for all existing dio workers */
 	ext4_inode_block_unlocked_dio(orig_inode);
@@ -1529,7 +1493,7 @@ out:
 	double_up_write_data_sem(orig_inode, donor_inode);
 	ext4_inode_resume_unlocked_dio(orig_inode);
 	ext4_inode_resume_unlocked_dio(donor_inode);
-	mext_inode_double_unlock(orig_inode, donor_inode);
+	unlock_two_nondirectories(orig_inode, donor_inode);
 
 	return ret;
 }
diff --git a/fs/inode.c b/fs/inode.c
index b03c719..9d5b6bd 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -969,6 +969,35 @@ void unlock_new_inode(struct inode *inode)
 EXPORT_SYMBOL(unlock_new_inode);
 
 /**
+ * lock_two_nondirectories - take two i_mutexes on non-directory objects
+ * @inode1: first inode to lock
+ * @inode2: second inode to lock
+ */
+void lock_two_nondirectories(struct inode *inode1, struct inode *inode2)
+{
+	if (inode1 < inode2) {
+		mutex_lock_nested(&inode1->i_mutex, I_MUTEX_PARENT);
+		mutex_lock_nested(&inode2->i_mutex, I_MUTEX_CHILD);
+	} else {
+		mutex_lock_nested(&inode2->i_mutex, I_MUTEX_PARENT);
+		mutex_lock_nested(&inode1->i_mutex, I_MUTEX_CHILD);
+	}
+}
+EXPORT_SYMBOL(lock_two_nondirectories);
+
+/**
+ * unlock_two_nondirectories - release locks from lock_two_nondirectories()
+ * @inode1: first inode to unlock
+ * @inode2: second inode to unlock
+ */
+void unlock_two_nondirectories(struct inode *inode1, struct inode *inode2)
+{
+	mutex_unlock(&inode1->i_mutex);
+	mutex_unlock(&inode2->i_mutex);
+}
+EXPORT_SYMBOL(unlock_two_nondirectories);
+
+/**
  * iget5_locked - obtain an inode from a mounted file system
  * @sb:		super block of file system
  * @hashval:	hash value (usually inode number) to get
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 001c7cf..83da035 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -568,6 +568,9 @@ enum inode_i_mutex_lock_class
 	I_MUTEX_QUOTA
 };
 
+void lock_two_nondirectories(struct inode *, struct inode*);
+void unlock_two_nondirectories(struct inode *, struct inode*);
+
 /*
  * NOTE: in a 32bit arch with a preemptable kernel and
  * an UP compile the i_size_read/write must be atomic
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 03/12] vfs: rename I_MUTEX_QUOTA now that it's not used for quotas
       [not found] ` <1350424897-26894-1-git-send-email-bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  2012-10-16 22:01   ` [PATCH 01/12] vfs: pull ext4's double-i_mutex-locking into common code J. Bruce Fields
@ 2012-10-16 22:01   ` J. Bruce Fields
  2012-10-16 22:01   ` [PATCH 04/12] vfs: take i_mutex on renamed file J. Bruce Fields
                     ` (3 subsequent siblings)
  5 siblings, 0 replies; 13+ messages in thread
From: J. Bruce Fields @ 2012-10-16 22:01 UTC (permalink / raw)
  To: Al Viro
  Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, J. Bruce Fields

From: "J. Bruce Fields" <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

I_MUTEX_QUOTA is now just being used whenever we want to lock two
non-directories.  So the name isn't right.  I_MUTEX_NONDIR2 isn't
especially elegant but it's the best I could think of.

Also fix some outdated documentation.

Signed-off-by: J. Bruce Fields <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
---
 fs/inode.c         |    4 ++--
 include/linux/fs.h |    9 ++++++---
 2 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/fs/inode.c b/fs/inode.c
index 413b043..2cffc18 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -977,10 +977,10 @@ void lock_two_nondirectories(struct inode *inode1, struct inode *inode2)
 {
 	if (inode1 < inode2) {
 		mutex_lock(&inode1->i_mutex);
-		mutex_lock_nested(&inode2->i_mutex, I_MUTEX_QUOTA);
+		mutex_lock_nested(&inode2->i_mutex, I_MUTEX_NONDIR2);
 	} else {
 		mutex_lock(&inode2->i_mutex);
-		mutex_lock_nested(&inode1->i_mutex, I_MUTEX_QUOTA);
+		mutex_lock_nested(&inode1->i_mutex, I_MUTEX_NONDIR2);
 	}
 }
 EXPORT_SYMBOL(lock_two_nondirectories);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 83da035..7ffc865 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -554,10 +554,13 @@ static inline int inode_unhashed(struct inode *inode)
  * 0: the object of the current VFS operation
  * 1: parent
  * 2: child/target
- * 3: quota file
+ * 3: xattr
+ * 4: second non-directory
+ * The last is for certain operations (such as rename) which lock two
+ * non-directories at once.
  *
  * The locking order between these classes is
- * parent -> child -> normal -> xattr -> quota
+ * parent -> child -> normal -> xattr -> second non-directory
  */
 enum inode_i_mutex_lock_class
 {
@@ -565,7 +568,7 @@ enum inode_i_mutex_lock_class
 	I_MUTEX_PARENT,
 	I_MUTEX_CHILD,
 	I_MUTEX_XATTR,
-	I_MUTEX_QUOTA
+	I_MUTEX_NONDIR2
 };
 
 void lock_two_nondirectories(struct inode *, struct inode*);
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 04/12] vfs: take i_mutex on renamed file
       [not found] ` <1350424897-26894-1-git-send-email-bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  2012-10-16 22:01   ` [PATCH 01/12] vfs: pull ext4's double-i_mutex-locking into common code J. Bruce Fields
  2012-10-16 22:01   ` [PATCH 03/12] vfs: rename I_MUTEX_QUOTA now that it's not used for quotas J. Bruce Fields
@ 2012-10-16 22:01   ` J. Bruce Fields
  2012-10-16 22:01   ` [PATCH 08/12] locks: break delegations on unlink J. Bruce Fields
                     ` (2 subsequent siblings)
  5 siblings, 0 replies; 13+ messages in thread
From: J. Bruce Fields @ 2012-10-16 22:01 UTC (permalink / raw)
  To: Al Viro
  Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, J. Bruce Fields

From: "J. Bruce Fields" <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

A read delegation is used by NFSv4 as a guarantee that a client can
perform local read opens without informing the server.

The open operation takes the last component of the pathname as an
argument, thus is also a lookup operation, and giving the client the
above guarantee means informing the client before we allow anything that
would change the set of names pointing to the inode.

Therefore, we need to break delegations on rename, link, and unlink.

We also need to prevent new delegations from being acquired while one of
these operations is in progress.

We could add some completely new locking for that purpose, but it's
simpler to use the i_mutex, since that's already taken by all the
operations we care about.

The single exception is rename.  So, modify rename to take the i_mutex
on the file that is being renamed.

Also fix up lockdep and Documentation/filesystems/directory-locking to
reflect the change.

Signed-off-by: J. Bruce Fields <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
---
 Documentation/filesystems/directory-locking |   31 +++++++++++++++++++--------
 fs/namei.c                                  |   12 ++++++++---
 2 files changed, 31 insertions(+), 12 deletions(-)

diff --git a/Documentation/filesystems/directory-locking b/Documentation/filesystems/directory-locking
index ff7b611..09bbf9a 100644
--- a/Documentation/filesystems/directory-locking
+++ b/Documentation/filesystems/directory-locking
@@ -2,6 +2,10 @@
 kinds of locks - per-inode (->i_mutex) and per-filesystem
 (->s_vfs_rename_mutex).
 
+	When taking the i_mutex on multiple non-directory objects, we
+always acquire the locks in order by increasing address.  We'll call
+that "inode pointer" order in the following.
+
 	For our purposes all operations fall in 5 classes:
 
 1) read access.  Locking rules: caller locks directory we are accessing.
@@ -12,8 +16,9 @@ kinds of locks - per-inode (->i_mutex) and per-filesystem
 locks victim and calls the method.
 
 4) rename() that is _not_ cross-directory.  Locking rules: caller locks
-the parent, finds source and target, if target already exists - locks it
-and then calls the method.
+the parent and finds source and target.  If target already exists, lock
+it.  If source is a non-directory, lock it.  If that means we need to
+lock both, lock them in inode pointer order.
 
 5) link creation.  Locking rules:
 	* lock parent
@@ -30,7 +35,9 @@ rules:
 		fail with -ENOTEMPTY
 	* if new parent is equal to or is a descendent of source
 		fail with -ELOOP
-	* if target exists - lock it.
+	* If target exists, lock it.  If source is a non-directory, lock
+	  it.  In case that means we need to lock both source and target,
+	  do so in inode pointer order.
 	* call the method.
 
 
@@ -56,9 +63,11 @@ objects - A < B iff A is an ancestor of B.
     renames will be blocked on filesystem lock and we don't start changing
     the order until we had acquired all locks).
 
-(3) any operation holds at most one lock on non-directory object and
-    that lock is acquired after all other locks.  (Proof: see descriptions
-    of operations).
+(3) locks on non-directory objects are acquired only after locks on
+    directory objects, and are acquired in inode pointer order.
+    (Proof: all operations but renames take lock on at most one
+    non-directory object, except renames, which take locks on source and
+    target in inode pointer order in the case they are not directories.)
 
 	Now consider the minimal deadlock.  Each process is blocked on
 attempt to acquire some lock and already holds at least one lock.  Let's
@@ -66,9 +75,13 @@ consider the set of contended locks.  First of all, filesystem lock is
 not contended, since any process blocked on it is not holding any locks.
 Thus all processes are blocked on ->i_mutex.
 
-	Non-directory objects are not contended due to (3).  Thus link
-creation can't be a part of deadlock - it can't be blocked on source
-and it means that it doesn't hold any locks.
+	By (3), any process holding a non-directory lock can only be
+waiting on another non-directory lock with a larger address.  Therefore
+the process holding the "largest" such lock can always make progress, and
+non-directory objects are not included in the set of contended locks.
+
+	Thus link creation can't be a part of deadlock - it can't be
+blocked on source and it means that it doesn't hold any locks.
 
 	Any contended object is either held by cross-directory rename or
 has a child that is also contended.  Indeed, suppose that it is held by
diff --git a/fs/namei.c b/fs/namei.c
index d1895f3..921a1ee 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -3650,7 +3650,8 @@ SYSCALL_DEFINE2(link, const char __user *, oldname, const char __user *, newname
  *	   That's where 4.4 screws up. Current fix: serialization on
  *	   sb->s_vfs_rename_mutex. We might be more accurate, but that's another
  *	   story.
- *	c) we have to lock _three_ objects - parents and victim (if it exists).
+ *	c) we have to lock _four_ objects - parents and victim (if it exists),
+ *	   and source (if it is not a directory).
  *	   And that - after we got ->i_mutex on parents (until then we don't know
  *	   whether the target exists).  Solution: try to be smart with locking
  *	   order for inodes.  We rely on the fact that tree topology may change
@@ -3726,6 +3727,7 @@ static int vfs_rename_other(struct inode *old_dir, struct dentry *old_dentry,
 			    struct inode *new_dir, struct dentry *new_dentry)
 {
 	struct inode *target = new_dentry->d_inode;
+	struct inode *source = old_dentry->d_inode;
 	int error;
 
 	error = security_inode_rename(old_dir, old_dentry, new_dir, new_dentry);
@@ -3734,7 +3736,9 @@ static int vfs_rename_other(struct inode *old_dir, struct dentry *old_dentry,
 
 	dget(new_dentry);
 	if (target)
-		mutex_lock(&target->i_mutex);
+		lock_two_nondirectories(source, target);
+	else
+		mutex_lock(&source->i_mutex);
 
 	error = -EBUSY;
 	if (d_mountpoint(old_dentry)||d_mountpoint(new_dentry))
@@ -3750,7 +3754,9 @@ static int vfs_rename_other(struct inode *old_dir, struct dentry *old_dentry,
 		d_move(old_dentry, new_dentry);
 out:
 	if (target)
-		mutex_unlock(&target->i_mutex);
+		unlock_two_nondirectories(source, target);
+	else
+		mutex_unlock(&source->i_mutex);
 	dput(new_dentry);
 	return error;
 }
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 08/12] locks: break delegations on unlink
       [not found] ` <1350424897-26894-1-git-send-email-bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
                     ` (2 preceding siblings ...)
  2012-10-16 22:01   ` [PATCH 04/12] vfs: take i_mutex on renamed file J. Bruce Fields
@ 2012-10-16 22:01   ` J. Bruce Fields
  2012-10-16 22:01   ` [PATCH 10/12] locks: break delegations on rename J. Bruce Fields
  2012-10-16 22:01   ` [PATCH 11/12] locks: break delegations on link J. Bruce Fields
  5 siblings, 0 replies; 13+ messages in thread
From: J. Bruce Fields @ 2012-10-16 22:01 UTC (permalink / raw)
  To: Al Viro
  Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, J. Bruce Fields,
	David Howells, Tyler Hicks, Dustin Kirkland

From: "J. Bruce Fields" <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

We need to break delegations on any operation that changes the set of
links pointing to an inode.  Start with unlink.

Such operations also hold the i_mutex on a parent directory.  Breaking a
delegation may require waiting for a timeout (by default 90 seconds) in
the case of a unresponsive NFS client.  To avoid blocking all directory
operations, we therefore drop locks before waiting for the delegation.
The logic then looks like:

	acquire locks
	...
	test for delegation; if found:
		take reference on inode
		release locks
		wait for delegation break
		drop reference on inode
		retry

It is possible this could never terminate.  (Even if we take precautions
to prevent another delegation being acquired on the same inode, we could
get a different inode on each retry.)  But this seems very unlikely.

The initial test for a delegation happens after the lock on the target
inode is acquired, but the directory inode may have been acquired
further up the call stack.  We therefore add a "struct inode **"
argument to any intervening functions, which we use to pass the inode
back up to the caller in the case it needs a delegation synchronously
broken.

Cc: David Howells <dhowells-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Cc: Tyler Hicks <tyhicks-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
Cc: Dustin Kirkland <dustin.kirkland-Bv2LyzZ6GzxBDgjK7y7TUQ@public.gmane.org>
Signed-off-by: J. Bruce Fields <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
---
 drivers/base/devtmpfs.c |    2 +-
 fs/cachefiles/namei.c   |    2 +-
 fs/ecryptfs/inode.c     |    2 +-
 fs/namei.c              |   23 ++++++++++++++++++++---
 fs/nfsd/vfs.c           |    2 +-
 include/linux/fs.h      |    2 +-
 ipc/mqueue.c            |    2 +-
 7 files changed, 26 insertions(+), 9 deletions(-)

diff --git a/drivers/base/devtmpfs.c b/drivers/base/devtmpfs.c
index 147d1a4..21099da 100644
--- a/drivers/base/devtmpfs.c
+++ b/drivers/base/devtmpfs.c
@@ -317,7 +317,7 @@ static int handle_remove(const char *nodename, struct device *dev)
 			mutex_lock(&dentry->d_inode->i_mutex);
 			notify_change(dentry, &newattrs);
 			mutex_unlock(&dentry->d_inode->i_mutex);
-			err = vfs_unlink(parent.dentry->d_inode, dentry);
+			err = vfs_unlink(parent.dentry->d_inode, dentry, NULL);
 			if (!err || err == -ENOENT)
 				deleted = 1;
 		}
diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c
index b0b5f7c..10cf5c6 100644
--- a/fs/cachefiles/namei.c
+++ b/fs/cachefiles/namei.c
@@ -295,7 +295,7 @@ static int cachefiles_bury_object(struct cachefiles_cache *cache,
 		if (ret < 0) {
 			cachefiles_io_error(cache, "Unlink security error");
 		} else {
-			ret = vfs_unlink(dir->d_inode, rep);
+			ret = vfs_unlink(dir->d_inode, rep, NULL);
 
 			if (preemptive)
 				cachefiles_mark_object_buried(cache, rep);
diff --git a/fs/ecryptfs/inode.c b/fs/ecryptfs/inode.c
index cc7709e..30d7e59 100644
--- a/fs/ecryptfs/inode.c
+++ b/fs/ecryptfs/inode.c
@@ -153,7 +153,7 @@ static int ecryptfs_do_unlink(struct inode *dir, struct dentry *dentry,
 
 	dget(lower_dentry);
 	lower_dir_dentry = lock_parent(lower_dentry);
-	rc = vfs_unlink(lower_dir_inode, lower_dentry);
+	rc = vfs_unlink(lower_dir_inode, lower_dentry, NULL);
 	if (rc) {
 		printk(KERN_ERR "Error in vfs_unlink; rc = [%d]\n", rc);
 		goto out_unlock;
diff --git a/fs/namei.c b/fs/namei.c
index 7dddff6..ed57a97 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -3373,7 +3373,7 @@ SYSCALL_DEFINE1(rmdir, const char __user *, pathname)
 	return do_rmdir(AT_FDCWD, pathname);
 }
 
-int vfs_unlink(struct inode *dir, struct dentry *dentry)
+int vfs_unlink(struct inode *dir, struct dentry *dentry, struct inode **delegated_inode)
 {
 	struct inode *target = dentry->d_inode;
 	int error = may_delete(dir, dentry, 0);
@@ -3390,11 +3390,20 @@ int vfs_unlink(struct inode *dir, struct dentry *dentry)
 	else {
 		error = security_inode_unlink(dir, dentry);
 		if (!error) {
+			error = break_deleg(target, O_WRONLY|O_NONBLOCK);
+			if (error) {
+				if (error == -EWOULDBLOCK && delegated_inode) {
+					*delegated_inode = target;
+					ihold(target);
+				}
+				goto out;
+			}
 			error = dir->i_op->unlink(dir, dentry);
 			if (!error)
 				dont_mount(dentry);
 		}
 	}
+out:
 	mutex_unlock(&target->i_mutex);
 
 	/* We don't d_delete() NFS sillyrenamed files--they still exist. */
@@ -3419,6 +3428,7 @@ static long do_unlinkat(int dfd, const char __user *pathname)
 	struct dentry *dentry;
 	struct nameidata nd;
 	struct inode *inode = NULL;
+	struct inode *delegated_inode = NULL;
 
 	name = user_path_parent(dfd, pathname, &nd);
 	if (IS_ERR(name))
@@ -3432,7 +3442,7 @@ static long do_unlinkat(int dfd, const char __user *pathname)
 	error = mnt_want_write(nd.path.mnt);
 	if (error)
 		goto exit1;
-
+retry_deleg:
 	mutex_lock_nested(&nd.path.dentry->d_inode->i_mutex, I_MUTEX_PARENT);
 	dentry = lookup_hash(&nd);
 	error = PTR_ERR(dentry);
@@ -3447,13 +3457,20 @@ static long do_unlinkat(int dfd, const char __user *pathname)
 		error = security_path_unlink(&nd.path, dentry);
 		if (error)
 			goto exit2;
-		error = vfs_unlink(nd.path.dentry->d_inode, dentry);
+		error = vfs_unlink(nd.path.dentry->d_inode, dentry, &delegated_inode);
 exit2:
 		dput(dentry);
 	}
 	mutex_unlock(&nd.path.dentry->d_inode->i_mutex);
 	if (inode)
 		iput(inode);	/* truncate the inode here */
+	if (delegated_inode) {
+		error = break_deleg(delegated_inode, O_WRONLY);
+		iput(delegated_inode);
+		delegated_inode = NULL;
+		if (!error)
+			goto retry_deleg;
+	}
 	mnt_drop_write(nd.path.mnt);
 exit1:
 	path_put(&nd.path);
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index c120b48..c51f4c9 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -1893,7 +1893,7 @@ nfsd_unlink(struct svc_rqst *rqstp, struct svc_fh *fhp, int type,
 	if (host_err)
 		goto out_put;
 	if (type != S_IFDIR)
-		host_err = vfs_unlink(dirp, rdentry);
+		host_err = vfs_unlink(dirp, rdentry, NULL);
 	else
 		host_err = vfs_rmdir(dirp, rdentry);
 	if (!host_err)
diff --git a/include/linux/fs.h b/include/linux/fs.h
index a05ac23..5ea5f5f 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1399,7 +1399,7 @@ extern int vfs_mknod(struct inode *, struct dentry *, umode_t, dev_t);
 extern int vfs_symlink(struct inode *, struct dentry *, const char *);
 extern int vfs_link(struct dentry *, struct inode *, struct dentry *);
 extern int vfs_rmdir(struct inode *, struct dentry *);
-extern int vfs_unlink(struct inode *, struct dentry *);
+extern int vfs_unlink(struct inode *, struct dentry *, struct inode **);
 extern int vfs_rename(struct inode *, struct dentry *, struct inode *, struct dentry *);
 
 /*
diff --git a/ipc/mqueue.c b/ipc/mqueue.c
index 71a3ca1..733cc8d 100644
--- a/ipc/mqueue.c
+++ b/ipc/mqueue.c
@@ -875,7 +875,7 @@ SYSCALL_DEFINE1(mq_unlink, const char __user *, u_name)
 		err = -ENOENT;
 	} else {
 		ihold(inode);
-		err = vfs_unlink(dentry->d_parent->d_inode, dentry);
+		err = vfs_unlink(dentry->d_parent->d_inode, dentry, NULL);
 	}
 	dput(dentry);
 
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 10/12] locks: break delegations on rename
       [not found] ` <1350424897-26894-1-git-send-email-bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
                     ` (3 preceding siblings ...)
  2012-10-16 22:01   ` [PATCH 08/12] locks: break delegations on unlink J. Bruce Fields
@ 2012-10-16 22:01   ` J. Bruce Fields
  2012-10-16 22:01   ` [PATCH 11/12] locks: break delegations on link J. Bruce Fields
  5 siblings, 0 replies; 13+ messages in thread
From: J. Bruce Fields @ 2012-10-16 22:01 UTC (permalink / raw)
  To: Al Viro
  Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, J. Bruce Fields,
	David Howells

From: "J. Bruce Fields" <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

Cc: David Howells <dhowells-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Signed-off-by: J. Bruce Fields <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
---
 fs/cachefiles/namei.c |    2 +-
 fs/namei.c            |   30 ++++++++++++++++++++++++------
 fs/nfsd/vfs.c         |    2 +-
 include/linux/fs.h    |    2 +-
 4 files changed, 27 insertions(+), 9 deletions(-)

diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c
index 10cf5c6..ef23d2a 100644
--- a/fs/cachefiles/namei.c
+++ b/fs/cachefiles/namei.c
@@ -397,7 +397,7 @@ try_again:
 		cachefiles_io_error(cache, "Rename security error %d", ret);
 	} else {
 		ret = vfs_rename(dir->d_inode, rep,
-				 cache->graveyard->d_inode, grave);
+				 cache->graveyard->d_inode, grave, NULL);
 		if (ret != 0 && ret != -ENOMEM)
 			cachefiles_io_error(cache,
 					    "Rename failed with error %d", ret);
diff --git a/fs/namei.c b/fs/namei.c
index 66228a8..4d0982b 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -3735,7 +3735,8 @@ out:
 }
 
 static int vfs_rename_other(struct inode *old_dir, struct dentry *old_dentry,
-			    struct inode *new_dir, struct dentry *new_dentry)
+			    struct inode *new_dir, struct dentry *new_dentry,
+			    struct inode **delegated_inode)
 {
 	struct inode *target = new_dentry->d_inode;
 	struct inode *source = old_dentry->d_inode;
@@ -3755,6 +3756,14 @@ static int vfs_rename_other(struct inode *old_dir, struct dentry *old_dentry,
 	if (d_mountpoint(old_dentry)||d_mountpoint(new_dentry))
 		goto out;
 
+	error = try_break_deleg(source, delegated_inode);
+	if (error)
+		goto out;
+	if (target) {
+		error = try_break_deleg(target, delegated_inode);
+		if (error)
+			goto out;
+	}
 	error = old_dir->i_op->rename(old_dir, old_dentry, new_dir, new_dentry);
 	if (error)
 		goto out;
@@ -3773,7 +3782,8 @@ out:
 }
 
 int vfs_rename(struct inode *old_dir, struct dentry *old_dentry,
-	       struct inode *new_dir, struct dentry *new_dentry)
+	       struct inode *new_dir, struct dentry *new_dentry,
+	       struct inode **delegated_inode)
 {
 	int error;
 	int is_dir = S_ISDIR(old_dentry->d_inode->i_mode);
@@ -3801,7 +3811,7 @@ int vfs_rename(struct inode *old_dir, struct dentry *old_dentry,
 	if (is_dir)
 		error = vfs_rename_dir(old_dir,old_dentry,new_dir,new_dentry);
 	else
-		error = vfs_rename_other(old_dir,old_dentry,new_dir,new_dentry);
+		error = vfs_rename_other(old_dir,old_dentry,new_dir,new_dentry,delegated_inode);
 	if (!error)
 		fsnotify_move(old_dir, new_dir, old_name, is_dir,
 			      new_dentry->d_inode, old_dentry);
@@ -3817,8 +3827,9 @@ SYSCALL_DEFINE4(renameat, int, olddfd, const char __user *, oldname,
 	struct dentry *old_dentry, *new_dentry;
 	struct dentry *trap;
 	struct nameidata oldnd, newnd;
-	struct filename *from;
-	struct filename *to;
+	struct inode *delegated_inode = NULL;
+	char *from;
+	char *to;
 	int error;
 
 	from = user_path_parent(olddfd, oldname, &oldnd);
@@ -3854,6 +3865,7 @@ SYSCALL_DEFINE4(renameat, int, olddfd, const char __user *, oldname,
 	newnd.flags &= ~LOOKUP_PARENT;
 	newnd.flags |= LOOKUP_RENAME_TARGET;
 
+retry_deleg:
 	trap = lock_rename(new_dir, old_dir);
 
 	old_dentry = lookup_hash(&oldnd);
@@ -3890,13 +3902,19 @@ SYSCALL_DEFINE4(renameat, int, olddfd, const char __user *, oldname,
 	if (error)
 		goto exit5;
 	error = vfs_rename(old_dir->d_inode, old_dentry,
-				   new_dir->d_inode, new_dentry);
+				   new_dir->d_inode, new_dentry,
+				   &delegated_inode);
 exit5:
 	dput(new_dentry);
 exit4:
 	dput(old_dentry);
 exit3:
 	unlock_rename(new_dir, old_dir);
+	if (delegated_inode) {
+		error = break_deleg_wait(&delegated_inode);
+		if (!error)
+			goto retry_deleg;
+	}
 	mnt_drop_write(oldnd.path.mnt);
 exit2:
 	path_put(&newnd.path);
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index c51f4c9..903fae1 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -1820,7 +1820,7 @@ nfsd_rename(struct svc_rqst *rqstp, struct svc_fh *ffhp, char *fname, int flen,
 		if (host_err)
 			goto out_dput_new;
 	}
-	host_err = vfs_rename(fdir, odentry, tdir, ndentry);
+	host_err = vfs_rename(fdir, odentry, tdir, ndentry, NULL);
 	if (!host_err) {
 		host_err = commit_metadata(tfhp);
 		if (!host_err)
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 0e761e7..84183db 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1400,7 +1400,7 @@ extern int vfs_symlink(struct inode *, struct dentry *, const char *);
 extern int vfs_link(struct dentry *, struct inode *, struct dentry *);
 extern int vfs_rmdir(struct inode *, struct dentry *);
 extern int vfs_unlink(struct inode *, struct dentry *, struct inode **);
-extern int vfs_rename(struct inode *, struct dentry *, struct inode *, struct dentry *);
+extern int vfs_rename(struct inode *, struct dentry *, struct inode *, struct dentry *, struct inode **);
 
 /*
  * VFS dentry helper functions.
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 11/12] locks: break delegations on link
       [not found] ` <1350424897-26894-1-git-send-email-bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
                     ` (4 preceding siblings ...)
  2012-10-16 22:01   ` [PATCH 10/12] locks: break delegations on rename J. Bruce Fields
@ 2012-10-16 22:01   ` J. Bruce Fields
  5 siblings, 0 replies; 13+ messages in thread
From: J. Bruce Fields @ 2012-10-16 22:01 UTC (permalink / raw)
  To: Al Viro
  Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, J. Bruce Fields,
	Tyler Hicks, Dustin Kirkland

From: "J. Bruce Fields" <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

Cc: Tyler Hicks <tyhicks-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
Cc: Dustin Kirkland <dustin.kirkland-Bv2LyzZ6GzxBDgjK7y7TUQ@public.gmane.org>
Signed-off-by: J. Bruce Fields <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
---
 fs/ecryptfs/inode.c |    2 +-
 fs/namei.c          |   18 ++++++++++++++----
 fs/nfsd/vfs.c       |    2 +-
 include/linux/fs.h  |    2 +-
 4 files changed, 17 insertions(+), 7 deletions(-)

diff --git a/fs/ecryptfs/inode.c b/fs/ecryptfs/inode.c
index 30d7e59..a3b2b08 100644
--- a/fs/ecryptfs/inode.c
+++ b/fs/ecryptfs/inode.c
@@ -475,7 +475,7 @@ static int ecryptfs_link(struct dentry *old_dentry, struct inode *dir,
 	dget(lower_new_dentry);
 	lower_dir_dentry = lock_parent(lower_new_dentry);
 	rc = vfs_link(lower_old_dentry, lower_dir_dentry->d_inode,
-		      lower_new_dentry);
+		      lower_new_dentry, NULL);
 	if (rc || !lower_new_dentry->d_inode)
 		goto out_lock;
 	rc = ecryptfs_interpose(lower_new_dentry, new_dentry, dir->i_sb);
diff --git a/fs/namei.c b/fs/namei.c
index 4d0982b..85d77c1 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -3543,7 +3543,7 @@ SYSCALL_DEFINE2(symlink, const char __user *, oldname, const char __user *, newn
 	return sys_symlinkat(oldname, AT_FDCWD, newname);
 }
 
-int vfs_link(struct dentry *old_dentry, struct inode *dir, struct dentry *new_dentry)
+int vfs_link(struct dentry *old_dentry, struct inode *dir, struct dentry *new_dentry, struct inode **delegated_inode)
 {
 	struct inode *inode = old_dentry->d_inode;
 	unsigned max_links = dir->i_sb->s_max_links;
@@ -3579,8 +3579,11 @@ int vfs_link(struct dentry *old_dentry, struct inode *dir, struct dentry *new_de
 		error =  -ENOENT;
 	else if (max_links && inode->i_nlink >= max_links)
 		error = -EMLINK;
-	else
-		error = dir->i_op->link(old_dentry, dir, new_dentry);
+	else {
+		error = try_break_deleg(inode, delegated_inode);
+		if (!error)
+			error = dir->i_op->link(old_dentry, dir, new_dentry);
+	}
 	mutex_unlock(&inode->i_mutex);
 	if (!error)
 		fsnotify_link(dir, inode, new_dentry);
@@ -3601,6 +3604,7 @@ SYSCALL_DEFINE5(linkat, int, olddfd, const char __user *, oldname,
 {
 	struct dentry *new_dentry;
 	struct path old_path, new_path;
+	struct inode *delegated_inode = NULL;
 	int how = 0;
 	int error;
 
@@ -3624,6 +3628,7 @@ SYSCALL_DEFINE5(linkat, int, olddfd, const char __user *, oldname,
 	if (error)
 		return error;
 
+retry_deleg:
 	new_dentry = user_path_create(newdfd, newname, &new_path, 0);
 	error = PTR_ERR(new_dentry);
 	if (IS_ERR(new_dentry))
@@ -3638,9 +3643,14 @@ SYSCALL_DEFINE5(linkat, int, olddfd, const char __user *, oldname,
 	error = security_path_link(old_path.dentry, &new_path, new_dentry);
 	if (error)
 		goto out_dput;
-	error = vfs_link(old_path.dentry, new_path.dentry->d_inode, new_dentry);
+	error = vfs_link(old_path.dentry, new_path.dentry->d_inode, new_dentry, &delegated_inode);
 out_dput:
 	done_path_create(&new_path, new_dentry);
+	if (delegated_inode) {
+		error = break_deleg_wait(&delegated_inode);
+		if (!error)
+			goto retry_deleg;
+	}
 out:
 	path_put(&old_path);
 
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index 903fae1..5e36961 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -1717,7 +1717,7 @@ nfsd_link(struct svc_rqst *rqstp, struct svc_fh *ffhp,
 		err = nfserrno(host_err);
 		goto out_dput;
 	}
-	host_err = vfs_link(dold, dirp, dnew);
+	host_err = vfs_link(dold, dirp, dnew, NULL);
 	if (!host_err) {
 		err = nfserrno(commit_metadata(ffhp));
 		if (!err)
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 84183db..f12a380 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1397,7 +1397,7 @@ extern int vfs_create(struct inode *, struct dentry *, umode_t, bool);
 extern int vfs_mkdir(struct inode *, struct dentry *, umode_t);
 extern int vfs_mknod(struct inode *, struct dentry *, umode_t, dev_t);
 extern int vfs_symlink(struct inode *, struct dentry *, const char *);
-extern int vfs_link(struct dentry *, struct inode *, struct dentry *);
+extern int vfs_link(struct dentry *, struct inode *, struct dentry *, struct inode **);
 extern int vfs_rmdir(struct inode *, struct dentry *);
 extern int vfs_unlink(struct inode *, struct dentry *, struct inode **);
 extern int vfs_rename(struct inode *, struct dentry *, struct inode *, struct dentry *, struct inode **);
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 05/12] locks: introduce new FL_DELEG lock flag
  2012-10-16 22:01 [PATCH 00/12] Implement NFSv4 delegations, take 5 J. Bruce Fields
  2012-10-16 22:01 ` [PATCH 02/12] vfs: don't use PARENT/CHILD lock classes for non-directories J. Bruce Fields
       [not found] ` <1350424897-26894-1-git-send-email-bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2012-10-16 22:01 ` J. Bruce Fields
  2012-10-16 22:01 ` [PATCH 06/12] locks: implement delegations J. Bruce Fields
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: J. Bruce Fields @ 2012-10-16 22:01 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-nfs, linux-fsdevel, J. Bruce Fields

From: "J. Bruce Fields" <bfields@redhat.com>

For now FL_DELEG is just a synonym for FL_LEASE.  So this patch doesn't
change behavior.

Next we'll modify break_lease to treat FL_DELEG leases differently, to
account for the fact that NFSv4 delegations should be broken in more
situations than Windows oplocks.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
 fs/locks.c          |    2 +-
 fs/nfsd/nfs4state.c |    2 +-
 include/linux/fs.h  |    1 +
 3 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/fs/locks.c b/fs/locks.c
index a94e331..7fac262 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -131,7 +131,7 @@
 
 #define IS_POSIX(fl)	(fl->fl_flags & FL_POSIX)
 #define IS_FLOCK(fl)	(fl->fl_flags & FL_FLOCK)
-#define IS_LEASE(fl)	(fl->fl_flags & FL_LEASE)
+#define IS_LEASE(fl)	(fl->fl_flags & (FL_LEASE|FL_DELEG))
 
 static bool lease_breaking(struct file_lock *fl)
 {
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index d0237f8..ac7c70e 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -2825,7 +2825,7 @@ static struct file_lock *nfs4_alloc_init_lease(struct nfs4_delegation *dp, int f
 		return NULL;
 	locks_init_lock(fl);
 	fl->fl_lmops = &nfsd_lease_mng_ops;
-	fl->fl_flags = FL_LEASE;
+	fl->fl_flags = FL_DELEG;
 	fl->fl_type = flag == NFS4_OPEN_DELEGATE_READ? F_RDLCK: F_WRLCK;
 	fl->fl_end = OFFSET_MAX;
 	fl->fl_owner = (fl_owner_t)(dp->dl_file);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 7ffc865..eae4f10 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -816,6 +816,7 @@ static inline int file_check_writeable(struct file *filp)
 
 #define FL_POSIX	1
 #define FL_FLOCK	2
+#define FL_DELEG	4	/* NFSv4 delegation */
 #define FL_ACCESS	8	/* not trying to lock, just looking */
 #define FL_EXISTS	16	/* when unlocking, test for existence */
 #define FL_LEASE	32	/* lease held on this file */
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 06/12] locks: implement delegations
  2012-10-16 22:01 [PATCH 00/12] Implement NFSv4 delegations, take 5 J. Bruce Fields
                   ` (2 preceding siblings ...)
  2012-10-16 22:01 ` [PATCH 05/12] locks: introduce new FL_DELEG lock flag J. Bruce Fields
@ 2012-10-16 22:01 ` J. Bruce Fields
  2012-10-16 22:01 ` [PATCH 07/12] namei: minor vfs_unlink cleanup J. Bruce Fields
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: J. Bruce Fields @ 2012-10-16 22:01 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-nfs, linux-fsdevel, J. Bruce Fields

From: "J. Bruce Fields" <bfields@redhat.com>

Implement NFSv4 delegations at the vfs level using the new FL_DELEG lock
type.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
 fs/locks.c         |   49 +++++++++++++++++++++++++++++++++++++++----------
 include/linux/fs.h |   18 +++++++++++++++---
 2 files changed, 54 insertions(+), 13 deletions(-)

diff --git a/fs/locks.c b/fs/locks.c
index 7fac262..17532ac 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -1176,28 +1176,40 @@ static void time_out_leases(struct inode *inode)
 	}
 }
 
+static bool leases_conflict(struct file_lock *lease, struct file_lock *breaker)
+{
+	if ((breaker->fl_flags & FL_DELEG) && (lease->fl_flags & FL_LEASE))
+		return false;
+	return locks_conflict(breaker, lease);
+}
+
 /**
  *	__break_lease	-	revoke all outstanding leases on file
  *	@inode: the inode of the file to return
- *	@mode: the open mode (read or write)
+ *	@mode: O_RDONLY: break only write leases; O_WRONLY or O_RDWR:
+ *	    break all leases
+ *	@type: FL_LEASE: break leases and delegations; FL_DELEG: break
+ *	    only delegations
  *
  *	break_lease (inlined for speed) has checked there already is at least
  *	some kind of lock (maybe a lease) on this file.  Leases are broken on
  *	a call to open() or truncate().  This function can sleep unless you
  *	specified %O_NONBLOCK to your open().
  */
-int __break_lease(struct inode *inode, unsigned int mode)
+int __break_lease(struct inode *inode, unsigned int mode, unsigned int type)
 {
 	int error = 0;
 	struct file_lock *new_fl, *flock;
 	struct file_lock *fl;
 	unsigned long break_time;
 	int i_have_this_lease = 0;
+	bool lease_conflict = false;
 	int want_write = (mode & O_ACCMODE) != O_RDONLY;
 
 	new_fl = lease_alloc(NULL, want_write ? F_WRLCK : F_RDLCK);
 	if (IS_ERR(new_fl))
 		return PTR_ERR(new_fl);
+	new_fl->fl_flags = type;
 
 	lock_flocks();
 
@@ -1207,13 +1219,16 @@ int __break_lease(struct inode *inode, unsigned int mode)
 	if ((flock == NULL) || !IS_LEASE(flock))
 		goto out;
 
-	if (!locks_conflict(flock, new_fl))
+	for (fl = flock; fl && IS_LEASE(fl); fl = fl->fl_next) {
+		if (leases_conflict(fl, new_fl)) {
+			lease_conflict = true;
+			if (fl->fl_owner == current->files)
+				i_have_this_lease = 1;
+		}
+	}
+	if (!lease_conflict)
 		goto out;
 
-	for (fl = flock; fl && IS_LEASE(fl); fl = fl->fl_next)
-		if (fl->fl_owner == current->files)
-			i_have_this_lease = 1;
-
 	break_time = 0;
 	if (lease_break_time > 0) {
 		break_time = jiffies + lease_break_time * HZ;
@@ -1222,6 +1237,8 @@ int __break_lease(struct inode *inode, unsigned int mode)
 	}
 
 	for (fl = flock; fl && IS_LEASE(fl); fl = fl->fl_next) {
+		if (!leases_conflict(fl, new_fl))
+			continue;
 		if (want_write) {
 			if (fl->fl_flags & FL_UNLOCK_PENDING)
 				continue;
@@ -1263,7 +1280,7 @@ restart:
 		 */
 		for (flock = inode->i_flock; flock && IS_LEASE(flock);
 				flock = flock->fl_next) {
-			if (locks_conflict(new_fl, flock))
+			if (leases_conflict(new_fl, flock))
 				goto restart;
 		}
 		error = 0;
@@ -1343,9 +1360,20 @@ int generic_add_lease(struct file *filp, long arg, struct file_lock **flp)
 	struct file_lock *fl, **before, **my_before = NULL, *lease;
 	struct dentry *dentry = filp->f_path.dentry;
 	struct inode *inode = dentry->d_inode;
+	bool is_deleg = (*flp)->fl_flags & FL_DELEG;
 	int error;
 
 	lease = *flp;
+	/*
+	 * In the delegation case we need mutual exclusion with
+	 * a number of operations that take the i_mutex.  We trylock
+	 * because delegations are an optional optimization, and if
+	 * there's some chance of a conflict--we'd rather not
+	 * bother, maybe that's a sign this just isn't a good file to
+	 * hand out a delegation on.
+	 */
+	if (is_deleg && !mutex_trylock(&inode->i_mutex))
+		return -EAGAIN;
 
 	error = -EAGAIN;
 	if ((arg == F_RDLCK) && (atomic_read(&inode->i_writecount) > 0))
@@ -1397,9 +1425,10 @@ int generic_add_lease(struct file *filp, long arg, struct file_lock **flp)
 		goto out;
 
 	locks_insert_lock(before, lease);
-	return 0;
-
+	error = 0;
 out:
+	if (is_deleg)
+		mutex_unlock(&inode->i_mutex);
 	return error;
 }
 
diff --git a/include/linux/fs.h b/include/linux/fs.h
index eae4f10..a05ac23 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -938,7 +938,7 @@ extern int vfs_test_lock(struct file *, struct file_lock *);
 extern int vfs_lock_file(struct file *, unsigned int, struct file_lock *, struct file_lock *);
 extern int vfs_cancel_lock(struct file *filp, struct file_lock *fl);
 extern int flock_lock_file_wait(struct file *filp, struct file_lock *fl);
-extern int __break_lease(struct inode *inode, unsigned int flags);
+extern int __break_lease(struct inode *inode, unsigned int flags, unsigned int type);
 extern void lease_get_mtime(struct inode *, struct timespec *time);
 extern int generic_setlease(struct file *, long, struct file_lock **);
 extern int vfs_setlease(struct file *, long, struct file_lock **);
@@ -1051,7 +1051,7 @@ static inline int flock_lock_file_wait(struct file *filp,
 	return -ENOLCK;
 }
 
-static inline int __break_lease(struct inode *inode, unsigned int mode)
+static inline int __break_lease(struct inode *inode, unsigned int mode, unsigned int type)
 {
 	return 0;
 }
@@ -1827,9 +1827,17 @@ static inline int locks_verify_truncate(struct inode *inode,
 static inline int break_lease(struct inode *inode, unsigned int mode)
 {
 	if (inode->i_flock)
-		return __break_lease(inode, mode);
+		return __break_lease(inode, mode, FL_LEASE);
 	return 0;
 }
+
+static inline int break_deleg(struct inode *inode, unsigned int mode)
+{
+	if (inode->i_flock)
+		return __break_lease(inode, mode, FL_DELEG);
+	return 0;
+}
+
 #else /* !CONFIG_FILE_LOCKING */
 static inline int locks_mandatory_locked(struct inode *inode)
 {
@@ -1869,6 +1877,10 @@ static inline int break_lease(struct inode *inode, unsigned int mode)
 	return 0;
 }
 
+static inline int break_deleg(struct inode *inode, unsigned int mode)
+{
+	return 0;
+}
 #endif /* CONFIG_FILE_LOCKING */
 
 /* fs/open.c */
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 07/12] namei: minor vfs_unlink cleanup
  2012-10-16 22:01 [PATCH 00/12] Implement NFSv4 delegations, take 5 J. Bruce Fields
                   ` (3 preceding siblings ...)
  2012-10-16 22:01 ` [PATCH 06/12] locks: implement delegations J. Bruce Fields
@ 2012-10-16 22:01 ` J. Bruce Fields
  2012-10-16 22:01 ` [PATCH 09/12] locks: helper functions for delegation breaking J. Bruce Fields
  2012-10-16 22:01 ` [PATCH 12/12] locks: break delegations on any attribute modification J. Bruce Fields
  6 siblings, 0 replies; 13+ messages in thread
From: J. Bruce Fields @ 2012-10-16 22:01 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-nfs, linux-fsdevel, J. Bruce Fields

From: "J. Bruce Fields" <bfields@redhat.com>

We'll be using dentry->d_inode in one more place.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
 fs/namei.c |    7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index 921a1ee..7dddff6 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -3375,6 +3375,7 @@ SYSCALL_DEFINE1(rmdir, const char __user *, pathname)
 
 int vfs_unlink(struct inode *dir, struct dentry *dentry)
 {
+	struct inode *target = dentry->d_inode;
 	int error = may_delete(dir, dentry, 0);
 
 	if (error)
@@ -3383,7 +3384,7 @@ int vfs_unlink(struct inode *dir, struct dentry *dentry)
 	if (!dir->i_op->unlink)
 		return -EPERM;
 
-	mutex_lock(&dentry->d_inode->i_mutex);
+	mutex_lock(&target->i_mutex);
 	if (d_mountpoint(dentry))
 		error = -EBUSY;
 	else {
@@ -3394,11 +3395,11 @@ int vfs_unlink(struct inode *dir, struct dentry *dentry)
 				dont_mount(dentry);
 		}
 	}
-	mutex_unlock(&dentry->d_inode->i_mutex);
+	mutex_unlock(&target->i_mutex);
 
 	/* We don't d_delete() NFS sillyrenamed files--they still exist. */
 	if (!error && !(dentry->d_flags & DCACHE_NFSFS_RENAMED)) {
-		fsnotify_link_count(dentry->d_inode);
+		fsnotify_link_count(target);
 		d_delete(dentry);
 	}
 
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 09/12] locks: helper functions for delegation breaking
  2012-10-16 22:01 [PATCH 00/12] Implement NFSv4 delegations, take 5 J. Bruce Fields
                   ` (4 preceding siblings ...)
  2012-10-16 22:01 ` [PATCH 07/12] namei: minor vfs_unlink cleanup J. Bruce Fields
@ 2012-10-16 22:01 ` J. Bruce Fields
  2012-10-16 22:01 ` [PATCH 12/12] locks: break delegations on any attribute modification J. Bruce Fields
  6 siblings, 0 replies; 13+ messages in thread
From: J. Bruce Fields @ 2012-10-16 22:01 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-nfs, linux-fsdevel, J. Bruce Fields

From: "J. Bruce Fields" <bfields@redhat.com>

We'll need the same logic for rename and link.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
 fs/namei.c         |   13 +++----------
 include/linux/fs.h |   33 +++++++++++++++++++++++++++++++--
 2 files changed, 34 insertions(+), 12 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index ed57a97..66228a8 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -3390,14 +3390,9 @@ int vfs_unlink(struct inode *dir, struct dentry *dentry, struct inode **delegate
 	else {
 		error = security_inode_unlink(dir, dentry);
 		if (!error) {
-			error = break_deleg(target, O_WRONLY|O_NONBLOCK);
-			if (error) {
-				if (error == -EWOULDBLOCK && delegated_inode) {
-					*delegated_inode = target;
-					ihold(target);
-				}
+			error = try_break_deleg(target, delegated_inode);
+			if (error)
 				goto out;
-			}
 			error = dir->i_op->unlink(dir, dentry);
 			if (!error)
 				dont_mount(dentry);
@@ -3465,9 +3460,7 @@ exit2:
 	if (inode)
 		iput(inode);	/* truncate the inode here */
 	if (delegated_inode) {
-		error = break_deleg(delegated_inode, O_WRONLY);
-		iput(delegated_inode);
-		delegated_inode = NULL;
+		error = break_deleg_wait(&delegated_inode);
 		if (!error)
 			goto retry_deleg;
 	}
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 5ea5f5f..0e761e7 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1770,6 +1770,9 @@ extern bool our_mnt(struct vfsmount *mnt);
 
 extern int current_umask(void);
 
+extern void ihold(struct inode * inode);
+extern void iput(struct inode *);
+
 /* /sys/fs */
 extern struct kobject *fs_kobj;
 
@@ -1838,6 +1841,28 @@ static inline int break_deleg(struct inode *inode, unsigned int mode)
 	return 0;
 }
 
+static inline int try_break_deleg(struct inode *inode, struct inode **delegated_inode)
+{
+	int ret;
+
+	ret = break_deleg(inode, O_WRONLY|O_NONBLOCK);
+	if (ret == -EWOULDBLOCK && delegated_inode) {
+		*delegated_inode = inode;
+		ihold(inode);
+	}
+	return ret;
+}
+
+static inline int break_deleg_wait(struct inode **delegated_inode)
+{
+	int ret;
+
+	ret = break_deleg(*delegated_inode, O_WRONLY);
+	iput(*delegated_inode);
+	*delegated_inode = NULL;
+	return ret;
+}
+
 #else /* !CONFIG_FILE_LOCKING */
 static inline int locks_mandatory_locked(struct inode *inode)
 {
@@ -1881,6 +1906,12 @@ static inline int break_deleg(struct inode *inode, unsigned int mode)
 {
 	return 0;
 }
+
+static inline int try_break_deleg(struct inode *inode, struct delegated_inode **inode)
+{
+	return 0;
+}
+
 #endif /* CONFIG_FILE_LOCKING */
 
 /* fs/open.c */
@@ -2189,8 +2220,6 @@ extern loff_t vfs_llseek(struct file *file, loff_t offset, int origin);
 extern int inode_init_always(struct super_block *, struct inode *);
 extern void inode_init_once(struct inode *);
 extern void address_space_init_once(struct address_space *mapping);
-extern void ihold(struct inode * inode);
-extern void iput(struct inode *);
 extern struct inode * igrab(struct inode *);
 extern ino_t iunique(struct super_block *, ino_t);
 extern int inode_needs_sync(struct inode *inode);
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 12/12] locks: break delegations on any attribute modification
  2012-10-16 22:01 [PATCH 00/12] Implement NFSv4 delegations, take 5 J. Bruce Fields
                   ` (5 preceding siblings ...)
  2012-10-16 22:01 ` [PATCH 09/12] locks: helper functions for delegation breaking J. Bruce Fields
@ 2012-10-16 22:01 ` J. Bruce Fields
  6 siblings, 0 replies; 13+ messages in thread
From: J. Bruce Fields @ 2012-10-16 22:01 UTC (permalink / raw)
  To: Al Viro
  Cc: linux-nfs, linux-fsdevel, J. Bruce Fields, Mikulas Patocka,
	David Howells, Tyler Hicks, Dustin Kirkland

From: "J. Bruce Fields" <bfields@redhat.com>

NFSv4 uses leases to guarantee that clients can cache metadata as well
as data.

Cc: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz>
Cc: David Howells <dhowells@redhat.com>
Cc: Tyler Hicks <tyhicks@canonical.com>
Cc: Dustin Kirkland <dustin.kirkland@gazzang.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
 drivers/base/devtmpfs.c   |    4 ++--
 fs/attr.c                 |    5 ++++-
 fs/cachefiles/interface.c |    4 ++--
 fs/ecryptfs/inode.c       |    2 +-
 fs/hpfs/namei.c           |    2 +-
 fs/inode.c                |    6 +++++-
 fs/nfsd/vfs.c             |    8 ++++++--
 fs/open.c                 |   21 +++++++++++++++++----
 fs/utimes.c               |    9 ++++++++-
 include/linux/fs.h        |    2 +-
 10 files changed, 47 insertions(+), 16 deletions(-)

diff --git a/drivers/base/devtmpfs.c b/drivers/base/devtmpfs.c
index 21099da..5295e4c 100644
--- a/drivers/base/devtmpfs.c
+++ b/drivers/base/devtmpfs.c
@@ -210,7 +210,7 @@ static int handle_create(const char *nodename, umode_t mode, struct device *dev)
 		newattrs.ia_mode = mode;
 		newattrs.ia_valid = ATTR_MODE;
 		mutex_lock(&dentry->d_inode->i_mutex);
-		notify_change(dentry, &newattrs);
+		notify_change(dentry, &newattrs, NULL);
 		mutex_unlock(&dentry->d_inode->i_mutex);
 
 		/* mark as kernel-created inode */
@@ -315,7 +315,7 @@ static int handle_remove(const char *nodename, struct device *dev)
 			newattrs.ia_valid =
 				ATTR_UID|ATTR_GID|ATTR_MODE;
 			mutex_lock(&dentry->d_inode->i_mutex);
-			notify_change(dentry, &newattrs);
+			notify_change(dentry, &newattrs, NULL);
 			mutex_unlock(&dentry->d_inode->i_mutex);
 			err = vfs_unlink(parent.dentry->d_inode, dentry, NULL);
 			if (!err || err == -ENOENT)
diff --git a/fs/attr.c b/fs/attr.c
index cce7df5..ad0a943 100644
--- a/fs/attr.c
+++ b/fs/attr.c
@@ -164,7 +164,7 @@ void setattr_copy(struct inode *inode, const struct iattr *attr)
 }
 EXPORT_SYMBOL(setattr_copy);
 
-int notify_change(struct dentry * dentry, struct iattr * attr)
+int notify_change(struct dentry * dentry, struct iattr * attr, struct inode **delegated_inode)
 {
 	struct inode *inode = dentry->d_inode;
 	umode_t mode = inode->i_mode;
@@ -240,6 +240,9 @@ int notify_change(struct dentry * dentry, struct iattr * attr)
 	error = security_inode_setattr(dentry, attr);
 	if (error)
 		return error;
+	error = try_break_deleg(inode, delegated_inode);
+	if (error)
+		return error;
 
 	if (inode->i_op->setattr)
 		error = inode->i_op->setattr(dentry, attr);
diff --git a/fs/cachefiles/interface.c b/fs/cachefiles/interface.c
index 67bef6d..67674b4 100644
--- a/fs/cachefiles/interface.c
+++ b/fs/cachefiles/interface.c
@@ -417,14 +417,14 @@ static int cachefiles_attr_changed(struct fscache_object *_object)
 		_debug("discard tail %llx", oi_size);
 		newattrs.ia_valid = ATTR_SIZE;
 		newattrs.ia_size = oi_size & PAGE_MASK;
-		ret = notify_change(object->backer, &newattrs);
+		ret = notify_change(object->backer, &newattrs, NULL);
 		if (ret < 0)
 			goto truncate_failed;
 	}
 
 	newattrs.ia_valid = ATTR_SIZE;
 	newattrs.ia_size = ni_size;
-	ret = notify_change(object->backer, &newattrs);
+	ret = notify_change(object->backer, &newattrs, NULL);
 
 truncate_failed:
 	mutex_unlock(&object->backer->d_inode->i_mutex);
diff --git a/fs/ecryptfs/inode.c b/fs/ecryptfs/inode.c
index a3b2b08..d751d8a 100644
--- a/fs/ecryptfs/inode.c
+++ b/fs/ecryptfs/inode.c
@@ -992,7 +992,7 @@ static int ecryptfs_setattr(struct dentry *dentry, struct iattr *ia)
 		lower_ia.ia_valid &= ~ATTR_MODE;
 
 	mutex_lock(&lower_dentry->d_inode->i_mutex);
-	rc = notify_change(lower_dentry, &lower_ia);
+	rc = notify_change(lower_dentry, &lower_ia, NULL);
 	mutex_unlock(&lower_dentry->d_inode->i_mutex);
 out:
 	fsstack_copy_attr_all(inode, lower_inode);
diff --git a/fs/hpfs/namei.c b/fs/hpfs/namei.c
index 345713d..1b39afd 100644
--- a/fs/hpfs/namei.c
+++ b/fs/hpfs/namei.c
@@ -407,7 +407,7 @@ again:
 			/*printk("HPFS: truncating file before delete.\n");*/
 			newattrs.ia_size = 0;
 			newattrs.ia_valid = ATTR_SIZE | ATTR_CTIME;
-			err = notify_change(dentry, &newattrs);
+			err = notify_change(dentry, &newattrs, NULL);
 			put_write_access(inode);
 			if (!err)
 				goto again;
diff --git a/fs/inode.c b/fs/inode.c
index 2cffc18..03ba513 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -1626,7 +1626,11 @@ static int __remove_suid(struct dentry *dentry, int kill)
 	struct iattr newattrs;
 
 	newattrs.ia_valid = ATTR_FORCE | kill;
-	return notify_change(dentry, &newattrs);
+	/*
+	 * Note we call this on write, so notify_change will not
+	 * encounter any conflicting delegations:
+	 */
+	return notify_change(dentry, &newattrs, NULL);
 }
 
 int file_remove_suid(struct file *file)
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index 5e36961..e8d5d25 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -426,7 +426,7 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap,
 			goto out_nfserr;
 		fh_lock(fhp);
 
-		host_err = notify_change(dentry, iap);
+		host_err = notify_change(dentry, iap, NULL);
 		err = nfserrno(host_err);
 		fh_unlock(fhp);
 	}
@@ -961,7 +961,11 @@ static void kill_suid(struct dentry *dentry)
 	ia.ia_valid = ATTR_KILL_SUID | ATTR_KILL_SGID | ATTR_KILL_PRIV;
 
 	mutex_lock(&dentry->d_inode->i_mutex);
-	notify_change(dentry, &ia);
+	/*
+	 * Note we call this on write, so notify_change will not
+	 * encounter any conflicting delegations:
+	 */
+	notify_change(dentry, &ia, NULL);
 	mutex_unlock(&dentry->d_inode->i_mutex);
 }
 
diff --git a/fs/open.c b/fs/open.c
index 59071f5..83e15dd 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -56,7 +56,7 @@ int do_truncate(struct dentry *dentry, loff_t length, unsigned int time_attrs,
 		newattrs.ia_valid |= ret | ATTR_FORCE;
 
 	mutex_lock(&dentry->d_inode->i_mutex);
-	ret = notify_change(dentry, &newattrs);
+	ret = notify_change(dentry, &newattrs, NULL);
 	mutex_unlock(&dentry->d_inode->i_mutex);
 	return ret;
 }
@@ -452,21 +452,28 @@ out:
 static int chmod_common(struct path *path, umode_t mode)
 {
 	struct inode *inode = path->dentry->d_inode;
+	struct inode *delegated_inode = NULL;
 	struct iattr newattrs;
 	int error;
 
 	error = mnt_want_write(path->mnt);
 	if (error)
 		return error;
+retry_deleg:
 	mutex_lock(&inode->i_mutex);
 	error = security_path_chmod(path, mode);
 	if (error)
 		goto out_unlock;
 	newattrs.ia_mode = (mode & S_IALLUGO) | (inode->i_mode & ~S_IALLUGO);
 	newattrs.ia_valid = ATTR_MODE | ATTR_CTIME;
-	error = notify_change(path->dentry, &newattrs);
+	error = notify_change(path->dentry, &newattrs, &delegated_inode);
 out_unlock:
 	mutex_unlock(&inode->i_mutex);
+	if (delegated_inode) {
+		error = break_deleg_wait(&delegated_inode);
+		if (!error)
+			goto retry_deleg;
+	}
 	mnt_drop_write(path->mnt);
 	return error;
 }
@@ -506,6 +513,7 @@ SYSCALL_DEFINE2(chmod, const char __user *, filename, umode_t, mode)
 static int chown_common(struct path *path, uid_t user, gid_t group)
 {
 	struct inode *inode = path->dentry->d_inode;
+	struct inode *delegated_inode = NULL;
 	int error;
 	struct iattr newattrs;
 	kuid_t uid;
@@ -530,12 +538,17 @@ static int chown_common(struct path *path, uid_t user, gid_t group)
 	if (!S_ISDIR(inode->i_mode))
 		newattrs.ia_valid |=
 			ATTR_KILL_SUID | ATTR_KILL_SGID | ATTR_KILL_PRIV;
+retry_deleg:
 	mutex_lock(&inode->i_mutex);
 	error = security_path_chown(path, uid, gid);
 	if (!error)
-		error = notify_change(path->dentry, &newattrs);
+		error = notify_change(path->dentry, &newattrs, &delegated_inode);
 	mutex_unlock(&inode->i_mutex);
-
+	if (delegated_inode) {
+		error = break_deleg_wait(&delegated_inode);
+		if (!error)
+			goto retry_deleg;
+	}
 	return error;
 }
 
diff --git a/fs/utimes.c b/fs/utimes.c
index bb0696a..671c27e 100644
--- a/fs/utimes.c
+++ b/fs/utimes.c
@@ -53,6 +53,7 @@ static int utimes_common(struct path *path, struct timespec *times)
 	int error;
 	struct iattr newattrs;
 	struct inode *inode = path->dentry->d_inode;
+	struct inode *delegated_inode = NULL;
 
 	error = mnt_want_write(path->mnt);
 	if (error)
@@ -101,9 +102,15 @@ static int utimes_common(struct path *path, struct timespec *times)
 				goto mnt_drop_write_and_out;
 		}
 	}
+retry_deleg:
 	mutex_lock(&inode->i_mutex);
-	error = notify_change(path->dentry, &newattrs);
+	error = notify_change(path->dentry, &newattrs, &delegated_inode);
 	mutex_unlock(&inode->i_mutex);
+	if (delegated_inode) {
+		error = break_deleg_wait(&delegated_inode);
+		if (!error)
+			goto retry_deleg;
+	}
 
 mnt_drop_write_and_out:
 	mnt_drop_write(path->mnt);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index f12a380..07e7369 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2137,7 +2137,7 @@ extern void emergency_remount(void);
 #ifdef CONFIG_BLOCK
 extern sector_t bmap(struct inode *, sector_t);
 #endif
-extern int notify_change(struct dentry *, struct iattr *);
+extern int notify_change(struct dentry *, struct iattr *, struct inode **);
 extern int inode_permission(struct inode *, int);
 extern int generic_permission(struct inode *, int);
 
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2012-10-16 22:01 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-10-16 22:01 [PATCH 00/12] Implement NFSv4 delegations, take 5 J. Bruce Fields
2012-10-16 22:01 ` [PATCH 02/12] vfs: don't use PARENT/CHILD lock classes for non-directories J. Bruce Fields
     [not found] ` <1350424897-26894-1-git-send-email-bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2012-10-16 22:01   ` [PATCH 01/12] vfs: pull ext4's double-i_mutex-locking into common code J. Bruce Fields
2012-10-16 22:01   ` [PATCH 03/12] vfs: rename I_MUTEX_QUOTA now that it's not used for quotas J. Bruce Fields
2012-10-16 22:01   ` [PATCH 04/12] vfs: take i_mutex on renamed file J. Bruce Fields
2012-10-16 22:01   ` [PATCH 08/12] locks: break delegations on unlink J. Bruce Fields
2012-10-16 22:01   ` [PATCH 10/12] locks: break delegations on rename J. Bruce Fields
2012-10-16 22:01   ` [PATCH 11/12] locks: break delegations on link J. Bruce Fields
2012-10-16 22:01 ` [PATCH 05/12] locks: introduce new FL_DELEG lock flag J. Bruce Fields
2012-10-16 22:01 ` [PATCH 06/12] locks: implement delegations J. Bruce Fields
2012-10-16 22:01 ` [PATCH 07/12] namei: minor vfs_unlink cleanup J. Bruce Fields
2012-10-16 22:01 ` [PATCH 09/12] locks: helper functions for delegation breaking J. Bruce Fields
2012-10-16 22:01 ` [PATCH 12/12] locks: break delegations on any attribute modification J. Bruce Fields

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).