* [PATCH 1/6] gfs2: Get rid of I_MUTEX_QUOTA usage
2012-05-31 1:50 rename locking J. Bruce Fields
@ 2012-05-31 1:50 ` J. Bruce Fields
[not found] ` <1338429016-20119-2-git-send-email-bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2012-05-31 12:14 ` Bob Peterson
[not found] ` <1338429016-20119-1-git-send-email-bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
` (2 subsequent siblings)
3 siblings, 2 replies; 13+ messages in thread
From: J. Bruce Fields @ 2012-05-31 1:50 UTC (permalink / raw)
To: Al Viro
Cc: linux-nfs, linux-fsdevel, Jan Kara, Steven Whitehouse,
J. Bruce Fields
From: Jan Kara <jack@suse.cz>
GFS2 uses i_mutex on its system quota inode to synchronize writes to
quota file. Since this is an internal inode to GFS2 (not part of directory
hiearchy or visible by user) we are safe to define locking rules for it. So
let's just get it its own locking class to make it clear.
CC: Steven Whitehouse <swhiteho@redhat.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
fs/gfs2/ops_fstype.c | 8 ++++++++
fs/gfs2/quota.c | 2 +-
2 files changed, 9 insertions(+), 1 deletion(-)
diff --git a/fs/gfs2/ops_fstype.c b/fs/gfs2/ops_fstype.c
index b8c250f..c714ef7 100644
--- a/fs/gfs2/ops_fstype.c
+++ b/fs/gfs2/ops_fstype.c
@@ -19,6 +19,7 @@
#include <linux/mount.h>
#include <linux/gfs2_ondisk.h>
#include <linux/quotaops.h>
+#include <linux/lockdep.h>
#include "gfs2.h"
#include "incore.h"
@@ -766,6 +767,7 @@ fail:
return error;
}
+static struct lock_class_key gfs2_quota_imutex_key;
static int init_inodes(struct gfs2_sbd *sdp, int undo)
{
@@ -803,6 +805,12 @@ static int init_inodes(struct gfs2_sbd *sdp, int undo)
fs_err(sdp, "can't get quota file inode: %d\n", error);
goto fail_rindex;
}
+ /*
+ * i_mutex on quota files is special. Since this inode is hidden system
+ * file, we are safe to define locking ourselves.
+ */
+ lockdep_set_class(&sdp->sd_quota_inode->i_mutex,
+ &gfs2_quota_imutex_key);
error = gfs2_rindex_update(sdp);
if (error)
diff --git a/fs/gfs2/quota.c b/fs/gfs2/quota.c
index b97178e..8ed70e7 100644
--- a/fs/gfs2/quota.c
+++ b/fs/gfs2/quota.c
@@ -772,7 +772,7 @@ static int do_sync(unsigned int num_qd, struct gfs2_quota_data **qda)
return -ENOMEM;
sort(qda, num_qd, sizeof(struct gfs2_quota_data *), sort_qd, NULL);
- mutex_lock_nested(&ip->i_inode.i_mutex, I_MUTEX_QUOTA);
+ mutex_lock(&ip->i_inode.i_mutex);
for (qx = 0; qx < num_qd; qx++) {
error = gfs2_glock_nq_init(qda[qx]->qd_gl, LM_ST_EXCLUSIVE,
GL_NOCACHE, &ghs[qx]);
--
1.7.9.5
^ permalink raw reply related [flat|nested] 13+ messages in thread
[parent not found: <1338429016-20119-2-git-send-email-bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>]
* Re: [PATCH 1/6] gfs2: Get rid of I_MUTEX_QUOTA usage
[not found] ` <1338429016-20119-2-git-send-email-bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2012-05-31 7:50 ` Steven Whitehouse
0 siblings, 0 replies; 13+ messages in thread
From: Steven Whitehouse @ 2012-05-31 7:50 UTC (permalink / raw)
To: J. Bruce Fields
Cc: Al Viro, linux-nfs-u79uwXL29TY76Z2rM5mHXA,
linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Jan Kara
Hi,
Acked-by: Steven Whitehouse <swhiteho-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Eventually we should be able to drop i_mutex from here entirely, but
that proved more problematic that I'd expected, so this is a good
interim fix,
Steve.
On Wed, 2012-05-30 at 21:50 -0400, J. Bruce Fields wrote:
> From: Jan Kara <jack-AlSwsSmVLrQ@public.gmane.org>
>
> GFS2 uses i_mutex on its system quota inode to synchronize writes to
> quota file. Since this is an internal inode to GFS2 (not part of directory
> hiearchy or visible by user) we are safe to define locking rules for it. So
> let's just get it its own locking class to make it clear.
>
> CC: Steven Whitehouse <swhiteho-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> Signed-off-by: Jan Kara <jack-AlSwsSmVLrQ@public.gmane.org>
> Signed-off-by: J. Bruce Fields <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> ---
> fs/gfs2/ops_fstype.c | 8 ++++++++
> fs/gfs2/quota.c | 2 +-
> 2 files changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/fs/gfs2/ops_fstype.c b/fs/gfs2/ops_fstype.c
> index b8c250f..c714ef7 100644
> --- a/fs/gfs2/ops_fstype.c
> +++ b/fs/gfs2/ops_fstype.c
> @@ -19,6 +19,7 @@
> #include <linux/mount.h>
> #include <linux/gfs2_ondisk.h>
> #include <linux/quotaops.h>
> +#include <linux/lockdep.h>
>
> #include "gfs2.h"
> #include "incore.h"
> @@ -766,6 +767,7 @@ fail:
> return error;
> }
>
> +static struct lock_class_key gfs2_quota_imutex_key;
>
> static int init_inodes(struct gfs2_sbd *sdp, int undo)
> {
> @@ -803,6 +805,12 @@ static int init_inodes(struct gfs2_sbd *sdp, int undo)
> fs_err(sdp, "can't get quota file inode: %d\n", error);
> goto fail_rindex;
> }
> + /*
> + * i_mutex on quota files is special. Since this inode is hidden system
> + * file, we are safe to define locking ourselves.
> + */
> + lockdep_set_class(&sdp->sd_quota_inode->i_mutex,
> + &gfs2_quota_imutex_key);
>
> error = gfs2_rindex_update(sdp);
> if (error)
> diff --git a/fs/gfs2/quota.c b/fs/gfs2/quota.c
> index b97178e..8ed70e7 100644
> --- a/fs/gfs2/quota.c
> +++ b/fs/gfs2/quota.c
> @@ -772,7 +772,7 @@ static int do_sync(unsigned int num_qd, struct gfs2_quota_data **qda)
> return -ENOMEM;
>
> sort(qda, num_qd, sizeof(struct gfs2_quota_data *), sort_qd, NULL);
> - mutex_lock_nested(&ip->i_inode.i_mutex, I_MUTEX_QUOTA);
> + mutex_lock(&ip->i_inode.i_mutex);
> for (qx = 0; qx < num_qd; qx++) {
> error = gfs2_glock_nq_init(qda[qx]->qd_gl, LM_ST_EXCLUSIVE,
> GL_NOCACHE, &ghs[qx]);
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/6] gfs2: Get rid of I_MUTEX_QUOTA usage
2012-05-31 1:50 ` [PATCH 1/6] gfs2: Get rid of I_MUTEX_QUOTA usage J. Bruce Fields
[not found] ` <1338429016-20119-2-git-send-email-bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2012-05-31 12:14 ` Bob Peterson
1 sibling, 0 replies; 13+ messages in thread
From: Bob Peterson @ 2012-05-31 12:14 UTC (permalink / raw)
To: J. Bruce Fields
Cc: linux-nfs, linux-fsdevel, Jan Kara, Steven Whitehouse, Al Viro
----- Original Message -----
| From: Jan Kara <jack@suse.cz>
|
| GFS2 uses i_mutex on its system quota inode to synchronize writes to
| quota file. Since this is an internal inode to GFS2 (not part of
| directory
| hiearchy or visible by user) we are safe to define locking rules for
| it. So
| let's just get it its own locking class to make it clear.
Hi,
Acked-by: Bob Peterson <rpeterso@redhat.com>
Regards,
Bob Peterson
Red Hat File Systems
^ permalink raw reply [flat|nested] 13+ messages in thread
[parent not found: <1338429016-20119-1-git-send-email-bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>]
* [PATCH 2/6] vfs: fix outdated i_mutex_lock_class documentation
[not found] ` <1338429016-20119-1-git-send-email-bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2012-05-31 1:50 ` J. Bruce Fields
[not found] ` <1338429016-20119-3-git-send-email-bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2012-05-31 1:50 ` [PATCH 4/6] vfs: don't use PARENT/CHILD lock classes for non-directories J. Bruce Fields
2012-05-31 1:50 ` [PATCH 6/6] vfs: rename I_MUTEX_QUOTA now that it's not used for quotas J. Bruce Fields
2 siblings, 1 reply; 13+ messages in thread
From: J. Bruce Fields @ 2012-05-31 1:50 UTC (permalink / raw)
To: Al Viro
Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA,
linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, J. Bruce Fields
From: "J. Bruce Fields" <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Signed-off-by: J. Bruce Fields <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
---
include/linux/fs.h | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 038076b..5584a29 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -861,7 +861,8 @@ static inline int inode_unhashed(struct inode *inode)
* 0: the object of the current VFS operation
* 1: parent
* 2: child/target
- * 3: quota file
+ * 3: xattr
+ * 4: quota file
*
* The locking order between these classes is
* parent -> child -> normal -> xattr -> quota
--
1.7.9.5
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 4/6] vfs: don't use PARENT/CHILD lock classes for non-directories
[not found] ` <1338429016-20119-1-git-send-email-bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2012-05-31 1:50 ` [PATCH 2/6] vfs: fix outdated i_mutex_lock_class documentation J. Bruce Fields
@ 2012-05-31 1:50 ` J. Bruce Fields
2012-05-31 1:50 ` [PATCH 6/6] vfs: rename I_MUTEX_QUOTA now that it's not used for quotas J. Bruce Fields
2 siblings, 0 replies; 13+ messages in thread
From: J. Bruce Fields @ 2012-05-31 1:50 UTC (permalink / raw)
To: Al Viro
Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA,
linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, J. Bruce Fields
From: "J. Bruce Fields" <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Reserve I_MUTEX_PARENT and I_MUTEX_CHILD for locking of actual
directories.
(Also I_MUTEX_QUOTA isn't really a meaningful name for this locking
class any more; fixed in a later patch.)
Signed-off-by: J. Bruce Fields <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
---
fs/inode.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/fs/inode.c b/fs/inode.c
index e48a96f..0272ac3 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -978,12 +978,12 @@ void lock_two_nondirectories(struct inode *inode1, struct inode *inode2)
if (inode1 == inode2 || inode2 == NULL)
mutex_lock(&inode1->i_mutex);
else if (inode1 < inode2) {
- mutex_lock_nested(&inode1->i_mutex, I_MUTEX_PARENT);
- mutex_lock_nested(&inode2->i_mutex, I_MUTEX_CHILD);
+ mutex_lock(&inode1->i_mutex);
+ mutex_lock_nested(&inode2->i_mutex, I_MUTEX_QUOTA);
} else {
- mutex_lock_nested(&inode2->i_mutex, I_MUTEX_PARENT);
- mutex_lock_nested(&inode1->i_mutex, I_MUTEX_CHILD);
+ mutex_lock(&inode2->i_mutex);
+ mutex_lock_nested(&inode1->i_mutex, I_MUTEX_QUOTA);
}
}
EXPORT_SYMBOL(lock_two_nondirectories);
--
1.7.9.5
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 6/6] vfs: rename I_MUTEX_QUOTA now that it's not used for quotas
[not found] ` <1338429016-20119-1-git-send-email-bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2012-05-31 1:50 ` [PATCH 2/6] vfs: fix outdated i_mutex_lock_class documentation J. Bruce Fields
2012-05-31 1:50 ` [PATCH 4/6] vfs: don't use PARENT/CHILD lock classes for non-directories J. Bruce Fields
@ 2012-05-31 1:50 ` J. Bruce Fields
2 siblings, 0 replies; 13+ messages in thread
From: J. Bruce Fields @ 2012-05-31 1:50 UTC (permalink / raw)
To: Al Viro
Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA,
linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, J. Bruce Fields
From: "J. Bruce Fields" <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
I_MUTEX_QUOTA is now just being used whenever we want to lock two
non-directories. So the name isn't right. I_MUTEX_NONDIR2 isn't
especially elegant but it's the best I could think of.
Signed-off-by: J. Bruce Fields <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
---
fs/inode.c | 4 ++--
include/linux/fs.h | 8 +++++---
2 files changed, 7 insertions(+), 5 deletions(-)
diff --git a/fs/inode.c b/fs/inode.c
index 0272ac3..9c828ab 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -979,11 +979,11 @@ void lock_two_nondirectories(struct inode *inode1, struct inode *inode2)
mutex_lock(&inode1->i_mutex);
else if (inode1 < inode2) {
mutex_lock(&inode1->i_mutex);
- mutex_lock_nested(&inode2->i_mutex, I_MUTEX_QUOTA);
+ mutex_lock_nested(&inode2->i_mutex, I_MUTEX_NONDIR2);
} else {
mutex_lock(&inode2->i_mutex);
- mutex_lock_nested(&inode1->i_mutex, I_MUTEX_QUOTA);
+ mutex_lock_nested(&inode1->i_mutex, I_MUTEX_NONDIR2);
}
}
EXPORT_SYMBOL(lock_two_nondirectories);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 308c2a0..f0b5a5d 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -862,10 +862,12 @@ static inline int inode_unhashed(struct inode *inode)
* 1: parent
* 2: child/target
* 3: xattr
- * 4: quota file
+ * 4: second non-directory
+ * The last is for certain operations (such as rename) which lock two
+ * non-directories at once.
*
* The locking order between these classes is
- * parent -> child -> normal -> xattr -> quota
+ * parent -> child -> normal -> xattr -> second non-directory
*/
enum inode_i_mutex_lock_class
{
@@ -873,7 +875,7 @@ enum inode_i_mutex_lock_class
I_MUTEX_PARENT,
I_MUTEX_CHILD,
I_MUTEX_XATTR,
- I_MUTEX_QUOTA
+ I_MUTEX_NONDIR2
};
void lock_two_nondirectories(struct inode *, struct inode*);
--
1.7.9.5
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 3/6] vfs: pull ext4's double-i_mutex-locking into common code
2012-05-31 1:50 rename locking J. Bruce Fields
2012-05-31 1:50 ` [PATCH 1/6] gfs2: Get rid of I_MUTEX_QUOTA usage J. Bruce Fields
[not found] ` <1338429016-20119-1-git-send-email-bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2012-05-31 1:50 ` J. Bruce Fields
2012-05-31 1:50 ` [PATCH 5/6] vfs: take i_mutex on renamed file J. Bruce Fields
3 siblings, 0 replies; 13+ messages in thread
From: J. Bruce Fields @ 2012-05-31 1:50 UTC (permalink / raw)
To: Al Viro; +Cc: linux-nfs, linux-fsdevel, J. Bruce Fields
From: "J. Bruce Fields" <bfields@redhat.com>
We want to do this elsewhere as well.
Also, compare pointers instead of inode numbers.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
fs/ext4/move_extent.c | 21 ++-------------------
fs/inode.c | 36 ++++++++++++++++++++++++++++++++++++
include/linux/fs.h | 3 +++
3 files changed, 41 insertions(+), 19 deletions(-)
diff --git a/fs/ext4/move_extent.c b/fs/ext4/move_extent.c
index c5826c6..b87d94a 100644
--- a/fs/ext4/move_extent.c
+++ b/fs/ext4/move_extent.c
@@ -1086,19 +1086,7 @@ mext_inode_double_lock(struct inode *inode1, struct inode *inode2)
if (ret < 0)
goto out;
- if (inode1 == inode2) {
- mutex_lock(&inode1->i_mutex);
- goto out;
- }
-
- if (inode1->i_ino < inode2->i_ino) {
- mutex_lock_nested(&inode1->i_mutex, I_MUTEX_PARENT);
- mutex_lock_nested(&inode2->i_mutex, I_MUTEX_CHILD);
- } else {
- mutex_lock_nested(&inode2->i_mutex, I_MUTEX_PARENT);
- mutex_lock_nested(&inode1->i_mutex, I_MUTEX_CHILD);
- }
-
+ lock_two_nondirectories(inode1, inode2);
out:
return ret;
}
@@ -1123,12 +1111,7 @@ mext_inode_double_unlock(struct inode *inode1, struct inode *inode2)
if (ret < 0)
goto out;
- if (inode1)
- mutex_unlock(&inode1->i_mutex);
-
- if (inode2 && inode2 != inode1)
- mutex_unlock(&inode2->i_mutex);
-
+ unlock_two_nondirectories(inode1, inode2);
out:
return ret;
}
diff --git a/fs/inode.c b/fs/inode.c
index 6bc8761..e48a96f 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -969,6 +969,42 @@ void unlock_new_inode(struct inode *inode)
EXPORT_SYMBOL(unlock_new_inode);
/**
+ * lock_two_nondirectories - take two i_mutexes on non-directory objects
+ * @inode1: first inode to lock; must be non-NULL
+ * @inode2: second inode to lock; optional, may equal first or be NULL.
+ */
+void lock_two_nondirectories(struct inode *inode1, struct inode *inode2)
+{
+ if (inode1 == inode2 || inode2 == NULL)
+ mutex_lock(&inode1->i_mutex);
+ else if (inode1 < inode2) {
+ mutex_lock_nested(&inode1->i_mutex, I_MUTEX_PARENT);
+ mutex_lock_nested(&inode2->i_mutex, I_MUTEX_CHILD);
+
+ } else {
+ mutex_lock_nested(&inode2->i_mutex, I_MUTEX_PARENT);
+ mutex_lock_nested(&inode1->i_mutex, I_MUTEX_CHILD);
+ }
+}
+EXPORT_SYMBOL(lock_two_nondirectories);
+
+/**
+ * lock_two_nondirectories - release locks from lock_two_nondirectories()
+ * @inode1: first inode to unlock
+ * @inode2: second inode to lock
+ *
+ * Arguments must be same as those given to corresponding
+ * lock_two_nondirectories() call.
+ */
+void unlock_two_nondirectories(struct inode *inode1, struct inode *inode2)
+{
+ mutex_unlock(&inode1->i_mutex);
+ if (inode2 && inode2 != inode1)
+ mutex_unlock(&inode2->i_mutex);
+}
+EXPORT_SYMBOL(unlock_two_nondirectories);
+
+/**
* iget5_locked - obtain an inode from a mounted file system
* @sb: super block of file system
* @hashval: hash value (usually inode number) to get
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 5584a29..308c2a0 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -876,6 +876,9 @@ enum inode_i_mutex_lock_class
I_MUTEX_QUOTA
};
+void lock_two_nondirectories(struct inode *, struct inode*);
+void unlock_two_nondirectories(struct inode *, struct inode*);
+
/*
* NOTE: in a 32bit arch with a preemptable kernel and
* an UP compile the i_size_read/write must be atomic
--
1.7.9.5
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 5/6] vfs: take i_mutex on renamed file
2012-05-31 1:50 rename locking J. Bruce Fields
` (2 preceding siblings ...)
2012-05-31 1:50 ` [PATCH 3/6] vfs: pull ext4's double-i_mutex-locking into common code J. Bruce Fields
@ 2012-05-31 1:50 ` J. Bruce Fields
3 siblings, 0 replies; 13+ messages in thread
From: J. Bruce Fields @ 2012-05-31 1:50 UTC (permalink / raw)
To: Al Viro; +Cc: linux-nfs, linux-fsdevel, J. Bruce Fields
From: "J. Bruce Fields" <bfields@redhat.com>
A read delegation is used by NFSv4 as a guarantee that a client can
perform local read opens without informing the server.
The open operation takes the last component of the pathname as an
argument, thus is also a lookup operation, and giving the client the
above guarantee means informing the client before we allow anything that
would change the set of names pointing to the inode.
Therefore, we need to break delegations on rename, link, and unlink.
We also need to prevent new delegations from being acquired while one of
these operations is in progress.
We could add some completely new locking for that purpose, but it's
simpler to use the i_mutex, since that's already taken by all the
operations we care about.
The single exception is rename. So, modify rename to take the i_mutex
on the file that is being renamed.
Also fix up lockdep and Documentation/filesystems/directory-locking to
reflect the change.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
Documentation/filesystems/directory-locking | 30 +++++++++++++++++++--------
fs/namei.c | 7 +++----
2 files changed, 24 insertions(+), 13 deletions(-)
diff --git a/Documentation/filesystems/directory-locking b/Documentation/filesystems/directory-locking
index ff7b611..9e8a629 100644
--- a/Documentation/filesystems/directory-locking
+++ b/Documentation/filesystems/directory-locking
@@ -2,6 +2,10 @@
kinds of locks - per-inode (->i_mutex) and per-filesystem
(->s_vfs_rename_mutex).
+ When taking the i_mutex on multiple non-directory objects, we
+always acquire the locks in order by increasing address. We'll call
+that "inode pointer" order in the following.
+
For our purposes all operations fall in 5 classes:
1) read access. Locking rules: caller locks directory we are accessing.
@@ -12,8 +16,9 @@ kinds of locks - per-inode (->i_mutex) and per-filesystem
locks victim and calls the method.
4) rename() that is _not_ cross-directory. Locking rules: caller locks
-the parent, finds source and target, if target already exists - locks it
-and then calls the method.
+the parent and finds source and target. If source and target both
+exist, they are locked in inode pointer order. Otherwise lock just
+source. Then call method.
5) link creation. Locking rules:
* lock parent
@@ -30,7 +35,8 @@ rules:
fail with -ENOTEMPTY
* if new parent is equal to or is a descendent of source
fail with -ELOOP
- * if target exists - lock it.
+ * If target exists, lock both source and target, in inode
+ pointer order. Otherwise lock just source.
* call the method.
@@ -56,9 +62,11 @@ objects - A < B iff A is an ancestor of B.
renames will be blocked on filesystem lock and we don't start changing
the order until we had acquired all locks).
-(3) any operation holds at most one lock on non-directory object and
- that lock is acquired after all other locks. (Proof: see descriptions
- of operations).
+(3) locks on non-directory objects are acquired only after locks on
+ directory objects, and are acquired in inode pointer order.
+ (Proof: all operations but renames take lock on at most one
+ non-directory object, except renames, which take locks on source and
+ target in inode pointer order.)
Now consider the minimal deadlock. Each process is blocked on
attempt to acquire some lock and already holds at least one lock. Let's
@@ -66,9 +74,13 @@ consider the set of contended locks. First of all, filesystem lock is
not contended, since any process blocked on it is not holding any locks.
Thus all processes are blocked on ->i_mutex.
- Non-directory objects are not contended due to (3). Thus link
-creation can't be a part of deadlock - it can't be blocked on source
-and it means that it doesn't hold any locks.
+ By (3), any process holding a non-directory lock can only be
+waiting on another non-directory lock with a larger address. Therefore
+the process holding the "largest" such lock can always make progress, and
+non-directory objects are not included in the set of contended locks.
+
+ Thus link creation can't be a part of deadlock - it can't be
+blocked on source and it means that it doesn't hold any locks.
Any contended object is either held by cross-directory rename or
has a child that is also contended. Indeed, suppose that it is held by
diff --git a/fs/namei.c b/fs/namei.c
index c651f02..421810b 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -3162,6 +3162,7 @@ static int vfs_rename_other(struct inode *old_dir, struct dentry *old_dentry,
struct inode *new_dir, struct dentry *new_dentry)
{
struct inode *target = new_dentry->d_inode;
+ struct inode *source = old_dentry->d_inode;
int error;
error = security_inode_rename(old_dir, old_dentry, new_dir, new_dentry);
@@ -3169,8 +3170,7 @@ static int vfs_rename_other(struct inode *old_dir, struct dentry *old_dentry,
return error;
dget(new_dentry);
- if (target)
- mutex_lock(&target->i_mutex);
+ lock_two_nondirectories(source, target);
error = -EBUSY;
if (d_mountpoint(old_dentry)||d_mountpoint(new_dentry))
@@ -3185,8 +3185,7 @@ static int vfs_rename_other(struct inode *old_dir, struct dentry *old_dentry,
if (!(old_dir->i_sb->s_type->fs_flags & FS_RENAME_DOES_D_MOVE))
d_move(old_dentry, new_dentry);
out:
- if (target)
- mutex_unlock(&target->i_mutex);
+ unlock_two_nondirectories(source, target);
dput(new_dentry);
return error;
}
--
1.7.9.5
^ permalink raw reply related [flat|nested] 13+ messages in thread