From: "J. Bruce Fields" <bfields@redhat.com>
To: Al Viro <viro@zeniv.linux.org.uk>
Cc: linux-nfs@vger.kernel.org, linux-fsdevel@vger.kernel.org,
jlayton@redhat.com, Dave Chinner <david@fromorbit.com>,
"J. Bruce Fields" <bfields@redhat.com>,
David Howells <dhowells@redhat.com>,
Tyler Hicks <tyhicks@canonical.com>,
Dustin Kirkland <dustin.kirkland@gazzang.com>
Subject: [PATCH 08/16] locks: break delegations on unlink
Date: Wed, 17 Jul 2013 16:50:09 -0400 [thread overview]
Message-ID: <1374094217-31493-10-git-send-email-bfields@redhat.com> (raw)
In-Reply-To: <1374094217-31493-1-git-send-email-bfields@redhat.com>
From: "J. Bruce Fields" <bfields@redhat.com>
We need to break delegations on any operation that changes the set of
links pointing to an inode. Start with unlink.
Such operations also hold the i_mutex on a parent directory. Breaking a
delegation may require waiting for a timeout (by default 90 seconds) in
the case of a unresponsive NFS client. To avoid blocking all directory
operations, we therefore drop locks before waiting for the delegation.
The logic then looks like:
acquire locks
...
test for delegation; if found:
take reference on inode
release locks
wait for delegation break
drop reference on inode
retry
It is possible this could never terminate. (Even if we take precautions
to prevent another delegation being acquired on the same inode, we could
get a different inode on each retry.) But this seems very unlikely.
The initial test for a delegation happens after the lock on the target
inode is acquired, but the directory inode may have been acquired
further up the call stack. We therefore add a "struct inode **"
argument to any intervening functions, which we use to pass the inode
back up to the caller in the case it needs a delegation synchronously
broken.
Cc: David Howells <dhowells@redhat.com>
Cc: Tyler Hicks <tyhicks@canonical.com>
Cc: Dustin Kirkland <dustin.kirkland@gazzang.com>
Acked-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
drivers/base/devtmpfs.c | 2 +-
fs/cachefiles/namei.c | 2 +-
fs/ecryptfs/inode.c | 2 +-
fs/namei.c | 42 +++++++++++++++++++++++++++++++++++++++---
fs/nfsd/vfs.c | 2 +-
include/linux/fs.h | 2 +-
ipc/mqueue.c | 2 +-
7 files changed, 45 insertions(+), 9 deletions(-)
diff --git a/drivers/base/devtmpfs.c b/drivers/base/devtmpfs.c
index 7413d06..1b8490e 100644
--- a/drivers/base/devtmpfs.c
+++ b/drivers/base/devtmpfs.c
@@ -324,7 +324,7 @@ static int handle_remove(const char *nodename, struct device *dev)
mutex_lock(&dentry->d_inode->i_mutex);
notify_change(dentry, &newattrs);
mutex_unlock(&dentry->d_inode->i_mutex);
- err = vfs_unlink(parent.dentry->d_inode, dentry);
+ err = vfs_unlink(parent.dentry->d_inode, dentry, NULL);
if (!err || err == -ENOENT)
deleted = 1;
}
diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c
index 25badd1..6846fcef 100644
--- a/fs/cachefiles/namei.c
+++ b/fs/cachefiles/namei.c
@@ -294,7 +294,7 @@ static int cachefiles_bury_object(struct cachefiles_cache *cache,
if (ret < 0) {
cachefiles_io_error(cache, "Unlink security error");
} else {
- ret = vfs_unlink(dir->d_inode, rep);
+ ret = vfs_unlink(dir->d_inode, rep, NULL);
if (preemptive)
cachefiles_mark_object_buried(cache, rep);
diff --git a/fs/ecryptfs/inode.c b/fs/ecryptfs/inode.c
index 67e9b63..0b8b632 100644
--- a/fs/ecryptfs/inode.c
+++ b/fs/ecryptfs/inode.c
@@ -153,7 +153,7 @@ static int ecryptfs_do_unlink(struct inode *dir, struct dentry *dentry,
dget(lower_dentry);
lower_dir_dentry = lock_parent(lower_dentry);
- rc = vfs_unlink(lower_dir_inode, lower_dentry);
+ rc = vfs_unlink(lower_dir_inode, lower_dentry, NULL);
if (rc) {
printk(KERN_ERR "Error in vfs_unlink; rc = [%d]\n", rc);
goto out_unlock;
diff --git a/fs/namei.c b/fs/namei.c
index 2b44960..2826bbd 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -3431,7 +3431,25 @@ SYSCALL_DEFINE1(rmdir, const char __user *, pathname)
return do_rmdir(AT_FDCWD, pathname);
}
-int vfs_unlink(struct inode *dir, struct dentry *dentry)
+/**
+ * vfs_unlink - unlink a filesystem object
+ * @dir: parent directory
+ * @dentry: victim
+ * @delegated_inode: returns victim inode, if the inode is delegated.
+ *
+ * The caller must hold dir->i_mutex.
+ *
+ * If vfs_unlink discovers a delegation, it will return -EWOULDBLOCK and
+ * return a reference to the inode in delegated_inode. The caller
+ * should then break the delegation on that inode and retry. Because
+ * breaking a delegation may take a long time, the caller should drop
+ * dir->i_mutex before doing so.
+ *
+ * Alternatively, a caller may pass NULL for delegated_inode. This may
+ * be appropriate for callers that expect the underlying filesystem not
+ * to be NFS exported.
+ */
+int vfs_unlink(struct inode *dir, struct dentry *dentry, struct inode **delegated_inode)
{
struct inode *target = dentry->d_inode;
int error = may_delete(dir, dentry, 0);
@@ -3448,11 +3466,20 @@ int vfs_unlink(struct inode *dir, struct dentry *dentry)
else {
error = security_inode_unlink(dir, dentry);
if (!error) {
+ error = break_deleg(target, O_WRONLY|O_NONBLOCK);
+ if (error) {
+ if (error == -EWOULDBLOCK && delegated_inode) {
+ *delegated_inode = target;
+ ihold(target);
+ }
+ goto out;
+ }
error = dir->i_op->unlink(dir, dentry);
if (!error)
dont_mount(dentry);
}
}
+out:
mutex_unlock(&target->i_mutex);
/* We don't d_delete() NFS sillyrenamed files--they still exist. */
@@ -3477,6 +3504,7 @@ static long do_unlinkat(int dfd, const char __user *pathname)
struct dentry *dentry;
struct nameidata nd;
struct inode *inode = NULL;
+ struct inode *delegated_inode = NULL;
unsigned int lookup_flags = 0;
retry:
name = user_path_parent(dfd, pathname, &nd, lookup_flags);
@@ -3491,7 +3519,7 @@ retry:
error = mnt_want_write(nd.path.mnt);
if (error)
goto exit1;
-
+retry_deleg:
mutex_lock_nested(&nd.path.dentry->d_inode->i_mutex, I_MUTEX_PARENT);
dentry = lookup_hash(&nd);
error = PTR_ERR(dentry);
@@ -3506,13 +3534,21 @@ retry:
error = security_path_unlink(&nd.path, dentry);
if (error)
goto exit2;
- error = vfs_unlink(nd.path.dentry->d_inode, dentry);
+ error = vfs_unlink(nd.path.dentry->d_inode, dentry, &delegated_inode);
exit2:
dput(dentry);
}
mutex_unlock(&nd.path.dentry->d_inode->i_mutex);
if (inode)
iput(inode); /* truncate the inode here */
+ inode = NULL;
+ if (delegated_inode) {
+ error = break_deleg(delegated_inode, O_WRONLY);
+ iput(delegated_inode);
+ delegated_inode = NULL;
+ if (!error)
+ goto retry_deleg;
+ }
mnt_drop_write(nd.path.mnt);
exit1:
path_put(&nd.path);
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index 8ff6a00..ff1fc44 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -1910,7 +1910,7 @@ nfsd_unlink(struct svc_rqst *rqstp, struct svc_fh *fhp, int type,
if (host_err)
goto out_put;
if (type != S_IFDIR)
- host_err = vfs_unlink(dirp, rdentry);
+ host_err = vfs_unlink(dirp, rdentry, NULL);
else
host_err = vfs_rmdir(dirp, rdentry);
if (!host_err)
diff --git a/include/linux/fs.h b/include/linux/fs.h
index bc1d8b8..ba29b37 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1469,7 +1469,7 @@ extern int vfs_mknod(struct inode *, struct dentry *, umode_t, dev_t);
extern int vfs_symlink(struct inode *, struct dentry *, const char *);
extern int vfs_link(struct dentry *, struct inode *, struct dentry *);
extern int vfs_rmdir(struct inode *, struct dentry *);
-extern int vfs_unlink(struct inode *, struct dentry *);
+extern int vfs_unlink(struct inode *, struct dentry *, struct inode **);
extern int vfs_rename(struct inode *, struct dentry *, struct inode *, struct dentry *);
/*
diff --git a/ipc/mqueue.c b/ipc/mqueue.c
index ae1996d..95827ce 100644
--- a/ipc/mqueue.c
+++ b/ipc/mqueue.c
@@ -886,7 +886,7 @@ SYSCALL_DEFINE1(mq_unlink, const char __user *, u_name)
err = -ENOENT;
} else {
ihold(inode);
- err = vfs_unlink(dentry->d_parent->d_inode, dentry);
+ err = vfs_unlink(dentry->d_parent->d_inode, dentry, NULL);
}
dput(dentry);
--
1.7.9.5
next prev parent reply other threads:[~2013-07-17 20:50 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-07-17 20:50 [PATCH 00/16] Implement NFSv4 delegations, take 9 J. Bruce Fields
2013-07-17 20:50 ` [PATCH] nfsd4: fix minorversion support interface J. Bruce Fields
2013-07-17 21:08 ` J. Bruce Fields
2013-07-17 20:50 ` [PATCH 02/16] vfs: don't use PARENT/CHILD lock classes for non-directories J. Bruce Fields
2013-07-17 20:50 ` [PATCH 03/16] vfs: rename I_MUTEX_QUOTA now that it's not used for quotas J. Bruce Fields
2013-07-17 20:50 ` [PATCH 04/16] vfs: take i_mutex on renamed file J. Bruce Fields
2013-07-17 20:50 ` J. Bruce Fields [this message]
2013-07-17 20:50 ` [PATCH 10/16] locks: break delegations on rename J. Bruce Fields
2013-07-17 20:50 ` [PATCH 13/16] nfsd4: minor nfs4_setlease cleanup J. Bruce Fields
2013-07-26 10:53 ` Jeff Layton
[not found] ` <1374094217-31493-1-git-send-email-bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-07-17 20:50 ` [PATCH 01/16] vfs: pull ext4's double-i_mutex-locking into common code J. Bruce Fields
2013-07-17 20:50 ` [PATCH 05/16] locks: introduce new FL_DELEG lock flag J. Bruce Fields
2013-07-17 20:50 ` [PATCH 06/16] locks: implement delegations J. Bruce Fields
2013-07-17 20:50 ` [PATCH 07/16] namei: minor vfs_unlink cleanup J. Bruce Fields
2013-07-17 20:50 ` [PATCH 09/16] locks: helper functions for delegation breaking J. Bruce Fields
[not found] ` <1374094217-31493-11-git-send-email-bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-07-26 10:50 ` Jeff Layton
2013-07-17 20:50 ` [PATCH 11/16] locks: break delegations on link J. Bruce Fields
2013-07-17 20:50 ` [PATCH 12/16] locks: break delegations on any attribute modification J. Bruce Fields
[not found] ` <1374094217-31493-14-git-send-email-bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-07-26 10:50 ` Jeff Layton
2013-07-17 20:50 ` [PATCH 14/16] nfsd4: delay setting current_fh in open J. Bruce Fields
[not found] ` <1374094217-31493-16-git-send-email-bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-07-26 11:11 ` Jeff Layton
2013-07-26 16:04 ` J. Bruce Fields
2013-07-17 20:50 ` [PATCH 15/16] nfsd4: close open-deleg/unlink/rename race J. Bruce Fields
2013-07-26 11:23 ` Jeff Layton
[not found] ` <20130726072326.56113a2c-4QP7MXygkU+dMjc06nkz3ljfA9RmPOcC@public.gmane.org>
2013-07-26 16:04 ` J. Bruce Fields
2013-07-26 21:14 ` J. Bruce Fields
2013-07-17 20:50 ` [PATCH 16/16] nfsd4: break only delegations when appropriate J. Bruce Fields
[not found] ` <1374094217-31493-18-git-send-email-bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-07-26 11:24 ` Jeff Layton
2013-07-17 21:09 ` [PATCH 00/16] Implement NFSv4 delegations, take 9 J. Bruce Fields
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1374094217-31493-10-git-send-email-bfields@redhat.com \
--to=bfields@redhat.com \
--cc=david@fromorbit.com \
--cc=dhowells@redhat.com \
--cc=dustin.kirkland@gazzang.com \
--cc=jlayton@redhat.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=tyhicks@canonical.com \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).