linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] d_move() vs d_unhashed() race: retry under d_lock
@ 2017-09-08 16:21 Goldwyn Rodrigues
  2017-09-09  0:14 ` NeilBrown
  0 siblings, 1 reply; 2+ messages in thread
From: Goldwyn Rodrigues @ 2017-09-08 16:21 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: viro, alexey.lyashkov, neilb, Goldwyn Rodrigues

From: Goldwyn Rodrigues <rgoldwyn@suse.com>

This is a follow-up of Alexey's patch at
https://patchwork.kernel.org/patch/9455345/
with suggestions proposed by Al Viro.

d_move() and d_unhashed() may race because there is a small window
where the dentry is unhashed. This may result in ENOENT (for getcwd).
This must be checked under d_lock. However, in order to keep the fast
path, perform the d_unhashed without d_lock first, and in the unlikely
event that it succeeds, perform the check again under d_lock.

Here is the test case which demonstrates the problem:

void *thread_main(void *unused)
{
	int rc;

	for (;;) {
		rc = rename("/tmp/t-a", "/tmp/t-b");
		assert(rc == 0);
		rc = rename("/tmp/t-b", "/tmp/t-a");
		assert(rc == 0);
	}

	return NULL;
}

int main(void)
{
	int rc, i;
	pthread_t ptt;

	rmdir("/tmp/t-a");
	rmdir("/tmp/t-b");

	rc = mkdir("/tmp/t-a", 0666);
	assert(rc == 0);

	rc = chdir("/tmp/t-a");
	assert(rc == 0);

	rc = pthread_create(&ptt, NULL, thread_main, NULL);
	assert(rc == 0);

	for (i = 0;; i++) {
		char buf[100], *b;
		b = getcwd(buf, sizeof(buf));
		if (b == NULL) {
			printf("getcwd failed on iter %d with %d\n", i,
					errno);
			break;
		}
	}

	return 0;
}

Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
---
 fs/dcache.c | 18 +++++++++++++++---
 1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/fs/dcache.c b/fs/dcache.c
index f90141387f01..ebda7e20ae86 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -3224,6 +3224,18 @@ char *d_absolute_path(const struct path *path,
 	return res;
 }
 
+static inline bool d_unhashed_safe(struct dentry *dentry)
+{
+        bool ret = d_unhashed(dentry);
+        if (unlikely(ret)) {
+                /* retry under d_lock */
+                spin_lock(&dentry->d_lock);
+                ret = d_unhashed(dentry);
+                spin_unlock(&dentry->d_lock);
+        }
+        return ret;
+}
+
 /*
  * same as __d_path but appends "(deleted)" for unlinked files.
  */
@@ -3232,7 +3244,7 @@ static int path_with_deleted(const struct path *path,
 			     char **buf, int *buflen)
 {
 	prepend(buf, buflen, "\0", 1);
-	if (d_unlinked(path->dentry)) {
+	if (d_unhashed_safe(path->dentry)) {
 		int error = prepend(buf, buflen, " (deleted)", 10);
 		if (error)
 			return error;
@@ -3396,7 +3408,7 @@ char *dentry_path(struct dentry *dentry, char *buf, int buflen)
 	char *p = NULL;
 	char *retval;
 
-	if (d_unlinked(dentry)) {
+	if (d_unhashed_safe(dentry)) {
 		p = buf + buflen;
 		if (prepend(&p, &buflen, "//deleted", 10) != 0)
 			goto Elong;
@@ -3453,7 +3465,7 @@ SYSCALL_DEFINE2(getcwd, char __user *, buf, unsigned long, size)
 	get_fs_root_and_pwd_rcu(current->fs, &root, &pwd);
 
 	error = -ENOENT;
-	if (!d_unlinked(pwd.dentry)) {
+	if (!d_unhashed_safe(pwd.dentry)) {
 		unsigned long len;
 		char *cwd = page + PATH_MAX;
 		int buflen = PATH_MAX;
-- 
2.14.1

^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH] d_move() vs d_unhashed() race: retry under d_lock
  2017-09-08 16:21 [PATCH] d_move() vs d_unhashed() race: retry under d_lock Goldwyn Rodrigues
@ 2017-09-09  0:14 ` NeilBrown
  0 siblings, 0 replies; 2+ messages in thread
From: NeilBrown @ 2017-09-09  0:14 UTC (permalink / raw)
  To: Goldwyn Rodrigues, linux-fsdevel; +Cc: viro, alexey.lyashkov, Goldwyn Rodrigues

[-- Attachment #1: Type: text/plain, Size: 5262 bytes --]

On Fri, Sep 08 2017, Goldwyn Rodrigues wrote:

> From: Goldwyn Rodrigues <rgoldwyn@suse.com>
>
> This is a follow-up of Alexey's patch at
> https://patchwork.kernel.org/patch/9455345/
> with suggestions proposed by Al Viro.
>
> d_move() and d_unhashed() may race because there is a small window
> where the dentry is unhashed. This may result in ENOENT (for getcwd).
> This must be checked under d_lock. However, in order to keep the fast
> path, perform the d_unhashed without d_lock first, and in the unlikely
> event that it succeeds, perform the check again under d_lock.

For your consideration, here is an alternate patch which - I believe -
achieves the same end.  I think this approach is a little more robust,
but there isn't a lot in it - Goldwyn's is arguably simpler so might be
better for that reason.

NeilBrown

From dfaa166e2afaed051c388dc9f43d1468020b5e22 Mon Sep 17 00:00:00 2001
From: NeilBrown <neilb@suse.com>
Date: Fri, 8 Sep 2017 16:03:42 +1000
Subject: [PATCH] VFS: close race between getcwd() and d_move()

d_move() will call __d_drop() and then __d_rehash()
on the dentry being moved.  This creates a small window
when the dentry appears to be unhashed.  Many tests
of d_unhashed() are made under ->d_lock and so are safe
from racing with this window, but some aren't.
In particular, getcwd() calls d_unlinked() (which calls
d_unhashed()) without d_lock protection, so it can race.

This races has been seen in practice with lustre, which uses d_move() as
part of name lookup.  See:
   https://jira.hpdd.intel.com/browse/LU-9735
It could race with a regular rename(), and result in ENOENT instead
of either the 'before' or 'after' name.

We could fix this race by taking d_lock an rechecking when
d_unhashed() reports true.  Alternately when can remove the window,
which is the approach this patch takes.

When __d_drop and __d_rehash are used to move a dentry, an extra
flag is passed which causes d_hash.pprev to not be cleared, and
to not be tested.

Signed-off-by: NeilBrown <neilb@suse.com>
---
 fs/dcache.c | 31 ++++++++++++++++++++-----------
 1 file changed, 20 insertions(+), 11 deletions(-)

diff --git a/fs/dcache.c b/fs/dcache.c
index f90141387f01..3d1f14c6c306 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -469,8 +469,11 @@ static void dentry_lru_add(struct dentry *dentry)
  * reason (NFS timeouts or autofs deletes).
  *
  * __d_drop requires dentry->d_lock.
+ * ___d_drop takes an extra @moving argument.
+ * If true, d_hash.pprev is not cleared, so there is no transient d_unhashed()
+ * state.
  */
-void __d_drop(struct dentry *dentry)
+static void inline ___d_drop(struct dentry *dentry, bool moving)
 {
 	if (!d_unhashed(dentry)) {
 		struct hlist_bl_head *b;
@@ -486,12 +489,18 @@ void __d_drop(struct dentry *dentry)
 
 		hlist_bl_lock(b);
 		__hlist_bl_del(&dentry->d_hash);
-		dentry->d_hash.pprev = NULL;
+		if (likely(!moving))
+			dentry->d_hash.pprev = NULL;
 		hlist_bl_unlock(b);
 		/* After this call, in-progress rcu-walk path lookup will fail. */
 		write_seqcount_invalidate(&dentry->d_seq);
 	}
 }
+
+void __d_drop(struct dentry *dentry)
+{
+	___d_drop(dentry, false);
+}
 EXPORT_SYMBOL(__d_drop);
 
 void d_drop(struct dentry *dentry)
@@ -2378,10 +2387,10 @@ void d_delete(struct dentry * dentry)
 }
 EXPORT_SYMBOL(d_delete);
 
-static void __d_rehash(struct dentry *entry)
+static void __d_rehash(struct dentry *entry, bool moving)
 {
 	struct hlist_bl_head *b = d_hash(entry->d_name.hash);
-	BUG_ON(!d_unhashed(entry));
+	BUG_ON(!moving && !d_unhashed(entry));
 	hlist_bl_lock(b);
 	hlist_bl_add_head_rcu(&entry->d_hash, b);
 	hlist_bl_unlock(b);
@@ -2397,7 +2406,7 @@ static void __d_rehash(struct dentry *entry)
 void d_rehash(struct dentry * entry)
 {
 	spin_lock(&entry->d_lock);
-	__d_rehash(entry);
+	__d_rehash(entry, false);
 	spin_unlock(&entry->d_lock);
 }
 EXPORT_SYMBOL(d_rehash);
@@ -2571,7 +2580,7 @@ static inline void __d_add(struct dentry *dentry, struct inode *inode)
 		raw_write_seqcount_end(&dentry->d_seq);
 		fsnotify_update_flags(dentry);
 	}
-	__d_rehash(dentry);
+	__d_rehash(dentry, false);
 	if (dir)
 		end_dir_add(dir, n);
 	spin_unlock(&dentry->d_lock);
@@ -2633,7 +2642,7 @@ struct dentry *d_exact_alias(struct dentry *entry, struct inode *inode)
 			alias = NULL;
 		} else {
 			__dget_dlock(alias);
-			__d_rehash(alias);
+			__d_rehash(alias, false);
 			spin_unlock(&alias->d_lock);
 		}
 		spin_unlock(&inode->i_lock);
@@ -2819,8 +2828,8 @@ static void __d_move(struct dentry *dentry, struct dentry *target,
 
 	/* unhash both */
 	/* __d_drop does write_seqcount_barrier, but they're OK to nest. */
-	__d_drop(dentry);
-	__d_drop(target);
+	___d_drop(dentry, true);
+	___d_drop(target, exchange);
 
 	/* Switch the names.. */
 	if (exchange)
@@ -2829,9 +2838,9 @@ static void __d_move(struct dentry *dentry, struct dentry *target,
 		copy_name(dentry, target);
 
 	/* rehash in new place(s) */
-	__d_rehash(dentry);
+	__d_rehash(dentry, true);
 	if (exchange)
-		__d_rehash(target);
+		__d_rehash(target, true);
 
 	/* ... and switch them in the tree */
 	if (IS_ROOT(dentry)) {
-- 
2.14.0.rc0.dirty


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply related	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2017-09-09  0:14 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-09-08 16:21 [PATCH] d_move() vs d_unhashed() race: retry under d_lock Goldwyn Rodrigues
2017-09-09  0:14 ` NeilBrown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).