From: NeilBrown <neilb@suse.com>
To: Goldwyn Rodrigues <rgoldwyn@suse.de>, linux-fsdevel@vger.kernel.org
Cc: viro@ZenIV.linux.org.uk, alexey.lyashkov@gmail.com,
Goldwyn Rodrigues <rgoldwyn@suse.com>
Subject: Re: [PATCH] d_move() vs d_unhashed() race: retry under d_lock
Date: Sat, 09 Sep 2017 10:14:11 +1000 [thread overview]
Message-ID: <87d170d8z0.fsf@notabene.neil.brown.name> (raw)
In-Reply-To: <20170908162109.17906-1-rgoldwyn@suse.de>
[-- Attachment #1: Type: text/plain, Size: 5262 bytes --]
On Fri, Sep 08 2017, Goldwyn Rodrigues wrote:
> From: Goldwyn Rodrigues <rgoldwyn@suse.com>
>
> This is a follow-up of Alexey's patch at
> https://patchwork.kernel.org/patch/9455345/
> with suggestions proposed by Al Viro.
>
> d_move() and d_unhashed() may race because there is a small window
> where the dentry is unhashed. This may result in ENOENT (for getcwd).
> This must be checked under d_lock. However, in order to keep the fast
> path, perform the d_unhashed without d_lock first, and in the unlikely
> event that it succeeds, perform the check again under d_lock.
For your consideration, here is an alternate patch which - I believe -
achieves the same end. I think this approach is a little more robust,
but there isn't a lot in it - Goldwyn's is arguably simpler so might be
better for that reason.
NeilBrown
From dfaa166e2afaed051c388dc9f43d1468020b5e22 Mon Sep 17 00:00:00 2001
From: NeilBrown <neilb@suse.com>
Date: Fri, 8 Sep 2017 16:03:42 +1000
Subject: [PATCH] VFS: close race between getcwd() and d_move()
d_move() will call __d_drop() and then __d_rehash()
on the dentry being moved. This creates a small window
when the dentry appears to be unhashed. Many tests
of d_unhashed() are made under ->d_lock and so are safe
from racing with this window, but some aren't.
In particular, getcwd() calls d_unlinked() (which calls
d_unhashed()) without d_lock protection, so it can race.
This races has been seen in practice with lustre, which uses d_move() as
part of name lookup. See:
https://jira.hpdd.intel.com/browse/LU-9735
It could race with a regular rename(), and result in ENOENT instead
of either the 'before' or 'after' name.
We could fix this race by taking d_lock an rechecking when
d_unhashed() reports true. Alternately when can remove the window,
which is the approach this patch takes.
When __d_drop and __d_rehash are used to move a dentry, an extra
flag is passed which causes d_hash.pprev to not be cleared, and
to not be tested.
Signed-off-by: NeilBrown <neilb@suse.com>
---
fs/dcache.c | 31 ++++++++++++++++++++-----------
1 file changed, 20 insertions(+), 11 deletions(-)
diff --git a/fs/dcache.c b/fs/dcache.c
index f90141387f01..3d1f14c6c306 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -469,8 +469,11 @@ static void dentry_lru_add(struct dentry *dentry)
* reason (NFS timeouts or autofs deletes).
*
* __d_drop requires dentry->d_lock.
+ * ___d_drop takes an extra @moving argument.
+ * If true, d_hash.pprev is not cleared, so there is no transient d_unhashed()
+ * state.
*/
-void __d_drop(struct dentry *dentry)
+static void inline ___d_drop(struct dentry *dentry, bool moving)
{
if (!d_unhashed(dentry)) {
struct hlist_bl_head *b;
@@ -486,12 +489,18 @@ void __d_drop(struct dentry *dentry)
hlist_bl_lock(b);
__hlist_bl_del(&dentry->d_hash);
- dentry->d_hash.pprev = NULL;
+ if (likely(!moving))
+ dentry->d_hash.pprev = NULL;
hlist_bl_unlock(b);
/* After this call, in-progress rcu-walk path lookup will fail. */
write_seqcount_invalidate(&dentry->d_seq);
}
}
+
+void __d_drop(struct dentry *dentry)
+{
+ ___d_drop(dentry, false);
+}
EXPORT_SYMBOL(__d_drop);
void d_drop(struct dentry *dentry)
@@ -2378,10 +2387,10 @@ void d_delete(struct dentry * dentry)
}
EXPORT_SYMBOL(d_delete);
-static void __d_rehash(struct dentry *entry)
+static void __d_rehash(struct dentry *entry, bool moving)
{
struct hlist_bl_head *b = d_hash(entry->d_name.hash);
- BUG_ON(!d_unhashed(entry));
+ BUG_ON(!moving && !d_unhashed(entry));
hlist_bl_lock(b);
hlist_bl_add_head_rcu(&entry->d_hash, b);
hlist_bl_unlock(b);
@@ -2397,7 +2406,7 @@ static void __d_rehash(struct dentry *entry)
void d_rehash(struct dentry * entry)
{
spin_lock(&entry->d_lock);
- __d_rehash(entry);
+ __d_rehash(entry, false);
spin_unlock(&entry->d_lock);
}
EXPORT_SYMBOL(d_rehash);
@@ -2571,7 +2580,7 @@ static inline void __d_add(struct dentry *dentry, struct inode *inode)
raw_write_seqcount_end(&dentry->d_seq);
fsnotify_update_flags(dentry);
}
- __d_rehash(dentry);
+ __d_rehash(dentry, false);
if (dir)
end_dir_add(dir, n);
spin_unlock(&dentry->d_lock);
@@ -2633,7 +2642,7 @@ struct dentry *d_exact_alias(struct dentry *entry, struct inode *inode)
alias = NULL;
} else {
__dget_dlock(alias);
- __d_rehash(alias);
+ __d_rehash(alias, false);
spin_unlock(&alias->d_lock);
}
spin_unlock(&inode->i_lock);
@@ -2819,8 +2828,8 @@ static void __d_move(struct dentry *dentry, struct dentry *target,
/* unhash both */
/* __d_drop does write_seqcount_barrier, but they're OK to nest. */
- __d_drop(dentry);
- __d_drop(target);
+ ___d_drop(dentry, true);
+ ___d_drop(target, exchange);
/* Switch the names.. */
if (exchange)
@@ -2829,9 +2838,9 @@ static void __d_move(struct dentry *dentry, struct dentry *target,
copy_name(dentry, target);
/* rehash in new place(s) */
- __d_rehash(dentry);
+ __d_rehash(dentry, true);
if (exchange)
- __d_rehash(target);
+ __d_rehash(target, true);
/* ... and switch them in the tree */
if (IS_ROOT(dentry)) {
--
2.14.0.rc0.dirty
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]
prev parent reply other threads:[~2017-09-09 0:14 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-09-08 16:21 [PATCH] d_move() vs d_unhashed() race: retry under d_lock Goldwyn Rodrigues
2017-09-09 0:14 ` NeilBrown [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87d170d8z0.fsf@notabene.neil.brown.name \
--to=neilb@suse.com \
--cc=alexey.lyashkov@gmail.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=rgoldwyn@suse.com \
--cc=rgoldwyn@suse.de \
--cc=viro@ZenIV.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).