From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ian Kent Subject: Re: [PATCH 1/2] vfs: make real_lookup do dentry revalidation with i_mutex held Date: Tue, 24 Mar 2009 13:14:54 +0900 Message-ID: <49C85E3E.7030505@themaw.net> References: <1237493790-5665-1-git-send-email-sage@newdream.net> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: linux-fsdevel@vger.kernel.org, hch@infradead.org, akpm@linux-foundation.org, Al Viro , Andreas Dilger , Yehuda Sadeh To: Sage Weil Return-path: Received: from outbound.icp-qv1-irony-out3.iinet.net.au ([203.59.1.148]:24980 "EHLO outbound.icp-qv1-irony-out3.iinet.net.au" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750714AbZCXEPB (ORCPT ); Tue, 24 Mar 2009 00:15:01 -0400 In-Reply-To: <1237493790-5665-1-git-send-email-sage@newdream.net> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: Sage Weil wrote: > real_lookup() is called by do_lookup() if dentry revalidation fails. If > the cache is re-populated while waiting for i_mutex, it may find that > a d_lookup() subsequently succeeds (see the "Uhhuh! Nasty case" comment). > > Previously, real_lookup() would drop i_mutex and do_revalidate() again. If > revalidate failed _again_, however, it would give up with -ENOENT. The > problem here that network file systems may be invalidating dentries via > server callbacks, e.g. due to concurrent access from another client, and > -ENOENT is frequently the wrong answer. This will be something of a problem for autofs4 (and autofs). It would require fairly significant changes to the revalidate code. > > This problem has been seen with both Lustre and Ceph. It seems possible > to hit this case with NFS as well if the cache lifetime is very short. > > Instead, we should do_revalidate() while i_mutex is still held. If > revalidation fails, we can move on to a ->lookup() and ensure a correct > result without worrying about any subsequent races. > > Note that do_revalidate() is called with i_mutex held elsewhere. For > example, do_filp_open(), lookup_create(), do_unlinkat(), do_rmdir(), > and possibly others all take the directory i_mutex, and then > > -> lookup_hash > -> __lookup_hash > -> cached_lookup > -> do_revalidate > > so this does not introduce any new locking rules for d_revalidate > implementations. > > Yes, the goto is ugly. A cleanup patch follows. > > CC: Al Viro > CC: Andreas Dilger > Signed-off-by: Yehuda Sadeh > Signed-off-by: Sage Weil > --- > fs/namei.c | 5 +++-- > 1 files changed, 3 insertions(+), 2 deletions(-) > > diff --git a/fs/namei.c b/fs/namei.c > index c30e33d..b9e7128 100644 > --- a/fs/namei.c > +++ b/fs/namei.c > @@ -489,6 +489,7 @@ static struct dentry * real_lookup(struct dentry * parent, struct qstr * name, s > if (!result) { > struct dentry *dentry; > > +do_the_lookup: > /* Don't create child dentry for a dead directory. */ > result = ERR_PTR(-ENOENT); > if (IS_DEADDIR(dir)) > @@ -512,12 +513,12 @@ out_unlock: > * Uhhuh! Nasty case: the cache was re-populated while > * we waited on the semaphore. Need to revalidate. > */ > - mutex_unlock(&dir->i_mutex); > if (result->d_op && result->d_op->d_revalidate) { > result = do_revalidate(result, nd); > if (!result) > - result = ERR_PTR(-ENOENT); > + goto do_the_lookup; > } > + mutex_unlock(&dir->i_mutex); > return result; > } >