From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ian Kent Subject: Re: [PATCH 1/2] vfs: make real_lookup do dentry revalidation with i_mutex held Date: Tue, 24 Mar 2009 13:18:18 +0900 Message-ID: <49C85F0A.9030409@themaw.net> References: <1237493790-5665-1-git-send-email-sage@newdream.net> <49C85E3E.7030505@themaw.net> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: linux-fsdevel@vger.kernel.org, hch@infradead.org, akpm@linux-foundation.org, Al Viro , Andreas Dilger , Yehuda Sadeh To: Sage Weil Return-path: Received: from outbound.icp-qv1-irony-out3.iinet.net.au ([203.59.1.148]:26747 "EHLO outbound.icp-qv1-irony-out3.iinet.net.au" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750714AbZCXESY (ORCPT ); Tue, 24 Mar 2009 00:18:24 -0400 In-Reply-To: <49C85E3E.7030505@themaw.net> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: Ian Kent wrote: > Sage Weil wrote: >> real_lookup() is called by do_lookup() if dentry revalidation fails. If >> the cache is re-populated while waiting for i_mutex, it may find that >> a d_lookup() subsequently succeeds (see the "Uhhuh! Nasty case" comment). >> >> Previously, real_lookup() would drop i_mutex and do_revalidate() >> again. If >> revalidate failed _again_, however, it would give up with -ENOENT. The >> problem here that network file systems may be invalidating dentries via >> server callbacks, e.g. due to concurrent access from another client, and >> -ENOENT is frequently the wrong answer. > > This will be something of a problem for autofs4 (and autofs). > It would require fairly significant changes to the revalidate code. Or maybe not that significant, I'm not sure yet. > >> >> This problem has been seen with both Lustre and Ceph. It seems possible >> to hit this case with NFS as well if the cache lifetime is very short. >> >> Instead, we should do_revalidate() while i_mutex is still held. If >> revalidation fails, we can move on to a ->lookup() and ensure a correct >> result without worrying about any subsequent races. >> >> Note that do_revalidate() is called with i_mutex held elsewhere. For >> example, do_filp_open(), lookup_create(), do_unlinkat(), do_rmdir(), >> and possibly others all take the directory i_mutex, and then >> >> -> lookup_hash >> -> __lookup_hash >> -> cached_lookup >> -> do_revalidate >> >> so this does not introduce any new locking rules for d_revalidate >> implementations. >> >> Yes, the goto is ugly. A cleanup patch follows. >> >> CC: Al Viro >> CC: Andreas Dilger >> Signed-off-by: Yehuda Sadeh >> Signed-off-by: Sage Weil >> --- >> fs/namei.c | 5 +++-- >> 1 files changed, 3 insertions(+), 2 deletions(-) >> >> diff --git a/fs/namei.c b/fs/namei.c >> index c30e33d..b9e7128 100644 >> --- a/fs/namei.c >> +++ b/fs/namei.c >> @@ -489,6 +489,7 @@ static struct dentry * real_lookup(struct dentry * >> parent, struct qstr * name, s >> if (!result) { >> struct dentry *dentry; >> >> +do_the_lookup: >> /* Don't create child dentry for a dead directory. */ >> result = ERR_PTR(-ENOENT); >> if (IS_DEADDIR(dir)) >> @@ -512,12 +513,12 @@ out_unlock: >> * Uhhuh! Nasty case: the cache was re-populated while >> * we waited on the semaphore. Need to revalidate. >> */ >> - mutex_unlock(&dir->i_mutex); >> if (result->d_op && result->d_op->d_revalidate) { >> result = do_revalidate(result, nd); >> if (!result) >> - result = ERR_PTR(-ENOENT); >> + goto do_the_lookup; >> } >> + mutex_unlock(&dir->i_mutex); >> return result; >> } >> > > -- > To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html