linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ian Kent <raven@themaw.net>
To: Sage Weil <sage@newdream.net>
Cc: linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Christoph Hellwig <hch@infradead.org>,
	akpm@linux-foundation.org, Al Viro <viro@zeniv.linux.org.uk>,
	Yehuda Sadeh <yehuda@newdream.net>,
	"H. Peter Anvin" <hpa@zytor.com>
Subject: Re: [PATCH 1/2] vfs: make real_lookup do dentry revalidation with i_mutex held
Date: Fri, 31 Jul 2009 10:03:41 +0800	[thread overview]
Message-ID: <4A7250FD.2020806@themaw.net> (raw)
In-Reply-To: <Pine.LNX.4.64.0907292119200.27357@cobra.newdream.net>

Sage Weil wrote:
> On Thu, 30 Jul 2009, Ian Kent wrote:
>> Sage Weil wrote:
>>> On Wed, 29 Jul 2009, Ian Kent wrote:
>>>>> How is it possible for a revalidate to fail and then to skip over the 
>>>>> ->lookup() call?  That sounds like a problem with the VFS?  (Like the one 
>>>>> I'm trying to fix... you're testing with the real_lookup patch applied, I 
>>>>> hope?)
>>>> It's been hard to work out exactly what's happening but I suspect it is
>>>> due to multiple walks occurring while the the dentry hashed state
>>>> changes. Like, one process hits revalidate, drops the dentry, another
>>>> process comes along and goes straight to lookup and rehashes the dentry,
>>>> the original process, that dropped the dentry, then ends up not calling
>>>> lookup. The race only happens occasionally, and my test uses 10
>>>> processes that tend to hit the same dentrys at the same time at various
>>>> points in the mount tree over 150 iterations. Admittedly, it's a fairly
>>>> stressful test as autofs goes but it does tend to show up races quite well.
>>> If I'm reading namei.c correctly, every ->d_revalidate() that returns NULL 
>>> will always result in a ->lookup() (unless IS_DEADDIR(inode)), without any 
>>> regard for the dentry's hashed/unhashed status.  The exception appears to 
>>> be a call in __link_path_walk when FS_REVAL_DOT is set in the superblock 
>>> (nfs only):
>>>
>>> 		if (nd->path.dentry && nd->path.dentry->d_sb &&
>>> 		    (nd->path.dentry->d_sb->s_type->fs_flags & FS_REVAL_DOT)) {
>>> 			err = -ESTALE;
>>> 			/* Note: we do not d_invalidate() */
>>> 			if (!nd->path.dentry->d_op->d_revalidate(
>>> 					nd->path.dentry, nd))
>>> 				break;
>>>
>>> in which case userspace would see an ESTALE.
>>>
>>> It sounds like a few well placed printks that include the pid for unhash, 
>>> rehash, revalidate, and lookup would narrow down what is going on?
>> Of course I've been doing that.
>>
>> I think you need to re-read what I wrote above. Your analysis is fine if
>> there is only one process concurrently walking onto the dentry. If a
>> second process beats the revalidate process to real_lookup() and
>> incorrectly rehashes the dentry the following revalidate process will
>> not call ->lookup().
> 
> Sorry, you're right.  I missed the d_lookup call in real_lookup.  So you 
> mean something like:
> 
> do_lookup
> 	__d_lookup -> finds dentry
> 	do_revalidate -> drops dentry, returns null
> 	real_lookup
> 		take i_mutex
> 		d_lookup -> returns dentry rehashed by racing ->lookup()

Well, that's what I thought but looking again it doesn't look like that
can happen. However, the evidence I have speaks for itself, a user space
process is calling ->open() on a mount point with no mount. That should
never happen, it should have waited for the mount. And the code to
ensure that the process that dropped the dentry is the one that
re-hashes it makes the problem go away.

> 
>> My most recent try at this seems to be working.
>> I need to clean up the printks and alter one of the comments and re-test
>> with a full test (I have been working with only half the test enabled as
>> it was most problematic). The testing takes a long time to run, 1 hour
>> 10 minutes for each of two configurations and the criteria for success
>> in at least six consecutive successful runs, that's about thirteen
>> hours. But, hopefully, when that is done I'll have a series to post for
>> comment.
> 
> Great to hear.  I hope my inane blathering hasn't been totally useless! :)

Sadly, on the seventh run of my test I encountered what looks like a
deadlock. I'm working on that now but, but as I progress the testing
takes longer and longer, *sigh*.

Ian

  reply	other threads:[~2009-07-31  2:03 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-03-19 20:16 [PATCH 1/2] vfs: make real_lookup do dentry revalidation with i_mutex held Sage Weil
2009-03-19 20:16 ` [PATCH 2/2] vfs: clean up real_lookup Sage Weil
2009-03-19 20:22   ` Christoph Hellwig
2009-03-19 20:35     ` Sage Weil
2009-03-19 20:23 ` [PATCH 1/2] vfs: make real_lookup do dentry revalidation with i_mutex held Christoph Hellwig
2009-03-24  4:14 ` Ian Kent
2009-03-24  4:18   ` Ian Kent
2009-03-25  4:29     ` Sage Weil
2009-03-25  6:08       ` Ian Kent
2009-03-25 16:11         ` Ian Kent
2009-03-25 19:11           ` Sage Weil
2009-03-26  2:09             ` Ian Kent
2009-03-26  3:53               ` Sage Weil
2009-03-26  8:00                 ` Ian Kent
2009-03-26 10:38                 ` Ian Kent
2009-03-29  8:53                   ` Ian Kent
2009-04-03  0:58                     ` Sage Weil
2009-04-03  2:00                       ` Ian Kent
2009-04-03  3:07                         ` Sage Weil
2009-06-22 17:15                         ` Sage Weil
2009-06-23  0:37                           ` Ian Kent
2009-06-23  2:40                             ` H. Peter Anvin
2009-06-25  7:21                               ` Ian Kent
2009-06-25 13:41                                 ` H. Peter Anvin
2009-06-25 13:58                                   ` Christoph Hellwig
2009-06-23  2:42                             ` H. Peter Anvin
2009-06-24  2:28                             ` Ian Kent
2009-06-24  5:45                               ` Sage Weil
2009-06-24  9:17                                 ` Ian Kent
2009-06-24 17:46                                   ` Sage Weil
2009-06-25  2:50                                     ` Ian Kent
2009-06-25  4:13                                     ` Ian Kent
2009-06-25  4:49                                       ` Sage Weil
2009-06-25  5:52                                         ` Ian Kent
2009-09-17  6:36                                           ` Ian Kent
2009-07-20  2:45                                 ` Ian Kent
2009-07-28 22:47                                   ` Sage Weil
2009-07-29  2:59                                     ` Ian Kent
2009-07-29 16:57                                       ` Sage Weil
2009-07-30  0:56                                         ` Ian Kent
2009-07-30 17:47                                           ` Sage Weil
2009-07-31  2:03                                             ` Ian Kent [this message]
2009-03-26  3:54               ` Ian Kent
2009-03-26  4:03                 ` Sage Weil
2009-03-26  5:07                 ` Ian Kent

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4A7250FD.2020806@themaw.net \
    --to=raven@themaw.net \
    --cc=akpm@linux-foundation.org \
    --cc=hch@infradead.org \
    --cc=hpa@zytor.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=sage@newdream.net \
    --cc=viro@zeniv.linux.org.uk \
    --cc=yehuda@newdream.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).