All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ian Kent <raven@themaw.net>
To: Sage Weil <sage@newdream.net>
Cc: linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Christoph Hellwig <hch@infradead.org>,
	akpm@linux-foundation.org, Al Viro <viro@zeniv.linux.org.uk>,
	Yehuda Sadeh <yehuda@newdream.net>,
	"H. Peter Anvin" <hpa@zytor.com>
Subject: Re: [PATCH 1/2] vfs: make real_lookup do dentry revalidation with i_mutex held
Date: Wed, 29 Jul 2009 10:59:33 +0800	[thread overview]
Message-ID: <4A6FBB15.8010301@themaw.net> (raw)
In-Reply-To: <Pine.LNX.4.64.0907281537360.24717@cobra.newdream.net>

Sage Weil wrote:
> On Mon, 20 Jul 2009, Ian Kent wrote:
>> Sage Weil wrote:
>>> On Wed, 24 Jun 2009, Ian Kent wrote:
>>>> I'm continuing with this now, but there's a deadlock in there somewhere!
>>> Sorry, are you still working with the patch you posted a few months back?
>>>
>>> 	http://marc.info/?l=linux-fsdevel&m=123831685111213&w=2
>>>
>>> Looking over it, the 
>>>
>>> +		unsigned int lock_held = mutex_is_locked(&dir->i_mutex);
>>> ...
>>> +		if (lock_held) {
>>> +			/* Already pending, send to ->lookup() */
>>> +			d_drop(dentry);
>>>
>>> bit looks highly suspect.  I'm guessing revalidate should never sleep, and 
>>> always kick things off to ->lookup() (to do any waiting on upcall 
>>> completion or whatever else) if the dentry isn't valid now...?
>> I tried your suggestion and have finally come to the conclusion that it
>> cannot work. My own fault really, for not fully understanding why I used
>> the above approach in the first place.
>>
>> I believe that if the mutex is not held then I "must" handle it in the
>> revalidate routine and if the mutex is held I "must" defer to
>> ->lookup(). The only way to send this to ->lookup() is to drop the
>> dentry and rehash it in lookup and the mutex must be held over both
>> calls or it is possible for an execution path to skip over the lookup
>> call when several concurrent processes walk into the same dentry at the
>> same time. AFAICS it isn't possible to detect this and work around it
>> when sending everything to ->lookup()
> 
> How is it possible for a revalidate to fail and then to skip over the 
> ->lookup() call?  That sounds like a problem with the VFS?  (Like the one 
> I'm trying to fix... you're testing with the real_lookup patch applied, I 
> hope?)

It's been hard to work out exactly what's happening but I suspect it is
due to multiple walks occurring while the the dentry hashed state
changes. Like, one process hits revalidate, drops the dentry, another
process comes along and goes straight to lookup and rehashes the dentry,
the original process, that dropped the dentry, then ends up not calling
lookup. The race only happens occasionally, and my test uses 10
processes that tend to hit the same dentrys at the same time at various
points in the mount tree over 150 iterations. Admittedly, it's a fairly
stressful test as autofs goes but it does tend to show up races quite well.

I'm now working on a method to ensure the code path that unhashed the
dentry is the one that rehashes it but that has its own set of
difficulties, partly because walks from the daemon shouldn't block and
the dentry can become unhashed at any time after other code paths have
dropped and rehashed the dentry, since the mutex is not always held
during revalidate.

> 
>> This digression was quite costly in time for me but useful in improving
>> my understanding of the problem. I'm going to return to my original
>> approach, hopefully I will make better progress.
> 
> I'm sorry if I've sent you on a wild goose chase.  If you describe how 
> you're reproducing the problem I can try it locally.

I thought that at the time but it turns out that you didn't, the race is
still present either way. I've since gone back to this approach.

I always knew this was going to be difficult but I didn't expect it to
not be possible, given that I'm using your VFS locking change. If this
continues to be a problem I may need to start thinking about some small
VFS changes to help work around the difficulty.

Ian


  reply	other threads:[~2009-07-29  2:59 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-03-19 20:16 [PATCH 1/2] vfs: make real_lookup do dentry revalidation with i_mutex held Sage Weil
2009-03-19 20:16 ` [PATCH 2/2] vfs: clean up real_lookup Sage Weil
2009-03-19 20:22   ` Christoph Hellwig
2009-03-19 20:35     ` Sage Weil
2009-03-19 20:23 ` [PATCH 1/2] vfs: make real_lookup do dentry revalidation with i_mutex held Christoph Hellwig
2009-03-24  4:14 ` Ian Kent
2009-03-24  4:18   ` Ian Kent
2009-03-25  4:29     ` Sage Weil
2009-03-25  6:08       ` Ian Kent
2009-03-25 16:11         ` Ian Kent
2009-03-25 19:11           ` Sage Weil
2009-03-26  2:09             ` Ian Kent
2009-03-26  3:53               ` Sage Weil
2009-03-26  8:00                 ` Ian Kent
2009-03-26 10:38                 ` Ian Kent
2009-03-29  8:53                   ` Ian Kent
2009-04-03  0:58                     ` Sage Weil
2009-04-03  2:00                       ` Ian Kent
2009-04-03  3:07                         ` Sage Weil
2009-06-22 17:15                         ` Sage Weil
2009-06-23  0:37                           ` Ian Kent
2009-06-23  2:40                             ` H. Peter Anvin
2009-06-25  7:21                               ` Ian Kent
2009-06-25 13:41                                 ` H. Peter Anvin
2009-06-25 13:58                                   ` Christoph Hellwig
2009-06-23  2:42                             ` H. Peter Anvin
2009-06-24  2:28                             ` Ian Kent
2009-06-24  5:45                               ` Sage Weil
2009-06-24  9:17                                 ` Ian Kent
2009-06-24 17:46                                   ` Sage Weil
2009-06-25  2:50                                     ` Ian Kent
2009-06-25  4:13                                     ` Ian Kent
2009-06-25  4:49                                       ` Sage Weil
2009-06-25  5:52                                         ` Ian Kent
2009-09-17  6:36                                           ` Ian Kent
2009-07-20  2:45                                 ` Ian Kent
2009-07-28 22:47                                   ` Sage Weil
2009-07-29  2:59                                     ` Ian Kent [this message]
2009-07-29 16:57                                       ` Sage Weil
2009-07-30  0:56                                         ` Ian Kent
2009-07-30 17:47                                           ` Sage Weil
2009-07-31  2:03                                             ` Ian Kent
2009-03-26  3:54               ` Ian Kent
2009-03-26  4:03                 ` Sage Weil
2009-03-26  5:07                 ` Ian Kent

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4A6FBB15.8010301@themaw.net \
    --to=raven@themaw.net \
    --cc=akpm@linux-foundation.org \
    --cc=hch@infradead.org \
    --cc=hpa@zytor.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=sage@newdream.net \
    --cc=viro@zeniv.linux.org.uk \
    --cc=yehuda@newdream.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.