From: Ian Kent <raven@themaw.net>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: autofs mailing list <autofs@linux.kernel.org>,
linux-fsdevel <linux-fsdevel@vger.kernel.org>,
Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] autofs4 - remove hashed check in validate_wait()
Date: Mon, 08 Jun 2009 13:59:57 +0800 [thread overview]
Message-ID: <4A2CA8DD.2080809@themaw.net> (raw)
In-Reply-To: <4A2CA75E.7040501@themaw.net>
Ian Kent wrote:
> Andrew Morton wrote:
>> On Mon, 08 Jun 2009 13:23:40 +0800 Ian Kent <raven@themaw.net> wrote:
>>
>>> Andrew Morton wrote:
>>>> On Mon, 08 Jun 2009 11:25:37 +0800 Ian Kent <raven@themaw.net> wrote:
>>>>
>>>>> The recent ->lookup() deadlock correction required the directory
>>>>> inode mutex to be dropped while waiting for expire completion. We
>>>>> were concerned about side effects from this change and one has
>>>>> been identified.
>>>>>
>>>>> When checking if a mount has already completed prior to adding a
>>>>> new mount request to the wait queue we check if the dentry is hashed
>>>>> and, if so, if it is a mount point. But, if a mount successfully
>>>>> completed while we slept on the wait queue mutex the dentry must
>>>>> exist for the mount to have completed so the test is not really
>>>>> needed.
>>>>>
>>>>> Mounts can also be done on top of a global root dentry, so for the
>>>>> above case, where a mount request completes and the wait queue entry
>>>>> has already been removed, the hashed test returning false can cause
>>>>> an incorrect callback to the daemon. Also, d_mountpoint() is not
>>>>> sufficient to check if a mount has completed for the multi-mount
>>>>> case when we don't have a real mount at the base of the tree.
>>>>>
>>>> I've been scratching my head trying to work out if this is a
>>>> needed-in-2.6.30 fix, but all I got was a bald spot. Help?
>>> Yeah, and why would you want to know that much about autofs, it's a
>>> wonder I have any hair at all, ;)
>>>
>>> I think so if possible, as it resolves an issue that is a side effect of
>>> the last patch I sent, which resolved a deadlook in ->lookup(). The
>>> problem occurs due to dropping the directory inode mutex before waiting
>>> for an expire. What isn't obvious is that holding the mutex (as we did
>>> previously) causes processes wanting to request mounts for other
>>> directories to wait, so we don't see the contention for the mount
>>> request wait queue that this patch addresses.
>>>
>>> However, the issue only surfaces when there are a number of processes
>>> all trying to perform mounts at the same time. The test I ran used 10
>>> processes all using the same map.
>> <scratch scratch>
>> What are the user-visible effects of the thing-which-this-fixed?
>
> I saw several error messages.
>
> They cause autofs to become quite confused and don't really point to the
> actual problem.
>
> Things like:
>
> handle_packet_missing_direct:1376: can't find map entry for (43,1827932)
>
> which is usually totally fatal (although in this case it wouldn't be
> except that I treat is as such because it normally is).
Oh .. and the requesting user always receives a fail in this case in
spite of the mount actually having previously succeeded.
>
> do_mount_direct: direct trigger not valid or already mounted
> /test/nested/g3c/s1/ss1
>
> which is recoverable, however if this problem is at play it can cause
> autofs to become quite confused as to the dependencies in the mount tree
> because mount triggers end up mounted multiple times. It's hard to
> accurately check for this over mounting case and automount shouldn't
> need to if the kernel module is doing its job.
>
> There was one other message, similar in consequence of this last one but
> I can't locate a log example just now.
And these two cause fails to be returned return to the requesting
process either immediately or shortly after as autofs becomes confused.
Sorry I can't give a clearer description.
Ian
prev parent reply other threads:[~2009-06-08 5:59 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-06-08 3:25 [PATCH] autofs4 - remove hashed check in validate_wait() Ian Kent
2009-06-08 5:08 ` Andrew Morton
2009-06-08 5:23 ` Ian Kent
2009-06-08 5:41 ` Andrew Morton
2009-06-08 5:53 ` Ian Kent
2009-06-08 5:59 ` Ian Kent [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4A2CA8DD.2080809@themaw.net \
--to=raven@themaw.net \
--cc=akpm@linux-foundation.org \
--cc=autofs@linux.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).