From: Andrew Morton <akpm@linux-foundation.org>
To: Ian Kent <raven@themaw.net>
Cc: Kernel Mailing List <linux-kernel@vger.kernel.org>,
autofs mailing list <autofs@linux.kernel.org>,
linux-fsdevel <linux-fsdevel@vger.kernel.org>
Subject: Re: [PATCH 7/7] autofs4 - fix pending mount race.
Date: Tue, 1 Jul 2008 00:04:22 -0700 [thread overview]
Message-ID: <20080701000422.67e211c3.akpm@linux-foundation.org> (raw)
In-Reply-To: <20080617122411.14725.31739.stgit@raven.themaw.net>
On Tue, 17 Jun 2008 20:24:12 +0800 Ian Kent <raven@themaw.net> wrote:
> Close a race between a pending mount that is about to finish and a new
> lookup for the same directory.
>
> Process P1 triggers a mount of directory foo.
> It sets DCACHE_AUTOFS_PENDING in the ->lookup routine, creates a waitq
> entry for 'foo', and calls out to the daemon to perform the mount.
> The autofs daemon will then create the directory 'foo', using a new dentry
> that will be hashed in the dcache.
>
> Before the mount completes, another process, P2, tries to walk into the
> 'foo' directory. The vfs path walking code finds an entry for 'foo' and
> calls the revalidate method. Revalidate finds that the entry is not
> PENDING (because PENDING was never set on the dentry created by the mkdir),
> but it does find the directory is empty. Revalidate calls try_to_fill_dentry,
> which sets the PENDING flag and then calls into the autofs4 wait code to
> trigger or wait for a mount of 'foo'. The wait code finds the entry for
> 'foo' and goes to sleep waiting for the completion of the mount.
>
> Yet another process, P3, tries to walk into the 'foo' directory. This
> process again finds a dentry in the dcache for 'foo', and calls into
> the autofs revalidate code.
>
> The revalidate code finds that the PENDING flag is set, and so calls
> try_to_fill_dentry.
>
> a) try_to_fill_dentry sets the PENDING flag redundantly for this dentry,
> then calls into the autofs4 wait code.
> b) the autofs4 wait code takes the waitq mutex and searches for an entry
> for 'foo'
>
> Between a and b, P1 is woken up because the mount completed.
> P1 takes the wait queue mutex, clears the PENDING flag from the dentry,
> and removes the waitqueue entry for 'foo' from the list.
>
> When it releases the waitq mutex, P3 (eventually) acquires it. At this
> time, it looks for an existing waitq for 'foo', finds none, and so
> creates a new one and calls out to the daemon to mount the 'foo' directory.
>
> Now, the reason that three processes are required to trigger this race
> is that, because the PENDING flag is not set on the dentry created by
> mkdir, the window for the race would be way to slim for it to ever occur.
> Basically, between the testing of d_mountpoint(dentry) and the taking of
> the waitq mutex, the mount would have to complete and the daemon would
> have to be woken up, and that in turn would have to wake up P1. This is
> simply impossible. Add the third process, though, and it becomes slightly
> more likely.
>
> ...
>
> diff --git a/fs/autofs4/waitq.c b/fs/autofs4/waitq.c
> index 5208cfb..cd21fd4 100644
> --- a/fs/autofs4/waitq.c
> +++ b/fs/autofs4/waitq.c
> @@ -207,19 +207,106 @@ autofs4_find_wait(struct autofs_sb_info *sbi, struct qstr *qstr)
> return wq;
> }
>
> +/*
> + * Check if we have a valid request.
> + * Returns
> + * 1 if the request should continue.
> + * In this case we can return an autofs_wait_queue entry if one is
> + * found or NULL to idicate a new wait needs to be created.
> + * 0 or a negative errno if the request shouldn't continue.
> + */
> +static int validate_request(struct autofs_wait_queue **wait,
> + struct autofs_sb_info *sbi,
> + struct qstr *qstr,
> + struct dentry*dentry, enum autofs_notify notify)
> +{
> + struct autofs_wait_queue *wq;
> + struct autofs_info *ino;
> +
> + /* Wait in progress, continue; */
> + wq = autofs4_find_wait(sbi, qstr);
> + if (wq) {
> + *wait = wq;
> + return 1;
Returns 1 with the mutex held.
> + }
> +
> + *wait = NULL;
> +
> + /* If we don't yet have any info this is a new request */
> + ino = autofs4_dentry_ino(dentry);
> + if (!ino)
> + return 1;
> +
> + /*
> + * If we've been asked to wait on an existing expire (NFY_NONE)
> + * but there is no wait in the queue ...
> + */
> + if (notify == NFY_NONE) {
> + /*
> + * Either we've betean the pending expire to post it's
> + * wait or it finished while we waited on the mutex.
> + * So we need to wait till either, the wait appears
> + * or the expire finishes.
> + */
Wanna have another go at that comment? The grammar and spelling should
cause an oops or something.
> + while (ino->flags & AUTOFS_INF_EXPIRING) {
> + mutex_unlock(&sbi->wq_mutex);
> + schedule_timeout_interruptible(HZ/10);
> + if (mutex_lock_interruptible(&sbi->wq_mutex))
> + return -EINTR;
Returns -EFOO with the mutex not held.
> +
> + wq = autofs4_find_wait(sbi, qstr);
> + if (wq) {
> + *wait = wq;
> + return 1;
> + }
> + }
> +
> + /*
> + * Not ideal but the status has already gone. Of the two
> + * cases where we wait on NFY_NONE neither depend on the
> + * return status of the wait.
> + */
> + return 0;
Returns zero with the mutex held.
> + }
> +
> + /*
> + * If we've been asked to trigger a mount and the request
> + * completed while we waited on the mutex ...
> + */
> + if (notify == NFY_MOUNT) {
> + /*
> + * If the dentry isn't hashed just go ahead and try the
> + * mount again with a new wait (not much else we can do).
> + */
> + if (!d_unhashed(dentry)) {
> + /*
> + * But if the dentry is hashed, that means that we
> + * got here through the revalidate path. Thus, we
> + * need to check if the dentry has been mounted
> + * while we waited on the wq_mutex. If it has,
> + * simply return success.
> + */
> + if (d_mountpoint(dentry))
> + return 0;
> + }
> + }
> +
> + return 1;
> +}
>
> ...
>
> + ret = validate_request(&wq, sbi, &qstr, dentry, notify);
> + if (ret <= 0) {
> + if (ret == 0)
> mutex_unlock(&sbi->wq_mutex);
> - return 0;
> - }
> + kfree(qstr.name);
> + return ret;
> }
Leave the mutex held if it returned 1. Doesn't unlock the mutex if it
returned -EFOO. Presumably callers of this function will unlock the
mutex if it returned zero.
Or something like that. My brain just exploded.
Please double-check the locking protocol here and then document the
sorry thing.
prev parent reply other threads:[~2008-07-01 7:04 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-06-17 12:23 [PATCH 1/7] autofs4 - don't make expiring dentry negative Ian Kent
2008-06-17 12:23 ` [PATCH 2/7] autofs4 - revert - redo lookup in ttfd Ian Kent
2008-06-17 12:23 ` [PATCH 3/7] autofs4 - use look aside list for lookups Ian Kent
2008-06-17 12:23 ` Ian Kent
2008-06-17 12:23 ` [PATCH 4/7] autofs4 - don't release directory mutex if called in oz_mode Ian Kent
2008-06-17 12:23 ` Ian Kent
2008-06-17 12:24 ` [PATCH 5/7] autofs4 - use lookup intent flags to trigger mounts Ian Kent
2008-06-17 12:24 ` Ian Kent
2008-06-17 12:24 ` [PATCH 6/7] autofs4 - use struct qstr in waitq.c Ian Kent
2008-07-22 23:13 ` Andrew Morton
2008-07-23 5:08 ` Ian Kent
2008-07-23 5:17 ` Andrew Morton
2008-07-23 5:24 ` Ian Kent
2008-06-17 12:24 ` [PATCH 7/7] autofs4 - fix pending mount race Ian Kent
2008-07-01 7:04 ` Andrew Morton [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080701000422.67e211c3.akpm@linux-foundation.org \
--to=akpm@linux-foundation.org \
--cc=autofs@linux.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=raven@themaw.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.