All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ric Wheeler <rwheeler@redhat.com>
To: Anand Avati <avati@redhat.com>
Cc: miklos@szeredi.hu, bfoster@redhat.com,
	Linux FS Devel <linux-fsdevel@vger.kernel.org>,
	fuse-devel@lists.sourceforge.net,
	Alexander Viro <aviro@redhat.com>,
	David Howells <dhowells@redhat.com>,
	Eric Paris <eparis@redhat.com>
Subject: Re: [PATCH] [REPOST] fuse: drop dentry on failed revalidate
Date: Thu, 25 Jul 2013 10:42:16 -0400	[thread overview]
Message-ID: <51F13948.3050100@redhat.com> (raw)
In-Reply-To: <20130725055209.GA15621@sh-el5.eng.rdu2.redhat.com>



On 07/25/2013 01:52 AM, Anand Avati wrote:
> Consider the following sequence of operations:
>
>    // mount same backend at two places
>
>    # mount -t fuse <some_src> /mnt1
>    # mount -t fuse <same_src> /mnt2
>
>    // create a directory chain from 1
>
>    $ mkdir -p /mnt1/a/b
>
>    // load it in 2's cache
>
>    $ stat /mnt2/a/b     # load it in cache
>
>    // recreate same names from 1
>
>    $ rm -rf /mnt1/a
>    $ mkdir -p /mnt1/a/b
>
>    // sleep long enough for entry_timeout to expire
>
>    $ sleep 5
>
>    // access /mnt2/a/b from two threads in parallel
>
>    $ stat /mnt2/a/b & stat /mnt2/a/b
>
> Depending on the race, none/either/both of the commands
> executed in the last step can fail.
>
> This is because both the stat command threads execute the
> resolver in parallel.
>
> - The resolver function lookup_fast() will acquire the dentry
>    (of /mnt2/a) reference with __d_lookup()
>
> - Call to d_revalidate() on the just acquired dentry will fail,
>    (i.e return 0) as FUSE gets a new nodeid from the server.
>
> - In the mean time another resolver thread enters lookup_fast()
>    and acquires the dentry of /mnt2/a with __d_lookup(), effectively
>    making dentry->d_count > 1 [+ child refs]
>
> - Now when first thread calls d_invalidate() because of the failed
>    d_revalidate(), d_invalidate() will find that even after calling
>    shrink_dcache_parent() we are left with d_count > 1, and fails
>    d_invalidate() with EBUSY.
>
> - The failed d_invalidate() makes the resolver use this "stale" dentry
>    as the result of this walk_component() call -- even though it just
>    witnessed d_revalidate() fail on it, only because d_invalidate()
>    could not succeed because of an innocent concurrent resolver in
>    progress.
>
> - Using the stale dentry (and inode), the call progress and stubles
>    with an error as the FUSE server is presented with a dead inode.
>
> - The other thread would fail in d_revalidate() too, and depending
>    on the progress relaitvely made between the two, the second
>    thread's d_invalidate() might get an EBUSY too, and stuble in the
>    same way as the first thread.
>
> If the same stat commands were issued serially, both would succeed.
>
> NFS is faced with a similar situation as FUSE (and in many other ways
> in general too) and it checks for a submounts and conditionally calls
> d_drop(). The call to d_drop() within ->d_revalidate() guarantees the
> success of d_invalidate(), and a fresh lookup would be issued there on.
>
> Signed-off-by: Anand Avati <avati@redhat.com>
> ---

Adding in Al and others to the thread & correcting the fsdevel address :)

Ric

>
> Background:
>
> The previous submission of this patch (on fuse-devel) had review comments
> to investigate doing a d_drop() on the entire subtree rather than just
> on the entry. That approach seems to be very complex. So reposting the
> same patch to kick in the discussion again. This patch follows the NFS
> approach to the problem.
>
>   fs/fuse/dir.c | 4 ++++
>   1 file changed, 4 insertions(+)
>
> diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
> index a1d9047..83c217e 100644
> --- a/fs/fuse/dir.c
> +++ b/fs/fuse/dir.c
> @@ -226,6 +226,10 @@ static int fuse_dentry_revalidate(struct dentry *entry, unsigned int flags)
>   		if (!err) {
>   			struct fuse_inode *fi = get_fuse_inode(inode);
>   			if (outarg.nodeid != get_node_id(inode)) {
> +				if (!have_submounts(entry)) {
> +					shrink_dcache_parent(entry);
> +					d_drop(entry);
> +				}
>   				fuse_queue_forget(fc, forget, outarg.nodeid, 1);
>   				return 0;
>   			}


       reply	other threads:[~2013-07-25 14:42 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20130725055209.GA15621@sh-el5.eng.rdu2.redhat.com>
2013-07-25 14:42 ` Ric Wheeler [this message]
2013-07-30 16:16   ` [PATCH] [REPOST] fuse: drop dentry on failed revalidate Miklos Szeredi
2013-07-30 19:30     ` Ric Wheeler
2013-08-01 16:39       ` Miklos Szeredi
2013-08-01 18:45         ` Ric Wheeler
2013-08-01 18:45           ` Ric Wheeler
2013-08-02  9:02           ` Miklos Szeredi
2013-08-02  9:02             ` Miklos Szeredi
2013-08-02 11:43             ` Jeff Layton
2013-08-02 11:43               ` Jeff Layton
2013-08-02 14:30               ` Miklos Szeredi
2013-08-02 16:58                 ` Jeff Layton
2013-08-02 16:58                   ` Jeff Layton
2013-08-02 12:17     ` Jeff Layton
2013-08-02 14:42       ` Miklos Szeredi
2013-08-02 17:32         ` Jeff Layton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51F13948.3050100@redhat.com \
    --to=rwheeler@redhat.com \
    --cc=avati@redhat.com \
    --cc=aviro@redhat.com \
    --cc=bfoster@redhat.com \
    --cc=dhowells@redhat.com \
    --cc=eparis@redhat.com \
    --cc=fuse-devel@lists.sourceforge.net \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.