From: Tigran Mkrtchyan <tigran.mkrtchyan@desy.de>
To: Jeff Layton <jlayton@redhat.com>, linux-nfs@vger.kernel.org
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Subject: Re: [PATCH] nfs: don't queue synchronous NFSv4 close rpc_release to nfsiod
Date: Tue, 15 Feb 2011 16:53:42 +0100 [thread overview]
Message-ID: <4D5AA186.4040802@desy.de> (raw)
In-Reply-To: <1297781939-1400-1-git-send-email-jlayton@redhat.com>
I have seen this as well, but was interpreting as problem in my server.
Tigran.
On 02/15/2011 03:58 PM, Jeff Layton wrote:
> I recently had some of our QA people report some connectathon test
> failures in RHEL5 (2.6.18-based kernel). For some odd reason (maybe
> scheduling differences that make the race more likely?) the problem
> occurs more frequently on s390.
>
> The problem generally manifests itself on NFSv4 as a race where an rmdir
> fails because a silly-renamed file in the directory wasn't deleted in
> time. Looking at traces, what you usually see is the failing rmdir
> attempt that fails with the sillydelete of the file that prevented it
> very soon afterward.
>
> Silly deletes are handled via dentry_iput and in the case of a close on
> NFSv4, the last dentry reference is often held by the CLOSE RPC task.
> nfs4_do_close does the close as an async RPC task that it conditionally
> waits on depending on whether the close is synchronous or not.
>
> It also sets the workqueue for the task to nfsiod_workqueue. When
> tk_workqueue is set, the rpc_release operation is queued to that
> workqueue. rpc_release is where the dentry reference held by the task is
> put. The caller has no way to wait for that to complete, so the close(2)
> syscall can easily return before the rpc_release call is ever done. In
> some cases, that rpc_release is delayed for a long enough to prevent a
> subsequent rmdir of the containing directory.
>
> I believe this is a bug, or at least not ideal behavior. We should try
> not to have the close(2) call return in this situation until the
> sillydelete is done.
>
> I've been able to reproduce this more reliably by adding a 100ms sleep
> at the top of nfs4_free_closedata. I've not seen it "in the wild" on
> mainline kernels, but it seems quite possible when a machine is heavily
> loaded.
>
> This patch fixes this by not setting tk_workqueue in nfs4_do_close when
> the wait flag is set. This makes the final rpc_put_task a synchronous
> operation and should prevent close(2) from returning before the
> dentry_iput is done.
>
> Signed-off-by: Jeff Layton <jlayton@redhat.com>
> ---
> fs/nfs/nfs4proc.c | 5 ++++-
> 1 files changed, 4 insertions(+), 1 deletions(-)
>
> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> index 78936a8..4cabfea 100644
> --- a/fs/nfs/nfs4proc.c
> +++ b/fs/nfs/nfs4proc.c
> @@ -1988,11 +1988,14 @@ int nfs4_do_close(struct path *path, struct nfs4_state *state, gfp_t gfp_mask, i
> .rpc_client = server->client,
> .rpc_message = &msg,
> .callback_ops = &nfs4_close_ops,
> - .workqueue = nfsiod_workqueue,
> .flags = RPC_TASK_ASYNC,
> };
> int status = -ENOMEM;
>
> + /* rpc_release must be synchronous too if "wait" is set */
> + if (!wait)
> + task_setup_data.workqueue = nfsiod_workqueue;
> +
> calldata = kzalloc(sizeof(*calldata), gfp_mask);
> if (calldata == NULL)
> goto out;
prev parent reply other threads:[~2011-02-15 15:54 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-02-15 14:58 [PATCH] nfs: don't queue synchronous NFSv4 close rpc_release to nfsiod Jeff Layton
2011-02-15 15:31 ` Trond Myklebust
2011-02-15 16:30 ` Jeff Layton
2011-02-15 23:47 ` Trond Myklebust
2011-02-16 14:09 ` Trond Myklebust
2011-02-16 14:26 ` Trond Myklebust
2011-02-16 14:50 ` Jeff Layton
2011-02-16 15:21 ` Trond Myklebust
2011-02-16 18:13 ` Jeff Layton
2011-02-17 13:40 ` Jeff Layton
2011-02-17 15:10 ` Jeff Layton
2011-02-17 19:47 ` Trond Myklebust
2011-02-17 21:37 ` Jeff Layton
2011-02-18 20:04 ` Jeff Layton
2011-02-18 20:54 ` Trond Myklebust
2011-02-23 20:17 ` Jeff Layton
2011-02-15 15:53 ` Tigran Mkrtchyan [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4D5AA186.4040802@desy.de \
--to=tigran.mkrtchyan@desy.de \
--cc=Trond.Myklebust@netapp.com \
--cc=jlayton@redhat.com \
--cc=linux-nfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).