From: Ian Kent <ikent@redhat.com>
To: Paulo Andrade <paulo.cesar.pereira.de.andrade@gmail.com>,
libtirpc-devel@lists.sourceforge.net
Cc: linux-nfs@vger.kernel.org, Paulo Andrade <pcpa@gnu.org>
Subject: Re: [Libtirpc-devel] [PATCH] Do not hold clnt_fd_lock mutex during connect
Date: Thu, 19 May 2016 13:19:36 +0800 [thread overview]
Message-ID: <1463635176.3017.75.camel@redhat.com> (raw)
In-Reply-To: <1463594091-1289-1-git-send-email-pcpa@gnu.org>
On Wed, 2016-05-18 at 14:54 -0300, Paulo Andrade wrote:
> An user reports that their application connects to multiple servers
> through a rpc interface using libtirpc. When one of the servers misbehaves
> (goes down ungracefully or has a delay of a few seconds in the traffic
> flow), it was observed that the traffic from the client to other servers is
> decreased by the traffic anomaly of the failing server, i.e. traffic
> decreases or goes to 0 in all the servers.
>
> When investigated further, specifically into the behavior of the libtirpc
> at the time of the issue, it was observed that all of the application
> threads specifically interacting with libtirpc were locked into one single
> lock inside the libtirpc library. This was a race condition which had
> resulted in a deadlock and hence the resultant dip/stoppage of traffic.
>
> As an experiment, the user removed the libtirpc from the application build
> and used the standard glibc library for rpc communication. In that case,
> everything worked perfectly even in the time of the issue of server nodes
> misbehaving.
>
> Signed-off-by: Paulo Andrade <pcpa@gnu.org>
> ---
> src/clnt_vc.c | 8 ++------
> 1 file changed, 2 insertions(+), 6 deletions(-)
>
> diff --git a/src/clnt_vc.c b/src/clnt_vc.c
> index a72f9f7..2396f34 100644
> --- a/src/clnt_vc.c
> +++ b/src/clnt_vc.c
> @@ -229,27 +229,23 @@ clnt_vc_create(fd, raddr, prog, vers, sendsz, recvsz)
> } else
> assert(vc_cv != (cond_t *) NULL);
>
> - /*
> - * XXX - fvdl connecting while holding a mutex?
> - */
> + mutex_unlock(&clnt_fd_lock);
> +
> slen = sizeof ss;
> if (getpeername(fd, (struct sockaddr *)&ss, &slen) < 0) {
> if (errno != ENOTCONN) {
> rpc_createerr.cf_stat = RPC_SYSTEMERROR;
> rpc_createerr.cf_error.re_errno = errno;
> - mutex_unlock(&clnt_fd_lock);
> thr_sigsetmask(SIG_SETMASK, &(mask), NULL);
> goto err;
> }
Oh, right, the mutex is probably needed to ensure that errno is reliable.
> if (connect(fd, (struct sockaddr *)raddr->buf, raddr->len) <
> 0){
But this is probably where the caller is blocking so a small variation of this
patch should achieve the required result.
btw, I had a quick look at some of the other code and so far it looks like they
lead to clnt_tp_create() or clnt_dg_create() calls.
clnt_dg_create() is not connection oriented so it doesn't have the same mutex
lock problem.
So this patch might be all that's needed.
Ian
next prev parent reply other threads:[~2016-05-19 5:19 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-05-18 17:54 [PATCH] Do not hold clnt_fd_lock mutex during connect Paulo Andrade
2016-05-19 3:43 ` [Libtirpc-devel] " Ian Kent
2016-05-19 5:19 ` Ian Kent [this message]
2016-05-19 23:53 ` Ian Kent
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1463635176.3017.75.camel@redhat.com \
--to=ikent@redhat.com \
--cc=libtirpc-devel@lists.sourceforge.net \
--cc=linux-nfs@vger.kernel.org \
--cc=paulo.cesar.pereira.de.andrade@gmail.com \
--cc=pcpa@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.