linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Calum Mackay <calum.mackay@oracle.com>
To: trondmy@hammerspace.com, anna.schumaker@netapp.com
Cc: linux-nfs@vger.kernel.org
Subject: Re: [PATCH] lockd: don't use timed rebind with TCP
Date: Sat, 3 Oct 2020 18:21:12 +0100	[thread overview]
Message-ID: <9f36a510-3721-f454-a2ef-95e5b9beb360@oracle.com> (raw)
In-Reply-To: <20201002225750.16452-1-calum.mackay@oracle.com>


[-- Attachment #1.1.1: Type: text/plain, Size: 2880 bytes --]

Please hold off for now on this one; I think I need to adjust the 
reclaimer a little.

thanks,
calum.

On 02/10/2020 11:57 pm, Calum Mackay wrote:
> It is possible for nlm_bind_host() to clear XPRT_BOUND whilst a connection
> worker is in the middle of trying to reconnect. When the latter notices
> that XPRT_BOUND been cleared under it, in xs_tcp_finish_connecting(),
> that results in:
> 
> 	xs_tcp_setup_socket: connect returned unhandled error -107
> 
> Worse, it's possible that the two can get into lockstep, resulting in
> the same behaviour repeated indefinitely, with the above error every
> 300 seconds, without ever recovering, and the connection never being
> established. This is most likely to occur when there's a large number
> of NLM client tasks following a server reboot.
> 
> Since the timed rebind would seem not to be needed for TCP in any case,
> whilst the existing connection remains, restrict the timed rebinding to
> UDP only.
> 
> For TCP, we will still rebind when needed, e.g. on timeout, connection
> error (including closure), and in the reclaimer.
> 
> Whilst there, refactor some duplicate code.
> 
> Signed-off-by: Calum Mackay <calum.mackay@oracle.com>
> ---
>   fs/lockd/host.c | 16 +++++++---------
>   1 file changed, 7 insertions(+), 9 deletions(-)
> 
> diff --git a/fs/lockd/host.c b/fs/lockd/host.c
> index 0afb6d59bad0..6e98c2ed6ffc 100644
> --- a/fs/lockd/host.c
> +++ b/fs/lockd/host.c
> @@ -439,12 +439,7 @@ nlm_bind_host(struct nlm_host *host)
>   	 * RPC rebind is required
>   	 */
>   	if ((clnt = host->h_rpcclnt) != NULL) {
> -		if (time_after_eq(jiffies, host->h_nextrebind)) {
> -			rpc_force_rebind(clnt);
> -			host->h_nextrebind = jiffies + NLM_HOST_REBIND;
> -			dprintk("lockd: next rebind in %lu jiffies\n",
> -					host->h_nextrebind - jiffies);
> -		}
> +		nlm_rebind_host(host);
>   	} else {
>   		unsigned long increment = nlmsvc_timeout;
>   		struct rpc_timeout timeparms = {
> @@ -495,15 +490,18 @@ nlm_bind_host(struct nlm_host *host)
>   }
>   
>   /*
> - * Force a portmap lookup of the remote lockd port
> + * Force a portmap lookup of the remote lockd port, unless we're using a
> + * TCP connection.
>    */
>   void
>   nlm_rebind_host(struct nlm_host *host)
>   {
> -	dprintk("lockd: rebind host %s\n", host->h_name);
> -	if (host->h_rpcclnt && time_after_eq(jiffies, host->h_nextrebind)) {
> +	if (unlikely(host->h_proto == IPPROTO_UDP) && host->h_rpcclnt &&
> +			time_after_eq(jiffies, host->h_nextrebind)) {
>   		rpc_force_rebind(host->h_rpcclnt);
>   		host->h_nextrebind = jiffies + NLM_HOST_REBIND;
> +		dprintk("lockd: rebind host %s; next rebind in %lu jiffies\n",
> +			host->h_name, host->h_nextrebind - jiffies);
>   	}
>   }
>   
> 

-- 
Calum Mackay
Linux Kernel Engineering
Oracle Linux and Virtualisation

[-- Attachment #1.1.2: OpenPGP_0x8523EF006DC153E2.asc --]
[-- Type: application/pgp-keys, Size: 3183 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 840 bytes --]

      reply	other threads:[~2020-10-03 17:21 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-02 22:57 [PATCH] lockd: don't use timed rebind with TCP Calum Mackay
2020-10-03 17:21 ` Calum Mackay [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9f36a510-3721-f454-a2ef-95e5b9beb360@oracle.com \
    --to=calum.mackay@oracle.com \
    --cc=anna.schumaker@netapp.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=trondmy@hammerspace.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).