All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chuck Lever <chuck.lever@oracle.com>
To: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: nfsv4@linux-nfs.org, Tom Talpey <Thomas.Talpey@netapp.com>,
	nfs@lists.sourceforge.net
Subject: Re: [PATCH 2/7] SUNRPC: Fix TCP rebinding logic
Date: Wed, 07 Nov 2007 17:48:30 -0500	[thread overview]
Message-ID: <473240BE.7000709@oracle.com> (raw)
In-Reply-To: <20071107003945.13713.61995.stgit@heimdal.trondhjem.org>

[-- Attachment #1: Type: text/plain, Size: 5199 bytes --]

Trond Myklebust wrote:
> From: Trond Myklebust <Trond.Myklebust@netapp.com>
> 
> Currently the TCP rebinding logic assumes that if we're not using a
> reserved port, then we don't need to reconnect on the same port if a
> disconnection event occurs.

As Johnny Carson used to say: "I did not know that."

I had assumed we always reused the port number whether a privileged port 
had been requested or not.

> This breaks most RPC duplicate reply cache
> implementations.
> 
> Also take into account the fact that xprt_min_resvport and
> xprt_max_resvport may change while we're reconnecting, since the user may
> change them at any time via the sysctls. Ensure that we check the port
> boundaries every time we loop in xs_bind4/xs_bind6. Also ensure that if the
> boundaries change, we only scan the ports a maximum of 2 times.
> 
> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
> ---
> 
>  net/sunrpc/xprtsock.c |   59 ++++++++++++++++++++++++++++++++-----------------
>  1 files changed, 38 insertions(+), 21 deletions(-)
> 
> diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
> index 322e4e2..5a83a40 100644
> --- a/net/sunrpc/xprtsock.c
> +++ b/net/sunrpc/xprtsock.c
> @@ -1272,34 +1272,53 @@ static void xs_set_port(struct rpc_xprt *xprt, unsigned short port)
>  	}
>  }
>  
> +static unsigned short xs_get_srcport(struct sock_xprt *transport, struct socket *sock)

Long line.

> +{
> +	unsigned short port = transport->port;
> +
> +	if (port == 0 && transport->xprt.resvport)
> +		port = xs_get_random_port();

I don't see a reason not to get rid of xs_get_random_port and move that 
logic in here.

> +	return port;
> +}
> +
> +static unsigned short xs_next_srcport(struct sock_xprt *transport, struct socket *sock, unsigned short port)

Long line.

> +{
> +	if (transport->port != 0)
> +		transport->port = 0;
> +	if (!transport->xprt.resvport)
> +		return 0;
> +	if (port <= xprt_min_resvport || port > xprt_max_resvport)
> +		return xprt_max_resvport;
> +	return --port;
> +}
> +
>  static int xs_bind4(struct sock_xprt *transport, struct socket *sock)
>  {
>  	struct sockaddr_in myaddr = {
>  		.sin_family = AF_INET,
>  	};
>  	struct sockaddr_in *sa;
> -	int err;
> -	unsigned short port = transport->port;
> +	int err, nloop = 0;
> +	unsigned short port = xs_get_srcport(transport, sock);
> +	unsigned short last;
>  
> -	if (!transport->xprt.resvport)
> -		port = 0;
>  	sa = (struct sockaddr_in *)&transport->addr;
>  	myaddr.sin_addr = sa->sin_addr;
>  	do {
>  		myaddr.sin_port = htons(port);
>  		err = kernel_bind(sock, (struct sockaddr *) &myaddr,
>  						sizeof(myaddr));
> -		if (!transport->xprt.resvport)
> +		if (port == 0)
>  			break;
>  		if (err == 0) {
>  			transport->port = port;
>  			break;
>  		}
> -		if (port <= xprt_min_resvport)
> -			port = xprt_max_resvport;
> -		else
> -			port--;
> -	} while (err == -EADDRINUSE && port != transport->port);
> +		last = port;
> +		port = xs_next_srcport(transport, sock, port);
> +		if (port > last)
> +			nloop++;

It seems like there are cases where a user can adjust the port range and 
it would defeat this check.  For example, if the port range is 30 to 40, 
and the user changes it to 10 to 20, we keep looping.

Doesn't breaking out of the loop break "Hard" NFS requests?

And I understand why you would want to copy the checks into a separate 
function (like, xs_bind6 uses the same checks), but it adds this extra 
little loop check at the end.  I usually punt when that happens.

> +	} while (err == -EADDRINUSE && nloop != 2);
>  	dprintk("RPC:       %s "NIPQUAD_FMT":%u: %s (%d)\n",
>  			__FUNCTION__, NIPQUAD(myaddr.sin_addr),
>  			port, err ? "failed" : "ok", err);
> @@ -1312,28 +1331,27 @@ static int xs_bind6(struct sock_xprt *transport, struct socket *sock)
>  		.sin6_family = AF_INET6,
>  	};
>  	struct sockaddr_in6 *sa;
> -	int err;
> -	unsigned short port = transport->port;
> +	int err, nloop = 0;
> +	unsigned short port = xs_get_srcport(transport, sock);
> +	unsigned short last;
>  
> -	if (!transport->xprt.resvport)
> -		port = 0;
>  	sa = (struct sockaddr_in6 *)&transport->addr;
>  	myaddr.sin6_addr = sa->sin6_addr;
>  	do {
>  		myaddr.sin6_port = htons(port);
>  		err = kernel_bind(sock, (struct sockaddr *) &myaddr,
>  						sizeof(myaddr));
> -		if (!transport->xprt.resvport)
> +		if (port == 0)
>  			break;
>  		if (err == 0) {
>  			transport->port = port;
>  			break;
>  		}
> -		if (port <= xprt_min_resvport)
> -			port = xprt_max_resvport;
> -		else
> -			port--;
> -	} while (err == -EADDRINUSE && port != transport->port);
> +		last = port;
> +		port = xs_next_srcport(transport, sock, port);
> +		if (port > last)
> +			nloop++;
> +	} while (err == -EADDRINUSE && nloop != 2);
>  	dprintk("RPC:       xs_bind6 "NIP6_FMT":%u: %s (%d)\n",
>  		NIP6(myaddr.sin6_addr), port, err ? "failed" : "ok", err);
>  	return err;
> @@ -1815,7 +1833,6 @@ static struct rpc_xprt *xs_setup_xprt(struct xprt_create *args,
>  	xprt->addrlen = args->addrlen;
>  	if (args->srcaddr)
>  		memcpy(&new->addr, args->srcaddr, args->addrlen);
> -	new->port = xs_get_random_port();
>  
>  	return xprt;
>  }

Moving this little wart into xs_bind?() is a nice clean-up.

[-- Attachment #2: chuck.lever.vcf --]
[-- Type: text/x-vcard, Size: 259 bytes --]

begin:vcard
fn:Chuck Lever
n:Lever;Chuck
org:Oracle Corporation;Corporate Architecture: Linux Projects Group
adr:;;1015 Granger Avenue;Ann Arbor;MI;48104;USA
title:Principal Member of Staff
tel;work:+1 248 614 5091
x-mozilla-html:FALSE
version:2.1
end:vcard


[-- Attachment #3: Type: text/plain, Size: 314 bytes --]

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/

[-- Attachment #4: Type: text/plain, Size: 140 bytes --]

_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

  reply	other threads:[~2007-11-07 22:49 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-11-07  0:39 [PATCH 0/7] Improve the NFS/TCP reconnection code Trond Myklebust
2007-11-07  0:39 ` [PATCH 1/7] SUNRPC: Fix a race in xs_tcp_state_change() Trond Myklebust
2007-11-07 22:21   ` Chuck Lever
2007-11-07 22:32     ` [NFS] " Trond Myklebust
2007-11-07 23:47       ` setclientid: string in use on NFS v4 share on Debian Etch & hosts file "solution" Matt Weatherford
2007-11-09 19:55         ` J. Bruce Fields
2007-11-08 15:40       ` [NFS] [PATCH 1/7] SUNRPC: Fix a race in xs_tcp_state_change() Chuck Lever
2007-11-08 16:12         ` Trond Myklebust
2007-11-07  0:39 ` [PATCH 2/7] SUNRPC: Fix TCP rebinding logic Trond Myklebust
2007-11-07 22:48   ` Chuck Lever [this message]
2007-11-07 23:08     ` Trond Myklebust
2007-11-07 23:28       ` [NFS] " Chuck Lever
2007-11-07 23:47         ` Trond Myklebust
2007-11-09 13:35   ` Talpey, Thomas
2007-11-07  0:39 ` [PATCH 3/7] SUNRPC: Allow the client to detect if the TCP connection is closed Trond Myklebust
2007-11-09 14:04   ` Talpey, Thomas
2007-11-09 14:33     ` [NFS] " Trond Myklebust
2007-11-09 14:35       ` Talpey, Thomas
2007-11-09 14:48         ` Trond Myklebust
2007-11-09 15:25           ` Talpey, Thomas
2007-11-09 15:32             ` [NFS] " Trond Myklebust
2007-11-09 16:53               ` Talpey, Thomas
2007-11-09 17:37                 ` Trond Myklebust
2007-11-09 17:52                   ` [NFS] " Talpey, Thomas
2007-11-09 18:21                     ` Trond Myklebust
2007-11-07  0:39 ` [PATCH 4/7] SUNRPC: Use shutdown() instead of close() when disconnecting a TCP socket Trond Myklebust
2007-11-07 23:11   ` [NFS] " Chuck Lever
2007-11-07 23:59     ` Trond Myklebust
2007-11-09 13:38   ` Talpey, Thomas
2007-11-09 13:51     ` [NFS] " Trond Myklebust
2007-11-07  0:40 ` [PATCH 5/7] SUNRPC: xprt_autoclose() should not call xprt_disconnect() Trond Myklebust
2007-11-09 13:56   ` [NFS] " Talpey, Thomas
2007-11-07  0:40 ` [PATCH 6/7] SUNRPC: Make call_status()/call_decode() call xprt_force_disconnect() Trond Myklebust
2007-11-07 23:15   ` [NFS] " Chuck Lever
2007-11-07  0:40 ` [PATCH 7/7] SUNRPC: Rename xprt_disconnect() Trond Myklebust
2007-11-07 23:16   ` [NFS] " Chuck Lever
2007-11-08  0:01     ` Trond Myklebust

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=473240BE.7000709@oracle.com \
    --to=chuck.lever@oracle.com \
    --cc=Thomas.Talpey@netapp.com \
    --cc=Trond.Myklebust@netapp.com \
    --cc=nfs@lists.sourceforge.net \
    --cc=nfsv4@linux-nfs.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.