* [PATCH 05/10] mount.nfs: Shorter timeout for TCP connects
@ 2007-08-03 17:23 Chuck Lever
2007-08-03 22:20 ` Neil Brown
0 siblings, 1 reply; 3+ messages in thread
From: Chuck Lever @ 2007-08-03 17:23 UTC (permalink / raw)
To: neilb; +Cc: nfs
The standard TCP connect timeout on Linux is 75 seconds, which can be
too long in some cases. The timeout itself can be altered on a system-wide
basis, but we'd like mount to have it's own connect timeout that's tunable,
and defaults to a shorter value.
The get_socket() function is a utility function that does TCP connects for
getport, clnt_ping, and other functions. Add logic there to use a
non-blocking connect() and select() in order to time out a connect
operation that's taking too long.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
utils/mount/network.c | 64 +++++++++++++++++++++++++++++++++++++++++++++----
1 files changed, 59 insertions(+), 5 deletions(-)
diff --git a/utils/mount/network.c b/utils/mount/network.c
index 220bbce..89cb976 100644
--- a/utils/mount/network.c
+++ b/utils/mount/network.c
@@ -52,6 +52,10 @@
#define NFS_PORT 2049
#endif
+#define PMAP_TIMEOUT (10)
+#define CONNECT_TIMEOUT (20)
+#define MOUNT_TIMEOUT (30)
+
#if SIZEOF_SOCKLEN_T - 0 == 0
#define socklen_t unsigned int
#endif
@@ -158,12 +162,60 @@ int nfs_gethostbyname(const char *hostname, struct sockaddr_in *saddr)
}
/*
+ * Attempt to connect a socket, but time out after "timeout" seconds.
+ *
+ * On error return, caller closes the socket.
+ */
+static int connect_to(int fd, struct sockaddr *addr,
+ socklen_t addrlen, int timeout)
+{
+ int ret, saved;
+ fd_set rset, wset;
+ struct timeval tv = {
+ .tv_sec = timeout,
+ };
+
+ saved = fcntl(fd, F_GETFL, 0);
+ fcntl(fd, F_SETFL, saved | O_NONBLOCK);
+
+ ret = connect(fd, addr, addrlen);
+ if (ret < 0 && errno != EINPROGRESS)
+ return -1;
+ if (ret == 0)
+ goto out;
+
+ FD_ZERO(&rset);
+ FD_SET(fd, &rset);
+ wset = rset;
+ ret = select(fd + 1, &rset, &wset, NULL, &tv);
+ if (ret == 0) {
+ errno = ETIMEDOUT;
+ return -1;
+ }
+ if (FD_ISSET(fd, &rset) || FD_ISSET(fd, &wset)) {
+ int error;
+ socklen_t len = sizeof(error);
+ if (getsockopt(fd, SOL_SOCKET, SO_ERROR, &error, &len) < 0)
+ return -1;
+ if (error) {
+ errno = error;
+ return -1;
+ }
+ } else
+ return -1;
+
+out:
+ fcntl(fd, F_SETFL, saved);
+ return 0;
+}
+
+/*
* Create a socket that is locally bound to a reserved or non-reserved
* port. For any failures, RPC_ANYSOCK is returned which will cause
* the RPC code to create the socket instead.
*/
static int get_socket(struct sockaddr_in *saddr, unsigned int p_prot,
- int resvp, int conn)
+ unsigned int timeout, int resvp, int conn)
{
int so, cc, type;
struct sockaddr_in laddr;
@@ -185,7 +237,8 @@ static int get_socket(struct sockaddr_in *saddr, unsigned int p_prot,
goto err_bind;
}
if (type == SOCK_STREAM || (conn && type == SOCK_DGRAM)) {
- cc = connect(so, (struct sockaddr *)saddr, namelen);
+ cc = connect_to(so, (struct sockaddr *)saddr, namelen,
+ timeout);
if (cc < 0)
goto err_connect;
}
@@ -262,7 +315,7 @@ static unsigned short getport(struct sockaddr_in *saddr,
* clnt*create() will create one anyway if this
* fails.
*/
- socket = get_socket(saddr, proto, FALSE, FALSE);
+ socket = get_socket(saddr, proto, PMAP_TIMEOUT, FALSE, FALSE);
if (socket == RPC_ANYSOCK) {
if (proto == IPPROTO_TCP && errno == ETIMEDOUT) {
/*
@@ -552,7 +605,8 @@ CLIENT *mnt_openclnt(clnt_addr_t *mnt_server, int *msock)
CLIENT *clnt = NULL;
mnt_saddr->sin_port = htons((u_short)mnt_pmap->pm_port);
- *msock = get_socket(mnt_saddr, mnt_pmap->pm_prot, TRUE, FALSE);
+ *msock = get_socket(mnt_saddr, mnt_pmap->pm_prot, MOUNT_TIMEOUT,
+ TRUE, FALSE);
if (*msock == RPC_ANYSOCK) {
if (rpc_createerr.cf_error.re_errno == EADDRINUSE)
/*
@@ -608,7 +662,7 @@ int clnt_ping(struct sockaddr_in *saddr, const unsigned long prog,
struct sockaddr dissolve;
rpc_createerr.cf_stat = stat = errno = 0;
- sock = get_socket(saddr, prot, FALSE, TRUE);
+ sock = get_socket(saddr, prot, CONNECT_TIMEOUT, FALSE, TRUE);
if (sock == RPC_ANYSOCK) {
if (errno == ETIMEDOUT) {
/*
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH 05/10] mount.nfs: Shorter timeout for TCP connects
2007-08-03 17:23 [PATCH 05/10] mount.nfs: Shorter timeout for TCP connects Chuck Lever
@ 2007-08-03 22:20 ` Neil Brown
2007-08-04 0:29 ` Chuck Lever
0 siblings, 1 reply; 3+ messages in thread
From: Neil Brown @ 2007-08-03 22:20 UTC (permalink / raw)
To: Chuck Lever; +Cc: nfs
On Friday August 3, chuck.lever@oracle.com wrote:
> The standard TCP connect timeout on Linux is 75 seconds, which can be
> too long in some cases. The timeout itself can be altered on a system-wide
> basis, but we'd like mount to have it's own connect timeout that's tunable,
> and defaults to a shorter value.
75? "man 7 tcp" suggest about 180, and a simple telnet test confirmed this.
The man page also suggests that this is due to 5 SYN packets being
sent, but I count 6 and intervals of
3, 6, 12, 24, 48, then 96 seconds until it gives up
This makes a total of 189.
You can change The number of SYN retries with the TCP_SYNCNT socket
option, but it is probably just as easy to use async connect and
select as you do.
The sensible timeouts would seem to be (just more than)
3, 9, 21, 45, 93
seconds. You have chosen 10, 20, 30
Any reason for that? I'm having trouble justifying why the portmap,
the mountd and the 'ping' calls should have different timeouts.
I would probably go for 12 seconds each. This allows 2 SYN requests,
and 3 seconds for a response to get back. If that is too short, then
maybe 25 seconds.
Choosing timeouts is a very imprecise science, but I'd like to have at
least some understanding of what we pick the numbers we do.
What do you think?
Thanks,
NeilBrown
>
> The get_socket() function is a utility function that does TCP connects for
> getport, clnt_ping, and other functions. Add logic there to use a
> non-blocking connect() and select() in order to time out a connect
> operation that's taking too long.
>
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> ---
>
> utils/mount/network.c | 64 +++++++++++++++++++++++++++++++++++++++++++++----
> 1 files changed, 59 insertions(+), 5 deletions(-)
>
> diff --git a/utils/mount/network.c b/utils/mount/network.c
> index 220bbce..89cb976 100644
> --- a/utils/mount/network.c
> +++ b/utils/mount/network.c
> @@ -52,6 +52,10 @@
> #define NFS_PORT 2049
> #endif
>
> +#define PMAP_TIMEOUT (10)
> +#define CONNECT_TIMEOUT (20)
> +#define MOUNT_TIMEOUT (30)
> +
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH 05/10] mount.nfs: Shorter timeout for TCP connects
2007-08-03 22:20 ` Neil Brown
@ 2007-08-04 0:29 ` Chuck Lever
0 siblings, 0 replies; 3+ messages in thread
From: Chuck Lever @ 2007-08-04 0:29 UTC (permalink / raw)
To: Neil Brown; +Cc: nfs
[-- Attachment #1: Type: text/plain, Size: 2468 bytes --]
Neil Brown wrote:
> On Friday August 3, chuck.lever@oracle.com wrote:
>> The standard TCP connect timeout on Linux is 75 seconds, which can be
>> too long in some cases. The timeout itself can be altered on a system-wide
>> basis, but we'd like mount to have it's own connect timeout that's tunable,
>> and defaults to a shorter value.
>
> 75? "man 7 tcp" suggest about 180, and a simple telnet test confirmed this.
> The man page also suggests that this is due to 5 SYN packets being
> sent, but I count 6 and intervals of
>
> 3, 6, 12, 24, 48, then 96 seconds until it gives up
>
> This makes a total of 189.
The traditional TCP connect timeout is 75 seconds on *BSD, and my simple
test gave about 75 seconds on my distro. I'm not surprised that it varies.
Usually some consideration is given to the expected maximum possible
round-trip. Too short a connect timeout can prevent successful
connections to distant hosts.
> You can change The number of SYN retries with the TCP_SYNCNT socket
> option, but it is probably just as easy to use async connect and
> select as you do.
connect/select is recommended in Steven's "Unix Networking Programming,
Vol I". I think it is recommended because this is the most portable way
to time out a TCP connect.
And, unlike SYNs, "seconds" directly tunes what a user will experience.
> The sensible timeouts would seem to be (just more than)
> 3, 9, 21, 45, 93
> seconds. You have chosen 10, 20, 30
>
> Any reason for that?
I don't really think it's necessary to cleave closely to the SYN behavior.
> I'm having trouble justifying why the portmap,
> the mountd and the 'ping' calls should have different timeouts.
I did this mostly to demonstrate that it could be done. I'm not
strongly attached to the idea of having different timeouts for different
purposes.
> I would probably go for 12 seconds each. This allows 2 SYN requests,
> and 3 seconds for a response to get back. If that is too short, then
> maybe 25 seconds.
GETPORT over UDP uses 3 second retries, and times out after 20 seconds.
The 20 second RPC timeout is determined by the TIMEOUT macro, and it
might be reasonable to make the equivalent TCP timeouts roughly the same.
> Choosing timeouts is a very imprecise science, but I'd like to have at
> least some understanding of what we pick the numbers we do.
Adding some documentation in comments near the TIMEOUT definitions would
probably be useful for doing future tweaks.
[-- Attachment #2: chuck.lever.vcf --]
[-- Type: text/x-vcard, Size: 315 bytes --]
begin:vcard
fn:Chuck Lever
n:Lever;Chuck
org:Oracle Corporation;Corporate Architecture: Linux Projects Group
adr:;;1015 Granger Avenue;Ann Arbor;MI;48104;USA
email;internet:chuck dot lever at nospam oracle dot com
title:Principal Member of Staff
tel;work:+1 248 614 5091
x-mozilla-html:FALSE
version:2.1
end:vcard
[-- Attachment #3: Type: text/plain, Size: 315 bytes --]
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
[-- Attachment #4: Type: text/plain, Size: 140 bytes --]
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2007-08-04 0:30 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-08-03 17:23 [PATCH 05/10] mount.nfs: Shorter timeout for TCP connects Chuck Lever
2007-08-03 22:20 ` Neil Brown
2007-08-04 0:29 ` Chuck Lever
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.