Fwd: NFS 5-minute hangs upon S3 resume using 2.6.27 client

Linux NFS development
 help / color / mirror / Atom feed

* Fwd: NFS 5-minute hangs upon S3 resume using 2.6.27 client
@ 2008-10-23  4:02 Michel Lespinasse
       [not found] ` <20081023040231.GA13512-Y93EPB1FQwg@public.gmane.org>
  0 siblings, 1 reply; 7+ messages in thread
From: Michel Lespinasse @ 2008-10-23  4:02 UTC (permalink / raw)
  To: linux-nfs

I sent this out to LKML earlier but was told this would be a better list.
This has been mentionned in bugzilla already, but I'd like to draw attention
before it gets too late for 2.6.28 (or is it already too late ???)

The following is a common cause of 5-minute NFS hangs here:

* Client has TCP connections to the NFS server, goes to S3 sleep for few hours.
* TCP connections die on the server side.
  (not 100% sure why, do they use some kind of keepalive ???)
* Client resumes from S3.
* Client sends NFS requests down its TCP connections, gets back RST packet.
* [Client hangs for exactly 300 seconds here]
* Client establishes new TCP connections to the NFS server,
  and recovers from the hang.

A tcpdump trace is attached at the end of bugzilla bug 11154:
http://bugzilla.kernel.org/show_bug.cgi?id=11154

Should the client immediately try to reconnect when its existing connection
receives an RST packet ? (the 5 minute delay would make sense to me if
RST was received in reply to a SYN, but I'm not sure about it in the case
of an existing open TCP connection).

If the 5 minute delay after an RST is necessary, could the client avoid it
by explicitly closing/reopening its connections using suspend/resume hooks ?

(I can not work around the issue locally by mounting/unmounting my NFS
shares around the suspend/resume because rootfs also on NFS...)

This NFS setup was working fine in 2.6.24. There has been issues with
2.6.25 and 2.6.26, but I did not confirm if they are the same bug.
2.6.25 usualy recovers after some variable delay and 2.6.26 usualy
does not recover. Bugs 11154 and 11061 have more details about this,
also Ian Campbell has been tracking an NFS issue under load that
appeared at around the same time.

Hope this helps,

-- 
Michel "Walken" Lespinasse
A program is never fully debugged until the last user dies.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Fwd: NFS 5-minute hangs upon S3 resume using 2.6.27 client
       [not found] ` <20081023040231.GA13512-Y93EPB1FQwg@public.gmane.org>
@ 2008-10-23 15:36   ` Trond Myklebust
  2008-10-23 19:52     ` Michel Lespinasse
  0 siblings, 1 reply; 7+ messages in thread
From: Trond Myklebust @ 2008-10-23 15:36 UTC (permalink / raw)
  To: Michel Lespinasse; +Cc: linux-nfs

On Wed, 2008-10-22 at 21:02 -0700, Michel Lespinasse wrote:
> I sent this out to LKML earlier but was told this would be a better list.
> This has been mentionned in bugzilla already, but I'd like to draw attention
> before it gets too late for 2.6.28 (or is it already too late ???)
> 
> The following is a common cause of 5-minute NFS hangs here:
> 
> * Client has TCP connections to the NFS server, goes to S3 sleep for few hours.
> * TCP connections die on the server side.
>   (not 100% sure why, do they use some kind of keepalive ???)
> * Client resumes from S3.
> * Client sends NFS requests down its TCP connections, gets back RST packet.
> * [Client hangs for exactly 300 seconds here]
> * Client establishes new TCP connections to the NFS server,
>   and recovers from the hang.
> 
> A tcpdump trace is attached at the end of bugzilla bug 11154:
> http://bugzilla.kernel.org/show_bug.cgi?id=11154
> 
> Should the client immediately try to reconnect when its existing connection
> receives an RST packet ? (the 5 minute delay would make sense to me if
> RST was received in reply to a SYN, but I'm not sure about it in the case
> of an existing open TCP connection).
> 
> If the 5 minute delay after an RST is necessary, could the client avoid it
> by explicitly closing/reopening its connections using suspend/resume hooks ?
> 
> (I can not work around the issue locally by mounting/unmounting my NFS
> shares around the suspend/resume because rootfs also on NFS...)
> 
> This NFS setup was working fine in 2.6.24. There has been issues with
> 2.6.25 and 2.6.26, but I did not confirm if they are the same bug.
> 2.6.25 usualy recovers after some variable delay and 2.6.26 usualy
> does not recover. Bugs 11154 and 11061 have more details about this,
> also Ian Campbell has been tracking an NFS issue under load that
> appeared at around the same time.
> 
> Hope this helps,

Does the appended patch make a difference?

Cheers
  Trond
--------------------------------------------------------------------------------------
From: Trond Myklebust <Trond.Myklebust@netapp.com>
Date: Thu, 23 Oct 2008 11:33:59 -0400
SUNRPC: Respond promptly to server TCP resets

If the server sends us an RST error while we're in the TCP_ESTABLISHED
state, then that will not result in a state change, and so the RPC client
ends up hanging forever (see
http://bugzilla.kernel.org/show_bug.cgi?id=11154)

We can intercept the reset by setting up an sk->sk_error_report callback,
which will then allow us to initiate a proper shutdown and retry...

We also make sure that if the send request receives an ECONNRESET, then we
shutdown too...

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
---

 net/sunrpc/xprtsock.c |   58 +++++++++++++++++++++++++++++++++++++++++--------
 1 files changed, 48 insertions(+), 10 deletions(-)

diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
index 9a288d5..0a50361 100644
--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -249,6 +249,7 @@ struct sock_xprt {
 	void			(*old_data_ready)(struct sock *, int);
 	void			(*old_state_change)(struct sock *);
 	void			(*old_write_space)(struct sock *);
+	void			(*old_error_report)(struct sock *);
 };
 
 /*
@@ -698,8 +699,9 @@ static int xs_tcp_send_request(struct rpc_task *task)
 	case -EAGAIN:
 		xs_nospace(task);
 		break;
-	case -ECONNREFUSED:
 	case -ECONNRESET:
+		xs_tcp_shutdown(xprt);
+	case -ECONNREFUSED:
 	case -ENOTCONN:
 	case -EPIPE:
 		status = -ENOTCONN;
@@ -742,6 +744,22 @@ out_release:
 	xprt_release_xprt(xprt, task);
 }
 
+static void xs_save_old_callbacks(struct sock_xprt *transport, struct sock *sk)
+{
+	transport->old_data_ready = sk->sk_data_ready;
+	transport->old_state_change = sk->sk_state_change;
+	transport->old_write_space = sk->sk_write_space;
+	transport->old_error_report = sk->sk_error_report;
+}
+
+static void xs_restore_old_callbacks(struct sock_xprt *transport, struct sock *sk)
+{
+	sk->sk_data_ready = transport->old_data_ready;
+	sk->sk_state_change = transport->old_state_change;
+	sk->sk_write_space = transport->old_write_space;
+	sk->sk_error_report = transport->old_error_report;
+}
+
 /**
  * xs_close - close a socket
  * @xprt: transport
@@ -765,9 +783,8 @@ static void xs_close(struct rpc_xprt *xprt)
 	transport->sock = NULL;
 
 	sk->sk_user_data = NULL;
-	sk->sk_data_ready = transport->old_data_ready;
-	sk->sk_state_change = transport->old_state_change;
-	sk->sk_write_space = transport->old_write_space;
+
+	xs_restore_old_callbacks(transport, sk);
 	write_unlock_bh(&sk->sk_callback_lock);
 
 	sk->sk_no_check = 0;
@@ -1180,6 +1197,28 @@ static void xs_tcp_state_change(struct sock *sk)
 }
 
 /**
+ * xs_tcp_error_report - callback mainly for catching RST events
+ * @sk: socket
+ */
+static void xs_tcp_error_report(struct sock *sk)
+{
+	struct rpc_xprt *xprt;
+
+	read_lock(&sk->sk_callback_lock);
+	if (sk->sk_err != ECONNRESET || sk->sk_state != TCP_ESTABLISHED)
+		goto out;
+	if (!(xprt = xprt_from_sock(sk)))
+		goto out;
+	dprintk("RPC:       %s client %p...\n"
+			"RPC:       error %d\n",
+			__func__, xprt, sk->sk_err);
+
+	xprt_force_disconnect(xprt);
+out:
+	read_unlock(&sk->sk_callback_lock);
+}
+
+/**
  * xs_udp_write_space - callback invoked when socket buffer space
  *                             becomes available
  * @sk: socket whose state has changed
@@ -1454,10 +1493,9 @@ static void xs_udp_finish_connecting(struct rpc_xprt *xprt, struct socket *sock)
 
 		write_lock_bh(&sk->sk_callback_lock);
 
+		xs_save_old_callbacks(transport, sk);
+
 		sk->sk_user_data = xprt;
-		transport->old_data_ready = sk->sk_data_ready;
-		transport->old_state_change = sk->sk_state_change;
-		transport->old_write_space = sk->sk_write_space;
 		sk->sk_data_ready = xs_udp_data_ready;
 		sk->sk_write_space = xs_udp_write_space;
 		sk->sk_no_check = UDP_CSUM_NORCV;
@@ -1589,13 +1627,13 @@ static int xs_tcp_finish_connecting(struct rpc_xprt *xprt, struct socket *sock)
 
 		write_lock_bh(&sk->sk_callback_lock);
 
+		xs_save_old_callbacks(transport, sk);
+
 		sk->sk_user_data = xprt;
-		transport->old_data_ready = sk->sk_data_ready;
-		transport->old_state_change = sk->sk_state_change;
-		transport->old_write_space = sk->sk_write_space;
 		sk->sk_data_ready = xs_tcp_data_ready;
 		sk->sk_state_change = xs_tcp_state_change;
 		sk->sk_write_space = xs_tcp_write_space;
+		sk->sk_error_report = xs_tcp_error_report;
 		sk->sk_allocation = GFP_ATOMIC;
 
 		/* socket options */



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: Fwd: NFS 5-minute hangs upon S3 resume using 2.6.27 client
  2008-10-23 15:36   ` Trond Myklebust
@ 2008-10-23 19:52     ` Michel Lespinasse
       [not found]       ` <20081023195231.GA2090-Y93EPB1FQwg@public.gmane.org>
  0 siblings, 1 reply; 7+ messages in thread
From: Michel Lespinasse @ 2008-10-23 19:52 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: linux-nfs

Hi,

On Thu, Oct 23, 2008 at 11:36:47AM -0400, Trond Myklebust wrote:
> Does the appended patch make a difference?
> 
> From: Trond Myklebust <Trond.Myklebust@netapp.com>
> Date: Thu, 23 Oct 2008 11:33:59 -0400
> SUNRPC: Respond promptly to server TCP resets

I applied it over a 2.6.27.3 base, suspended the client for 40 minutes
and resumed it, logging what happens from the server side. The resume
went like this:

12:38:53.692785 IP client.329262748 > server.nfs: 100 getattr [|nfs]
12:38:53.699885 arp who-has client tell server
12:38:54.123793 IP client.329262748 > server.nfs: 100 getattr [|nfs]
12:38:54.695888 arp who-has client tell server
12:38:54.696011 arp reply client is-at 00:19:d1:54:0e:39 (oui Unknown)
12:38:54.696020 IP server.nfs > client.882: R 2944642919:2944642919(0) win 0
12:38:54.696024 IP server.nfs > client.882: R 2944642919:2944642919(0) win 0

(I'm still concerned about the 3 second delay here...)

12:38:57.695956 IP client.2 > server.nfs: 0 null
12:38:57.696004 IP server.nfs > client.2: reply ERR 0 null
12:38:57.696133 IP client.882 > server.nfs: . ack 931551254 win 183 <nop,nop,timestamp 448858 1116460165>
12:38:57.696159 IP client.362817180 > server.nfs: 100 getattr [|nfs]
12:38:57.696179 IP server.nfs > client.882: . ack 64732 win 91 <nop,nop,timestamp 1116460165 448858>
12:38:57.696189 IP client.346039964 > server.nfs: 100 getattr [|nfs]
12:38:57.696205 IP server.nfs > client.882: . ack 64832 win 91 <nop,nop,timestamp 1116460165 448858>
12:38:57.696294 IP client.329262748 > server.nfs: 300 getattr [|nfs]
12:38:57.696307 IP server.nfs > client.882: . ack 65132 win 108 <nop,nop,timestamp 1116460165 448858>
12:38:57.765764 IP server.nfs > client.362817180: reply ok 116 getattr [|nfs]
12:38:57.765833 IP server.nfs > client.346039964: reply ok 116 getattr [|nfs]
12:38:57.765879 IP client.882 > server.nfs: . ack 931551370 win 183 <nop,nop,timestamp 448875 1116460182>
12:38:57.765903 IP server.nfs > client.329262748: reply ok 116 getattr [|nfs]
12:38:57.765909 IP client.882 > server.nfs: . ack 931551486 win 183 <nop,nop,timestamp 448875 1116460182>
12:38:57.765933 IP server.nfs > client.379594396: reply ok 116
12:38:57.765953 IP client.882 > server.nfs: . ack 931551602 win 183 <nop,nop,timestamp 448875 1116460182>
12:38:57.766008 IP client.882 > server.nfs: . ack 931551718 win 183 <nop,nop,timestamp 448875 1116460182>
12:38:57.766058 IP server.nfs > client.396371612: reply ok 116
12:38:57.766108 IP client.413148828 > server.nfs: 100 getattr [|nfs]
12:38:57.766149 IP server.nfs > client.882: . ack 65232 win 108 <nop,nop,timestamp 1116460182 448875>
12:38:57.766176 IP server.nfs > client.413148828: reply ok 116 getattr [|nfs]
12:38:57.766274 IP client.882 > server.nfs: . ack 931551950 win 183 <nop,nop,timestamp 448875 1116460182>
12:38:57.769130 IP client.429926044 > server.nfs: 100 getattr [|nfs]
12:38:57.769188 IP server.nfs > client.429926044: reply ok 116 getattr [|nfs]
12:38:57.769418 IP client.446703260 > server.nfs: 100 getattr [|nfs]
12:38:57.769473 IP server.nfs > client.446703260: reply ok 116 getattr [|nfs]

I'll have more results tonight when I try resuming again. If the delay
goes to 6 seconds then the issue is probably still there... I'll report
here either way.

-- 
Michel "Walken" Lespinasse
A program is never fully debugged until the last user dies.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Fwd: NFS 5-minute hangs upon S3 resume using 2.6.27 client
       [not found]       ` <20081023195231.GA2090-Y93EPB1FQwg@public.gmane.org>
@ 2008-10-23 23:17         ` Trond Myklebust
  2008-10-24  6:57           ` Michel Lespinasse
  0 siblings, 1 reply; 7+ messages in thread
From: Trond Myklebust @ 2008-10-23 23:17 UTC (permalink / raw)
  To: Michel Lespinasse; +Cc: linux-nfs

On Thu, 2008-10-23 at 12:52 -0700, Michel Lespinasse wrote:
> Hi,
> 
> On Thu, Oct 23, 2008 at 11:36:47AM -0400, Trond Myklebust wrote:
> > Does the appended patch make a difference?
> > 
> > From: Trond Myklebust <Trond.Myklebust@netapp.com>
> > Date: Thu, 23 Oct 2008 11:33:59 -0400
> > SUNRPC: Respond promptly to server TCP resets
> 
> I applied it over a 2.6.27.3 base, suspended the client for 40 minutes
> and resumed it, logging what happens from the server side. The resume
> went like this:
> 
> 12:38:53.692785 IP client.329262748 > server.nfs: 100 getattr [|nfs]
> 12:38:53.699885 arp who-has client tell server
> 12:38:54.123793 IP client.329262748 > server.nfs: 100 getattr [|nfs]
> 12:38:54.695888 arp who-has client tell server
> 12:38:54.696011 arp reply client is-at 00:19:d1:54:0e:39 (oui Unknown)
> 12:38:54.696020 IP server.nfs > client.882: R 2944642919:2944642919(0) win 0
> 12:38:54.696024 IP server.nfs > client.882: R 2944642919:2944642919(0) win 0
> 
> (I'm still concerned about the 3 second delay here...)
> 
> 12:38:57.695956 IP client.2 > server.nfs: 0 null

Does this patch fix that delay?

Cheers
  Trond

-------------------------------------------------------------------------------------
From: Trond Myklebust <Trond.Myklebust@netapp.com>
Date: Thu, 23 Oct 2008 19:14:55 -0400
SUNRPC: Fix the setting of xprt->reestablish_timeout when reconnecting

If the server aborts an established connection, then we should retry
connecting immediately. Since xprt->reestablish_timeout is not reset unless
we go through a TCP_FIN_WAIT1 state, we may end waiting.
The fix is to reset xprt->reestablish_timeout in TCP_ESTABLISHED, and then
rely on the fact that we set it to non-zero values in all other cases when
the server closes the connection.

Also fix a race between xs_connect() and xs_tcp_state_change(). The
value of xprt->reestablish_timeout should be updated before we actually
attempt the connection.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
---

 net/sunrpc/xprtsock.c |   10 ++++++----
 1 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
index 0a50361..ac2aa52 100644
--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -1144,6 +1144,8 @@ static void xs_tcp_state_change(struct sock *sk)
 			struct sock_xprt *transport = container_of(xprt,
 					struct sock_xprt, xprt);
 
+			xprt->reestablish_timeout = 0;
+
 			/* Reset TCP record info */
 			transport->tcp_offset = 0;
 			transport->tcp_reclen = 0;
@@ -1158,7 +1160,6 @@ static void xs_tcp_state_change(struct sock *sk)
 	case TCP_FIN_WAIT1:
 		/* The client initiated a shutdown of the socket */
 		xprt->connect_cookie++;
-		xprt->reestablish_timeout = 0;
 		set_bit(XPRT_CLOSING, &xprt->state);
 		smp_mb__before_clear_bit();
 		clear_bit(XPRT_CONNECTED, &xprt->state);
@@ -1793,6 +1794,7 @@ static void xs_connect(struct rpc_task *task)
 {
 	struct rpc_xprt *xprt = task->tk_xprt;
 	struct sock_xprt *transport = container_of(xprt, struct sock_xprt, xprt);
+	unsigned long timeout;
 
 	if (xprt_test_and_set_connecting(xprt))
 		return;
@@ -1801,12 +1803,12 @@ static void xs_connect(struct rpc_task *task)
 		dprintk("RPC:       xs_connect delayed xprt %p for %lu "
 				"seconds\n",
 				xprt, xprt->reestablish_timeout / HZ);
-		queue_delayed_work(rpciod_workqueue,
-				   &transport->connect_worker,
-				   xprt->reestablish_timeout);
+		timeout = xprt->reestablish_timeout;
 		xprt->reestablish_timeout <<= 1;
 		if (xprt->reestablish_timeout > XS_TCP_MAX_REEST_TO)
 			xprt->reestablish_timeout = XS_TCP_MAX_REEST_TO;
+		queue_delayed_work(rpciod_workqueue,
+				   &transport->connect_worker, timeout);
 	} else {
 		dprintk("RPC:       xs_connect scheduled xprt %p\n", xprt);
 		queue_delayed_work(rpciod_workqueue,



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: Fwd: NFS 5-minute hangs upon S3 resume using 2.6.27 client
  2008-10-23 23:17         ` Trond Myklebust
@ 2008-10-24  6:57           ` Michel Lespinasse
       [not found]             ` <20081024065759.GA2401-Y93EPB1FQwg@public.gmane.org>
  0 siblings, 1 reply; 7+ messages in thread
From: Michel Lespinasse @ 2008-10-24  6:57 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: linux-nfs

On Thu, Oct 23, 2008 at 07:17:59PM -0400, Trond Myklebust wrote:
> > (I'm still concerned about the 3 second delay here...)
> Does this patch fix that delay?
> 
> SUNRPC: Fix the setting of xprt->reestablish_timeout when reconnecting

I applied this on top of the previous patch and it worked - but now I'm
not sure if you wanted to test this as an independant patch ???

I'm wondering how the code in xs_tcp_state_change() that sets
reestablish_timeout back to XS_TCP_INIT_REEST_TO managed to not cause trouble.


Can I propose a patch too ? mine looks quite similar to your second patch,
but with the reestablish_timeout logic hopefully simplified...



SUNRPC: Fix the setting of xprt->reestablish_timeout when reconnecting

The proposed logic works as follows:

* The reestablish_timeout is initially set to zero (no delay before the
  initial connection). It is also reset to zero upon establishing a
  successful TCP connection (no delay next time we try to reopen the
  connection, whether it got closed by the client or the server side).
  This fixes an issue where we would wait 5-minute after receiving an RST
  from the server rather than retrying immediately (bug 11154).

* xs_connect() delays the (re)connection by reestablish_timeout jiffies.
  reestablish_timeout for the *next* connection attempt is also
  updated to a nonzero value at that point (constant for UDP case,
  exponential backoff for TCP case). There used to be an issue where
  we would apply the exponential backoff even in the UDP case.

Signed-off-by: Michel Lespinasse <walken-Y93EPB1FQwg@public.gmane.org>
---

--- net/sunrpc/xprtsock.c.orig	2008-10-23 10:43:31.000000000 -0700
+++ net/sunrpc/xprtsock.c	2008-10-23 22:10:50.000000000 -0700
@@ -1134,6 +1134,7 @@
 			transport->tcp_flags =
 				TCP_RCV_COPY_FRAGHDR | TCP_RCV_COPY_XID;
 
+			xprt->reestablish_timeout = 0;
 			xprt_wake_pending_tasks(xprt, 0);
 		}
 		spin_unlock_bh(&xprt->transport_lock);
@@ -1141,7 +1142,6 @@
 	case TCP_FIN_WAIT1:
 		/* The client initiated a shutdown of the socket */
 		xprt->connect_cookie++;
-		xprt->reestablish_timeout = 0;
 		set_bit(XPRT_CLOSING, &xprt->state);
 		smp_mb__before_clear_bit();
 		clear_bit(XPRT_CONNECTED, &xprt->state);
@@ -1154,13 +1154,6 @@
 		xprt_force_disconnect(xprt);
 	case TCP_SYN_SENT:
 		xprt->connect_cookie++;
-	case TCP_CLOSING:
-		/*
-		 * If the server closed down the connection, make sure that
-		 * we back off before reconnecting
-		 */
-		if (xprt->reestablish_timeout < XS_TCP_INIT_REEST_TO)
-			xprt->reestablish_timeout = XS_TCP_INIT_REEST_TO;
 		break;
 	case TCP_LAST_ACK:
 		smp_mb__before_clear_bit();
@@ -1755,25 +1748,25 @@
 {
 	struct rpc_xprt *xprt = task->tk_xprt;
 	struct sock_xprt *transport = container_of(xprt, struct sock_xprt, xprt);
+	unsigned long timeout;
 
 	if (xprt_test_and_set_connecting(xprt))
 		return;
 
-	if (transport->sock != NULL) {
-		dprintk("RPC:       xs_connect delayed xprt %p for %lu "
-				"seconds\n",
-				xprt, xprt->reestablish_timeout / HZ);
-		queue_delayed_work(rpciod_workqueue,
-				   &transport->connect_worker,
-				   xprt->reestablish_timeout);
+	timeout = xprt->reestablish_timeout;
+	if (xprt->prot != IPPROTO_TCP)
+		xprt->reestablish_timeout = XS_UDP_REEST_TO;
+	else if (xprt->reestablish_timeout == 0)
+		xprt->reestablish_timeout = XS_TCP_INIT_REEST_TO;
+	else {
 		xprt->reestablish_timeout <<= 1;
 		if (xprt->reestablish_timeout > XS_TCP_MAX_REEST_TO)
 			xprt->reestablish_timeout = XS_TCP_MAX_REEST_TO;
-	} else {
-		dprintk("RPC:       xs_connect scheduled xprt %p\n", xprt);
-		queue_delayed_work(rpciod_workqueue,
-				   &transport->connect_worker, 0);
 	}
+	dprintk("RPC:       xs_connect delayed xprt %p for %lu seconds\n",
+		xprt, timeout / HZ);
+	queue_delayed_work(rpciod_workqueue, &transport->connect_worker,
+			   timeout);
 }
 
 static void xs_tcp_connect(struct rpc_task *task)
@@ -1935,7 +1928,7 @@
 
 	xprt->bind_timeout = XS_BIND_TO;
 	xprt->connect_timeout = XS_UDP_CONN_TO;
-	xprt->reestablish_timeout = XS_UDP_REEST_TO;
+	xprt->reestablish_timeout = 0;
 	xprt->idle_timeout = XS_IDLE_DISC_TO;
 
 	xprt->ops = &xs_udp_ops;
@@ -2003,7 +1996,7 @@
 
 	xprt->bind_timeout = XS_BIND_TO;
 	xprt->connect_timeout = XS_TCP_CONN_TO;
-	xprt->reestablish_timeout = XS_TCP_INIT_REEST_TO;
+	xprt->reestablish_timeout = 0;
 	xprt->idle_timeout = XS_IDLE_DISC_TO;
 
 	xprt->ops = &xs_tcp_ops;


Hope this helps,

-- 
Michel "Walken" Lespinasse
A program is never fully debugged until the last user dies.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Fwd: NFS 5-minute hangs upon S3 resume using 2.6.27 client
       [not found]             ` <20081024065759.GA2401-Y93EPB1FQwg@public.gmane.org>
@ 2008-10-24 12:29               ` Trond Myklebust
  2008-10-24 21:02                 ` Michel Lespinasse
  0 siblings, 1 reply; 7+ messages in thread
From: Trond Myklebust @ 2008-10-24 12:29 UTC (permalink / raw)
  To: Michel Lespinasse; +Cc: linux-nfs

On Thu, 2008-10-23 at 23:57 -0700, Michel Lespinasse wrote:
> On Thu, Oct 23, 2008 at 07:17:59PM -0400, Trond Myklebust wrote:
> > > (I'm still concerned about the 3 second delay here...)
> > Does this patch fix that delay?
> > 
> > SUNRPC: Fix the setting of xprt->reestablish_timeout when reconnecting
> 
> I applied this on top of the previous patch and it worked - but now I'm
> not sure if you wanted to test this as an independant patch ???

It was meant to be applied incrementally on top of the other.

> I'm wondering how the code in xs_tcp_state_change() that sets
> reestablish_timeout back to XS_TCP_INIT_REEST_TO managed to not cause trouble.
> 
> 
> Can I propose a patch too ? mine looks quite similar to your second patch,
> but with the reestablish_timeout logic hopefully simplified...
> 

This would cause a different regression. The current code is there in
order to ensure that we apply that exponential backoff if and only if
the _server_ closes the TCP connection since that would usually indicate
that it is trying to deal with a resource congestion issue. We don't
need to back off if we were the ones closing the socket.

The issue of UDP exponential backoff is moot: the UDP code doesn't use
xs_connect() at all.

Cheers
  Trond


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Fwd: NFS 5-minute hangs upon S3 resume using 2.6.27 client
  2008-10-24 12:29               ` Trond Myklebust
@ 2008-10-24 21:02                 ` Michel Lespinasse
  0 siblings, 0 replies; 7+ messages in thread
From: Michel Lespinasse @ 2008-10-24 21:02 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: linux-nfs

On Fri, Oct 24, 2008 at 08:29:28AM -0400, Trond Myklebust wrote:
> On Thu, 2008-10-23 at 23:57 -0700, Michel Lespinasse wrote:
> > I applied this on top of the previous patch and it worked - but now I'm
> > not sure if you wanted to test this as an independant patch ???
> It was meant to be applied incrementally on top of the other.

OK. Worked fine, then. Thanks again !

> > Can I propose a patch too ? mine looks quite similar to your second patch,
> > but with the reestablish_timeout logic hopefully simplified...
> 
> This would cause a different regression. The current code is there in
> order to ensure that we apply that exponential backoff if and only if
> the _server_ closes the TCP connection since that would usually indicate
> that it is trying to deal with a resource congestion issue. We don't
> need to back off if we were the ones closing the socket.

So the idea is to backoff if the connection is closed by a server FIN but
retry right away if it's closed by a RST ?
I did not realize that, but this makes sense too...

> The issue of UDP exponential backoff is moot: the UDP code doesn't use
> xs_connect() at all.

Hrmmm ? I see the following down the file:
static struct rpc_xprt_ops xs_udp_ops = {
...
        .connect                = xs_connect,
...
}

I'm not sure if/when the callback is actually called in the UDP case,
but it's definitely being set up that way.

-- 
Michel "Walken" Lespinasse
A program is never fully debugged until the last user dies.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2008-10-24 21:02 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-10-23  4:02 Fwd: NFS 5-minute hangs upon S3 resume using 2.6.27 client Michel Lespinasse
     [not found] ` <20081023040231.GA13512-Y93EPB1FQwg@public.gmane.org>
2008-10-23 15:36   ` Trond Myklebust
2008-10-23 19:52     ` Michel Lespinasse
     [not found]       ` <20081023195231.GA2090-Y93EPB1FQwg@public.gmane.org>
2008-10-23 23:17         ` Trond Myklebust
2008-10-24  6:57           ` Michel Lespinasse
     [not found]             ` <20081024065759.GA2401-Y93EPB1FQwg@public.gmane.org>
2008-10-24 12:29               ` Trond Myklebust
2008-10-24 21:02                 ` Michel Lespinasse

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox