All of lore.kernel.org
 help / color / mirror / Atom feed
From: Trond Myklebust <trond.myklebust@fys.uio.no>
To: Michel Lespinasse <walken-Y93EPB1FQwg@public.gmane.org>
Cc: linux-nfs@vger.kernel.org
Subject: Re: Fwd: NFS 5-minute hangs upon S3 resume using 2.6.27 client
Date: Thu, 23 Oct 2008 19:17:59 -0400	[thread overview]
Message-ID: <1224803879.7625.79.camel@localhost> (raw)
In-Reply-To: <20081023195231.GA2090-Y93EPB1FQwg@public.gmane.org>

On Thu, 2008-10-23 at 12:52 -0700, Michel Lespinasse wrote:
> Hi,
> 
> On Thu, Oct 23, 2008 at 11:36:47AM -0400, Trond Myklebust wrote:
> > Does the appended patch make a difference?
> > 
> > From: Trond Myklebust <Trond.Myklebust@netapp.com>
> > Date: Thu, 23 Oct 2008 11:33:59 -0400
> > SUNRPC: Respond promptly to server TCP resets
> 
> I applied it over a 2.6.27.3 base, suspended the client for 40 minutes
> and resumed it, logging what happens from the server side. The resume
> went like this:
> 
> 12:38:53.692785 IP client.329262748 > server.nfs: 100 getattr [|nfs]
> 12:38:53.699885 arp who-has client tell server
> 12:38:54.123793 IP client.329262748 > server.nfs: 100 getattr [|nfs]
> 12:38:54.695888 arp who-has client tell server
> 12:38:54.696011 arp reply client is-at 00:19:d1:54:0e:39 (oui Unknown)
> 12:38:54.696020 IP server.nfs > client.882: R 2944642919:2944642919(0) win 0
> 12:38:54.696024 IP server.nfs > client.882: R 2944642919:2944642919(0) win 0
> 
> (I'm still concerned about the 3 second delay here...)
> 
> 12:38:57.695956 IP client.2 > server.nfs: 0 null

Does this patch fix that delay?

Cheers
  Trond

-------------------------------------------------------------------------------------
From: Trond Myklebust <Trond.Myklebust@netapp.com>
Date: Thu, 23 Oct 2008 19:14:55 -0400
SUNRPC: Fix the setting of xprt->reestablish_timeout when reconnecting

If the server aborts an established connection, then we should retry
connecting immediately. Since xprt->reestablish_timeout is not reset unless
we go through a TCP_FIN_WAIT1 state, we may end waiting.
The fix is to reset xprt->reestablish_timeout in TCP_ESTABLISHED, and then
rely on the fact that we set it to non-zero values in all other cases when
the server closes the connection.

Also fix a race between xs_connect() and xs_tcp_state_change(). The
value of xprt->reestablish_timeout should be updated before we actually
attempt the connection.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
---

 net/sunrpc/xprtsock.c |   10 ++++++----
 1 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
index 0a50361..ac2aa52 100644
--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -1144,6 +1144,8 @@ static void xs_tcp_state_change(struct sock *sk)
 			struct sock_xprt *transport = container_of(xprt,
 					struct sock_xprt, xprt);
 
+			xprt->reestablish_timeout = 0;
+
 			/* Reset TCP record info */
 			transport->tcp_offset = 0;
 			transport->tcp_reclen = 0;
@@ -1158,7 +1160,6 @@ static void xs_tcp_state_change(struct sock *sk)
 	case TCP_FIN_WAIT1:
 		/* The client initiated a shutdown of the socket */
 		xprt->connect_cookie++;
-		xprt->reestablish_timeout = 0;
 		set_bit(XPRT_CLOSING, &xprt->state);
 		smp_mb__before_clear_bit();
 		clear_bit(XPRT_CONNECTED, &xprt->state);
@@ -1793,6 +1794,7 @@ static void xs_connect(struct rpc_task *task)
 {
 	struct rpc_xprt *xprt = task->tk_xprt;
 	struct sock_xprt *transport = container_of(xprt, struct sock_xprt, xprt);
+	unsigned long timeout;
 
 	if (xprt_test_and_set_connecting(xprt))
 		return;
@@ -1801,12 +1803,12 @@ static void xs_connect(struct rpc_task *task)
 		dprintk("RPC:       xs_connect delayed xprt %p for %lu "
 				"seconds\n",
 				xprt, xprt->reestablish_timeout / HZ);
-		queue_delayed_work(rpciod_workqueue,
-				   &transport->connect_worker,
-				   xprt->reestablish_timeout);
+		timeout = xprt->reestablish_timeout;
 		xprt->reestablish_timeout <<= 1;
 		if (xprt->reestablish_timeout > XS_TCP_MAX_REEST_TO)
 			xprt->reestablish_timeout = XS_TCP_MAX_REEST_TO;
+		queue_delayed_work(rpciod_workqueue,
+				   &transport->connect_worker, timeout);
 	} else {
 		dprintk("RPC:       xs_connect scheduled xprt %p\n", xprt);
 		queue_delayed_work(rpciod_workqueue,



  parent reply	other threads:[~2008-10-23 23:18 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-10-23  4:02 Fwd: NFS 5-minute hangs upon S3 resume using 2.6.27 client Michel Lespinasse
     [not found] ` <20081023040231.GA13512-Y93EPB1FQwg@public.gmane.org>
2008-10-23 15:36   ` Trond Myklebust
2008-10-23 19:52     ` Michel Lespinasse
     [not found]       ` <20081023195231.GA2090-Y93EPB1FQwg@public.gmane.org>
2008-10-23 23:17         ` Trond Myklebust [this message]
2008-10-24  6:57           ` Michel Lespinasse
     [not found]             ` <20081024065759.GA2401-Y93EPB1FQwg@public.gmane.org>
2008-10-24 12:29               ` Trond Myklebust
2008-10-24 21:02                 ` Michel Lespinasse

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1224803879.7625.79.camel@localhost \
    --to=trond.myklebust@fys.uio.no \
    --cc=linux-nfs@vger.kernel.org \
    --cc=walken-Y93EPB1FQwg@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.