public inbox for linux-nfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Neil Brown <neilb@suse.de>
To: Chuck Lever <chuck.lever@oracle.com>,
	Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: linux-nfs@vger.kernel.org
Subject: [PATCH] SUNRPC: reset TCP reconnect exponential back-off on successful connection.
Date: Fri, 17 Jul 2009 17:53:37 +1000	[thread overview]
Message-ID: <19040.11777.346898.322780@notabene.brown> (raw)


Hi.
 A customer of ours has been testing NFS failover and has been
 experiencing unexpected delays before the client starts writing
 again.   It turns out there are a number of issues here, some client
 and some server.

 This patch fixes two client issues, one that causes the failover time
 to double on each migration (or each time the NFS server is stopped
 and restarted), and one that causes the client to spam the server
 with SYN requests until it accepts the connection (I have a trace
 showing over 100 SYN requests, each followed by a RST,ACK reply, in
 the space for 300 milliseconds).

 I am able to simulate the first failure and have tested that the
 patch fixes it.  I have not managed to simulate the second failure,
 but I think that fix is clearly safe.

 I'm not sure that the patch fits the original definition for -stable,
 but it seems to fit the current practice and I would appreciate if
 (assuming the patch passes review) it could be submitted for -stable.

Thanks,
NeilBrown



The sunrpc/TCP transport has an exponential back-off for reconnection,
starting at 3 seconds and with a maximum of 300 seconds.  On every
connection attempt the timeout is doubled.
It is only reset when the client deliberately closes the connection.
If the server closes the connection but a subsequent reconnect
succeeds, the timeout remains elevated.

This means that if the server resets the connection several times, as
can happen with server migration in a clustered environment, each
reconnect takes longer than the previous one - unnecessarily so.

This patch resets the timeout on a successful connection so that every
time the server resets the connection we start with a basic 3 second
timeout.

There is also the possibility for the reverse problem.  When the
client closes the connection it sets the timeout to 0 (so that a
reconnect - when required - is instant).  When 0 is doubled it remains
at 0, so if the server refused the reconnect, the client will try
again instantly and indefinitely.  To avoid this we ensure that after
doubling the timeout it is at least the minimum.

Cc: stable@vger.kernel.org
Signed-off-by: NeilBrown <neilb@suse.de>
---
 net/sunrpc/xprtsock.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
index 83c73c4..b032e06 100644
--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -1403,6 +1403,7 @@ static void xs_tcp_state_change(struct sock *sk)
 				TCP_RCV_COPY_FRAGHDR | TCP_RCV_COPY_XID;
 
 			xprt_wake_pending_tasks(xprt, -EAGAIN);
+			xprt->reestablish_timeout = 0;
 		}
 		spin_unlock_bh(&xprt->transport_lock);
 		break;
@@ -2090,6 +2091,8 @@ static void xs_connect(struct rpc_task *task)
 				   &transport->connect_worker,
 				   xprt->reestablish_timeout);
 		xprt->reestablish_timeout <<= 1;
+		if (xprt->reestablish_timeout < XS_TCP_INIT_REEST_TO)
+			xprt->reestablish_timeout = XS_TCP_INIT_REEST_TO;
 		if (xprt->reestablish_timeout > XS_TCP_MAX_REEST_TO)
 			xprt->reestablish_timeout = XS_TCP_MAX_REEST_TO;
 	} else {
-- 
1.6.3.3


             reply	other threads:[~2009-07-17  7:53 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-07-17  7:53 Neil Brown [this message]
     [not found] ` <19040.11777.346898.322780-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2009-07-21 23:07   ` [PATCH] SUNRPC: reset TCP reconnect exponential back-off on successful connection Trond Myklebust
2009-07-30  6:28     ` Neil Brown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=19040.11777.346898.322780@notabene.brown \
    --to=neilb@suse.de \
    --cc=chuck.lever@oracle.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=trond.myklebust@fys.uio.no \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox