From: Simon Kirby <sim@hostway.ca>
To: "Myklebust, Trond" <Trond.Myklebust@netapp.com>
Cc: Yan-Pai Chen <yanpai.chen@gmail.com>,
"linux-nfs@vger.kernel.org" <linux-nfs@vger.kernel.org>
Subject: Re: [3.2.5] NFSv3 CLOSE_WAIT hang
Date: Wed, 19 Sep 2012 15:01:48 -0700 [thread overview]
Message-ID: <20120919220148.GA8249@hostway.ca> (raw)
In-Reply-To: <4FA345DA4F4AE44899BD2B03EEEC2FA908F9F734@SACEXCMBX04-PRD.hq.netapp.com>
On Tue, Sep 11, 2012 at 10:17:25PM +0000, Myklebust, Trond wrote:
> Does the "if (test_and_set_bit(XPRT_LOCK) == 0)" condition immediately
> following that succeed so that queue_work() is called?
Yes, it seems to:
[146957.793093] RPC: server initated shutdown -- state 8 conn 1 dead 0 zapped 1 sk_shutdown 1
[146957.793418] xprt_force_disconnect(): setting XPRT_CLOSE_WAIT
[146957.799288] xprt_force_disconnect(): setting XPRT_LOCKED worked, calling queue_work()
On Wed, Sep 12, 2012 at 08:54:14PM +0000, Myklebust, Trond wrote:
> On Tue, 2012-09-11 at 18:17 -0400, Trond Myklebust wrote:
>
> Hi Simon,
>
> Can you try the following patch, and see if it addresses the TCP "server
> hangs" case?
>
> Cheers
> Trond
> 8<----------------------------------------------------------------------
> From 99330d09cc1074fbdc64089fa0a3f8dbdc74daaf Mon Sep 17 00:00:00 2001
> From: Trond Myklebust <Trond.Myklebust@netapp.com>
> Date: Wed, 12 Sep 2012 16:49:15 -0400
> Subject: [PATCH] SUNRPC: Ensure that the TCP socket is closed when in
> CLOSE_WAIT
>
> Instead of doing a shutdown() call, we need to do an actual close().
> Ditto if/when the server is sending us junk RPC headers.
>
> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
> ---
> net/sunrpc/xprtsock.c | 21 ++++++++++++++++-----
> 1 file changed, 16 insertions(+), 5 deletions(-)
>
> diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
> index a35b8e5..d1988cf 100644
> --- a/net/sunrpc/xprtsock.c
> +++ b/net/sunrpc/xprtsock.c
> @@ -1025,6 +1025,16 @@ static void xs_udp_data_ready(struct sock *sk, int len)
> read_unlock_bh(&sk->sk_callback_lock);
> }
>
> +/*
> + * Helper function to force a TCP close if the server is sending
> + * junk and/or it has put us in CLOSE_WAIT
> + */
> +static void xs_tcp_force_close(struct rpc_xprt *xprt)
> +{
> + set_bit(XPRT_CONNECTION_CLOSE, &xprt->state);
> + xprt_force_disconnect(xprt);
> +}
> +
> static inline void xs_tcp_read_fraghdr(struct rpc_xprt *xprt, struct xdr_skb_reader *desc)
> {
> struct sock_xprt *transport = container_of(xprt, struct sock_xprt, xprt);
> @@ -1051,7 +1061,7 @@ static inline void xs_tcp_read_fraghdr(struct rpc_xprt *xprt, struct xdr_skb_rea
> /* Sanity check of the record length */
> if (unlikely(transport->tcp_reclen < 8)) {
> dprintk("RPC: invalid TCP record fragment length\n");
> - xprt_force_disconnect(xprt);
> + xs_tcp_force_close(xprt);
> return;
> }
> dprintk("RPC: reading TCP record fragment of length %d\n",
> @@ -1132,7 +1142,7 @@ static inline void xs_tcp_read_calldir(struct sock_xprt *transport,
> break;
> default:
> dprintk("RPC: invalid request message type\n");
> - xprt_force_disconnect(&transport->xprt);
> + xs_tcp_force_close(&transport->xprt);
> }
> xs_tcp_check_fraghdr(transport);
> }
> @@ -1455,6 +1465,8 @@ static void xs_tcp_cancel_linger_timeout(struct rpc_xprt *xprt)
> static void xs_sock_mark_closed(struct rpc_xprt *xprt)
> {
> smp_mb__before_clear_bit();
> + clear_bit(XPRT_CONNECTION_ABORT, &xprt->state);
> + clear_bit(XPRT_CONNECTION_CLOSE, &xprt->state);
> clear_bit(XPRT_CLOSE_WAIT, &xprt->state);
> clear_bit(XPRT_CLOSING, &xprt->state);
> smp_mb__after_clear_bit();
> @@ -1512,8 +1524,8 @@ static void xs_tcp_state_change(struct sock *sk)
> break;
> case TCP_CLOSE_WAIT:
> /* The server initiated a shutdown of the socket */
> - xprt_force_disconnect(xprt);
> xprt->connect_cookie++;
> + xs_tcp_force_close(xprt);
> case TCP_CLOSING:
> /*
> * If the server closed down the connection, make sure that
> @@ -2199,8 +2211,7 @@ static void xs_tcp_setup_socket(struct work_struct *work)
> /* We're probably in TIME_WAIT. Get rid of existing socket,
> * and retry
> */
> - set_bit(XPRT_CONNECTION_CLOSE, &xprt->state);
> - xprt_force_disconnect(xprt);
> + xs_tcp_force_close(xprt);
> break;
> case -ECONNREFUSED:
> case -ECONNRESET:
> --
> 1.7.11.4
Yes, based on data collected today, this seems to fix the problem!
Awesome! :)
Simon-
next prev parent reply other threads:[~2012-09-19 22:01 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-09-09 19:45 [3.1-rc4] NFSv3 client hang Simon Kirby
2011-09-09 23:18 ` Trond Myklebust
2011-10-20 19:03 ` Simon Kirby
2012-03-01 22:55 ` Simon Kirby
2012-03-02 0:25 ` Simon Kirby
2012-03-02 18:49 ` [3.2.5] NFSv3 CLOSE_WAIT hang Simon Kirby
2012-09-05 7:49 ` Yan-Pai Chen
2012-09-05 15:09 ` Myklebust, Trond
2012-09-07 13:57 ` Dick Streefland, rnews
2012-09-07 14:13 ` Myklebust, Trond
2012-09-07 14:33 ` Dick Streefland, rnews
2012-09-07 15:46 ` Myklebust, Trond
2012-09-08 19:32 ` Dick Streefland, rnews
2012-09-10 9:00 ` Yan-Pai Chen
2012-09-11 19:40 ` Simon Kirby
2012-09-11 22:17 ` Myklebust, Trond
2012-09-13 5:22 ` Yan-Pai Chen
2012-09-13 13:32 ` Myklebust, Trond
2012-09-21 7:30 ` Yan-Pai Chen
[not found] ` <1347401844.15208.17.camel@lade.trondhjem.org>
2012-09-12 20:54 ` Myklebust, Trond
2012-09-19 22:01 ` Simon Kirby [this message]
2012-09-19 22:11 ` Myklebust, Trond
2012-10-12 8:15 ` Simon Kirby
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120919220148.GA8249@hostway.ca \
--to=sim@hostway.ca \
--cc=Trond.Myklebust@netapp.com \
--cc=linux-nfs@vger.kernel.org \
--cc=yanpai.chen@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).