public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Andy Chittenden <andyc@bluearc.com>
To: "Linux Kernel Mailing List (linux-kernel@vger.kernel.org)" 
	<linux-kernel@vger.kernel.org>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Subject: Re: nfs client hang
Date: Tue, 27 Jul 2010 08:25:08 +0100	[thread overview]
Message-ID: <4C4E89D4.8040607@bluearc.com> (raw)
In-Reply-To: <99613C19B13C5D40914FB8930657FA9303365708DE@uk-ex-mbx1.terastack.bluearc.com>

  On 2010-07-23 13:36, Andy Chittenden wrote:
>> IE the client starts a connection and then closes it again without sending data.
> Once this happens, here's some rpcdebug info for the rpc module using 2.6.34.1 kernel:
>
> ... lots of the following nfsv3 WRITE requests:
> [ 7670.026741] 57793 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
> [ 7670.026759] 57794 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
> [ 7670.026778] 57795 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
> [ 7670.026797] 57796 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
> [ 7670.026815] 57797 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
> [ 7670.026834] 57798 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
> [ 7670.026853] 57799 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
> [ 7670.026871] 57800 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
> [ 7670.026890] 57801 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
> [ 7670.026909] 57802 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
> [ 7680.520042] RPC:       worker connecting xprt ffff88013e62d800 via tcp to 10.1.6.102 (port 2049)
> [ 7680.520066] RPC:       ffff88013e62d800 connect status 99 connected 0 sock state 7
> [ 7680.520074] RPC: 33550 __rpc_wake_up_task (now 4296812426)
> [ 7680.520079] RPC: 33550 disabling timer
> [ 7680.520084] RPC: 33550 removed from queue ffff88013e62db20 "xprt_pending"
> [ 7680.520089] RPC:       __rpc_wake_up_task done
> [ 7680.520094] RPC: 33550 __rpc_execute flags=0x1
> [ 7680.520098] RPC: 33550 xprt_connect_status: retrying
> [ 7680.520103] RPC: 33550 call_connect_status (status -11)
> [ 7680.520108] RPC: 33550 call_transmit (status 0)
> [ 7680.520112] RPC: 33550 xprt_prepare_transmit
> [ 7680.520118] RPC: 33550 rpc_xdr_encode (status 0)
> [ 7680.520123] RPC: 33550 marshaling UNIX cred ffff88012e002300
> [ 7680.520130] RPC: 33550 using AUTH_UNIX cred ffff88012e002300 to wrap rpc data
> [ 7680.520136] RPC: 33550 xprt_transmit(32920)
> [ 7680.520145] RPC:       xs_tcp_send_request(32920) = -32
> [ 7680.520151] RPC:       xs_tcp_state_change client ffff88013e62d800...
> [ 7680.520156] RPC:       state 7 conn 0 dead 0 zapped 1
I changed that debug to output sk_shutdown too. That has a value of 2 
(IE SEND_SHUTDOWN). Looking at tcp_sendmsg(), I see this:

         err = -EPIPE;
         if (sk->sk_err || (sk->sk_shutdown & SEND_SHUTDOWN))
                 goto out_err;

which correlates with the trace "xs_tcp_send_request(32920) = -32". So, 
this looks like a problem in the sockets/tcp layer. The rpc layer issues 
a shutdown and then reconnects using the same socket. So either 
sk_shutdown needs zeroing once the shutdown completes or should be 
zeroed on subsequent connect. The latter sounds safer.

-- 
Andy, BlueArc Engineering


       reply	other threads:[~2010-07-27  7:25 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <99613C19B13C5D40914FB8930657FA9303365708DE@uk-ex-mbx1.terastack.bluearc.com>
2010-07-27  7:25 ` Andy Chittenden [this message]
2010-07-27 10:53   ` nfs client hang Andy Chittenden
2010-07-27 12:21     ` Eric Dumazet
2010-07-27 12:51       ` Andy Chittenden
2010-07-27 17:28       ` Chuck Lever
2010-07-28  7:08         ` Andy Chittenden
2010-07-28  7:24         ` Andy Chittenden
2010-07-28 17:37           ` Chuck Lever
2010-07-29 10:10             ` Andy Chittenden
2011-07-07 17:01             ` General Linux Kernel / User Space License Mitchell Erblich
2011-07-07 20:35               ` Chris Friesen
2010-07-23 12:36 nfs client hang Andy Chittenden
  -- strict thread matches above, loose matches on Subject: below --
2010-07-22 12:19 Andy Chittenden

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4C4E89D4.8040607@bluearc.com \
    --to=andyc@bluearc.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=trond.myklebust@fys.uio.no \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox