All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andy Chittenden <andyc@bluearc.com>
To: "Linux Kernel Mailing List (linux-kernel@vger.kernel.org)" 
	<linux-kernel@vger.kernel.org>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Subject: Re: nfs client hang
Date: Tue, 27 Jul 2010 08:25:08 +0100	[thread overview]
Message-ID: <4C4E89D4.8040607@bluearc.com> (raw)
In-Reply-To: <99613C19B13C5D40914FB8930657FA9303365708DE@uk-ex-mbx1.terastack.bluearc.com>

  On 2010-07-23 13:36, Andy Chittenden wrote:
>> IE the client starts a connection and then closes it again without sending data.
> Once this happens, here's some rpcdebug info for the rpc module using 2.6.34.1 kernel:
>
> ... lots of the following nfsv3 WRITE requests:
> [ 7670.026741] 57793 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
> [ 7670.026759] 57794 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
> [ 7670.026778] 57795 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
> [ 7670.026797] 57796 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
> [ 7670.026815] 57797 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
> [ 7670.026834] 57798 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
> [ 7670.026853] 57799 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
> [ 7670.026871] 57800 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
> [ 7670.026890] 57801 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
> [ 7670.026909] 57802 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
> [ 7680.520042] RPC:       worker connecting xprt ffff88013e62d800 via tcp to 10.1.6.102 (port 2049)
> [ 7680.520066] RPC:       ffff88013e62d800 connect status 99 connected 0 sock state 7
> [ 7680.520074] RPC: 33550 __rpc_wake_up_task (now 4296812426)
> [ 7680.520079] RPC: 33550 disabling timer
> [ 7680.520084] RPC: 33550 removed from queue ffff88013e62db20 "xprt_pending"
> [ 7680.520089] RPC:       __rpc_wake_up_task done
> [ 7680.520094] RPC: 33550 __rpc_execute flags=0x1
> [ 7680.520098] RPC: 33550 xprt_connect_status: retrying
> [ 7680.520103] RPC: 33550 call_connect_status (status -11)
> [ 7680.520108] RPC: 33550 call_transmit (status 0)
> [ 7680.520112] RPC: 33550 xprt_prepare_transmit
> [ 7680.520118] RPC: 33550 rpc_xdr_encode (status 0)
> [ 7680.520123] RPC: 33550 marshaling UNIX cred ffff88012e002300
> [ 7680.520130] RPC: 33550 using AUTH_UNIX cred ffff88012e002300 to wrap rpc data
> [ 7680.520136] RPC: 33550 xprt_transmit(32920)
> [ 7680.520145] RPC:       xs_tcp_send_request(32920) = -32
> [ 7680.520151] RPC:       xs_tcp_state_change client ffff88013e62d800...
> [ 7680.520156] RPC:       state 7 conn 0 dead 0 zapped 1
I changed that debug to output sk_shutdown too. That has a value of 2 
(IE SEND_SHUTDOWN). Looking at tcp_sendmsg(), I see this:

         err = -EPIPE;
         if (sk->sk_err || (sk->sk_shutdown & SEND_SHUTDOWN))
                 goto out_err;

which correlates with the trace "xs_tcp_send_request(32920) = -32". So, 
this looks like a problem in the sockets/tcp layer. The rpc layer issues 
a shutdown and then reconnects using the same socket. So either 
sk_shutdown needs zeroing once the shutdown completes or should be 
zeroed on subsequent connect. The latter sounds safer.

-- 
Andy, BlueArc Engineering


       reply	other threads:[~2010-07-27  7:25 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <99613C19B13C5D40914FB8930657FA9303365708DE@uk-ex-mbx1.terastack.bluearc.com>
2010-07-27  7:25 ` Andy Chittenden [this message]
2010-07-27 10:53   ` nfs client hang Andy Chittenden
2010-07-27 12:21     ` Eric Dumazet
2010-07-27 12:51       ` Andy Chittenden
2010-07-27 17:28       ` Chuck Lever
2010-07-27 17:28         ` Chuck Lever
2010-07-28  7:08         ` Andy Chittenden
2010-07-28  7:08           ` Andy Chittenden
2010-07-28  7:08           ` Andy Chittenden
2010-07-28  7:24         ` Andy Chittenden
2010-07-28  7:24           ` Andy Chittenden
2010-07-28  7:24           ` Andy Chittenden
2010-07-28 17:37           ` Chuck Lever
2010-07-28 17:37             ` Chuck Lever
2010-07-29 10:10             ` Andy Chittenden
2010-07-29 10:10               ` Andy Chittenden
2011-07-07 17:01             ` General Linux Kernel / User Space License Mitchell Erblich
2011-07-07 20:35               ` Chris Friesen
2010-07-23 12:36 nfs client hang Andy Chittenden
  -- strict thread matches above, loose matches on Subject: below --
2010-07-22 12:19 Andy Chittenden

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4C4E89D4.8040607@bluearc.com \
    --to=andyc@bluearc.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=trond.myklebust@fys.uio.no \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.