nfs client hang

All of lore.kernel.org
 help / color / mirror / Atom feed

* nfs client hang
@ 2010-07-22 12:19 Andy Chittenden
  0 siblings, 0 replies; 20+ messages in thread
From: Andy Chittenden @ 2010-07-22 12:19 UTC (permalink / raw)
  To: Linux Kernel Mailing List (linux-kernel@vger.kernel.org)

We're encountering a bug similar to http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=578152 but that claims to be fixed in the version we're running:

# dpkg --status linux-image-2.6.32-5-amd64 | grep Version:
Version: 2.6.32-17

If I do this in 4 different xterm windows having cd to the same NFS mounted directory:

xterm1: rm -rf *
xterm2: while true; do     let iter+=1;     echo $iter;     dd if=/dev/zero of=$$ bs=1M count=1000; done
xterm3: while true; do     let iter+=1;     echo $iter;     dd if=/dev/zero of=$$ bs=1M count=1000; done
xterm4: while true; do     let iter+=1;     echo $iter;     dd if=/dev/zero of=$$ bs=1M count=1000; done

then it normally hangs before the 3rd iteration starts. The directory contains loads of information (eg 5 linux source trees).

When it gets into this hang state, here's the packets from the client to server:

4	42.909478	172.18.0.39	10.1.6.102	TCP	1013 > nfs [SYN] Seq=0 Win=5840 Len=0 MSS=1460 TSV=108490 TSER=0 WS=0
5	42.909577	10.1.6.102	172.18.0.39	TCP	nfs > 1013 [SYN, ACK] Seq=0 Ack=1 Win=64240 Len=0 MSS=1460
6	42.909610	172.18.0.39	10.1.6.102	TCP	1013 > nfs [ACK] Seq=1 Ack=1 Win=5840 Len=0
7	42.909672	172.18.0.39	10.1.6.102	TCP	1013 > nfs [FIN, ACK] Seq=1 Ack=1 Win=5840 Len=0
8	42.909767	10.1.6.102	172.18.0.39	TCP	nfs > 1013 [ACK] Seq=1 Ack=2 Win=64240 Len=0
9	43.660083	10.1.6.102	172.18.0.39	TCP	nfs > 1013 [FIN, ACK] Seq=1 Ack=2 Win=64240 Len=0
10	43.660100	172.18.0.39	10.1.6.102	TCP	1013 > nfs [ACK] Seq=2 Ack=2 Win=5840 Len=0

and then repeats after a while.

IE the client starts a connection and then closes it again without sending data.

FWIW I've found it easier to reproduce this problem if Ethernet flow control is off but it still happens with it on as well. This happens with different types of Ethernet hardware too. The rm -rf isn't necessary either but makes the problem easier to reproduce (for me anyway).

The mount options are:

# mount | grep u15
sweet.dev.bluearc.com:/u15 on /u/u15 type nfs (rw,noatime,nodiratime,hard,intr,rsize=32768,wsize=32768,proto=tcp,hard,intr,rsize=32768,wsize=32768,sloppy,addr=10.1.6.102)

I've generated a 2.6.34.1 kernel and that also has the same problem.

So, why would the linux NFS client get into this "non-transmitting data" state? NB 2.6.26 doesn't exhibit this problem.

-- 
Andy, BlueArc Engineering

^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: nfs client hang
@ 2010-07-23 12:36 Andy Chittenden
  0 siblings, 0 replies; 20+ messages in thread
From: Andy Chittenden @ 2010-07-23 12:36 UTC (permalink / raw)
  To: Linux Kernel Mailing List (linux-kernel@vger.kernel.org)

> IE the client starts a connection and then closes it again without sending data.

Once this happens, here's some rpcdebug info for the rpc module using 2.6.34.1 kernel:

... lots of the following nfsv3 WRITE requests:
[ 7670.026741] 57793 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
[ 7670.026759] 57794 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
[ 7670.026778] 57795 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
[ 7670.026797] 57796 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
[ 7670.026815] 57797 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
[ 7670.026834] 57798 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
[ 7670.026853] 57799 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
[ 7670.026871] 57800 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
[ 7670.026890] 57801 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
[ 7670.026909] 57802 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
[ 7680.520042] RPC:       worker connecting xprt ffff88013e62d800 via tcp to 10.1.6.102 (port 2049)
[ 7680.520066] RPC:       ffff88013e62d800 connect status 99 connected 0 sock state 7
[ 7680.520074] RPC: 33550 __rpc_wake_up_task (now 4296812426)
[ 7680.520079] RPC: 33550 disabling timer
[ 7680.520084] RPC: 33550 removed from queue ffff88013e62db20 "xprt_pending"
[ 7680.520089] RPC:       __rpc_wake_up_task done
[ 7680.520094] RPC: 33550 __rpc_execute flags=0x1
[ 7680.520098] RPC: 33550 xprt_connect_status: retrying
[ 7680.520103] RPC: 33550 call_connect_status (status -11)
[ 7680.520108] RPC: 33550 call_transmit (status 0)
[ 7680.520112] RPC: 33550 xprt_prepare_transmit
[ 7680.520118] RPC: 33550 rpc_xdr_encode (status 0)
[ 7680.520123] RPC: 33550 marshaling UNIX cred ffff88012e002300
[ 7680.520130] RPC: 33550 using AUTH_UNIX cred ffff88012e002300 to wrap rpc data
[ 7680.520136] RPC: 33550 xprt_transmit(32920)
[ 7680.520145] RPC:       xs_tcp_send_request(32920) = -32
[ 7680.520151] RPC:       xs_tcp_state_change client ffff88013e62d800...
[ 7680.520156] RPC:       state 7 conn 0 dead 0 zapped 1
[ 7680.520160] RPC:       disconnected transport ffff88013e62d800
[ 7680.520166] RPC: 33550 call_status (status -32)
[ 7680.520171] RPC: 33550 call_bind (status 0)
[ 7680.520175] RPC: 33550 call_connect xprt ffff88013e62d800 is not connected
[ 7680.520181] RPC: 33550 xprt_connect xprt ffff88013e62d800 is not connected
[ 7680.520187] RPC: 33550 sleep_on(queue "xprt_pending" time 4296812426)
[ 7680.520193] RPC: 33550 added to queue ffff88013e62db20 "xprt_pending"
[ 7680.520198] RPC: 33550 setting alarm for 60000 ms
[ 7680.520203] RPC:       xs_connect delayed xprt ffff88013e62d800 for 48 seconds
[ 7728.520042] RPC:       worker connecting xprt ffff88013e62d800 via tcp to 10.1.6.102 (port 2049)
[ 7728.520093] RPC:       ffff88013e62d800 connect status 115 connected 0 sock state 2
[ 7728.520884] RPC:       xs_tcp_state_change client ffff88013e62d800...
[ 7728.520889] RPC:       state 1 conn 0 dead 0 zapped 1
[ 7728.520896] RPC: 33550 __rpc_wake_up_task (now 4296824426)
[ 7728.520900] RPC: 33550 disabling timer
[ 7728.520906] RPC: 33550 removed from queue ffff88013e62db20 "xprt_pending"
[ 7728.520912] RPC:       __rpc_wake_up_task done
[ 7728.520932] RPC: 33550 __rpc_execute flags=0x1
[ 7728.520937] RPC: 33550 xprt_connect_status: retrying
[ 7728.520942] RPC: 33550 call_connect_status (status -11)
[ 7728.520947] RPC: 33550 call_transmit (status 0)
[ 7728.520951] RPC: 33550 xprt_prepare_transmit
[ 7728.520957] RPC: 33550 rpc_xdr_encode (status 0)
[ 7728.520962] RPC: 33550 marshaling UNIX cred ffff88012e002300
[ 7728.520969] RPC: 33550 using AUTH_UNIX cred ffff88012e002300 to wrap rpc data
[ 7728.520976] RPC: 33550 xprt_transmit(32920)
[ 7728.520984] RPC:       xs_tcp_send_request(32920) = -32
[ 7728.520997] RPC:       xs_tcp_state_change client ffff88013e62d800...
[ 7728.521007] RPC:       state 4 conn 1 dead 0 zapped 1
[ 7728.521013] RPC: 33550 call_status (status -32)
[ 7728.521018] RPC: 33550 call_bind (status 0)
[ 7728.521023] RPC: 33550 call_connect xprt ffff88013e62d800 is not connected
[ 7728.521028] RPC: 33550 xprt_connect xprt ffff88013e62d800 is not connected
[ 7728.521035] RPC: 33550 sleep_on(queue "xprt_pending" time 4296824426)
[ 7728.521040] RPC: 33550 added to queue ffff88013e62db20 "xprt_pending"
[ 7728.521045] RPC: 33550 setting alarm for 60000 ms
[ 7728.521439] RPC:       xs_tcp_state_change client ffff88013e62d800...
[ 7728.521444] RPC:       state 5 conn 0 dead 0 zapped 1
[ 7728.521602] RPC:       xs_tcp_state_change client ffff88013e62d800...
[ 7728.521608] RPC:       state 7 conn 0 dead 0 zapped 1
[ 7728.521612] RPC:       disconnected transport ffff88013e62d800
[ 7728.521617] RPC: 33550 __rpc_wake_up_task (now 4296824426)
[ 7728.521621] RPC: 33550 disabling timer
[ 7728.521626] RPC: 33550 removed from queue ffff88013e62db20 "xprt_pending"
[ 7728.521631] RPC:       __rpc_wake_up_task done
[ 7728.521636] RPC:       xs_tcp_state_change client ffff88013e62d800...
[ 7728.521641] RPC:       state 7 conn 0 dead 0 zapped 1
[ 7728.521645] RPC:       disconnected transport ffff88013e62d800
[ 7728.521649] RPC:       xs_tcp_data_ready...
[ 7728.521662] RPC: 33550 __rpc_execute flags=0x1
[ 7728.521666] RPC: 33550 xprt_connect_status: retrying
[ 7728.521671] RPC: 33550 call_connect_status (status -11)
[ 7728.521675] RPC: 33550 call_transmit (status 0)
[ 7728.521679] RPC: 33550 xprt_prepare_transmit
[ 7728.521683] RPC: 33550 rpc_xdr_encode (status 0)
[ 7728.521688] RPC: 33550 marshaling UNIX cred ffff88012e002300
[ 7728.521694] RPC: 33550 using AUTH_UNIX cred ffff88012e002300 to wrap rpc data
[ 7728.521699] RPC: 33550 xprt_transmit(32920)
[ 7728.521704] RPC:       xs_tcp_send_request(32920) = -32
[ 7728.521709] RPC:       xs_tcp_state_change client ffff88013e62d800...
[ 7728.521714] RPC:       state 7 conn 0 dead 0 zapped 1
[ 7728.521718] RPC:       disconnected transport ffff88013e62d800
[ 7728.521723] RPC: 33550 call_status (status -32)
[ 7728.521727] RPC: 33550 call_bind (status 0)
[ 7728.521732] RPC: 33550 call_connect xprt ffff88013e62d800 is not connected
[ 7728.521737] RPC: 33550 xprt_connect xprt ffff88013e62d800 is not connected
[ 7728.521743] RPC: 33550 sleep_on(queue "xprt_pending" time 4296824426)
[ 7728.521748] RPC: 33550 added to queue ffff88013e62db20 "xprt_pending"
[ 7728.521753] RPC: 33550 setting alarm for 60000 ms
[ 7728.521758] RPC:       xs_connect delayed xprt ffff88013e62d800 for 0 seconds
[ 7728.521763] RPC:       disconnecting xprt ffff88013e62d800 to reuse port
[ 7728.521771] RPC:       xs_error_report client ffff88013e62d800...
[ 7728.521773] RPC:       error 0
[ 7728.521778] RPC: 33550 __rpc_wake_up_task (now 4296824426)
[ 7728.521782] RPC: 33550 disabling timer
[ 7728.521786] RPC: 33550 removed from queue ffff88013e62db20 "xprt_pending"
[ 7728.521790] RPC:       __rpc_wake_up_task done
[ 7728.521794] RPC:       disconnected transport ffff88013e62d800
[ 7728.521800] RPC:       worker connecting xprt ffff88013e62d800 via tcp to 10.1.6.102 (port 2049)
[ 7728.521809] RPC:       ffff88013e62d800 connect status 99 connected 0 sock state 7
[ 7728.521815] RPC: 33550 __rpc_execute flags=0x1
[ 7728.521819] RPC: 33550 xprt_connect_status: retrying
[ 7728.521823] RPC: 33550 call_connect_status (status -11)
[ 7728.521828] RPC: 33550 call_transmit (status 0)
[ 7728.521831] RPC: 33550 xprt_prepare_transmit
[ 7728.521835] RPC: 33550 rpc_xdr_encode (status 0)
[ 7728.521840] RPC: 33550 marshaling UNIX cred ffff88012e002300
[ 7728.521845] RPC: 33550 using AUTH_UNIX cred ffff88012e002300 to wrap rpc data
[ 7728.521850] RPC: 33550 xprt_transmit(32920)
[ 7728.521855] RPC:       xs_tcp_send_request(32920) = -32
[ 7728.521859] RPC:       xs_tcp_state_change client ffff88013e62d800...
[ 7728.521864] RPC:       state 7 conn 0 dead 0 zapped 1
[ 7728.521868] RPC:       disconnected transport ffff88013e62d800
[ 7728.521873] RPC: 33550 call_status (status -32)
[ 7728.521877] RPC: 33550 call_bind (status 0)
[ 7728.521882] RPC: 33550 call_connect xprt ffff88013e62d800 is not connected
[ 7728.521887] RPC: 33550 xprt_connect xprt ffff88013e62d800 is not connected
[ 7728.521893] RPC: 33550 sleep_on(queue "xprt_pending" time 4296824426)
[ 7728.521898] RPC: 33550 added to queue ffff88013e62db20 "xprt_pending"
[ 7728.521902] RPC: 33550 setting alarm for 60000 ms
[ 7728.521907] RPC:       xs_connect delayed xprt ffff88013e62d800 for 3 seconds
[ 7731.520043] RPC:       worker connecting xprt ffff88013e62d800 via tcp to 10.1.6.102 (port 2049)
[ 7731.520063] RPC:       ffff88013e62d800 connect status 99 connected 0 sock state 7
[ 7731.520069] RPC: 33550 __rpc_wake_up_task (now 4296825176)
[ 7731.520073] RPC: 33550 disabling timer
[ 7731.520078] RPC: 33550 removed from queue ffff88013e62db20 "xprt_pending"
[ 7731.520082] RPC:       __rpc_wake_up_task done
[ 7731.520087] RPC: 33550 __rpc_execute flags=0x1
[ 7731.520092] RPC: 33550 xprt_connect_status: retrying
[ 7731.520096] RPC: 33550 call_connect_status (status -11)
[ 7731.520101] RPC: 33550 call_transmit (status 0)
[ 7731.520105] RPC: 33550 xprt_prepare_transmit
[ 7731.520110] RPC: 33550 rpc_xdr_encode (status 0)
[ 7731.520115] RPC: 33550 marshaling UNIX cred ffff88012e002300
[ 7731.520120] RPC: 33550 using AUTH_UNIX cred ffff88012e002300 to wrap rpc data
[ 7731.520126] RPC: 33550 xprt_transmit(32920)
[ 7731.520132] RPC:       xs_tcp_send_request(32920) = -32
[ 7731.520137] RPC:       xs_tcp_state_change client ffff88013e62d800...
[ 7731.520142] RPC:       state 7 conn 0 dead 0 zapped 1
[ 7731.520146] RPC:       disconnected transport ffff88013e62d800
[ 7731.520152] RPC: 33550 call_status (status -32)
[ 7731.520156] RPC: 33550 call_bind (status 0)
[ 7731.520161] RPC: 33550 call_connect xprt ffff88013e62d800 is not connected
[ 7731.520167] RPC: 33550 xprt_connect xprt ffff88013e62d800 is not connected
[ 7731.520172] RPC: 33550 sleep_on(queue "xprt_pending" time 4296825176)
[ 7731.520178] RPC: 33550 added to queue ffff88013e62db20 "xprt_pending"
[ 7731.520182] RPC: 33550 setting alarm for 60000 ms
[ 7731.520187] RPC:       xs_connect delayed xprt ffff88013e62d800 for 6 seconds
[ 7737.520042] RPC:       worker connecting xprt ffff88013e62d800 via tcp to 10.1.6.102 (port 2049)
[ 7737.520063] RPC:       ffff88013e62d800 connect status 99 connected 0 sock state 7
[ 7737.520070] RPC: 33550 __rpc_wake_up_task (now 4296826676)
[ 7737.520075] RPC: 33550 disabling timer
[ 7737.520080] RPC: 33550 removed from queue ffff88013e62db20 "xprt_pending"
[ 7737.520085] RPC:       __rpc_wake_up_task done
[ 7737.520089] RPC: 33550 __rpc_execute flags=0x1
[ 7737.520094] RPC: 33550 xprt_connect_status: retrying
[ 7737.520099] RPC: 33550 call_connect_status (status -11)
[ 7737.520104] RPC: 33550 call_transmit (status 0)
[ 7737.520108] RPC: 33550 xprt_prepare_transmit
[ 7737.520113] RPC: 33550 rpc_xdr_encode (status 0)
[ 7737.520118] RPC: 33550 marshaling UNIX cred ffff88012e002300
[ 7737.520125] RPC: 33550 using AUTH_UNIX cred ffff88012e002300 to wrap rpc data
[ 7737.520131] RPC: 33550 xprt_transmit(32920)
[ 7737.520139] RPC:       xs_tcp_send_request(32920) = -32
[ 7737.520144] RPC:       xs_tcp_state_change client ffff88013e62d800...
[ 7737.520149] RPC:       state 7 conn 0 dead 0 zapped 1
[ 7737.520154] RPC:       disconnected transport ffff88013e62d800
[ 7737.520159] RPC: 33550 call_status (status -32)
[ 7737.520164] RPC: 33550 call_bind (status 0)
[ 7737.520169] RPC: 33550 call_connect xprt ffff88013e62d800 is not connected
[ 7737.520175] RPC: 33550 xprt_connect xprt ffff88013e62d800 is not connected
[ 7737.520181] RPC: 33550 sleep_on(queue "xprt_pending" time 4296826676)
[ 7737.520187] RPC: 33550 added to queue ffff88013e62db20 "xprt_pending"
[ 7737.520191] RPC: 33550 setting alarm for 60000 ms
[ 7737.520197] RPC:       xs_connect delayed xprt ffff88013e62d800 for 12 seconds
[ 7749.520034] RPC:       worker connecting xprt ffff88013e62d800 via tcp to 10.1.6.102 (port 2049)
[ 7749.520056] RPC:       ffff88013e62d800 connect status 99 connected 0 sock state 7
[ 7749.520064] RPC: 33550 __rpc_wake_up_task (now 4296829676)
[ 7749.520069] RPC: 33550 disabling timer
[ 7749.520074] RPC: 33550 removed from queue ffff88013e62db20 "xprt_pending"
[ 7749.520079] RPC:       __rpc_wake_up_task done
[ 7749.520084] RPC: 33550 __rpc_execute flags=0x1
[ 7749.520088] RPC: 33550 xprt_connect_status: retrying
[ 7749.520093] RPC: 33550 call_connect_status (status -11)
[ 7749.520098] RPC: 33550 call_transmit (status 0)
[ 7749.520102] RPC: 33550 xprt_prepare_transmit
[ 7749.520107] RPC: 33550 rpc_xdr_encode (status 0)
[ 7749.520113] RPC: 33550 marshaling UNIX cred ffff88012e002300
[ 7749.520119] RPC: 33550 using AUTH_UNIX cred ffff88012e002300 to wrap rpc data
[ 7749.520126] RPC: 33550 xprt_transmit(32920)
[ 7749.520134] RPC:       xs_tcp_send_request(32920) = -32
[ 7749.520140] RPC:       xs_tcp_state_change client ffff88013e62d800...
[ 7749.520145] RPC:       state 7 conn 0 dead 0 zapped 1
[ 7749.520149] RPC:       disconnected transport ffff88013e62d800
[ 7749.520155] RPC: 33550 call_status (status -32)
[ 7749.520160] RPC: 33550 call_bind (status 0)
[ 7749.520165] RPC: 33550 call_connect xprt ffff88013e62d800 is not connected
[ 7749.520170] RPC: 33550 xprt_connect xprt ffff88013e62d800 is not connected
[ 7749.520176] RPC: 33550 sleep_on(queue "xprt_pending" time 4296829676)
[ 7749.520182] RPC: 33550 added to queue ffff88013e62db20 "xprt_pending"
[ 7749.520187] RPC: 33550 setting alarm for 60000 ms
[ 7749.520192] RPC:       xs_connect delayed xprt ffff88013e62d800 for 24 seconds
[ 7773.520041] RPC:       worker connecting xprt ffff88013e62d800 via tcp to 10.1.6.102 (port 2049)
[ 7773.520061] RPC:       ffff88013e62d800 connect status 99 connected 0 sock state 7
[ 7773.520069] RPC: 33550 __rpc_wake_up_task (now 4296835676)
[ 7773.520074] RPC: 33550 disabling timer
[ 7773.520079] RPC: 33550 removed from queue ffff88013e62db20 "xprt_pending"
[ 7773.520083] RPC:       __rpc_wake_up_task done
[ 7773.520088] RPC: 33550 __rpc_execute flags=0x1
[ 7773.520093] RPC: 33550 xprt_connect_status: retrying
[ 7773.520098] RPC: 33550 call_connect_status (status -11)
[ 7773.520103] RPC: 33550 call_transmit (status 0)
[ 7773.520107] RPC: 33550 xprt_prepare_transmit
[ 7773.520112] RPC: 33550 rpc_xdr_encode (status 0)
[ 7773.520118] RPC: 33550 marshaling UNIX cred ffff88012e002300
[ 7773.520124] RPC: 33550 using AUTH_UNIX cred ffff88012e002300 to wrap rpc data
[ 7773.520131] RPC: 33550 xprt_transmit(32920)
[ 7773.520138] RPC:       xs_tcp_send_request(32920) = -32
[ 7773.520144] RPC:       xs_tcp_state_change client ffff88013e62d800...
[ 7773.520149] RPC:       state 7 conn 0 dead 0 zapped 1
[ 7773.520153] RPC:       disconnected transport ffff88013e62d800
[ 7773.520159] RPC: 33550 call_status (status -32)
[ 7773.520164] RPC: 33550 call_bind (status 0)
[ 7773.520168] RPC: 33550 call_connect xprt ffff88013e62d800 is not connected
[ 7773.520174] RPC: 33550 xprt_connect xprt ffff88013e62d800 is not connected
[ 7773.520180] RPC: 33550 sleep_on(queue "xprt_pending" time 4296835676)
[ 7773.520186] RPC: 33550 added to queue ffff88013e62db20 "xprt_pending"
[ 7773.520190] RPC: 33550 setting alarm for 60000 ms
[ 7773.520196] RPC:       xs_connect delayed xprt ffff88013e62d800 for 48 seconds

--
Andy, BlueArc Engineering


-----Original Message-----
From: Andy Chittenden
Sent: 22 July 2010 13:19
To: Linux Kernel Mailing List (linux-kernel@vger.kernel.org)
Subject: nfs client hang

We're encountering a bug similar to http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=578152 but that claims to be fixed in the version we're running:

# dpkg --status linux-image-2.6.32-5-amd64 | grep Version:
Version: 2.6.32-17

If I do this in 4 different xterm windows having cd to the same NFS mounted directory:

xterm1: rm -rf *
xterm2: while true; do     let iter+=1;     echo $iter;     dd if=/dev/zero of=$$ bs=1M count=1000; done
xterm3: while true; do     let iter+=1;     echo $iter;     dd if=/dev/zero of=$$ bs=1M count=1000; done
xterm4: while true; do     let iter+=1;     echo $iter;     dd if=/dev/zero of=$$ bs=1M count=1000; done

then it normally hangs before the 3rd iteration starts. The directory contains loads of information (eg 5 linux source trees).

When it gets into this hang state, here's the packets from the client to server:

4       42.909478       172.18.0.39     10.1.6.102      TCP     1013 > nfs [SYN] Seq=0 Win=5840 Len=0 MSS=1460 TSV=108490 TSER=0 WS=0
5       42.909577       10.1.6.102      172.18.0.39     TCP     nfs > 1013 [SYN, ACK] Seq=0 Ack=1 Win=64240 Len=0 MSS=1460
6       42.909610       172.18.0.39     10.1.6.102      TCP     1013 > nfs [ACK] Seq=1 Ack=1 Win=5840 Len=0
7       42.909672       172.18.0.39     10.1.6.102      TCP     1013 > nfs [FIN, ACK] Seq=1 Ack=1 Win=5840 Len=0
8       42.909767       10.1.6.102      172.18.0.39     TCP     nfs > 1013 [ACK] Seq=1 Ack=2 Win=64240 Len=0
9       43.660083       10.1.6.102      172.18.0.39     TCP     nfs > 1013 [FIN, ACK] Seq=1 Ack=2 Win=64240 Len=0
10      43.660100       172.18.0.39     10.1.6.102      TCP     1013 > nfs [ACK] Seq=2 Ack=2 Win=5840 Len=0

and then repeats after a while.

IE the client starts a connection and then closes it again without sending data.

FWIW I've found it easier to reproduce this problem if Ethernet flow control is off but it still happens with it on as well. This happens with different types of Ethernet hardware too. The rm -rf isn't necessary either but makes the problem easier to reproduce (for me anyway).

The mount options are:

# mount | grep u15
sweet.dev.bluearc.com:/u15 on /u/u15 type nfs (rw,noatime,nodiratime,hard,intr,rsize=32768,wsize=32768,proto=tcp,hard,intr,rsize=32768,wsize=32768,sloppy,addr=10.1.6.102)

I've generated a 2.6.34.1 kernel and that also has the same problem.

So, why would the linux NFS client get into this "non-transmitting data" state? NB 2.6.26 doesn't exhibit this problem.

--
Andy, BlueArc Engineering



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: nfs client hang
       [not found] <99613C19B13C5D40914FB8930657FA9303365708DE@uk-ex-mbx1.terastack.bluearc.com>
@ 2010-07-27  7:25 ` Andy Chittenden
  2010-07-27 10:53   ` Andy Chittenden
  0 siblings, 1 reply; 20+ messages in thread
From: Andy Chittenden @ 2010-07-27  7:25 UTC (permalink / raw)
  To: Linux Kernel Mailing List (linux-kernel@vger.kernel.org); +Cc: Trond Myklebust

  On 2010-07-23 13:36, Andy Chittenden wrote:
>> IE the client starts a connection and then closes it again without sending data.
> Once this happens, here's some rpcdebug info for the rpc module using 2.6.34.1 kernel:
>
> ... lots of the following nfsv3 WRITE requests:
> [ 7670.026741] 57793 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
> [ 7670.026759] 57794 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
> [ 7670.026778] 57795 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
> [ 7670.026797] 57796 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
> [ 7670.026815] 57797 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
> [ 7670.026834] 57798 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
> [ 7670.026853] 57799 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
> [ 7670.026871] 57800 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
> [ 7670.026890] 57801 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
> [ 7670.026909] 57802 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
> [ 7680.520042] RPC:       worker connecting xprt ffff88013e62d800 via tcp to 10.1.6.102 (port 2049)
> [ 7680.520066] RPC:       ffff88013e62d800 connect status 99 connected 0 sock state 7
> [ 7680.520074] RPC: 33550 __rpc_wake_up_task (now 4296812426)
> [ 7680.520079] RPC: 33550 disabling timer
> [ 7680.520084] RPC: 33550 removed from queue ffff88013e62db20 "xprt_pending"
> [ 7680.520089] RPC:       __rpc_wake_up_task done
> [ 7680.520094] RPC: 33550 __rpc_execute flags=0x1
> [ 7680.520098] RPC: 33550 xprt_connect_status: retrying
> [ 7680.520103] RPC: 33550 call_connect_status (status -11)
> [ 7680.520108] RPC: 33550 call_transmit (status 0)
> [ 7680.520112] RPC: 33550 xprt_prepare_transmit
> [ 7680.520118] RPC: 33550 rpc_xdr_encode (status 0)
> [ 7680.520123] RPC: 33550 marshaling UNIX cred ffff88012e002300
> [ 7680.520130] RPC: 33550 using AUTH_UNIX cred ffff88012e002300 to wrap rpc data
> [ 7680.520136] RPC: 33550 xprt_transmit(32920)
> [ 7680.520145] RPC:       xs_tcp_send_request(32920) = -32
> [ 7680.520151] RPC:       xs_tcp_state_change client ffff88013e62d800...
> [ 7680.520156] RPC:       state 7 conn 0 dead 0 zapped 1
I changed that debug to output sk_shutdown too. That has a value of 2 
(IE SEND_SHUTDOWN). Looking at tcp_sendmsg(), I see this:

         err = -EPIPE;
         if (sk->sk_err || (sk->sk_shutdown & SEND_SHUTDOWN))
                 goto out_err;

which correlates with the trace "xs_tcp_send_request(32920) = -32". So, 
this looks like a problem in the sockets/tcp layer. The rpc layer issues 
a shutdown and then reconnects using the same socket. So either 
sk_shutdown needs zeroing once the shutdown completes or should be 
zeroed on subsequent connect. The latter sounds safer.

-- 
Andy, BlueArc Engineering


^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: nfs client hang
  2010-07-27  7:25 ` nfs client hang Andy Chittenden
@ 2010-07-27 10:53   ` Andy Chittenden
  2010-07-27 12:21     ` Eric Dumazet
  0 siblings, 1 reply; 20+ messages in thread
From: Andy Chittenden @ 2010-07-27 10:53 UTC (permalink / raw)
  To: Andy Chittenden,
	Linux Kernel Mailing List (linux-kernel@vger.kernel.org)
  Cc: Trond Myklebust

> >> IE the client starts a connection and then closes it again without sending data.
> > Once this happens, here's some rpcdebug info for the rpc module using 2.6.34.1 kernel:
> >
> > ... lots of the following nfsv3 WRITE requests:
> > [ 7670.026741] 57793 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
> > [ 7670.026759] 57794 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
> > [ 7670.026778] 57795 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
> > [ 7670.026797] 57796 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
> > [ 7670.026815] 57797 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
> > [ 7670.026834] 57798 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
> > [ 7670.026853] 57799 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
> > [ 7670.026871] 57800 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
> > [ 7670.026890] 57801 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
> > [ 7670.026909] 57802 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
> > [ 7680.520042] RPC:       worker connecting xprt ffff88013e62d800 via tcp to 10.1.6.102 (port 2049)
> > [ 7680.520066] RPC:       ffff88013e62d800 connect status 99 connected 0 sock state 7
> > [ 7680.520074] RPC: 33550 __rpc_wake_up_task (now 4296812426)
> > [ 7680.520079] RPC: 33550 disabling timer
> > [ 7680.520084] RPC: 33550 removed from queue ffff88013e62db20 "xprt_pending"
> > [ 7680.520089] RPC:       __rpc_wake_up_task done
> > [ 7680.520094] RPC: 33550 __rpc_execute flags=0x1
> > [ 7680.520098] RPC: 33550 xprt_connect_status: retrying
> > [ 7680.520103] RPC: 33550 call_connect_status (status -11)
> > [ 7680.520108] RPC: 33550 call_transmit (status 0)
> > [ 7680.520112] RPC: 33550 xprt_prepare_transmit
> > [ 7680.520118] RPC: 33550 rpc_xdr_encode (status 0)
> > [ 7680.520123] RPC: 33550 marshaling UNIX cred ffff88012e002300
> > [ 7680.520130] RPC: 33550 using AUTH_UNIX cred ffff88012e002300 to wrap rpc data
> > [ 7680.520136] RPC: 33550 xprt_transmit(32920)
> > [ 7680.520145] RPC:       xs_tcp_send_request(32920) = -32
> > [ 7680.520151] RPC:       xs_tcp_state_change client ffff88013e62d800...
> > [ 7680.520156] RPC:       state 7 conn 0 dead 0 zapped 1

> I changed that debug to output sk_shutdown too. That has a value of 2 
> (IE SEND_SHUTDOWN). Looking at tcp_sendmsg(), I see this:

>          err = -EPIPE;
>          if (sk->sk_err || (sk->sk_shutdown & SEND_SHUTDOWN))
>                  goto out_err;

> which correlates with the trace "xs_tcp_send_request(32920) = -32". So, 
> this looks like a problem in the sockets/tcp layer. The rpc layer issues 
> a shutdown and then reconnects using the same socket. So either 
> sk_shutdown needs zeroing once the shutdown completes or should be 
> zeroed on subsequent connect. The latter sounds safer.

This patch for 2.6.34.1 fixes the issue:

--- /home/company/software/src/linux-2.6.34.1/net/ipv4/tcp_output.c     2010-07-27 08:46:46.917000000 +0100
+++ net/ipv4/tcp_output.c       2010-07-27 09:19:16.000000000 +0100
@@ -2522,6 +2522,13 @@
        struct tcp_sock *tp = tcp_sk(sk);
        __u8 rcv_wscale;
 
+       /* clear down any previous shutdown attempts so that
+        * reconnects on a socket that's been shutdown leave the
+        * socket in a usable state (otherwise tcp_sendmsg() returns
+        * -EPIPE).
+        */
+       sk->sk_shutdown = 0;
+
        /* We'll fix this up when we get a response from the other end.
         * See tcp_input.c:tcp_rcv_state_process case TCP_SYN_SENT.
         */

As I mentioned in my first message, we first saw this issue in 2.6.32 as supplied by debian (linux-image-2.6.32-5-amd64 Version: 2.6.32-17). It looks like the same patch would fix the problem there too.

-- 
Andy, BlueArc Engineering


^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: nfs client hang
  2010-07-27 10:53   ` Andy Chittenden
@ 2010-07-27 12:21     ` Eric Dumazet
  2010-07-27 12:51       ` Andy Chittenden
  2010-07-27 17:28         ` Chuck Lever
  0 siblings, 2 replies; 20+ messages in thread
From: Eric Dumazet @ 2010-07-27 12:21 UTC (permalink / raw)
  To: Andy Chittenden
  Cc: Linux Kernel Mailing List (linux-kernel@vger.kernel.org),
	Trond Myklebust, netdev

Le mardi 27 juillet 2010 à 11:53 +0100, Andy Chittenden a écrit :
> > >> IE the client starts a connection and then closes it again without sending data.
> > > Once this happens, here's some rpcdebug info for the rpc module using 2.6.34.1 kernel:
> > >
> > > ... lots of the following nfsv3 WRITE requests:
> > > [ 7670.026741] 57793 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
> > > [ 7670.026759] 57794 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
> > > [ 7670.026778] 57795 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
> > > [ 7670.026797] 57796 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
> > > [ 7670.026815] 57797 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
> > > [ 7670.026834] 57798 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
> > > [ 7670.026853] 57799 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
> > > [ 7670.026871] 57800 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
> > > [ 7670.026890] 57801 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
> > > [ 7670.026909] 57802 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
> > > [ 7680.520042] RPC:       worker connecting xprt ffff88013e62d800 via tcp to 10.1.6.102 (port 2049)
> > > [ 7680.520066] RPC:       ffff88013e62d800 connect status 99 connected 0 sock state 7
> > > [ 7680.520074] RPC: 33550 __rpc_wake_up_task (now 4296812426)
> > > [ 7680.520079] RPC: 33550 disabling timer
> > > [ 7680.520084] RPC: 33550 removed from queue ffff88013e62db20 "xprt_pending"
> > > [ 7680.520089] RPC:       __rpc_wake_up_task done
> > > [ 7680.520094] RPC: 33550 __rpc_execute flags=0x1
> > > [ 7680.520098] RPC: 33550 xprt_connect_status: retrying
> > > [ 7680.520103] RPC: 33550 call_connect_status (status -11)
> > > [ 7680.520108] RPC: 33550 call_transmit (status 0)
> > > [ 7680.520112] RPC: 33550 xprt_prepare_transmit
> > > [ 7680.520118] RPC: 33550 rpc_xdr_encode (status 0)
> > > [ 7680.520123] RPC: 33550 marshaling UNIX cred ffff88012e002300
> > > [ 7680.520130] RPC: 33550 using AUTH_UNIX cred ffff88012e002300 to wrap rpc data
> > > [ 7680.520136] RPC: 33550 xprt_transmit(32920)
> > > [ 7680.520145] RPC:       xs_tcp_send_request(32920) = -32
> > > [ 7680.520151] RPC:       xs_tcp_state_change client ffff88013e62d800...
> > > [ 7680.520156] RPC:       state 7 conn 0 dead 0 zapped 1
> 
> > I changed that debug to output sk_shutdown too. That has a value of 2 
> > (IE SEND_SHUTDOWN). Looking at tcp_sendmsg(), I see this:
> 
> >          err = -EPIPE;
> >          if (sk->sk_err || (sk->sk_shutdown & SEND_SHUTDOWN))
> >                  goto out_err;
> 
> > which correlates with the trace "xs_tcp_send_request(32920) = -32". So, 
> > this looks like a problem in the sockets/tcp layer. The rpc layer issues 
> > a shutdown and then reconnects using the same socket. So either 
> > sk_shutdown needs zeroing once the shutdown completes or should be 
> > zeroed on subsequent connect. The latter sounds safer.
> 
> This patch for 2.6.34.1 fixes the issue:
> 
> --- /home/company/software/src/linux-2.6.34.1/net/ipv4/tcp_output.c     2010-07-27 08:46:46.917000000 +0100
> +++ net/ipv4/tcp_output.c       2010-07-27 09:19:16.000000000 +0100
> @@ -2522,6 +2522,13 @@
>         struct tcp_sock *tp = tcp_sk(sk);
>         __u8 rcv_wscale;
>  
> +       /* clear down any previous shutdown attempts so that
> +        * reconnects on a socket that's been shutdown leave the
> +        * socket in a usable state (otherwise tcp_sendmsg() returns
> +        * -EPIPE).
> +        */
> +       sk->sk_shutdown = 0;
> +
>         /* We'll fix this up when we get a response from the other end.
>          * See tcp_input.c:tcp_rcv_state_process case TCP_SYN_SENT.
>          */
> 
> As I mentioned in my first message, we first saw this issue in 2.6.32 as supplied by debian (linux-image-2.6.32-5-amd64 Version: 2.6.32-17). It looks like the same patch would fix the problem there too.
> 

CC netdev

This reminds me a similar problem we had in the past, fixed with commit
1fdf475a (tcp: tcp_disconnect() should clear window_clamp)

But tcp_disconnect() already clears sk->sk_shutdown

If NFS calls tcp_disconnect(), then shutdown(), there is a problem.

Maybe xs_tcp_shutdown() should make some sanity tests ?

Following sequence is legal, and your patch might break it.

fd = socket(...);
shutdown(fd, SHUT_WR);
...
connect(fd, ...);




^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: nfs client hang
  2010-07-27 12:21     ` Eric Dumazet
@ 2010-07-27 12:51       ` Andy Chittenden
  2010-07-27 17:28         ` Chuck Lever
  1 sibling, 0 replies; 20+ messages in thread
From: Andy Chittenden @ 2010-07-27 12:51 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Linux Kernel Mailing List (linux-kernel@vger.kernel.org),
	Trond Myklebust, netdev

  On 2010-07-27 13:21, Eric Dumazet wrote:
> Le mardi 27 juillet 2010 à 11:53 +0100, Andy Chittenden a écrit :
>>>>> IE the client starts a connection and then closes it again without sending data.
>>>> Once this happens, here's some rpcdebug info for the rpc module using 2.6.34.1 kernel:
>>>>
>>>> ... lots of the following nfsv3 WRITE requests:
>>>> [ 7670.026741] 57793 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
>>>> [ 7670.026759] 57794 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
>>>> [ 7670.026778] 57795 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
>>>> [ 7670.026797] 57796 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
>>>> [ 7670.026815] 57797 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
>>>> [ 7670.026834] 57798 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
>>>> [ 7670.026853] 57799 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
>>>> [ 7670.026871] 57800 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
>>>> [ 7670.026890] 57801 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
>>>> [ 7670.026909] 57802 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
>>>> [ 7680.520042] RPC:       worker connecting xprt ffff88013e62d800 via tcp to 10.1.6.102 (port 2049)
>>>> [ 7680.520066] RPC:       ffff88013e62d800 connect status 99 connected 0 sock state 7
>>>> [ 7680.520074] RPC: 33550 __rpc_wake_up_task (now 4296812426)
>>>> [ 7680.520079] RPC: 33550 disabling timer
>>>> [ 7680.520084] RPC: 33550 removed from queue ffff88013e62db20 "xprt_pending"
>>>> [ 7680.520089] RPC:       __rpc_wake_up_task done
>>>> [ 7680.520094] RPC: 33550 __rpc_execute flags=0x1
>>>> [ 7680.520098] RPC: 33550 xprt_connect_status: retrying
>>>> [ 7680.520103] RPC: 33550 call_connect_status (status -11)
>>>> [ 7680.520108] RPC: 33550 call_transmit (status 0)
>>>> [ 7680.520112] RPC: 33550 xprt_prepare_transmit
>>>> [ 7680.520118] RPC: 33550 rpc_xdr_encode (status 0)
>>>> [ 7680.520123] RPC: 33550 marshaling UNIX cred ffff88012e002300
>>>> [ 7680.520130] RPC: 33550 using AUTH_UNIX cred ffff88012e002300 to wrap rpc data
>>>> [ 7680.520136] RPC: 33550 xprt_transmit(32920)
>>>> [ 7680.520145] RPC:       xs_tcp_send_request(32920) = -32
>>>> [ 7680.520151] RPC:       xs_tcp_state_change client ffff88013e62d800...
>>>> [ 7680.520156] RPC:       state 7 conn 0 dead 0 zapped 1
>>> I changed that debug to output sk_shutdown too. That has a value of 2
>>> (IE SEND_SHUTDOWN). Looking at tcp_sendmsg(), I see this:
>>>           err = -EPIPE;
>>>           if (sk->sk_err || (sk->sk_shutdown&  SEND_SHUTDOWN))
>>>                   goto out_err;
>>> which correlates with the trace "xs_tcp_send_request(32920) = -32". So,
>>> this looks like a problem in the sockets/tcp layer. The rpc layer issues
>>> a shutdown and then reconnects using the same socket. So either
>>> sk_shutdown needs zeroing once the shutdown completes or should be
>>> zeroed on subsequent connect. The latter sounds safer.
>> This patch for 2.6.34.1 fixes the issue:
>>
>> --- /home/company/software/src/linux-2.6.34.1/net/ipv4/tcp_output.c     2010-07-27 08:46:46.917000000 +0100
>> +++ net/ipv4/tcp_output.c       2010-07-27 09:19:16.000000000 +0100
>> @@ -2522,6 +2522,13 @@
>>          struct tcp_sock *tp = tcp_sk(sk);
>>          __u8 rcv_wscale;
>>
>> +       /* clear down any previous shutdown attempts so that
>> +        * reconnects on a socket that's been shutdown leave the
>> +        * socket in a usable state (otherwise tcp_sendmsg() returns
>> +        * -EPIPE).
>> +        */
>> +       sk->sk_shutdown = 0;
>> +
>>          /* We'll fix this up when we get a response from the other end.
>>           * See tcp_input.c:tcp_rcv_state_process case TCP_SYN_SENT.
>>           */
>>
>> As I mentioned in my first message, we first saw this issue in 2.6.32 as supplied by debian (linux-image-2.6.32-5-amd64 Version: 2.6.32-17). It looks like the same patch would fix the problem there too.
>>
> CC netdev
>
> This reminds me a similar problem we had in the past, fixed with commit
> 1fdf475a (tcp: tcp_disconnect() should clear window_clamp)
>
> But tcp_disconnect() already clears sk->sk_shutdown
>
> If NFS calls tcp_disconnect(), then shutdown(), there is a problem.
>
> Maybe xs_tcp_shutdown() should make some sanity tests ?
>
> Following sequence is legal, and your patch might break it.
>
> fd = socket(...);
> shutdown(fd, SHUT_WR);
> ...
> connect(fd, ...);
>
>
>
Thanks for the response. From my reading of the RPC code, because 
nothing clears the sk_shutdown flag, the RPC code goes into a loop when 
recovering from packets being lost:

shutdown
connect
send fails so repeat

My patch stops the NFS client hang that I was seeing but I'm not an 
expert on either the socket layer, RPC code or NFS code so I'm happy for 
someone else to come up with the alternative, correct fix.

-- 
Andy, BlueArc Engineering


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: nfs client hang
  2010-07-27 12:21     ` Eric Dumazet
@ 2010-07-27 17:28         ` Chuck Lever
  2010-07-27 17:28         ` Chuck Lever
  1 sibling, 0 replies; 20+ messages in thread
From: Chuck Lever @ 2010-07-27 17:28 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Andy Chittenden,
	Linux Kernel Mailing List (linux-kernel@vger.kernel.org),
	Trond Myklebust, netdev, Linux NFS Mailing List

Add CC: linux-nfs@vger.kernel.org

On 07/27/10 08:21 AM, Eric Dumazet wrote:
> Le mardi 27 juillet 2010 à 11:53 +0100, Andy Chittenden a écrit :
>>>>> IE the client starts a connection and then closes it again without sending data.
>>>> Once this happens, here's some rpcdebug info for the rpc module using 2.6.34.1 kernel:
>>>>
>>>> ... lots of the following nfsv3 WRITE requests:
>>>> [ 7670.026741] 57793 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
>>>> [ 7670.026759] 57794 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
>>>> [ 7670.026778] 57795 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
>>>> [ 7670.026797] 57796 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
>>>> [ 7670.026815] 57797 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
>>>> [ 7670.026834] 57798 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
>>>> [ 7670.026853] 57799 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
>>>> [ 7670.026871] 57800 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
>>>> [ 7670.026890] 57801 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
>>>> [ 7670.026909] 57802 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
>>>> [ 7680.520042] RPC:       worker connecting xprt ffff88013e62d800 via tcp to 10.1.6.102 (port 2049)
>>>> [ 7680.520066] RPC:       ffff88013e62d800 connect status 99 connected 0 sock state 7
>>>> [ 7680.520074] RPC: 33550 __rpc_wake_up_task (now 4296812426)
>>>> [ 7680.520079] RPC: 33550 disabling timer
>>>> [ 7680.520084] RPC: 33550 removed from queue ffff88013e62db20 "xprt_pending"
>>>> [ 7680.520089] RPC:       __rpc_wake_up_task done
>>>> [ 7680.520094] RPC: 33550 __rpc_execute flags=0x1
>>>> [ 7680.520098] RPC: 33550 xprt_connect_status: retrying
>>>> [ 7680.520103] RPC: 33550 call_connect_status (status -11)
>>>> [ 7680.520108] RPC: 33550 call_transmit (status 0)
>>>> [ 7680.520112] RPC: 33550 xprt_prepare_transmit
>>>> [ 7680.520118] RPC: 33550 rpc_xdr_encode (status 0)
>>>> [ 7680.520123] RPC: 33550 marshaling UNIX cred ffff88012e002300
>>>> [ 7680.520130] RPC: 33550 using AUTH_UNIX cred ffff88012e002300 to wrap rpc data
>>>> [ 7680.520136] RPC: 33550 xprt_transmit(32920)
>>>> [ 7680.520145] RPC:       xs_tcp_send_request(32920) = -32
>>>> [ 7680.520151] RPC:       xs_tcp_state_change client ffff88013e62d800...
>>>> [ 7680.520156] RPC:       state 7 conn 0 dead 0 zapped 1
>>
>>> I changed that debug to output sk_shutdown too. That has a value of 2
>>> (IE SEND_SHUTDOWN). Looking at tcp_sendmsg(), I see this:
>>
>>>           err = -EPIPE;
>>>           if (sk->sk_err || (sk->sk_shutdown&  SEND_SHUTDOWN))
>>>                   goto out_err;
>>
>>> which correlates with the trace "xs_tcp_send_request(32920) = -32". So,
>>> this looks like a problem in the sockets/tcp layer. The rpc layer issues
>>> a shutdown and then reconnects using the same socket. So either
>>> sk_shutdown needs zeroing once the shutdown completes or should be
>>> zeroed on subsequent connect. The latter sounds safer.

>> This patch for 2.6.34.1 fixes the issue:
>>
>> --- /home/company/software/src/linux-2.6.34.1/net/ipv4/tcp_output.c     2010-07-27 08:46:46.917000000 +0100
>> +++ net/ipv4/tcp_output.c       2010-07-27 09:19:16.000000000 +0100
>> @@ -2522,6 +2522,13 @@
>>          struct tcp_sock *tp = tcp_sk(sk);
>>          __u8 rcv_wscale;
>>
>> +       /* clear down any previous shutdown attempts so that
>> +        * reconnects on a socket that's been shutdown leave the
>> +        * socket in a usable state (otherwise tcp_sendmsg() returns
>> +        * -EPIPE).
>> +        */
>> +       sk->sk_shutdown = 0;
>> +
>>          /* We'll fix this up when we get a response from the other end.
>>           * See tcp_input.c:tcp_rcv_state_process case TCP_SYN_SENT.
>>           */
>>
>> As I mentioned in my first message, we first saw this issue in 2.6.32 as supplied by debian (linux-image-2.6.32-5-amd64 Version: 2.6.32-17). It looks like the same patch would fix the problem there too.
>>
>
> CC netdev
>
> This reminds me a similar problem we had in the past, fixed with commit
> 1fdf475a (tcp: tcp_disconnect() should clear window_clamp)
>
> But tcp_disconnect() already clears sk->sk_shutdown
>
> If NFS calls tcp_disconnect(), then shutdown(), there is a problem.

If tcp_disconnect() was called at some point, I would expect to see a 
message from xs_error_report() in the debugging output.  Perhaps 
tcp_disconnect() is not being invoked at all?

> Maybe xs_tcp_shutdown() should make some sanity tests ?
>
> Following sequence is legal, and your patch might break it.
>
> fd = socket(...);
> shutdown(fd, SHUT_WR);
> ...
> connect(fd, ...);


I looked closely at some of Andy's debugging output from the 
linux-kernel mailing list archive.  I basically agree that the network 
layer is returning -EPIPE from tcp_sendmsg(), which the RPC client logic 
does not expect. But it's not clear to me how it gets into this state.

> [ 7728.520042] RPC:       worker connecting xprt ffff88013e62d800 via tcp to 10.1.6.102 (port 2049)
 > [ 7728.520093] RPC:       ffff88013e62d800 connect status 115 
connected 0 sock state 2

"sock state 2" => sk->sk_state == TCP_SYN_SENT

 > [ 7728.520884] RPC:       xs_tcp_state_change client ffff88013e62d800...
 > [ 7728.520889] RPC:       state 1 conn 0 dead 0 zapped 1

"state 1" => sk->sk_state == TCP_ESTABLISHED

RPC client wakes up this RPC task now that the connection is established.

> [ 7728.520896] RPC: 33550 __rpc_wake_up_task (now 4296824426)
> [ 7728.520900] RPC: 33550 disabling timer
> [ 7728.520906] RPC: 33550 removed from queue ffff88013e62db20 "xprt_pending"
> [ 7728.520912] RPC:       __rpc_wake_up_task done
> [ 7728.520932] RPC: 33550 __rpc_execute flags=0x1
> [ 7728.520937] RPC: 33550 xprt_connect_status: retrying
> [ 7728.520942] RPC: 33550 call_connect_status (status -11)

The awoken RPC task's status is -EAGAIN, which prevents a reconnection 
attempt.

> [ 7728.520947] RPC: 33550 call_transmit (status 0)
> [ 7728.520951] RPC: 33550 xprt_prepare_transmit
> [ 7728.520957] RPC: 33550 rpc_xdr_encode (status 0)
> [ 7728.520962] RPC: 33550 marshaling UNIX cred ffff88012e002300
> [ 7728.520969] RPC: 33550 using AUTH_UNIX cred ffff88012e002300 to wrap rpc data
> [ 7728.520976] RPC: 33550 xprt_transmit(32920)

RPC client encodes the request and attempts to send it.

> [ 7728.520984] RPC:       xs_tcp_send_request(32920) = -32

Network layer says -EPIPE, for some reason.  RPC client calls 
kernel_sock_shutdown(SHUT_WR).

> [ 7728.520997] RPC:       xs_tcp_state_change client ffff88013e62d800...
> [ 7728.521007] RPC:       state 4 conn 1 dead 0 zapped 1

"state 4" => sk->sk_state == TCP_FIN_WAIT1

The RPC client sets up a linger timeout.

> [ 7728.521013] RPC: 33550 call_status (status -32)
> [ 7728.521018] RPC: 33550 call_bind (status 0)
> [ 7728.521023] RPC: 33550 call_connect xprt ffff88013e62d800 is not connected
> [ 7728.521028] RPC: 33550 xprt_connect xprt ffff88013e62d800 is not connected
> [ 7728.521035] RPC: 33550 sleep_on(queue "xprt_pending" time 4296824426)
> [ 7728.521040] RPC: 33550 added to queue ffff88013e62db20 "xprt_pending"
> [ 7728.521045] RPC: 33550 setting alarm for 60000 ms

RPC client puts this RPC task to sleep.

> [ 7728.521439] RPC:       xs_tcp_state_change client ffff88013e62d800...
> [ 7728.521444] RPC:       state 5 conn 0 dead 0 zapped 1

"state 5" => sk->sk_state == TCP_FIN_WAIT2

> [ 7728.521602] RPC:       xs_tcp_state_change client ffff88013e62d800...
> [ 7728.521608] RPC:       state 7 conn 0 dead 0 zapped 1
> [ 7728.521612] RPC:       disconnected transport ffff88013e62d800

"state 7" => sk->sk_state == TCP_CLOSE

The RPC client marks the socket closed. and the linger timeout is 
cancelled.  At this point, sk_shutdown should be set to zero, correct? 
I don't see an xs_error_report() call here, which would confirm that the 
socket took a trip through tcp_disconnect().

> [ 7728.521617] RPC: 33550 __rpc_wake_up_task (now 4296824426)
> [ 7728.521621] RPC: 33550 disabling timer
> [ 7728.521626] RPC: 33550 removed from queue ffff88013e62db20 "xprt_pending"
> [ 7728.521631] RPC:       __rpc_wake_up_task done

RPC client wakes up the RPC task.  Meanwhile...

> [ 7728.521636] RPC:       xs_tcp_state_change client ffff88013e62d800...
> [ 7728.521641] RPC:       state 7 conn 0 dead 0 zapped 1
> [ 7728.521645] RPC:       disconnected transport ffff88013e62d800
> [ 7728.521649] RPC:       xs_tcp_data_ready...

... network layer calls closed socket's data_ready method... while the 
awoken RPC task gets underway.

> [ 7728.521662] RPC: 33550 __rpc_execute flags=0x1
> [ 7728.521666] RPC: 33550 xprt_connect_status: retrying
> [ 7728.521671] RPC: 33550 call_connect_status (status -11)

The awoken RPC task's status is -EAGAIN, which prevents a reconnection 
attempt, even though there is no established connection.

RPC client barrels on to send the request again.

> [ 7728.521675] RPC: 33550 call_transmit (status 0)
> [ 7728.521679] RPC: 33550 xprt_prepare_transmit
> [ 7728.521683] RPC: 33550 rpc_xdr_encode (status 0)
> [ 7728.521688] RPC: 33550 marshaling UNIX cred ffff88012e002300
> [ 7728.521694] RPC: 33550 using AUTH_UNIX cred ffff88012e002300 to wrap rpc data
> [ 7728.521699] RPC: 33550 xprt_transmit(32920)
> [ 7728.521704] RPC:       xs_tcp_send_request(32920) = -32

RPC client attempts to send again, gets -EPIPE, and calls 
kernel_sock_shutdown(SHUT_WR).  If there is no connection established, 
the RPC client expects -ENOTCONN, in which case it will attempt to 
reconnect here.

> [ 7728.521709] RPC:       xs_tcp_state_change client ffff88013e62d800...
> [ 7728.521714] RPC:       state 7 conn 0 dead 0 zapped 1
> [ 7728.521718] RPC:       disconnected transport ffff88013e62d800

"state 7" => sk->sk_state == TCP_CLOSE

Following this, the RPC client attempts to retransmit the request 
repeatedly, but the socket remains in the TCP_CLOSE state.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: nfs client hang
@ 2010-07-27 17:28         ` Chuck Lever
  0 siblings, 0 replies; 20+ messages in thread
From: Chuck Lever @ 2010-07-27 17:28 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Andy Chittenden,
	Linux Kernel Mailing List (linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org),
	Trond Myklebust, netdev, Linux NFS Mailing List

Add CC: linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org

On 07/27/10 08:21 AM, Eric Dumazet wrote:
> Le mardi 27 juillet 2010 à 11:53 +0100, Andy Chittenden a écrit :
>>>>> IE the client starts a connection and then closes it again without sending data.
>>>> Once this happens, here's some rpcdebug info for the rpc module using 2.6.34.1 kernel:
>>>>
>>>> ... lots of the following nfsv3 WRITE requests:
>>>> [ 7670.026741] 57793 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
>>>> [ 7670.026759] 57794 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
>>>> [ 7670.026778] 57795 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
>>>> [ 7670.026797] 57796 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
>>>> [ 7670.026815] 57797 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
>>>> [ 7670.026834] 57798 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
>>>> [ 7670.026853] 57799 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
>>>> [ 7670.026871] 57800 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
>>>> [ 7670.026890] 57801 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
>>>> [ 7670.026909] 57802 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
>>>> [ 7680.520042] RPC:       worker connecting xprt ffff88013e62d800 via tcp to 10.1.6.102 (port 2049)
>>>> [ 7680.520066] RPC:       ffff88013e62d800 connect status 99 connected 0 sock state 7
>>>> [ 7680.520074] RPC: 33550 __rpc_wake_up_task (now 4296812426)
>>>> [ 7680.520079] RPC: 33550 disabling timer
>>>> [ 7680.520084] RPC: 33550 removed from queue ffff88013e62db20 "xprt_pending"
>>>> [ 7680.520089] RPC:       __rpc_wake_up_task done
>>>> [ 7680.520094] RPC: 33550 __rpc_execute flags=0x1
>>>> [ 7680.520098] RPC: 33550 xprt_connect_status: retrying
>>>> [ 7680.520103] RPC: 33550 call_connect_status (status -11)
>>>> [ 7680.520108] RPC: 33550 call_transmit (status 0)
>>>> [ 7680.520112] RPC: 33550 xprt_prepare_transmit
>>>> [ 7680.520118] RPC: 33550 rpc_xdr_encode (status 0)
>>>> [ 7680.520123] RPC: 33550 marshaling UNIX cred ffff88012e002300
>>>> [ 7680.520130] RPC: 33550 using AUTH_UNIX cred ffff88012e002300 to wrap rpc data
>>>> [ 7680.520136] RPC: 33550 xprt_transmit(32920)
>>>> [ 7680.520145] RPC:       xs_tcp_send_request(32920) = -32
>>>> [ 7680.520151] RPC:       xs_tcp_state_change client ffff88013e62d800...
>>>> [ 7680.520156] RPC:       state 7 conn 0 dead 0 zapped 1
>>
>>> I changed that debug to output sk_shutdown too. That has a value of 2
>>> (IE SEND_SHUTDOWN). Looking at tcp_sendmsg(), I see this:
>>
>>>           err = -EPIPE;
>>>           if (sk->sk_err || (sk->sk_shutdown&  SEND_SHUTDOWN))
>>>                   goto out_err;
>>
>>> which correlates with the trace "xs_tcp_send_request(32920) = -32". So,
>>> this looks like a problem in the sockets/tcp layer. The rpc layer issues
>>> a shutdown and then reconnects using the same socket. So either
>>> sk_shutdown needs zeroing once the shutdown completes or should be
>>> zeroed on subsequent connect. The latter sounds safer.

>> This patch for 2.6.34.1 fixes the issue:
>>
>> --- /home/company/software/src/linux-2.6.34.1/net/ipv4/tcp_output.c     2010-07-27 08:46:46.917000000 +0100
>> +++ net/ipv4/tcp_output.c       2010-07-27 09:19:16.000000000 +0100
>> @@ -2522,6 +2522,13 @@
>>          struct tcp_sock *tp = tcp_sk(sk);
>>          __u8 rcv_wscale;
>>
>> +       /* clear down any previous shutdown attempts so that
>> +        * reconnects on a socket that's been shutdown leave the
>> +        * socket in a usable state (otherwise tcp_sendmsg() returns
>> +        * -EPIPE).
>> +        */
>> +       sk->sk_shutdown = 0;
>> +
>>          /* We'll fix this up when we get a response from the other end.
>>           * See tcp_input.c:tcp_rcv_state_process case TCP_SYN_SENT.
>>           */
>>
>> As I mentioned in my first message, we first saw this issue in 2.6.32 as supplied by debian (linux-image-2.6.32-5-amd64 Version: 2.6.32-17). It looks like the same patch would fix the problem there too.
>>
>
> CC netdev
>
> This reminds me a similar problem we had in the past, fixed with commit
> 1fdf475a (tcp: tcp_disconnect() should clear window_clamp)
>
> But tcp_disconnect() already clears sk->sk_shutdown
>
> If NFS calls tcp_disconnect(), then shutdown(), there is a problem.

If tcp_disconnect() was called at some point, I would expect to see a 
message from xs_error_report() in the debugging output.  Perhaps 
tcp_disconnect() is not being invoked at all?

> Maybe xs_tcp_shutdown() should make some sanity tests ?
>
> Following sequence is legal, and your patch might break it.
>
> fd = socket(...);
> shutdown(fd, SHUT_WR);
> ...
> connect(fd, ...);


I looked closely at some of Andy's debugging output from the 
linux-kernel mailing list archive.  I basically agree that the network 
layer is returning -EPIPE from tcp_sendmsg(), which the RPC client logic 
does not expect. But it's not clear to me how it gets into this state.

> [ 7728.520042] RPC:       worker connecting xprt ffff88013e62d800 via tcp to 10.1.6.102 (port 2049)
 > [ 7728.520093] RPC:       ffff88013e62d800 connect status 115 
connected 0 sock state 2

"sock state 2" => sk->sk_state == TCP_SYN_SENT

 > [ 7728.520884] RPC:       xs_tcp_state_change client ffff88013e62d800...
 > [ 7728.520889] RPC:       state 1 conn 0 dead 0 zapped 1

"state 1" => sk->sk_state == TCP_ESTABLISHED

RPC client wakes up this RPC task now that the connection is established.

> [ 7728.520896] RPC: 33550 __rpc_wake_up_task (now 4296824426)
> [ 7728.520900] RPC: 33550 disabling timer
> [ 7728.520906] RPC: 33550 removed from queue ffff88013e62db20 "xprt_pending"
> [ 7728.520912] RPC:       __rpc_wake_up_task done
> [ 7728.520932] RPC: 33550 __rpc_execute flags=0x1
> [ 7728.520937] RPC: 33550 xprt_connect_status: retrying
> [ 7728.520942] RPC: 33550 call_connect_status (status -11)

The awoken RPC task's status is -EAGAIN, which prevents a reconnection 
attempt.

> [ 7728.520947] RPC: 33550 call_transmit (status 0)
> [ 7728.520951] RPC: 33550 xprt_prepare_transmit
> [ 7728.520957] RPC: 33550 rpc_xdr_encode (status 0)
> [ 7728.520962] RPC: 33550 marshaling UNIX cred ffff88012e002300
> [ 7728.520969] RPC: 33550 using AUTH_UNIX cred ffff88012e002300 to wrap rpc data
> [ 7728.520976] RPC: 33550 xprt_transmit(32920)

RPC client encodes the request and attempts to send it.

> [ 7728.520984] RPC:       xs_tcp_send_request(32920) = -32

Network layer says -EPIPE, for some reason.  RPC client calls 
kernel_sock_shutdown(SHUT_WR).

> [ 7728.520997] RPC:       xs_tcp_state_change client ffff88013e62d800...
> [ 7728.521007] RPC:       state 4 conn 1 dead 0 zapped 1

"state 4" => sk->sk_state == TCP_FIN_WAIT1

The RPC client sets up a linger timeout.

> [ 7728.521013] RPC: 33550 call_status (status -32)
> [ 7728.521018] RPC: 33550 call_bind (status 0)
> [ 7728.521023] RPC: 33550 call_connect xprt ffff88013e62d800 is not connected
> [ 7728.521028] RPC: 33550 xprt_connect xprt ffff88013e62d800 is not connected
> [ 7728.521035] RPC: 33550 sleep_on(queue "xprt_pending" time 4296824426)
> [ 7728.521040] RPC: 33550 added to queue ffff88013e62db20 "xprt_pending"
> [ 7728.521045] RPC: 33550 setting alarm for 60000 ms

RPC client puts this RPC task to sleep.

> [ 7728.521439] RPC:       xs_tcp_state_change client ffff88013e62d800...
> [ 7728.521444] RPC:       state 5 conn 0 dead 0 zapped 1

"state 5" => sk->sk_state == TCP_FIN_WAIT2

> [ 7728.521602] RPC:       xs_tcp_state_change client ffff88013e62d800...
> [ 7728.521608] RPC:       state 7 conn 0 dead 0 zapped 1
> [ 7728.521612] RPC:       disconnected transport ffff88013e62d800

"state 7" => sk->sk_state == TCP_CLOSE

The RPC client marks the socket closed. and the linger timeout is 
cancelled.  At this point, sk_shutdown should be set to zero, correct? 
I don't see an xs_error_report() call here, which would confirm that the 
socket took a trip through tcp_disconnect().

> [ 7728.521617] RPC: 33550 __rpc_wake_up_task (now 4296824426)
> [ 7728.521621] RPC: 33550 disabling timer
> [ 7728.521626] RPC: 33550 removed from queue ffff88013e62db20 "xprt_pending"
> [ 7728.521631] RPC:       __rpc_wake_up_task done

RPC client wakes up the RPC task.  Meanwhile...

> [ 7728.521636] RPC:       xs_tcp_state_change client ffff88013e62d800...
> [ 7728.521641] RPC:       state 7 conn 0 dead 0 zapped 1
> [ 7728.521645] RPC:       disconnected transport ffff88013e62d800
> [ 7728.521649] RPC:       xs_tcp_data_ready...

... network layer calls closed socket's data_ready method... while the 
awoken RPC task gets underway.

> [ 7728.521662] RPC: 33550 __rpc_execute flags=0x1
> [ 7728.521666] RPC: 33550 xprt_connect_status: retrying
> [ 7728.521671] RPC: 33550 call_connect_status (status -11)

The awoken RPC task's status is -EAGAIN, which prevents a reconnection 
attempt, even though there is no established connection.

RPC client barrels on to send the request again.

> [ 7728.521675] RPC: 33550 call_transmit (status 0)
> [ 7728.521679] RPC: 33550 xprt_prepare_transmit
> [ 7728.521683] RPC: 33550 rpc_xdr_encode (status 0)
> [ 7728.521688] RPC: 33550 marshaling UNIX cred ffff88012e002300
> [ 7728.521694] RPC: 33550 using AUTH_UNIX cred ffff88012e002300 to wrap rpc data
> [ 7728.521699] RPC: 33550 xprt_transmit(32920)
> [ 7728.521704] RPC:       xs_tcp_send_request(32920) = -32

RPC client attempts to send again, gets -EPIPE, and calls 
kernel_sock_shutdown(SHUT_WR).  If there is no connection established, 
the RPC client expects -ENOTCONN, in which case it will attempt to 
reconnect here.

> [ 7728.521709] RPC:       xs_tcp_state_change client ffff88013e62d800...
> [ 7728.521714] RPC:       state 7 conn 0 dead 0 zapped 1
> [ 7728.521718] RPC:       disconnected transport ffff88013e62d800

"state 7" => sk->sk_state == TCP_CLOSE

Following this, the RPC client attempts to retransmit the request 
repeatedly, but the socket remains in the TCP_CLOSE state.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: nfs client hang
  2010-07-27 17:28         ` Chuck Lever
  (?)
@ 2010-07-28  7:08           ` Andy Chittenden
  -1 siblings, 0 replies; 20+ messages in thread
From: Andy Chittenden @ 2010-07-28  7:08 UTC (permalink / raw)
  To: Chuck Lever, Eric Dumazet
  Cc: Linux Kernel Mailing List (linux-kernel@vger.kernel.org),
	Trond Myklebust, netdev, Linux NFS Mailing List

PiBJIGRvbid0IHNlZSBhbiB4c19lcnJvcl9yZXBvcnQoKSBjYWxsIGhlcmUsIHdoaWNoIHdvdWxk
IGNvbmZpcm0gdGhhdCB0aGUgc29ja2V0IHRvb2sgYSB0cmlwIHRocm91Z2ggdGNwX2Rpc2Nvbm5l
Y3QoKS4NCg0KRnJvbSBteSByZWFkaW5nIG9mIHRjcF9kaXNjb25uZWN0KCksIGl0IGNhbGxzIHNr
LT5za19lcnJvcl9yZXBvcnQoc2spIHVuY29uZGl0aW9uYWxseSBzbyBhcyB0aGVyZSdzIG5vIHhz
X2Vycm9yX3JlcG9ydCgpLCB0aGF0IHN1cmVseSBtZWFucyB0aGUgZXhhY3Qgb3Bwb3NpdGU6IHRj
cF9kaXNjb25uZWN0KCkgd2Fzbid0IGNhbGxlZC4gSWYgaXQncyBub3QgY2FsbGVkLCBza19zaHV0
ZG93biBpcyBub3QgY2xlYXJlZC4gQW5kIG15IHJldmlzZWQgdHJhY2luZyBjb25maXJtZWQgdGhh
dCBpdCB3YXMgc2V0IHRvIA0KU0VORF9TSFVURE9XTi4NCg0KLS0gDQpBbmR5LCBCbHVlQXJjIEVu
Z2luZWVyaW5nDQoNCg0KLS0tLS1PcmlnaW5hbCBNZXNzYWdlLS0tLS0NCkZyb206IENodWNrIExl
dmVyIFttYWlsdG86Y2h1Y2subGV2ZXJAb3JhY2xlLmNvbV0gDQpTZW50OiAyNyBKdWx5IDIwMTAg
MTg6MjkNClRvOiBFcmljIER1bWF6ZXQNCkNjOiBBbmR5IENoaXR0ZW5kZW47IExpbnV4IEtlcm5l
bCBNYWlsaW5nIExpc3QgKGxpbnV4LWtlcm5lbEB2Z2VyLmtlcm5lbC5vcmcpOyBUcm9uZCBNeWts
ZWJ1c3Q7IG5ldGRldjsgTGludXggTkZTIE1haWxpbmcgTGlzdA0KU3ViamVjdDogUmU6IG5mcyBj
bGllbnQgaGFuZw0KDQpBZGQgQ0M6IGxpbnV4LW5mc0B2Z2VyLmtlcm5lbC5vcmcNCg0KT24gMDcv
MjcvMTAgMDg6MjEgQU0sIEVyaWMgRHVtYXpldCB3cm90ZToNCj4gTGUgbWFyZGkgMjcganVpbGxl
dCAyMDEwIMOgIDExOjUzICswMTAwLCBBbmR5IENoaXR0ZW5kZW4gYSDDqWNyaXQgOg0KPj4+Pj4g
SUUgdGhlIGNsaWVudCBzdGFydHMgYSBjb25uZWN0aW9uIGFuZCB0aGVuIGNsb3NlcyBpdCBhZ2Fp
biB3aXRob3V0IHNlbmRpbmcgZGF0YS4NCj4+Pj4gT25jZSB0aGlzIGhhcHBlbnMsIGhlcmUncyBz
b21lIHJwY2RlYnVnIGluZm8gZm9yIHRoZSBycGMgbW9kdWxlIHVzaW5nIDIuNi4zNC4xIGtlcm5l
bDoNCj4+Pj4NCj4+Pj4gLi4uIGxvdHMgb2YgdGhlIGZvbGxvd2luZyBuZnN2MyBXUklURSByZXF1
ZXN0czoNCj4+Pj4gWyA3NjcwLjAyNjc0MV0gNTc3OTMgMDAwMSAgICAtMTEgZmZmZjg4MDEyZTMy
YjAwMCAgIChudWxsKSAgICAgICAgMCBmZmZmZmZmZmEwM2JlYjEwIG5mc3YzIFdSSVRFIGE6Y2Fs
bF9yZXNlcnZlcmVzdWx0IHE6eHBydF9iYWNrbG9nDQo+Pj4+IFsgNzY3MC4wMjY3NTldIDU3Nzk0
IDAwMDEgICAgLTExIGZmZmY4ODAxMmUzMmIwMDAgICAobnVsbCkgICAgICAgIDAgZmZmZmZmZmZh
MDNiZWIxMCBuZnN2MyBXUklURSBhOmNhbGxfcmVzZXJ2ZXJlc3VsdCBxOnhwcnRfYmFja2xvZw0K
Pj4+PiBbIDc2NzAuMDI2Nzc4XSA1Nzc5NSAwMDAxICAgIC0xMSBmZmZmODgwMTJlMzJiMDAwICAg
KG51bGwpICAgICAgICAwIGZmZmZmZmZmYTAzYmViMTAgbmZzdjMgV1JJVEUgYTpjYWxsX3Jlc2Vy
dmVyZXN1bHQgcTp4cHJ0X2JhY2tsb2cNCj4+Pj4gWyA3NjcwLjAyNjc5N10gNTc3OTYgMDAwMSAg
ICAtMTEgZmZmZjg4MDEyZTMyYjAwMCAgIChudWxsKSAgICAgICAgMCBmZmZmZmZmZmEwM2JlYjEw
IG5mc3YzIFdSSVRFIGE6Y2FsbF9yZXNlcnZlcmVzdWx0IHE6eHBydF9iYWNrbG9nDQo+Pj4+IFsg
NzY3MC4wMjY4MTVdIDU3Nzk3IDAwMDEgICAgLTExIGZmZmY4ODAxMmUzMmIwMDAgICAobnVsbCkg
ICAgICAgIDAgZmZmZmZmZmZhMDNiZWIxMCBuZnN2MyBXUklURSBhOmNhbGxfcmVzZXJ2ZXJlc3Vs
dCBxOnhwcnRfYmFja2xvZw0KPj4+PiBbIDc2NzAuMDI2ODM0XSA1Nzc5OCAwMDAxICAgIC0xMSBm
ZmZmODgwMTJlMzJiMDAwICAgKG51bGwpICAgICAgICAwIGZmZmZmZmZmYTAzYmViMTAgbmZzdjMg
V1JJVEUgYTpjYWxsX3Jlc2VydmVyZXN1bHQgcTp4cHJ0X2JhY2tsb2cNCj4+Pj4gWyA3NjcwLjAy
Njg1M10gNTc3OTkgMDAwMSAgICAtMTEgZmZmZjg4MDEyZTMyYjAwMCAgIChudWxsKSAgICAgICAg
MCBmZmZmZmZmZmEwM2JlYjEwIG5mc3YzIFdSSVRFIGE6Y2FsbF9yZXNlcnZlcmVzdWx0IHE6eHBy
dF9iYWNrbG9nDQo+Pj4+IFsgNzY3MC4wMjY4NzFdIDU3ODAwIDAwMDEgICAgLTExIGZmZmY4ODAx
MmUzMmIwMDAgICAobnVsbCkgICAgICAgIDAgZmZmZmZmZmZhMDNiZWIxMCBuZnN2MyBXUklURSBh
OmNhbGxfcmVzZXJ2ZXJlc3VsdCBxOnhwcnRfYmFja2xvZw0KPj4+PiBbIDc2NzAuMDI2ODkwXSA1
NzgwMSAwMDAxICAgIC0xMSBmZmZmODgwMTJlMzJiMDAwICAgKG51bGwpICAgICAgICAwIGZmZmZm
ZmZmYTAzYmViMTAgbmZzdjMgV1JJVEUgYTpjYWxsX3Jlc2VydmVyZXN1bHQgcTp4cHJ0X2JhY2ts
b2cNCj4+Pj4gWyA3NjcwLjAyNjkwOV0gNTc4MDIgMDAwMSAgICAtMTEgZmZmZjg4MDEyZTMyYjAw
MCAgIChudWxsKSAgICAgICAgMCBmZmZmZmZmZmEwM2JlYjEwIG5mc3YzIFdSSVRFIGE6Y2FsbF9y
ZXNlcnZlcmVzdWx0IHE6eHBydF9iYWNrbG9nDQo+Pj4+IFsgNzY4MC41MjAwNDJdIFJQQzogICAg
ICAgd29ya2VyIGNvbm5lY3RpbmcgeHBydCBmZmZmODgwMTNlNjJkODAwIHZpYSB0Y3AgdG8gMTAu
MS42LjEwMiAocG9ydCAyMDQ5KQ0KPj4+PiBbIDc2ODAuNTIwMDY2XSBSUEM6ICAgICAgIGZmZmY4
ODAxM2U2MmQ4MDAgY29ubmVjdCBzdGF0dXMgOTkgY29ubmVjdGVkIDAgc29jayBzdGF0ZSA3DQo+
Pj4+IFsgNzY4MC41MjAwNzRdIFJQQzogMzM1NTAgX19ycGNfd2FrZV91cF90YXNrIChub3cgNDI5
NjgxMjQyNikNCj4+Pj4gWyA3NjgwLjUyMDA3OV0gUlBDOiAzMzU1MCBkaXNhYmxpbmcgdGltZXIN
Cj4+Pj4gWyA3NjgwLjUyMDA4NF0gUlBDOiAzMzU1MCByZW1vdmVkIGZyb20gcXVldWUgZmZmZjg4
MDEzZTYyZGIyMCAieHBydF9wZW5kaW5nIg0KPj4+PiBbIDc2ODAuNTIwMDg5XSBSUEM6ICAgICAg
IF9fcnBjX3dha2VfdXBfdGFzayBkb25lDQo+Pj4+IFsgNzY4MC41MjAwOTRdIFJQQzogMzM1NTAg
X19ycGNfZXhlY3V0ZSBmbGFncz0weDENCj4+Pj4gWyA3NjgwLjUyMDA5OF0gUlBDOiAzMzU1MCB4
cHJ0X2Nvbm5lY3Rfc3RhdHVzOiByZXRyeWluZw0KPj4+PiBbIDc2ODAuNTIwMTAzXSBSUEM6IDMz
NTUwIGNhbGxfY29ubmVjdF9zdGF0dXMgKHN0YXR1cyAtMTEpDQo+Pj4+IFsgNzY4MC41MjAxMDhd
IFJQQzogMzM1NTAgY2FsbF90cmFuc21pdCAoc3RhdHVzIDApDQo+Pj4+IFsgNzY4MC41MjAxMTJd
IFJQQzogMzM1NTAgeHBydF9wcmVwYXJlX3RyYW5zbWl0DQo+Pj4+IFsgNzY4MC41MjAxMThdIFJQ
QzogMzM1NTAgcnBjX3hkcl9lbmNvZGUgKHN0YXR1cyAwKQ0KPj4+PiBbIDc2ODAuNTIwMTIzXSBS
UEM6IDMzNTUwIG1hcnNoYWxpbmcgVU5JWCBjcmVkIGZmZmY4ODAxMmUwMDIzMDANCj4+Pj4gWyA3
NjgwLjUyMDEzMF0gUlBDOiAzMzU1MCB1c2luZyBBVVRIX1VOSVggY3JlZCBmZmZmODgwMTJlMDAy
MzAwIHRvIHdyYXAgcnBjIGRhdGENCj4+Pj4gWyA3NjgwLjUyMDEzNl0gUlBDOiAzMzU1MCB4cHJ0
X3RyYW5zbWl0KDMyOTIwKQ0KPj4+PiBbIDc2ODAuNTIwMTQ1XSBSUEM6ICAgICAgIHhzX3RjcF9z
ZW5kX3JlcXVlc3QoMzI5MjApID0gLTMyDQo+Pj4+IFsgNzY4MC41MjAxNTFdIFJQQzogICAgICAg
eHNfdGNwX3N0YXRlX2NoYW5nZSBjbGllbnQgZmZmZjg4MDEzZTYyZDgwMC4uLg0KPj4+PiBbIDc2
ODAuNTIwMTU2XSBSUEM6ICAgICAgIHN0YXRlIDcgY29ubiAwIGRlYWQgMCB6YXBwZWQgMQ0KPj4N
Cj4+PiBJIGNoYW5nZWQgdGhhdCBkZWJ1ZyB0byBvdXRwdXQgc2tfc2h1dGRvd24gdG9vLiBUaGF0
IGhhcyBhIHZhbHVlIG9mIDINCj4+PiAoSUUgU0VORF9TSFVURE9XTikuIExvb2tpbmcgYXQgdGNw
X3NlbmRtc2coKSwgSSBzZWUgdGhpczoNCj4+DQo+Pj4gICAgICAgICAgIGVyciA9IC1FUElQRTsN
Cj4+PiAgICAgICAgICAgaWYgKHNrLT5za19lcnIgfHwgKHNrLT5za19zaHV0ZG93biYgIFNFTkRf
U0hVVERPV04pKQ0KPj4+ICAgICAgICAgICAgICAgICAgIGdvdG8gb3V0X2VycjsNCj4+DQo+Pj4g
d2hpY2ggY29ycmVsYXRlcyB3aXRoIHRoZSB0cmFjZSAieHNfdGNwX3NlbmRfcmVxdWVzdCgzMjky
MCkgPSAtMzIiLiBTbywNCj4+PiB0aGlzIGxvb2tzIGxpa2UgYSBwcm9ibGVtIGluIHRoZSBzb2Nr
ZXRzL3RjcCBsYXllci4gVGhlIHJwYyBsYXllciBpc3N1ZXMNCj4+PiBhIHNodXRkb3duIGFuZCB0
aGVuIHJlY29ubmVjdHMgdXNpbmcgdGhlIHNhbWUgc29ja2V0LiBTbyBlaXRoZXINCj4+PiBza19z
aHV0ZG93biBuZWVkcyB6ZXJvaW5nIG9uY2UgdGhlIHNodXRkb3duIGNvbXBsZXRlcyBvciBzaG91
bGQgYmUNCj4+PiB6ZXJvZWQgb24gc3Vic2VxdWVudCBjb25uZWN0LiBUaGUgbGF0dGVyIHNvdW5k
cyBzYWZlci4NCg0KPj4gVGhpcyBwYXRjaCBmb3IgMi42LjM0LjEgZml4ZXMgdGhlIGlzc3VlOg0K
Pj4NCj4+IC0tLSAvaG9tZS9jb21wYW55L3NvZnR3YXJlL3NyYy9saW51eC0yLjYuMzQuMS9uZXQv
aXB2NC90Y3Bfb3V0cHV0LmMgICAgIDIwMTAtMDctMjcgMDg6NDY6NDYuOTE3MDAwMDAwICswMTAw
DQo+PiArKysgbmV0L2lwdjQvdGNwX291dHB1dC5jICAgICAgIDIwMTAtMDctMjcgMDk6MTk6MTYu
MDAwMDAwMDAwICswMTAwDQo+PiBAQCAtMjUyMiw2ICsyNTIyLDEzIEBADQo+PiAgICAgICAgICBz
dHJ1Y3QgdGNwX3NvY2sgKnRwID0gdGNwX3NrKHNrKTsNCj4+ICAgICAgICAgIF9fdTggcmN2X3dz
Y2FsZTsNCj4+DQo+PiArICAgICAgIC8qIGNsZWFyIGRvd24gYW55IHByZXZpb3VzIHNodXRkb3du
IGF0dGVtcHRzIHNvIHRoYXQNCj4+ICsgICAgICAgICogcmVjb25uZWN0cyBvbiBhIHNvY2tldCB0
aGF0J3MgYmVlbiBzaHV0ZG93biBsZWF2ZSB0aGUNCj4+ICsgICAgICAgICogc29ja2V0IGluIGEg
dXNhYmxlIHN0YXRlIChvdGhlcndpc2UgdGNwX3NlbmRtc2coKSByZXR1cm5zDQo+PiArICAgICAg
ICAqIC1FUElQRSkuDQo+PiArICAgICAgICAqLw0KPj4gKyAgICAgICBzay0+c2tfc2h1dGRvd24g
PSAwOw0KPj4gKw0KPj4gICAgICAgICAgLyogV2UnbGwgZml4IHRoaXMgdXAgd2hlbiB3ZSBnZXQg
YSByZXNwb25zZSBmcm9tIHRoZSBvdGhlciBlbmQuDQo+PiAgICAgICAgICAgKiBTZWUgdGNwX2lu
cHV0LmM6dGNwX3Jjdl9zdGF0ZV9wcm9jZXNzIGNhc2UgVENQX1NZTl9TRU5ULg0KPj4gICAgICAg
ICAgICovDQo+Pg0KPj4gQXMgSSBtZW50aW9uZWQgaW4gbXkgZmlyc3QgbWVzc2FnZSwgd2UgZmly
c3Qgc2F3IHRoaXMgaXNzdWUgaW4gMi42LjMyIGFzIHN1cHBsaWVkIGJ5IGRlYmlhbiAobGludXgt
aW1hZ2UtMi42LjMyLTUtYW1kNjQgVmVyc2lvbjogMi42LjMyLTE3KS4gSXQgbG9va3MgbGlrZSB0
aGUgc2FtZSBwYXRjaCB3b3VsZCBmaXggdGhlIHByb2JsZW0gdGhlcmUgdG9vLg0KPj4NCj4NCj4g
Q0MgbmV0ZGV2DQo+DQo+IFRoaXMgcmVtaW5kcyBtZSBhIHNpbWlsYXIgcHJvYmxlbSB3ZSBoYWQg
aW4gdGhlIHBhc3QsIGZpeGVkIHdpdGggY29tbWl0DQo+IDFmZGY0NzVhICh0Y3A6IHRjcF9kaXNj
b25uZWN0KCkgc2hvdWxkIGNsZWFyIHdpbmRvd19jbGFtcCkNCj4NCj4gQnV0IHRjcF9kaXNjb25u
ZWN0KCkgYWxyZWFkeSBjbGVhcnMgc2stPnNrX3NodXRkb3duDQo+DQo+IElmIE5GUyBjYWxscyB0
Y3BfZGlzY29ubmVjdCgpLCB0aGVuIHNodXRkb3duKCksIHRoZXJlIGlzIGEgcHJvYmxlbS4NCg0K
SWYgdGNwX2Rpc2Nvbm5lY3QoKSB3YXMgY2FsbGVkIGF0IHNvbWUgcG9pbnQsIEkgd291bGQgZXhw
ZWN0IHRvIHNlZSBhIA0KbWVzc2FnZSBmcm9tIHhzX2Vycm9yX3JlcG9ydCgpIGluIHRoZSBkZWJ1
Z2dpbmcgb3V0cHV0LiAgUGVyaGFwcyANCnRjcF9kaXNjb25uZWN0KCkgaXMgbm90IGJlaW5nIGlu
dm9rZWQgYXQgYWxsPw0KDQo+IE1heWJlIHhzX3RjcF9zaHV0ZG93bigpIHNob3VsZCBtYWtlIHNv
bWUgc2FuaXR5IHRlc3RzID8NCj4NCj4gRm9sbG93aW5nIHNlcXVlbmNlIGlzIGxlZ2FsLCBhbmQg
eW91ciBwYXRjaCBtaWdodCBicmVhayBpdC4NCj4NCj4gZmQgPSBzb2NrZXQoLi4uKTsNCj4gc2h1
dGRvd24oZmQsIFNIVVRfV1IpOw0KPiAuLi4NCj4gY29ubmVjdChmZCwgLi4uKTsNCg0KDQpJIGxv
b2tlZCBjbG9zZWx5IGF0IHNvbWUgb2YgQW5keSdzIGRlYnVnZ2luZyBvdXRwdXQgZnJvbSB0aGUg
DQpsaW51eC1rZXJuZWwgbWFpbGluZyBsaXN0IGFyY2hpdmUuICBJIGJhc2ljYWxseSBhZ3JlZSB0
aGF0IHRoZSBuZXR3b3JrIA0KbGF5ZXIgaXMgcmV0dXJuaW5nIC1FUElQRSBmcm9tIHRjcF9zZW5k
bXNnKCksIHdoaWNoIHRoZSBSUEMgY2xpZW50IGxvZ2ljIA0KZG9lcyBub3QgZXhwZWN0LiBCdXQg
aXQncyBub3QgY2xlYXIgdG8gbWUgaG93IGl0IGdldHMgaW50byB0aGlzIHN0YXRlLg0KDQo+IFsg
NzcyOC41MjAwNDJdIFJQQzogICAgICAgd29ya2VyIGNvbm5lY3RpbmcgeHBydCBmZmZmODgwMTNl
NjJkODAwIHZpYSB0Y3AgdG8gMTAuMS42LjEwMiAocG9ydCAyMDQ5KQ0KID4gWyA3NzI4LjUyMDA5
M10gUlBDOiAgICAgICBmZmZmODgwMTNlNjJkODAwIGNvbm5lY3Qgc3RhdHVzIDExNSANCmNvbm5l
Y3RlZCAwIHNvY2sgc3RhdGUgMg0KDQoic29jayBzdGF0ZSAyIiA9PiBzay0+c2tfc3RhdGUgPT0g
VENQX1NZTl9TRU5UDQoNCiA+IFsgNzcyOC41MjA4ODRdIFJQQzogICAgICAgeHNfdGNwX3N0YXRl
X2NoYW5nZSBjbGllbnQgZmZmZjg4MDEzZTYyZDgwMC4uLg0KID4gWyA3NzI4LjUyMDg4OV0gUlBD
OiAgICAgICBzdGF0ZSAxIGNvbm4gMCBkZWFkIDAgemFwcGVkIDENCg0KInN0YXRlIDEiID0+IHNr
LT5za19zdGF0ZSA9PSBUQ1BfRVNUQUJMSVNIRUQNCg0KUlBDIGNsaWVudCB3YWtlcyB1cCB0aGlz
IFJQQyB0YXNrIG5vdyB0aGF0IHRoZSBjb25uZWN0aW9uIGlzIGVzdGFibGlzaGVkLg0KDQo+IFsg
NzcyOC41MjA4OTZdIFJQQzogMzM1NTAgX19ycGNfd2FrZV91cF90YXNrIChub3cgNDI5NjgyNDQy
NikNCj4gWyA3NzI4LjUyMDkwMF0gUlBDOiAzMzU1MCBkaXNhYmxpbmcgdGltZXINCj4gWyA3NzI4
LjUyMDkwNl0gUlBDOiAzMzU1MCByZW1vdmVkIGZyb20gcXVldWUgZmZmZjg4MDEzZTYyZGIyMCAi
eHBydF9wZW5kaW5nIg0KPiBbIDc3MjguNTIwOTEyXSBSUEM6ICAgICAgIF9fcnBjX3dha2VfdXBf
dGFzayBkb25lDQo+IFsgNzcyOC41MjA5MzJdIFJQQzogMzM1NTAgX19ycGNfZXhlY3V0ZSBmbGFn
cz0weDENCj4gWyA3NzI4LjUyMDkzN10gUlBDOiAzMzU1MCB4cHJ0X2Nvbm5lY3Rfc3RhdHVzOiBy
ZXRyeWluZw0KPiBbIDc3MjguNTIwOTQyXSBSUEM6IDMzNTUwIGNhbGxfY29ubmVjdF9zdGF0dXMg
KHN0YXR1cyAtMTEpDQoNClRoZSBhd29rZW4gUlBDIHRhc2sncyBzdGF0dXMgaXMgLUVBR0FJTiwg
d2hpY2ggcHJldmVudHMgYSByZWNvbm5lY3Rpb24gDQphdHRlbXB0Lg0KDQo+IFsgNzcyOC41MjA5
NDddIFJQQzogMzM1NTAgY2FsbF90cmFuc21pdCAoc3RhdHVzIDApDQo+IFsgNzcyOC41MjA5NTFd
IFJQQzogMzM1NTAgeHBydF9wcmVwYXJlX3RyYW5zbWl0DQo+IFsgNzcyOC41MjA5NTddIFJQQzog
MzM1NTAgcnBjX3hkcl9lbmNvZGUgKHN0YXR1cyAwKQ0KPiBbIDc3MjguNTIwOTYyXSBSUEM6IDMz
NTUwIG1hcnNoYWxpbmcgVU5JWCBjcmVkIGZmZmY4ODAxMmUwMDIzMDANCj4gWyA3NzI4LjUyMDk2
OV0gUlBDOiAzMzU1MCB1c2luZyBBVVRIX1VOSVggY3JlZCBmZmZmODgwMTJlMDAyMzAwIHRvIHdy
YXAgcnBjIGRhdGENCj4gWyA3NzI4LjUyMDk3Nl0gUlBDOiAzMzU1MCB4cHJ0X3RyYW5zbWl0KDMy
OTIwKQ0KDQpSUEMgY2xpZW50IGVuY29kZXMgdGhlIHJlcXVlc3QgYW5kIGF0dGVtcHRzIHRvIHNl
bmQgaXQuDQoNCj4gWyA3NzI4LjUyMDk4NF0gUlBDOiAgICAgICB4c190Y3Bfc2VuZF9yZXF1ZXN0
KDMyOTIwKSA9IC0zMg0KDQpOZXR3b3JrIGxheWVyIHNheXMgLUVQSVBFLCBmb3Igc29tZSByZWFz
b24uICBSUEMgY2xpZW50IGNhbGxzIA0Ka2VybmVsX3NvY2tfc2h1dGRvd24oU0hVVF9XUikuDQoN
Cj4gWyA3NzI4LjUyMDk5N10gUlBDOiAgICAgICB4c190Y3Bfc3RhdGVfY2hhbmdlIGNsaWVudCBm
ZmZmODgwMTNlNjJkODAwLi4uDQo+IFsgNzcyOC41MjEwMDddIFJQQzogICAgICAgc3RhdGUgNCBj
b25uIDEgZGVhZCAwIHphcHBlZCAxDQoNCiJzdGF0ZSA0IiA9PiBzay0+c2tfc3RhdGUgPT0gVENQ
X0ZJTl9XQUlUMQ0KDQpUaGUgUlBDIGNsaWVudCBzZXRzIHVwIGEgbGluZ2VyIHRpbWVvdXQuDQoN
Cj4gWyA3NzI4LjUyMTAxM10gUlBDOiAzMzU1MCBjYWxsX3N0YXR1cyAoc3RhdHVzIC0zMikNCj4g
WyA3NzI4LjUyMTAxOF0gUlBDOiAzMzU1MCBjYWxsX2JpbmQgKHN0YXR1cyAwKQ0KPiBbIDc3Mjgu
NTIxMDIzXSBSUEM6IDMzNTUwIGNhbGxfY29ubmVjdCB4cHJ0IGZmZmY4ODAxM2U2MmQ4MDAgaXMg
bm90IGNvbm5lY3RlZA0KPiBbIDc3MjguNTIxMDI4XSBSUEM6IDMzNTUwIHhwcnRfY29ubmVjdCB4
cHJ0IGZmZmY4ODAxM2U2MmQ4MDAgaXMgbm90IGNvbm5lY3RlZA0KPiBbIDc3MjguNTIxMDM1XSBS
UEM6IDMzNTUwIHNsZWVwX29uKHF1ZXVlICJ4cHJ0X3BlbmRpbmciIHRpbWUgNDI5NjgyNDQyNikN
Cj4gWyA3NzI4LjUyMTA0MF0gUlBDOiAzMzU1MCBhZGRlZCB0byBxdWV1ZSBmZmZmODgwMTNlNjJk
YjIwICJ4cHJ0X3BlbmRpbmciDQo+IFsgNzcyOC41MjEwNDVdIFJQQzogMzM1NTAgc2V0dGluZyBh
bGFybSBmb3IgNjAwMDAgbXMNCg0KUlBDIGNsaWVudCBwdXRzIHRoaXMgUlBDIHRhc2sgdG8gc2xl
ZXAuDQoNCj4gWyA3NzI4LjUyMTQzOV0gUlBDOiAgICAgICB4c190Y3Bfc3RhdGVfY2hhbmdlIGNs
aWVudCBmZmZmODgwMTNlNjJkODAwLi4uDQo+IFsgNzcyOC41MjE0NDRdIFJQQzogICAgICAgc3Rh
dGUgNSBjb25uIDAgZGVhZCAwIHphcHBlZCAxDQoNCiJzdGF0ZSA1IiA9PiBzay0+c2tfc3RhdGUg
PT0gVENQX0ZJTl9XQUlUMg0KDQo+IFsgNzcyOC41MjE2MDJdIFJQQzogICAgICAgeHNfdGNwX3N0
YXRlX2NoYW5nZSBjbGllbnQgZmZmZjg4MDEzZTYyZDgwMC4uLg0KPiBbIDc3MjguNTIxNjA4XSBS
UEM6ICAgICAgIHN0YXRlIDcgY29ubiAwIGRlYWQgMCB6YXBwZWQgMQ0KPiBbIDc3MjguNTIxNjEy
XSBSUEM6ICAgICAgIGRpc2Nvbm5lY3RlZCB0cmFuc3BvcnQgZmZmZjg4MDEzZTYyZDgwMA0KDQoi
c3RhdGUgNyIgPT4gc2stPnNrX3N0YXRlID09IFRDUF9DTE9TRQ0KDQpUaGUgUlBDIGNsaWVudCBt
YXJrcyB0aGUgc29ja2V0IGNsb3NlZC4gYW5kIHRoZSBsaW5nZXIgdGltZW91dCBpcyANCmNhbmNl
bGxlZC4gIEF0IHRoaXMgcG9pbnQsIHNrX3NodXRkb3duIHNob3VsZCBiZSBzZXQgdG8gemVybywg
Y29ycmVjdD8gDQpJIGRvbid0IHNlZSBhbiB4c19lcnJvcl9yZXBvcnQoKSBjYWxsIGhlcmUsIHdo
aWNoIHdvdWxkIGNvbmZpcm0gdGhhdCB0aGUgDQpzb2NrZXQgdG9vayBhIHRyaXAgdGhyb3VnaCB0
Y3BfZGlzY29ubmVjdCgpLg0KDQo+IFsgNzcyOC41MjE2MTddIFJQQzogMzM1NTAgX19ycGNfd2Fr
ZV91cF90YXNrIChub3cgNDI5NjgyNDQyNikNCj4gWyA3NzI4LjUyMTYyMV0gUlBDOiAzMzU1MCBk
aXNhYmxpbmcgdGltZXINCj4gWyA3NzI4LjUyMTYyNl0gUlBDOiAzMzU1MCByZW1vdmVkIGZyb20g
cXVldWUgZmZmZjg4MDEzZTYyZGIyMCAieHBydF9wZW5kaW5nIg0KPiBbIDc3MjguNTIxNjMxXSBS
UEM6ICAgICAgIF9fcnBjX3dha2VfdXBfdGFzayBkb25lDQoNClJQQyBjbGllbnQgd2FrZXMgdXAg
dGhlIFJQQyB0YXNrLiAgTWVhbndoaWxlLi4uDQoNCj4gWyA3NzI4LjUyMTYzNl0gUlBDOiAgICAg
ICB4c190Y3Bfc3RhdGVfY2hhbmdlIGNsaWVudCBmZmZmODgwMTNlNjJkODAwLi4uDQo+IFsgNzcy
OC41MjE2NDFdIFJQQzogICAgICAgc3RhdGUgNyBjb25uIDAgZGVhZCAwIHphcHBlZCAxDQo+IFsg
NzcyOC41MjE2NDVdIFJQQzogICAgICAgZGlzY29ubmVjdGVkIHRyYW5zcG9ydCBmZmZmODgwMTNl
NjJkODAwDQo+IFsgNzcyOC41MjE2NDldIFJQQzogICAgICAgeHNfdGNwX2RhdGFfcmVhZHkuLi4N
Cg0KLi4uIG5ldHdvcmsgbGF5ZXIgY2FsbHMgY2xvc2VkIHNvY2tldCdzIGRhdGFfcmVhZHkgbWV0
aG9kLi4uIHdoaWxlIHRoZSANCmF3b2tlbiBSUEMgdGFzayBnZXRzIHVuZGVyd2F5Lg0KDQo+IFsg
NzcyOC41MjE2NjJdIFJQQzogMzM1NTAgX19ycGNfZXhlY3V0ZSBmbGFncz0weDENCj4gWyA3NzI4
LjUyMTY2Nl0gUlBDOiAzMzU1MCB4cHJ0X2Nvbm5lY3Rfc3RhdHVzOiByZXRyeWluZw0KPiBbIDc3
MjguNTIxNjcxXSBSUEM6IDMzNTUwIGNhbGxfY29ubmVjdF9zdGF0dXMgKHN0YXR1cyAtMTEpDQoN
ClRoZSBhd29rZW4gUlBDIHRhc2sncyBzdGF0dXMgaXMgLUVBR0FJTiwgd2hpY2ggcHJldmVudHMg
YSByZWNvbm5lY3Rpb24gDQphdHRlbXB0LCBldmVuIHRob3VnaCB0aGVyZSBpcyBubyBlc3RhYmxp
c2hlZCBjb25uZWN0aW9uLg0KDQpSUEMgY2xpZW50IGJhcnJlbHMgb24gdG8gc2VuZCB0aGUgcmVx
dWVzdCBhZ2Fpbi4NCg0KPiBbIDc3MjguNTIxNjc1XSBSUEM6IDMzNTUwIGNhbGxfdHJhbnNtaXQg
KHN0YXR1cyAwKQ0KPiBbIDc3MjguNTIxNjc5XSBSUEM6IDMzNTUwIHhwcnRfcHJlcGFyZV90cmFu
c21pdA0KPiBbIDc3MjguNTIxNjgzXSBSUEM6IDMzNTUwIHJwY194ZHJfZW5jb2RlIChzdGF0dXMg
MCkNCj4gWyA3NzI4LjUyMTY4OF0gUlBDOiAzMzU1MCBtYXJzaGFsaW5nIFVOSVggY3JlZCBmZmZm
ODgwMTJlMDAyMzAwDQo+IFsgNzcyOC41MjE2OTRdIFJQQzogMzM1NTAgdXNpbmcgQVVUSF9VTklY
IGNyZWQgZmZmZjg4MDEyZTAwMjMwMCB0byB3cmFwIHJwYyBkYXRhDQo+IFsgNzcyOC41MjE2OTld
IFJQQzogMzM1NTAgeHBydF90cmFuc21pdCgzMjkyMCkNCj4gWyA3NzI4LjUyMTcwNF0gUlBDOiAg
ICAgICB4c190Y3Bfc2VuZF9yZXF1ZXN0KDMyOTIwKSA9IC0zMg0KDQpSUEMgY2xpZW50IGF0dGVt
cHRzIHRvIHNlbmQgYWdhaW4sIGdldHMgLUVQSVBFLCBhbmQgY2FsbHMgDQprZXJuZWxfc29ja19z
aHV0ZG93bihTSFVUX1dSKS4gIElmIHRoZXJlIGlzIG5vIGNvbm5lY3Rpb24gZXN0YWJsaXNoZWQs
IA0KdGhlIFJQQyBjbGllbnQgZXhwZWN0cyAtRU5PVENPTk4sIGluIHdoaWNoIGNhc2UgaXQgd2ls
bCBhdHRlbXB0IHRvIA0KcmVjb25uZWN0IGhlcmUuDQoNCj4gWyA3NzI4LjUyMTcwOV0gUlBDOiAg
ICAgICB4c190Y3Bfc3RhdGVfY2hhbmdlIGNsaWVudCBmZmZmODgwMTNlNjJkODAwLi4uDQo+IFsg
NzcyOC41MjE3MTRdIFJQQzogICAgICAgc3RhdGUgNyBjb25uIDAgZGVhZCAwIHphcHBlZCAxDQo+
IFsgNzcyOC41MjE3MThdIFJQQzogICAgICAgZGlzY29ubmVjdGVkIHRyYW5zcG9ydCBmZmZmODgw
MTNlNjJkODAwDQoNCiJzdGF0ZSA3IiA9PiBzay0+c2tfc3RhdGUgPT0gVENQX0NMT1NFDQoNCkZv
bGxvd2luZyB0aGlzLCB0aGUgUlBDIGNsaWVudCBhdHRlbXB0cyB0byByZXRyYW5zbWl0IHRoZSBy
ZXF1ZXN0IA0KcmVwZWF0ZWRseSwgYnV0IHRoZSBzb2NrZXQgcmVtYWlucyBpbiB0aGUgVENQX0NM
T1NFIHN0YXRlLg0KTu+/ve+/ve+/ve+/ve+/vXLvv73vv71577+977+977+9Yu+/vVjvv73vv73H
p3bvv71e77+9Kd66ey5u77+9K++/ve+/ve+/ve+/vXvvv73vv73vv70i77+977+9Xm7vv71y77+9
77+977+9eu+/vRrvv73vv71o77+977+977+977+9Ju+/ve+/vR7vv71H77+977+977+9aO+/vQMo
77+96ZqO77+93aJqIu+/ve+/vRrvv70bbe+/ve+/ve+/ve+/ve+/vXrvv73elu+/ve+/ve+/vWbv
v73vv73vv71o77+977+977+9fu+/vW3vv70=

^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: nfs client hang
@ 2010-07-28  7:08           ` Andy Chittenden
  0 siblings, 0 replies; 20+ messages in thread
From: Andy Chittenden @ 2010-07-28  7:08 UTC (permalink / raw)
  To: Chuck Lever, Eric Dumazet
  Cc: Linux Kernel Mailing List (linux-kernel@vger.kernel.org),
	Trond Myklebust, netdev, Linux NFS Mailing List

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 11398 bytes --]

> I don't see an xs_error_report() call here, which would confirm that the socket took a trip through tcp_disconnect().

>From my reading of tcp_disconnect(), it calls sk->sk_error_report(sk) unconditionally so as there's no xs_error_report(), that surely means the exact opposite: tcp_disconnect() wasn't called. If it's not called, sk_shutdown is not cleared. And my revised tracing confirmed that it was set to 
SEND_SHUTDOWN.

-- 
Andy, BlueArc Engineering


-----Original Message-----
From: Chuck Lever [mailto:chuck.lever@oracle.com] 
Sent: 27 July 2010 18:29
To: Eric Dumazet
Cc: Andy Chittenden; Linux Kernel Mailing List (linux-kernel@vger.kernel.org); Trond Myklebust; netdev; Linux NFS Mailing List
Subject: Re: nfs client hang

Add CC: linux-nfs@vger.kernel.org

On 07/27/10 08:21 AM, Eric Dumazet wrote:
> Le mardi 27 juillet 2010 Ã  11:53 +0100, Andy Chittenden a Ã©crit :
>>>>> IE the client starts a connection and then closes it again without sending data.
>>>> Once this happens, here's some rpcdebug info for the rpc module using 2.6.34.1 kernel:
>>>>
>>>> ... lots of the following nfsv3 WRITE requests:
>>>> [ 7670.026741] 57793 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
>>>> [ 7670.026759] 57794 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
>>>> [ 7670.026778] 57795 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
>>>> [ 7670.026797] 57796 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
>>>> [ 7670.026815] 57797 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
>>>> [ 7670.026834] 57798 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
>>>> [ 7670.026853] 57799 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
>>>> [ 7670.026871] 57800 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
>>>> [ 7670.026890] 57801 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
>>>> [ 7670.026909] 57802 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
>>>> [ 7680.520042] RPC:       worker connecting xprt ffff88013e62d800 via tcp to 10.1.6.102 (port 2049)
>>>> [ 7680.520066] RPC:       ffff88013e62d800 connect status 99 connected 0 sock state 7
>>>> [ 7680.520074] RPC: 33550 __rpc_wake_up_task (now 4296812426)
>>>> [ 7680.520079] RPC: 33550 disabling timer
>>>> [ 7680.520084] RPC: 33550 removed from queue ffff88013e62db20 "xprt_pending"
>>>> [ 7680.520089] RPC:       __rpc_wake_up_task done
>>>> [ 7680.520094] RPC: 33550 __rpc_execute flags=0x1
>>>> [ 7680.520098] RPC: 33550 xprt_connect_status: retrying
>>>> [ 7680.520103] RPC: 33550 call_connect_status (status -11)
>>>> [ 7680.520108] RPC: 33550 call_transmit (status 0)
>>>> [ 7680.520112] RPC: 33550 xprt_prepare_transmit
>>>> [ 7680.520118] RPC: 33550 rpc_xdr_encode (status 0)
>>>> [ 7680.520123] RPC: 33550 marshaling UNIX cred ffff88012e002300
>>>> [ 7680.520130] RPC: 33550 using AUTH_UNIX cred ffff88012e002300 to wrap rpc data
>>>> [ 7680.520136] RPC: 33550 xprt_transmit(32920)
>>>> [ 7680.520145] RPC:       xs_tcp_send_request(32920) = -32
>>>> [ 7680.520151] RPC:       xs_tcp_state_change client ffff88013e62d800...
>>>> [ 7680.520156] RPC:       state 7 conn 0 dead 0 zapped 1
>>
>>> I changed that debug to output sk_shutdown too. That has a value of 2
>>> (IE SEND_SHUTDOWN). Looking at tcp_sendmsg(), I see this:
>>
>>>           err = -EPIPE;
>>>           if (sk->sk_err || (sk->sk_shutdown&  SEND_SHUTDOWN))
>>>                   goto out_err;
>>
>>> which correlates with the trace "xs_tcp_send_request(32920) = -32". So,
>>> this looks like a problem in the sockets/tcp layer. The rpc layer issues
>>> a shutdown and then reconnects using the same socket. So either
>>> sk_shutdown needs zeroing once the shutdown completes or should be
>>> zeroed on subsequent connect. The latter sounds safer.

>> This patch for 2.6.34.1 fixes the issue:
>>
>> --- /home/company/software/src/linux-2.6.34.1/net/ipv4/tcp_output.c     2010-07-27 08:46:46.917000000 +0100
>> +++ net/ipv4/tcp_output.c       2010-07-27 09:19:16.000000000 +0100
>> @@ -2522,6 +2522,13 @@
>>          struct tcp_sock *tp = tcp_sk(sk);
>>          __u8 rcv_wscale;
>>
>> +       /* clear down any previous shutdown attempts so that
>> +        * reconnects on a socket that's been shutdown leave the
>> +        * socket in a usable state (otherwise tcp_sendmsg() returns
>> +        * -EPIPE).
>> +        */
>> +       sk->sk_shutdown = 0;
>> +
>>          /* We'll fix this up when we get a response from the other end.
>>           * See tcp_input.c:tcp_rcv_state_process case TCP_SYN_SENT.
>>           */
>>
>> As I mentioned in my first message, we first saw this issue in 2.6.32 as supplied by debian (linux-image-2.6.32-5-amd64 Version: 2.6.32-17). It looks like the same patch would fix the problem there too.
>>
>
> CC netdev
>
> This reminds me a similar problem we had in the past, fixed with commit
> 1fdf475a (tcp: tcp_disconnect() should clear window_clamp)
>
> But tcp_disconnect() already clears sk->sk_shutdown
>
> If NFS calls tcp_disconnect(), then shutdown(), there is a problem.

If tcp_disconnect() was called at some point, I would expect to see a 
message from xs_error_report() in the debugging output.  Perhaps 
tcp_disconnect() is not being invoked at all?

> Maybe xs_tcp_shutdown() should make some sanity tests ?
>
> Following sequence is legal, and your patch might break it.
>
> fd = socket(...);
> shutdown(fd, SHUT_WR);
> ...
> connect(fd, ...);


I looked closely at some of Andy's debugging output from the 
linux-kernel mailing list archive.  I basically agree that the network 
layer is returning -EPIPE from tcp_sendmsg(), which the RPC client logic 
does not expect. But it's not clear to me how it gets into this state.

> [ 7728.520042] RPC:       worker connecting xprt ffff88013e62d800 via tcp to 10.1.6.102 (port 2049)
 > [ 7728.520093] RPC:       ffff88013e62d800 connect status 115 
connected 0 sock state 2

"sock state 2" => sk->sk_state == TCP_SYN_SENT

 > [ 7728.520884] RPC:       xs_tcp_state_change client ffff88013e62d800...
 > [ 7728.520889] RPC:       state 1 conn 0 dead 0 zapped 1

"state 1" => sk->sk_state == TCP_ESTABLISHED

RPC client wakes up this RPC task now that the connection is established.

> [ 7728.520896] RPC: 33550 __rpc_wake_up_task (now 4296824426)
> [ 7728.520900] RPC: 33550 disabling timer
> [ 7728.520906] RPC: 33550 removed from queue ffff88013e62db20 "xprt_pending"
> [ 7728.520912] RPC:       __rpc_wake_up_task done
> [ 7728.520932] RPC: 33550 __rpc_execute flags=0x1
> [ 7728.520937] RPC: 33550 xprt_connect_status: retrying
> [ 7728.520942] RPC: 33550 call_connect_status (status -11)

The awoken RPC task's status is -EAGAIN, which prevents a reconnection 
attempt.

> [ 7728.520947] RPC: 33550 call_transmit (status 0)
> [ 7728.520951] RPC: 33550 xprt_prepare_transmit
> [ 7728.520957] RPC: 33550 rpc_xdr_encode (status 0)
> [ 7728.520962] RPC: 33550 marshaling UNIX cred ffff88012e002300
> [ 7728.520969] RPC: 33550 using AUTH_UNIX cred ffff88012e002300 to wrap rpc data
> [ 7728.520976] RPC: 33550 xprt_transmit(32920)

RPC client encodes the request and attempts to send it.

> [ 7728.520984] RPC:       xs_tcp_send_request(32920) = -32

Network layer says -EPIPE, for some reason.  RPC client calls 
kernel_sock_shutdown(SHUT_WR).

> [ 7728.520997] RPC:       xs_tcp_state_change client ffff88013e62d800...
> [ 7728.521007] RPC:       state 4 conn 1 dead 0 zapped 1

"state 4" => sk->sk_state == TCP_FIN_WAIT1

The RPC client sets up a linger timeout.

> [ 7728.521013] RPC: 33550 call_status (status -32)
> [ 7728.521018] RPC: 33550 call_bind (status 0)
> [ 7728.521023] RPC: 33550 call_connect xprt ffff88013e62d800 is not connected
> [ 7728.521028] RPC: 33550 xprt_connect xprt ffff88013e62d800 is not connected
> [ 7728.521035] RPC: 33550 sleep_on(queue "xprt_pending" time 4296824426)
> [ 7728.521040] RPC: 33550 added to queue ffff88013e62db20 "xprt_pending"
> [ 7728.521045] RPC: 33550 setting alarm for 60000 ms

RPC client puts this RPC task to sleep.

> [ 7728.521439] RPC:       xs_tcp_state_change client ffff88013e62d800...
> [ 7728.521444] RPC:       state 5 conn 0 dead 0 zapped 1

"state 5" => sk->sk_state == TCP_FIN_WAIT2

> [ 7728.521602] RPC:       xs_tcp_state_change client ffff88013e62d800...
> [ 7728.521608] RPC:       state 7 conn 0 dead 0 zapped 1
> [ 7728.521612] RPC:       disconnected transport ffff88013e62d800

"state 7" => sk->sk_state == TCP_CLOSE

The RPC client marks the socket closed. and the linger timeout is 
cancelled.  At this point, sk_shutdown should be set to zero, correct? 
I don't see an xs_error_report() call here, which would confirm that the 
socket took a trip through tcp_disconnect().

> [ 7728.521617] RPC: 33550 __rpc_wake_up_task (now 4296824426)
> [ 7728.521621] RPC: 33550 disabling timer
> [ 7728.521626] RPC: 33550 removed from queue ffff88013e62db20 "xprt_pending"
> [ 7728.521631] RPC:       __rpc_wake_up_task done

RPC client wakes up the RPC task.  Meanwhile...

> [ 7728.521636] RPC:       xs_tcp_state_change client ffff88013e62d800...
> [ 7728.521641] RPC:       state 7 conn 0 dead 0 zapped 1
> [ 7728.521645] RPC:       disconnected transport ffff88013e62d800
> [ 7728.521649] RPC:       xs_tcp_data_ready...

... network layer calls closed socket's data_ready method... while the 
awoken RPC task gets underway.

> [ 7728.521662] RPC: 33550 __rpc_execute flags=0x1
> [ 7728.521666] RPC: 33550 xprt_connect_status: retrying
> [ 7728.521671] RPC: 33550 call_connect_status (status -11)

The awoken RPC task's status is -EAGAIN, which prevents a reconnection 
attempt, even though there is no established connection.

RPC client barrels on to send the request again.

> [ 7728.521675] RPC: 33550 call_transmit (status 0)
> [ 7728.521679] RPC: 33550 xprt_prepare_transmit
> [ 7728.521683] RPC: 33550 rpc_xdr_encode (status 0)
> [ 7728.521688] RPC: 33550 marshaling UNIX cred ffff88012e002300
> [ 7728.521694] RPC: 33550 using AUTH_UNIX cred ffff88012e002300 to wrap rpc data
> [ 7728.521699] RPC: 33550 xprt_transmit(32920)
> [ 7728.521704] RPC:       xs_tcp_send_request(32920) = -32

RPC client attempts to send again, gets -EPIPE, and calls 
kernel_sock_shutdown(SHUT_WR).  If there is no connection established, 
the RPC client expects -ENOTCONN, in which case it will attempt to 
reconnect here.

> [ 7728.521709] RPC:       xs_tcp_state_change client ffff88013e62d800...
> [ 7728.521714] RPC:       state 7 conn 0 dead 0 zapped 1
> [ 7728.521718] RPC:       disconnected transport ffff88013e62d800

"state 7" => sk->sk_state == TCP_CLOSE

Following this, the RPC client attempts to retransmit the request 
repeatedly, but the socket remains in the TCP_CLOSE state.
ÿôèº{.nÇ+‰·Ÿ®‰†+%ŠËÿ±éÝ¶\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dÊ‡Ú™ë,j\a¢f£¢·hšïêÿ‘êçz_è®\x03(éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨èÚ&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: nfs client hang
@ 2010-07-28  7:08           ` Andy Chittenden
  0 siblings, 0 replies; 20+ messages in thread
From: Andy Chittenden @ 2010-07-28  7:08 UTC (permalink / raw)
  To: Chuck Lever, Eric Dumazet
  Cc: Linux Kernel Mailing List (linux-kernel@vger.kernel.org),
	Trond Myklebust, netdev, Linux NFS Mailing List

> I don't see an xs_error_report() call here, which would confirm that the socket took a trip through tcp_disconnect().

From my reading of tcp_disconnect(), it calls sk->sk_error_report(sk) unconditionally so as there's no xs_error_report(), that surely means the exact opposite: tcp_disconnect() wasn't called. If it's not called, sk_shutdown is not cleared. And my revised tracing confirmed that it was set to 
SEND_SHUTDOWN.

-- 
Andy, BlueArc Engineering


-----Original Message-----
From: Chuck Lever [mailto:chuck.lever@oracle.com] 
Sent: 27 July 2010 18:29
To: Eric Dumazet
Cc: Andy Chittenden; Linux Kernel Mailing List (linux-kernel@vger.kernel.org); Trond Myklebust; netdev; Linux NFS Mailing List
Subject: Re: nfs client hang

Add CC: linux-nfs@vger.kernel.org

On 07/27/10 08:21 AM, Eric Dumazet wrote:
> Le mardi 27 juillet 2010 à 11:53 +0100, Andy Chittenden a écrit :
>>>>> IE the client starts a connection and then closes it again without sending data.
>>>> Once this happens, here's some rpcdebug info for the rpc module using 2.6.34.1 kernel:
>>>>
>>>> ... lots of the following nfsv3 WRITE requests:
>>>> [ 7670.026741] 57793 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
>>>> [ 7670.026759] 57794 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
>>>> [ 7670.026778] 57795 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
>>>> [ 7670.026797] 57796 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
>>>> [ 7670.026815] 57797 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
>>>> [ 7670.026834] 57798 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
>>>> [ 7670.026853] 57799 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
>>>> [ 7670.026871] 57800 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
>>>> [ 7670.026890] 57801 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
>>>> [ 7670.026909] 57802 0001    -11 ffff88012e32b000   (null)        0 ffffffffa03beb10 nfsv3 WRITE a:call_reserveresult q:xprt_backlog
>>>> [ 7680.520042] RPC:       worker connecting xprt ffff88013e62d800 via tcp to 10.1.6.102 (port 2049)
>>>> [ 7680.520066] RPC:       ffff88013e62d800 connect status 99 connected 0 sock state 7
>>>> [ 7680.520074] RPC: 33550 __rpc_wake_up_task (now 4296812426)
>>>> [ 7680.520079] RPC: 33550 disabling timer
>>>> [ 7680.520084] RPC: 33550 removed from queue ffff88013e62db20 "xprt_pending"
>>>> [ 7680.520089] RPC:       __rpc_wake_up_task done
>>>> [ 7680.520094] RPC: 33550 __rpc_execute flags=0x1
>>>> [ 7680.520098] RPC: 33550 xprt_connect_status: retrying
>>>> [ 7680.520103] RPC: 33550 call_connect_status (status -11)
>>>> [ 7680.520108] RPC: 33550 call_transmit (status 0)
>>>> [ 7680.520112] RPC: 33550 xprt_prepare_transmit
>>>> [ 7680.520118] RPC: 33550 rpc_xdr_encode (status 0)
>>>> [ 7680.520123] RPC: 33550 marshaling UNIX cred ffff88012e002300
>>>> [ 7680.520130] RPC: 33550 using AUTH_UNIX cred ffff88012e002300 to wrap rpc data
>>>> [ 7680.520136] RPC: 33550 xprt_transmit(32920)
>>>> [ 7680.520145] RPC:       xs_tcp_send_request(32920) = -32
>>>> [ 7680.520151] RPC:       xs_tcp_state_change client ffff88013e62d800...
>>>> [ 7680.520156] RPC:       state 7 conn 0 dead 0 zapped 1
>>
>>> I changed that debug to output sk_shutdown too. That has a value of 2
>>> (IE SEND_SHUTDOWN). Looking at tcp_sendmsg(), I see this:
>>
>>>           err = -EPIPE;
>>>           if (sk->sk_err || (sk->sk_shutdown&  SEND_SHUTDOWN))
>>>                   goto out_err;
>>
>>> which correlates with the trace "xs_tcp_send_request(32920) = -32". So,
>>> this looks like a problem in the sockets/tcp layer. The rpc layer issues
>>> a shutdown and then reconnects using the same socket. So either
>>> sk_shutdown needs zeroing once the shutdown completes or should be
>>> zeroed on subsequent connect. The latter sounds safer.

>> This patch for 2.6.34.1 fixes the issue:
>>
>> --- /home/company/software/src/linux-2.6.34.1/net/ipv4/tcp_output.c     2010-07-27 08:46:46.917000000 +0100
>> +++ net/ipv4/tcp_output.c       2010-07-27 09:19:16.000000000 +0100
>> @@ -2522,6 +2522,13 @@
>>          struct tcp_sock *tp = tcp_sk(sk);
>>          __u8 rcv_wscale;
>>
>> +       /* clear down any previous shutdown attempts so that
>> +        * reconnects on a socket that's been shutdown leave the
>> +        * socket in a usable state (otherwise tcp_sendmsg() returns
>> +        * -EPIPE).
>> +        */
>> +       sk->sk_shutdown = 0;
>> +
>>          /* We'll fix this up when we get a response from the other end.
>>           * See tcp_input.c:tcp_rcv_state_process case TCP_SYN_SENT.
>>           */
>>
>> As I mentioned in my first message, we first saw this issue in 2.6.32 as supplied by debian (linux-image-2.6.32-5-amd64 Version: 2.6.32-17). It looks like the same patch would fix the problem there too.
>>
>
> CC netdev
>
> This reminds me a similar problem we had in the past, fixed with commit
> 1fdf475a (tcp: tcp_disconnect() should clear window_clamp)
>
> But tcp_disconnect() already clears sk->sk_shutdown
>
> If NFS calls tcp_disconnect(), then shutdown(), there is a problem.

If tcp_disconnect() was called at some point, I would expect to see a 
message from xs_error_report() in the debugging output.  Perhaps 
tcp_disconnect() is not being invoked at all?

> Maybe xs_tcp_shutdown() should make some sanity tests ?
>
> Following sequence is legal, and your patch might break it.
>
> fd = socket(...);
> shutdown(fd, SHUT_WR);
> ...
> connect(fd, ...);


I looked closely at some of Andy's debugging output from the 
linux-kernel mailing list archive.  I basically agree that the network 
layer is returning -EPIPE from tcp_sendmsg(), which the RPC client logic 
does not expect. But it's not clear to me how it gets into this state.

> [ 7728.520042] RPC:       worker connecting xprt ffff88013e62d800 via tcp to 10.1.6.102 (port 2049)
 > [ 7728.520093] RPC:       ffff88013e62d800 connect status 115 
connected 0 sock state 2

"sock state 2" => sk->sk_state == TCP_SYN_SENT

 > [ 7728.520884] RPC:       xs_tcp_state_change client ffff88013e62d800...
 > [ 7728.520889] RPC:       state 1 conn 0 dead 0 zapped 1

"state 1" => sk->sk_state == TCP_ESTABLISHED

RPC client wakes up this RPC task now that the connection is established.

> [ 7728.520896] RPC: 33550 __rpc_wake_up_task (now 4296824426)
> [ 7728.520900] RPC: 33550 disabling timer
> [ 7728.520906] RPC: 33550 removed from queue ffff88013e62db20 "xprt_pending"
> [ 7728.520912] RPC:       __rpc_wake_up_task done
> [ 7728.520932] RPC: 33550 __rpc_execute flags=0x1
> [ 7728.520937] RPC: 33550 xprt_connect_status: retrying
> [ 7728.520942] RPC: 33550 call_connect_status (status -11)

The awoken RPC task's status is -EAGAIN, which prevents a reconnection 
attempt.

> [ 7728.520947] RPC: 33550 call_transmit (status 0)
> [ 7728.520951] RPC: 33550 xprt_prepare_transmit
> [ 7728.520957] RPC: 33550 rpc_xdr_encode (status 0)
> [ 7728.520962] RPC: 33550 marshaling UNIX cred ffff88012e002300
> [ 7728.520969] RPC: 33550 using AUTH_UNIX cred ffff88012e002300 to wrap rpc data
> [ 7728.520976] RPC: 33550 xprt_transmit(32920)

RPC client encodes the request and attempts to send it.

> [ 7728.520984] RPC:       xs_tcp_send_request(32920) = -32

Network layer says -EPIPE, for some reason.  RPC client calls 
kernel_sock_shutdown(SHUT_WR).

> [ 7728.520997] RPC:       xs_tcp_state_change client ffff88013e62d800...
> [ 7728.521007] RPC:       state 4 conn 1 dead 0 zapped 1

"state 4" => sk->sk_state == TCP_FIN_WAIT1

The RPC client sets up a linger timeout.

> [ 7728.521013] RPC: 33550 call_status (status -32)
> [ 7728.521018] RPC: 33550 call_bind (status 0)
> [ 7728.521023] RPC: 33550 call_connect xprt ffff88013e62d800 is not connected
> [ 7728.521028] RPC: 33550 xprt_connect xprt ffff88013e62d800 is not connected
> [ 7728.521035] RPC: 33550 sleep_on(queue "xprt_pending" time 4296824426)
> [ 7728.521040] RPC: 33550 added to queue ffff88013e62db20 "xprt_pending"
> [ 7728.521045] RPC: 33550 setting alarm for 60000 ms

RPC client puts this RPC task to sleep.

> [ 7728.521439] RPC:       xs_tcp_state_change client ffff88013e62d800...
> [ 7728.521444] RPC:       state 5 conn 0 dead 0 zapped 1

"state 5" => sk->sk_state == TCP_FIN_WAIT2

> [ 7728.521602] RPC:       xs_tcp_state_change client ffff88013e62d800...
> [ 7728.521608] RPC:       state 7 conn 0 dead 0 zapped 1
> [ 7728.521612] RPC:       disconnected transport ffff88013e62d800

"state 7" => sk->sk_state == TCP_CLOSE

The RPC client marks the socket closed. and the linger timeout is 
cancelled.  At this point, sk_shutdown should be set to zero, correct? 
I don't see an xs_error_report() call here, which would confirm that the 
socket took a trip through tcp_disconnect().

> [ 7728.521617] RPC: 33550 __rpc_wake_up_task (now 4296824426)
> [ 7728.521621] RPC: 33550 disabling timer
> [ 7728.521626] RPC: 33550 removed from queue ffff88013e62db20 "xprt_pending"
> [ 7728.521631] RPC:       __rpc_wake_up_task done

RPC client wakes up the RPC task.  Meanwhile...

> [ 7728.521636] RPC:       xs_tcp_state_change client ffff88013e62d800...
> [ 7728.521641] RPC:       state 7 conn 0 dead 0 zapped 1
> [ 7728.521645] RPC:       disconnected transport ffff88013e62d800
> [ 7728.521649] RPC:       xs_tcp_data_ready...

... network layer calls closed socket's data_ready method... while the 
awoken RPC task gets underway.

> [ 7728.521662] RPC: 33550 __rpc_execute flags=0x1
> [ 7728.521666] RPC: 33550 xprt_connect_status: retrying
> [ 7728.521671] RPC: 33550 call_connect_status (status -11)

The awoken RPC task's status is -EAGAIN, which prevents a reconnection 
attempt, even though there is no established connection.

RPC client barrels on to send the request again.

> [ 7728.521675] RPC: 33550 call_transmit (status 0)
> [ 7728.521679] RPC: 33550 xprt_prepare_transmit
> [ 7728.521683] RPC: 33550 rpc_xdr_encode (status 0)
> [ 7728.521688] RPC: 33550 marshaling UNIX cred ffff88012e002300
> [ 7728.521694] RPC: 33550 using AUTH_UNIX cred ffff88012e002300 to wrap rpc data
> [ 7728.521699] RPC: 33550 xprt_transmit(32920)
> [ 7728.521704] RPC:       xs_tcp_send_request(32920) = -32

RPC client attempts to send again, gets -EPIPE, and calls 
kernel_sock_shutdown(SHUT_WR).  If there is no connection established, 
the RPC client expects -ENOTCONN, in which case it will attempt to 
reconnect here.

> [ 7728.521709] RPC:       xs_tcp_state_change client ffff88013e62d800...
> [ 7728.521714] RPC:       state 7 conn 0 dead 0 zapped 1
> [ 7728.521718] RPC:       disconnected transport ffff88013e62d800

"state 7" => sk->sk_state == TCP_CLOSE

Following this, the RPC client attempts to retransmit the request 
repeatedly, but the socket remains in the TCP_CLOSE state.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: nfs client hang
  2010-07-27 17:28         ` Chuck Lever
  (?)
@ 2010-07-28  7:24           ` Andy Chittenden
  -1 siblings, 0 replies; 20+ messages in thread
From: Andy Chittenden @ 2010-07-28  7:24 UTC (permalink / raw)
  To: Andy Chittenden, Chuck Lever, Eric Dumazet
  Cc: Linux Kernel Mailing List (linux-kernel@vger.kernel.org),
	Trond Myklebust, netdev, Linux NFS Mailing List

cmVzZW5kaW5nIGFzIGl0IHNlZW1zIHRvIGhhdmUgYmVlbiBjb3JydXB0ZWQgb24gTEtNTCENCg0K
PiBUaGUgUlBDIGNsaWVudCBtYXJrcyB0aGUgc29ja2V0IGNsb3NlZC4gYW5kIHRoZSBsaW5nZXIg
dGltZW91dCBpcyANCj4gY2FuY2VsbGVkLiAgQXQgdGhpcyBwb2ludCwgc2tfc2h1dGRvd24gc2hv
dWxkIGJlIHNldCB0byB6ZXJvLCBjb3JyZWN0PyANCj4gSSBkb24ndCBzZWUgYW4geHNfZXJyb3Jf
cmVwb3J0KCkgY2FsbCBoZXJlLCB3aGljaCB3b3VsZCBjb25maXJtIHRoYXQgdGhlIA0KPiBzb2Nr
ZXQgdG9vayBhIHRyaXAgdGhyb3VnaCB0Y3BfZGlzY29ubmVjdCgpLg0KDQpGcm9tIG15IHJlYWRp
bmcgb2YgdGNwX2Rpc2Nvbm5lY3QoKSwgaXQgY2FsbHMgc2stPnNrX2Vycm9yX3JlcG9ydChzaykg
dW5jb25kaXRpb25hbGx5IHNvIGFzIHRoZXJlJ3Mgbm8geHNfZXJyb3JfcmVwb3J0KCksIHRoYXQg
c3VyZWx5IG1lYW5zIHRoZSBleGFjdCBvcHBvc2l0ZTogdGNwX2Rpc2Nvbm5lY3QoKSB3YXNuJ3Qg
Y2FsbGVkLiBJZiBpdCdzIG5vdCBjYWxsZWQsIHNrX3NodXRkb3duIGlzIG5vdCBjbGVhcmVkLiBB
bmQgbXkgcmV2aXNlZCB0cmFjaW5nIGNvbmZpcm1lZCB0aGF0IGl0IHdhcyBzZXQgdG8gDQpTRU5E
X1NIVVRET1dOLg0KDQotLSANCkFuZHksIEJsdWVBcmMgRW5naW5lZXJpbmcNCg==

^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: nfs client hang
@ 2010-07-28  7:24           ` Andy Chittenden
  0 siblings, 0 replies; 20+ messages in thread
From: Andy Chittenden @ 2010-07-28  7:24 UTC (permalink / raw)
  To: Andy Chittenden, Chuck Lever, Eric Dumazet
  Cc: Linux Kernel Mailing List (linux-kernel@vger.kernel.org),
	Trond Myklebust, netdev, Linux NFS Mailing List

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 822 bytes --]

resending as it seems to have been corrupted on LKML!

> The RPC client marks the socket closed. and the linger timeout is 
> cancelled.  At this point, sk_shutdown should be set to zero, correct? 
> I don't see an xs_error_report() call here, which would confirm that the 
> socket took a trip through tcp_disconnect().

>From my reading of tcp_disconnect(), it calls sk->sk_error_report(sk) unconditionally so as there's no xs_error_report(), that surely means the exact opposite: tcp_disconnect() wasn't called. If it's not called, sk_shutdown is not cleared. And my revised tracing confirmed that it was set to 
SEND_SHUTDOWN.

-- 
Andy, BlueArc Engineering
ÿôèº{.nÇ+‰·Ÿ®‰†+%ŠËÿ±éÝ¶\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dÊ‡Ú™ë,j\a¢f£¢·hšïêÿ‘êçz_è®\x03(éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨èÚ&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: nfs client hang
@ 2010-07-28  7:24           ` Andy Chittenden
  0 siblings, 0 replies; 20+ messages in thread
From: Andy Chittenden @ 2010-07-28  7:24 UTC (permalink / raw)
  To: Andy Chittenden, Chuck Lever, Eric Dumazet
  Cc: Linux Kernel Mailing List (linux-kernel@vger.kernel.org),
	Trond Myklebust, netdev, Linux NFS Mailing List

resending as it seems to have been corrupted on LKML!

> The RPC client marks the socket closed. and the linger timeout is 
> cancelled.  At this point, sk_shutdown should be set to zero, correct? 
> I don't see an xs_error_report() call here, which would confirm that the 
> socket took a trip through tcp_disconnect().

From my reading of tcp_disconnect(), it calls sk->sk_error_report(sk) unconditionally so as there's no xs_error_report(), that surely means the exact opposite: tcp_disconnect() wasn't called. If it's not called, sk_shutdown is not cleared. And my revised tracing confirmed that it was set to 
SEND_SHUTDOWN.

-- 
Andy, BlueArc Engineering

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: nfs client hang
@ 2010-07-28 17:37             ` Chuck Lever
  0 siblings, 0 replies; 20+ messages in thread
From: Chuck Lever @ 2010-07-28 17:37 UTC (permalink / raw)
  To: Andy Chittenden
  Cc: Eric Dumazet,
	Linux Kernel Mailing List (linux-kernel@vger.kernel.org),
	Trond Myklebust, netdev, Linux NFS Mailing List

On 07/28/10 03:24 AM, Andy Chittenden wrote:
> resending as it seems to have been corrupted on LKML!
>
>> The RPC client marks the socket closed. and the linger timeout is
>> cancelled.  At this point, sk_shutdown should be set to zero, correct?
>> I don't see an xs_error_report() call here, which would confirm that the
>> socket took a trip through tcp_disconnect().
>
> From my reading of tcp_disconnect(), it calls sk->sk_error_report(sk)
> unconditionally so as there's no xs_error_report(), that surely means
> the exact opposite: tcp_disconnect() wasn't called. If it's not
> called, sk_shutdown is not cleared. And my revised tracing confirmed
> that it was set to SEND_SHUTDOWN.

Sorry, that's what I meant above.

An xs_error_report() debugging message at that point in the log would 
confirm that the socket took a trip through tcp_disconnect().  But I 
don't see such a message.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: nfs client hang
@ 2010-07-28 17:37             ` Chuck Lever
  0 siblings, 0 replies; 20+ messages in thread
From: Chuck Lever @ 2010-07-28 17:37 UTC (permalink / raw)
  To: Andy Chittenden
  Cc: Eric Dumazet,
	Linux Kernel Mailing List (linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org),
	Trond Myklebust, netdev, Linux NFS Mailing List

On 07/28/10 03:24 AM, Andy Chittenden wrote:
> resending as it seems to have been corrupted on LKML!
>
>> The RPC client marks the socket closed. and the linger timeout is
>> cancelled.  At this point, sk_shutdown should be set to zero, correct?
>> I don't see an xs_error_report() call here, which would confirm that the
>> socket took a trip through tcp_disconnect().
>
> From my reading of tcp_disconnect(), it calls sk->sk_error_report(sk)
> unconditionally so as there's no xs_error_report(), that surely means
> the exact opposite: tcp_disconnect() wasn't called. If it's not
> called, sk_shutdown is not cleared. And my revised tracing confirmed
> that it was set to SEND_SHUTDOWN.

Sorry, that's what I meant above.

An xs_error_report() debugging message at that point in the log would 
confirm that the socket took a trip through tcp_disconnect().  But I 
don't see such a message.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: nfs client hang
@ 2010-07-29 10:10               ` Andy Chittenden
  0 siblings, 0 replies; 20+ messages in thread
From: Andy Chittenden @ 2010-07-29 10:10 UTC (permalink / raw)
  To: Chuck Lever
  Cc: Eric Dumazet,
	Linux Kernel Mailing List (linux-kernel@vger.kernel.org),
	Trond Myklebust, netdev, Linux NFS Mailing List

  On 2010-07-28 18:37, Chuck Lever wrote:
> On 07/28/10 03:24 AM, Andy Chittenden wrote:
>> resending as it seems to have been corrupted on LKML!
>>
>>> The RPC client marks the socket closed. and the linger timeout is
>>> cancelled.  At this point, sk_shutdown should be set to zero, correct?
>>> I don't see an xs_error_report() call here, which would confirm that the
>>> socket took a trip through tcp_disconnect().
>>  From my reading of tcp_disconnect(), it calls sk->sk_error_report(sk)
>> unconditionally so as there's no xs_error_report(), that surely means
>> the exact opposite: tcp_disconnect() wasn't called. If it's not
>> called, sk_shutdown is not cleared. And my revised tracing confirmed
>> that it was set to SEND_SHUTDOWN.
> Sorry, that's what I meant above.
>
> An xs_error_report() debugging message at that point in the log would
> confirm that the socket took a trip through tcp_disconnect().  But I
> don't see such a message.
I don't see how tcp_disconnect() gets called if the application does a 
shutdown when the state is TCP_ESTABLISHED (or a myriad of other 
states). It just seems to send a FIN. Should tcp_disconnect() be called? 
If so, how? Alternatively, I wonder whether my patch that set 
sk_shutdown to 0 in tcp_connect_init() is the correct fix after all.

-- 
Andy, BlueArc Engineering


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: nfs client hang
@ 2010-07-29 10:10               ` Andy Chittenden
  0 siblings, 0 replies; 20+ messages in thread
From: Andy Chittenden @ 2010-07-29 10:10 UTC (permalink / raw)
  To: Chuck Lever
  Cc: Eric Dumazet,
	Linux Kernel Mailing List (linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org),
	Trond Myklebust, netdev, Linux NFS Mailing List

  On 2010-07-28 18:37, Chuck Lever wrote:
> On 07/28/10 03:24 AM, Andy Chittenden wrote:
>> resending as it seems to have been corrupted on LKML!
>>
>>> The RPC client marks the socket closed. and the linger timeout is
>>> cancelled.  At this point, sk_shutdown should be set to zero, correct?
>>> I don't see an xs_error_report() call here, which would confirm that the
>>> socket took a trip through tcp_disconnect().
>>  From my reading of tcp_disconnect(), it calls sk->sk_error_report(sk)
>> unconditionally so as there's no xs_error_report(), that surely means
>> the exact opposite: tcp_disconnect() wasn't called. If it's not
>> called, sk_shutdown is not cleared. And my revised tracing confirmed
>> that it was set to SEND_SHUTDOWN.
> Sorry, that's what I meant above.
>
> An xs_error_report() debugging message at that point in the log would
> confirm that the socket took a trip through tcp_disconnect().  But I
> don't see such a message.
I don't see how tcp_disconnect() gets called if the application does a 
shutdown when the state is TCP_ESTABLISHED (or a myriad of other 
states). It just seems to send a FIN. Should tcp_disconnect() be called? 
If so, how? Alternatively, I wonder whether my patch that set 
sk_shutdown to 0 in tcp_connect_init() is the correct fix after all.

-- 
Andy, BlueArc Engineering

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

* General Linux Kernel / User Space License
  2010-07-28 17:37             ` Chuck Lever
  (?)
  (?)
@ 2011-07-07 17:01             ` Mitchell Erblich
  2011-07-07 20:35               ` Chris Friesen
  -1 siblings, 1 reply; 20+ messages in thread
From: Mitchell Erblich @ 2011-07-07 17:01 UTC (permalink / raw)
  To: Linux Kernel Mailing List (linux-kernel@vger.kernel.org)


	Please include me in the response.

	I am interested in any legal restriction of copying different parts of the
	    kernel into the user space and then building an app around
	 those changes.

	The app would have a src directory with the copied parts.

	Thanks,     Mitchell Erblich

	
	

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: General Linux Kernel / User Space License
  2011-07-07 17:01             ` General Linux Kernel / User Space License Mitchell Erblich
@ 2011-07-07 20:35               ` Chris Friesen
  0 siblings, 0 replies; 20+ messages in thread
From: Chris Friesen @ 2011-07-07 20:35 UTC (permalink / raw)
  To: Mitchell Erblich; +Cc: Linux Kernel Mailing List (linux-kernel@vger.kernel.org)

On 07/07/2011 11:01 AM, Mitchell Erblich wrote:
>
> 	Please include me in the response.
>
> 	I am interested in any legal restriction of copying different parts of the
> 	    kernel into the user space and then building an app around
> 	 those changes.

The kernel is mostly GPLv2.  Treat the code appropriately.

Chris

Chris Friesen
Software Developer
GENBAND
chris.friesen@genband.com
www.genband.com

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2011-07-07 20:36 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <99613C19B13C5D40914FB8930657FA9303365708DE@uk-ex-mbx1.terastack.bluearc.com>
2010-07-27  7:25 ` nfs client hang Andy Chittenden
2010-07-27 10:53   ` Andy Chittenden
2010-07-27 12:21     ` Eric Dumazet
2010-07-27 12:51       ` Andy Chittenden
2010-07-27 17:28       ` Chuck Lever
2010-07-27 17:28         ` Chuck Lever
2010-07-28  7:08         ` Andy Chittenden
2010-07-28  7:08           ` Andy Chittenden
2010-07-28  7:08           ` Andy Chittenden
2010-07-28  7:24         ` Andy Chittenden
2010-07-28  7:24           ` Andy Chittenden
2010-07-28  7:24           ` Andy Chittenden
2010-07-28 17:37           ` Chuck Lever
2010-07-28 17:37             ` Chuck Lever
2010-07-29 10:10             ` Andy Chittenden
2010-07-29 10:10               ` Andy Chittenden
2011-07-07 17:01             ` General Linux Kernel / User Space License Mitchell Erblich
2011-07-07 20:35               ` Chris Friesen
2010-07-23 12:36 nfs client hang Andy Chittenden
  -- strict thread matches above, loose matches on Subject: below --
2010-07-22 12:19 Andy Chittenden

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.