linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* nfs4 mount hanging suddenly
@ 2012-02-29 22:29 Orion Poplawski
  2012-02-29 23:17 ` J. Bruce Fields
  0 siblings, 1 reply; 6+ messages in thread
From: Orion Poplawski @ 2012-02-29 22:29 UTC (permalink / raw)
  To: linux-nfs

Just starting today, one of our user's nfs mounted home directory has started 
locking up.  Client is Fedora 16 32-bit, server is CentOS 5.7 32-bit.  Have 
not seen this particular problem elsewhere (yet).

I captured this trace on the server after the hang:

http://sw.cora.nwra.com/tmp/marie-nfs-home-lwang-hang.pcap

   1   0.000000  10.10.20.15 -> 10.10.10.1   NFS V4 COMP Call <EMPTY> 
PUTFH;GETATTR GETATTR
   2   0.000133   10.10.10.1 -> 10.10.20.15  NFS V4 COMP Reply (Call In 1) 
<EMPTY> PUTFH;GETATTR GETATTR
   3   0.000421  10.10.20.15 -> 10.10.10.1   TCP 879 > nfs [ACK] Seq=137 
Ack=225 Win=17738 Len=0 TSV=3584653 TSER=2438333196
   4   0.000519  10.10.20.15 -> 10.10.10.1   NFS V4 COMP Call <EMPTY> 
PUTFH;ACCESS ACCESS;GETATTR GETATTR
   5   0.000587   10.10.10.1 -> 10.10.20.15  NFS V4 COMP Reply (Call In 4) 
<EMPTY> PUTFH;ACCESS ACCESS;GETATTR GETATTR[Unreassembled Packet [incorrect 
TCP checksum]]
   6   0.040522  10.10.20.15 -> 10.10.10.1   TCP 879 > nfs [ACK] Seq=289 
Ack=465 Win=17738 Len=0 TSV=3584694 TSER=2438333196
   7   0.451636  10.10.20.15 -> 10.10.10.1   NFS V4 COMP Call <EMPTY> 
PUTFH;SAVEFH SAVEFH;OPEN OPEN;DELEGRETURN DELEGRETURN;Unknown
   8   0.451892   10.10.10.1 -> 10.10.20.15  NFS V4 COMP Reply (Call In 7) 
<EMPTY> PUTFH;SAVEFH SAVEFH;OPEN OPEN(10008)
   9   0.452164  10.10.20.15 -> 10.10.10.1   TCP 879 > nfs [ACK] Seq=529 
Ack=529 Win=17738 Len=0 TSV=3585105 TSER=2438333648
.....
120  53.161949  10.10.20.15 -> 10.10.10.1   NFS V4 COMP Call <EMPTY> 
PUTFH;GETATTR GETATTR
121  53.162281   10.10.10.1 -> 10.10.20.15  NFS V4 COMP Reply (Call In 120) 
<EMPTY> PUTFH;GETATTR GETATTR
122  53.162596  10.10.20.15 -> 10.10.10.1   TCP 879 > nfs [ACK] Seq=8205 
Ack=10341 Win=17738 Len=0 TSV=3637816 TSER=2438386366
123  53.162680  10.10.20.15 -> 10.10.10.1   NFS V4 COMP Call <EMPTY> 
PUTFH;GETATTR GETATTR
124  53.162748   10.10.10.1 -> 10.10.20.15  NFS V4 COMP Reply (Call In 123) 
<EMPTY> PUTFH;GETATTR GETATTR[Unreassembled Packet [incorrect TCP checksum]]
125  53.163245  10.10.20.15 -> 10.10.10.1   NFS V4 COMP Call <EMPTY> 
PUTFH;GETATTR GETATTR
126  53.163418   10.10.10.1 -> 10.10.20.15  NFS V4 COMP Reply (Call In 125) 
<EMPTY> PUTFH;GETATTR GETATTR
127  53.203530  10.10.20.15 -> 10.10.10.1   TCP 879 > nfs [ACK] Seq=8493 
Ack=10685 Win=17738 Len=0 TSV=3637857 TSER=2438386368
128  53.450308  10.10.20.15 -> 10.10.10.1   NFS V4 COMP Call <EMPTY> 
PUTFH;ACCESS ACCESS;GETATTR GETATTR
129  53.450457   10.10.10.1 -> 10.10.20.15  NFS V4 COMP Reply (Call In 128) 
<EMPTY> PUTFH;ACCESS ACCESS;GETATTR GETATTR[Unreassembled Packet [incorrect 
TCP checksum]]
130  53.450671  10.10.20.15 -> 10.10.10.1   TCP 879 > nfs [ACK] Seq=8645 
Ack=10925 Win=17738 Len=0 TSV=3638104 TSER=2438386655


I was not able to find any error messages anywhere.  Server has been up 28 
days.  Client was up for 14 days before first hang, then 2 more today.  Home 
directories are automounted and I was able to access a different home 
directory that is served off the save server and filesystem.

client kernels: 3.2.3-2.fc16.i68, 3.2.7-1.fc16.i68
server kernel: 2.6.18-274.17.1.el5

earth:/export/home/lwang on /home/lwang type nfs4 
(rw,noatime,vers=4,rsize=32768,wsize=32768,namlen=255,acregmin=1,acregmax=1,acdirmin=1,acdirmax=1,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=10.10.20.15,minorversion=0,local_lock=none,addr=10.10.10.1)

There is a newer nfs-utils:
Jan 24 03:34:43 Updated: 1:nfs-utils-1.2.5-4.fc16.i686

may try backing that off, but doesn't seem like a big change:

* Mon Jan 16 2012 Steve Dickson <steved@redhat.com> 1.2.5-4
- Reworked how the nfsd service requires the rpcbind service (bz 768550)

and seems to only affect nfs-server.

Anything else to check?

TIA,

  Orion

-- 
Orion Poplawski
Technical Manager                     303-415-9701 x222
NWRA, Boulder Office                  FAX: 303-415-9702
3380 Mitchell Lane                  orion@cora.nwra.com
Boulder, CO 80301              http://www.cora.nwra.com

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2012-03-01 19:28 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-02-29 22:29 nfs4 mount hanging suddenly Orion Poplawski
2012-02-29 23:17 ` J. Bruce Fields
2012-02-29 23:21   ` Orion Poplawski
2012-03-01 13:50     ` Myklebust, Trond
2012-03-01 15:34       ` Orion Poplawski
2012-03-01 19:28         ` J. Bruce Fields

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).