From: Orion Poplawski <orion@cora.nwra.com>
To: linux-nfs@vger.kernel.org
Subject: nfs4 mount hanging suddenly
Date: Wed, 29 Feb 2012 15:29:36 -0700 [thread overview]
Message-ID: <4F4EA6D0.30606@cora.nwra.com> (raw)
Just starting today, one of our user's nfs mounted home directory has started
locking up. Client is Fedora 16 32-bit, server is CentOS 5.7 32-bit. Have
not seen this particular problem elsewhere (yet).
I captured this trace on the server after the hang:
http://sw.cora.nwra.com/tmp/marie-nfs-home-lwang-hang.pcap
1 0.000000 10.10.20.15 -> 10.10.10.1 NFS V4 COMP Call <EMPTY>
PUTFH;GETATTR GETATTR
2 0.000133 10.10.10.1 -> 10.10.20.15 NFS V4 COMP Reply (Call In 1)
<EMPTY> PUTFH;GETATTR GETATTR
3 0.000421 10.10.20.15 -> 10.10.10.1 TCP 879 > nfs [ACK] Seq=137
Ack=225 Win=17738 Len=0 TSV=3584653 TSER=2438333196
4 0.000519 10.10.20.15 -> 10.10.10.1 NFS V4 COMP Call <EMPTY>
PUTFH;ACCESS ACCESS;GETATTR GETATTR
5 0.000587 10.10.10.1 -> 10.10.20.15 NFS V4 COMP Reply (Call In 4)
<EMPTY> PUTFH;ACCESS ACCESS;GETATTR GETATTR[Unreassembled Packet [incorrect
TCP checksum]]
6 0.040522 10.10.20.15 -> 10.10.10.1 TCP 879 > nfs [ACK] Seq=289
Ack=465 Win=17738 Len=0 TSV=3584694 TSER=2438333196
7 0.451636 10.10.20.15 -> 10.10.10.1 NFS V4 COMP Call <EMPTY>
PUTFH;SAVEFH SAVEFH;OPEN OPEN;DELEGRETURN DELEGRETURN;Unknown
8 0.451892 10.10.10.1 -> 10.10.20.15 NFS V4 COMP Reply (Call In 7)
<EMPTY> PUTFH;SAVEFH SAVEFH;OPEN OPEN(10008)
9 0.452164 10.10.20.15 -> 10.10.10.1 TCP 879 > nfs [ACK] Seq=529
Ack=529 Win=17738 Len=0 TSV=3585105 TSER=2438333648
.....
120 53.161949 10.10.20.15 -> 10.10.10.1 NFS V4 COMP Call <EMPTY>
PUTFH;GETATTR GETATTR
121 53.162281 10.10.10.1 -> 10.10.20.15 NFS V4 COMP Reply (Call In 120)
<EMPTY> PUTFH;GETATTR GETATTR
122 53.162596 10.10.20.15 -> 10.10.10.1 TCP 879 > nfs [ACK] Seq=8205
Ack=10341 Win=17738 Len=0 TSV=3637816 TSER=2438386366
123 53.162680 10.10.20.15 -> 10.10.10.1 NFS V4 COMP Call <EMPTY>
PUTFH;GETATTR GETATTR
124 53.162748 10.10.10.1 -> 10.10.20.15 NFS V4 COMP Reply (Call In 123)
<EMPTY> PUTFH;GETATTR GETATTR[Unreassembled Packet [incorrect TCP checksum]]
125 53.163245 10.10.20.15 -> 10.10.10.1 NFS V4 COMP Call <EMPTY>
PUTFH;GETATTR GETATTR
126 53.163418 10.10.10.1 -> 10.10.20.15 NFS V4 COMP Reply (Call In 125)
<EMPTY> PUTFH;GETATTR GETATTR
127 53.203530 10.10.20.15 -> 10.10.10.1 TCP 879 > nfs [ACK] Seq=8493
Ack=10685 Win=17738 Len=0 TSV=3637857 TSER=2438386368
128 53.450308 10.10.20.15 -> 10.10.10.1 NFS V4 COMP Call <EMPTY>
PUTFH;ACCESS ACCESS;GETATTR GETATTR
129 53.450457 10.10.10.1 -> 10.10.20.15 NFS V4 COMP Reply (Call In 128)
<EMPTY> PUTFH;ACCESS ACCESS;GETATTR GETATTR[Unreassembled Packet [incorrect
TCP checksum]]
130 53.450671 10.10.20.15 -> 10.10.10.1 TCP 879 > nfs [ACK] Seq=8645
Ack=10925 Win=17738 Len=0 TSV=3638104 TSER=2438386655
I was not able to find any error messages anywhere. Server has been up 28
days. Client was up for 14 days before first hang, then 2 more today. Home
directories are automounted and I was able to access a different home
directory that is served off the save server and filesystem.
client kernels: 3.2.3-2.fc16.i68, 3.2.7-1.fc16.i68
server kernel: 2.6.18-274.17.1.el5
earth:/export/home/lwang on /home/lwang type nfs4
(rw,noatime,vers=4,rsize=32768,wsize=32768,namlen=255,acregmin=1,acregmax=1,acdirmin=1,acdirmax=1,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=10.10.20.15,minorversion=0,local_lock=none,addr=10.10.10.1)
There is a newer nfs-utils:
Jan 24 03:34:43 Updated: 1:nfs-utils-1.2.5-4.fc16.i686
may try backing that off, but doesn't seem like a big change:
* Mon Jan 16 2012 Steve Dickson <steved@redhat.com> 1.2.5-4
- Reworked how the nfsd service requires the rpcbind service (bz 768550)
and seems to only affect nfs-server.
Anything else to check?
TIA,
Orion
--
Orion Poplawski
Technical Manager 303-415-9701 x222
NWRA, Boulder Office FAX: 303-415-9702
3380 Mitchell Lane orion@cora.nwra.com
Boulder, CO 80301 http://www.cora.nwra.com
next reply other threads:[~2012-02-29 22:40 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-02-29 22:29 Orion Poplawski [this message]
2012-02-29 23:17 ` nfs4 mount hanging suddenly J. Bruce Fields
2012-02-29 23:21 ` Orion Poplawski
2012-03-01 13:50 ` Myklebust, Trond
2012-03-01 15:34 ` Orion Poplawski
2012-03-01 19:28 ` J. Bruce Fields
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4F4EA6D0.30606@cora.nwra.com \
--to=orion@cora.nwra.com \
--cc=linux-nfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.