From: "J. Bruce Fields" <bfields@fieldses.org>
To: Orion Poplawski <orion@cora.nwra.com>
Cc: linux-nfs@vger.kernel.org
Subject: Re: nfs4 mount hanging suddenly
Date: Wed, 29 Feb 2012 18:17:32 -0500 [thread overview]
Message-ID: <20120229231732.GD6506@fieldses.org> (raw)
In-Reply-To: <4F4EA6D0.30606@cora.nwra.com>
On Wed, Feb 29, 2012 at 03:29:36PM -0700, Orion Poplawski wrote:
> Just starting today, one of our user's nfs mounted home directory
> has started locking up. Client is Fedora 16 32-bit, server is
> CentOS 5.7 32-bit. Have not seen this particular problem elsewhere
> (yet).
>
> I captured this trace on the server after the hang:
>
> http://sw.cora.nwra.com/tmp/marie-nfs-home-lwang-hang.pcap
>
> 1 0.000000 10.10.20.15 -> 10.10.10.1 NFS V4 COMP Call <EMPTY>
> PUTFH;GETATTR GETATTR
> 2 0.000133 10.10.10.1 -> 10.10.20.15 NFS V4 COMP Reply (Call
> In 1) <EMPTY> PUTFH;GETATTR GETATTR
> 3 0.000421 10.10.20.15 -> 10.10.10.1 TCP 879 > nfs [ACK]
> Seq=137 Ack=225 Win=17738 Len=0 TSV=3584653 TSER=2438333196
> 4 0.000519 10.10.20.15 -> 10.10.10.1 NFS V4 COMP Call <EMPTY>
> PUTFH;ACCESS ACCESS;GETATTR GETATTR
> 5 0.000587 10.10.10.1 -> 10.10.20.15 NFS V4 COMP Reply (Call
> In 4) <EMPTY> PUTFH;ACCESS ACCESS;GETATTR GETATTR[Unreassembled
> Packet [incorrect TCP checksum]]
> 6 0.040522 10.10.20.15 -> 10.10.10.1 TCP 879 > nfs [ACK]
> Seq=289 Ack=465 Win=17738 Len=0 TSV=3584694 TSER=2438333196
> 7 0.451636 10.10.20.15 -> 10.10.10.1 NFS V4 COMP Call <EMPTY>
> PUTFH;SAVEFH SAVEFH;OPEN OPEN;DELEGRETURN DELEGRETURN;Unknown
That looks weird. Looking at the pcap--ok, the "delegreturn" is a
mistake, there's no delegreturn there.
> 8 0.451892 10.10.10.1 -> 10.10.20.15 NFS V4 COMP Reply (Call
> In 7) <EMPTY> PUTFH;SAVEFH SAVEFH;OPEN OPEN(10008)
That probably means the server is waiting for the client to return a
delegation.
Either the server's confused about their being a delegation, or the
client's failing to return one it should?
--b.
> 9 0.452164 10.10.20.15 -> 10.10.10.1 TCP 879 > nfs [ACK]
> Seq=529 Ack=529 Win=17738 Len=0 TSV=3585105 TSER=2438333648
> .....
> 120 53.161949 10.10.20.15 -> 10.10.10.1 NFS V4 COMP Call <EMPTY>
> PUTFH;GETATTR GETATTR
> 121 53.162281 10.10.10.1 -> 10.10.20.15 NFS V4 COMP Reply (Call
> In 120) <EMPTY> PUTFH;GETATTR GETATTR
> 122 53.162596 10.10.20.15 -> 10.10.10.1 TCP 879 > nfs [ACK]
> Seq=8205 Ack=10341 Win=17738 Len=0 TSV=3637816 TSER=2438386366
> 123 53.162680 10.10.20.15 -> 10.10.10.1 NFS V4 COMP Call <EMPTY>
> PUTFH;GETATTR GETATTR
> 124 53.162748 10.10.10.1 -> 10.10.20.15 NFS V4 COMP Reply (Call
> In 123) <EMPTY> PUTFH;GETATTR GETATTR[Unreassembled Packet
> [incorrect TCP checksum]]
> 125 53.163245 10.10.20.15 -> 10.10.10.1 NFS V4 COMP Call <EMPTY>
> PUTFH;GETATTR GETATTR
> 126 53.163418 10.10.10.1 -> 10.10.20.15 NFS V4 COMP Reply (Call
> In 125) <EMPTY> PUTFH;GETATTR GETATTR
> 127 53.203530 10.10.20.15 -> 10.10.10.1 TCP 879 > nfs [ACK]
> Seq=8493 Ack=10685 Win=17738 Len=0 TSV=3637857 TSER=2438386368
> 128 53.450308 10.10.20.15 -> 10.10.10.1 NFS V4 COMP Call <EMPTY>
> PUTFH;ACCESS ACCESS;GETATTR GETATTR
> 129 53.450457 10.10.10.1 -> 10.10.20.15 NFS V4 COMP Reply (Call
> In 128) <EMPTY> PUTFH;ACCESS ACCESS;GETATTR GETATTR[Unreassembled
> Packet [incorrect TCP checksum]]
> 130 53.450671 10.10.20.15 -> 10.10.10.1 TCP 879 > nfs [ACK]
> Seq=8645 Ack=10925 Win=17738 Len=0 TSV=3638104 TSER=2438386655
>
>
> I was not able to find any error messages anywhere. Server has been
> up 28 days. Client was up for 14 days before first hang, then 2
> more today. Home directories are automounted and I was able to
> access a different home directory that is served off the save server
> and filesystem.
>
> client kernels: 3.2.3-2.fc16.i68, 3.2.7-1.fc16.i68
> server kernel: 2.6.18-274.17.1.el5
>
> earth:/export/home/lwang on /home/lwang type nfs4 (rw,noatime,vers=4,rsize=32768,wsize=32768,namlen=255,acregmin=1,acregmax=1,acdirmin=1,acdirmax=1,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=10.10.20.15,minorversion=0,local_lock=none,addr=10.10.10.1)
>
> There is a newer nfs-utils:
> Jan 24 03:34:43 Updated: 1:nfs-utils-1.2.5-4.fc16.i686
>
> may try backing that off, but doesn't seem like a big change:
>
> * Mon Jan 16 2012 Steve Dickson <steved@redhat.com> 1.2.5-4
> - Reworked how the nfsd service requires the rpcbind service (bz 768550)
>
> and seems to only affect nfs-server.
>
> Anything else to check?
>
> TIA,
>
> Orion
>
> --
> Orion Poplawski
> Technical Manager 303-415-9701 x222
> NWRA, Boulder Office FAX: 303-415-9702
> 3380 Mitchell Lane orion@cora.nwra.com
> Boulder, CO 80301 http://www.cora.nwra.com
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2012-02-29 23:17 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-02-29 22:29 nfs4 mount hanging suddenly Orion Poplawski
2012-02-29 23:17 ` J. Bruce Fields [this message]
2012-02-29 23:21 ` Orion Poplawski
2012-03-01 13:50 ` Myklebust, Trond
2012-03-01 15:34 ` Orion Poplawski
2012-03-01 19:28 ` J. Bruce Fields
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120229231732.GD6506@fieldses.org \
--to=bfields@fieldses.org \
--cc=linux-nfs@vger.kernel.org \
--cc=orion@cora.nwra.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.