linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* nfs4 mount hanging suddenly
@ 2012-02-29 22:29 Orion Poplawski
  2012-02-29 23:17 ` J. Bruce Fields
  0 siblings, 1 reply; 6+ messages in thread
From: Orion Poplawski @ 2012-02-29 22:29 UTC (permalink / raw)
  To: linux-nfs

Just starting today, one of our user's nfs mounted home directory has started 
locking up.  Client is Fedora 16 32-bit, server is CentOS 5.7 32-bit.  Have 
not seen this particular problem elsewhere (yet).

I captured this trace on the server after the hang:

http://sw.cora.nwra.com/tmp/marie-nfs-home-lwang-hang.pcap

   1   0.000000  10.10.20.15 -> 10.10.10.1   NFS V4 COMP Call <EMPTY> 
PUTFH;GETATTR GETATTR
   2   0.000133   10.10.10.1 -> 10.10.20.15  NFS V4 COMP Reply (Call In 1) 
<EMPTY> PUTFH;GETATTR GETATTR
   3   0.000421  10.10.20.15 -> 10.10.10.1   TCP 879 > nfs [ACK] Seq=137 
Ack=225 Win=17738 Len=0 TSV=3584653 TSER=2438333196
   4   0.000519  10.10.20.15 -> 10.10.10.1   NFS V4 COMP Call <EMPTY> 
PUTFH;ACCESS ACCESS;GETATTR GETATTR
   5   0.000587   10.10.10.1 -> 10.10.20.15  NFS V4 COMP Reply (Call In 4) 
<EMPTY> PUTFH;ACCESS ACCESS;GETATTR GETATTR[Unreassembled Packet [incorrect 
TCP checksum]]
   6   0.040522  10.10.20.15 -> 10.10.10.1   TCP 879 > nfs [ACK] Seq=289 
Ack=465 Win=17738 Len=0 TSV=3584694 TSER=2438333196
   7   0.451636  10.10.20.15 -> 10.10.10.1   NFS V4 COMP Call <EMPTY> 
PUTFH;SAVEFH SAVEFH;OPEN OPEN;DELEGRETURN DELEGRETURN;Unknown
   8   0.451892   10.10.10.1 -> 10.10.20.15  NFS V4 COMP Reply (Call In 7) 
<EMPTY> PUTFH;SAVEFH SAVEFH;OPEN OPEN(10008)
   9   0.452164  10.10.20.15 -> 10.10.10.1   TCP 879 > nfs [ACK] Seq=529 
Ack=529 Win=17738 Len=0 TSV=3585105 TSER=2438333648
.....
120  53.161949  10.10.20.15 -> 10.10.10.1   NFS V4 COMP Call <EMPTY> 
PUTFH;GETATTR GETATTR
121  53.162281   10.10.10.1 -> 10.10.20.15  NFS V4 COMP Reply (Call In 120) 
<EMPTY> PUTFH;GETATTR GETATTR
122  53.162596  10.10.20.15 -> 10.10.10.1   TCP 879 > nfs [ACK] Seq=8205 
Ack=10341 Win=17738 Len=0 TSV=3637816 TSER=2438386366
123  53.162680  10.10.20.15 -> 10.10.10.1   NFS V4 COMP Call <EMPTY> 
PUTFH;GETATTR GETATTR
124  53.162748   10.10.10.1 -> 10.10.20.15  NFS V4 COMP Reply (Call In 123) 
<EMPTY> PUTFH;GETATTR GETATTR[Unreassembled Packet [incorrect TCP checksum]]
125  53.163245  10.10.20.15 -> 10.10.10.1   NFS V4 COMP Call <EMPTY> 
PUTFH;GETATTR GETATTR
126  53.163418   10.10.10.1 -> 10.10.20.15  NFS V4 COMP Reply (Call In 125) 
<EMPTY> PUTFH;GETATTR GETATTR
127  53.203530  10.10.20.15 -> 10.10.10.1   TCP 879 > nfs [ACK] Seq=8493 
Ack=10685 Win=17738 Len=0 TSV=3637857 TSER=2438386368
128  53.450308  10.10.20.15 -> 10.10.10.1   NFS V4 COMP Call <EMPTY> 
PUTFH;ACCESS ACCESS;GETATTR GETATTR
129  53.450457   10.10.10.1 -> 10.10.20.15  NFS V4 COMP Reply (Call In 128) 
<EMPTY> PUTFH;ACCESS ACCESS;GETATTR GETATTR[Unreassembled Packet [incorrect 
TCP checksum]]
130  53.450671  10.10.20.15 -> 10.10.10.1   TCP 879 > nfs [ACK] Seq=8645 
Ack=10925 Win=17738 Len=0 TSV=3638104 TSER=2438386655


I was not able to find any error messages anywhere.  Server has been up 28 
days.  Client was up for 14 days before first hang, then 2 more today.  Home 
directories are automounted and I was able to access a different home 
directory that is served off the save server and filesystem.

client kernels: 3.2.3-2.fc16.i68, 3.2.7-1.fc16.i68
server kernel: 2.6.18-274.17.1.el5

earth:/export/home/lwang on /home/lwang type nfs4 
(rw,noatime,vers=4,rsize=32768,wsize=32768,namlen=255,acregmin=1,acregmax=1,acdirmin=1,acdirmax=1,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=10.10.20.15,minorversion=0,local_lock=none,addr=10.10.10.1)

There is a newer nfs-utils:
Jan 24 03:34:43 Updated: 1:nfs-utils-1.2.5-4.fc16.i686

may try backing that off, but doesn't seem like a big change:

* Mon Jan 16 2012 Steve Dickson <steved@redhat.com> 1.2.5-4
- Reworked how the nfsd service requires the rpcbind service (bz 768550)

and seems to only affect nfs-server.

Anything else to check?

TIA,

  Orion

-- 
Orion Poplawski
Technical Manager                     303-415-9701 x222
NWRA, Boulder Office                  FAX: 303-415-9702
3380 Mitchell Lane                  orion@cora.nwra.com
Boulder, CO 80301              http://www.cora.nwra.com

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: nfs4 mount hanging suddenly
  2012-02-29 22:29 nfs4 mount hanging suddenly Orion Poplawski
@ 2012-02-29 23:17 ` J. Bruce Fields
  2012-02-29 23:21   ` Orion Poplawski
  0 siblings, 1 reply; 6+ messages in thread
From: J. Bruce Fields @ 2012-02-29 23:17 UTC (permalink / raw)
  To: Orion Poplawski; +Cc: linux-nfs

On Wed, Feb 29, 2012 at 03:29:36PM -0700, Orion Poplawski wrote:
> Just starting today, one of our user's nfs mounted home directory
> has started locking up.  Client is Fedora 16 32-bit, server is
> CentOS 5.7 32-bit.  Have not seen this particular problem elsewhere
> (yet).
> 
> I captured this trace on the server after the hang:
> 
> http://sw.cora.nwra.com/tmp/marie-nfs-home-lwang-hang.pcap
> 
>   1   0.000000  10.10.20.15 -> 10.10.10.1   NFS V4 COMP Call <EMPTY>
> PUTFH;GETATTR GETATTR
>   2   0.000133   10.10.10.1 -> 10.10.20.15  NFS V4 COMP Reply (Call
> In 1) <EMPTY> PUTFH;GETATTR GETATTR
>   3   0.000421  10.10.20.15 -> 10.10.10.1   TCP 879 > nfs [ACK]
> Seq=137 Ack=225 Win=17738 Len=0 TSV=3584653 TSER=2438333196
>   4   0.000519  10.10.20.15 -> 10.10.10.1   NFS V4 COMP Call <EMPTY>
> PUTFH;ACCESS ACCESS;GETATTR GETATTR
>   5   0.000587   10.10.10.1 -> 10.10.20.15  NFS V4 COMP Reply (Call
> In 4) <EMPTY> PUTFH;ACCESS ACCESS;GETATTR GETATTR[Unreassembled
> Packet [incorrect TCP checksum]]
>   6   0.040522  10.10.20.15 -> 10.10.10.1   TCP 879 > nfs [ACK]
> Seq=289 Ack=465 Win=17738 Len=0 TSV=3584694 TSER=2438333196
>   7   0.451636  10.10.20.15 -> 10.10.10.1   NFS V4 COMP Call <EMPTY>
> PUTFH;SAVEFH SAVEFH;OPEN OPEN;DELEGRETURN DELEGRETURN;Unknown

That looks weird.  Looking at the pcap--ok, the "delegreturn" is a
mistake, there's no delegreturn there.

>   8   0.451892   10.10.10.1 -> 10.10.20.15  NFS V4 COMP Reply (Call
> In 7) <EMPTY> PUTFH;SAVEFH SAVEFH;OPEN OPEN(10008)

That probably means the server is waiting for the client to return a
delegation.

Either the server's confused about their being a delegation, or the
client's failing to return one it should?

--b.

>   9   0.452164  10.10.20.15 -> 10.10.10.1   TCP 879 > nfs [ACK]
> Seq=529 Ack=529 Win=17738 Len=0 TSV=3585105 TSER=2438333648
> .....
> 120  53.161949  10.10.20.15 -> 10.10.10.1   NFS V4 COMP Call <EMPTY>
> PUTFH;GETATTR GETATTR
> 121  53.162281   10.10.10.1 -> 10.10.20.15  NFS V4 COMP Reply (Call
> In 120) <EMPTY> PUTFH;GETATTR GETATTR
> 122  53.162596  10.10.20.15 -> 10.10.10.1   TCP 879 > nfs [ACK]
> Seq=8205 Ack=10341 Win=17738 Len=0 TSV=3637816 TSER=2438386366
> 123  53.162680  10.10.20.15 -> 10.10.10.1   NFS V4 COMP Call <EMPTY>
> PUTFH;GETATTR GETATTR
> 124  53.162748   10.10.10.1 -> 10.10.20.15  NFS V4 COMP Reply (Call
> In 123) <EMPTY> PUTFH;GETATTR GETATTR[Unreassembled Packet
> [incorrect TCP checksum]]
> 125  53.163245  10.10.20.15 -> 10.10.10.1   NFS V4 COMP Call <EMPTY>
> PUTFH;GETATTR GETATTR
> 126  53.163418   10.10.10.1 -> 10.10.20.15  NFS V4 COMP Reply (Call
> In 125) <EMPTY> PUTFH;GETATTR GETATTR
> 127  53.203530  10.10.20.15 -> 10.10.10.1   TCP 879 > nfs [ACK]
> Seq=8493 Ack=10685 Win=17738 Len=0 TSV=3637857 TSER=2438386368
> 128  53.450308  10.10.20.15 -> 10.10.10.1   NFS V4 COMP Call <EMPTY>
> PUTFH;ACCESS ACCESS;GETATTR GETATTR
> 129  53.450457   10.10.10.1 -> 10.10.20.15  NFS V4 COMP Reply (Call
> In 128) <EMPTY> PUTFH;ACCESS ACCESS;GETATTR GETATTR[Unreassembled
> Packet [incorrect TCP checksum]]
> 130  53.450671  10.10.20.15 -> 10.10.10.1   TCP 879 > nfs [ACK]
> Seq=8645 Ack=10925 Win=17738 Len=0 TSV=3638104 TSER=2438386655
> 
> 
> I was not able to find any error messages anywhere.  Server has been
> up 28 days.  Client was up for 14 days before first hang, then 2
> more today.  Home directories are automounted and I was able to
> access a different home directory that is served off the save server
> and filesystem.
> 
> client kernels: 3.2.3-2.fc16.i68, 3.2.7-1.fc16.i68
> server kernel: 2.6.18-274.17.1.el5
> 
> earth:/export/home/lwang on /home/lwang type nfs4 (rw,noatime,vers=4,rsize=32768,wsize=32768,namlen=255,acregmin=1,acregmax=1,acdirmin=1,acdirmax=1,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=10.10.20.15,minorversion=0,local_lock=none,addr=10.10.10.1)
> 
> There is a newer nfs-utils:
> Jan 24 03:34:43 Updated: 1:nfs-utils-1.2.5-4.fc16.i686
> 
> may try backing that off, but doesn't seem like a big change:
> 
> * Mon Jan 16 2012 Steve Dickson <steved@redhat.com> 1.2.5-4
> - Reworked how the nfsd service requires the rpcbind service (bz 768550)
> 
> and seems to only affect nfs-server.
> 
> Anything else to check?
> 
> TIA,
> 
>  Orion
> 
> -- 
> Orion Poplawski
> Technical Manager                     303-415-9701 x222
> NWRA, Boulder Office                  FAX: 303-415-9702
> 3380 Mitchell Lane                  orion@cora.nwra.com
> Boulder, CO 80301              http://www.cora.nwra.com
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: nfs4 mount hanging suddenly
  2012-02-29 23:17 ` J. Bruce Fields
@ 2012-02-29 23:21   ` Orion Poplawski
  2012-03-01 13:50     ` Myklebust, Trond
  0 siblings, 1 reply; 6+ messages in thread
From: Orion Poplawski @ 2012-02-29 23:21 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs

On 02/29/2012 04:17 PM, J. Bruce Fields wrote:
> On Wed, Feb 29, 2012 at 03:29:36PM -0700, Orion Poplawski wrote:
>> Just starting today, one of our user's nfs mounted home directory
>> has started locking up.  Client is Fedora 16 32-bit, server is
>> CentOS 5.7 32-bit.  Have not seen this particular problem elsewhere
>> (yet).
>>
>> I captured this trace on the server after the hang:
>>
>> http://sw.cora.nwra.com/tmp/marie-nfs-home-lwang-hang.pcap
>>
>>    1   0.000000  10.10.20.15 ->  10.10.10.1   NFS V4 COMP Call<EMPTY>
>> PUTFH;GETATTR GETATTR
>>    2   0.000133   10.10.10.1 ->  10.10.20.15  NFS V4 COMP Reply (Call
>> In 1)<EMPTY>  PUTFH;GETATTR GETATTR
>>    3   0.000421  10.10.20.15 ->  10.10.10.1   TCP 879>  nfs [ACK]
>> Seq=137 Ack=225 Win=17738 Len=0 TSV=3584653 TSER=2438333196
>>    4   0.000519  10.10.20.15 ->  10.10.10.1   NFS V4 COMP Call<EMPTY>
>> PUTFH;ACCESS ACCESS;GETATTR GETATTR
>>    5   0.000587   10.10.10.1 ->  10.10.20.15  NFS V4 COMP Reply (Call
>> In 4)<EMPTY>  PUTFH;ACCESS ACCESS;GETATTR GETATTR[Unreassembled
>> Packet [incorrect TCP checksum]]
>>    6   0.040522  10.10.20.15 ->  10.10.10.1   TCP 879>  nfs [ACK]
>> Seq=289 Ack=465 Win=17738 Len=0 TSV=3584694 TSER=2438333196
>>    7   0.451636  10.10.20.15 ->  10.10.10.1   NFS V4 COMP Call<EMPTY>
>> PUTFH;SAVEFH SAVEFH;OPEN OPEN;DELEGRETURN DELEGRETURN;Unknown
>
> That looks weird.  Looking at the pcap--ok, the "delegreturn" is a
> mistake, there's no delegreturn there.
>
>>    8   0.451892   10.10.10.1 ->  10.10.20.15  NFS V4 COMP Reply (Call
>> In 7)<EMPTY>  PUTFH;SAVEFH SAVEFH;OPEN OPEN(10008)
>
> That probably means the server is waiting for the client to return a
> delegation.
>
> Either the server's confused about their being a delegation, or the
> client's failing to return one it should?
>
> --b.

All way over my head.  Any way to check in more detail?

thanks.


-- 
Orion Poplawski
Technical Manager                     303-415-9701 x222
NWRA, Boulder Office                  FAX: 303-415-9702
3380 Mitchell Lane                  orion@cora.nwra.com
Boulder, CO 80301              http://www.cora.nwra.com

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: nfs4 mount hanging suddenly
  2012-02-29 23:21   ` Orion Poplawski
@ 2012-03-01 13:50     ` Myklebust, Trond
  2012-03-01 15:34       ` Orion Poplawski
  0 siblings, 1 reply; 6+ messages in thread
From: Myklebust, Trond @ 2012-03-01 13:50 UTC (permalink / raw)
  To: Orion Poplawski; +Cc: J. Bruce Fields, linux-nfs@vger.kernel.org

T24gV2VkLCAyMDEyLTAyLTI5IGF0IDE2OjIxIC0wNzAwLCBPcmlvbiBQb3BsYXdza2kgd3JvdGU6
DQo+IE9uIDAyLzI5LzIwMTIgMDQ6MTcgUE0sIEouIEJydWNlIEZpZWxkcyB3cm90ZToNCj4gPiBP
biBXZWQsIEZlYiAyOSwgMjAxMiBhdCAwMzoyOTozNlBNIC0wNzAwLCBPcmlvbiBQb3BsYXdza2kg
d3JvdGU6DQo+ID4+IEp1c3Qgc3RhcnRpbmcgdG9kYXksIG9uZSBvZiBvdXIgdXNlcidzIG5mcyBt
b3VudGVkIGhvbWUgZGlyZWN0b3J5DQo+ID4+IGhhcyBzdGFydGVkIGxvY2tpbmcgdXAuICBDbGll
bnQgaXMgRmVkb3JhIDE2IDMyLWJpdCwgc2VydmVyIGlzDQo+ID4+IENlbnRPUyA1LjcgMzItYml0
LiAgSGF2ZSBub3Qgc2VlbiB0aGlzIHBhcnRpY3VsYXIgcHJvYmxlbSBlbHNld2hlcmUNCj4gPj4g
KHlldCkuDQo+ID4+DQo+ID4+IEkgY2FwdHVyZWQgdGhpcyB0cmFjZSBvbiB0aGUgc2VydmVyIGFm
dGVyIHRoZSBoYW5nOg0KPiA+Pg0KPiA+PiBodHRwOi8vc3cuY29yYS5ud3JhLmNvbS90bXAvbWFy
aWUtbmZzLWhvbWUtbHdhbmctaGFuZy5wY2FwDQo+ID4+DQo+ID4+ICAgIDEgICAwLjAwMDAwMCAg
MTAuMTAuMjAuMTUgLT4gIDEwLjEwLjEwLjEgICBORlMgVjQgQ09NUCBDYWxsPEVNUFRZPg0KPiA+
PiBQVVRGSDtHRVRBVFRSIEdFVEFUVFINCj4gPj4gICAgMiAgIDAuMDAwMTMzICAgMTAuMTAuMTAu
MSAtPiAgMTAuMTAuMjAuMTUgIE5GUyBWNCBDT01QIFJlcGx5IChDYWxsDQo+ID4+IEluIDEpPEVN
UFRZPiAgUFVURkg7R0VUQVRUUiBHRVRBVFRSDQo+ID4+ICAgIDMgICAwLjAwMDQyMSAgMTAuMTAu
MjAuMTUgLT4gIDEwLjEwLjEwLjEgICBUQ1AgODc5PiAgbmZzIFtBQ0tdDQo+ID4+IFNlcT0xMzcg
QWNrPTIyNSBXaW49MTc3MzggTGVuPTAgVFNWPTM1ODQ2NTMgVFNFUj0yNDM4MzMzMTk2DQo+ID4+
ICAgIDQgICAwLjAwMDUxOSAgMTAuMTAuMjAuMTUgLT4gIDEwLjEwLjEwLjEgICBORlMgVjQgQ09N
UCBDYWxsPEVNUFRZPg0KPiA+PiBQVVRGSDtBQ0NFU1MgQUNDRVNTO0dFVEFUVFIgR0VUQVRUUg0K
PiA+PiAgICA1ICAgMC4wMDA1ODcgICAxMC4xMC4xMC4xIC0+ICAxMC4xMC4yMC4xNSAgTkZTIFY0
IENPTVAgUmVwbHkgKENhbGwNCj4gPj4gSW4gNCk8RU1QVFk+ICBQVVRGSDtBQ0NFU1MgQUNDRVNT
O0dFVEFUVFIgR0VUQVRUUltVbnJlYXNzZW1ibGVkDQo+ID4+IFBhY2tldCBbaW5jb3JyZWN0IFRD
UCBjaGVja3N1bV1dDQo+ID4+ICAgIDYgICAwLjA0MDUyMiAgMTAuMTAuMjAuMTUgLT4gIDEwLjEw
LjEwLjEgICBUQ1AgODc5PiAgbmZzIFtBQ0tdDQo+ID4+IFNlcT0yODkgQWNrPTQ2NSBXaW49MTc3
MzggTGVuPTAgVFNWPTM1ODQ2OTQgVFNFUj0yNDM4MzMzMTk2DQo+ID4+ICAgIDcgICAwLjQ1MTYz
NiAgMTAuMTAuMjAuMTUgLT4gIDEwLjEwLjEwLjEgICBORlMgVjQgQ09NUCBDYWxsPEVNUFRZPg0K
PiA+PiBQVVRGSDtTQVZFRkggU0FWRUZIO09QRU4gT1BFTjtERUxFR1JFVFVSTiBERUxFR1JFVFVS
TjtVbmtub3duDQo+ID4NCj4gPiBUaGF0IGxvb2tzIHdlaXJkLiAgTG9va2luZyBhdCB0aGUgcGNh
cC0tb2ssIHRoZSAiZGVsZWdyZXR1cm4iIGlzIGENCj4gPiBtaXN0YWtlLCB0aGVyZSdzIG5vIGRl
bGVncmV0dXJuIHRoZXJlLg0KPiA+DQo+ID4+ICAgIDggICAwLjQ1MTg5MiAgIDEwLjEwLjEwLjEg
LT4gIDEwLjEwLjIwLjE1ICBORlMgVjQgQ09NUCBSZXBseSAoQ2FsbA0KPiA+PiBJbiA3KTxFTVBU
WT4gIFBVVEZIO1NBVkVGSCBTQVZFRkg7T1BFTiBPUEVOKDEwMDA4KQ0KPiA+DQo+ID4gVGhhdCBw
cm9iYWJseSBtZWFucyB0aGUgc2VydmVyIGlzIHdhaXRpbmcgZm9yIHRoZSBjbGllbnQgdG8gcmV0
dXJuIGENCj4gPiBkZWxlZ2F0aW9uLg0KPiA+DQo+ID4gRWl0aGVyIHRoZSBzZXJ2ZXIncyBjb25m
dXNlZCBhYm91dCB0aGVpciBiZWluZyBhIGRlbGVnYXRpb24sIG9yIHRoZQ0KPiA+IGNsaWVudCdz
IGZhaWxpbmcgdG8gcmV0dXJuIG9uZSBpdCBzaG91bGQ/DQo+ID4NCg0KQXMgZmFyIGFzIEkgY2Fu
IHNlZSBmcm9tIHlvdXIgdHJhY2UsIHRoZSBjbGllbnQga2VlcHMgdHJ5aW5nIHRvIG9wZW4gdGhl
DQpmaWxlICcuaGlzdG9yeScsIGFuZCB0aGUgc2VydmVyIGtlZXBzIHJlcGx5aW5nIHdpdGggTkZT
NEVSUl9ERUxBWS4gVGhlcmUNCmlzIG5vdGhpbmcgaW4gdGhlIHRyYWNlIGl0c2VsZiB0aGF0IGNh
biB0ZWxsIHVzIHdoeSB0aGUgc2VydmVyIGlzDQpkZWxheWluZy4NCg0KQ2hlZXJzDQogIFRyb25k
DQoNCi0tIA0KVHJvbmQgTXlrbGVidXN0DQpMaW51eCBORlMgY2xpZW50IG1haW50YWluZXINCg0K
TmV0QXBwDQpUcm9uZC5NeWtsZWJ1c3RAbmV0YXBwLmNvbQ0Kd3d3Lm5ldGFwcC5jb20NCg0K

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: nfs4 mount hanging suddenly
  2012-03-01 13:50     ` Myklebust, Trond
@ 2012-03-01 15:34       ` Orion Poplawski
  2012-03-01 19:28         ` J. Bruce Fields
  0 siblings, 1 reply; 6+ messages in thread
From: Orion Poplawski @ 2012-03-01 15:34 UTC (permalink / raw)
  To: Myklebust, Trond; +Cc: J. Bruce Fields, linux-nfs@vger.kernel.org

On 03/01/2012 06:50 AM, Myklebust, Trond wrote:
> On Wed, 2012-02-29 at 16:21 -0700, Orion Poplawski wrote:
>> On 02/29/2012 04:17 PM, J. Bruce Fields wrote:
>>> On Wed, Feb 29, 2012 at 03:29:36PM -0700, Orion Poplawski wrote:
OMP Reply (Call
>>>
>>> That probably means the server is waiting for the client to return a
>>> delegation.
>>>
>>> Either the server's confused about their being a delegation, or the
>>> client's failing to return one it should?
>>>
>
> As far as I can see from your trace, the client keeps trying to open the
> file '.history', and the server keeps replying with NFS4ERR_DELAY. There
> is nothing in the trace itself that can tell us why the server is
> delaying.

So, it seems to be on the server side?  I didn't get any relief 
restarting nfs on the server.  Any way to trace what is happening on the 
server side?

Thanks again.


-- 
Orion Poplawski
Technical Manager                     303-415-9701 x222
NWRA/CoRA Division                    FAX: 303-415-9702
3380 Mitchell Lane                  orion@cora.nwra.com
Boulder, CO 80301              http://www.cora.nwra.com

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: nfs4 mount hanging suddenly
  2012-03-01 15:34       ` Orion Poplawski
@ 2012-03-01 19:28         ` J. Bruce Fields
  0 siblings, 0 replies; 6+ messages in thread
From: J. Bruce Fields @ 2012-03-01 19:28 UTC (permalink / raw)
  To: Orion Poplawski; +Cc: Myklebust, Trond, linux-nfs@vger.kernel.org

On Thu, Mar 01, 2012 at 08:34:02AM -0700, Orion Poplawski wrote:
> On 03/01/2012 06:50 AM, Myklebust, Trond wrote:
> >On Wed, 2012-02-29 at 16:21 -0700, Orion Poplawski wrote:
> >>On 02/29/2012 04:17 PM, J. Bruce Fields wrote:
> >>>On Wed, Feb 29, 2012 at 03:29:36PM -0700, Orion Poplawski wrote:
> OMP Reply (Call
> >>>
> >>>That probably means the server is waiting for the client to return a
> >>>delegation.
> >>>
> >>>Either the server's confused about their being a delegation, or the
> >>>client's failing to return one it should?
> >>>
> >
> >As far as I can see from your trace, the client keeps trying to open the
> >file '.history', and the server keeps replying with NFS4ERR_DELAY. There
> >is nothing in the trace itself that can tell us why the server is
> >delaying.
> 
> So, it seems to be on the server side?  I didn't get any relief
> restarting nfs on the server.  Any way to trace what is happening on
> the server side?

Hm.  /proc/locks should be able to tell you whether there's a 
delegation.  (Look for a lock of type LEASE with an inode number 
matching that of the .history (or whatever file's getting the DELAY 
return.))

But honestly it's a fairly old kernel.

--b.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2012-03-01 19:28 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-02-29 22:29 nfs4 mount hanging suddenly Orion Poplawski
2012-02-29 23:17 ` J. Bruce Fields
2012-02-29 23:21   ` Orion Poplawski
2012-03-01 13:50     ` Myklebust, Trond
2012-03-01 15:34       ` Orion Poplawski
2012-03-01 19:28         ` J. Bruce Fields

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).