All of lore.kernel.org
 help / color / mirror / Atom feed
* NFS lock reclaiming not working on SLES9 SP2
@ 2006-02-17  4:44 asha  yr
  2006-02-17  9:25 ` Olaf Kirch
  0 siblings, 1 reply; 6+ messages in thread
From: asha  yr @ 2006-02-17  4:44 UTC (permalink / raw)
  To: nfs


[-- Attachment #1.1: Type: text/plain, Size: 908 bytes --]

Hi,

NFS lock reclaiming is not working on SLES9 SP2. After the server reboot, sm-notify sends reboot notifications but the clients fail to reclaim locks.

My NFS Server and NFS client are on SLES9 SP2. My NFS server is
sgmlx1(15.70.191.172) and NFS client is sgmlx2(15.70.191.173). I mounted NFS file system on the client. Then on the client, I acquired a lock on NFS shared file using fcntl. On NFS server, an entry was made for the client in sm directory. I restarted NFS server and then
started sm-notify application. The client failed to reclaim the lock and I could acquire the lock on the same file from another client. 

The reclaiming of locks was working fine on base SLES9. It started failing after I updated SLES9 with SP2.

I have captured lockd debugging messages and network tetheral trace on client and attaching the same.

Thanks for your help in advance.

Regards,
Asha



[-- Attachment #1.2: Type: text/html, Size: 1310 bytes --]

[-- Attachment #2: debugging_messages.txt --]
[-- Type: text/plain, Size: 12586 bytes --]

Debugging Messages of lockd on client:

Feb 16 11:04:44 sgmlx2 kernel: NFS lockd/statd started (ver 0.5).
Feb 16 11:04:50 sgmlx2 kernel: lockd: nlm_lookup_host(0f46bfac, p=6, v=4, my
role=client, name=15.70.191.172)
Feb 16 11:04:50 sgmlx2 kernel: lockd: host garbage collection
Feb 16 11:04:50 sgmlx2 kernel: lockd: nlmsvc_mark_resources
Feb 16 11:04:50 sgmlx2 kernel: lockd: nlm_bind_host(0f46bfac)
Feb 16 11:05:35 sgmlx2 last message repeated 4 times
Feb 16 11:05:35 sgmlx2 kernel: lockd: release host 15.70.191.172
Feb 16 11:05:35 sgmlx2 kernel: lockd: get host 15.70.191.172
Feb 16 11:06:34 sgmlx2 kernel: lockd: request from 0f46bfac
Feb 16 11:06:34 sgmlx2 kernel: lockd: nlm_host_rebooted("sgmlx1")
Feb 16 11:06:34 sgmlx2 kernel: nlmsvc_retry_blocked(00000000, when=0)
Feb 16 11:11:43 sgmlx2 kernel: device eth0 left promiscuous mode
Feb 16 11:11:46 sgmlx2 kernel: lockd: nlm_lookup_host(0f46bfac, p=6, v=4, my
role=client, name=15.70.191.172)
Feb 16 11:11:46 sgmlx2 kernel: lockd: host garbage collection
Feb 16 11:11:46 sgmlx2 kernel: lockd: nlmsvc_mark_resources
Feb 16 11:11:46 sgmlx2 kernel: nlm_gc_hosts skipping 15.70.191.172 (cnt 1 use 0
exp 2355525)
Feb 16 11:11:46 sgmlx2 kernel: lockd: get host 15.70.191.172
Feb 16 11:11:46 sgmlx2 kernel: lockd: nlm_bind_host(0f46bfac)
Feb 16 11:12:01 sgmlx2 kernel: lockd: release host 15.70.191.172
Feb 16 11:12:01 sgmlx2 kernel: lockd: release host 15.70.191.172

Tethereal network trace of what client is seeing:

 1   0.000000 15.70.191.172 -> 224.0.0.251  IGMP V2 Membership Report
  2   0.792683 15.70.191.172 -> 224.0.1.22   IGMP V2 Membership Report
  3   1.386484 FoundryN_70:ab:00 -> Broadcast    ARP Who has 15.70.191.172? 
Tell 15.70.191.1
  4  12.614506 FoundryN_70:ab:00 -> Broadcast    ARP Who has 15.70.191.172? 
Tell 15.70.191.1
  5  36.068268 15.70.191.173 -> 15.70.191.172 TCP phonebook > sunrpc [SYN]
Seq=0 Ack=0 Win=5840 Len=0 MSS=1460 TSV=2183646 TSER=0 WS=0
  6  36.068577 15.70.191.172 -> 15.70.191.173 TCP sunrpc > phonebook [SYN, ACK]
Seq=0 Ack=1 Win=5792 Len=0 MSS=1460 TSV=226815 TSER=2183646 WS=0
  7  36.068599 15.70.191.173 -> 15.70.191.172 TCP phonebook > sunrpc [ACK]
Seq=1 Ack=1 Win=5840 Len=0 TSV=2183646 TSER=226815
  8  36.068670 15.70.191.173 -> 15.70.191.172 Portmap V2 DUMP Call
  9  36.068824 15.70.191.172 -> 15.70.191.173 TCP sunrpc > phonebook [ACK]
Seq=1 Ack=45 Win=5792 Len=0 TSV=226815 TSER=2183646
 10  36.069313 15.70.191.172 -> 15.70.191.173 Portmap V2 DUMP Reply (Call In
8)[Unreassembled Packet]
 11  36.069320 15.70.191.173 -> 15.70.191.172 TCP phonebook > sunrpc [ACK]
Seq=45 Ack=401 Win=6432 Len=0 TSV=2183647 TSER=226815
 12  36.069481 15.70.191.172 -> 15.70.191.173 RPC Continuation
 13  36.069486 15.70.191.173 -> 15.70.191.172 TCP phonebook > sunrpc [ACK]
Seq=45 Ack=497 Win=6432 Len=0 TSV=2183647 TSER=226816
 14  36.069508 15.70.191.173 -> 15.70.191.172 TCP phonebook > sunrpc [FIN, ACK]
Seq=45 Ack=497 Win=6432 Len=0 TSV=2183647 TSER=226816
 15  36.069560 15.70.191.173 -> 15.70.191.172 TCP 768 > 812 [SYN] Seq=0 Ack=0
Win=5840 Len=0 MSS=1460 TSV=2183647 TSER=0 WS=0
 16  36.069782 15.70.191.172 -> 15.70.191.173 TCP 812 > 768 [SYN, ACK] Seq=0
Ack=1 Win=5792 Len=0 MSS=1460 TSV=226816 TSER=2183647 WS=0
 17  36.069793 15.70.191.173 -> 15.70.191.172 TCP 768 > 812 [ACK] Seq=1 Ack=1
Win=5840 Len=0 TSV=2183647 TSER=226816
 18  36.069810 15.70.191.172 -> 15.70.191.173 TCP sunrpc > phonebook [FIN, ACK]
Seq=497 Ack=46 Win=5792 Len=0 TSV=226816 TSER=2183647
 19  36.069820 15.70.191.173 -> 15.70.191.172 TCP phonebook > sunrpc [ACK]
Seq=46 Ack=498 Win=6432 Len=0 TSV=2183647 TSER=226816
 20  36.069848 15.70.191.173 -> 15.70.191.172 MOUNT V3 MNT Call
 21  36.070005 15.70.191.172 -> 15.70.191.173 TCP 812 > 768 [ACK] Seq=1 Ack=89
Win=5792 Len=0 TSV=226816 TSER=2183647
 22  36.119549 15.70.191.172 -> 15.70.191.173 MOUNT V3 MNT Reply (Call In 20)
 23  36.119556 15.70.191.173 -> 15.70.191.172 TCP 768 > 812 [ACK] Seq=89 Ack=61
Win=5840 Len=0 TSV=2183697 TSER=226866
 24  36.119648 15.70.191.173 -> 15.70.191.172 TCP cadlock > sunrpc [SYN] Seq=0
Ack=0 Win=5840 Len=0 MSS=1460 TSV=2183697 TSER=0 WS=0
 25  36.119865 15.70.191.172 -> 15.70.191.173 TCP sunrpc > cadlock [SYN, ACK]
Seq=0 Ack=1 Win=5792 Len=0 MSS=1460 TSV=226866 TSER=2183697 WS=0
 26  36.119877 15.70.191.173 -> 15.70.191.172 TCP cadlock > sunrpc [ACK] Seq=1
Ack=1 Win=5840 Len=0 TSV=2183697 TSER=226866
 27  36.119908 15.70.191.173 -> 15.70.191.172 Portmap V2 GETPORT Call
NFS(100003) V:3 TCP
 28  36.120067 15.70.191.172 -> 15.70.191.173 TCP sunrpc > cadlock [ACK] Seq=1
Ack=61 Win=5792 Len=0 TSV=226866 TSER=2183698
 29  36.120273 15.70.191.172 -> 15.70.191.173 Portmap V2 GETPORT Reply (Call In
27) Port:2049
 30  36.120278 15.70.191.173 -> 15.70.191.172 TCP cadlock > sunrpc [ACK] Seq=61
Ack=33 Win=5840 Len=0 TSV=2183698 TSER=226866
 31  36.120298 15.70.191.173 -> 15.70.191.172 TCP cadlock > sunrpc [FIN, ACK]
Seq=61 Ack=33 Win=5840 Len=0 TSV=2183698 TSER=226866
 32  36.120335 15.70.191.173 -> 15.70.191.172 TCP 768 > 812 [FIN, ACK] Seq=89
Ack=61 Win=5840 Len=0 TSV=2183698 TSER=226866
 33  36.120509 15.70.191.172 -> 15.70.191.173 TCP sunrpc > cadlock [FIN, ACK]
Seq=33 Ack=62 Win=5792 Len=0 TSV=226867 TSER=2183698
 34  36.120527 15.70.191.173 -> 15.70.191.172 TCP cadlock > sunrpc [ACK] Seq=62
Ack=34 Win=5840 Len=0 TSV=2183698 TSER=226867
 35  36.120553 15.70.191.172 -> 15.70.191.173 TCP 812 > 768 [FIN, ACK] Seq=61
Ack=90 Win=5792 Len=0 TSV=226867 TSER=2183698
36  36.120559 15.70.191.173 -> 15.70.191.172 TCP 768 > 812 [ACK] Seq=90 Ack=62
Win=5840 Len=0 TSV=2183698 TSER=226867
 37  36.167056 15.70.191.173 -> 15.70.191.172 TCP 1023 > nfs [SYN] Seq=0 Ack=0
Win=5840 Len=0 MSS=1460 TSV=2183745 TSER=0 WS=0
 38  36.167287 15.70.191.172 -> 15.70.191.173 TCP nfs > 1023 [SYN, ACK] Seq=0
Ack=1 Win=5792 Len=0 MSS=1460 TSV=226913 TSER=2183745 WS=0
 39  36.167311 15.70.191.173 -> 15.70.191.172 TCP 1023 > nfs [ACK] Seq=1 Ack=1
Win=5840 Len=0 TSV=2183745 TSER=226913
 40  36.167323 15.70.191.173 -> 15.70.191.172 NFS V3 FSINFO Call, FH:0x14d80402
 41  36.167482 15.70.191.172 -> 15.70.191.173 TCP nfs > 1023 [ACK] Seq=1 Ack=93
Win=5792 Len=0 TSV=226914 TSER=2183745
 42  36.187813 15.70.191.172 -> 15.70.191.173 NFS V3 FSINFO Reply (Call In 40)
 43  36.187820 15.70.191.173 -> 15.70.191.172 TCP 1023 > nfs [ACK] Seq=93
Ack=85 Win=5840 Len=0 TSV=2183765 TSER=226934
 44  36.187848 15.70.191.173 -> 15.70.191.172 NFS V3 GETATTR Call,
FH:0x14d80402
 45  36.188058 15.70.191.172 -> 15.70.191.173 NFS V3 GETATTR Reply (Call In 44)
 46  36.228420 15.70.191.173 -> 15.70.191.172 TCP 1023 > nfs [ACK] Seq=185
Ack=201 Win=5840 Len=0 TSV=2183806 TSER=226934
 47  42.005867 15.70.191.173 -> 15.70.191.172 NFS V3 ACCESS Call, FH:0x14d80402
 48  42.006151 15.70.191.172 -> 15.70.191.173 NFS V3 ACCESS Reply (Call In 47)
 49  42.006170 15.70.191.173 -> 15.70.191.172 TCP 1023 > nfs [ACK] Seq=281
Ack=325 Win=5840 Len=0 TSV=2189585 TSER=232753
 50  42.006210 15.70.191.173 -> 15.70.191.172 NFS V3 LOOKUP Call,
DH:0x14d80402/hi
 51  42.016430 15.70.191.172 -> 15.70.191.173 NFS V3 LOOKUP Reply (Call In 50),
FH:0x7608060e
 52  42.016461 15.70.191.173 -> 15.70.191.172 NFS V3 ACCESS Call, FH:0x7608060e
 53  42.016683 15.70.191.172 -> 15.70.191.173 NFS V3 ACCESS Reply (Call In 52)
 54  42.056037 15.70.191.173 -> 15.70.191.172 TCP 1023 > nfs [ACK] Seq=501
Ack=697 Win=6432 Len=0 TSV=2189635 TSER=232764
 55  42.810338 15.70.191.173 -> 15.70.191.172 TCP 1022 > sunrpc [SYN] Seq=0
Ack=0 Win=5840 Len=0 MSS=1460 TSV=2190389 TSER=0 WS=0
 56  42.810617 15.70.191.172 -> 15.70.191.173 TCP sunrpc > 1022 [SYN, ACK]
Seq=0 Ack=1 Win=5792 Len=0 MSS=1460 TSV=233558 TSER=2190389 WS=0
 57  42.810642 15.70.191.173 -> 15.70.191.172 TCP 1022 > sunrpc [ACK] Seq=1
Ack=1 Win=5840 Len=0 TSV=2190389 TSER=233558
 58  42.810668 15.70.191.173 -> 15.70.191.172 Portmap V2 GETPORT Call
NLM(100021) V:4 TCP
 59  42.810823 15.70.191.172 -> 15.70.191.173 TCP sunrpc > 1022 [ACK] Seq=1
Ack=61 Win=5792 Len=0 TSV=233558 TSER=2190389
 60  42.811094 15.70.191.172 -> 15.70.191.173 Portmap V2 GETPORT Reply (Call In
58) Port:32769
 61  42.811103 15.70.191.173 -> 15.70.191.172 TCP 1022 > sunrpc [ACK] Seq=61
Ack=33 Win=5840 Len=0 TSV=2190390 TSER=233558
 62  42.811143 15.70.191.173 -> 15.70.191.172 TCP 1022 > sunrpc [FIN, ACK]
Seq=61 Ack=33 Win=5840 Len=0 TSV=2190390 TSER=233558
 63  42.811247 15.70.191.173 -> 15.70.191.172 TCP 1021 > filenet-rpc [SYN]
Seq=0 Ack=0 Win=5840 Len=0 MSS=1460 TSV=2190390 TSER=0 WS=0
 64  42.811454 15.70.191.172 -> 15.70.191.173 TCP sunrpc > 1022 [FIN, ACK]
Seq=33 Ack=62 Win=5792 Len=0 TSV=233559 TSER=2190390
 65  42.811469 15.70.191.173 -> 15.70.191.172 TCP 1022 > sunrpc [ACK] Seq=62
Ack=34 Win=5840 Len=0 TSV=2190390 TSER=233559
 66  42.811490 15.70.191.172 -> 15.70.191.173 TCP filenet-rpc > 1021 [SYN, ACK]
Seq=0 Ack=1 Win=5792 Len=0 MSS=1460 TSV=233559 TSER=2190390 WS=0
 67  42.811503 15.70.191.173 -> 15.70.191.172 TCP 1021 > filenet-rpc [ACK]
Seq=1 Ack=1 Win=5840 Len=0 TSV=2190390 TSER=233559
68  42.811519 15.70.191.173 -> 15.70.191.172 NLM V4 LOCK Call FH:0x7608060e
svid:6954 pos:0-0
 69  42.811702 15.70.191.172 -> 15.70.191.173 TCP filenet-rpc > 1021 [ACK]
Seq=1 Ack=189 Win=5604 Len=0 TSV=233559 TSER=2190390
 70  42.811746 15.70.191.172 -> 15.70.191.173 NLM V4 LOCK Reply (Call In 68)
NLM_DENIED_GRACE_PERIOD
 71  42.811752 15.70.191.173 -> 15.70.191.172 TCP 1021 > filenet-rpc [ACK]
Seq=189 Ack=41 Win=5840 Len=0 TSV=2190390 TSER=233559
 72  50.328078 FoundryN_70:ab:00 -> Broadcast    ARP Who has 15.70.191.172? 
Tell 15.70.191.1
 73  57.809375 15.70.191.173 -> 15.70.191.172 NLM V4 LOCK Call FH:0x7608060e
svid:6954 pos:0-0
 74  57.809672 15.70.191.172 -> 15.70.191.173 NLM V4 LOCK Reply (Call In 73)
NLM_DENIED_GRACE_PERIOD
 75  57.809688 15.70.191.173 -> 15.70.191.172 TCP 1021 > filenet-rpc [ACK]
Seq=377 Ack=81 Win=5840 Len=0 TSV=2205391 TSER=248560
 76  65.135773 FoundryN_70:ab:00 -> Broadcast    ARP Who has 15.70.191.172? 
Tell 15.70.191.1
 77  72.807840 15.70.191.173 -> 15.70.191.172 NLM V4 LOCK Call FH:0x7608060e
svid:6954 pos:0-0
 78  72.808138 15.70.191.172 -> 15.70.191.173 NLM V4 LOCK Reply (Call In 77)
NLM_DENIED_GRACE_PERIOD
 79  72.808154 15.70.191.173 -> 15.70.191.172 TCP 1021 > filenet-rpc [ACK]
Seq=565 Ack=121 Win=5840 Len=0 TSV=2220391 TSER=263561
 80  87.805307 15.70.191.173 -> 15.70.191.172 NLM V4 LOCK Call FH:0x7608060e
svid:6954 pos:0-0
 81  87.845402 15.70.191.172 -> 15.70.191.173 TCP filenet-rpc > 1021 [ACK]
Seq=121 Ack=753 Win=5604 Len=0 TSV=278601 TSER=2235391
 82  87.939016 15.70.191.172 -> 15.70.191.173 NLM V4 LOCK Reply (Call In 80)
 83  87.939036 15.70.191.173 -> 15.70.191.172 TCP 1021 > filenet-rpc [ACK]
Seq=753 Ack=161 Win=5840 Len=0 TSV=2235525 TSER=278694
 84  99.789128 FoundryN_70:ab:00 -> Broadcast    ARP Who has 15.70.191.172? 
Tell 15.70.191.1
 85 100.787646 FoundryN_70:ab:00 -> Broadcast    ARP Who has 15.70.191.172? 
Tell 15.70.191.1
 86 121.309035 FoundryN_70:ab:00 -> Broadcast    ARP Who has 15.70.191.172? 
Tell 15.70.191.1
 87 125.271852 15.70.191.172 -> 224.0.1.22   IGMP V2 Membership Report
 88 132.657551 15.70.191.172 -> 15.70.191.173 TCP filenet-rpc > 1021 [FIN, ACK]
Seq=161 Ack=753 Win=5604 Len=0 TSV=323421 TSER=2235525
 89 132.657595 15.70.191.172 -> 15.70.191.173 TCP nfs > 1023 [FIN, ACK] Seq=697
Ack=501 Win=5792 Len=0 TSV=323421 TSER=2189635
 90 132.697683 15.70.191.173 -> 15.70.191.172 TCP 1021 > filenet-rpc [ACK]
Seq=753 Ack=162 Win=5840 Len=0 TSV=2280290 TSER=323421
 91 132.697688 15.70.191.173 -> 15.70.191.172 TCP 1023 > nfs [ACK] Seq=501
Ack=698 Win=6432 Len=0 TSV=2280290 TSER=323421
 92 137.696503 CompaqHp_50:5f:d3 -> CompaqHp_2d:70:6e ARP Who has
15.70.191.172?  Tell 15.70.191.173
 93 137.696634 CompaqHp_2d:70:6e -> CompaqHp_50:5f:d3 ARP 15.70.191.172 is at
00:0b:cd:2d:70:6e
 94 142.173489 FoundryN_70:ab:00 -> Broadcast    ARP Who has 15.70.191.172? 
Tell 15.70.191.1
 95 146.171834 15.70.191.172 -> 15.70.191.173 Portmap V2 GETPORT Call
STAT(100024) V:1 UDP
 96 146.172124 15.70.191.173 -> 15.70.191.172 Portmap V2 GETPORT Reply (Call In
95) Port:32775
 97 146.172373 15.70.191.172 -> 15.70.191.173 STAT V1 NOTIFY Call
 98 146.172426 15.70.191.173 -> 15.70.191.172 STAT V1 NOTIFY Reply (Call In 97)
 99 155.659246 FoundryN_70:ab:00 -> Broadcast    ARP Who has 15.70.191.172? 
Tell 15.70.191.1
100 203.565599 FoundryN_70:ab:00 -> Broadcast    ARP Who has 15.70.191.172? 
Tell 15.70.191.1

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: NFS lock reclaiming not working on SLES9 SP2
  2006-02-17  4:44 NFS lock reclaiming not working on SLES9 SP2 asha  yr
@ 2006-02-17  9:25 ` Olaf Kirch
  2006-02-17  9:37   ` Greg Banks
  0 siblings, 1 reply; 6+ messages in thread
From: Olaf Kirch @ 2006-02-17  9:25 UTC (permalink / raw)
  To: asha yr; +Cc: nfs

Hi,

On Fri, Feb 17, 2006 at 04:44:48AM -0000, asha  yr wrote:
> The reclaiming of locks was working fine on base SLES9. It started failing after I updated SLES9 with SP2.

Did you try SLES9 SP3? I remember we had some problems with lock reclaim,
but I thought they were fixed in SP2.

It would be useful to get a lockd trace on the server side, what it
receives and what it sends back. The ethereal traces aren't very useful
though; it seems the snaplen is too small (and binary dumps are usually
much more helpful than the "helpful" ASCII packet representation that
ethereal or tcpdump generate)

Thanks,
Olaf
-- 
Olaf Kirch   |  --- o --- Nous sommes du soleil we love when we play
okir@suse.de |    / | \   sol.dhoop.naytheet.ah kin.ir.samse.qurax


-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: NFS lock reclaiming not working on SLES9 SP2
  2006-02-17  9:25 ` Olaf Kirch
@ 2006-02-17  9:37   ` Greg Banks
  2006-02-17  9:56     ` Olaf Kirch
  0 siblings, 1 reply; 6+ messages in thread
From: Greg Banks @ 2006-02-17  9:37 UTC (permalink / raw)
  To: Olaf Kirch; +Cc: asha yr, Linux NFS Mailing List

On Fri, 2006-02-17 at 20:25, Olaf Kirch wrote:
> Hi,
> 
> On Fri, Feb 17, 2006 at 04:44:48AM -0000, asha  yr wrote:
> > The reclaiming of locks was working fine on base SLES9. It started failing after I updated SLES9 with SP2.
> 
> Did you try SLES9 SP3? I remember we had some problems with lock reclaim,
> but I thought they were fixed in SP2.
>
> It would be useful to get a lockd trace on the server side, what it
> receives and what it sends back. The ethereal traces aren't very useful
> though; it seems the snaplen is too small (and binary dumps are usually
> much more helpful than the "helpful" ASCII packet representation that
> ethereal or tcpdump generate)

In his trace, the client wasn't sending any LOCK calls at all
for multiple minutes after receiving the NOTIFY.

Greg.
-- 
Greg Banks, R&D Software Engineer, SGI Australian Software Group.
I don't speak for SGI.




-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: NFS lock reclaiming not working on SLES9 SP2
  2006-02-17  9:37   ` Greg Banks
@ 2006-02-17  9:56     ` Olaf Kirch
  2006-02-17 10:03       ` Greg Banks
  2006-02-17 19:08       ` Marc Eshel
  0 siblings, 2 replies; 6+ messages in thread
From: Olaf Kirch @ 2006-02-17  9:56 UTC (permalink / raw)
  To: Greg Banks; +Cc: asha yr, Linux NFS Mailing List

[-- Attachment #1: Type: text/plain, Size: 572 bytes --]

On Fri, Feb 17, 2006 at 08:37:05PM +1100, Greg Banks wrote:
> In his trace, the client wasn't sending any LOCK calls at all
> for multiple minutes after receiving the NOTIFY.

Ah, you're right. I wasn't paying attention to the time stamps.

It seems the problem is that we're now using hostnames to identify lockd
peers, but you mounted the file system using the ipaddr:/path.

Could you please try the attached patch?

Thanks
Olaf
-- 
Olaf Kirch   |  --- o --- Nous sommes du soleil we love when we play
okir@suse.de |    / | \   sol.dhoop.naytheet.ah kin.ir.samse.qurax

[-- Attachment #2: statd-hostname-fix --]
[-- Type: text/plain, Size: 3020 bytes --]

 fs/lockd/host.c             |    5 +++--
 fs/lockd/statd.c            |    2 +-
 fs/lockd/svc4proc.c         |    2 +-
 fs/lockd/svcproc.c          |    2 +-
 include/linux/lockd/lockd.h |    2 +-
 5 files changed, 7 insertions(+), 6 deletions(-)

Index: build/fs/lockd/host.c
===================================================================
--- build.orig/fs/lockd/host.c
+++ build/fs/lockd/host.c
@@ -274,7 +274,7 @@ void nlm_release_host(struct nlm_host *h
  * Given an IP address, initiate recovery and ditch all locks.
  */
 void
-nlm_host_rebooted(const char *hostname, u32 new_state)
+nlm_host_rebooted(struct sockaddr_in *addr, const char *hostname, u32 new_state)
 {
 	struct nlm_host	*host, **hp;
 	int		hash;
@@ -287,7 +287,8 @@ nlm_host_rebooted(const char *hostname, 
 	/* Mark all matching hosts as having rebooted */
 	for (hash = 0; hash < NLM_HOST_NRHASH; hash++) {
 		for (hp = &nlm_hosts[hash]; (host = *hp); hp = &host->h_next) {
-			if (nlm_cmp_name(host->h_name, hostname)) {
+			if (nlm_cmp_name(host->h_name, hostname)
+			 || (addr && nlm_cmp_addr(&host->h_addr, addr))) {
 				if (host->h_nsmhandle)
 					host->h_nsmhandle->sm_monitored = 0;
 				host->h_rebooted = 1;
Index: build/fs/lockd/statd.c
===================================================================
--- build.orig/fs/lockd/statd.c
+++ build/fs/lockd/statd.c
@@ -311,7 +311,7 @@ nsmsvc_proc_notify(struct svc_rqst *rqst
 				           struct nsm_res  *resp)
 {
 	dprintk("statd: NOTIFY        called\n");
-	nlm_host_rebooted(argp->mon_name, argp->state);
+	nlm_host_rebooted(&rqstp->rq_addr, argp->mon_name, argp->state);
 	return rpc_success;
 }
 
Index: build/fs/lockd/svc4proc.c
===================================================================
--- build.orig/fs/lockd/svc4proc.c
+++ build/fs/lockd/svc4proc.c
@@ -427,7 +427,7 @@ nlm4svc_proc_sm_notify(struct svc_rqst *
 		return rpc_system_err;
 	}
 
-	nlm_host_rebooted(argp->mon, argp->state);
+	nlm_host_rebooted(NULL, argp->mon, argp->state);
 	return rpc_success;
 }
 
Index: build/fs/lockd/svcproc.c
===================================================================
--- build.orig/fs/lockd/svcproc.c
+++ build/fs/lockd/svcproc.c
@@ -455,7 +455,7 @@ nlmsvc_proc_sm_notify(struct svc_rqst *r
 		return rpc_system_err;
 	}
 
-	nlm_host_rebooted(argp->mon, argp->state);
+	nlm_host_rebooted(NULL, argp->mon, argp->state);
 	return rpc_success;
 }
 
Index: build/include/linux/lockd/lockd.h
===================================================================
--- build.orig/include/linux/lockd/lockd.h
+++ build/include/linux/lockd/lockd.h
@@ -166,7 +166,7 @@ struct nlm_host * nlm_get_host(struct nl
 void		  nlm_release_host(struct nlm_host *);
 void		  nlm_shutdown_hosts(void);
 extern struct nlm_host *nlm_find_client(void);
-extern void	  nlm_host_rebooted(const char *, u32);
+extern void	  nlm_host_rebooted(struct sockaddr_in *, const char *, u32);
 struct nsm_handle *nsm_find(const char *, int);
 void             nsm_release(struct nsm_handle *);
 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: NFS lock reclaiming not working on SLES9 SP2
  2006-02-17  9:56     ` Olaf Kirch
@ 2006-02-17 10:03       ` Greg Banks
  2006-02-17 19:08       ` Marc Eshel
  1 sibling, 0 replies; 6+ messages in thread
From: Greg Banks @ 2006-02-17 10:03 UTC (permalink / raw)
  To: Olaf Kirch; +Cc: asha yr, Linux NFS Mailing List

On Fri, 2006-02-17 at 20:56, Olaf Kirch wrote:
> It seems the problem is that we're now using hostnames to identify lockd
> peers, but you mounted the file system using the ipaddr:/path.

Indeed, well caught.  I missed the significance of name=...

Greg.
-- 
Greg Banks, R&D Software Engineer, SGI Australian Software Group.
I don't speak for SGI.




-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: NFS lock reclaiming not working on SLES9 SP2
  2006-02-17  9:56     ` Olaf Kirch
  2006-02-17 10:03       ` Greg Banks
@ 2006-02-17 19:08       ` Marc Eshel
  1 sibling, 0 replies; 6+ messages in thread
From: Marc Eshel @ 2006-02-17 19:08 UTC (permalink / raw)
  To: Olaf Kirch; +Cc: asha yr, Greg Banks, Linux NFS Mailing List, nfs-admin

Hi Olaf,
This fix will not work for HA-NFS which allows a failover node to send the 
SM_NOTIFY on behalf of the failed node.
The -v option was added for this purpose.
Marc. 

nfs-admin@lists.sourceforge.net wrote on 02/17/2006 01:56:18 AM:

> On Fri, Feb 17, 2006 at 08:37:05PM +1100, Greg Banks wrote:
> > In his trace, the client wasn't sending any LOCK calls at all
> > for multiple minutes after receiving the NOTIFY.
> 
> Ah, you're right. I wasn't paying attention to the time stamps.
> 
> It seems the problem is that we're now using hostnames to identify lockd
> peers, but you mounted the file system using the ipaddr:/path.
> 
> Could you please try the attached patch?
> 
> Thanks
> Olaf
> -- 
> Olaf Kirch   |  --- o --- Nous sommes du soleil we love when we play
> okir@suse.de |    / | \   sol.dhoop.naytheet.ah kin.ir.samse.qurax
> [attachment "statd-hostname-fix" deleted by Marc Eshel/Almaden/IBM] 


-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2006-02-17 19:08 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-02-17  4:44 NFS lock reclaiming not working on SLES9 SP2 asha  yr
2006-02-17  9:25 ` Olaf Kirch
2006-02-17  9:37   ` Greg Banks
2006-02-17  9:56     ` Olaf Kirch
2006-02-17 10:03       ` Greg Banks
2006-02-17 19:08       ` Marc Eshel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.