All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] fix recursive nlm_file_mutex deadlock
@ 2006-08-09 18:13 Wendy Cheng
  2006-08-09 18:32 ` Wendy Cheng
  2006-08-09 21:45 ` Trond Myklebust
  0 siblings, 2 replies; 9+ messages in thread
From: Wendy Cheng @ 2006-08-09 18:13 UTC (permalink / raw)
  To: Linux NFS Mailing List

[-- Attachment #1: Type: text/plain, Size: 1068 bytes --]

I was testing NLM failover patches this morning and found the command 
hangs. Look like nlm_traverse_files(), where it grabs nlm_file_mutex 
early in the call, will have a chance to call nlm_release_file() via 
nlmsvc_free_block() inside kref_put(). The nlm_release_file() wants 
nlm_file_mutex too - this would generate a deadlock as the following:

dhcp59-234 kernel: Call Trace:
[<c02dd749>] __mutex_lock_slowpath+0x4c/0x7e
[<c02dd78a>] .text.lock.mutex+0xf/0x14
[<f8afeacd>] nlm_release_file+0x2b/0xdf [lockd]
[<f8afda90>] nlmsvc_free_block+0x8c/0x9d [lockd]
[<f8afda04>] nlmsvc_free_block+0x0/0x9d [lockd]
[<c01be98d>] kref_put+0x4e/0x58
[<f8afd175>] nlmsvc_traverse_blocks+0xaf/0xc6 [lockd]
[<f8afe960>] nlm_traverse_files+0x108/0x1cd [lockd]

The attached patch seems to fix the issue - it skips (defers) the file 
removal. Eventually either nlm_gc_hosts (some time later when client is 
unmonitored) or nlmsvc_traverse_files will finish the clean up.  Note 
that this is a 10-minutes  work - not sure its ramification at this 
moment. Take a look ?

-- Wendy


[-- Attachment #2: gfs_nlm_deadlock.patch --]
[-- Type: text/plain, Size: 388 bytes --]

--- linux-2/fs/lockd/svclock.c	2006-08-08 10:20:16.000000000 -0400
+++ linux/fs/lockd/svclock.c	2006-08-09 10:28:35.000000000 -0400
@@ -264,7 +264,9 @@ static void nlmsvc_free_block(struct kre
 
 	nlmsvc_freegrantargs(block->b_call);
 	nlm_release_call(block->b_call);
-	nlm_release_file(block->b_file);
+	down(&file->f_sema);
+	file->f_count--;
+	up(&file->f_sema);
 	kfree(block);
 }
 

[-- Attachment #3: Type: text/plain, Size: 373 bytes --]

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642

[-- Attachment #4: Type: text/plain, Size: 140 bytes --]

_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2006-08-10 16:05 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-08-09 18:13 [PATCH] fix recursive nlm_file_mutex deadlock Wendy Cheng
2006-08-09 18:32 ` Wendy Cheng
2006-08-09 21:45 ` Trond Myklebust
2006-08-09 22:07   ` Wendy Cheng
2006-08-09 22:41     ` Trond Myklebust
2006-08-09 23:57     ` Trond Myklebust
2006-08-10 15:24       ` Wendy Cheng
2006-08-10 15:40         ` Trond Myklebust
2006-08-10 16:05           ` Trond Myklebust

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.