All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] fix recursive nlm_file_mutex deadlock
@ 2006-08-09 18:13 Wendy Cheng
  2006-08-09 18:32 ` Wendy Cheng
  2006-08-09 21:45 ` Trond Myklebust
  0 siblings, 2 replies; 9+ messages in thread
From: Wendy Cheng @ 2006-08-09 18:13 UTC (permalink / raw)
  To: Linux NFS Mailing List

[-- Attachment #1: Type: text/plain, Size: 1068 bytes --]

I was testing NLM failover patches this morning and found the command 
hangs. Look like nlm_traverse_files(), where it grabs nlm_file_mutex 
early in the call, will have a chance to call nlm_release_file() via 
nlmsvc_free_block() inside kref_put(). The nlm_release_file() wants 
nlm_file_mutex too - this would generate a deadlock as the following:

dhcp59-234 kernel: Call Trace:
[<c02dd749>] __mutex_lock_slowpath+0x4c/0x7e
[<c02dd78a>] .text.lock.mutex+0xf/0x14
[<f8afeacd>] nlm_release_file+0x2b/0xdf [lockd]
[<f8afda90>] nlmsvc_free_block+0x8c/0x9d [lockd]
[<f8afda04>] nlmsvc_free_block+0x0/0x9d [lockd]
[<c01be98d>] kref_put+0x4e/0x58
[<f8afd175>] nlmsvc_traverse_blocks+0xaf/0xc6 [lockd]
[<f8afe960>] nlm_traverse_files+0x108/0x1cd [lockd]

The attached patch seems to fix the issue - it skips (defers) the file 
removal. Eventually either nlm_gc_hosts (some time later when client is 
unmonitored) or nlmsvc_traverse_files will finish the clean up.  Note 
that this is a 10-minutes  work - not sure its ramification at this 
moment. Take a look ?

-- Wendy


[-- Attachment #2: gfs_nlm_deadlock.patch --]
[-- Type: text/plain, Size: 388 bytes --]

--- linux-2/fs/lockd/svclock.c	2006-08-08 10:20:16.000000000 -0400
+++ linux/fs/lockd/svclock.c	2006-08-09 10:28:35.000000000 -0400
@@ -264,7 +264,9 @@ static void nlmsvc_free_block(struct kre
 
 	nlmsvc_freegrantargs(block->b_call);
 	nlm_release_call(block->b_call);
-	nlm_release_file(block->b_file);
+	down(&file->f_sema);
+	file->f_count--;
+	up(&file->f_sema);
 	kfree(block);
 }
 

[-- Attachment #3: Type: text/plain, Size: 373 bytes --]

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642

[-- Attachment #4: Type: text/plain, Size: 140 bytes --]

_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] fix recursive nlm_file_mutex deadlock
  2006-08-09 18:13 [PATCH] fix recursive nlm_file_mutex deadlock Wendy Cheng
@ 2006-08-09 18:32 ` Wendy Cheng
  2006-08-09 21:45 ` Trond Myklebust
  1 sibling, 0 replies; 9+ messages in thread
From: Wendy Cheng @ 2006-08-09 18:32 UTC (permalink / raw)
  To: Linux NFS Mailing List

Wendy Cheng wrote:

>
> The attached patch seems to fix the issue - it skips (defers) the file 
> removal. Eventually either nlm_gc_hosts (some time later when client 
> is unmonitored) or nlmsvc_traverse_files will finish the clean up.  
> Note that this is a 10-minutes  work - not sure its ramification at 
> this moment. Take a look ?
>
Well, I never really understand what this "signed-off" line is all 
about. But if it is required, here it is:

Signed-off-by: S. Wendy Cheng (wcheng@redhat.com)


>
>------------------------------------------------------------------------
>
>--- linux-2/fs/lockd/svclock.c	2006-08-08 10:20:16.000000000 -0400
>+++ linux/fs/lockd/svclock.c	2006-08-09 10:28:35.000000000 -0400
>@@ -264,7 +264,9 @@ static void nlmsvc_free_block(struct kre
> 
> 	nlmsvc_freegrantargs(block->b_call);
> 	nlm_release_call(block->b_call);
>-	nlm_release_file(block->b_file);
>+	down(&file->f_sema);
>+	file->f_count--;
>+	up(&file->f_sema);
> 	kfree(block);
> }
> 
>  
>

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] fix recursive nlm_file_mutex deadlock
  2006-08-09 18:13 [PATCH] fix recursive nlm_file_mutex deadlock Wendy Cheng
  2006-08-09 18:32 ` Wendy Cheng
@ 2006-08-09 21:45 ` Trond Myklebust
  2006-08-09 22:07   ` Wendy Cheng
  1 sibling, 1 reply; 9+ messages in thread
From: Trond Myklebust @ 2006-08-09 21:45 UTC (permalink / raw)
  To: Wendy Cheng; +Cc: Linux NFS Mailing List

On Wed, 2006-08-09 at 14:13 -0400, Wendy Cheng wrote:
> I was testing NLM failover patches this morning and found the command 
> hangs. Look like nlm_traverse_files(), where it grabs nlm_file_mutex 
> early in the call, will have a chance to call nlm_release_file() via 
> nlmsvc_free_block() inside kref_put(). The nlm_release_file() wants 
> nlm_file_mutex too - this would generate a deadlock as the following:
> 
> dhcp59-234 kernel: Call Trace:
> [<c02dd749>] __mutex_lock_slowpath+0x4c/0x7e
> [<c02dd78a>] .text.lock.mutex+0xf/0x14
> [<f8afeacd>] nlm_release_file+0x2b/0xdf [lockd]
> [<f8afda90>] nlmsvc_free_block+0x8c/0x9d [lockd]
> [<f8afda04>] nlmsvc_free_block+0x0/0x9d [lockd]
> [<c01be98d>] kref_put+0x4e/0x58
> [<f8afd175>] nlmsvc_traverse_blocks+0xaf/0xc6 [lockd]
> [<f8afe960>] nlm_traverse_files+0x108/0x1cd [lockd]
> 
> The attached patch seems to fix the issue - it skips (defers) the file 
> removal. Eventually either nlm_gc_hosts (some time later when client is 
> unmonitored) or nlmsvc_traverse_files will finish the clean up.  Note 
> that this is a 10-minutes  work - not sure its ramification at this 
> moment. Take a look ?
> 
> -- Wendy
> 
> plain text document attachment (gfs_nlm_deadlock.patch)
> --- linux-2/fs/lockd/svclock.c	2006-08-08 10:20:16.000000000 -0400
> +++ linux/fs/lockd/svclock.c	2006-08-09 10:28:35.000000000 -0400
> @@ -264,7 +264,9 @@ static void nlmsvc_free_block(struct kre
>  
>  	nlmsvc_freegrantargs(block->b_call);
>  	nlm_release_call(block->b_call);
> -	nlm_release_file(block->b_file);
> +	down(&file->f_sema);
> +	file->f_count--;
> +	up(&file->f_sema);
>  	kfree(block);
>  }

Vetoed. The block holds a reference to the file. It _must_ call
nlm_release_file() in order to release that reference. It is in any case
a bug to grab file->f_sema without holding a reference to the file.

I suspect, rather, that the problem is due to nlmsvc_create_block()
incrementing file->f_count without holding the nlm_file_mutex. If we
convert it to an atomic_t instead, then that problem should be solved.

aside: Note also that we want to get rid of all that mark and sweep
braindamage in nlm_traverse_*() with all the silly counting of f_lock,
f_blocks, f_shares,.... and replace those variables with proper
references to the struct nlm_file by the locks, blocks (is already the
case?), and shares.

Cheers,
  Trond


-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] fix recursive nlm_file_mutex deadlock
  2006-08-09 21:45 ` Trond Myklebust
@ 2006-08-09 22:07   ` Wendy Cheng
  2006-08-09 22:41     ` Trond Myklebust
  2006-08-09 23:57     ` Trond Myklebust
  0 siblings, 2 replies; 9+ messages in thread
From: Wendy Cheng @ 2006-08-09 22:07 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: Linux NFS Mailing List

Trond Myklebust wrote:

>On Wed, 2006-08-09 at 14:13 -0400, Wendy Cheng wrote:
>  
>
>>I was testing NLM failover patches this morning and found the command 
>>hangs. Look like nlm_traverse_files(), where it grabs nlm_file_mutex 
>>early in the call, will have a chance to call nlm_release_file() via 
>>nlmsvc_free_block() inside kref_put(). The nlm_release_file() wants 
>>nlm_file_mutex too - this would generate a deadlock as the following:
>>
>>dhcp59-234 kernel: Call Trace:
>>[<c02dd749>] __mutex_lock_slowpath+0x4c/0x7e
>>[<c02dd78a>] .text.lock.mutex+0xf/0x14
>>[<f8afeacd>] nlm_release_file+0x2b/0xdf [lockd]
>>[<f8afda90>] nlmsvc_free_block+0x8c/0x9d [lockd]
>>[<f8afda04>] nlmsvc_free_block+0x0/0x9d [lockd]
>>[<c01be98d>] kref_put+0x4e/0x58
>>[<f8afd175>] nlmsvc_traverse_blocks+0xaf/0xc6 [lockd]
>>[<f8afe960>] nlm_traverse_files+0x108/0x1cd [lockd]
>>
>>The attached patch seems to fix the issue - it skips (defers) the file 
>>removal. Eventually either nlm_gc_hosts (some time later when client is 
>>unmonitored) or nlmsvc_traverse_files will finish the clean up.  Note 
>>that this is a 10-minutes  work - not sure its ramification at this 
>>moment. Take a look ?
>>
>>-- Wendy
>>
>>plain text document attachment (gfs_nlm_deadlock.patch)
>>--- linux-2/fs/lockd/svclock.c	2006-08-08 10:20:16.000000000 -0400
>>+++ linux/fs/lockd/svclock.c	2006-08-09 10:28:35.000000000 -0400
>>@@ -264,7 +264,9 @@ static void nlmsvc_free_block(struct kre
>> 
>> 	nlmsvc_freegrantargs(block->b_call);
>> 	nlm_release_call(block->b_call);
>>-	nlm_release_file(block->b_file);
>>+	down(&file->f_sema);
>>+	file->f_count--;
>>+	up(&file->f_sema);
>> 	kfree(block);
>> }
>>    
>>
>
>Vetoed. The block holds a reference to the file. It _must_ call
>nlm_release_file() in order to release that reference. It is in any case
>a bug to grab file->f_sema without holding a reference to the file.
>
>I suspect, rather, that the problem is due to nlmsvc_create_block()
>incrementing file->f_count without holding the nlm_file_mutex. If we
>convert it to an atomic_t instead, then that problem should be solved.
>  
>

Disagree ! :)

The whole thing is about deadlock, not about reference count. Look at 
the logic ... nlm_traverse_files grabs the nlm_file_mutex, then comes 
down to nlm_release_file where it tries to get nlm_file_mutex lock 
again. I'll need to run now - we can discuss this tomorrow morning. 
Please re-read the issue and we'll discuss later.

-- Wendy

>aside: Note also that we want to get rid of all that mark and sweep
>braindamage in nlm_traverse_*() with all the silly counting of f_lock,
>f_blocks, f_shares,.... and replace those variables with proper
>references to the struct nlm_file by the locks, blocks (is already the
>case?), and shares.
>
>Cheers,
>  Trond
>
>  
>

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] fix recursive nlm_file_mutex deadlock
  2006-08-09 22:07   ` Wendy Cheng
@ 2006-08-09 22:41     ` Trond Myklebust
  2006-08-09 23:57     ` Trond Myklebust
  1 sibling, 0 replies; 9+ messages in thread
From: Trond Myklebust @ 2006-08-09 22:41 UTC (permalink / raw)
  To: Wendy Cheng; +Cc: Linux NFS Mailing List

On Wed, 2006-08-09 at 18:07 -0400, Wendy Cheng wrote:

> 
> Disagree ! :)
> 
> The whole thing is about deadlock, not about reference count. Look at 
> the logic ... nlm_traverse_files grabs the nlm_file_mutex, then comes 
> down to nlm_release_file where it tries to get nlm_file_mutex lock 
> again. I'll need to run now - we can discuss this tomorrow morning. 
> Please re-read the issue and we'll discuss later.

OK. The we need to fix the borken mark/sweep crap. Trying to circumvent
refcounting rules is unacceptable.

Trond


-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] fix recursive nlm_file_mutex deadlock
  2006-08-09 22:07   ` Wendy Cheng
  2006-08-09 22:41     ` Trond Myklebust
@ 2006-08-09 23:57     ` Trond Myklebust
  2006-08-10 15:24       ` Wendy Cheng
  1 sibling, 1 reply; 9+ messages in thread
From: Trond Myklebust @ 2006-08-09 23:57 UTC (permalink / raw)
  To: Wendy Cheng; +Cc: Linux NFS Mailing List

[-- Attachment #1: Type: text/plain, Size: 617 bytes --]

On Wed, 2006-08-09 at 18:07 -0400, Wendy Cheng wrote:
> The whole thing is about deadlock, not about reference count. Look at 
> the logic ... nlm_traverse_files grabs the nlm_file_mutex, then comes 
> down to nlm_release_file where it tries to get nlm_file_mutex lock 
> again. I'll need to run now - we can discuss this tomorrow morning. 
> Please re-read the issue and we'll discuss later.

Here you go. Assuming this compiles, it ought to prevent the recursive
call to mutex_lock(&nlm_file_mutex), _and_ speed up nlm_release_file()
in the case where we know we're not going to free up the file.

Cheers,
  Trond


[-- Attachment #2: linux-2.6.18-009-make_nlm_file_f_count_atomic.dif --]
[-- Type: text/plain, Size: 4152 bytes --]

LOCKD: Make nlm_file->f_count an atomic

From: Trond Myklebust <Trond.Myklebust@netapp.com>

This allows us to grab the mutex in nlm_release_file only if we suspect
that the file needs to be freed.

The latter fixes a deadlock in which nlm_traverse_files() ends up calling
mutex_lock(&nlm_file_mutex) recursively (the second time being by means of
the call to nlm_release_file() in nlmsvc_free_block()).

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
---

 fs/lockd/svclock.c          |    2 +-
 fs/lockd/svcsubs.c          |   26 +++++++++++++++-----------
 include/linux/lockd/lockd.h |    2 +-
 3 files changed, 17 insertions(+), 13 deletions(-)

diff --git a/fs/lockd/svclock.c b/fs/lockd/svclock.c
index c9d4197..57dcb2d 100644
--- a/fs/lockd/svclock.c
+++ b/fs/lockd/svclock.c
@@ -207,7 +207,7 @@ nlmsvc_create_block(struct svc_rqst *rqs
 	block->b_daemon = rqstp->rq_server;
 	block->b_host   = host;
 	block->b_file   = file;
-	file->f_count++;
+	atomic_inc(&file->f_count);
 
 	/* Add to file's list of blocks */
 	block->b_fnext  = file->f_blocks;
diff --git a/fs/lockd/svcsubs.c b/fs/lockd/svcsubs.c
index 2a4df9b..dfca9e7 100644
--- a/fs/lockd/svcsubs.c
+++ b/fs/lockd/svcsubs.c
@@ -94,8 +94,10 @@ nlm_lookup_file(struct svc_rqst *rqstp, 
 	mutex_lock(&nlm_file_mutex);
 
 	for (file = nlm_files[hash]; file; file = file->f_next)
-		if (!nfs_compare_fh(&file->f_handle, f))
+		if (!nfs_compare_fh(&file->f_handle, f)) {
+			atomic_inc(&file->f_count);
 			goto found;
+		}
 
 	nlm_debug_print_fh("creating file for", f);
 
@@ -108,6 +110,7 @@ nlm_lookup_file(struct svc_rqst *rqstp, 
 	memcpy(&file->f_handle, f, sizeof(struct nfs_fh));
 	file->f_hash = hash;
 	init_MUTEX(&file->f_sema);
+	atomic_set(&file->f_count, 1);
 
 	/* Open the file. Note that this must not sleep for too long, else
 	 * we would lock up lockd:-) So no NFS re-exports, folks.
@@ -124,9 +127,8 @@ nlm_lookup_file(struct svc_rqst *rqstp, 
 	nlm_files[hash] = file;
 
 found:
-	dprintk("lockd: found file %p (count %d)\n", file, file->f_count);
+	dprintk("lockd: found file %p (count %d)\n", file, atomic_read(&file->f_count));
 	*result = file;
-	file->f_count++;
 	nfserr = 0;
 
 out_unlock:
@@ -221,7 +223,7 @@ nlm_inspect_file(struct nlm_host *host, 
 {
 	if (action == NLM_ACT_CHECK) {
 		/* Fast path for mark and sweep garbage collection */
-		if (file->f_count || file->f_blocks || file->f_shares)
+		if (atomic_read(&file->f_count) || file->f_blocks || file->f_shares)
 			return 1;
 	} else {
 		nlmsvc_traverse_blocks(host, file, action);
@@ -252,7 +254,7 @@ nlm_traverse_files(struct nlm_host *host
 
 			/* No more references to this file. Let go of it. */
 			if (!file->f_blocks && !file->f_locks
-			 && !file->f_shares && !file->f_count) {
+			 && !file->f_shares && !atomic_read(&file->f_count)) {
 				*fp = file->f_next;
 				nlmsvc_ops->fclose(file->f_file);
 				kfree(file);
@@ -278,18 +280,20 @@ void
 nlm_release_file(struct nlm_file *file)
 {
 	dprintk("lockd: nlm_release_file(%p, ct = %d)\n",
-				file, file->f_count);
-
-	/* Lock file table */
-	mutex_lock(&nlm_file_mutex);
+				file, atomic_read(&file->f_count));
 
 	/* If there are no more locks etc, delete the file */
-	if(--file->f_count == 0) {
+	if (atomic_dec_and_test(&file->f_count)) {
+		/*
+		 * Note! nlm_inspect_file() will check file->f_count
+		 * to see if we raced while taking nlm_file_mutex
+		 */
+		mutex_lock(&nlm_file_mutex);
 		if(!nlm_inspect_file(NULL, file, NLM_ACT_CHECK))
 			nlm_delete_file(file);
+		mutex_unlock(&nlm_file_mutex);
 	}
 
-	mutex_unlock(&nlm_file_mutex);
 }
 
 /*
diff --git a/include/linux/lockd/lockd.h b/include/linux/lockd/lockd.h
index 0d92c46..59b63a1 100644
--- a/include/linux/lockd/lockd.h
+++ b/include/linux/lockd/lockd.h
@@ -102,7 +102,7 @@ struct nlm_file {
 	struct nlm_share *	f_shares;	/* DOS shares */
 	struct nlm_block *	f_blocks;	/* blocked locks */
 	unsigned int		f_locks;	/* guesstimate # of locks */
-	unsigned int		f_count;	/* reference count */
+	atomic_t		f_count;	/* reference count */
 	struct semaphore	f_sema;		/* avoid concurrent access */
 	int		       	f_hash;		/* hash of f_handle */
 };

[-- Attachment #3: Type: text/plain, Size: 373 bytes --]

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642

[-- Attachment #4: Type: text/plain, Size: 140 bytes --]

_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] fix recursive nlm_file_mutex deadlock
  2006-08-09 23:57     ` Trond Myklebust
@ 2006-08-10 15:24       ` Wendy Cheng
  2006-08-10 15:40         ` Trond Myklebust
  0 siblings, 1 reply; 9+ messages in thread
From: Wendy Cheng @ 2006-08-10 15:24 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: Linux NFS Mailing List

On Wed, 2006-08-09 at 19:57 -0400, Trond Myklebust wrote:

> Here you go. Assuming this compiles, it ought to prevent the recursive
> call to mutex_lock(&nlm_file_mutex), _and_ speed up nlm_release_file()
> in the case where we know we're not going to free up the file.

Look like this issue can't be patched by any 10-minutes code (that
applies to both of us). 

We simply can't call nlm_release_file() as long as it requires
nlm_file_mutex (no matter where you move the locking line) within
nlm_traverse_files() code path. I'm ok with atomic variable though.

I always think nlm_file_mutex is used to protect nlm_files *list* (which
is a list of individual nlm_file) and file->f_sema is used to protect
the *individual* nlm_file. But I can be wrong (and will go to read the
code more).

In short, you patch will deadlock again.

-- Wendy

> 
> Cheers,
>   Trond
> 
> plain text document attachment (linux-2.6.18-009-
> make_nlm_file_f_count_atomic.dif)
> LOCKD: Make nlm_file->f_count an atomic
> 
> From: Trond Myklebust <Trond.Myklebust@netapp.com>
> 
> This allows us to grab the mutex in nlm_release_file only if we suspect
> that the file needs to be freed.
> 
> The latter fixes a deadlock in which nlm_traverse_files() ends up calling
> mutex_lock(&nlm_file_mutex) recursively (the second time being by means of
> the call to nlm_release_file() in nlmsvc_free_block()).
> 
> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
> ---
> 
>  fs/lockd/svclock.c          |    2 +-
>  fs/lockd/svcsubs.c          |   26 +++++++++++++++-----------
>  include/linux/lockd/lockd.h |    2 +-
>  3 files changed, 17 insertions(+), 13 deletions(-)
> 
> diff --git a/fs/lockd/svclock.c b/fs/lockd/svclock.c
> index c9d4197..57dcb2d 100644
> --- a/fs/lockd/svclock.c
> +++ b/fs/lockd/svclock.c
> @@ -207,7 +207,7 @@ nlmsvc_create_block(struct svc_rqst *rqs
>  	block->b_daemon = rqstp->rq_server;
>  	block->b_host   = host;
>  	block->b_file   = file;
> -	file->f_count++;
> +	atomic_inc(&file->f_count);
>  
>  	/* Add to file's list of blocks */
>  	block->b_fnext  = file->f_blocks;
> diff --git a/fs/lockd/svcsubs.c b/fs/lockd/svcsubs.c
> index 2a4df9b..dfca9e7 100644
> --- a/fs/lockd/svcsubs.c
> +++ b/fs/lockd/svcsubs.c
> @@ -94,8 +94,10 @@ nlm_lookup_file(struct svc_rqst *rqstp, 
>  	mutex_lock(&nlm_file_mutex);
>  
>  	for (file = nlm_files[hash]; file; file = file->f_next)
> -		if (!nfs_compare_fh(&file->f_handle, f))
> +		if (!nfs_compare_fh(&file->f_handle, f)) {
> +			atomic_inc(&file->f_count);
>  			goto found;
> +		}
>  
>  	nlm_debug_print_fh("creating file for", f);
>  
> @@ -108,6 +110,7 @@ nlm_lookup_file(struct svc_rqst *rqstp, 
>  	memcpy(&file->f_handle, f, sizeof(struct nfs_fh));
>  	file->f_hash = hash;
>  	init_MUTEX(&file->f_sema);
> +	atomic_set(&file->f_count, 1);
>  
>  	/* Open the file. Note that this must not sleep for too long, else
>  	 * we would lock up lockd:-) So no NFS re-exports, folks.
> @@ -124,9 +127,8 @@ nlm_lookup_file(struct svc_rqst *rqstp, 
>  	nlm_files[hash] = file;
>  
>  found:
> -	dprintk("lockd: found file %p (count %d)\n", file, file->f_count);
> +	dprintk("lockd: found file %p (count %d)\n", file, atomic_read(&file->f_count));
>  	*result = file;
> -	file->f_count++;
>  	nfserr = 0;
>  
>  out_unlock:
> @@ -221,7 +223,7 @@ nlm_inspect_file(struct nlm_host *host, 
>  {
>  	if (action == NLM_ACT_CHECK) {
>  		/* Fast path for mark and sweep garbage collection */
> -		if (file->f_count || file->f_blocks || file->f_shares)
> +		if (atomic_read(&file->f_count) || file->f_blocks || file->f_shares)
>  			return 1;
>  	} else {
>  		nlmsvc_traverse_blocks(host, file, action);
> @@ -252,7 +254,7 @@ nlm_traverse_files(struct nlm_host *host
>  
>  			/* No more references to this file. Let go of it. */
>  			if (!file->f_blocks && !file->f_locks
> -			 && !file->f_shares && !file->f_count) {
> +			 && !file->f_shares && !atomic_read(&file->f_count)) {
>  				*fp = file->f_next;
>  				nlmsvc_ops->fclose(file->f_file);
>  				kfree(file);
> @@ -278,18 +280,20 @@ void
>  nlm_release_file(struct nlm_file *file)
>  {
>  	dprintk("lockd: nlm_release_file(%p, ct = %d)\n",
> -				file, file->f_count);
> -
> -	/* Lock file table */
> -	mutex_lock(&nlm_file_mutex);
> +				file, atomic_read(&file->f_count));
>  
>  	/* If there are no more locks etc, delete the file */
> -	if(--file->f_count == 0) {
> +	if (atomic_dec_and_test(&file->f_count)) {
> +		/*
> +		 * Note! nlm_inspect_file() will check file->f_count
> +		 * to see if we raced while taking nlm_file_mutex
> +		 */
> +		mutex_lock(&nlm_file_mutex);
>  		if(!nlm_inspect_file(NULL, file, NLM_ACT_CHECK))
>  			nlm_delete_file(file);
> +		mutex_unlock(&nlm_file_mutex);
>  	}
>  
> -	mutex_unlock(&nlm_file_mutex);
>  }
>  
>  /*
> diff --git a/include/linux/lockd/lockd.h b/include/linux/lockd/lockd.h
> index 0d92c46..59b63a1 100644
> --- a/include/linux/lockd/lockd.h
> +++ b/include/linux/lockd/lockd.h
> @@ -102,7 +102,7 @@ struct nlm_file {
>  	struct nlm_share *	f_shares;	/* DOS shares */
>  	struct nlm_block *	f_blocks;	/* blocked locks */
>  	unsigned int		f_locks;	/* guesstimate # of locks */
> -	unsigned int		f_count;	/* reference count */
> +	atomic_t		f_count;	/* reference count */
>  	struct semaphore	f_sema;		/* avoid concurrent access */
>  	int		       	f_hash;		/* hash of f_handle */
>  };
> -------------------------------------------------------------------------
> Using Tomcat but need to do more? Need to support web services, security?
> Get stuff done quickly with pre-integrated technology to make your job easier
> Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
> _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs


-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] fix recursive nlm_file_mutex deadlock
  2006-08-10 15:24       ` Wendy Cheng
@ 2006-08-10 15:40         ` Trond Myklebust
  2006-08-10 16:05           ` Trond Myklebust
  0 siblings, 1 reply; 9+ messages in thread
From: Trond Myklebust @ 2006-08-10 15:40 UTC (permalink / raw)
  To: Wendy Cheng; +Cc: Linux NFS Mailing List

On Thu, 2006-08-10 at 11:24 -0400, Wendy Cheng wrote:

> We simply can't call nlm_release_file() as long as it requires
> nlm_file_mutex (no matter where you move the locking line) within
> nlm_traverse_files() code path. I'm ok with atomic variable though.

As long as we can guarantee that file->f_count doesn't drop to zero (and
we can make that guarantee in nlm_traverse_file()), then the code is
correct.

  Trond


-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] fix recursive nlm_file_mutex deadlock
  2006-08-10 15:40         ` Trond Myklebust
@ 2006-08-10 16:05           ` Trond Myklebust
  0 siblings, 0 replies; 9+ messages in thread
From: Trond Myklebust @ 2006-08-10 16:05 UTC (permalink / raw)
  To: Wendy Cheng; +Cc: Linux NFS Mailing List

[-- Attachment #1: Type: text/plain, Size: 754 bytes --]

On Thu, 2006-08-10 at 11:40 -0400, Trond Myklebust wrote:
> On Thu, 2006-08-10 at 11:24 -0400, Wendy Cheng wrote:
> 
> > We simply can't call nlm_release_file() as long as it requires
> > nlm_file_mutex (no matter where you move the locking line) within
> > nlm_traverse_files() code path. I'm ok with atomic variable though.
> 
> As long as we can guarantee that file->f_count doesn't drop to zero (and
> we can make that guarantee in nlm_traverse_file()), then the code is
> correct.

Doh! There is no reason whatsoever to be holding nlm_file_mutex across
the call to nlm_inspect_file. As you said, the only thing it does is
protect the list, and we don't care about that.

The simplest fix is just to do something like the following.

Cheers,
  Trond

[-- Attachment #2: linux-2.6.18-009-fix_nlm_traverse_files_deadlock.dif --]
[-- Type: message/rfc822, Size: 1653 bytes --]

From: Trond Myklebust <Trond.Myklebust@netapp.com>
Subject: No Subject
Date: Thu, 10 Aug 2006 12:01:51 -0400
Message-ID: <1155225711.10547.49.camel@localhost>

nlm_traverse_files() is not allowed to hold the nlm_file_mutex while calling
nlm_inspect file, since it may end up calling nlm_release_file() when
releaseing the blocks.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
---

 fs/lockd/svcsubs.c |   15 +++++++++------
 1 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/fs/lockd/svcsubs.c b/fs/lockd/svcsubs.c
index 2a4df9b..01b4db9 100644
--- a/fs/lockd/svcsubs.c
+++ b/fs/lockd/svcsubs.c
@@ -237,19 +237,22 @@ static int
 nlm_traverse_files(struct nlm_host *host, int action)
 {
 	struct nlm_file	*file, **fp;
-	int		i;
+	int i, ret = 0;
 
 	mutex_lock(&nlm_file_mutex);
 	for (i = 0; i < FILE_NRHASH; i++) {
 		fp = nlm_files + i;
 		while ((file = *fp) != NULL) {
+			file->f_count++;
+			mutex_unlock(&nlm_file_mutex);
+
 			/* Traverse locks, blocks and shares of this file
 			 * and update file->f_locks count */
-			if (nlm_inspect_file(host, file, action)) {
-				mutex_unlock(&nlm_file_mutex);
-				return 1;
-			}
+			if (nlm_inspect_file(host, file, action))
+				ret = 1;
 
+			mutex_lock(&nlm_file_mutex);
+			file->f_count--;
 			/* No more references to this file. Let go of it. */
 			if (!file->f_blocks && !file->f_locks
 			 && !file->f_shares && !file->f_count) {
@@ -262,7 +265,7 @@ nlm_traverse_files(struct nlm_host *host
 		}
 	}
 	mutex_unlock(&nlm_file_mutex);
-	return 0;
+	return ret;
 }
 
 /*

[-- Attachment #3: Type: text/plain, Size: 373 bytes --]

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642

[-- Attachment #4: Type: text/plain, Size: 140 bytes --]

_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply related	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2006-08-10 16:05 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-08-09 18:13 [PATCH] fix recursive nlm_file_mutex deadlock Wendy Cheng
2006-08-09 18:32 ` Wendy Cheng
2006-08-09 21:45 ` Trond Myklebust
2006-08-09 22:07   ` Wendy Cheng
2006-08-09 22:41     ` Trond Myklebust
2006-08-09 23:57     ` Trond Myklebust
2006-08-10 15:24       ` Wendy Cheng
2006-08-10 15:40         ` Trond Myklebust
2006-08-10 16:05           ` Trond Myklebust

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.