From: Trond Myklebust <trond.myklebust@fys.uio.no>
To: Wendy Cheng <wcheng@redhat.com>
Cc: Linux NFS Mailing List <nfs@lists.sourceforge.net>
Subject: Re: [PATCH] fix recursive nlm_file_mutex deadlock
Date: Wed, 09 Aug 2006 17:45:36 -0400 [thread overview]
Message-ID: <1155159937.15624.22.camel@localhost> (raw)
In-Reply-To: <44DA25D3.3010003@redhat.com>
On Wed, 2006-08-09 at 14:13 -0400, Wendy Cheng wrote:
> I was testing NLM failover patches this morning and found the command
> hangs. Look like nlm_traverse_files(), where it grabs nlm_file_mutex
> early in the call, will have a chance to call nlm_release_file() via
> nlmsvc_free_block() inside kref_put(). The nlm_release_file() wants
> nlm_file_mutex too - this would generate a deadlock as the following:
>
> dhcp59-234 kernel: Call Trace:
> [<c02dd749>] __mutex_lock_slowpath+0x4c/0x7e
> [<c02dd78a>] .text.lock.mutex+0xf/0x14
> [<f8afeacd>] nlm_release_file+0x2b/0xdf [lockd]
> [<f8afda90>] nlmsvc_free_block+0x8c/0x9d [lockd]
> [<f8afda04>] nlmsvc_free_block+0x0/0x9d [lockd]
> [<c01be98d>] kref_put+0x4e/0x58
> [<f8afd175>] nlmsvc_traverse_blocks+0xaf/0xc6 [lockd]
> [<f8afe960>] nlm_traverse_files+0x108/0x1cd [lockd]
>
> The attached patch seems to fix the issue - it skips (defers) the file
> removal. Eventually either nlm_gc_hosts (some time later when client is
> unmonitored) or nlmsvc_traverse_files will finish the clean up. Note
> that this is a 10-minutes work - not sure its ramification at this
> moment. Take a look ?
>
> -- Wendy
>
> plain text document attachment (gfs_nlm_deadlock.patch)
> --- linux-2/fs/lockd/svclock.c 2006-08-08 10:20:16.000000000 -0400
> +++ linux/fs/lockd/svclock.c 2006-08-09 10:28:35.000000000 -0400
> @@ -264,7 +264,9 @@ static void nlmsvc_free_block(struct kre
>
> nlmsvc_freegrantargs(block->b_call);
> nlm_release_call(block->b_call);
> - nlm_release_file(block->b_file);
> + down(&file->f_sema);
> + file->f_count--;
> + up(&file->f_sema);
> kfree(block);
> }
Vetoed. The block holds a reference to the file. It _must_ call
nlm_release_file() in order to release that reference. It is in any case
a bug to grab file->f_sema without holding a reference to the file.
I suspect, rather, that the problem is due to nlmsvc_create_block()
incrementing file->f_count without holding the nlm_file_mutex. If we
convert it to an atomic_t instead, then that problem should be solved.
aside: Note also that we want to get rid of all that mark and sweep
braindamage in nlm_traverse_*() with all the silly counting of f_lock,
f_blocks, f_shares,.... and replace those variables with proper
references to the struct nlm_file by the locks, blocks (is already the
case?), and shares.
Cheers,
Trond
-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
next prev parent reply other threads:[~2006-08-09 21:45 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-08-09 18:13 [PATCH] fix recursive nlm_file_mutex deadlock Wendy Cheng
2006-08-09 18:32 ` Wendy Cheng
2006-08-09 21:45 ` Trond Myklebust [this message]
2006-08-09 22:07 ` Wendy Cheng
2006-08-09 22:41 ` Trond Myklebust
2006-08-09 23:57 ` Trond Myklebust
2006-08-10 15:24 ` Wendy Cheng
2006-08-10 15:40 ` Trond Myklebust
2006-08-10 16:05 ` Trond Myklebust
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1155159937.15624.22.camel@localhost \
--to=trond.myklebust@fys.uio.no \
--cc=nfs@lists.sourceforge.net \
--cc=wcheng@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.