* [PATCH] Fix NLM reference count panic
@ 2008-01-04 22:48 Wendy Cheng
2008-01-04 22:58 ` Wendy Cheng
2008-01-04 23:24 ` J. Bruce Fields
0 siblings, 2 replies; 7+ messages in thread
From: Wendy Cheng @ 2008-01-04 22:48 UTC (permalink / raw)
To: NFS list
[-- Attachment #1: Type: text/plain, Size: 483 bytes --]
This is not exciting as Jeff Garzik's new NFSv4 server but a painful
chore needs to be done. Will start to push the patches for NLM lock
failover code next week. However, instead of doing large amount of
patches all at once, would like to do it one by one, if they can be
functionally separated. This is to avoid the tedious re-base works I
have been doing for the past, (oh no), two years ??!!!
First one should be simple enough without too much explanation ....
-- Wendy
[-- Attachment #2: 001-nlm_f_count.patch --]
[-- Type: text/x-patch, Size: 955 bytes --]
This fixes the incorrect fclose call inside nlm_traverse_files() where
a posix lock could still be held by NFS client. Problem was found in a
kernel panic inside locks_remove_flock() (fs/locks.c:2034) as part of
the fclose call due to NFS-NLM locks still hanging on inode->i_flock list.
Also see: http://people.redhat.com/wcheng/Patches/NFS/NLM/001.txt
svcsubs.c | 3 +--
1 files changed, 1 insertion(+), 2 deletions(-)
--- gfs2-nmw/fs/lockd/svcsubs.c 2007-04-10 11:59:09.000000000 -0400
+++ linux/fs/lockd/svcsubs.c 2007-04-18 10:01:23.000000000 -0400
@@ -250,8 +250,7 @@ nlm_traverse_files(struct nlm_host *host
mutex_lock(&nlm_file_mutex);
file->f_count--;
/* No more references to this file. Let go of it. */
- if (list_empty(&file->f_blocks) && !file->f_locks
- && !file->f_shares && !file->f_count) {
+ if (!nlm_file_inuse(file)) {
hlist_del(&file->f_list);
nlmsvc_ops->fclose(file->f_file);
kfree(file);
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [PATCH] Fix NLM reference count panic 2008-01-04 22:48 [PATCH] Fix NLM reference count panic Wendy Cheng @ 2008-01-04 22:58 ` Wendy Cheng 2008-01-04 23:24 ` J. Bruce Fields 1 sibling, 0 replies; 7+ messages in thread From: Wendy Cheng @ 2008-01-04 22:58 UTC (permalink / raw) To: NFS list Forgot the signed-off line .. here it is ... Signed-off-by: S. Wendy Cheng <wcheng@redhat.com> > > ------------------------------------------------------------------------ > > This fixes the incorrect fclose call inside nlm_traverse_files() where > a posix lock could still be held by NFS client. Problem was found in a > kernel panic inside locks_remove_flock() (fs/locks.c:2034) as part of > the fclose call due to NFS-NLM locks still hanging on inode->i_flock list. > > Also see: http://people.redhat.com/wcheng/Patches/NFS/NLM/001.txt > > svcsubs.c | 3 +-- > 1 files changed, 1 insertion(+), 2 deletions(-) > > --- gfs2-nmw/fs/lockd/svcsubs.c 2007-04-10 11:59:09.000000000 -0400 > +++ linux/fs/lockd/svcsubs.c 2007-04-18 10:01:23.000000000 -0400 > @@ -250,8 +250,7 @@ nlm_traverse_files(struct nlm_host *host > mutex_lock(&nlm_file_mutex); > file->f_count--; > /* No more references to this file. Let go of it. */ > - if (list_empty(&file->f_blocks) && !file->f_locks > - && !file->f_shares && !file->f_count) { > + if (!nlm_file_inuse(file)) { > hlist_del(&file->f_list); > nlmsvc_ops->fclose(file->f_file); > kfree(file); > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] Fix NLM reference count panic 2008-01-04 22:48 [PATCH] Fix NLM reference count panic Wendy Cheng 2008-01-04 22:58 ` Wendy Cheng @ 2008-01-04 23:24 ` J. Bruce Fields 2008-01-05 5:03 ` Wendy Cheng 1 sibling, 1 reply; 7+ messages in thread From: J. Bruce Fields @ 2008-01-04 23:24 UTC (permalink / raw) To: Wendy Cheng; +Cc: NFS list On Fri, Jan 04, 2008 at 05:48:49PM -0500, Wendy Cheng wrote: > This fixes the incorrect fclose call inside nlm_traverse_files() where > a posix lock could still be held by NFS client. Problem was found in a > kernel panic inside locks_remove_flock() (fs/locks.c:2034) as part of > the fclose call due to NFS-NLM locks still hanging on inode->i_flock list. > > Also see: http://people.redhat.com/wcheng/Patches/NFS/NLM/001.txt Next time it'd be best just to include that into the referred-to text into the message. > svcsubs.c | 3 +-- > 1 files changed, 1 insertion(+), 2 deletions(-) > > --- gfs2-nmw/fs/lockd/svcsubs.c 2007-04-10 11:59:09.000000000 -0400 > +++ linux/fs/lockd/svcsubs.c 2007-04-18 10:01:23.000000000 -0400 > @@ -250,8 +250,7 @@ nlm_traverse_files(struct nlm_host *host > mutex_lock(&nlm_file_mutex); > file->f_count--; > /* No more references to this file. Let go of it. */ > - if (list_empty(&file->f_blocks) && !file->f_locks > - && !file->f_shares && !file->f_count) { > + if (!nlm_file_inuse(file)) { > hlist_del(&file->f_list); > nlmsvc_ops->fclose(file->f_file); > kfree(file); This just replaces the file->f_locks check by a search of the inode's lock list. What confuses me here is that the nlm_inspect_file() call just above already did that search, and set file->f_locks accordingly. The only difference is that now we've acquired the nlm_file_mutex. I don't understand yet how that makes a difference. --b. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] Fix NLM reference count panic 2008-01-04 23:24 ` J. Bruce Fields @ 2008-01-05 5:03 ` Wendy Cheng 2008-01-05 6:05 ` J. Bruce Fields 0 siblings, 1 reply; 7+ messages in thread From: Wendy Cheng @ 2008-01-05 5:03 UTC (permalink / raw) To: J. Bruce Fields; +Cc: NFS list J. Bruce Fields wrote: >On Fri, Jan 04, 2008 at 05:48:49PM -0500, Wendy Cheng wrote: > > >>This fixes the incorrect fclose call inside nlm_traverse_files() where >>a posix lock could still be held by NFS client. Problem was found in a >>kernel panic inside locks_remove_flock() (fs/locks.c:2034) as part of >>the fclose call due to NFS-NLM locks still hanging on inode->i_flock list. >> >>Also see: http://people.redhat.com/wcheng/Patches/NFS/NLM/001.txt >> >> > >Next time it'd be best just to include that into the referred-to text >into the message. > > Sorry, forgot to remove it - it was for internal review purpose. I was a little bit careless with this patch. > > >> svcsubs.c | 3 +-- >> 1 files changed, 1 insertion(+), 2 deletions(-) >> >>--- gfs2-nmw/fs/lockd/svcsubs.c 2007-04-10 11:59:09.000000000 -0400 >>+++ linux/fs/lockd/svcsubs.c 2007-04-18 10:01:23.000000000 -0400 >>@@ -250,8 +250,7 @@ nlm_traverse_files(struct nlm_host *host >> mutex_lock(&nlm_file_mutex); >> file->f_count--; >> /* No more references to this file. Let go of it. */ >>- if (list_empty(&file->f_blocks) && !file->f_locks >>- && !file->f_shares && !file->f_count) { >>+ if (!nlm_file_inuse(file)) { >> hlist_del(&file->f_list); >> nlmsvc_ops->fclose(file->f_file); >> kfree(file); >> >> > >This just replaces the file->f_locks check by a search of the inode's >lock list. > >What confuses me here is that the nlm_inspect_file() call just above >already did that search, and set file->f_locks accordingly. The only >difference is that now we've acquired the nlm_file_mutex. I don't >understand yet how that makes a difference. > > > You're right. I got the patch sequence wrong. It will cause panic only when we selectively unlock nlm locks under my "unlock" patch as the following. See how it returns without doing nlm_traverse_locks() below ... Let's combine this patch into the big unlock patch so we won't have any confusion. The unlock patch will submitted on Monday after this weekend's sanity check test run. In short, I withdraw this patch... Wendy nlm_inspect_file(struct nlm_host *host, struct nlm_file *file, nlm_host_match_fn_t match) { + /* Cluster failover has timing constraints. There is a slight + * performance hit if nlm_fo_unlock_match() is implemented as + * a match fn (since it will be invoked for each block, share, + * and lock later when the lists are traversed). Instead, we + * add path-matching logic into the following unlikely clause. + * If matches, the dummy nlmsvc_fo_match will always return + * true. + */ + dprintk("nlm_inspect_files: file=%p\n", file); + if (unlikely(match == nlmsvc_fo_match)) { + if (!nlmsvc_fo_unlock_match((void *)host, file)) + return 0; + fo_printk("nlm_fo find lock file entry (0x%p)\n", file); + } + nlmsvc_traverse_blocks(host, file, match); nlmsvc_traverse_shares(host, file, match); return nlm_traverse_locks(host, file, match); @@ -369,3 +451,35 @@ nlmsvc_invalidate_all(void) */ nlm_traverse_files(NULL, nlmsvc_is_client); } ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] Fix NLM reference count panic 2008-01-05 5:03 ` Wendy Cheng @ 2008-01-05 6:05 ` J. Bruce Fields 2008-01-05 17:46 ` Wendy Cheng 0 siblings, 1 reply; 7+ messages in thread From: J. Bruce Fields @ 2008-01-05 6:05 UTC (permalink / raw) To: Wendy Cheng; +Cc: NFS list On Sat, Jan 05, 2008 at 12:03:07AM -0500, Wendy Cheng wrote: > J. Bruce Fields wrote: > >> On Fri, Jan 04, 2008 at 05:48:49PM -0500, Wendy Cheng wrote: >> This just replaces the file->f_locks check by a search of the inode's >> lock list. >> >> What confuses me here is that the nlm_inspect_file() call just above >> already did that search, and set file->f_locks accordingly. The only >> difference is that now we've acquired the nlm_file_mutex. I don't >> understand yet how that makes a difference. >> >> > You're right. I got the patch sequence wrong. It will cause panic only when > we selectively unlock nlm locks under my "unlock" patch as the following. > See how it returns without doing nlm_traverse_locks() below ... Let's > combine this patch into the big unlock patch so we won't have any > confusion. The unlock patch will submitted on Monday after this weekend's > sanity check test run. In short, I withdraw this patch... Wendy OK. I'll admit to still being a little confused--but with luck it'll all make sense when I see the whole thing. Have a good weekend! --b. > nlm_inspect_file(struct nlm_host *host, struct nlm_file *file, > nlm_host_match_fn_t match) > { > + /* Cluster failover has timing constraints. There is a slight > + * performance hit if nlm_fo_unlock_match() is implemented as > + * a match fn (since it will be invoked for each block, share, > + * and lock later when the lists are traversed). Instead, we > + * add path-matching logic into the following unlikely clause. > + * If matches, the dummy nlmsvc_fo_match will always return > + * true. > + */ > + dprintk("nlm_inspect_files: file=%p\n", file); > + if (unlikely(match == nlmsvc_fo_match)) { > + if (!nlmsvc_fo_unlock_match((void *)host, file)) > + return 0; > + fo_printk("nlm_fo find lock file entry (0x%p)\n", file); > + } > + > nlmsvc_traverse_blocks(host, file, match); > nlmsvc_traverse_shares(host, file, match); > return nlm_traverse_locks(host, file, match); > @@ -369,3 +451,35 @@ nlmsvc_invalidate_all(void) > */ > nlm_traverse_files(NULL, nlmsvc_is_client); > } > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] Fix NLM reference count panic 2008-01-05 6:05 ` J. Bruce Fields @ 2008-01-05 17:46 ` Wendy Cheng 2008-01-05 17:36 ` J. Bruce Fields 0 siblings, 1 reply; 7+ messages in thread From: Wendy Cheng @ 2008-01-05 17:46 UTC (permalink / raw) To: J. Bruce Fields; +Cc: NFS list J. Bruce Fields wrote: >On Sat, Jan 05, 2008 at 12:03:07AM -0500, Wendy Cheng wrote: > > >>J. Bruce Fields wrote: >> >> >> >>>On Fri, Jan 04, 2008 at 05:48:49PM -0500, Wendy Cheng wrote: >>>This just replaces the file->f_locks check by a search of the inode's >>>lock list. >>> >>>What confuses me here is that the nlm_inspect_file() call just above >>>already did that search, and set file->f_locks accordingly. The only >>>difference is that now we've acquired the nlm_file_mutex. I don't >>>understand yet how that makes a difference. >>> >>> >>> >>> >>You're right. I got the patch sequence wrong. It will cause panic only when >>we selectively unlock nlm locks under my "unlock" patch as the following. >>See how it returns without doing nlm_traverse_locks() below ... Let's >>combine this patch into the big unlock patch so we won't have any >>confusion. The unlock patch will submitted on Monday after this weekend's >>sanity check test run. In short, I withdraw this patch... Wendy >> >> > >OK. I'll admit to still being a little confused--but with luck it'll >all make sense when I see the whole thing. Have a good weekend! > > > Actually the *existing* logic (implying without my changes) is awkward. It uses file->f_locks to carry the result of unlock operation. Leave the non-zero f_locks value hanging there if unlock fails, It only resets it back to zero when next time nlm_inspect_file is called. Code like this is a good place to generate subtle race conditions that could result with file close failure. Just a complaint - I don't have a solid proof that it has caused troubles though. Anyway, will combine this small patch into the big unlock patch. You have a good weekend too ! -- Wendy ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] Fix NLM reference count panic 2008-01-05 17:46 ` Wendy Cheng @ 2008-01-05 17:36 ` J. Bruce Fields 0 siblings, 0 replies; 7+ messages in thread From: J. Bruce Fields @ 2008-01-05 17:36 UTC (permalink / raw) To: Wendy Cheng; +Cc: NFS list On Sat, Jan 05, 2008 at 12:46:21PM -0500, Wendy Cheng wrote: > Actually the *existing* logic (implying without my changes) is awkward. It > uses file->f_locks to carry the result of unlock operation. Leave the > non-zero f_locks value hanging there if unlock fails, It only resets it > back to zero when next time nlm_inspect_file is called. Code like this is a > good place to generate subtle race conditions that could result with file > close failure. Just a complaint - I don't have a solid proof that it has > caused troubles though. Absolutely agreed. It looks very fragile to me. --b. ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2008-01-05 17:36 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-01-04 22:48 [PATCH] Fix NLM reference count panic Wendy Cheng 2008-01-04 22:58 ` Wendy Cheng 2008-01-04 23:24 ` J. Bruce Fields 2008-01-05 5:03 ` Wendy Cheng 2008-01-05 6:05 ` J. Bruce Fields 2008-01-05 17:46 ` Wendy Cheng 2008-01-05 17:36 ` J. Bruce Fields
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.