* [PATCH] SGI 882960: Busy inodes after unmount, oops
@ 2004-02-04 7:12 Greg Banks
2004-02-04 10:42 ` Olaf Kirch
` (3 more replies)
0 siblings, 4 replies; 13+ messages in thread
From: Greg Banks @ 2004-02-04 7:12 UTC (permalink / raw)
To: Trond Myklebust; +Cc: Linux NFS Mailing List
G'day,
This patch fixes a bug where the forced killing of pending asynchronous
unlink rpc_tasks during unmount leaks inode reference counts for the
parent of the silly-renamed file and all its ancestor directories,
resulting in the message
VFS: Busy inodes after unmount. Self-destruct in 5 seconds. Have a nice day...
and a few seconds later an oops with a stack trace ending in
prune_dcache -> nfs_dentry_iput -> iput.
This is probably also the bug discussed last September on the autofs
mailing list. The patch posted by Olaf Hering then has no effect
at all, but it did put me on the right track (thanks Olaf).
The first part makes sure that dput() will unhash and kill dentries
and their parents if called while the unmount is underway.
The second part moves the dput() call from the tk_exit callback of
the async unlink rpc_task to the tk_release callback so that it
will be called if the rpc_task is killed by rpc_killall_tasks()
instead of completing normally.
===========================================================================
linux/linux/fs/nfs/dir.c
===========================================================================
--- /usr/tmp/TmpDir.27555-0/linux/linux/fs/nfs/dir.c_1.32 Wed Feb 4 17:57:23 2004
+++ linux/linux/fs/nfs/dir.c Wed Feb 4 17:52:20 2004
@@ -551,6 +551,11 @@ static int nfs_dentry_delete(struct dent
/* Unhash it, so that ->d_iput() would be called */
return 1;
}
+ if (!(dentry->d_sb->s_flags & MS_ACTIVE)) {
+ /* Unhash it, so that ancestors of killed async unlink
+ * files will be cleaned up during umount */
+ return 1;
+ }
return 0;
}
===========================================================================
linux/linux/fs/nfs/unlink.c
===========================================================================
--- /usr/tmp/TmpDir.27555-0/linux/linux/fs/nfs/unlink.c_1.6 Wed Feb 4 17:57:23 2004
+++ linux/linux/fs/nfs/unlink.c Wed Feb 4 17:56:57 2004
@@ -51,6 +51,7 @@ static void
nfs_put_unlinkdata(struct nfs_unlinkdata *data)
{
if (--data->count == 0) {
+ dput(data->dir);
nfs_detach_unlinkdata(data);
if (data->name.name != NULL)
kfree(data->name.name);
@@ -132,7 +133,6 @@ nfs_async_unlink_done(struct rpc_task *t
NFS_PROTO(dir_i)->unlink_done(dir, &task->tk_msg);
put_rpccred(data->cred);
data->cred = NULL;
- dput(dir);
}
/**
Greg.
--
Greg Banks, R&D Software Engineer, SGI Australian Software Group.
I don't speak for SGI.
-------------------------------------------------------
The SF.Net email is sponsored by EclipseCon 2004
Premiere Conference on Open Tools Development and Integration
See the breadth of Eclipse activity. February 3-5 in Anaheim, CA.
http://www.eclipsecon.org/osdn
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: [PATCH] SGI 882960: Busy inodes after unmount, oops 2004-02-04 7:12 [PATCH] SGI 882960: Busy inodes after unmount, oops Greg Banks @ 2004-02-04 10:42 ` Olaf Kirch 2004-02-04 22:59 ` Greg Banks 2004-02-04 12:09 ` Olaf Kirch ` (2 subsequent siblings) 3 siblings, 1 reply; 13+ messages in thread From: Olaf Kirch @ 2004-02-04 10:42 UTC (permalink / raw) To: Greg Banks; +Cc: Trond Myklebust, Linux NFS Mailing List Hi Greg, On Wed, Feb 04, 2004 at 06:12:45PM +1100, Greg Banks wrote: > This patch fixes a bug where the forced killing of pending asynchronous > unlink rpc_tasks during unmount leaks inode reference counts for the > parent of the silly-renamed file and all its ancestor directories, > resulting in the message > > VFS: Busy inodes after unmount. Self-destruct in 5 seconds. Have a nice day... Finally! This bug was driving me nuts. I wasn't even able to reproduce this under lab conditions :-/ > The first part makes sure that dput() will unhash and kill dentries > and their parents if called while the unmount is underway. > > The second part moves the dput() call from the tk_exit callback of > the async unlink rpc_task to the tk_release callback so that it > will be called if the rpc_task is killed by rpc_killall_tasks() > instead of completing normally. Yes, that looks as if this could fix the problem. Thanks again, Olaf -- Olaf Kirch | Stop wasting entropy - start using predictable okir@suse.de | tempfile names today! ---------------+ ------------------------------------------------------- The SF.Net email is sponsored by EclipseCon 2004 Premiere Conference on Open Tools Development and Integration See the breadth of Eclipse activity. February 3-5 in Anaheim, CA. http://www.eclipsecon.org/osdn _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] SGI 882960: Busy inodes after unmount, oops 2004-02-04 10:42 ` Olaf Kirch @ 2004-02-04 22:59 ` Greg Banks 0 siblings, 0 replies; 13+ messages in thread From: Greg Banks @ 2004-02-04 22:59 UTC (permalink / raw) To: Olaf Kirch; +Cc: Trond Myklebust, Linux NFS Mailing List Olaf Kirch wrote: > > > VFS: Busy inodes after unmount. Self-destruct in 5 seconds. Have a nice day... > > Finally! This bug was driving me nuts. I wasn't even able to reproduce > this under lab conditions :-/ I stumbled on it by running the Connectathon test suite inside a shell wrapper which did a mount/umount for every test. I also have a small shellscript+C program which reproduces it reliably. Greg. -- Greg Banks, R&D Software Engineer, SGI Australian Software Group. I don't speak for SGI. ------------------------------------------------------- The SF.Net email is sponsored by EclipseCon 2004 Premiere Conference on Open Tools Development and Integration See the breadth of Eclipse activity. February 3-5 in Anaheim, CA. http://www.eclipsecon.org/osdn _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] SGI 882960: Busy inodes after unmount, oops 2004-02-04 7:12 [PATCH] SGI 882960: Busy inodes after unmount, oops Greg Banks 2004-02-04 10:42 ` Olaf Kirch @ 2004-02-04 12:09 ` Olaf Kirch 2004-02-04 22:41 ` Greg Banks 2004-02-04 14:24 ` raven 2004-02-05 12:40 ` James Pearson 3 siblings, 1 reply; 13+ messages in thread From: Olaf Kirch @ 2004-02-04 12:09 UTC (permalink / raw) To: Greg Banks; +Cc: Trond Myklebust, Linux NFS Mailing List Hi Greg, I've been looking at your analysis a little closer, and am trying to understand how the bug was triggered. Here's what I think happened: - the unlink code keeps a reference to the dentry of the parent directory, but not to the vfsmount - this allows the umount to proceed, because there don't seem to be any more references to the mount - rpc_shutdown_client calls rpc_killall_tasks, which terminates the async unlink task. - rpciod is woken up and schedules the async task, calling __rpc_execute - __rpc_execute notices the task is dead (no tk_action), leaves the loop and invokes task->tk_exit == nfs_async_unlink_done - nfs_async_unlink_done calls dput() on the parent dentry, but the dentry is not unhashed. Now looking at kill_super(), the sequence of calls there looks like this: shrink_dcache_parent(root); ... sop->put_super(sb); ... if (invalidate_inodes(sb)) { printk(KERN_ERR "VFS: Busy inodes after unmount. " "Self-destruct in 5 seconds. Have a nice day...\n"); } So the real problem is that the dentry isn't unhashed. The other part of the patch isn't required because the tk_exit() function is called always, even when rpc_killall_tasks triggers the demise of an async task. Do you agree with this analysis? Olaf -- Olaf Kirch | Stop wasting entropy - start using predictable okir@suse.de | tempfile names today! ---------------+ ------------------------------------------------------- The SF.Net email is sponsored by EclipseCon 2004 Premiere Conference on Open Tools Development and Integration See the breadth of Eclipse activity. February 3-5 in Anaheim, CA. http://www.eclipsecon.org/osdn _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] SGI 882960: Busy inodes after unmount, oops 2004-02-04 12:09 ` Olaf Kirch @ 2004-02-04 22:41 ` Greg Banks 2004-02-05 16:15 ` Olaf Kirch 0 siblings, 1 reply; 13+ messages in thread From: Greg Banks @ 2004-02-04 22:41 UTC (permalink / raw) To: Olaf Kirch; +Cc: Trond Myklebust, Linux NFS Mailing List Olaf Kirch wrote: > > Hi Greg, > > I've been looking at your analysis a little closer, and am > trying to understand how the bug was triggered. Here's what > I think happened: > > - the unlink code keeps a reference to the dentry of > the parent directory, but not to the vfsmount Yes. > - this allows the umount to proceed, because there > don't seem to be any more references to the mount Yes. BTW (not directly related to this bug) I found by experiment that I could umount an NFS mount when there were open file descriptors for unlinked files in the mount, and even keep writing. All the NFS and RPC structures stay alive until the last file descriptor closes, thanks to the magic of refcounts. All this despite the vfsmount reference taken in struct file, which I thought was supposed to prevent umount. > - rpc_shutdown_client calls rpc_killall_tasks, which > terminates the async unlink task. Yes. > - rpciod is woken up and schedules the async task, > calling __rpc_execute Yes. > - __rpc_execute notices the task is dead (no tk_action), > leaves the loop and invokes task->tk_exit == nfs_async_unlink_done No. In a crash dump taken after the umount has completed, the dir dentry has 1 leaked d_count for every async unlink present at umount, even though the async unlink tasks have been cleaned up. This indicates that task->tk_exit is not being called but task->tk_release is, so the dput is not happening. The change in unlink.c moves the dput so that it happens in task->tk_release. After the change, the dir dentry d_count is decremented to zero reliably (I walked data structures before during and after umount in the debugger). It's not entirely clear to me how __rpc_execute can do that, but the evidence is that it does so. > - nfs_async_unlink_done calls dput() on the parent dentry, > but the dentry is not unhashed. > > Now looking at kill_super(), the sequence of calls there looks > like this: > > shrink_dcache_parent(root); > ... > sop->put_super(sb); > ... > if (invalidate_inodes(sb)) { > printk(KERN_ERR "VFS: Busy inodes after unmount. " > "Self-destruct in 5 seconds. Have a nice day...\n"); > } > > So the real problem is that the dentry isn't unhashed. Yes, this was precisely the problem I encountered when I had the unlink.c change but not the dir.c change. > The other > part of the patch isn't required because the tk_exit() function > is called always, even when rpc_killall_tasks triggers the demise > of an async task. Surprisingly, it isn't. > Do you agree with this analysis? Almost. Greg. -- Greg Banks, R&D Software Engineer, SGI Australian Software Group. I don't speak for SGI. ------------------------------------------------------- The SF.Net email is sponsored by EclipseCon 2004 Premiere Conference on Open Tools Development and Integration See the breadth of Eclipse activity. February 3-5 in Anaheim, CA. http://www.eclipsecon.org/osdn _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] SGI 882960: Busy inodes after unmount, oops 2004-02-04 22:41 ` Greg Banks @ 2004-02-05 16:15 ` Olaf Kirch 2004-02-05 22:23 ` Greg Banks 0 siblings, 1 reply; 13+ messages in thread From: Olaf Kirch @ 2004-02-05 16:15 UTC (permalink / raw) To: Greg Banks; +Cc: Trond Myklebust, Linux NFS Mailing List Hi Greg, On Thu, Feb 05, 2004 at 09:41:32AM +1100, Greg Banks wrote: > BTW (not directly related to this bug) I found by experiment that I > could umount an NFS mount when there were open file descriptors for > unlinked files in the mount, and even keep writing. All the NFS > and RPC structures stay alive until the last file descriptor closes, > thanks to the magic of refcounts. All this despite the vfsmount > reference taken in struct file, which I thought was supposed to > prevent umount. Then something else must be wrong big time. > > - __rpc_execute notices the task is dead (no tk_action), > > leaves the loop and invokes task->tk_exit == nfs_async_unlink_done > > No. In a crash dump taken after the umount has completed, the dir dentry has > 1 leaked d_count for every async unlink present at umount, even though the > async unlink tasks have been cleaned up. This indicates that task->tk_exit > is not being called but task->tk_release is, so the dput is not happening. But then prune_dcache shouldn't touch these dentries at all, because their refcount is still 1. They would be leaked, but there would be no crash. > It's not entirely clear to me how __rpc_execute can do that, but the evidence > is that it does so. Very strange... maybe we have a refcounting problem elsewhere, and the refcount was 2 before calling tk_exit? But somehow I doubt this... I think we'd see far more massive problems in this case. Would you share your test case? Olaf -- Olaf Kirch | Stop wasting entropy - start using predictable okir@suse.de | tempfile names today! ---------------+ ------------------------------------------------------- The SF.Net email is sponsored by EclipseCon 2004 Premiere Conference on Open Tools Development and Integration See the breadth of Eclipse activity. February 3-5 in Anaheim, CA. http://www.eclipsecon.org/osdn _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] SGI 882960: Busy inodes after unmount, oops 2004-02-05 16:15 ` Olaf Kirch @ 2004-02-05 22:23 ` Greg Banks 2004-02-06 5:50 ` Greg Banks 0 siblings, 1 reply; 13+ messages in thread From: Greg Banks @ 2004-02-05 22:23 UTC (permalink / raw) To: Olaf Kirch; +Cc: Trond Myklebust, Linux NFS Mailing List [-- Attachment #1: Type: text/plain, Size: 1813 bytes --] Olaf Kirch wrote: > > Hi Greg, > > On Thu, Feb 05, 2004 at 09:41:32AM +1100, Greg Banks wrote: > > BTW (not directly related to this bug) I found by experiment that I > > could umount an NFS mount when there were open file descriptors for > > unlinked files in the mount,[...] > > Then something else must be wrong big time. Yes, I found that a surprising behaviour. > > > - __rpc_execute notices the task is dead (no tk_action), > > > leaves the loop and invokes task->tk_exit == nfs_async_unlink_done > > > > No. In a crash dump taken after the umount has completed, the dir dentry has > > 1 leaked d_count for every async unlink present at umount, even though the > > async unlink tasks have been cleaned up. This indicates that task->tk_exit > > is not being called but task->tk_release is, so the dput is not happening. > > But then prune_dcache shouldn't touch these dentries at all, because their > refcount is still 1. They would be leaked, but there would be no crash. That makes sense. I'll go back and recheck my forensics. > > It's not entirely clear to me how __rpc_execute can do that, but the evidence > > is that it does so. > > Very strange... maybe we have a refcounting problem elsewhere, and the > refcount was 2 before calling tk_exit? But somehow I doubt this... I > think we'd see far more massive problems in this case. > > Would you share your test case? Sure, attached is a C program and a shell script wrapper. You need to adjust $SERVER and possibly $UDELAY in the shell script. Then run the shell script and watch /var/log/messages. The C program is fairly generic, you can use it to test the other case (umount allowed with open file descriptors) also. Greg. -- Greg Banks, R&D Software Engineer, SGI Australian Software Group. I don't speak for SGI. [-- Attachment #2: fmeh.sh --] [-- Type: application/x-sh, Size: 673 bytes --] [-- Attachment #3: dangle.c --] [-- Type: text/plain, Size: 1923 bytes --] #include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <signal.h> #include <string.h> #include <fcntl.h> struct state { const char **cmd; int fd; const char *path; }; static void do_one_command(const char *cmd, struct state *state) { if (!strcmp(cmd, "unlink")) { fprintf(stderr, "dangle: unlink\n"); unlink(state->path); } else if (!strncmp(cmd, "write", 5)) { const char *data; data = cmd+5; if (!*data) data = "X"; fprintf(stderr, "dangle: write(\"%s\")\n", data); write(state->fd, data, strlen(data)); } else if (!strcmp(cmd, "fsync")) { fprintf(stderr, "dangle: fsync\n"); fsync(state->fd); } else if (!strcmp(cmd, "fdatasync")) { fprintf(stderr, "dangle: fdatasync\n"); fdatasync(state->fd); } else if (!strcmp(cmd, "sigpause")) { fprintf(stderr, "dangle: sigpause\n"); sigpause(0); } else if (!strncmp(cmd, "loop", 4)) { int i, N; N = atoi(cmd+4); fprintf(stderr, "dangle: looping, N=%d\n", N); state->cmd++; if (N == 0) { for (;;) do_one_command(*state->cmd, state); } else { for (i = 0 ; i < N ; i++) do_one_command(*state->cmd, state); } } else { fprintf(stderr, "dangle: unknown command \"%s\"\n", cmd); exit(1); } } static void do_commands(struct state *state) { for ( ; *state->cmd != NULL ; state->cmd++) { do_one_command(*state->cmd, state); } } int main(int argc, char **argv) { struct state state; if (argc < 2) { fprintf(stderr, "Usage: dangle filename [command...]\n"); exit(1); } state.path = argv[1]; state.cmd = (const char **)argv+2; fprintf(stderr, "dangle: open(\"%s\")\n", state.path); state.fd = open(state.path, O_RDWR|O_CREAT, 0); if (state.fd < 0) { perror(state.path); exit(1); } do_commands(&state); fprintf(stderr, "dangle: exiting\n"); return 0; } ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] SGI 882960: Busy inodes after unmount, oops 2004-02-05 22:23 ` Greg Banks @ 2004-02-06 5:50 ` Greg Banks 2004-02-13 16:26 ` canon 0 siblings, 1 reply; 13+ messages in thread From: Greg Banks @ 2004-02-06 5:50 UTC (permalink / raw) To: Olaf Kirch, Trond Myklebust, Linux NFS Mailing List Greg Banks wrote: > > Olaf Kirch wrote: > > > > > > - __rpc_execute notices the task is dead (no tk_action), > > > > leaves the loop and invokes task->tk_exit == nfs_async_unlink_done > > > > > > No. In a crash dump taken after the umount has completed, the dir dentry has > > > 1 leaked d_count for every async unlink present at umount, even though the > > > async unlink tasks have been cleaned up. This indicates that task->tk_exit > > > is not being called but task->tk_release is, so the dput is not happening. > > > > But then prune_dcache shouldn't touch these dentries at all, because their > > refcount is still 1. They would be leaked, but there would be no crash. > > That makes sense. I'll go back and recheck my forensics. I've checked the crash dumps again, and they don't actually have any evidence either way. So I ran an experiment. Feb 6 16:15:37 3X:budgie root: fmeh.sh: /tmp/fmeh-mounts/00007.d UDELAY=500 Feb 6 16:15:38 4A:budgie kernel: nfs_async_unlink_done Feb 6 16:15:38 4A:budgie kernel: nfs_async_unlink_release Feb 6 16:15:38 4A:budgie kernel: nfs_async_unlink_done Feb 6 16:15:38 4A:budgie kernel: nfs_async_unlink_release Feb 6 16:15:38 4A:budgie kernel: nfs_async_unlink_done Feb 6 16:15:38 4A:budgie kernel: nfs_async_unlink_release Feb 6 16:15:38 3X:budgie root: fmeh.sh: /tmp/fmeh-mounts/00008.d UDELAY=500 Feb 6 16:15:38 4A:budgie kernel: nfs_async_unlink_done Feb 6 16:15:38 4A:budgie kernel: nfs_async_unlink_release Feb 6 16:15:38 4A:budgie kernel: nfs_async_unlink_done Feb 6 16:15:38 4A:budgie kernel: nfs_async_unlink_release Feb 6 16:15:38 4A:budgie kernel: nfs_async_unlink_done Feb 6 16:15:38 4A:budgie kernel: nfs_async_unlink_release Feb 6 16:15:38 3X:budgie root: fmeh.sh: /tmp/fmeh-mounts/00009.d UDELAY=500 Feb 6 16:15:38 4A:budgie kernel: nfs_async_unlink_done Feb 6 16:15:38 4A:budgie kernel: nfs_async_unlink_release Feb 6 16:15:38 4A:budgie kernel: nfs_async_unlink_done Feb 6 16:15:38 4A:budgie kernel: nfs_async_unlink_release Feb 6 16:15:38 4A:budgie kernel: nfs_async_unlink_done There's some more matched done/release pairs which didn't make it to syslog by the time the kernel hit my BUG() in invalidate_list. So I was wrong, and the unlink.c part of the patch is worthless. Well caught, Olaf. Here's the latest version of the proposed patch. This patch fixes a bug where the forced killing of pending asynchronous unlink rpc_tasks during unmount leaks inode reference counts for the parent of the silly-renamed file and all its ancestor directories, resulting in the message VFS: Busy inodes after unmount. Self-destruct in 5 seconds. Have a nice day... and a few seconds later an oops with a stack trace ending in prune_dcache -> nfs_dentry_iput -> iput. Patch against 2.4.25-rc1, also applies to 2.6.2-rc2 (with an offset). =========================================================================== linux/linux/fs/nfs/dir.c =========================================================================== --- /usr/tmp/TmpDir.27555-0/linux/linux/fs/nfs/dir.c_1.32 Wed Feb 4 17:57:23 2004 +++ linux/linux/fs/nfs/dir.c Wed Feb 4 17:52:20 2004 @@ -551,6 +551,11 @@ static int nfs_dentry_delete(struct dent /* Unhash it, so that ->d_iput() would be called */ return 1; } + if (!(dentry->d_sb->s_flags & MS_ACTIVE)) { + /* Unhash it, so that ancestors of killed async unlink + * files will be cleaned up during umount */ + return 1; + } return 0; } Greg. -- Greg Banks, R&D Software Engineer, SGI Australian Software Group. I don't speak for SGI. ------------------------------------------------------- The SF.Net email is sponsored by EclipseCon 2004 Premiere Conference on Open Tools Development and Integration See the breadth of Eclipse activity. February 3-5 in Anaheim, CA. http://www.eclipsecon.org/osdn _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] SGI 882960: Busy inodes after unmount, oops 2004-02-06 5:50 ` Greg Banks @ 2004-02-13 16:26 ` canon 0 siblings, 0 replies; 13+ messages in thread From: canon @ 2004-02-13 16:26 UTC (permalink / raw) To: Linux NFS Mailing List Greetings, I'm seeing a similar oops and its starting to become a serious issue. I'm loosing dozens of nodes a day to it. See the oops below. I'm not seeing any errors before the oops, so I'm not sure if it is this VFS unmount sequence. I'll try to get more info and I will test out the patch to see if it makes a difference. Thanks, --Shane 20:45:06 pdsflx301 login: Unable to handle kernel NULL pointer dereference at virtual address 00000004 20:45:06 *pde = 00000000 20:45:06 Oops: 0000 20:45:06 nfs lockd sunrpc autofs e1000 ipchains ext3 jbd raid0 20:45:06 CPU: 1 20:45:06 EIP: 0010:[<c015b5a1>] Not tainted 20:45:06 EFLAGS: 00010246 20:45:06 20:45:06 EIP is at destroy_inode [kernel] 0x21 (2.4.20-28.7smp) 20:45:06 eax: 00000000 ebx: f6f1f6c0 ecx: f6f1f6c0 edx: f6f1f6c0 20:45:06 esi: f6f1f6c0 edi: f6f1f6c0 ebp: 00000295 esp: c2c2df3c 20:45:06 ds: 0018 es: 0018 ss: 0018 20:45:06 Process kswapd (pid: 5, stackpage=c2c2d000) 20:45:06 Stack: c03085d4 c015d096 f6f1f6c0 c0159f27 c47541c0 000002c9 f8a54cf9 e672ac58 20:45:06 e672ac40 f6f1f6c0 c015a320 f6f1f6c0 f6f1f6c0 c2c2df84 01186213 c0118bb3 20:45:06 c2c2df84 c2c2df84 00000000 00000000 01186213 ffff82a2 c0305820 000001d0 20:45:06 Call Trace: [<c015d096>] iput [kernel] 0x2a6 (0xc2c2df40)) 20:45:06 [<c0159f27>] dput [kernel] 0x47 (0xc2c2df48)) 20:45:06 [<f8a54cf9>] nfs_dentry_iput [nfs] 0x59 (0xc2c2df54)) 20:45:06 [<c015a320>] prune_dcache [kernel] 0xe0 (0xc2c2df64)) 20:45:06 [<c0118bb3>] schedule_timeout [kernel] 0x83 (0xc2c2df78)) 20:45:06 [<c015a720>] shrink_dcache_memory [kernel] 0x20 (0xc2c2dfa0)) 20:45:06 [<c013c213>] do_try_to_free_pages_kswapd [kernel] 0x13 (0xc2c2dfa8)) 20:45:06 [<c013c6d8>] kswapd [kernel] 0x138 (0xc2c2dfd4)) 20:45:06 [<c0105000>] stext [kernel] 0x0 (0xc2c2dfe8)) 20:45:06 [<c0107216>] arch_kernel_thread [kernel] 0x26 (0xc2c2dff0)) 20:45:06 [<c013c5a0>] kswapd [kernel] 0x0 (0xc2c2dff8)) 20:45:06 20:45:06 20:45:06 Code: 8b 40 04 85 c0 74 08 53 ff d0 5a eb 10 89 f6 53 ff 35 b8 91 ------------------------------------------------------------------------ Shane Canon voice: 510-486-6981 PSDF Project Lead fax: 510-486-7520 National Energy Research Scientific Computing Center 1 Cyclotron Road Mailstop 943-256 Berkeley, CA 94720 canon@nersc.gov ------------------------------------------------------------------------ ------------------------------------------------------- SF.Net is sponsored by: Speed Start Your Linux Apps Now. Build and deploy apps & Web services for Linux with a free DVD software kit from IBM. Click Now! http://ads.osdn.com/?ad_id=1356&alloc_id=3438&op=click _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] SGI 882960: Busy inodes after unmount, oops 2004-02-04 7:12 [PATCH] SGI 882960: Busy inodes after unmount, oops Greg Banks 2004-02-04 10:42 ` Olaf Kirch 2004-02-04 12:09 ` Olaf Kirch @ 2004-02-04 14:24 ` raven 2004-02-04 22:56 ` Greg Banks 2004-02-05 12:40 ` James Pearson 3 siblings, 1 reply; 13+ messages in thread From: raven @ 2004-02-04 14:24 UTC (permalink / raw) To: Greg Banks; +Cc: Trond Myklebust, Linux NFS Mailing List Excelent work Greg. On Wed, 4 Feb 2004, Greg Banks wrote: > G'day, > > This patch fixes a bug where the forced killing of pending asynchronous > unlink rpc_tasks during unmount leaks inode reference counts for the > parent of the silly-renamed file and all its ancestor directories, > resulting in the message > > VFS: Busy inodes after unmount. Self-destruct in 5 seconds. Have a nice day... > > and a few seconds later an oops with a stack trace ending in > prune_dcache -> nfs_dentry_iput -> iput. > > This is probably also the bug discussed last September on the autofs > mailing list. The patch posted by Olaf Hering then has no effect > at all, but it did put me on the right track (thanks Olaf). This message has occasionally been seen for a very log time when using autofs but I've never seen an oops follow it??? Until now I have assumed that it is a bug in the daemon code caused by not removing a directory(s) before the umount. I've caught several bugs in the cleanup code but can't seem to find what else might causing it. About the only thing that I'm sure of is that this happens at umount time in autofs, and with my latest code, only when there is a accompanying directory removal. Do you guys think this might really be the same problem? This is very hard to trigger so testing is difficult. Ian ------------------------------------------------------- The SF.Net email is sponsored by EclipseCon 2004 Premiere Conference on Open Tools Development and Integration See the breadth of Eclipse activity. February 3-5 in Anaheim, CA. http://www.eclipsecon.org/osdn _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] SGI 882960: Busy inodes after unmount, oops 2004-02-04 14:24 ` raven @ 2004-02-04 22:56 ` Greg Banks 0 siblings, 0 replies; 13+ messages in thread From: Greg Banks @ 2004-02-04 22:56 UTC (permalink / raw) To: raven; +Cc: Trond Myklebust, Linux NFS Mailing List raven@themaw.net wrote: > > This message has occasionally been seen for a very log time when using > autofs but I've never seen an oops follow it??? Good luck, I guess. The oopses aren't deterministic, but on an Altix I get them seconds to minutes after one of every few dozen umounts which give the message. The failure mode is that the NFS inode of the parent dir of the silly renamed file remains used (i_count=1 for the leaked dentry) but i_sb points to freed memory. This memory gets reused and overwritten with what appears to be ASCII strings. Later, prune_dcache comes along and tries to get rid of the inode, and iput dies while trying to call inode->i_sb->s_op->put_inode. I've seen three different stack traces in oopses, all going through prune_dcache. > About the only thing that I'm sure of is that this happens at umount time > in autofs, and with my latest code, only when there is a accompanying > directory removal. There may well be another similar bug as well. > This is very hard to trigger so testing is difficult. You could try turning the autofs expiry timer down to <5 sec and doing lots of automount/create/delete/umount cycles. FWIW, it may be relevant that for my tests the NFS server was a lot slower than the client and was only connected by 100BaseT. Greg. -- Greg Banks, R&D Software Engineer, SGI Australian Software Group. I don't speak for SGI. ------------------------------------------------------- The SF.Net email is sponsored by EclipseCon 2004 Premiere Conference on Open Tools Development and Integration See the breadth of Eclipse activity. February 3-5 in Anaheim, CA. http://www.eclipsecon.org/osdn _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] SGI 882960: Busy inodes after unmount, oops 2004-02-04 7:12 [PATCH] SGI 882960: Busy inodes after unmount, oops Greg Banks ` (2 preceding siblings ...) 2004-02-04 14:24 ` raven @ 2004-02-05 12:40 ` James Pearson 2004-02-09 7:46 ` Greg Banks 3 siblings, 1 reply; 13+ messages in thread From: James Pearson @ 2004-02-05 12:40 UTC (permalink / raw) To: Greg Banks; +Cc: Trond Myklebust, Linux NFS Mailing List I notice this patch doesn't apply cleanly with Trond's fix_unlink patch - can the two patches live together? Thanks James Pearson Greg Banks wrote: > > G'day, > > This patch fixes a bug where the forced killing of pending asynchronous > unlink rpc_tasks during unmount leaks inode reference counts for the > parent of the silly-renamed file and all its ancestor directories, > resulting in the message > > VFS: Busy inodes after unmount. Self-destruct in 5 seconds. Have a nice day... > > and a few seconds later an oops with a stack trace ending in > prune_dcache -> nfs_dentry_iput -> iput. > > This is probably also the bug discussed last September on the autofs > mailing list. The patch posted by Olaf Hering then has no effect > at all, but it did put me on the right track (thanks Olaf). > > The first part makes sure that dput() will unhash and kill dentries > and their parents if called while the unmount is underway. > > The second part moves the dput() call from the tk_exit callback of > the async unlink rpc_task to the tk_release callback so that it > will be called if the rpc_task is killed by rpc_killall_tasks() > instead of completing normally. > > =========================================================================== > linux/linux/fs/nfs/dir.c > =========================================================================== > > --- /usr/tmp/TmpDir.27555-0/linux/linux/fs/nfs/dir.c_1.32 Wed Feb 4 17:57:23 2004 > +++ linux/linux/fs/nfs/dir.c Wed Feb 4 17:52:20 2004 > @@ -551,6 +551,11 @@ static int nfs_dentry_delete(struct dent > /* Unhash it, so that ->d_iput() would be called */ > return 1; > } > + if (!(dentry->d_sb->s_flags & MS_ACTIVE)) { > + /* Unhash it, so that ancestors of killed async unlink > + * files will be cleaned up during umount */ > + return 1; > + } > return 0; > > } > > =========================================================================== > linux/linux/fs/nfs/unlink.c > =========================================================================== > > --- /usr/tmp/TmpDir.27555-0/linux/linux/fs/nfs/unlink.c_1.6 Wed Feb 4 17:57:23 2004 > +++ linux/linux/fs/nfs/unlink.c Wed Feb 4 17:56:57 2004 > @@ -51,6 +51,7 @@ static void > nfs_put_unlinkdata(struct nfs_unlinkdata *data) > { > if (--data->count == 0) { > + dput(data->dir); > nfs_detach_unlinkdata(data); > if (data->name.name != NULL) > kfree(data->name.name); > @@ -132,7 +133,6 @@ nfs_async_unlink_done(struct rpc_task *t > NFS_PROTO(dir_i)->unlink_done(dir, &task->tk_msg); > put_rpccred(data->cred); > data->cred = NULL; > - dput(dir); > } > > /** > > Greg. > -- > Greg Banks, R&D Software Engineer, SGI Australian Software Group. > I don't speak for SGI. > > ------------------------------------------------------- > The SF.Net email is sponsored by EclipseCon 2004 > Premiere Conference on Open Tools Development and Integration > See the breadth of Eclipse activity. February 3-5 in Anaheim, CA. > http://www.eclipsecon.org/osdn > _______________________________________________ > NFS maillist - NFS@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nfs ------------------------------------------------------- The SF.Net email is sponsored by EclipseCon 2004 Premiere Conference on Open Tools Development and Integration See the breadth of Eclipse activity. February 3-5 in Anaheim, CA. http://www.eclipsecon.org/osdn _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] SGI 882960: Busy inodes after unmount, oops 2004-02-05 12:40 ` James Pearson @ 2004-02-09 7:46 ` Greg Banks 0 siblings, 0 replies; 13+ messages in thread From: Greg Banks @ 2004-02-09 7:46 UTC (permalink / raw) To: James Pearson; +Cc: Trond Myklebust, Linux NFS Mailing List James Pearson wrote: > > I notice this patch doesn't apply cleanly with Trond's fix_unlink patch The second smaller version applies cleanly over fix_unlink.dif (with a 6 line offset). > - can the two patches live together? Yes. But... If all you want is to prevent busy inodes and the subsequent oops, my patch will do that in all cases. As discussed earlier, Trond's patch still allows busy inodes if the mount is "intr" and the last close is interrupted. I have a small test case which triggers this on a kernel with only Trond's patch. OTOH, Trond's patch serialises the last close so that umount cannot proceed until the .nfsXXX file is removed, which greatly reduces the chances of short-lived mounts leaving turds on the server. This is obviously a good thing. So I would recommend using both patches. Greg. -- Greg Banks, R&D Software Engineer, SGI Australian Software Group. I don't speak for SGI. ------------------------------------------------------- The SF.Net email is sponsored by EclipseCon 2004 Premiere Conference on Open Tools Development and Integration See the breadth of Eclipse activity. February 3-5 in Anaheim, CA. http://www.eclipsecon.org/osdn _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2004-02-13 16:28 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2004-02-04 7:12 [PATCH] SGI 882960: Busy inodes after unmount, oops Greg Banks 2004-02-04 10:42 ` Olaf Kirch 2004-02-04 22:59 ` Greg Banks 2004-02-04 12:09 ` Olaf Kirch 2004-02-04 22:41 ` Greg Banks 2004-02-05 16:15 ` Olaf Kirch 2004-02-05 22:23 ` Greg Banks 2004-02-06 5:50 ` Greg Banks 2004-02-13 16:26 ` canon 2004-02-04 14:24 ` raven 2004-02-04 22:56 ` Greg Banks 2004-02-05 12:40 ` James Pearson 2004-02-09 7:46 ` Greg Banks
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.