Linux NFS development
 help / color / mirror / Atom feed
From: Greg Banks <gnb@melbourne.sgi.com>
To: Olaf Kirch <okir@suse.de>,
	Trond Myklebust <trond.myklebust@fys.uio.no>,
	Linux NFS Mailing List <nfs@lists.sourceforge.net>
Subject: Re: [PATCH] SGI 882960: Busy inodes after unmount, oops
Date: Fri, 06 Feb 2004 16:50:44 +1100	[thread overview]
Message-ID: <40232B34.2AB2307A@melbourne.sgi.com> (raw)
In-Reply-To: 4022C248.D4FF17CD@melbourne.sgi.com

Greg Banks wrote:
> 
> Olaf Kirch wrote:
> >
> > > >  -      __rpc_execute notices the task is dead (no tk_action),
> > > >         leaves the loop and invokes task->tk_exit == nfs_async_unlink_done
> > >
> > > No.  In a crash dump taken after the umount has completed, the dir dentry has
> > > 1 leaked d_count for every async unlink present at umount, even though the
> > > async unlink tasks have been cleaned up.  This indicates that task->tk_exit
> > > is not being called but task->tk_release is, so the dput is not happening.
> >
> > But then prune_dcache shouldn't touch these dentries at all, because their
> > refcount is still 1. They would be leaked, but there would be no crash.
> 
> That makes sense.  I'll go back and recheck my forensics.

I've checked the crash dumps again, and they don't actually have any evidence
either way.  So I ran an experiment.

Feb  6 16:15:37 3X:budgie root: fmeh.sh: /tmp/fmeh-mounts/00007.d UDELAY=500
Feb  6 16:15:38 4A:budgie kernel: nfs_async_unlink_done
Feb  6 16:15:38 4A:budgie kernel: nfs_async_unlink_release
Feb  6 16:15:38 4A:budgie kernel: nfs_async_unlink_done
Feb  6 16:15:38 4A:budgie kernel: nfs_async_unlink_release
Feb  6 16:15:38 4A:budgie kernel: nfs_async_unlink_done
Feb  6 16:15:38 4A:budgie kernel: nfs_async_unlink_release
Feb  6 16:15:38 3X:budgie root: fmeh.sh: /tmp/fmeh-mounts/00008.d UDELAY=500
Feb  6 16:15:38 4A:budgie kernel: nfs_async_unlink_done
Feb  6 16:15:38 4A:budgie kernel: nfs_async_unlink_release
Feb  6 16:15:38 4A:budgie kernel: nfs_async_unlink_done
Feb  6 16:15:38 4A:budgie kernel: nfs_async_unlink_release
Feb  6 16:15:38 4A:budgie kernel: nfs_async_unlink_done
Feb  6 16:15:38 4A:budgie kernel: nfs_async_unlink_release
Feb  6 16:15:38 3X:budgie root: fmeh.sh: /tmp/fmeh-mounts/00009.d UDELAY=500
Feb  6 16:15:38 4A:budgie kernel: nfs_async_unlink_done
Feb  6 16:15:38 4A:budgie kernel: nfs_async_unlink_release
Feb  6 16:15:38 4A:budgie kernel: nfs_async_unlink_done
Feb  6 16:15:38 4A:budgie kernel: nfs_async_unlink_release
Feb  6 16:15:38 4A:budgie kernel: nfs_async_unlink_done

There's some more matched done/release pairs which didn't make it
to syslog by the time the kernel hit my BUG() in invalidate_list.

So I was wrong, and the unlink.c part of the patch is worthless.
Well caught, Olaf.

Here's the latest version of the proposed patch.





This patch fixes a bug where the forced killing of pending asynchronous
unlink rpc_tasks during unmount leaks inode reference counts for the
parent of the silly-renamed file and all its ancestor directories,
resulting in the message 

VFS: Busy inodes after unmount. Self-destruct in 5 seconds.  Have a nice day...

and a few seconds later an oops with a stack trace ending in
prune_dcache -> nfs_dentry_iput -> iput.

Patch against 2.4.25-rc1, also applies to 2.6.2-rc2 (with an offset).

===========================================================================
linux/linux/fs/nfs/dir.c
===========================================================================

--- /usr/tmp/TmpDir.27555-0/linux/linux/fs/nfs/dir.c_1.32	Wed Feb  4 17:57:23 2004
+++ linux/linux/fs/nfs/dir.c	Wed Feb  4 17:52:20 2004
@@ -551,6 +551,11 @@ static int nfs_dentry_delete(struct dent
 		/* Unhash it, so that ->d_iput() would be called */
 		return 1;
 	}
+	if (!(dentry->d_sb->s_flags & MS_ACTIVE)) {
+		/* Unhash it, so that ancestors of killed async unlink
+		 * files will be cleaned up during umount */
+		return 1;
+	}
 	return 0;
 
 }




Greg.
-- 
Greg Banks, R&D Software Engineer, SGI Australian Software Group.
I don't speak for SGI.


-------------------------------------------------------
The SF.Net email is sponsored by EclipseCon 2004
Premiere Conference on Open Tools Development and Integration
See the breadth of Eclipse activity. February 3-5 in Anaheim, CA.
http://www.eclipsecon.org/osdn
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

  reply	other threads:[~2004-02-06  5:51 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-02-04  7:12 [PATCH] SGI 882960: Busy inodes after unmount, oops Greg Banks
2004-02-04 10:42 ` Olaf Kirch
2004-02-04 22:59   ` Greg Banks
2004-02-04 12:09 ` Olaf Kirch
2004-02-04 22:41   ` Greg Banks
2004-02-05 16:15     ` Olaf Kirch
2004-02-05 22:23       ` Greg Banks
2004-02-06  5:50         ` Greg Banks [this message]
2004-02-13 16:26           ` canon
2004-02-04 14:24 ` raven
2004-02-04 22:56   ` Greg Banks
2004-02-05 12:40 ` James Pearson
2004-02-09  7:46   ` Greg Banks

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=40232B34.2AB2307A@melbourne.sgi.com \
    --to=gnb@melbourne.sgi.com \
    --cc=nfs@lists.sourceforge.net \
    --cc=okir@suse.de \
    --cc=trond.myklebust@fys.uio.no \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox