public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Benny Halevy <bhalevy@panasas.com>
To: Andrew Morton <akpm@linux-foundation.org>,
	Steve Dickson <SteveD@redhat.com>
Cc: Alexander Viro <aviro@redhat.com>,
	Christoph Hellwig <hch@infradead.org>,
	linux-fsdevel@vger.kernel.org,
	linux-kernel <linux-kernel@vger.kernel.org>,
	nfs@lists.sourceforge.net,
	"J. Bruce Fields" <bfields@fieldses.org>,
	Trond Myklebust <trond.myklebust@fys.uio.no>
Subject: Re: [PATCH] NFS: Stop sillyname renames and unmounts from racing
Date: Tue, 06 Nov 2007 10:24:50 +0200	[thread overview]
Message-ID: <473024D2.6080105@panasas.com> (raw)
In-Reply-To: <20071105210636.2fc72e14.akpm@linux-foundation.org>

On Nov. 06, 2007, 7:06 +0200, Andrew Morton <akpm@linux-foundation.org> wrote:
> On Sat, 03 Nov 2007 07:09:25 -0400 Steve Dickson <SteveD@redhat.com> wrote:
> 
>> The following patch stops NFS sillyname renames and umounts from racing.
> 
> (appropriate cc's added)
> 
>> I have a test script does the following:
>>      1) start nfs server
>>       2) mount loopback
>>       3) open file in background
>>       4) remove file
>>       5) stop nfs server
>>       6) kill -9 process which has file open
>>       7) restart nfs server
>>       8) umount looback mount.
>>
>> After umount I got the "VFS: Busy inodes after unmount" message
>> because the processing of the rename has not finished.
>>
>> Below is a patch that the uses the new silly_count mechanism to
>> synchronize sillyname processing and umounts. The patch introduces a
>> nfs_put_super() routine that waits until the nfsi->silly_count count
>> is zero.
>>
>> A side-effect of finding and waiting for all the inode to
>> find the sillyname processing, is I need to traverse
>> the sb->s_inodes list in the supper block. To do that
>> safely the inode_lock spin lock has to be held. So for
>> modules to be able to "see" that lock I needed to
>> EXPORT_SYMBOL_GPL() it.
>>
>> Any objections to exporting the inode_lock spin lock?
>> If so, how should modules _safely_ access the s_inode list?
>>
>> steved.
>>
>>
>> Author: Steve Dickson <steved@redhat.com>
>> Date:   Wed Oct 31 12:19:26 2007 -0400
>>
>>      Close a unlink/sillyname rename and umount race by added a
>>      nfs_put_super routine that will run through all the inode
>>      currently on the super block, waiting for those that are
>>      in the middle of a sillyname rename or removal.
>>
>>      This patch stop the infamous "VFS: Busy inodes after unmount... "
>>      warning during umounts.
>>
>>      Signed-off-by: Steve Dickson <steved@redhat.com>
>>
>> diff --git a/fs/inode.c b/fs/inode.c
>> index ed35383..da9034a 100644
>> --- a/fs/inode.c
>> +++ b/fs/inode.c
>> @@ -81,6 +81,7 @@ static struct hlist_head *inode_hashtable __read_mostly;
>>    * the i_state of an inode while it is in use..
>>    */
>>   DEFINE_SPINLOCK(inode_lock);
>> +EXPORT_SYMBOL_GPL(inode_lock);
> 
> That's going to make hch unhappy.
> 
> Your email client is performing space-stuffing.
> See http://mbligh.org/linuxdocs/Email/Clients/Thunderbird
> 
>>   static struct file_system_type nfs_fs_type = {
>>   	.owner		= THIS_MODULE,
>> @@ -223,6 +225,7 @@ static const struct super_operations nfs_sops = {
>>   	.alloc_inode	= nfs_alloc_inode,
>>   	.destroy_inode	= nfs_destroy_inode,
>>   	.write_inode	= nfs_write_inode,
>> +	.put_super	= nfs_put_super,
>>   	.statfs		= nfs_statfs,
>>   	.clear_inode	= nfs_clear_inode,
>>   	.umount_begin	= nfs_umount_begin,
>> @@ -1767,6 +1770,30 @@ static void nfs4_kill_super(struct super_block *sb)
>>   	nfs_free_server(server);
>>   }
>>
>> +void nfs_put_super(struct super_block *sb)
> 
> This was (correctly) declared to be static.  We should define it that way
> too (I didn't know you could do this, actually).
> 
>> +{
>> +	struct inode *inode;
>> +	struct nfs_inode *nfsi;
>> +	/*
>> +	 * Make sure there are no outstanding renames
>> +	 */
>> +relock:
>> +	spin_lock(&inode_lock);
>> +	list_for_each_entry(inode, &sb->s_inodes, i_sb_list) {
>> +		nfsi = NFS_I(inode);
>> +		if (atomic_read(&nfsi->silly_count) > 0) {
>> +			/* Keep this inode around  during the wait */
>> +			atomic_inc(&inode->i_count);
>> +			spin_unlock(&inode_lock);
>> +			wait_event(nfsi->waitqueue,
>> +				atomic_read(&nfsi->silly_count) == 1);
>> +			iput(inode);
>> +			goto relock;
>> +		}
>> +	}
>> +	spin_unlock(&inode_lock);
>> +}
> 
> That's an O(n^2) search.  If it is at all possible to hit a catastrophic
> slowdown in here, you can bet that someone out there will indeed hit it in
> real life.
> 
> I'm too lazy to look, but we might need to check things like I_FREEING
> and I_CLEAR before taking a ref on this inode.

It'd be very nice if the silly renamed inodes (with silly_count > 1) were moved
to a different list in the first pass, under the inode_lock, and then waited on
until silly_count <= 1 in a second pass only on the filtered list.  This will
provide you with O(1).


  parent reply	other threads:[~2007-11-06  8:25 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-11-03 11:09 [PATCH] NFS: Stop sillyname renames and unmounts from racing Steve Dickson
2007-11-06  5:06 ` Andrew Morton
2007-11-06  5:15   ` Alexander Viro
2007-11-08  9:05     ` Steve Dickson
2007-11-06  8:24   ` Benny Halevy [this message]
2007-11-06  8:50     ` Alexander Viro

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=473024D2.6080105@panasas.com \
    --to=bhalevy@panasas.com \
    --cc=SteveD@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=aviro@redhat.com \
    --cc=bfields@fieldses.org \
    --cc=hch@infradead.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nfs@lists.sourceforge.net \
    --cc=trond.myklebust@fys.uio.no \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox