public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Boaz Harrosh <bharrosh@panasas.com>
To: Christoph Hellwig <hch@infradead.org>,
	Dave Chinner <david@fromorbit.com>,
	Nick Piggin <npiggin@kernel.dk>
Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 06/18] exofs: use iput() for inode reference count decrements
Date: Sun, 24 Oct 2010 20:06:06 +0200	[thread overview]
Message-ID: <4CC4758E.9020102@panasas.com> (raw)
In-Reply-To: <20101017012437.GA4227@infradead.org>

On 10/17/2010 03:24 AM, Christoph Hellwig wrote:
> On Wed, Oct 13, 2010 at 10:49:46AM -0400, Boaz Harrosh wrote:
>> I suspect it's not a bug but a useless inc/dec because in all my testing
>> I have not seen an inode leak. Let me investigate if it can be removed.
>>
>> So I do not think we need it for 2.6.36.
>>
>> I'll take this patch into my 2.6.37-rcX merge window. It should appear
>> in linux-next by tomorrow. Hopefully followed by a removal patch later.
> 
> It's a very real bug.  If an inode goes away in-core before the creation
> on the OSD has finished, e.g. by using the drop_cache files the
> atomic_dec instead of the iput means you will never call iput_final
> and thus leak all ressources associated with the inode, as well as
> leaving it on all lists.  It's not easy to hit, but very nasty when
> it is hit.
> 

Hi Christoph Dave

As I suspected this fix is not good. For a simple reason, The create_done()
is called from scsi_done() which has irq disabled. So in iput() in the case
evict() is needed we BUG on trying to take the i_mutex.

> Another option to fix it might be to drop the refcount games and just
> add a wait for the objection creation in the evict_inode method to
> make sure we never remove the inode before the object creation
> has finished.
> 

On the other hand this solution does work, perfectly. Actually there
was already a "wait for the objection creation" in exofs_evict_inode().
Hence the reason I've never seen an inode leak. Below is the patch I'm
putting in -next for push to 2.6.37 (So there was no bug in exofs after all,
I'm not CC(ing) stable@)

Boaz
---
From: Boaz Harrosh <bharrosh@panasas.com>
Subject: [PATCH] exofs: remove inode->i_count ref/deref in exofs_new_inode/create_done

exofs_new_inode was incrementing the inode->i_count and
decrementing it in create_done, in a bad attempt to make
sure the inode will still be there when asynchronous create_done
finally arrives. This was stupid because iput() was not called,
and if is the final ref, could leak the inode.

However all this is not needed, because at exofs_evict_inode()
we already wait for create_done return by waiting for the
create_object event. Therefore remove the extra ref counting
and just Thicken the comment at exofs_evict_inode() a bit.
(Also use ready made __exofs_wait_obj_created instead of
open-coding it.)

CC: Dave Chinner <dchinner@redhat.com>
CC: Christoph Hellwig <hch@lst.de>
CC: Nick Piggin <npiggin@kernel.dk>
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
---
 fs/exofs/inode.c |   19 ++++++-------------
 1 files changed, 6 insertions(+), 13 deletions(-)

diff --git a/fs/exofs/inode.c b/fs/exofs/inode.c
index 0ba9886..31e9164 100644
--- a/fs/exofs/inode.c
+++ b/fs/exofs/inode.c
@@ -1102,7 +1102,6 @@ static void create_done(struct exofs_io_state *ios, void *p)
 
 	set_obj_created(oi);
 
-	atomic_dec(&inode->i_count);
 	wake_up(&oi->i_wq);
 }
 
@@ -1153,17 +1152,11 @@ struct inode *exofs_new_inode(struct inode *dir, int mode)
 	ios->obj.id = exofs_oi_objno(oi);
 	exofs_make_credential(oi->i_cred, &ios->obj);
 
-	/* increment the refcount so that the inode will still be around when we
-	 * reach the callback
-	 */
-	atomic_inc(&inode->i_count);
-
 	ios->done = create_done;
 	ios->private = inode;
 	ios->cred = oi->i_cred;
 	ret = exofs_sbi_create(ios);
 	if (ret) {
-		atomic_dec(&inode->i_count);
 		exofs_put_io_state(ios);
 		return ERR_PTR(ret);
 	}
@@ -1321,12 +1314,12 @@ void exofs_evict_inode(struct inode *inode)
 	inode->i_size = 0;
 	end_writeback(inode);
 
-	/* if we are deleting an obj that hasn't been created yet, wait */
-	if (!obj_created(oi)) {
-		BUG_ON(!obj_2bcreated(oi));
-		wait_event(oi->i_wq, obj_created(oi));
-		/* ignore the error attempt a remove anyway */
-	}
+	/* if we are deleting an obj that hasn't been created yet, wait
+	 * This also makes sure that create_done cannot be called with an
+	 * already deleted inode.
+	 */
+	__exofs_wait_obj_created(oi);
+	/* ignore the error attempt a remove anyway */
 
 	/* Now Remove the OSD objects */
 	ret = exofs_get_io_state(&sbi->layout, &ios);
-- 
1.7.2.3


  reply	other threads:[~2010-10-24 18:06 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-10-13  0:15 fs: Inode cache scalability V3 Dave Chinner
2010-10-13  0:15 ` [PATCH 01/18] kernel: add bl_list Dave Chinner
2010-10-13  0:15 ` [PATCH 02/18] fs: Convert nr_inodes and nr_unused to per-cpu counters Dave Chinner
2010-10-13  0:15 ` [PATCH 03/18] fs: Implement lazy LRU updates for inodes Dave Chinner
2010-10-13 13:32   ` Christoph Hellwig
2010-10-16  0:11     ` Dave Chinner
2010-10-16  7:56     ` Nick Piggin
2010-10-13  0:15 ` [PATCH 04/18] fs: inode split IO and LRU lists Dave Chinner
2010-10-13 11:31   ` Christoph Hellwig
2010-10-13  0:15 ` [PATCH 05/18] fs: Clean up inode reference counting Dave Chinner
2010-10-13 11:33   ` Christoph Hellwig
2010-10-13  0:15 ` [PATCH 06/18] exofs: use iput() for inode reference count decrements Dave Chinner
2010-10-13 11:34   ` Christoph Hellwig
2010-10-13 14:49     ` Boaz Harrosh
2010-10-17  1:24       ` Christoph Hellwig
2010-10-24 18:06         ` Boaz Harrosh [this message]
2010-10-13  0:15 ` [PATCH 07/18] fs: rework icount to be a locked variable Dave Chinner
2010-10-13 11:36   ` Christoph Hellwig
2010-10-16  0:15     ` Dave Chinner
2010-10-16  0:20       ` Dave Chinner
2010-10-16  0:23         ` Christoph Hellwig
2010-10-13  0:15 ` [PATCH 08/18] fs: Factor inode hash operations into functions Dave Chinner
2010-10-13  0:15 ` [PATCH 09/18] fs: Introduce per-bucket inode hash locks Dave Chinner
2010-10-13 11:41   ` Christoph Hellwig
2010-10-13 15:05   ` Christoph Hellwig
2010-10-13  0:15 ` [PATCH 10/18] fs: add a per-superblock lock for the inode list Dave Chinner
2010-10-13  0:15 ` [PATCH 11/18] fs: split locking of inode writeback and LRU lists Dave Chinner
2010-10-13  3:26   ` Lin Ming
2010-10-13 13:18   ` Christoph Hellwig
2010-10-13  0:15 ` [PATCH 12/18] fs: Protect inode->i_state with the inode->i_lock Dave Chinner
2010-10-13 13:27   ` Christoph Hellwig
2010-10-13  0:15 ` [PATCH 13/18] fs: introduce a per-cpu last_ino allocator Dave Chinner
2010-10-13  0:15 ` [PATCH 14/18] fs: Make iunique independent of inode_lock Dave Chinner
2010-10-13  0:15 ` [PATCH 15/18] fs: icache remove inode_lock Dave Chinner
2010-10-13  2:09   ` Dave Chinner
2010-10-13 13:42   ` Christoph Hellwig
2010-10-13  0:15 ` [PATCH 16/18] fs: Reduce inode I_FREEING and factor inode disposal Dave Chinner
2010-10-13 13:51   ` Christoph Hellwig
2010-10-13  0:16 ` [PATCH 17/18] fs: split __inode_add_to_list Dave Chinner
2010-10-13 15:08   ` Christoph Hellwig
2010-10-13  0:16 ` [PATCH 18/18] fs: do not assign default i_ino in new_inode Dave Chinner
2010-10-16  7:57   ` Nick Piggin
2010-10-16 16:30     ` Christoph Hellwig
2010-10-13 14:51 ` fs: Inode cache scalability V3 Christoph Hellwig
2010-10-13 15:58   ` Christoph Hellwig
2010-10-13 21:46     ` Christoph Hellwig
2010-10-13 23:36       ` Christoph Hellwig
2010-10-13 23:55         ` Dave Chinner
2010-10-14  0:06           ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4CC4758E.9020102@panasas.com \
    --to=bharrosh@panasas.com \
    --cc=david@fromorbit.com \
    --cc=hch@infradead.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=npiggin@kernel.dk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox