public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Andrea Arcangeli <andrea@suse.de>
To: linux-kernel@vger.kernel.org
Cc: Andrew Morton <akpm@osdl.org>, Chris Mason <mason@suse.com>
Subject: [PATCH] fix nr_unused accounting, and avoid recursing in iput with I_WILL_FREE set
Date: Tue, 18 Oct 2005 10:26:09 +0200	[thread overview]
Message-ID: <20051018082609.GC15717@x30.random> (raw)

Hello,

@@ -183,6 +183,7 @@ __sync_single_inode(struct inode *inode,
 			list_move(&inode->i_list, &inode_in_use);
 		} else {
 			list_move(&inode->i_list, &inode_unused);
+			inodes_stat.nr_unused++;
 		}
 	}
 	wake_up_inode(inode);

Are you sure the above diff is correct? It was added somewhere between
2.6.5 and 2.6.8. I think it's wrong.

The only way I can imagine the i_count to be zero in the above path, is
that I_WILL_FREE is set. And if I_WILL_FREE is set, then we must not
increase nr_unused. So I believe the above change is buggy and it will
definitely overstate the number of unused inodes and it should be backed
out.

Note that __writeback_single_inode before calling __sync_single_inode, can
drop the spinlock and we can have both the dirty and locked bitflags
clear here:

		spin_unlock(&inode_lock);
		__wait_on_inode(inode);
		iput(inode);
XXXXXXX
		spin_lock(&inode_lock);
	}
	use inode again here

a construct like the above makes zero sense from a reference counting
standpoint.

Either we don't ever use the inode again after the iput, or the
inode_lock should be taken _before_ executing the iput (i.e. a __iput
would be required). Taking the inode_lock after iput means the iget was
useless if we keep using the inode after the iput.

So the only chance the 2.6 was safe to call __writeback_single_inode
with the i_count == 0, is that I_WILL_FREE is set (I_WILL_FREE will
prevent the VM to free the inode in XXXXX).

Potentially calling the above iput with I_WILL_FREE was also wrong
because it would recurse in iput_final (the second mainline bug).

The below (untested) patch fixes the nr_unused accounting, avoids
recursing in iput when I_WILL_FREE is set and makes sure (with the
BUG_ON) that we don't corrupt memory and that all holders that don't set
I_WILL_FREE, keeps a reference on the inode!

Comments welcome, thanks.

Signed-off-by: Andrea Arcangeli <andrea@suse.de>

 fs-writeback.c |    5 ++---
 1 files changed, 2 insertions(+), 3 deletions(-)

Index: linux-2.6/fs/fs-writeback.c
--- linux-2.6/fs/fs-writeback.c.~1~	2005-07-28 17:08:53.000000000 +0200
+++ linux-2.6/fs/fs-writeback.c	2005-10-17 15:43:53.000000000 +0200
@@ -230,7 +230,6 @@ __sync_single_inode(struct inode *inode,
 			 * The inode is clean, unused
 			 */
 			list_move(&inode->i_list, &inode_unused);
-			inodes_stat.nr_unused++;
 		}
 	}
 	wake_up_inode(inode);
@@ -246,6 +245,8 @@ __writeback_single_inode(struct inode *i
 {
 	wait_queue_head_t *wqh;
 
+	BUG_ON(!atomic_read(&inode->i_count) ^ !!(inode->i_state & I_WILL_FREE));
+
 	if ((wbc->sync_mode != WB_SYNC_ALL) && (inode->i_state & I_LOCK)) {
 		list_move(&inode->i_list, &inode->i_sb->s_dirty);
 		return 0;
@@ -259,11 +260,9 @@ __writeback_single_inode(struct inode *i
 
 		wqh = bit_waitqueue(&inode->i_state, __I_LOCK);
 		do {
-			__iget(inode);
 			spin_unlock(&inode_lock);
 			__wait_on_bit(wqh, &wq, inode_wait,
 							TASK_UNINTERRUPTIBLE);
-			iput(inode);
 			spin_lock(&inode_lock);
 		} while (inode->i_state & I_LOCK);
 	}

             reply	other threads:[~2005-10-18  8:27 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-10-18  8:26 Andrea Arcangeli [this message]
2005-10-19  0:13 ` [PATCH] fix nr_unused accounting, and avoid recursing in iput with I_WILL_FREE set Andrew Morton
2005-10-19  0:40   ` Chris Mason
2005-10-19  1:15     ` Andrew Morton
2005-10-19  1:58       ` Chris Mason
2005-10-19  2:26         ` Andrew Morton
2005-10-19  2:58           ` Chris Mason
2005-10-25  2:21           ` Chris Mason
2005-10-25 14:14             ` Andrea Arcangeli
2005-10-19  7:47   ` Andrea Arcangeli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20051018082609.GC15717@x30.random \
    --to=andrea@suse.de \
    --cc=akpm@osdl.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mason@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox