linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Christian Stroetmann <stroetmann@ontolinux.com>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-fsdevel <linux-fsdevel@vger.kernel.org>
Subject: Re: [PATCH 19/21] fs: icache remove inode_lock
Date: Thu, 21 Oct 2010 04:14:02 +0200	[thread overview]
Message-ID: <4CBFA1EA.70109@ontolinux.com> (raw)
In-Reply-To: <1287622186-1935-20-git-send-email-david@fromorbit.com>

  Aloha;

On the 21.10.2010 02:49, Dave Chinner wrote:
> From: Dave Chinner<dchinner@redhat.com>
>
> All the functionality that the inode_lock protected has now been
> wrapped up in new independent locks and/or functionality. Hence the
> inode_lock does not serve a purpose any longer and hence can now be
> removed.
>
> Based on work originally done by Nick Piggin.
>
> Signed-off-by: Dave Chinner<dchinner@redhat.com>
> Reviewed-by: Christoph Hellwig<hch@lst.de>
> ---
>   Documentation/filesystems/Locking |    2 +-
>   Documentation/filesystems/porting |    8 ++-
>   Documentation/filesystems/vfs.txt |    2 +-
>   fs/block_dev.c                    |    2 -
>   fs/buffer.c                       |    2 +-
>   fs/drop_caches.c                  |    4 -
>   fs/fs-writeback.c                 |   85 ++++++----------------
>   fs/inode.c                        |  147 ++++++++-----------------------------
>   fs/logfs/inode.c                  |    2 +-
>   fs/notify/inode_mark.c            |   10 +--
>   fs/notify/mark.c                  |    1 -
>   fs/notify/vfsmount_mark.c         |    1 -
>   fs/ntfs/inode.c                   |    4 +-
>   fs/ocfs2/inode.c                  |    2 +-
>   fs/quota/dquot.c                  |   12 +--
>   include/linux/fs.h                |    2 +-
>   include/linux/writeback.h         |    2 -
>   mm/backing-dev.c                  |    4 -
>   mm/filemap.c                      |    6 +-
>   mm/rmap.c                         |    6 +-
>   20 files changed, 81 insertions(+), 223 deletions(-)
>
> diff --git a/Documentation/filesystems/Locking b/Documentation/filesystems/Locking
> index 2db4283..7f98cd5 100644
> --- a/Documentation/filesystems/Locking
> +++ b/Documentation/filesystems/Locking
> @@ -114,7 +114,7 @@ alloc_inode:
>   destroy_inode:
>   dirty_inode:				(must not sleep)
>   write_inode:
> -drop_inode:				!!!inode_lock!!!
> +drop_inode:				!!!i_lock!!!
>   evict_inode:
>   put_super:		write
>   write_super:		read
> diff --git a/Documentation/filesystems/porting b/Documentation/filesystems/porting
> index b12c895..f182795 100644
> --- a/Documentation/filesystems/porting
> +++ b/Documentation/filesystems/porting
> @@ -299,7 +299,7 @@ be used instead.  It gets called whenever the inode is evicted, whether it has
>   remaining links or not.  Caller does *not* evict the pagecache or inode-associated
>   metadata buffers; getting rid of those is responsibility of method, as it had
>   been for ->delete_inode().
> -	->drop_inode() returns int now; it's called on final iput() with inode_lock
> +	->drop_inode() returns int now; it's called on final iput() with i_lock
>   held and it returns true if filesystems wants the inode to be dropped.  As before,
still exists :-) :
filesystems want
>   generic_drop_inode() is still the default and it's been updated appropriately.
>   generic_delete_inode() is also alive and it consists simply of return 1.  Note that
> @@ -318,3 +318,9 @@ if it's zero is not *and* *never* *had* *been* enough.  Final unlink() and iput(
>   may happen while the inode is in the middle of ->write_inode(); e.g. if you blindly
>   free the on-disk inode, you may end up doing that while ->write_inode() is writing
>   to it.
> +
> +[mandatory]
> +        The i_count field in the inode has been replaced with i_ref, which is
> +a regular integer instead of an atomic_t.  Filesystems should not manipulate
> +it directly but use helpers like igrab(), iref() and iput().
> +
> diff --git a/Documentation/filesystems/vfs.txt b/Documentation/filesystems/vfs.txt
> index 0dbbbe4..cc0fd79 100644
> --- a/Documentation/filesystems/vfs.txt
> +++ b/Documentation/filesystems/vfs.txt
> @@ -246,7 +246,7 @@ or bottom half).
>   	should be synchronous or not, not all filesystems check this flag.
>
>     drop_inode: called when the last access to the inode is dropped,
> -	with the inode_lock spinlock held.
> +	with the i_lock spinlock held.
>
>   	This method should be either NULL (normal UNIX filesystem
>   	semantics) or "generic_delete_inode" (for filesystems that do not
> diff --git a/fs/block_dev.c b/fs/block_dev.c
> index 7909775..dae9871 100644
> --- a/fs/block_dev.c
> +++ b/fs/block_dev.c
> @@ -58,14 +58,12 @@ static void bdev_inode_switch_bdi(struct inode *inode,
>   {
>   	struct backing_dev_info *old = inode->i_data.backing_dev_info;
>
> -	spin_lock(&inode_lock);
>   	bdi_lock_two(old, dst);
>   	inode->i_data.backing_dev_info = dst;
>   	if (!list_empty(&inode->i_wb_list))
>   		list_move(&inode->i_wb_list,&dst->wb.b_dirty);
>   	spin_unlock(&old->wb.b_lock);
>   	spin_unlock(&dst->wb.b_lock);
> -	spin_unlock(&inode_lock);
>   }
>
>   static sector_t max_block(struct block_device *bdev)
> diff --git a/fs/buffer.c b/fs/buffer.c
> index 3e7dca2..66f7afd 100644
> --- a/fs/buffer.c
> +++ b/fs/buffer.c
> @@ -1145,7 +1145,7 @@ __getblk_slow(struct block_device *bdev, sector_t block, int size)
>    * inode list.
>    *
>    * mark_buffer_dirty() is atomic.  It takes bh->b_page->mapping->private_lock,
> - * mapping->tree_lock and the global inode_lock.
> + * and mapping->tree_lock.
>    */
>   void mark_buffer_dirty(struct buffer_head *bh)
>   {
> diff --git a/fs/drop_caches.c b/fs/drop_caches.c
> index f958dd8..bd39f65 100644
> --- a/fs/drop_caches.c
> +++ b/fs/drop_caches.c
> @@ -16,7 +16,6 @@ static void drop_pagecache_sb(struct super_block *sb, void *unused)
>   {
>   	struct inode *inode, *toput_inode = NULL;
>
> -	spin_lock(&inode_lock);
>   	spin_lock(&sb->s_inodes_lock);
>   	list_for_each_entry(inode,&sb->s_inodes, i_sb_list) {
>   		spin_lock(&inode->i_lock);
> @@ -28,15 +27,12 @@ static void drop_pagecache_sb(struct super_block *sb, void *unused)
>   		inode->i_ref++;
>   		spin_unlock(&inode->i_lock);
>   		spin_unlock(&sb->s_inodes_lock);
> -		spin_unlock(&inode_lock);
>   		invalidate_mapping_pages(inode->i_mapping, 0, -1);
>   		iput(toput_inode);
>   		toput_inode = inode;
> -		spin_lock(&inode_lock);
>   		spin_lock(&sb->s_inodes_lock);
>   	}
>   	spin_unlock(&sb->s_inodes_lock);
> -	spin_unlock(&inode_lock);
>   	iput(toput_inode);
>   }
>
> diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
> index 807d936..f0f5ca0 100644
> --- a/fs/fs-writeback.c
> +++ b/fs/fs-writeback.c
> @@ -206,7 +206,7 @@ static void requeue_io(struct inode *inode)
>   static void inode_sync_complete(struct inode *inode)
>   {
>   	/*
> -	 * Prevent speculative execution through spin_unlock(&inode_lock);
> +	 * Prevent speculative execution through spin_unlock(&inode->i_lock);
>   	 */
>   	smp_mb();
>   	wake_up_bit(&inode->i_state, __I_SYNC);
> @@ -306,27 +306,30 @@ static void inode_wait_for_writeback(struct inode *inode)
>   	wqh = bit_waitqueue(&inode->i_state, __I_SYNC);
>   	while (inode->i_state&  I_SYNC) {
>   		spin_unlock(&inode->i_lock);
> -		spin_unlock(&inode_lock);
>   		__wait_on_bit(wqh,&wq, inode_wait, TASK_UNINTERRUPTIBLE);
> -		spin_lock(&inode_lock);
>   		spin_lock(&inode->i_lock);
>   	}
>   }
>
> -/*
> - * Write out an inode's dirty pages.  Called under inode_lock.  Either the
> - * caller has a reference on the inode or the inode has I_WILL_FREE set.
> +/**
> + * sync_inode - write an inode and its pages to disk.
> + * @inode: the inode to sync
> + * @wbc: controls the writeback mode
> + *
> + * sync_inode() will write an inode and its pages to disk.  It will also
> + * correctly update the inode on its superblock's dirty inode lists and will
> + * update inode->i_state.
>    *
> - * If `wait' is set, wait on the writeout.
> + * The caller must have a ref on the inode or the inode has I_WILL_FREE set.
> + *
> + * If @wbc->sync_mode == WB_SYNC_ALL the we are doing a data integrity
> + * operation so we need to wait on the writeout.
>    *
>    * The whole writeout design is quite complex and fragile.  We want to avoid
>    * starvation of particular inodes when others are being redirtied, prevent
>    * livelocks, etc.
> - *
> - * Called under inode_lock.
>    */
> -static int
> -writeback_single_inode(struct inode *inode, struct writeback_control *wbc)
> +int sync_inode(struct inode *inode, struct writeback_control *wbc)
>   {
>   	struct backing_dev_info *bdi = inode_to_bdi(inode);
>   	struct address_space *mapping = inode->i_mapping;
> @@ -368,7 +371,6 @@ writeback_single_inode(struct inode *inode, struct writeback_control *wbc)
>   	inode->i_state |= I_SYNC;
>   	inode->i_state&= ~I_DIRTY_PAGES;
>   	spin_unlock(&inode->i_lock);
> -	spin_unlock(&inode_lock);
>
>   	ret = do_writepages(mapping, wbc);
>
> @@ -388,12 +390,10 @@ writeback_single_inode(struct inode *inode, struct writeback_control *wbc)
>   	 * due to delalloc, clear dirty metadata flags right before
>   	 * write_inode()
>   	 */
> -	spin_lock(&inode_lock);
>   	spin_lock(&inode->i_lock);
>   	dirty = inode->i_state&  I_DIRTY;
>   	inode->i_state&= ~(I_DIRTY_SYNC | I_DIRTY_DATASYNC);
>   	spin_unlock(&inode->i_lock);
> -	spin_unlock(&inode_lock);
>   	/* Don't write the inode if only I_DIRTY_PAGES was set */
>   	if (dirty&  (I_DIRTY_SYNC | I_DIRTY_DATASYNC)) {
>   		int err = write_inode(inode, wbc);
> @@ -401,7 +401,6 @@ writeback_single_inode(struct inode *inode, struct writeback_control *wbc)
>   			ret = err;
>   	}
>
> -	spin_lock(&inode_lock);
>   	spin_lock(&inode->i_lock);
>   	inode->i_state&= ~I_SYNC;
>   	if (!(inode->i_state&  I_FREEING)) {
> @@ -460,6 +459,7 @@ writeback_single_inode(struct inode *inode, struct writeback_control *wbc)
>   	inode_sync_complete(inode);
>   	return ret;
>   }
> +EXPORT_SYMBOL(sync_inode);
>
>   /*
>    * For background writeback the caller does not have the sb pinned
> @@ -552,7 +552,7 @@ static int writeback_sb_inodes(struct super_block *sb, struct bdi_writeback *wb,
>   		spin_unlock(&wb->b_lock);
>
>   		pages_skipped = wbc->pages_skipped;
> -		writeback_single_inode(inode, wbc);
> +		sync_inode(inode, wbc);
>   		if (wbc->pages_skipped != pages_skipped) {
>   			/*
>   			 * writeback is not making progress due to locked
> @@ -562,10 +562,8 @@ static int writeback_sb_inodes(struct super_block *sb, struct bdi_writeback *wb,
>   			redirty_tail(inode);
>   			spin_unlock(&wb->b_lock);
>   		}
> -		spin_unlock(&inode_lock);
>   		iput(inode);
>   		cond_resched();
> -		spin_lock(&inode_lock);
>   		spin_lock(&wb->b_lock);
>   		if (wbc->nr_to_write<= 0) {
>   			wbc->more_io = 1;
> @@ -585,9 +583,7 @@ void writeback_inodes_wb(struct bdi_writeback *wb,
>
>   	if (!wbc->wb_start)
>   		wbc->wb_start = jiffies; /* livelock avoidance */
> -	spin_lock(&inode_lock);
>   	spin_lock(&wb->b_lock);
> -
>   	if (!wbc->for_kupdate || list_empty(&wb->b_io))
>   		queue_io(wb, wbc->older_than_this);
>
> @@ -607,7 +603,6 @@ void writeback_inodes_wb(struct bdi_writeback *wb,
>   			break;
>   	}
>   	spin_unlock(&wb->b_lock);
> -	spin_unlock(&inode_lock);
>   	/* Leave any unwritten inodes on b_io */
>   }
>
> @@ -616,13 +611,11 @@ static void __writeback_inodes_sb(struct super_block *sb,
>   {
>   	WARN_ON(!rwsem_is_locked(&sb->s_umount));
>
> -	spin_lock(&inode_lock);
>   	spin_lock(&wb->b_lock);
>   	if (!wbc->for_kupdate || list_empty(&wb->b_io))
>   		queue_io(wb, wbc->older_than_this);
>   	writeback_sb_inodes(sb, wb, wbc, true);
>   	spin_unlock(&wb->b_lock);
> -	spin_unlock(&inode_lock);
>   }
>
>   /*
> @@ -732,7 +725,6 @@ static long wb_writeback(struct bdi_writeback *wb,
>   		 * become available for writeback. Otherwise
>   		 * we'll just busyloop.
>   		 */
> -		spin_lock(&inode_lock);
>   		if (!list_empty(&wb->b_more_io))  {
>   			spin_lock(&wb->b_lock);
>   			inode = list_entry(wb->b_more_io.prev,
> @@ -743,7 +735,6 @@ static long wb_writeback(struct bdi_writeback *wb,
>   			inode_wait_for_writeback(inode);
>   			spin_unlock(&inode->i_lock);
>   		}
> -		spin_unlock(&inode_lock);
>   	}
>
>   	return wrote;
> @@ -1006,7 +997,6 @@ void __mark_inode_dirty(struct inode *inode, int flags)
>   	if (unlikely(block_dump))
>   		block_dump___mark_inode_dirty(inode);
>
> -	spin_lock(&inode_lock);
>   	spin_lock(&inode->i_lock);
>   	if ((inode->i_state&  flags) != flags) {
>   		const int was_dirty = inode->i_state&  I_DIRTY;
> @@ -1064,8 +1054,6 @@ void __mark_inode_dirty(struct inode *inode, int flags)
>   out_unlock:
>   	spin_unlock(&inode->i_lock);
>   out:
> -	spin_unlock(&inode_lock);
> -
>   	if (wakeup_bdi)
>   		bdi_wakeup_thread_delayed(bdi);
>   }
> @@ -1098,7 +1086,6 @@ static void wait_sb_inodes(struct super_block *sb)
>   	 */
>   	WARN_ON(!rwsem_is_locked(&sb->s_umount));
>
> -	spin_lock(&inode_lock);
>   	spin_lock(&sb->s_inodes_lock);
>
>   	/*
> @@ -1121,14 +1108,12 @@ static void wait_sb_inodes(struct super_block *sb)
>   		inode->i_ref++;
>   		spin_unlock(&inode->i_lock);
>   		spin_unlock(&sb->s_inodes_lock);
> -		spin_unlock(&inode_lock);
>   		/*
> -		 * We hold a reference to 'inode' so it couldn't have
> -		 * been removed from s_inodes list while we dropped the
> -		 * inode_lock.  We cannot iput the inode now as we can
> -		 * be holding the last reference and we cannot iput it
> -		 * under inode_lock. So we keep the reference and iput
> -		 * it later.
> +		 * We hold a reference to 'inode' so it couldn't have been
> +		 * removed from s_inodes list while we dropped the
> +		 * s_inodes_lock.  We cannot iput the inode now as we can be
> +		 * holding the last reference and we cannot iput it under
> +		 * s_inodes_lock. So we keep the reference and iput it later.
>   		 */
>   		iput(old_inode);
>   		old_inode = inode;
> @@ -1137,11 +1122,9 @@ static void wait_sb_inodes(struct super_block *sb)
>
>   		cond_resched();
>
> -		spin_lock(&inode_lock);
>   		spin_lock(&sb->s_inodes_lock);
>   	}
>   	spin_unlock(&sb->s_inodes_lock);
> -	spin_unlock(&inode_lock);
>   	iput(old_inode);
>   }
>
> @@ -1244,33 +1227,9 @@ int write_inode_now(struct inode *inode, int sync)
>   		wbc.nr_to_write = 0;
>
>   	might_sleep();
> -	spin_lock(&inode_lock);
> -	ret = writeback_single_inode(inode,&wbc);
> -	spin_unlock(&inode_lock);
> +	ret = sync_inode(inode,&wbc);
>   	if (sync)
>   		inode_sync_wait(inode);
>   	return ret;
>   }
>   EXPORT_SYMBOL(write_inode_now);
> -
> -/**
> - * sync_inode - write an inode and its pages to disk.
> - * @inode: the inode to sync
> - * @wbc: controls the writeback mode
> - *
> - * sync_inode() will write an inode and its pages to disk.  It will also
> - * correctly update the inode on its superblock's dirty inode lists and will
> - * update inode->i_state.
> - *
> - * The caller must have a ref on the inode.
> - */
> -int sync_inode(struct inode *inode, struct writeback_control *wbc)
> -{
> -	int ret;
> -
> -	spin_lock(&inode_lock);
> -	ret = writeback_single_inode(inode, wbc);
> -	spin_unlock(&inode_lock);
> -	return ret;
> -}
> -EXPORT_SYMBOL(sync_inode);
> diff --git a/fs/inode.c b/fs/inode.c
> index b33b57c..0046ea8 100644
> --- a/fs/inode.c
> +++ b/fs/inode.c
> @@ -44,24 +44,20 @@
>    *   inode_lru, i_lru
>    *
>    * Lock orders
> - * inode_lock
> - *   inode hash bucket lock
> - *     inode->i_lock
> + * inode hash bucket lock
> + *   inode->i_lock
>    *
> - * inode_lock
> - *   sb inode lock
> - *     inode_lru_lock
> - *       wb->b_lock
> - *         inode->i_lock
> + * sb inode lock
> + *   inode_lru_lock
> + *     wb->b_lock
> + *       inode->i_lock
>    *
> - * inode_lock
> - *   wb->b_lock
> - *     sb_lock (pin sb for writeback)
> - *     inode->i_lock
> + * wb->b_lock
> + *   sb_lock (pin sb for writeback)
> + *   inode->i_lock
>    *
> - * inode_lock
> - *   inode_lru
> - *     inode->i_lock
> + * inode_lru
> + *   inode->i_lock
>    */
>   /*
>    * This is needed for the following functions:
> @@ -114,14 +110,6 @@ static LIST_HEAD(inode_lru);
>   static DEFINE_SPINLOCK(inode_lru_lock);
>
>   /*
> - * A simple spinlock to protect the list manipulations.
> - *
> - * NOTE! You also have to own the lock if you change
> - * the i_state of an inode while it is in use..
> - */
> -DEFINE_SPINLOCK(inode_lock);
> -
> -/*
>    * iprune_sem provides exclusion between the kswapd or try_to_free_pages
>    * icache shrinking path, and the umount path.  Without this exclusion,
>    * by the time prune_icache calls iput for the inode whose pages it has
> @@ -355,11 +343,9 @@ static void init_once(void *foo)
>   void iref(struct inode *inode)
>   {
>   	WARN_ON(inode->i_ref<  1);
> -	spin_lock(&inode_lock);
>   	spin_lock(&inode->i_lock);
>   	inode->i_ref++;
>   	spin_unlock(&inode->i_lock);
> -	spin_unlock(&inode_lock);
>   }
>   EXPORT_SYMBOL_GPL(iref);
>
> @@ -387,28 +373,21 @@ void inode_lru_list_del(struct inode *inode)
>   	spin_unlock(&inode_lru_lock);
>   }
>
> -static void __inode_sb_list_add(struct inode *inode)
> -{
> -	struct super_block *sb = inode->i_sb;
> -
> -	spin_lock(&sb->s_inodes_lock);
> -	list_add(&inode->i_sb_list,&sb->s_inodes);
> -	spin_unlock(&sb->s_inodes_lock);
> -}
> -
>   /**
>    * inode_sb_list_add - add inode to the superblock list of inodes
>    * @inode: inode to add
>    */
>   void inode_sb_list_add(struct inode *inode)
>   {
> -	spin_lock(&inode_lock);
> -	__inode_sb_list_add(inode);
> -	spin_unlock(&inode_lock);
> +	struct super_block *sb = inode->i_sb;
> +
> +	spin_lock(&sb->s_inodes_lock);
> +	list_add(&inode->i_sb_list,&sb->s_inodes);
> +	spin_unlock(&sb->s_inodes_lock);
>   }
>   EXPORT_SYMBOL_GPL(inode_sb_list_add);
>
> -static void __inode_sb_list_del(struct inode *inode)
> +static void inode_sb_list_del(struct inode *inode)
>   {
>   	struct super_block *sb = inode->i_sb;
>
> @@ -439,22 +418,17 @@ void __insert_inode_hash(struct inode *inode, unsigned long hashval)
>   {
>   	struct hlist_bl_head *b = inode_hashtable + hash(inode->i_sb, hashval);
>
> -	spin_lock(&inode_lock);
>   	hlist_bl_lock(b);
>   	hlist_bl_add_head(&inode->i_hash, b);
>   	hlist_bl_unlock(b);
> -	spin_unlock(&inode_lock);
>   }
>   EXPORT_SYMBOL(__insert_inode_hash);
>
>   /**
> - *	__remove_inode_hash - remove an inode from the hash
> + *	remove_inode_hash - remove an inode from the hash
>    *	@inode: inode to unhash
> - *
> - *	Remove an inode from the superblock. inode->i_lock must be
> - *	held.
>    */
> -static void __remove_inode_hash(struct inode *inode)
> +void remove_inode_hash(struct inode *inode)
>   {
>   	struct hlist_bl_head *b;
>
> @@ -463,19 +437,6 @@ static void __remove_inode_hash(struct inode *inode)
>   	hlist_bl_del_init(&inode->i_hash);
>   	hlist_bl_unlock(b);
>   }
> -
> -/**
> - *	remove_inode_hash - remove an inode from the hash
> - *	@inode: inode to unhash
> - *
> - *	Remove an inode from the superblock.
> - */
> -void remove_inode_hash(struct inode *inode)
> -{
> -	spin_lock(&inode_lock);
> -	__remove_inode_hash(inode);
> -	spin_unlock(&inode_lock);
> -}
>   EXPORT_SYMBOL(remove_inode_hash);
>
>   void end_writeback(struct inode *inode)
> @@ -526,10 +487,8 @@ static void dispose_list(struct list_head *head)
>
>   		evict(inode);
>
> -		spin_lock(&inode_lock);
> -		__remove_inode_hash(inode);
> -		__inode_sb_list_del(inode);
> -		spin_unlock(&inode_lock);
> +		remove_inode_hash(inode);
> +		inode_sb_list_del(inode);
>
>   		spin_lock(&inode->i_lock);
>   		wake_up_bit(&inode->i_state, __I_NEW);
> @@ -558,7 +517,6 @@ static int invalidate_list(struct super_block *sb, struct list_head *head,
>   		 * change during umount anymore, and because iprune_sem keeps
>   		 * shrink_icache_memory() away.
>   		 */
> -		cond_resched_lock(&inode_lock);
>   		cond_resched_lock(&sb->s_inodes_lock);
>
>   		next = next->next;
> @@ -609,12 +567,10 @@ int invalidate_inodes(struct super_block *sb)
>   	LIST_HEAD(throw_away);
>
>   	down_write(&iprune_sem);
> -	spin_lock(&inode_lock);
>   	spin_lock(&sb->s_inodes_lock);
>   	fsnotify_unmount_inodes(&sb->s_inodes);
>   	busy = invalidate_list(sb,&sb->s_inodes,&throw_away);
>   	spin_unlock(&sb->s_inodes_lock);
> -	spin_unlock(&inode_lock);
>
>   	dispose_list(&throw_away);
>   	up_write(&iprune_sem);
> @@ -625,7 +581,7 @@ EXPORT_SYMBOL(invalidate_inodes);
>
>   /*
>    * Scan `goal' inodes on the unused list for freeable ones. They are moved to a
> - * temporary list and then are freed outside inode_lock by dispose_list().
> + * temporary list and then are freed outside locks by dispose_list().
>    *
>    * Any inodes which are pinned purely because of attached pagecache have their
>    * pagecache removed.  If the inode has metadata buffers attached to
> @@ -646,7 +602,6 @@ static void prune_icache(int nr_to_scan)
>   	unsigned long reap = 0;
>
>   	down_read(&iprune_sem);
> -	spin_lock(&inode_lock);
>   	spin_lock(&inode_lru_lock);
>   	for (nr_scanned = 0; nr_scanned<  nr_to_scan; nr_scanned++) {
>   		struct inode *inode;
> @@ -679,7 +634,6 @@ static void prune_icache(int nr_to_scan)
>   			inode->i_ref++;
>   			spin_unlock(&inode->i_lock);
>   			spin_unlock(&inode_lru_lock);
> -			spin_unlock(&inode_lock);
>   			if (remove_inode_buffers(inode))
>   				reap += invalidate_mapping_pages(&inode->i_data,
>   								0, -1);
> @@ -695,7 +649,6 @@ static void prune_icache(int nr_to_scan)
>   			 * the I_REFERENCED flag on the next pass and do the
>   			 * same. Either way, we won't spin on it in this loop.
>   			 */
> -			spin_lock(&inode_lock);
>   			spin_lock(&inode_lru_lock);
>   			continue;
>   		}
> @@ -718,7 +671,6 @@ static void prune_icache(int nr_to_scan)
>   	else
>   		__count_vm_events(PGINODESTEAL, reap);
>   	spin_unlock(&inode_lru_lock);
> -	spin_unlock(&inode_lock);
>
>   	dispose_list(&freeable);
>   	up_read(&iprune_sem);
> @@ -868,19 +820,15 @@ struct inode *new_inode(struct super_block *sb)
>   {
>   	struct inode *inode;
>
> -	spin_lock_prefetch(&inode_lock);
> -
>   	inode = alloc_inode(sb);
>   	if (inode) {
> -		spin_lock(&inode_lock);
>   		/*
>   		 * set the inode state before we make the inode accessible to
>   		 * the outside world.
>   		 */
>   		inode->i_ino = get_next_ino();
>   		inode->i_state = 0;
> -		__inode_sb_list_add(inode);
> -		spin_unlock(&inode_lock);
> +		inode_sb_list_add(inode);
>   	}
>   	return inode;
>   }
> @@ -938,7 +886,6 @@ static struct inode *get_new_inode(struct super_block *sb,
>   	if (inode) {
>   		struct inode *old;
>
> -		spin_lock(&inode_lock);
>   		hlist_bl_lock(b);
>   		/* We released the lock, so.. */
>   		old = find_inode(sb, b, test, data);
> @@ -953,8 +900,7 @@ static struct inode *get_new_inode(struct super_block *sb,
>   			inode->i_state = I_NEW;
>   			hlist_bl_add_head(&inode->i_hash, b);
>   			hlist_bl_unlock(b);
> -			__inode_sb_list_add(inode);
> -			spin_unlock(&inode_lock);
> +			inode_sb_list_add(inode);
>
>   			/* Return the locked inode with I_NEW set, the
>   			 * caller is responsible for filling in the contents
> @@ -968,7 +914,6 @@ static struct inode *get_new_inode(struct super_block *sb,
>   		 * allocated.
>   		 */
>   		hlist_bl_unlock(b);
> -		spin_unlock(&inode_lock);
>   		destroy_inode(inode);
>   		inode = old;
>   		wait_on_inode(inode);
> @@ -977,7 +922,6 @@ static struct inode *get_new_inode(struct super_block *sb,
>
>   set_failed:
>   	hlist_bl_unlock(b);
> -	spin_unlock(&inode_lock);
>   	destroy_inode(inode);
>   	return NULL;
>   }
> @@ -995,7 +939,6 @@ static struct inode *get_new_inode_fast(struct super_block *sb,
>   	if (inode) {
>   		struct inode *old;
>
> -		spin_lock(&inode_lock);
>   		hlist_bl_lock(b);
>   		/* We released the lock, so.. */
>   		old = find_inode_fast(sb, b, ino);
> @@ -1008,8 +951,7 @@ static struct inode *get_new_inode_fast(struct super_block *sb,
>   			inode->i_state = I_NEW;
>   			hlist_bl_add_head(&inode->i_hash, b);
>   			hlist_bl_unlock(b);
> -			__inode_sb_list_add(inode);
> -			spin_unlock(&inode_lock);
> +			inode_sb_list_add(inode);
>
>   			/* Return the locked inode with I_NEW set, the
>   			 * caller is responsible for filling in the contents
> @@ -1023,7 +965,6 @@ static struct inode *get_new_inode_fast(struct super_block *sb,
>   		 * allocated.
>   		 */
>   		hlist_bl_unlock(b);
> -		spin_unlock(&inode_lock);
>   		destroy_inode(inode);
>   		inode = old;
>   		wait_on_inode(inode);
> @@ -1081,7 +1022,6 @@ ino_t iunique(struct super_block *sb, ino_t max_reserved)
>   	static unsigned int counter;
>   	ino_t res;
>
> -	spin_lock(&inode_lock);
>   	spin_lock(&iunique_lock);
>   	do {
>   		if (counter<= max_reserved)
> @@ -1089,7 +1029,6 @@ ino_t iunique(struct super_block *sb, ino_t max_reserved)
>   		res = counter++;
>   	} while (!test_inode_iunique(sb, res));
>   	spin_unlock(&iunique_lock);
> -	spin_unlock(&inode_lock);
>
>   	return res;
>   }
> @@ -1097,7 +1036,6 @@ EXPORT_SYMBOL(iunique);
>
>   struct inode *igrab(struct inode *inode)
>   {
> -	spin_lock(&inode_lock);
>   	spin_lock(&inode->i_lock);
>   	if (!(inode->i_state&  (I_FREEING|I_WILL_FREE))) {
>   		inode->i_ref++;
> @@ -1111,7 +1049,6 @@ struct inode *igrab(struct inode *inode)
>   		 */
>   		inode = NULL;
>   	}
> -	spin_unlock(&inode_lock);
>   	return inode;
>   }
>   EXPORT_SYMBOL(igrab);
> @@ -1133,7 +1070,7 @@ EXPORT_SYMBOL(igrab);
>    *
>    * Otherwise NULL is returned.
>    *
> - * Note, @test is called with the inode_lock held, so can't sleep.
> + * Note, @test is called with the i_lock held, so can't sleep.
>    */
>   static struct inode *ifind(struct super_block *sb,
>   		struct hlist_bl_head *b,
> @@ -1142,11 +1079,9 @@ static struct inode *ifind(struct super_block *sb,
>   {
>   	struct inode *inode;
>
> -	spin_lock(&inode_lock);
>   	hlist_bl_lock(b);
>   	inode = find_inode(sb, b, test, data);
>   	hlist_bl_unlock(b);
> -	spin_unlock(&inode_lock);
>
>   	if (inode&&  likely(wait))
>   		wait_on_inode(inode);
> @@ -1174,11 +1109,9 @@ static struct inode *ifind_fast(struct super_block *sb,
>   {
>   	struct inode *inode;
>
> -	spin_lock(&inode_lock);
>   	hlist_bl_lock(b);
>   	inode = find_inode_fast(sb, b, ino);
>   	hlist_bl_unlock(b);
> -	spin_unlock(&inode_lock);
>
>   	if (inode)
>   		wait_on_inode(inode);
> @@ -1204,7 +1137,7 @@ static struct inode *ifind_fast(struct super_block *sb,
>    *
>    * Otherwise NULL is returned.
>    *
> - * Note, @test is called with the inode_lock held, so can't sleep.
> + * Note, @test is called with the i_lock held, so can't sleep.
>    */
>   struct inode *ilookup5_nowait(struct super_block *sb, unsigned long hashval,
>   		int (*test)(struct inode *, void *), void *data)
> @@ -1232,7 +1165,7 @@ EXPORT_SYMBOL(ilookup5_nowait);
>    *
>    * Otherwise NULL is returned.
>    *
> - * Note, @test is called with the inode_lock held, so can't sleep.
> + * Note, @test is called with the i_lock held, so can't sleep.
>    */
>   struct inode *ilookup5(struct super_block *sb, unsigned long hashval,
>   		int (*test)(struct inode *, void *), void *data)
> @@ -1283,7 +1216,7 @@ EXPORT_SYMBOL(ilookup);
>    * inode and this is returned locked, hashed, and with the I_NEW flag set. The
>    * file system gets to fill it in before unlocking it via unlock_new_inode().
>    *
> - * Note both @test and @set are called with the inode_lock held, so can't sleep.
> + * Note both @test and @set are called with the i_lock held, so can't sleep.
mentioned before, so maybe this is more consistent with the other notes 
below:
NOTE: Both @test and @set are called with the i_lock held, so can't sleep.

[...]
> - * NOTE: This function runs with the inode_lock spin lock held so it is not
> + * NOTE: This function runs with the i_lock spin lock held so it is not
>    * allowed to sleep.
>    */
>   int ntfs_test_inode(struct inode *vi, ntfs_attr *na)
> @@ -98,7 +98,7 @@ int ntfs_test_inode(struct inode *vi, ntfs_attr *na)
>    *
>    * Return 0 on success and -errno on error.
>    *
> - * NOTE: This function runs with the inode_lock spin lock held so it is not
> + * NOTE: This function runs with the i_lock spin lock held so it is not
>    * allowed to sleep. (Hence the GFP_ATOMIC allocation.)
>    */

Have fun
Christian

  reply	other threads:[~2010-10-21  2:15 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-10-21  0:49 Inode Lock Scalability V6 Dave Chinner
2010-10-21  0:49 ` [PATCH 01/21] fs: switch bdev inode bdi's correctly Dave Chinner
2010-10-21  0:49 ` [PATCH 02/21] kernel: add bl_list Dave Chinner
2010-10-21  0:49 ` [PATCH 03/21] fs: Convert nr_inodes and nr_unused to per-cpu counters Dave Chinner
2010-10-21  0:49 ` [PATCH 04/21] fs: Implement lazy LRU updates for inodes Dave Chinner
2010-10-21  2:14   ` Christian Stroetmann
2010-10-21 10:07   ` Nick Piggin
2010-10-21 12:22     ` Christoph Hellwig
2010-10-23  9:32   ` Al Viro
2010-10-21  0:49 ` [PATCH 05/21] fs: inode split IO and LRU lists Dave Chinner
2010-10-21  0:49 ` [PATCH 06/21] fs: Clean up inode reference counting Dave Chinner
2010-10-21  1:41   ` Christoph Hellwig
2010-10-21  0:49 ` [PATCH 07/21] exofs: use iput() for inode reference count decrements Dave Chinner
2010-10-21  0:49 ` [PATCH 08/21] fs: rework icount to be a locked variable Dave Chinner
2010-10-21 19:40   ` Al Viro
2010-10-21 22:32     ` Dave Chinner
2010-10-21  0:49 ` [PATCH 09/21] fs: Factor inode hash operations into functions Dave Chinner
2010-10-21  0:49 ` [PATCH 10/21] fs: Stop abusing find_inode_fast in iunique Dave Chinner
2010-10-21  0:49 ` [PATCH 11/21] fs: move i_ref increments into find_inode/find_inode_fast Dave Chinner
2010-10-21  0:49 ` [PATCH 12/21] fs: remove inode_add_to_list/__inode_add_to_list Dave Chinner
2010-10-21  0:49 ` [PATCH 13/21] fs: Introduce per-bucket inode hash locks Dave Chinner
2010-10-21  0:49 ` [PATCH 14/21] fs: add a per-superblock lock for the inode list Dave Chinner
2010-10-21  0:49 ` [PATCH 15/21] fs: split locking of inode writeback and LRU lists Dave Chinner
2010-10-21  0:49 ` [PATCH 16/21] fs: Protect inode->i_state with the inode->i_lock Dave Chinner
2010-10-22  1:56   ` Al Viro
2010-10-22  2:26     ` Nick Piggin
2010-10-22  3:14     ` Dave Chinner
2010-10-22 10:37       ` Al Viro
2010-10-22 11:40         ` Christoph Hellwig
2010-10-23 21:40           ` Al Viro
2010-10-23 21:37         ` Al Viro
2010-10-24 14:13           ` Christoph Hellwig
2010-10-24 16:21             ` Christoph Hellwig
2010-10-24 19:17               ` Al Viro
2010-10-24 20:04                 ` Christoph Hellwig
2010-10-24 20:36                   ` Al Viro
2010-10-24  2:18       ` Nick Piggin
2010-10-21  0:49 ` [PATCH 17/21] fs: protect wake_up_inode with inode->i_lock Dave Chinner
2010-10-21  2:17   ` Christoph Hellwig
2010-10-21 13:16     ` Nick Piggin
2010-10-21  0:49 ` [PATCH 18/21] fs: introduce a per-cpu last_ino allocator Dave Chinner
2010-10-21  0:49 ` [PATCH 19/21] fs: icache remove inode_lock Dave Chinner
2010-10-21  2:14   ` Christian Stroetmann [this message]
2010-10-21  0:49 ` [PATCH 20/21] fs: Reduce inode I_FREEING and factor inode disposal Dave Chinner
2010-10-21  0:49 ` [PATCH 21/21] fs: do not assign default i_ino in new_inode Dave Chinner
2010-10-21  5:04 ` Inode Lock Scalability V7 (was V6) Dave Chinner
2010-10-21 13:20   ` Nick Piggin
2010-10-21 23:52     ` Dave Chinner
2010-10-22  0:45       ` Nick Piggin
2010-10-22  2:20         ` Al Viro
2010-10-22  2:34           ` Nick Piggin
2010-10-22  2:41             ` Nick Piggin
2010-10-22  2:48               ` Nick Piggin
2010-10-22  3:12                 ` Al Viro
2010-10-22  4:48                   ` Nick Piggin
2010-10-22  3:07             ` Al Viro
2010-10-22  4:46               ` Nick Piggin
2010-10-22  5:01                 ` Nick Piggin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4CBFA1EA.70109@ontolinux.com \
    --to=stroetmann@ontolinux.com \
    --cc=david@fromorbit.com \
    --cc=linux-fsdevel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).