All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: "Theodore Ts'o" <tytso@mit.edu>
Cc: Jan Kara <jack@suse.cz>, Dave Chinner <david@fromorbit.com>,
	Andreas Dilger <adilger@dilger.ca>,
	Linux Filesystem Development List <linux-fsdevel@vger.kernel.org>,
	Ext4 Developers List <linux-ext4@vger.kernel.org>,
	Linux btrfs Developers List <linux-btrfs@vger.kernel.org>,
	XFS Developers <xfs@oss.sgi.com>
Subject: Re: [PATCH-v4 6/7] ext4: add support for a lazytime mount option
Date: Thu, 27 Nov 2014 16:41:59 +0100	[thread overview]
Message-ID: <20141127154159.GA11922@quack.suse.cz> (raw)
In-Reply-To: <20141127152524.GB14091@thunk.org>

On Thu 27-11-14 10:25:24, Ted Tso wrote:
> This is what I'm currently playing with which I believe fixes the iput()
> problem.  In fs/ext4/inode.c:
> 
> struct other_inode {
> 	unsigned long		orig_ino;
> 	struct ext4_inode	*raw_inode;
> };
> static int other_inode_match(struct inode * inode, unsigned long ino,
> 			     void *data);
> 
> /*
>  * Opportunistically update the other time fields for other inodes in
>  * the same inode table block.
>  */
> static void ext4_update_other_inodes_time(struct super_block *sb,
> 					  unsigned long orig_ino, char *buf)
> {
> 	struct other_inode oi;
> 	unsigned long ino;
> 	int i, inodes_per_block = EXT4_SB(sb)->s_inodes_per_block;
> 	int inode_size = EXT4_INODE_SIZE(sb);
> 
> 	oi.orig_ino = orig_ino;
> 	ino = orig_ino & ~(inodes_per_block - 1);
> 	for (i = 0; i < inodes_per_block; i++, ino++, buf += inode_size) {
> 		if (ino == orig_ino)
> 			continue;
> 		oi.raw_inode = (struct ext4_inode *) buf;
> 		(void) find_inode_nowait(sb, ino, other_inode_match, &oi);
> 	}
> }
> 
> static int other_inode_match(struct inode * inode, unsigned long ino,
> 			     void *data)
> {
> 	struct other_inode *oi = (struct other_inode *) data;
> 
> 	if ((inode->i_ino != ino) ||
> 	    (inode->i_state & (I_FREEING | I_WILL_FREE | I_NEW)) ||
> 	    ((inode->i_state & I_DIRTY_TIME) == 0))
> 		return 0;
> 	spin_lock(&inode->i_lock);
> 	if (((inode->i_state & (I_FREEING | I_WILL_FREE | I_NEW)) == 0) &&
> 	    (inode->i_state & I_DIRTY_TIME)) {
> 		struct ext4_inode_info	*ei = EXT4_I(inode);
> 
> 		inode->i_state &= ~I_DIRTY_TIME;
> 		inode->i_ts_dirty_day = 0;
> 		spin_unlock(&inode->i_lock);
> 		inode_requeue_dirtytime(inode);
> 
> 		spin_lock(&ei->i_raw_lock);
> 		EXT4_INODE_SET_XTIME(i_ctime, inode, oi->raw_inode);
> 		EXT4_INODE_SET_XTIME(i_mtime, inode, oi->raw_inode);
> 		EXT4_INODE_SET_XTIME(i_atime, inode, oi->raw_inode);
> 		ext4_inode_csum_set(inode, oi->raw_inode, ei);
> 		spin_unlock(&ei->i_raw_lock);
> 		trace_ext4_other_inode_update_time(inode, oi->orig_ino);
> 		return -1;
> 	}
> 	spin_unlock(&inode->i_lock);
> 	return -1;
> }
  Hum, but this puts lots of stuff under inode_hash_lock, including
writeback list lock. I don't like this too much. I understand that getting
handle for each inode is rather more CPU intensive but it should still be a
clear win over the current situation and avoids entangling locks like this.

								Honza

> The above uses the following in fs/inode.c (which gets added instead of
> find_active_inode_nowait):
> 
> /**
>  * find_inode_nowait - find an inode in the inode cache
>  * @sb:		super block of file system to search
>  * @hashval:	hash value (usually inode number) to search for
>  * @match:	callback used for comparisons between inodes
>  * @data:	opaque data pointer to pass to @match
>  *
>  * Search for the inode specified by @hashval and @data in the inode
>  * cache, where the helper function @match will return 0 if the inode
>  * does not match, 1 if the inode does match, and -1 if the search
>  * should be stopped.  The @match function must be responsible for
>  * taking the i_lock spin_lock and checking i_state for an inode being
>  * freed or being initialized, and incrementing the reference count
>  * before returning 1.  It also must not sleep, since it is called with
>  * the inode_hash_lock spinlock held.
>  *
>  * This is a even more generalized version of ilookup5() when the
>  * function must never block --- find_inode() can block in
>  * __wait_on_freeing_inode() --- or when the caller can not increment
>  * the reference count because the resulting iput() might cause an
>  * inode eviction().  The tradeoff is that the @match funtion must be
>  * very carefully implemented.
>  */
> struct inode *find_inode_nowait(struct super_block *sb,
> 				unsigned long hashval,
> 				int (*match)(struct inode *, unsigned long,
> 					     void *),
> 				void *data)
> {
> 	struct hlist_head *head = inode_hashtable + hash(sb, hashval);
> 	struct inode *inode, *ret_inode = NULL;
> 	int mval;
> 
> 	spin_lock(&inode_hash_lock);
> 	hlist_for_each_entry(inode, head, i_hash) {
> 		if (inode->i_sb != sb)
> 			continue;
> 		mval = match(inode, hashval, data);
> 		if (mval == 0)
> 			continue;
> 		if (mval == 1)
> 			ret_inode = inode;
> 		goto out;
> 	}
> out:
> 	spin_unlock(&inode_hash_lock);
> 	return ret_inode;
> }
> EXPORT_SYMBOL(find_inode_nowait);
> 
> Comments?
> 
> 							- Ted
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

WARNING: multiple messages have this Message-ID (diff)
From: Jan Kara <jack@suse.cz>
To: Theodore Ts'o <tytso@mit.edu>
Cc: Andreas Dilger <adilger@dilger.ca>, Jan Kara <jack@suse.cz>,
	XFS Developers <xfs@oss.sgi.com>,
	Linux Filesystem Development List <linux-fsdevel@vger.kernel.org>,
	Ext4 Developers List <linux-ext4@vger.kernel.org>,
	Linux btrfs Developers List <linux-btrfs@vger.kernel.org>
Subject: Re: [PATCH-v4 6/7] ext4: add support for a lazytime mount option
Date: Thu, 27 Nov 2014 16:41:59 +0100	[thread overview]
Message-ID: <20141127154159.GA11922@quack.suse.cz> (raw)
In-Reply-To: <20141127152524.GB14091@thunk.org>

On Thu 27-11-14 10:25:24, Ted Tso wrote:
> This is what I'm currently playing with which I believe fixes the iput()
> problem.  In fs/ext4/inode.c:
> 
> struct other_inode {
> 	unsigned long		orig_ino;
> 	struct ext4_inode	*raw_inode;
> };
> static int other_inode_match(struct inode * inode, unsigned long ino,
> 			     void *data);
> 
> /*
>  * Opportunistically update the other time fields for other inodes in
>  * the same inode table block.
>  */
> static void ext4_update_other_inodes_time(struct super_block *sb,
> 					  unsigned long orig_ino, char *buf)
> {
> 	struct other_inode oi;
> 	unsigned long ino;
> 	int i, inodes_per_block = EXT4_SB(sb)->s_inodes_per_block;
> 	int inode_size = EXT4_INODE_SIZE(sb);
> 
> 	oi.orig_ino = orig_ino;
> 	ino = orig_ino & ~(inodes_per_block - 1);
> 	for (i = 0; i < inodes_per_block; i++, ino++, buf += inode_size) {
> 		if (ino == orig_ino)
> 			continue;
> 		oi.raw_inode = (struct ext4_inode *) buf;
> 		(void) find_inode_nowait(sb, ino, other_inode_match, &oi);
> 	}
> }
> 
> static int other_inode_match(struct inode * inode, unsigned long ino,
> 			     void *data)
> {
> 	struct other_inode *oi = (struct other_inode *) data;
> 
> 	if ((inode->i_ino != ino) ||
> 	    (inode->i_state & (I_FREEING | I_WILL_FREE | I_NEW)) ||
> 	    ((inode->i_state & I_DIRTY_TIME) == 0))
> 		return 0;
> 	spin_lock(&inode->i_lock);
> 	if (((inode->i_state & (I_FREEING | I_WILL_FREE | I_NEW)) == 0) &&
> 	    (inode->i_state & I_DIRTY_TIME)) {
> 		struct ext4_inode_info	*ei = EXT4_I(inode);
> 
> 		inode->i_state &= ~I_DIRTY_TIME;
> 		inode->i_ts_dirty_day = 0;
> 		spin_unlock(&inode->i_lock);
> 		inode_requeue_dirtytime(inode);
> 
> 		spin_lock(&ei->i_raw_lock);
> 		EXT4_INODE_SET_XTIME(i_ctime, inode, oi->raw_inode);
> 		EXT4_INODE_SET_XTIME(i_mtime, inode, oi->raw_inode);
> 		EXT4_INODE_SET_XTIME(i_atime, inode, oi->raw_inode);
> 		ext4_inode_csum_set(inode, oi->raw_inode, ei);
> 		spin_unlock(&ei->i_raw_lock);
> 		trace_ext4_other_inode_update_time(inode, oi->orig_ino);
> 		return -1;
> 	}
> 	spin_unlock(&inode->i_lock);
> 	return -1;
> }
  Hum, but this puts lots of stuff under inode_hash_lock, including
writeback list lock. I don't like this too much. I understand that getting
handle for each inode is rather more CPU intensive but it should still be a
clear win over the current situation and avoids entangling locks like this.

								Honza

> The above uses the following in fs/inode.c (which gets added instead of
> find_active_inode_nowait):
> 
> /**
>  * find_inode_nowait - find an inode in the inode cache
>  * @sb:		super block of file system to search
>  * @hashval:	hash value (usually inode number) to search for
>  * @match:	callback used for comparisons between inodes
>  * @data:	opaque data pointer to pass to @match
>  *
>  * Search for the inode specified by @hashval and @data in the inode
>  * cache, where the helper function @match will return 0 if the inode
>  * does not match, 1 if the inode does match, and -1 if the search
>  * should be stopped.  The @match function must be responsible for
>  * taking the i_lock spin_lock and checking i_state for an inode being
>  * freed or being initialized, and incrementing the reference count
>  * before returning 1.  It also must not sleep, since it is called with
>  * the inode_hash_lock spinlock held.
>  *
>  * This is a even more generalized version of ilookup5() when the
>  * function must never block --- find_inode() can block in
>  * __wait_on_freeing_inode() --- or when the caller can not increment
>  * the reference count because the resulting iput() might cause an
>  * inode eviction().  The tradeoff is that the @match funtion must be
>  * very carefully implemented.
>  */
> struct inode *find_inode_nowait(struct super_block *sb,
> 				unsigned long hashval,
> 				int (*match)(struct inode *, unsigned long,
> 					     void *),
> 				void *data)
> {
> 	struct hlist_head *head = inode_hashtable + hash(sb, hashval);
> 	struct inode *inode, *ret_inode = NULL;
> 	int mval;
> 
> 	spin_lock(&inode_hash_lock);
> 	hlist_for_each_entry(inode, head, i_hash) {
> 		if (inode->i_sb != sb)
> 			continue;
> 		mval = match(inode, hashval, data);
> 		if (mval == 0)
> 			continue;
> 		if (mval == 1)
> 			ret_inode = inode;
> 		goto out;
> 	}
> out:
> 	spin_unlock(&inode_hash_lock);
> 	return ret_inode;
> }
> EXPORT_SYMBOL(find_inode_nowait);
> 
> Comments?
> 
> 							- Ted
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  reply	other threads:[~2014-11-27 15:42 UTC|newest]

Thread overview: 72+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-11-26 10:23 [PATCH-v4 0/7] add support for a lazytime mount option Theodore Ts'o
2014-11-26 10:23 ` Theodore Ts'o
2014-11-26 10:23 ` [PATCH-v4 1/7] vfs: split update_time() into update_time() and write_time() Theodore Ts'o
2014-11-26 10:23   ` Theodore Ts'o
2014-11-26 19:23   ` Christoph Hellwig
2014-11-26 19:23     ` Christoph Hellwig
2014-11-27 12:34     ` Jan Kara
2014-11-27 12:34       ` Jan Kara
2014-11-27 15:25       ` Christoph Hellwig
2014-11-27 15:25         ` Christoph Hellwig
2014-11-27 14:41     ` Theodore Ts'o
2014-11-27 14:41       ` Theodore Ts'o
2014-11-27 15:28       ` Christoph Hellwig
2014-11-27 15:28         ` Christoph Hellwig
2014-11-27 15:33       ` Theodore Ts'o
2014-11-27 15:33         ` Theodore Ts'o
2014-11-27 16:49         ` Christoph Hellwig
2014-11-27 16:49           ` Christoph Hellwig
2014-11-27 20:27           ` Theodore Ts'o
2014-11-27 20:27             ` Theodore Ts'o
2014-12-01  9:28             ` Christoph Hellwig
2014-12-01  9:28               ` Christoph Hellwig
2014-12-01 15:04               ` Theodore Ts'o
2014-12-01 15:04                 ` Theodore Ts'o
2014-12-01 17:18                 ` David Sterba
2014-12-01 17:18                   ` David Sterba
2014-12-02  9:20                 ` Christoph Hellwig
2014-12-02  9:20                   ` Christoph Hellwig
2014-12-02 15:09                   ` Theodore Ts'o
2014-12-02 15:09                     ` Theodore Ts'o
2014-11-26 10:23 ` [PATCH-v4 2/7] vfs: add support for a lazytime mount option Theodore Ts'o
2014-11-26 10:23   ` Theodore Ts'o
2014-11-27 13:14   ` Jan Kara
2014-11-27 13:14     ` Jan Kara
2014-11-27 20:19     ` Theodore Ts'o
2014-11-27 20:19       ` Theodore Ts'o
2014-11-28 12:41       ` Jan Kara
2014-11-28 12:41         ` Jan Kara
2014-11-27 23:00     ` Theodore Ts'o
2014-11-27 23:00       ` Theodore Ts'o
2014-11-28  5:36       ` Theodore Ts'o
2014-11-28  5:36         ` Theodore Ts'o
2014-11-28 16:24       ` Jan Kara
2014-11-28 16:24         ` Jan Kara
2014-11-26 10:23 ` [PATCH-v4 3/7] vfs: don't let the dirty time inodes get more than a day stale Theodore Ts'o
2014-11-26 10:23   ` Theodore Ts'o
2014-11-26 10:23 ` [PATCH-v4 4/7] vfs: add lazytime tracepoints for better debugging Theodore Ts'o
2014-11-26 10:23   ` Theodore Ts'o
2014-11-26 10:23 ` [PATCH-v4 5/7] vfs: add find_active_inode_nowait() function Theodore Ts'o
2014-11-26 10:23   ` Theodore Ts'o
2014-11-26 10:23 ` [PATCH-v4 6/7] ext4: add support for a lazytime mount option Theodore Ts'o
2014-11-26 10:23   ` Theodore Ts'o
2014-11-26 19:24   ` Christoph Hellwig
2014-11-26 19:24     ` Christoph Hellwig
2014-11-26 22:48   ` Dave Chinner
2014-11-26 22:48     ` Dave Chinner
2014-11-26 23:10     ` Andreas Dilger
2014-11-26 23:10       ` Andreas Dilger
2014-11-26 23:35       ` Dave Chinner
2014-11-26 23:35         ` Dave Chinner
2014-11-27 13:27         ` Jan Kara
2014-11-27 13:27           ` Jan Kara
2014-11-27 13:32           ` Jan Kara
2014-11-27 13:32             ` Jan Kara
2014-11-27 15:25             ` Theodore Ts'o
2014-11-27 15:25               ` Theodore Ts'o
2014-11-27 15:41               ` Jan Kara [this message]
2014-11-27 15:41                 ` Jan Kara
2014-11-27 20:13                 ` Theodore Ts'o
2014-11-27 20:13                   ` Theodore Ts'o
2014-11-26 10:23 ` [PATCH-v4 7/7] btrfs: add an is_readonly() so btrfs can use common code for update_time() Theodore Ts'o
2014-11-26 10:23   ` Theodore Ts'o

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20141127154159.GA11922@quack.suse.cz \
    --to=jack@suse.cz \
    --cc=adilger@dilger.ca \
    --cc=david@fromorbit.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=tytso@mit.edu \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.