linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	akpm@linux-foundation.org, davej@redhat.com,
	viro@zeniv.linux.org.uk, jack@suse.cz, glommer@parallels.com
Subject: Re: [PATCH 04/11] sync: serialise per-superblock sync operations
Date: Wed, 31 Jul 2013 17:12:49 +0200	[thread overview]
Message-ID: <20130731151249.GH22930@quack.suse.cz> (raw)
In-Reply-To: <1375244150-27296-5-git-send-email-david@fromorbit.com>

On Wed 31-07-13 14:15:43, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
> 
> When competing sync(2) calls walk the same filesystem, they need to
> walk the list of inodes on the superblock to find all the inodes
> that we need to wait for IO completion on. However, when multiple
> wait_sb_inodes() calls do this at the same time, they contend on the
> the inode_sb_list_lock and the contention causes system wide
> slowdowns. In effect, concurrent sync(2) calls can take longer and
> burn more CPU than if they were serialised.
> 
> Stop the worst of the contention by adding a per-sb mutex to wrap
> around wait_sb_inodes() so that we only execute one sync(2) IO
> completion walk per superblock superblock at a time and hence avoid
> contention being triggered by concurrent sync(2) calls.
  The patch looks OK. You can add:
Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> 
> Signed-off-by: Dave Chinner <dchinner@redhat.com>
> ---
>  fs/fs-writeback.c  | 11 +++++++++++
>  fs/super.c         |  1 +
>  include/linux/fs.h |  2 ++
>  3 files changed, 14 insertions(+)
> 
> diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
> index ca66dc8..56272ec 100644
> --- a/fs/fs-writeback.c
> +++ b/fs/fs-writeback.c
> @@ -1207,6 +1207,15 @@ out_unlock_inode:
>  }
>  EXPORT_SYMBOL(__mark_inode_dirty);
>  
> +/*
> + * The @s_sync_lock is used to serialise concurrent sync operations
> + * to avoid lock contention problems with concurrent wait_sb_inodes() calls.
> + * Concurrent callers will block on the s_sync_lock rather than doing contending
> + * walks. The queueing maintains sync(2) required behaviour as all the IO that
> + * has been issued up to the time this function is enter is guaranteed to be
> + * completed by the time we have gained the lock and waited for all IO that is
> + * in progress regardless of the order callers are granted the lock.
> + */
>  static void wait_sb_inodes(struct super_block *sb)
>  {
>  	struct inode *inode, *old_inode = NULL;
> @@ -1217,6 +1226,7 @@ static void wait_sb_inodes(struct super_block *sb)
>  	 */
>  	WARN_ON(!rwsem_is_locked(&sb->s_umount));
>  
> +	mutex_lock(&sb->s_sync_lock);
>  	spin_lock(&sb->s_inode_list_lock);
>  
>  	/*
> @@ -1258,6 +1268,7 @@ static void wait_sb_inodes(struct super_block *sb)
>  	}
>  	spin_unlock(&sb->s_inode_list_lock);
>  	iput(old_inode);
> +	mutex_unlock(&sb->s_sync_lock);
>  }
>  
>  /**
> diff --git a/fs/super.c b/fs/super.c
> index d4d753e..7f98fd6 100644
> --- a/fs/super.c
> +++ b/fs/super.c
> @@ -200,6 +200,7 @@ static struct super_block *alloc_super(struct file_system_type *type, int flags)
>  		s->s_bdi = &default_backing_dev_info;
>  		INIT_HLIST_NODE(&s->s_instances);
>  		INIT_HLIST_BL_HEAD(&s->s_anon);
> +		mutex_init(&s->s_sync_lock);
>  		INIT_LIST_HEAD(&s->s_inodes);
>  		spin_lock_init(&s->s_inode_list_lock);
>  
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 923b465..971e8be 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -1321,6 +1321,8 @@ struct super_block {
>  	/* Being remounted read-only */
>  	int s_readonly_remount;
>  
> +	struct mutex		s_sync_lock;	/* sync serialisation lock */
> +
>  	/*
>  	 * Keep the lru lists last in the structure so they always sit on their
>  	 * own individual cachelines.
> -- 
> 1.8.3.2
> 
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

  reply	other threads:[~2013-07-31 15:12 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-07-31  4:15 [PATCH 00/11] Sync and VFS scalability improvements Dave Chinner
2013-07-31  4:15 ` [PATCH 01/11] writeback: plug writeback at a high level Dave Chinner
2013-07-31 14:40   ` Jan Kara
2013-08-01  5:48     ` Dave Chinner
2013-08-01  8:34       ` Jan Kara
2013-07-31  4:15 ` [PATCH 02/11] inode: add IOP_NOTHASHED to avoid inode hash lock in evict Dave Chinner
2013-07-31 14:44   ` Jan Kara
2013-08-01  8:12   ` Christoph Hellwig
2013-08-02  1:11     ` Dave Chinner
2013-08-02 14:32       ` Christoph Hellwig
2013-07-31  4:15 ` [PATCH 03/11] inode: convert inode_sb_list_lock to per-sb Dave Chinner
2013-07-31 14:48   ` Jan Kara
2013-07-31  4:15 ` [PATCH 04/11] sync: serialise per-superblock sync operations Dave Chinner
2013-07-31 15:12   ` Jan Kara [this message]
2013-07-31  4:15 ` [PATCH 05/11] inode: rename i_wb_list to i_io_list Dave Chinner
2013-07-31 14:51   ` Jan Kara
2013-07-31  4:15 ` [PATCH 06/11] bdi: add a new writeback list for sync Dave Chinner
2013-07-31 15:11   ` Jan Kara
2013-08-01  5:59     ` Dave Chinner
2013-07-31  4:15 ` [PATCH 07/11] writeback: periodically trim the writeback list Dave Chinner
2013-07-31 15:15   ` Jan Kara
2013-08-01  6:16     ` Dave Chinner
2013-08-01  9:03       ` Jan Kara
2013-07-31  4:15 ` [PATCH 08/11] inode: convert per-sb inode list to a list_lru Dave Chinner
2013-08-01  8:19   ` Christoph Hellwig
2013-08-02  1:06     ` Dave Chinner
2013-07-31  4:15 ` [PATCH 09/11] fs: Use RCU lookups for inode cache Dave Chinner
2013-07-31  4:15 ` [PATCH 10/11] list_lru: don't need node lock in list_lru_count_node Dave Chinner
2013-07-31  4:15 ` [PATCH 11/11] list_lru: don't lock during add/del if unnecessary Dave Chinner
2013-07-31  6:48 ` [PATCH 00/11] Sync and VFS scalability improvements Sedat Dilek
2013-08-01  6:19   ` Dave Chinner
2013-08-01  6:31     ` Sedat Dilek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130731151249.GH22930@quack.suse.cz \
    --to=jack@suse.cz \
    --cc=akpm@linux-foundation.org \
    --cc=davej@redhat.com \
    --cc=david@fromorbit.com \
    --cc=glommer@parallels.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).