linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Wu Fengguang <fengguang.wu@intel.com>
To: Jan Kara <jack@suse.cz>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Mel Gorman <mel@linux.vnet.ibm.com>, Mel Gorman <mel@csn.ul.ie>,
	Dave Chinner <david@fromorbit.com>,
	Itaru Kitayama <kitayama@cl.bb4u.ne.jp>,
	Minchan Kim <minchan.kim@gmail.com>,
	Linux Memory Management List <linux-mm@kvack.org>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	"Li, Shaohua" <shaohua.li@intel.com>
Subject: [RFC][PATCH v2] writeback: limit number of moved inodes in queue_io()
Date: Fri, 6 May 2011 18:06:48 +0800	[thread overview]
Message-ID: <20110506100648.GA3435@localhost> (raw)
In-Reply-To: <20110506084238.GA487@localhost>

On Fri, May 06, 2011 at 04:42:38PM +0800, Wu Fengguang wrote:
> > patched trace-tar-dd-ext4-2.6.39-rc3+
> 
> >        flush-8:0-3048  [004]  1929.981734: writeback_queue_io: bdi 8:0: older=4296600898 age=2 enqueue=13227
> 
> > vanilla trace-tar-dd-ext4-2.6.39-rc3
> 
> >        flush-8:0-2911  [004]    77.158312: writeback_queue_io: bdi 8:0: older=0 age=-1 enqueue=18938
> 
> >        flush-8:0-2911  [000]    82.461064: writeback_queue_io: bdi 8:0: older=0 age=-1 enqueue=6957
> 
> It looks too much to move 13227 and 18938 inodes at once. So I tried
> arbitrarily limiting the max move number to 1000 and it helps reduce
> the lock hold time and contentions a lot.

Oh it seems 1000 is too small at least for this workload, it hurts
dd+tar+sync total elapsed time. 

no limit:
                avg        167.486 
                stddev       8.996 
limit=1000:
                avg        171.222 
                stddev       5.588 
limit=3000:
                avg        165.335 
                stddev       5.503 

So use 3000 as the new limit.

Thanks,
Fengguang
---
Subject: writeback: limit number of moved inodes in queue_io()
Date: Fri May 06 13:34:08 CST 2011

Only move 3000 inodes from b_dirty to b_io at one time. This reduces
lock max hold time and lock contentions by many times in a simple dd+tar
workload in a 8p test box. This workload was observed to move 10000+
inodes in one shot on ext4 which was obviously too much.

                              class name    con-bounces    contentions   waittime-min   waittime-max waittime-total    acq-b
ounces   acquisitions   holdtime-min   holdtime-max holdtime-total
----------------------------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------
vanilla 2.6.39-rc3:
                      inode_wb_list_lock:          2063           2065           0.12        2648.66        5948.99
 27475         943778           0.09        2704.76      498340.24
                      ------------------
                      inode_wb_list_lock             89          [<ffffffff8115cf3a>] sync_inode+0x28/0x5f
                      inode_wb_list_lock             38          [<ffffffff8115ccab>] inode_wait_for_writeback+0xa8/0xc6
                      inode_wb_list_lock            629          [<ffffffff8115da35>] __mark_inode_dirty+0x170/0x1d0
                      inode_wb_list_lock            842          [<ffffffff8115d334>] writeback_sb_inodes+0x10f/0x157
                      ------------------
                      inode_wb_list_lock            891          [<ffffffff8115ce3e>] writeback_single_inode+0x175/0x249
                      inode_wb_list_lock             13          [<ffffffff8115dc4e>] writeback_inodes_wb+0x3a/0x143
                      inode_wb_list_lock            499          [<ffffffff8115da35>] __mark_inode_dirty+0x170/0x1d0
                      inode_wb_list_lock            617          [<ffffffff8115d334>] writeback_sb_inodes+0x10f/0x157

limit=1000:

dd+tar+sync total elapsed time (10 runs):
				avg        171.222 
				stddev       5.588 

                &(&wb->list_lock)->rlock:           842            842           0.14         101.10        1013.34
 20489         970892           0.09         234.11      509829.79
                ------------------------
                &(&wb->list_lock)->rlock            275          [<ffffffff8115db09>] __mark_inode_dirty+0x173/0x1cf
                &(&wb->list_lock)->rlock            114          [<ffffffff8115cdd3>] writeback_single_inode+0x18a/0x27e
                &(&wb->list_lock)->rlock             56          [<ffffffff8115cc29>] inode_wait_for_writeback+0xac/0xcc
                &(&wb->list_lock)->rlock            132          [<ffffffff8115cf2a>] sync_inode+0x63/0xa2
                ------------------------
                &(&wb->list_lock)->rlock              2          [<ffffffff8115dfea>] inode_wb_list_del+0x5f/0x85
                &(&wb->list_lock)->rlock             33          [<ffffffff8115cf2a>] sync_inode+0x63/0xa2
                &(&wb->list_lock)->rlock              9          [<ffffffff8115cc29>] inode_wait_for_writeback+0xac/0xcc
                &(&wb->list_lock)->rlock            430          [<ffffffff8115cdd3>] writeback_single_inode+0x18a/0x27e

limit=3000:

dd+tar+sync total elapsed time (10 runs):
				avg        165.335
				stddev       5.503

                &(&wb->list_lock)->rlock:          1088           1092           0.11         245.08        3268.75
 21124        1718636           0.09         384.53      849827.20
                ------------------------
                &(&wb->list_lock)->rlock            518          [<ffffffff8115db09>] __mark_inode_dirty+0x173/0x1cf
                &(&wb->list_lock)->rlock              3          [<ffffffff8115dfea>] inode_wb_list_del+0x5f/0x85
                &(&wb->list_lock)->rlock             54          [<ffffffff8115cf2a>] sync_inode+0x63/0xa2
                &(&wb->list_lock)->rlock             10          [<ffffffff8115cc29>] inode_wait_for_writeback+0xac/0xcc
                ------------------------
                &(&wb->list_lock)->rlock              4          [<ffffffff8115dfea>] inode_wb_list_del+0x5f/0x85
                &(&wb->list_lock)->rlock            379          [<ffffffff8115db09>] __mark_inode_dirty+0x173/0x1cf
                &(&wb->list_lock)->rlock              4          [<ffffffff8115cc29>] inode_wait_for_writeback+0xac/0xcc
                &(&wb->list_lock)->rlock            446          [<ffffffff8115cdd3>] writeback_single_inode+0x18a/0x27e

Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 fs/fs-writeback.c |    2 ++
 1 file changed, 2 insertions(+)

--- linux-next.orig/fs/fs-writeback.c	2011-05-06 13:32:41.000000000 +0800
+++ linux-next/fs/fs-writeback.c	2011-05-06 16:44:58.000000000 +0800
@@ -279,6 +279,8 @@ static int move_expired_inodes(struct li
 		sb = inode->i_sb;
 		list_move(&inode->i_wb_list, &tmp);
 		moved++;
+		if (unlikely(moved >= 3000))	/* limit spinlock hold time */
+			break;
 	}
 
 	/* just one sb in list, splice to dispatch_queue and we're done */

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2011-05-06 10:06 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-04-20  8:03 [PATCH 0/6] writeback: moving expire targets for background/kupdate works v2 Wu Fengguang
2011-04-20  8:03 ` [PATCH 1/6] writeback: pass writeback_control down to move_expired_inodes() Wu Fengguang
2011-05-04 11:04   ` Christoph Hellwig
2011-05-04 11:13     ` Wu Fengguang
2011-04-20  8:03 ` [PATCH 2/6] writeback: introduce writeback_control.inodes_cleaned Wu Fengguang
2011-05-04 11:05   ` Christoph Hellwig
2011-05-04 11:11     ` Wu Fengguang
2011-05-04 11:16       ` Christoph Hellwig
2011-05-04 11:32         ` Wu Fengguang
2011-04-20  8:03 ` [PATCH 3/6] writeback: try more writeback as long as something was written Wu Fengguang
2011-04-20  8:03 ` [PATCH 4/6] writeback: the kupdate expire timestamp should be a moving target Wu Fengguang
2011-04-20  8:03 ` [PATCH 5/6] writeback: sync expired inodes first in background writeback Wu Fengguang
2011-04-20 23:40   ` Andrew Morton
2011-04-21  1:14     ` Wu Fengguang
2011-04-21  1:21       ` Wu Fengguang
2011-04-24  3:15     ` Wu Fengguang
2011-04-26 12:17       ` Jan Kara
2011-04-26 13:51         ` Wu Fengguang
2011-04-26 13:59           ` Wu Fengguang
2011-04-26 14:05           ` Wu Fengguang
2011-04-27 11:15           ` Wu Fengguang
2011-04-20  8:03 ` [PATCH 6/6] writeback: refill b_io iff empty Wu Fengguang
2011-05-04  7:39   ` Wu Fengguang
2011-05-05 16:37     ` Jan Kara
2011-05-05 16:47       ` Wu Fengguang
2011-05-06  5:29       ` Wu Fengguang
2011-05-06  8:42         ` [RFC][PATCH] writeback: limit number of moved inodes in queue_io() Wu Fengguang
2011-05-06 10:06           ` Wu Fengguang [this message]
2011-05-06 23:06             ` [RFC][PATCH v2] " Dave Chinner
2011-05-06 14:21         ` [PATCH 6/6] writeback: refill b_io iff empty Jan Kara
2011-05-10  4:31           ` Wu Fengguang
2011-05-10  4:53             ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110506100648.GA3435@localhost \
    --to=fengguang.wu@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@fromorbit.com \
    --cc=jack@suse.cz \
    --cc=kitayama@cl.bb4u.ne.jp \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mel@csn.ul.ie \
    --cc=mel@linux.vnet.ibm.com \
    --cc=minchan.kim@gmail.com \
    --cc=shaohua.li@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).