From: Wu Fengguang <fengguang.wu@intel.com>
To: Jan Kara <jack@suse.cz>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Mel Gorman <mel@linux.vnet.ibm.com>, Mel Gorman <mel@csn.ul.ie>,
Dave Chinner <david@fromorbit.com>,
Itaru Kitayama <kitayama@cl.bb4u.ne.jp>,
Minchan Kim <minchan.kim@gmail.com>,
Linux Memory Management List <linux-mm@kvack.org>,
"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
LKML <linux-kernel@vger.kernel.org>,
"Li, Shaohua" <shaohua.li@intel.com>
Subject: [RFC][PATCH v2] writeback: limit number of moved inodes in queue_io()
Date: Fri, 6 May 2011 18:06:48 +0800 [thread overview]
Message-ID: <20110506100648.GA3435@localhost> (raw)
In-Reply-To: <20110506084238.GA487@localhost>
On Fri, May 06, 2011 at 04:42:38PM +0800, Wu Fengguang wrote:
> > patched trace-tar-dd-ext4-2.6.39-rc3+
>
> > flush-8:0-3048 [004] 1929.981734: writeback_queue_io: bdi 8:0: older=4296600898 age=2 enqueue=13227
>
> > vanilla trace-tar-dd-ext4-2.6.39-rc3
>
> > flush-8:0-2911 [004] 77.158312: writeback_queue_io: bdi 8:0: older=0 age=-1 enqueue=18938
>
> > flush-8:0-2911 [000] 82.461064: writeback_queue_io: bdi 8:0: older=0 age=-1 enqueue=6957
>
> It looks too much to move 13227 and 18938 inodes at once. So I tried
> arbitrarily limiting the max move number to 1000 and it helps reduce
> the lock hold time and contentions a lot.
Oh it seems 1000 is too small at least for this workload, it hurts
dd+tar+sync total elapsed time.
no limit:
avg 167.486
stddev 8.996
limit=1000:
avg 171.222
stddev 5.588
limit=3000:
avg 165.335
stddev 5.503
So use 3000 as the new limit.
Thanks,
Fengguang
---
Subject: writeback: limit number of moved inodes in queue_io()
Date: Fri May 06 13:34:08 CST 2011
Only move 3000 inodes from b_dirty to b_io at one time. This reduces
lock max hold time and lock contentions by many times in a simple dd+tar
workload in a 8p test box. This workload was observed to move 10000+
inodes in one shot on ext4 which was obviously too much.
class name con-bounces contentions waittime-min waittime-max waittime-total acq-b
ounces acquisitions holdtime-min holdtime-max holdtime-total
----------------------------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------
vanilla 2.6.39-rc3:
inode_wb_list_lock: 2063 2065 0.12 2648.66 5948.99
27475 943778 0.09 2704.76 498340.24
------------------
inode_wb_list_lock 89 [<ffffffff8115cf3a>] sync_inode+0x28/0x5f
inode_wb_list_lock 38 [<ffffffff8115ccab>] inode_wait_for_writeback+0xa8/0xc6
inode_wb_list_lock 629 [<ffffffff8115da35>] __mark_inode_dirty+0x170/0x1d0
inode_wb_list_lock 842 [<ffffffff8115d334>] writeback_sb_inodes+0x10f/0x157
------------------
inode_wb_list_lock 891 [<ffffffff8115ce3e>] writeback_single_inode+0x175/0x249
inode_wb_list_lock 13 [<ffffffff8115dc4e>] writeback_inodes_wb+0x3a/0x143
inode_wb_list_lock 499 [<ffffffff8115da35>] __mark_inode_dirty+0x170/0x1d0
inode_wb_list_lock 617 [<ffffffff8115d334>] writeback_sb_inodes+0x10f/0x157
limit=1000:
dd+tar+sync total elapsed time (10 runs):
avg 171.222
stddev 5.588
&(&wb->list_lock)->rlock: 842 842 0.14 101.10 1013.34
20489 970892 0.09 234.11 509829.79
------------------------
&(&wb->list_lock)->rlock 275 [<ffffffff8115db09>] __mark_inode_dirty+0x173/0x1cf
&(&wb->list_lock)->rlock 114 [<ffffffff8115cdd3>] writeback_single_inode+0x18a/0x27e
&(&wb->list_lock)->rlock 56 [<ffffffff8115cc29>] inode_wait_for_writeback+0xac/0xcc
&(&wb->list_lock)->rlock 132 [<ffffffff8115cf2a>] sync_inode+0x63/0xa2
------------------------
&(&wb->list_lock)->rlock 2 [<ffffffff8115dfea>] inode_wb_list_del+0x5f/0x85
&(&wb->list_lock)->rlock 33 [<ffffffff8115cf2a>] sync_inode+0x63/0xa2
&(&wb->list_lock)->rlock 9 [<ffffffff8115cc29>] inode_wait_for_writeback+0xac/0xcc
&(&wb->list_lock)->rlock 430 [<ffffffff8115cdd3>] writeback_single_inode+0x18a/0x27e
limit=3000:
dd+tar+sync total elapsed time (10 runs):
avg 165.335
stddev 5.503
&(&wb->list_lock)->rlock: 1088 1092 0.11 245.08 3268.75
21124 1718636 0.09 384.53 849827.20
------------------------
&(&wb->list_lock)->rlock 518 [<ffffffff8115db09>] __mark_inode_dirty+0x173/0x1cf
&(&wb->list_lock)->rlock 3 [<ffffffff8115dfea>] inode_wb_list_del+0x5f/0x85
&(&wb->list_lock)->rlock 54 [<ffffffff8115cf2a>] sync_inode+0x63/0xa2
&(&wb->list_lock)->rlock 10 [<ffffffff8115cc29>] inode_wait_for_writeback+0xac/0xcc
------------------------
&(&wb->list_lock)->rlock 4 [<ffffffff8115dfea>] inode_wb_list_del+0x5f/0x85
&(&wb->list_lock)->rlock 379 [<ffffffff8115db09>] __mark_inode_dirty+0x173/0x1cf
&(&wb->list_lock)->rlock 4 [<ffffffff8115cc29>] inode_wait_for_writeback+0xac/0xcc
&(&wb->list_lock)->rlock 446 [<ffffffff8115cdd3>] writeback_single_inode+0x18a/0x27e
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
fs/fs-writeback.c | 2 ++
1 file changed, 2 insertions(+)
--- linux-next.orig/fs/fs-writeback.c 2011-05-06 13:32:41.000000000 +0800
+++ linux-next/fs/fs-writeback.c 2011-05-06 16:44:58.000000000 +0800
@@ -279,6 +279,8 @@ static int move_expired_inodes(struct li
sb = inode->i_sb;
list_move(&inode->i_wb_list, &tmp);
moved++;
+ if (unlikely(moved >= 3000)) /* limit spinlock hold time */
+ break;
}
/* just one sb in list, splice to dispatch_queue and we're done */
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2011-05-06 10:06 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-04-20 8:03 [PATCH 0/6] writeback: moving expire targets for background/kupdate works v2 Wu Fengguang
2011-04-20 8:03 ` [PATCH 1/6] writeback: pass writeback_control down to move_expired_inodes() Wu Fengguang
2011-05-04 11:04 ` Christoph Hellwig
2011-05-04 11:13 ` Wu Fengguang
2011-04-20 8:03 ` [PATCH 2/6] writeback: introduce writeback_control.inodes_cleaned Wu Fengguang
2011-05-04 11:05 ` Christoph Hellwig
2011-05-04 11:11 ` Wu Fengguang
2011-05-04 11:16 ` Christoph Hellwig
2011-05-04 11:32 ` Wu Fengguang
2011-04-20 8:03 ` [PATCH 3/6] writeback: try more writeback as long as something was written Wu Fengguang
2011-04-20 8:03 ` [PATCH 4/6] writeback: the kupdate expire timestamp should be a moving target Wu Fengguang
2011-04-20 8:03 ` [PATCH 5/6] writeback: sync expired inodes first in background writeback Wu Fengguang
2011-04-20 23:40 ` Andrew Morton
2011-04-21 1:14 ` Wu Fengguang
2011-04-21 1:21 ` Wu Fengguang
2011-04-24 3:15 ` Wu Fengguang
2011-04-26 12:17 ` Jan Kara
2011-04-26 13:51 ` Wu Fengguang
2011-04-26 13:59 ` Wu Fengguang
2011-04-26 14:05 ` Wu Fengguang
2011-04-27 11:15 ` Wu Fengguang
2011-04-20 8:03 ` [PATCH 6/6] writeback: refill b_io iff empty Wu Fengguang
2011-05-04 7:39 ` Wu Fengguang
2011-05-05 16:37 ` Jan Kara
2011-05-05 16:47 ` Wu Fengguang
2011-05-06 5:29 ` Wu Fengguang
2011-05-06 8:42 ` [RFC][PATCH] writeback: limit number of moved inodes in queue_io() Wu Fengguang
2011-05-06 10:06 ` Wu Fengguang [this message]
2011-05-06 23:06 ` [RFC][PATCH v2] " Dave Chinner
2011-05-06 14:21 ` [PATCH 6/6] writeback: refill b_io iff empty Jan Kara
2011-05-10 4:31 ` Wu Fengguang
2011-05-10 4:53 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110506100648.GA3435@localhost \
--to=fengguang.wu@intel.com \
--cc=akpm@linux-foundation.org \
--cc=david@fromorbit.com \
--cc=jack@suse.cz \
--cc=kitayama@cl.bb4u.ne.jp \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mel@csn.ul.ie \
--cc=mel@linux.vnet.ibm.com \
--cc=minchan.kim@gmail.com \
--cc=shaohua.li@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).