From: Wu Fengguang <fengguang.wu@intel.com>
To: "Li, Shaohua" <shaohua.li@intel.com>
Cc: lkml <linux-kernel@vger.kernel.org>,
"jens.axboe@oracle.com" <jens.axboe@oracle.com>,
Peter Zijlstra <a.p.zijlstra@chello.nl>,
Andrew Morton <akpm@linux-foundation.org>,
Chris Mason <chris.mason@oracle.com>, Jan Kara <jack@suse.cz>,
linux-fsdevel@vger.kernel.org
Subject: Re: [RFC] page-writeback: move indoes from one superblock together
Date: Thu, 24 Sep 2009 15:14:15 +0800 [thread overview]
Message-ID: <20090924071415.GA20808@localhost> (raw)
In-Reply-To: <1253775260.10618.10.camel@sli10-desk.sh.intel.com>
On Thu, Sep 24, 2009 at 02:54:20PM +0800, Li, Shaohua wrote:
> __mark_inode_dirty adds inode to wb dirty list in random order. If a disk has
> several partitions, writeback might keep spindle moving between partitions.
> To reduce the move, better write big chunk of one partition and then move to
> another. Inodes from one fs usually are in one partion, so idealy move indoes
> from one fs together should reduce spindle move. This patch tries to address
> this. Before per-bdi writeback is added, the behavior is write indoes
> from one fs first and then another, so the patch restores previous behavior.
> The loop in the patch is a bit ugly, should we add a dirty list for each
> superblock in bdi_writeback?
>
> Test in a two partition disk with attached fio script shows about 3% ~ 6%
> improvement.
Reviewed-by: Wu Fengguang <fengguang.wu@intel.com>
Good idea! The optimization looks good to me, it addresses one
weakness of per-bdi writeback.
But one problem is, Jan Kara and me are planning to remove b_io and
hence this move_expired_inodes() function. Not sure how to do this
optimization without b_io.
> Signed-off-by: Shaohua Li <shaohua.li@intel.com>
>
> diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
> index 8e1e5e1..fc87730 100644
> --- a/fs/fs-writeback.c
> +++ b/fs/fs-writeback.c
> @@ -324,13 +324,29 @@ static void move_expired_inodes(struct list_head *delaying_queue,
> struct list_head *dispatch_queue,
> unsigned long *older_than_this)
> {
> + LIST_HEAD(tmp);
> + struct list_head *pos, *node;
> + struct super_block *sb;
> + struct inode *inode;
> +
> while (!list_empty(delaying_queue)) {
> - struct inode *inode = list_entry(delaying_queue->prev,
> - struct inode, i_list);
> + inode = list_entry(delaying_queue->prev, struct inode, i_list);
> if (older_than_this &&
> inode_dirtied_after(inode, *older_than_this))
> break;
> - list_move(&inode->i_list, dispatch_queue);
> + list_move(&inode->i_list, &tmp);
> + }
> +
> + /* Move indoes from one superblock together */
> + while (!list_empty(&tmp)) {
> + inode = list_entry(tmp.prev, struct inode, i_list);
> + sb = inode->i_sb;
> + list_for_each_prev_safe(pos, node, &tmp) {
We are in spin lock, so not necessary to use the safe version?
> + struct inode *inode = list_entry(pos,
Could just reuse inode.
Thanks,
Fengguang
> + struct inode, i_list);
> + if (inode->i_sb == sb)
> + list_move(&inode->i_list, dispatch_queue);
> + }
> }
> }
>
>
Content-Description: newfio
> [global]
> runtime=120
> ioscheduler=cfq
> size=2G
> ioengine=sync
> rw=write
> file_service_type=random:256
> overwrite=1
>
> [sdb1]
> directory=/mnt/b1
> nrfiles=10
> numjobs=4
>
> [sdb2]
> directory=/mnt/b2
> nrfiles=10
> numjobs=4
next parent reply other threads:[~2009-09-24 7:14 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <1253775260.10618.10.camel@sli10-desk.sh.intel.com>
2009-09-24 7:14 ` Wu Fengguang [this message]
2009-09-24 7:29 ` [RFC] page-writeback: move indoes from one superblock together Arjan van de Ven
2009-09-24 7:36 ` Wu Fengguang
2009-09-24 7:44 ` Shaohua Li
2009-09-24 13:17 ` Jens Axboe
2009-09-24 13:29 ` Wu Fengguang
2009-09-24 10:01 ` Wu Fengguang
2009-09-24 12:35 ` Jens Axboe
2009-09-24 13:22 ` Wu Fengguang
2009-09-24 13:29 ` Jens Axboe
2009-09-24 13:46 ` Wu Fengguang
2009-09-24 13:52 ` Arjan van de Ven
2009-09-24 14:09 ` Wu Fengguang
2009-09-25 4:16 ` Dave Chinner
2009-09-25 5:09 ` Wu Fengguang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090924071415.GA20808@localhost \
--to=fengguang.wu@intel.com \
--cc=a.p.zijlstra@chello.nl \
--cc=akpm@linux-foundation.org \
--cc=chris.mason@oracle.com \
--cc=jack@suse.cz \
--cc=jens.axboe@oracle.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=shaohua.li@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).