From: Fengguang Wu <wfg@mail.ustc.edu.cn>
To: Chris Mason <chris.mason@oracle.com>
Cc: Andrew Morton <akpm@osdl.org>, Ken Chen <kenchen@google.com>,
linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
Jens Axboe <jens.axboe@oracle.com>
Subject: Re: [PATCH 0/6] writeback time order/delay fixes take 3
Date: Wed, 22 Aug 2007 09:18:41 +0800 [thread overview]
Message-ID: <387745522.02814@ustc.edu.cn> (raw)
Message-ID: <20070822011841.GA8090@mail.ustc.edu.cn> (raw)
In-Reply-To: <20070821202314.335e86ec@think.oraclecorp.com>
On Tue, Aug 21, 2007 at 08:23:14PM -0400, Chris Mason wrote:
> On Sun, 12 Aug 2007 17:11:20 +0800
> Fengguang Wu <wfg@mail.ustc.edu.cn> wrote:
>
> > Andrew and Ken,
> >
> > Here are some more experiments on the writeback stuff.
> > Comments are highly welcome~
>
> I've been doing benchmarks lately to try and trigger fragmentation, and
> one of them is a simulation of make -j N. It takes a list of all
> the .o files in the kernel tree, randomly sorts them and then
> creates bogus files with the same names and sizes in clean kernel trees.
>
> This is basically creating a whole bunch of files in random order in a
> whole bunch of subdirectories.
>
> The results aren't pretty:
>
> http://oss.oracle.com/~mason/compilebench/makej/compare-compile-dirs-0.png
>
> The top graph shows one dot for each write over time. It shows that
> ext3 is basically writing all over the place the whole time. But, ext3
> actually wins the read phase, so the layout isn't horrible. My guess
> is that if we introduce some write clustering by sending a group of
> inodes down at the same time, it'll go much much better.
>
> Andrew has mentioned bringing a few radix trees into the writeback paths
> before, it seems like file servers and other general uses will benefit
> from better clustering here.
>
> I'm hoping to talk you into trying it out ;)
Thank you for the description of problem. So far I have a similar one
in mind: if we are to delay writeback of atime-dirty-only inodes to
above 1 hour, some grouping/piggy-backing scenario would be
beneficial. (Which I guess does not deserve the complexity now that
we have Ingo's make-reltime-default patch.)
My vague idea is to
- keep the s_io/s_more_io as a FIFO/cyclic writeback dispatching queue.
- convert s_dirty to some radix-tree/rbtree based data structure.
It would have dual functions: delayed-writeback and clustered-writeback.
clustered-writeback:
- Use inode number as clue of locality, hence the key for the sorted
tree.
- Drain some more s_dirty inodes into s_io on every kupdate wakeup,
but do it in the ascending order of inode number instead of
->dirtied_when.
delayed-writeback:
- Make sure that a full scan of the s_dirty tree takes <=30s, i.e.
dirty_expire_interval.
Notes:
(1) I'm not sure inode number is correlated to disk location in
filesystems other than ext2/3/4. Or parent dir?
(2) It duplicates some function of elevators. Why is it necessary?
Maybe we have no clue on the exact data location at this time?
Fengguang
next prev parent reply other threads:[~2007-08-22 1:18 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20070812091120.189651872@mail.ustc.edu.cn>
2007-08-12 9:11 ` [PATCH 0/6] writeback time order/delay fixes take 3 Fengguang Wu
2007-08-12 9:11 ` Fengguang Wu
2007-08-22 0:23 ` Chris Mason
[not found] ` <20070822011841.GA8090@mail.ustc.edu.cn>
2007-08-22 1:18 ` Fengguang Wu
2007-08-22 1:18 ` Fengguang Wu [this message]
2007-08-22 12:42 ` Chris Mason
2007-08-23 2:47 ` David Chinner
2007-08-23 12:13 ` Chris Mason
[not found] ` <20070824125643.GB7933@mail.ustc.edu.cn>
2007-08-24 12:56 ` Fengguang Wu
2007-08-24 12:56 ` Fengguang Wu
[not found] ` <20070824132458.GC7933@mail.ustc.edu.cn>
2007-08-24 13:24 ` Fengguang Wu
2007-08-24 14:36 ` Chris Mason
2007-08-24 13:24 ` Fengguang Wu
2007-08-23 2:33 ` David Chinner
[not found] ` <20070824135504.GA9029@mail.ustc.edu.cn>
2007-08-24 13:55 ` Fengguang Wu
2007-08-24 13:55 ` Fengguang Wu
[not found] ` <20070828145530.GD61154114@sgi.com>
[not found] ` <20070828110820.542bbd67@think.oraclecorp.com>
[not found] ` <20070828163308.GE61154114@sgi.com>
[not found] ` <20070829075330.GA5960@mail.ustc.edu.cn>
2007-08-29 7:53 ` Fengguang Wu
2007-08-29 7:53 ` Fengguang Wu
[not found] ` <20070812092052.558804846@mail.ustc.edu.cn>
2007-08-12 9:11 ` [PATCH 1/6] writeback: fix time ordering of the per superblock inode lists 8 Fengguang Wu
2007-08-12 9:11 ` Fengguang Wu
[not found] ` <20070812092052.704326603@mail.ustc.edu.cn>
2007-08-12 9:11 ` [PATCH 2/6] writeback: fix ntfs with sb_has_dirty_inodes() Fengguang Wu
2007-08-12 9:11 ` Fengguang Wu
[not found] ` <20070812092052.848213359@mail.ustc.edu.cn>
2007-08-12 9:11 ` [PATCH 3/6] writeback: remove pages_skipped accounting in __block_write_full_page() Fengguang Wu
2007-08-12 9:11 ` Fengguang Wu
2007-08-13 1:03 ` David Chinner
[not found] ` <20070813103000.GA8520@mail.ustc.edu.cn>
2007-08-13 10:30 ` Fengguang Wu
2007-08-13 10:30 ` Fengguang Wu
[not found] ` <20070817071317.GA8965@mail.ustc.edu.cn>
2007-08-17 7:13 ` Fengguang Wu
[not found] ` <20070812092052.983296733@mail.ustc.edu.cn>
2007-08-12 9:11 ` [PATCH 4/6] check dirty inode list Fengguang Wu
2007-08-12 9:11 ` Fengguang Wu
[not found] ` <20070812092053.113127445@mail.ustc.edu.cn>
2007-08-12 9:11 ` [PATCH 5/6] prevent time-ordering warnings Fengguang Wu
2007-08-12 9:11 ` Fengguang Wu
[not found] ` <20070812092053.242474484@mail.ustc.edu.cn>
2007-08-12 9:11 ` [PATCH 6/6] track redirty_tail() calls Fengguang Wu
2007-08-12 9:11 ` Fengguang Wu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=387745522.02814@ustc.edu.cn \
--to=wfg@mail.ustc.edu.cn \
--cc=akpm@osdl.org \
--cc=chris.mason@oracle.com \
--cc=jens.axboe@oracle.com \
--cc=kenchen@google.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).