From: Fengguang Wu <wfg@mail.ustc.edu.cn>
To: Chris Mason <chris.mason@oracle.com>
Cc: Andrew Morton <akpm@osdl.org>, Ken Chen <kenchen@google.com>,
linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
Jens Axboe <jens.axboe@oracle.com>
Subject: Re: [PATCH 0/6] writeback time order/delay fixes take 3
Date: Fri, 24 Aug 2007 21:24:58 +0800 [thread overview]
Message-ID: <387961898.15210@ustc.edu.cn> (raw)
Message-ID: <20070824132458.GC7933@mail.ustc.edu.cn> (raw)
In-Reply-To: <20070822084201.2c4eceb6@think.oraclecorp.com>
On Wed, Aug 22, 2007 at 08:42:01AM -0400, Chris Mason wrote:
> > My vague idea is to
> > - keep the s_io/s_more_io as a FIFO/cyclic writeback dispatching
> > queue.
> > - convert s_dirty to some radix-tree/rbtree based data structure.
> > It would have dual functions: delayed-writeback and
> > clustered-writeback.
> > clustered-writeback:
> > - Use inode number as clue of locality, hence the key for the sorted
> > tree.
> > - Drain some more s_dirty inodes into s_io on every kupdate wakeup,
> > but do it in the ascending order of inode number instead of
> > ->dirtied_when.
> >
> > delayed-writeback:
> > - Make sure that a full scan of the s_dirty tree takes <=30s, i.e.
> > dirty_expire_interval.
>
> I think we should assume a full scan of s_dirty is impossible in the
> presence of concurrent writers. We want to be able to pick a start
> time (right now) and find all the inodes older than that start time.
> New things will come in while we're scanning. But perhaps that's what
> you're saying...
Yeah, I was thinking about elevators :)
Or call it sweeping based on address-hint(inode number).
> At any rate, we've got two types of lists now. One keeps track of age
> and the other two keep track of what is currently being written. I
> would try two things:
>
> 1) s_dirty stays a list for FIFO. s_io becomes a radix tree that
> indexes by inode number (or some arbitrary field the FS can set in the
> inode). Radix tree tags are used to indicate which things in s_io are
> already in progress or are pending (hand waving because I'm not sure
> exactly).
>
> inodes are pulled off s_dirty and the corresponding slot in s_io is
> tagged to indicate IO has started. Any nearby inodes in s_io are also
> sent down.
>
> 2) s_dirty and s_io both become radix trees. s_dirty is indexed by a
> sequence number that corresponds to age. It is treated as a big
> circular indexed list that can wrap around over time. Radix tree tags
> are used both on s_dirty and s_io to flag which inodes are in progress.
It's meaningless to convert s_io to radix tree. Because inodes on s_io
will normally be sent to block layer elevators at the same time.
Also s_dirty holds 30 seconds of inodes, while s_io only 5 seconds.
The more inodes, the more chances of good clustering. That's the
general rule.
s_dirty is the right place to do address-clustering.
As for the dirty_expire_interval parameter on dirty age,
we can apply a simple rule: do one full scan/sweep over the
fs-address-space in every 30s, syncing all inodes encountered,
and sparing those newly dirtied in less than 5s. With that rule,
any inode will get synced after being dirtied for 5-35 seconds.
-fengguang
next prev parent reply other threads:[~2007-08-24 13:25 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20070812091120.189651872@mail.ustc.edu.cn>
2007-08-12 9:11 ` [PATCH 0/6] writeback time order/delay fixes take 3 Fengguang Wu
2007-08-12 9:11 ` Fengguang Wu
2007-08-22 0:23 ` Chris Mason
[not found] ` <20070822011841.GA8090@mail.ustc.edu.cn>
2007-08-22 1:18 ` Fengguang Wu
2007-08-22 12:42 ` Chris Mason
2007-08-23 2:47 ` David Chinner
2007-08-23 12:13 ` Chris Mason
[not found] ` <20070824125643.GB7933@mail.ustc.edu.cn>
2007-08-24 12:56 ` Fengguang Wu
2007-08-24 12:56 ` Fengguang Wu
[not found] ` <20070824132458.GC7933@mail.ustc.edu.cn>
2007-08-24 13:24 ` Fengguang Wu
2007-08-24 13:24 ` Fengguang Wu [this message]
2007-08-24 14:36 ` Chris Mason
2007-08-22 1:18 ` Fengguang Wu
2007-08-23 2:33 ` David Chinner
[not found] ` <20070824135504.GA9029@mail.ustc.edu.cn>
2007-08-24 13:55 ` Fengguang Wu
2007-08-24 13:55 ` Fengguang Wu
[not found] ` <20070828145530.GD61154114@sgi.com>
[not found] ` <20070828110820.542bbd67@think.oraclecorp.com>
[not found] ` <20070828163308.GE61154114@sgi.com>
[not found] ` <20070829075330.GA5960@mail.ustc.edu.cn>
2007-08-29 7:53 ` Fengguang Wu
2007-08-29 7:53 ` Fengguang Wu
[not found] ` <20070812092052.558804846@mail.ustc.edu.cn>
2007-08-12 9:11 ` [PATCH 1/6] writeback: fix time ordering of the per superblock inode lists 8 Fengguang Wu
2007-08-12 9:11 ` Fengguang Wu
[not found] ` <20070812092052.704326603@mail.ustc.edu.cn>
2007-08-12 9:11 ` [PATCH 2/6] writeback: fix ntfs with sb_has_dirty_inodes() Fengguang Wu
2007-08-12 9:11 ` Fengguang Wu
[not found] ` <20070812092052.983296733@mail.ustc.edu.cn>
2007-08-12 9:11 ` [PATCH 4/6] check dirty inode list Fengguang Wu
2007-08-12 9:11 ` Fengguang Wu
[not found] ` <20070812092053.113127445@mail.ustc.edu.cn>
2007-08-12 9:11 ` [PATCH 5/6] prevent time-ordering warnings Fengguang Wu
2007-08-12 9:11 ` Fengguang Wu
[not found] ` <20070812092053.242474484@mail.ustc.edu.cn>
2007-08-12 9:11 ` [PATCH 6/6] track redirty_tail() calls Fengguang Wu
2007-08-12 9:11 ` Fengguang Wu
[not found] ` <20070812092052.848213359@mail.ustc.edu.cn>
2007-08-12 9:11 ` [PATCH 3/6] writeback: remove pages_skipped accounting in __block_write_full_page() Fengguang Wu
2007-08-12 9:11 ` Fengguang Wu
2007-08-13 1:03 ` David Chinner
[not found] ` <20070813103000.GA8520@mail.ustc.edu.cn>
2007-08-13 10:30 ` Fengguang Wu
2007-08-13 10:30 ` Fengguang Wu
[not found] ` <20070817071317.GA8965@mail.ustc.edu.cn>
2007-08-17 7:13 ` Fengguang Wu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=387961898.15210@ustc.edu.cn \
--to=wfg@mail.ustc.edu.cn \
--cc=akpm@osdl.org \
--cc=chris.mason@oracle.com \
--cc=jens.axboe@oracle.com \
--cc=kenchen@google.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).