From: Quinn Harris <lists@qutek.net>
To: reiserfs-list@namesys.com
Subject: Re: Relocating files for faster boot/start-up on reiser(fs/4)
Date: Fri, 15 Sep 2006 15:20:36 -0600 [thread overview]
Message-ID: <200609151520.39826.lists@qutek.net> (raw)
In-Reply-To: <54E99998-88D2-4E0D-8970-1C3DA365DBE0@smartgames.ca>
On Thursday 14 September 2006 23:15, Toby Thain wrote:
> On 14-Sep-06, at 6:23 PM, David Masover wrote:
> > Quinn Harris wrote:
> >> On Thursday 14 September 2006 13:55, David Masover wrote:
> >>> ...
> >>
> >> That is a good point. Recording the disk layout before and after
> >> to compare relative fragmentation would be a good idea. As well
> >> as randomizing the sequence as a sanity check.
> >> Also note that during boot I was using readahead on all 3885
> >> files. So the kernel has a good opportunity to rearrange the
> >> reads. And the read sequence doesn't necessary match the order
> >> its needed (though I tried to get that).
> >
> > Speaking of which, did you parallize the boot process at all?
>
> Just off the top of my head, wouldn't that make the access sequence
> asynchronous & thereby less predictable? (Although I'm sure it's a
> net win.)
It could, but the kernel will try to reorder the outstanding block requests to
reduce seek. If that is an overall win I don't know. In addition early in
the boot, readahead-list or similar will tell the kernel to start reading
most of the files need for the complete boot so they are already in memory
when needed. Ubuntu does the readahead now and all my tests where with
readahead.
>
> > I'd estimate my system easily spent more than 50% of its boot time
> > not touching the disk at all before I did that. Gentoo can do
> > this, I'm not sure what else, as it kind of needs your init system
> > to understand dependencies.
>
> ...
My first test turned out to be on a heavily fragmented file system. I
reinstalled Ubuntu Dapper with a fresh reiserfs file system and it booted in
1:07 (grub to desktop background appearing). After extending the time
readahead-watch monitors files and running the reallocate script it now boots
in 0:50.
I wrote a little python script that uses the FIBMAP ioctl to check the blocks
the files are using. From this I know the relocate script on this fresh file
system is doing exactly what it was intended to do. I am also able to
estimate how much it will improve performance by comparing the fragmentation
before and after its run. I have learned that the delays on disk io for
Ubuntu boot are dominated by rotational latency and not head seeks. The
current readahead implementation orders the files by on disk location,
substantially mitigating head seek time. But the latency is can easily
double the time needed to load the same data.
Subjectively (and objectively by about 6s) relocation and extending
readahead-watch substantially improved Gnome boot and initial responsiveness.
But, I need to measure how much of this was caused by just extending how much
is read ahead vs. the reallocation.
The current Ubuntu boot waits for hardware probing, DHCP and other things
giving the disk readahead a chance to work. I think this reallocation might
help a parallel boot more as the data will be needed sooner. So I changed my
mind, I think parallel boot will highlight the reallocate advantage. Now I
just need to test the hypothesis.
Not sure if I would be better of trying initng or waiting for upstart (Ubuntus
new init) to get scripts that actually parallel boot. The code for upstart
is very clean and it has the backing of a major distro, so I have high hopes.
Much like before, I was able to improve a 16.5s oowriter cold start to 14s
with this reallocate script, with a cold start of 4.8s (OO 2.0.2, was using
2.0.3 before). It is evident to me that the readahead-watch is missing
something on Open Office startup. It seems very possible to get OO to cold
start in under 8s with the uses of reallocation and readahead right when it
starts.
My current scripts are at
http://www.quinnh.org/reallocate.py (27 line reallocate script, expects
dir /tmp/refrag to exist and takes the readahead-watch log as a paramater)
http://www.quinnh.org/measure.py (uses FIBMAP to estimate the time needed to
load the files in the passed readahead-watch log, uses average seek and and
latency for estimate)
http://www.quinnh.org/readahead-watch-time-order.patch (Patch against Ubuntu
readahead-watch to add an order by access time option.)
I will try to write a nice unified script that will profile, reallocate and do
readahead for an application to speed it up. e.g. "# reallocate.py
oowriter". Run it once to profile and reallocate. drop_caches, Run it again
and oowriter loads faster.
I think Python will be the best language for this because its become
relatively universal and its easy to understand for the uninitiated. This
really isn't black magic so transparency is good. I personally prefer Ruby
though.
next prev parent reply other threads:[~2006-09-15 21:20 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-09-13 20:51 Relocating files for faster boot/start-up on reiser(fs/4) Quinn Harris
2006-09-13 21:10 ` Peter
2006-09-14 3:10 ` Quinn Harris
2006-09-14 19:55 ` David Masover
2006-09-14 22:09 ` Quinn Harris
2006-09-14 22:23 ` David Masover
2006-09-15 5:15 ` Toby Thain
2006-09-15 21:20 ` Quinn Harris [this message]
2006-09-15 22:27 ` David Masover
2006-09-16 0:01 ` Quinn Harris
2006-09-16 8:59 ` David Masover
2006-09-18 9:36 ` PFC
2006-09-18 22:32 ` Quinn Harris
2006-09-14 14:01 ` cmaurand
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200609151520.39826.lists@qutek.net \
--to=lists@qutek.net \
--cc=reiserfs-list@namesys.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.