From: Daniel Phillips <phillips@phunq.net>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: abhishekrai@google.com, tytso@mit.edu, adilger@sun.com,
linux-kernel@vger.kernel.org, kenchen@google.com,
mikew@google.com, rohitseth@google.com
Subject: Re: [PATCH] Clustering indirect blocks in Ext3
Date: Fri, 11 Jan 2008 22:05:10 -0800 [thread overview]
Message-ID: <200801112205.11733.phillips@phunq.net> (raw)
In-Reply-To: <20080111160431.905a853f.akpm@linux-foundation.org>
On Friday 11 January 2008 16:04, Andrew Morton wrote:
> It needs to be reviewed. In exhaustive detail. Few people can do
> that and fewer are inclined to do so.
Agreed, there just have to be a few bugs in this many lines of code.
I spent a couple of hours going through it, not really looking at the
algorithms but just the superficial details. I only found minor nits,
and not many of those.
For example, I do not like to see "if (free_blocks == 0)" written as"if
(free_blocks <= 0)" in an attempt to increase robustness. What it
actually does is make the effect of an error more subtle, or
even "corrects" it. Firmly in the niggle category.
I checked the locking of sbi->bginfo and didn't see a flaw, good.
I see a missing KERN_INFO added to a printk, it technically counts as an
unrelated change but oh well.
Stylistically this new code is hard to tell apart from the incumbent
code, except for being more heavily commented. I wish all kernel code
was written this clearly.
At this point I will run away in favor of for-real Ext3 hackers (you
know who you are:-)
> I went to merge it so it could get some testing while we await review
> but the patch has all its tabs replaced with spaces, is seriously
> wordwrapped and has random newlines added to it. Please fix email
> client and resend (offlist is OK if it is unaltered).
Odd, the original post has tabs and the updated one does not, though the
client seems to be kmail in both cases.
> We should have a think about which workloads are most likely to be
> adversely affected by this change.
I was just rolling up my sleeves to construct the nasty sequential case
where the head keeps seeking back to the center of the group after
picking up each 4 MB of doubly indexed data when I realized that even
the most simple minded disk cache makes this case a non-issue. The
drive will most likely suck a full track (roughly .5 MB) or big chunk
thereof into cache the first time it seeks to the index cluster, thus
having a whole group of double index blocks in cache and then will
proceed to chew happily and linearly through the data blocks.
It seems like placing those second level index blocks all together
really helps this case. Hmm, how to break it.
How about having a disk full of 100 MB files and skipping all over the
disk randomly reading one block each time. That will fill the disk
cache, and each random read then requires seeking to two places that
were hopefully close together without index node clustering, and now
will be an average of 32 MB apart. Each of these "extra" seeks costs a
couple of ms worth of head travel plus average rotational latency of 4
ms or so, for a total 6 ms. However, even with a perfect non-clustered
layout, the index mode will still be an average of 2 MB away from the
data block, so the rotational latency is still incurred and only the
head travel is a little less, say 1 ms less. So the "extra" seek time
for clustered is 6 ms vs 5 ms for non-clustered. Add in 8 ms for the
long random seek and we have 14 ms vs 13 ms, or about 8% difference.
Only a small regression there, and I tried hard. Barring mistakes in
my estimates the sequential improvement above is large while the
regression for the nasty random construction is small.
Maybe somebody else will have better luck breaking it.
Regards,
Daniel
next prev parent reply other threads:[~2008-01-12 6:05 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-11-16 5:02 [PATCH] Clustering indirect blocks in Ext3 Abhishek Rai
2007-11-16 7:02 ` Andrew Morton
2007-11-16 7:37 ` Matt Mackall
2007-11-18 15:52 ` Abhishek Rai
2007-11-18 20:47 ` Matt Mackall
2007-11-19 10:34 ` Kyungmin Park
2007-11-20 20:25 ` John Stoffel
2007-11-16 11:28 ` Andreas Dilger
2007-11-16 21:11 ` Theodore Tso
2007-11-17 0:25 ` Abhishek Rai
2007-11-17 2:58 ` Theodore Tso
2007-11-17 8:58 ` Abhishek Rai
2007-12-21 14:15 ` Abhishek Rai
2008-01-10 21:17 ` Abhishek Rai
2008-01-11 17:05 ` Daniel Phillips
2008-01-12 0:04 ` Andrew Morton
2008-01-12 6:05 ` Daniel Phillips [this message]
2008-01-13 5:06 ` Abhishek Rai
2007-11-16 22:27 ` Abhishek Rai
[not found] <9q1CT-82L-3@gated-at.bofh.it>
[not found] ` <9q3v2-2Br-3@gated-at.bofh.it>
[not found] ` <9qgLE-7ds-21@gated-at.bofh.it>
[not found] ` <9qjJx-3wE-9@gated-at.bofh.it>
[not found] ` <9qm4D-70Q-1@gated-at.bofh.it>
[not found] ` <9CQTt-7cr-27@gated-at.bofh.it>
[not found] ` <9KcYS-46E-27@gated-at.bofh.it>
2008-01-11 14:12 ` Bodo Eggert
2008-01-11 14:49 ` Abhishek Rai
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200801112205.11733.phillips@phunq.net \
--to=phillips@phunq.net \
--cc=abhishekrai@google.com \
--cc=adilger@sun.com \
--cc=akpm@linux-foundation.org \
--cc=kenchen@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mikew@google.com \
--cc=rohitseth@google.com \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox