From: phillip@lougher.demon.co.uk
To: "Theodore Tso" <tytso@mit.edu>, hbryan@us.ibm.com
Cc: atraeger@cs.sunysb.edu, linux-fsdevel@vger.kernel.org,
"Ihar `Philips` Filipau" <thephilips@gmail.com>,
phillip.lougher@gmail.com
Subject: Re: Allocation strategy - dynamic zone for small files
Date: Tue, 14 Nov 2006 14:30:41 +0000 [thread overview]
Message-ID: <E1GjzJB-0002Sy-LD@pr-webmail-2.demon.net> (raw)
In-Reply-To: <20061114010235.GB6454@thunk.org>
tytso@mit.edu wrote:
> 1) It's not just about storage efficiency, but also about transfer
> efficiency. Disk drives generally like to transfer hunks of data in
> 16k to 64k at a time. So getting related pieces of small hunks of
> data read at the same time, we can win big on performance. BUT, it's
> extremely hard to do this at the filesystem level, since the
> application is much more likely to know which micro-file of 16 bytes
> is likely to be needed at the same time as some other micro-file which
> is only 16 bytes long.
Most filesystems (as you'll know) use locality of reference to cluster files.
>From the studies I've seen it works quite well.
When I added tail-end packing to SquashFS, I looked into various stategies to
determine which tail-ends (fragments) to pack together. As SquashFS is a
read-only filesystem this can be done using off-line analysis. After
evaluating various strategies (best fit, first fit, same-size etc.) I found the
best compression of these packed tail-ends was achieved by packing small files
together in alphabetical order from the same directory. Such packing also
achieved the highest performance improvements reading from CDROM (Squashfs is
used for LiveCDs, and so changes in file placement can have a dramatic affect
on seeking). This was a result which was interesting from my POV because it
confirmed conventional locality of reference wisdom.
Phillip Lougher
prev parent reply other threads:[~2006-11-14 14:30 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-11-13 10:37 Allocation strategy - dynamic zone for small files Ihar `Philips` Filipau
2006-11-13 13:56 ` avishay
2006-11-13 17:46 ` Bryan Henderson
2006-11-13 19:38 ` Josef Sipek
2006-11-13 21:12 ` Bryan Henderson
2006-11-13 23:32 ` Ihar `Philips` Filipau
2006-11-13 23:57 ` Andreas Dilger
2006-11-14 2:19 ` Dave Kleikamp
2006-11-14 13:15 ` Jörn Engel
[not found] ` <efa6f5910611140541m302201e6t4e84551b75e79611@mail.gmail.com>
2006-11-14 13:56 ` Jörn Engel
2006-11-14 18:23 ` Andreas Dilger
2006-11-14 15:19 ` phillip
2006-11-14 18:19 ` Andreas Dilger
2006-11-14 0:15 ` Josef Sipek
2006-11-14 0:59 ` Bryan Henderson
2006-11-14 1:02 ` Theodore Tso
2006-11-14 11:21 ` Al Boldi
2006-11-14 14:25 ` Theodore Tso
2006-11-14 15:43 ` Al Boldi
2006-11-14 15:46 ` Matthew Wilcox
2006-11-14 16:59 ` Al Boldi
2006-11-14 17:27 ` Matthew Wilcox
2006-11-14 17:55 ` Theodore Tso
2006-11-14 18:23 ` Al Boldi
2006-11-14 14:30 ` phillip [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=E1GjzJB-0002Sy-LD@pr-webmail-2.demon.net \
--to=phillip@lougher.demon.co.uk \
--cc=atraeger@cs.sunysb.edu \
--cc=hbryan@us.ibm.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=phillip.lougher@gmail.com \
--cc=thephilips@gmail.com \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.