From: Alex Tomas <alex@clusterfs.com>
To: linux-ext4@vger.kernel.org
Subject: [RFC] delayed allocation, mballoc, etc
Date: Fri, 01 Dec 2006 03:15:06 +0300 [thread overview]
Message-ID: <m3irgwv59d.fsf@bzzz.home.net> (raw)
Good day,
I'd like to ask the community to discuss and review few things
I've been working on. we propose set of patches with intention
to improve performance of ext4:
* locality groups
to achieve good performance writing many small files
we need to allocate them closely each to other. the
simplest way could be to allocate all small files using
next block after the previous small file. and this would
work well for a single-job case. for multi-job case (few
untar's, for example) this would break job locality and
cause performance penaly in subsequent access. locality
groups idea may help here: let's group all files by some
property. pgid, for example. now, every time the kernel
ask filesystem to flush dirty pages, we flush inodes from
1st group, then from 2nd and go on. this one we can form
large contiguous allocations (for a whole group) achieving
good throughput and preserve quite good locality.
* scalable block reservation
this is required to protect from -ENOSPC when pages enter
pagecache w/o space allocation (delayed allocation). it
also should scale well on high-end SMP as every cpu has
one "pool" of block. when pool is empty, the filesystem
rebalance free blocks between all cpus
* mballoc v4
multiblock allocator. it's supposed to be ablo to allocate
many blocks at once saving cpu.
with the following changes since v2 published before:
a) per-inode preallocation
every regular inode may have few preallocated chunks
assigned to specific logical offset. it's intended to
help applications like IOR and p2p
b) per-locality-group preallocation
a locality group may have few preallocated chunks
c) buddy structures aren't stored on a disk, instead
they are regenerated from on-disk bitmaps on demand
d) has stride option to align requests (useful for arrays)
* delayed allocation
not that many changes have been done since the previous
publication: few bugfixes and tweaks, adopted to new mballoc
as usual, there are tons of things yet to be done/fixed/tweaked.
I'm trying to keep them uptodate in TODOs.
few tests have been done. I'm sending the numbers (as well as
the patches) in the subsequent mails. please, have a look.
all the series can be found at
ftp://ftp.clusterfs.com/pub/people/alex/2.6.19-rc6/
to enable the features, ext4 should be mounted with options:
extents,mballoc,delalloc
any comments and questions are very welcome.
thanks, Alex
PS. I'd like to give thanks to CFS for help. especially to
Peter Braam and Andreas Dilger who feed me with ideas.
next reply other threads:[~2006-12-01 0:17 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-12-01 0:15 Alex Tomas [this message]
2006-12-07 17:18 ` [RFC] delayed allocation, mballoc, etc Valerie Clement
2006-12-07 17:26 ` Alex Tomas
-- strict thread matches above, loose matches on Subject: below --
2006-12-27 11:09 sho
2006-12-27 11:16 ` Alex Tomas
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=m3irgwv59d.fsf@bzzz.home.net \
--to=alex@clusterfs.com \
--cc=linux-ext4@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).