From: <Solofo.Ramangalahy@bull.net>
To: linux-ext4@vger.kernel.org
Subject: [RFC 0/2] ext4: zero uninitialized inode tables
Date: Fri, 21 Nov 2008 11:23:09 +0100 [thread overview]
Message-ID: <20081121102309.182113793@bull.net> (raw)
The time to format a filesystem is mostly linear with filesystem size.
Exact time spent on formating depends on hardware and software, but
this is mainly explained by the zeroing of some blocks (inode, block
bitmaps and inodes tables).
While the mkfs time can be considered negligible (for example compared
to RAID formatting of disk arrays), it is significant compared
to the formating time of others filesystems.
This is noticeable when conducting performance comparison tests, or
testing involving multiple formatting of the same device.
This may become prohibitive for large disks (arrays).
For some measurements, see:
http://www.bullopensource.org/ext4/20080909-mkfs-speed-lazy_itable_init/
http://www.bullopensource.org/ext4/20080911-mkfs-speed-lazy_itable_init/
http://www.bullopensource.org/ext4/20080912-mkfs-speed-lazy_itable_init/
so far it is under one hour, further measurements would be needed,
like for 16TB filesystems.
It is possible to skip the initialization of the inode tables blocks
with the mkfs option "lazy_itable_init" (mkfs.ext4(8)).
However, this option is not safe with respect to fsck, as there is no
way to distinguish between an unitialized block filled with old bits
and a corrupted one.
(The use of lazy_itable_init could be considered safe in the case where
the blocks of the disk, in particular those used by the inode tables,
are prefilled with zeros.)
These patches (try to) initialize the inode tables after mount via a
kernel thread launched by module loading. The goal is to find a
tradeoff between speed and safety.
Apart from use in testing, another use case could be a distribution
installation: since device size rises faster than system size, the
percentage of the formating time during the installation will
increase. Since the system will use a fragment of the full device (say
10GB for system installation on a 1TB disk), it would not be strictly
necessary to initialize all the inode tables before starting the
installation, for example for the home partition.
So far, I've only been able to initialize some small filesystems with
this code (using 2.6.28-rc4).
For example, like this:
. dd if=/dev/zero of=/tmp/ext4fs.img bs=1M count=1024
. losetup /dev/loop0 /tmp/ext4fs.img
. mkfs.ext4 -O^resize_inode -Elazy_itable_init /dev/loop0
. mount /dev/loop0 /mnt/test-ext4
. [dumpe2fs /dev/loop0]
. modprobe ext4_itable_init
. [dumpe2fs /dev/loop0 # here check the ITABLE_ZEROED]
. umount /mnt/test-ext4
. [dumpe2fs /dev/loop0]
. [fsck /dev/loop0]
But I also hitted several bugs and managed to somehow screw up my
machine. So be _extremly_ careful if ever you try the code!
TODO:
. fix the resize inode case
. fix the observed soft lockup
. decide whether to keep it a module.
If not, decide how/when run the kernel thread
. initialize some blocks (for example the non-empty ones) at mount
time, or somewhere else.
. non-empty group case
. feature interactions? (for example inode zeroing vs. resize)
. multiple threads (based on cpu/disks)
. other ?
next reply other threads:[~2008-11-21 10:39 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-11-21 10:23 Solofo.Ramangalahy [this message]
2008-11-21 10:23 ` [RFC 1/2] ext4 resize: Mark the added group with EXT4_BG_INODE_ZEROED flag Solofo.Ramangalahy
2008-11-24 23:25 ` Andreas Dilger
2008-11-25 11:27 ` Solofo.Ramangalahy
2008-11-25 21:18 ` Andreas Dilger
2008-11-27 4:50 ` Theodore Tso
2008-11-27 9:30 ` Solofo.Ramangalahy
2008-11-27 22:35 ` Theodore Tso
2008-11-27 23:09 ` Andreas Dilger
2008-11-21 10:23 ` [RFC 2/2] ext4: module to initialize the inode table when using mkfs option lazy_itable_init Solofo.Ramangalahy
2008-11-25 5:32 ` [RFC 0/2] ext4: zero uninitialized inode tables Theodore Tso
2008-11-25 8:35 ` Andreas Dilger
2008-11-25 12:28 ` Solofo.Ramangalahy
2008-11-25 18:52 ` Theodore Tso
2008-11-25 21:10 ` Andreas Dilger
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20081121102309.182113793@bull.net \
--to=solofo.ramangalahy@bull.net \
--cc=linux-ext4@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.