From: "Darrick J. Wong" <darrick.wong@oracle.com>
To: tytso@mit.edu, darrick.wong@oracle.com
Cc: linux-ext4@vger.kernel.org
Subject: [PATCH 46/54] contrib: script to create minified ext4 image from a directory
Date: Mon, 26 Jan 2015 23:40:35 -0800 [thread overview]
Message-ID: <20150127074035.13308.79172.stgit@birch.djwong.org> (raw)
In-Reply-To: <20150127073533.13308.44994.stgit@birch.djwong.org>
The dir2fs script converts a directory into a minimized ext4 filesystem.
FS creation parameters are tweaked to reduce as much FS overhead as
possible, and to leave as few unused blocks and inodes as possible.
Given that mke2fs -d lays out files linearly from the beginning of the
FS, using resize2fs -M is not as horrible as it usually is.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
contrib/dir2fs | 66 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 66 insertions(+)
create mode 100755 contrib/dir2fs
diff --git a/contrib/dir2fs b/contrib/dir2fs
new file mode 100755
index 0000000..abcecb3
--- /dev/null
+++ b/contrib/dir2fs
@@ -0,0 +1,66 @@
+#!/bin/sh
+
+dir="$1"
+dev="$2"
+
+if [ "$1" = "--help" ] || [ ! -d "${dir}" ]; then
+ echo "Usage: $0 dir [mke2fs args] dev"
+ exit 1
+fi
+
+shift
+
+# Goal: Put all the files at the beginning (which mke2fs does) and minimize
+# the number of free inodes given the minimum number of blocks required.
+# Hence all this math to get the inode ratio just right.
+
+bytes="$(du -ks "${dir}" | awk '{print $1}')"
+bytes="$((bytes * 1024))"
+inodes="$(find "${dir}" -print0 | xargs -0 stat -c '%i' | sort -g | uniq | wc -l)"
+block_sz=4096
+inode_sz=256
+sb_overhead=4096
+blocks_per_group="$((block_sz * 8))"
+bytes_per_group="$((blocks_per_group * block_sz))"
+inode_bytes="$((inodes * inode_sz))"
+
+# Estimate overhead with the minimum number of groups...
+nr_groups="$(( (bytes + inode_bytes + bytes_per_group - 1) / bytes_per_group))"
+inode_bytes_per_group="$((inode_bytes / nr_groups))"
+inode_blocks_per_group="$(( (inode_bytes_per_group + (block_sz - 1)) / block_sz ))"
+per_grp_overhead="$(( ((3 + inode_blocks_per_group) * block_sz) + 64 ))"
+overhead="$(( sb_overhead + (per_grp_overhead * nr_groups) ))"
+used_bytes="$((bytes + overhead))"
+
+# Then do it again with the real number of groups.
+nr_groups="$(( (used_bytes + (bytes_per_group - 1)) / bytes_per_group))"
+tot_blocks="$((nr_groups * blocks_per_group))"
+tot_bytes="$((tot_blocks * block_sz))"
+
+ratio="$((bytes / inodes))"
+mkfs_blocks="$((tot_blocks * 4 / 3))"
+
+mke2fs -i "${ratio}" -T ext4 -d "${dir}" -O ^resize_inode,sparse_super2,metadata_csum,64bit,^has_journal -E packed_meta_blocks=1,num_backup_sb=0 -b "${block_sz}" -I "${inodesz}" -F "${dev}" "${mkfs_blocks}" || exit
+
+e2fsck -fyD "${dev}"
+
+blocks="$(dumpe2fs -h "${dev}" 2>&1 | grep 'Block count:' | awk '{print $3}')"
+while resize2fs -f -M "${dev}"; do
+ new_blocks="$(dumpe2fs -h "${dev}" 2>&1 | grep 'Block count:' | awk '{print $3}')"
+ if [ "${new_blocks}" -eq "${blocks}" ]; then
+ break;
+ fi
+ blocks="${new_blocks}"
+done
+
+if [ ! -b "${dev}" ]; then
+ truncate -s "$((blocks * block_sz))" "${dev}" || (e2image -ar "${dev}" "${dev}.min"; mv "${dev}.min" "${dev}")
+fi
+
+e2fsck -fy "${dev}"
+
+dir_blocks="$((bytes / block_sz))"
+overhead="$((blocks - dir_blocks))"
+echo "Minimized image overhead: $((100 * overhead / dir_blocks))%"
+
+exit 0
next prev parent reply other threads:[~2015-01-27 7:40 UTC|newest]
Thread overview: 88+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-01-27 7:35 [PATCH 00/54] e2fsprogs January 2015 patchbomb Darrick J. Wong
2015-01-27 7:35 ` [PATCH 01/54] misc: fix minor testcase problems Darrick J. Wong
2015-01-27 15:55 ` Theodore Ts'o
2015-01-27 7:35 ` [PATCH 02/54] debugfs: document new commands Darrick J. Wong
2015-01-27 15:56 ` Theodore Ts'o
2015-01-27 7:35 ` [PATCH 03/54] debugfs: fix crash in ea_set argument handling Darrick J. Wong
2015-01-27 15:58 ` Theodore Ts'o
2015-01-27 7:35 ` [PATCH 04/54] libext2fs: initialize i_extra_isize when writing EAs Darrick J. Wong
2015-01-27 16:02 ` Theodore Ts'o
2015-01-27 7:36 ` [PATCH 05/54] libext2fs: avoid pointless EA block allocation Darrick J. Wong
2015-01-27 16:07 ` Theodore Ts'o
2015-01-27 19:26 ` Darrick J. Wong
2015-01-27 7:36 ` [PATCH 06/54] libext2fs: strengthen i_extra_isize checks when reading/writing xattrs Darrick J. Wong
2015-01-27 16:08 ` Theodore Ts'o
2015-01-27 7:36 ` [PATCH 07/54] libext2fs: fix tdb.c mmap leak Darrick J. Wong
2015-01-27 16:09 ` Theodore Ts'o
2015-01-27 7:36 ` [PATCH 08/54] resize2fs: fix regression test to not depend on ext4.ko being loaded Darrick J. Wong
2015-01-27 16:10 ` Theodore Ts'o
2015-01-27 7:36 ` [PATCH 09/54] tune2fs: disable csum verification before resizing inode Darrick J. Wong
2015-01-27 16:11 ` Theodore Ts'o
2015-01-27 7:36 ` [PATCH 10/54] tune2fs: abort when trying to enable/disable metadata_csum on mounted fs Darrick J. Wong
2015-01-27 16:26 ` Theodore Ts'o
2015-01-27 7:36 ` [PATCH 11/54] tune2fs: call out to resize2fs for 64bit conversion Darrick J. Wong
2015-01-27 16:31 ` Theodore Ts'o
2015-01-27 7:36 ` [PATCH 12/54] e2fsck: clear i_block[] when there are too many bad mappings on a special inode Darrick J. Wong
2015-01-27 16:32 ` Theodore Ts'o
2015-01-27 7:36 ` [PATCH 13/54] e2fsck: on read error, don't rewrite blocks past the end of the fs Darrick J. Wong
2015-01-27 17:35 ` Theodore Ts'o
2015-01-28 23:35 ` Darrick J. Wong
2015-01-27 7:37 ` [PATCH 14/54] e2fsck: fix the journal recreation message Darrick J. Wong
2015-01-27 18:02 ` Theodore Ts'o
2015-01-27 19:37 ` Darrick J. Wong
2015-01-27 7:37 ` [PATCH 15/54] e2fsck: handle multiple *ind block collisions with critical metadata Darrick J. Wong
2015-01-28 13:52 ` Theodore Ts'o
2015-01-27 7:37 ` [PATCH 16/54] e2fsck: decrement bad count _after_ remapping a duplicate block Darrick J. Wong
2015-01-28 13:58 ` Theodore Ts'o
2015-01-27 7:37 ` [PATCH 17/54] e2fsck: inspect inline dir data as two directory blocks Darrick J. Wong
2015-01-28 15:16 ` Theodore Ts'o
2015-01-27 7:37 ` [PATCH 18/54] e2fsck: improve the inline directory detector Darrick J. Wong
2015-01-28 16:38 ` Theodore Ts'o
2015-01-27 7:37 ` [PATCH 19/54] e2fsck: salvage under-sized dirents by removing them Darrick J. Wong
2015-02-16 15:40 ` Theodore Ts'o
2015-01-27 7:37 ` [PATCH 20/54] e2fsck: add a 'yes to all' response in interactive mode Darrick J. Wong
2015-03-29 2:54 ` Theodore Ts'o
2015-01-27 7:37 ` [PATCH 21/54] libext2fs: zero blocks via FALLOC_FL_ZERO_RANGE in ext2fs_zero_blocks Darrick J. Wong
2015-03-29 3:46 ` Theodore Ts'o
2015-01-27 7:37 ` [PATCH 22/54] libext2fs: ext2fs_new_block2() should call alloc_block hook Darrick J. Wong
2015-03-29 3:08 ` Theodore Ts'o
2015-01-27 7:38 ` [PATCH 23/54] libext2fs: Support readonly filesystem images Darrick J. Wong
2015-03-19 21:32 ` [PATCH v2 " Darrick J. Wong
2015-03-29 3:42 ` Theodore Ts'o
2015-01-27 7:38 ` [PATCH 24/54] libext2fs/e2fsck: provide routines to read-ahead metadata Darrick J. Wong
2015-01-27 7:38 ` [PATCH 25/54] e2fsck: read-ahead metadata during passes 1, 2, and 4 Darrick J. Wong
2015-01-27 7:38 ` [PATCH 26/54] e2fsck: track directories to be rehashed with a bitmap Darrick J. Wong
2015-01-27 7:38 ` [PATCH 27/54] e2fsck: rebuild sparse extent trees/convert non-extent ext3 files Darrick J. Wong
2015-03-19 21:42 ` [PATCH v4 " Darrick J. Wong
2015-01-27 7:38 ` [PATCH 28/54] tests: verify proper rebuilding of sparse extent trees and block map file conversion Darrick J. Wong
2015-01-27 7:38 ` [PATCH 29/54] undo-io: add new calls to and speed up the undo io manager Darrick J. Wong
2015-01-27 7:38 ` [PATCH 30/54] undo-io: be more flexible about setting block size Darrick J. Wong
2015-01-27 7:38 ` [PATCH 31/54] undo-io: use a bitmap to track what we've already written Darrick J. Wong
2015-01-27 7:39 ` [PATCH 32/54] e2undo: fix memory leaks and tweak the error messages somewhat Darrick J. Wong
2015-01-27 7:39 ` [PATCH 33/54] e2undo: ditch tdb file, write everything to a flat file Darrick J. Wong
2015-01-27 7:39 ` [PATCH 34/54] libext2fs: support atexit cleanups Darrick J. Wong
2015-01-27 7:39 ` [PATCH 35/54] e2fsck: optionally create an undo file Darrick J. Wong
2015-01-27 7:39 ` [PATCH 36/54] resize2fs: optionally create " Darrick J. Wong
2015-01-27 7:39 ` [PATCH 37/54] tune2fs: " Darrick J. Wong
2015-01-27 7:39 ` [PATCH 38/54] mke2fs: " Darrick J. Wong
2015-01-27 7:39 ` [PATCH 39/54] debugfs: " Darrick J. Wong
2015-01-27 7:39 ` [PATCH 40/54] tests: test undo file creation in e2fsck/resize2fs/tune2fs/mke2fs Darrick J. Wong
2015-01-27 7:40 ` [PATCH 41/54] tests: test various features of the new e2undo format Darrick J. Wong
2015-01-27 7:40 ` [PATCH 42/54] copy-in: create hardlinks with the correct directory filetype Darrick J. Wong
2015-01-27 7:40 ` [PATCH 43/54] copy-in: for files, only iterate file blocks that are mapped Darrick J. Wong
2015-01-27 7:40 ` [PATCH 44/54] copyin: fix error handling Darrick J. Wong
2015-01-27 7:40 ` [PATCH 45/54] mke2fs: add simple tests and re-alphabetize mke2fs manpage options Darrick J. Wong
2015-01-27 7:40 ` Darrick J. Wong [this message]
2015-01-27 7:40 ` [PATCH 47/54] libext2fs: support allocating uninit blocks in bmap2() Darrick J. Wong
2015-01-27 7:40 ` [PATCH 48/54] libext2fs: find/alloc a range of empty blocks Darrick J. Wong
2015-01-27 7:40 ` [PATCH 49/54] libext2fs: add new hooks to support large allocations Darrick J. Wong
2015-01-27 7:41 ` [PATCH 50/54] libext2fs: implement fallocate Darrick J. Wong
2015-01-27 7:41 ` [PATCH 51/54] libext2fs: use fallocate for creating journals and hugefiles Darrick J. Wong
2015-01-27 7:41 ` [PATCH 52/54] debugfs: implement fallocate Darrick J. Wong
2015-01-27 7:41 ` [PATCH 53/54] tests: test debugfs punch command Darrick J. Wong
2015-03-19 21:44 ` [PATCH 55/54] e2fsck: actually fix inline_data flags problems when user says to do so Darrick J. Wong
2015-03-29 4:05 ` Theodore Ts'o
2015-03-19 21:45 ` [PATCH 56/54] libext2fs: zero hash in ibody extended attributes Darrick J. Wong
2015-03-29 4:13 ` Theodore Ts'o
2015-03-19 21:47 ` [PATCH 57/54] e2fsck: convert block-mapped files to extents on bigalloc fs Darrick J. Wong
2015-03-19 23:54 ` [PATCH 58/54] e2fsck: turn inline data symlink into a fast symlink when possible Darrick J. Wong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150127074035.13308.79172.stgit@birch.djwong.org \
--to=darrick.wong@oracle.com \
--cc=linux-ext4@vger.kernel.org \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).