From: Ivan Shmakov <ivan@gray.siamics.net>
To: linux-ext4@vger.kernel.org
Subject: e2dis: a Jigdo-like tool for Ext2+ FS
Date: Sat, 13 Aug 2011 17:21:09 +0700 [thread overview]
Message-ID: <86ei0p8ve2.fsf@gray.siamics.net> (raw)
A couple of weeks ago I've started working on a tool
(tentantively named “Ext2 disassembler”) to walk through an
Ext2+ filesystem (or an image of) and produce the mapping of
files' (inodes') relative block numbers to the image's (or
“physical”) block numbers.
The version-that-works (apparently) is almost done, pending
upload to a publicly-accessible Git repository.
However, there's a considerable amount of work to be done so
that the tool will become really usable. Therefore, I'd
appreciate any help with it.
TIA.
Why I'm interested in that?
Recently, there was a discussion in debian-devel@ on whether the
Debian project should provide images for easy deployment within
“virtual” environments (such as KVM, Xen, etc.)
Such images (which, I assume, will use a filesystem supported by
e2fsprogs) are going to be quite large: hundreds MiB to a few
GiB's (depending on the intended usage) per architecture per
version.
Earlier, to reduce the burden of mirroring of the ISO 9660 (CD,
DVD, etc.) images, the Jigdo (for Jigsaw Download) tool was
introduced. The tool uses SHA-1 to associate pieces of a
filesystem image with the contents of the files of a specified
set. As the result, the tool produces the association map,
which has the parts of the image for which no matching files are
known embedded. (A helper file, which contains the URI's the
files may be downloaded from, is also generated.)
Given such an association map, and the files, the tool is
capable of restoring the image.
The tool is filesystem-agnostic. Unfortunately, it relies on
the fact that the files on the ISO 9660 filesystem are never
fragmented. Which doesn't hold for Ext2+.
However, given the knowledge of the filesystem, it's possible to
solve the task of describing the parts of a given image as being
parts of the files specified.
Done
The tool iterates over the inodes, and records the
logical-to-physical blocks correspondence. All the “chunks”
belonging to the same inode are marked as such.
The mapping is written to a SQLite database.
To do
Message digests are to be computed and recorded just as well.
Non-payload blocks are to be annotated as well.
A tool to reassemble the image.
Command line interface. (Preferably compliant to the GNU Coding
Standards.)
--
FSF associate member #7257
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next reply other threads:[~2011-08-13 10:21 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-08-13 10:21 Ivan Shmakov [this message]
2011-08-14 6:56 ` e2dis: a Jigdo-like tool for Ext2+ FS Ivan Shmakov
2011-08-15 9:29 ` Lukas Czerner
2011-08-15 11:10 ` Ivan Shmakov
2011-08-15 16:12 ` Lukas Czerner
2011-08-17 5:21 ` debugfs: list inode numbers? Ivan Shmakov
2011-08-17 5:49 ` e2dis: a Jigdo-like tool for Ext2+ FS Ivan Shmakov
2011-08-18 16:27 ` ext2fs_test_block_bitmap (): Unknown code ext2 47 #0, etc Ivan Shmakov
2011-08-22 4:31 ` Ted Ts'o
2011-08-23 17:26 ` Ivan Shmakov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=86ei0p8ve2.fsf@gray.siamics.net \
--to=ivan@gray.siamics.net \
--cc=linux-ext4@vger.kernel.org \
--cc=oneingray@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.