From: "Vladimir 'φ-coder/phcoder' Serbinenko" <phcoder@gmail.com>
To: linux-fsdevel@vger.kernel.org
Subject: Description of HFS+ compression
Date: Thu, 24 May 2012 20:38:35 +0200 [thread overview]
Message-ID: <4FBE802B.4040305@gmail.com> (raw)
[-- Attachment #1: Type: text/plain, Size: 2542 bytes --]
Hello, all. I've looked into how Mac OS X compresses files using "transparent compression".
Since I don't plan to use this data now, I've thought it may be a good idea to document my
findings, perhaps someone may implement compressed reader. It's conceptually and in
implementation very similar to zisofs.
I suppose that reader is familiar with
http://developer.apple.com/legacy/mac/library/#technotes/tn/tn1150.html
Used compression: zlib
Block size: 64K
Missing bits from TN1150.
Attributes key (big-endian):
uint16_t | unknown | always zero
uint32_t | cnid | file id of parent, most likely, not checked
uint32_t | unknown | always zero
uint16_t | namelen | length of name
uint16_t[] | name | name in UTF-16BE
Attributes header (start of the value in attributes key), (big-endian):
uint8[3] | unknown | always zero
uint8_t | type | only 0x10 = inline is used for com.apple.decmpfs, attribute itself follows
uint32_t | unknown | always zero
uint64_t | size | size of attribute
Compressed attribute header (little-endian):
uint32_t | magic | "fpmc"
uint32_t | unknown | always 3
uint32_t | uncompressed_size | uncompressed size if inline, 8 otherwise
uint32_t | unknown | always 0
If there is only one block and it's small enough it's stored directly following the header.
Otherwise "# dummy\n" is stored instead and the compressed data is stored in resource fork of the file in question.
The headers Mac OS X uses to masquerade as some kind of resource:
Resource fork header (big-endian):
uint32_t | header_size | always 0x100
uint32_t | size | total_compressed_size + seek_block_size + 4 + 0x100
uint32_t | size | total_compressed_size + seek_block_size + 4
uint32_t | unknown | always 0x32
uint8_t[0xf0] | unknown | zero-filled
uint8_t | size | total_compressed_size + seek_block_size
It's followed by seek block starts with (little-endian)
uint32_t | nentries | number of entries follow
entries are (little-endian):
uint32_t | compressed_offset (offset 0 corresponds to the nentries field)
uint32_t | compressed_size
Follow zlib compressed blocks.
Trailer is 50 bytes of always the same contents:
0000000: 0000 0000 0000 0000 0000 0000 0000 0000 ................
0000010: 0000 0000 0000 0000 001c 0032 0000 636d ...........2..cm
0000020: 7066 0000 000a 0001 ffff 0000 0000 0000 pf..............
0000030: 0000
--
Regards
Vladimir 'φ-coder/phcoder' Serbinenko
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 294 bytes --]
reply other threads:[~2012-05-24 18:38 UTC|newest]
Thread overview: [no followups] expand[flat|nested] mbox.gz Atom feed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4FBE802B.4040305@gmail.com \
--to=phcoder@gmail.com \
--cc=linux-fsdevel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.