linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Description of HFS+ compression
@ 2012-05-24 18:38 Vladimir 'φ-coder/phcoder' Serbinenko
  0 siblings, 0 replies; only message in thread
From: Vladimir 'φ-coder/phcoder' Serbinenko @ 2012-05-24 18:38 UTC (permalink / raw)
  To: linux-fsdevel

[-- Attachment #1: Type: text/plain, Size: 2542 bytes --]

Hello, all. I've looked into how Mac OS X compresses files using "transparent compression".
Since I don't plan to use this data now, I've thought it may be a good idea to document my
findings, perhaps someone may implement compressed reader. It's conceptually and in
implementation very similar to zisofs.
I suppose that reader is familiar with
http://developer.apple.com/legacy/mac/library/#technotes/tn/tn1150.html
Used compression: zlib
Block size: 64K
Missing bits from TN1150.
Attributes key (big-endian):
uint16_t   | unknown | always zero
uint32_t   | cnid    | file id of parent, most likely, not checked
uint32_t   | unknown | always zero
uint16_t   | namelen | length of name
uint16_t[] | name    | name in UTF-16BE

Attributes header (start of the value in attributes key), (big-endian):
uint8[3]   | unknown | always zero
uint8_t    | type    | only 0x10 = inline is used for com.apple.decmpfs, attribute itself follows
uint32_t   | unknown | always zero
uint64_t   | size    | size of attribute

Compressed attribute header (little-endian):
uint32_t   | magic             | "fpmc"
uint32_t   | unknown           | always 3
uint32_t   | uncompressed_size | uncompressed size if inline, 8 otherwise
uint32_t   | unknown           | always 0

If there is only one block and it's small enough it's stored directly following the header.
Otherwise "# dummy\n" is stored instead and the compressed data is stored in resource fork of the file in question.

The headers Mac OS X uses to masquerade as some kind of resource:
Resource fork header (big-endian):
uint32_t  | header_size | always 0x100
uint32_t  | size        | total_compressed_size + seek_block_size + 4 + 0x100 
uint32_t  | size        | total_compressed_size + seek_block_size + 4
uint32_t  | unknown     | always 0x32
uint8_t[0xf0] | unknown | zero-filled
uint8_t   | size        | total_compressed_size + seek_block_size
It's followed by seek block starts with (little-endian)
uint32_t  | nentries    | number of entries follow
entries are (little-endian):
uint32_t  | compressed_offset (offset 0 corresponds to the nentries field)
uint32_t  | compressed_size

Follow zlib compressed blocks.

Trailer is 50 bytes of always the same contents:
0000000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000010: 0000 0000 0000 0000 001c 0032 0000 636d  ...........2..cm
0000020: 7066 0000 000a 0001 ffff 0000 0000 0000  pf..............
0000030: 0000 


-- 
Regards
Vladimir 'φ-coder/phcoder' Serbinenko


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 294 bytes --]

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2012-05-24 18:38 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-05-24 18:38 Description of HFS+ compression Vladimir 'φ-coder/phcoder' Serbinenko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).