From mboxrd@z Thu Jan 1 00:00:00 1970 From: =?UTF-8?B?VmxhZGltaXIgJ8+GLWNvZGVyL3BoY29kZXInIFNlcmJpbmVua28=?= Subject: Description of HFS+ compression Date: Thu, 24 May 2012 20:38:35 +0200 Message-ID: <4FBE802B.4040305@gmail.com> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="------------enig6196869D98DBEFA96BC85457" To: linux-fsdevel@vger.kernel.org Return-path: Received: from mail-wi0-f170.google.com ([209.85.212.170]:61355 "EHLO mail-wi0-f170.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756871Ab2EXSik (ORCPT ); Thu, 24 May 2012 14:38:40 -0400 Received: by wibhm6 with SMTP id hm6so5959272wib.1 for ; Thu, 24 May 2012 11:38:39 -0700 (PDT) Sender: linux-fsdevel-owner@vger.kernel.org List-ID: This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enig6196869D98DBEFA96BC85457 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hello, all. I've looked into how Mac OS X compresses files using "transpa= rent compression". Since I don't plan to use this data now, I've thought it may be a good id= ea to document my findings, perhaps someone may implement compressed reader. It's conceptua= lly and in implementation very similar to zisofs. I suppose that reader is familiar with http://developer.apple.com/legacy/mac/library/#technotes/tn/tn1150.html Used compression: zlib Block size: 64K Missing bits from TN1150. Attributes key (big-endian): uint16_t | unknown | always zero uint32_t | cnid | file id of parent, most likely, not checked uint32_t | unknown | always zero uint16_t | namelen | length of name uint16_t[] | name | name in UTF-16BE Attributes header (start of the value in attributes key), (big-endian): uint8[3] | unknown | always zero uint8_t | type | only 0x10 =3D inline is used for com.apple.decmpfs= , attribute itself follows uint32_t | unknown | always zero uint64_t | size | size of attribute Compressed attribute header (little-endian): uint32_t | magic | "fpmc" uint32_t | unknown | always 3 uint32_t | uncompressed_size | uncompressed size if inline, 8 otherwise= uint32_t | unknown | always 0 If there is only one block and it's small enough it's stored directly fol= lowing the header. Otherwise "# dummy\n" is stored instead and the compressed data is stored= in resource fork of the file in question. The headers Mac OS X uses to masquerade as some kind of resource: Resource fork header (big-endian): uint32_t | header_size | always 0x100 uint32_t | size | total_compressed_size + seek_block_size + 4 + 0= x100=20 uint32_t | size | total_compressed_size + seek_block_size + 4 uint32_t | unknown | always 0x32 uint8_t[0xf0] | unknown | zero-filled uint8_t | size | total_compressed_size + seek_block_size It's followed by seek block starts with (little-endian) uint32_t | nentries | number of entries follow entries are (little-endian): uint32_t | compressed_offset (offset 0 corresponds to the nentries field= ) uint32_t | compressed_size Follow zlib compressed blocks. Trailer is 50 bytes of always the same contents: 0000000: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 0000010: 0000 0000 0000 0000 001c 0032 0000 636d ...........2..cm 0000020: 7066 0000 000a 0001 ffff 0000 0000 0000 pf.............. 0000030: 0000=20 --=20 Regards Vladimir '=CF=86-coder/phcoder' Serbinenko --------------enig6196869D98DBEFA96BC85457 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iF4EAREKAAYFAk++gCsACgkQNak7dOguQgnulAEAqlBxXZJKfqPOTC+yr4B8xK1w /X6pcsU4lFD0JH2FTZ4BAIEPVQ2yHdgiY45Jdd07zoPVfIrlOM9h+5N9k2jREhgp =l10q -----END PGP SIGNATURE----- --------------enig6196869D98DBEFA96BC85457--