git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [FYI] pack idx format
@ 2006-02-15  8:39 Junio C Hamano
  2006-02-15 11:16 ` Johannes Schindelin
                   ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: Junio C Hamano @ 2006-02-15  8:39 UTC (permalink / raw)
  To: git

This is still WIP but if anybody is interested...  Once done, it
should become Documentation/technical/pack-format.txt.

The reason I started doing this is to prototype this one:

	<7v4q3453qu.fsf@assigned-by-dhcp.cox.net>

-- >8 --

Idx file:

The idx file is to map object name SHA1 to offset into the
corresponding pack file.  There is the 'first-level fan-out'
table at the beginning, and then the main part of the index
follows.  This is a table whose entries are sorted by their
object name SHA1.  The file ends with some trailer information.

The main part is a table of 24-byte entries, and each entry is:

	offset : 4-byte network byte order integer.
	SHA1   : 20-byte object name SHA1.

The data for the named object begins at byte offset "offset" in
the corresponding pack file.

Before this main table, at the beginning of the idx file, there
is a table of 256 4-byte network byte order integers.  This is
called "first-level fan-out".  N-th entry of this table records
the offset into the main index for the first object whose object
name SHA1 starts with N+1.  fanout[255] points at the end of
main index.  The offset is expressed in 24-bytes unit.

Example:

	idx
	    +--------------------------------+
	    | fanout[0] = 2                  |-.
	    +--------------------------------+ |
	    | fanout[1]                      | |
	    +--------------------------------+ |
	    | fanout[2]                      | |
	    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
	    | fanout[255]                    | |
	    +--------------------------------+ |
main	    | offset                         | |
index	    | object name 00XXXXXXXXXXXXXXXX | |
table	    +--------------------------------+ | 
	    | offset                         | |
	    | object name 00XXXXXXXXXXXXXXXX | |
	    +--------------------------------+ |
	  .-| offset                         |<+
	  | | object name 01XXXXXXXXXXXXXXXX |
	  | +--------------------------------+
	  | | offset                         |
	  | | object name 01XXXXXXXXXXXXXXXX |
	  | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
	  | | offset                         |
	  | | object name FFXXXXXXXXXXXXXXXX |
	  | +--------------------------------+
trailer	  | | packfile checksum              |
	  | +--------------------------------+
	  | | idxfile checksum               |
	  | +--------------------------------+
          .-------.      
                  |
Pack file entry: <+

     packed object header:
	1-byte type (upper 4-bit)
	       size0 (lower 4-bit) 
        n-byte sizeN (as long as MSB is set, each 7-bit)
		size0..sizeN form 4+7+7+..+7 bit integer, size0
		is the most significant part.
     packed object data:
        If it is not DELTA, then deflated bytes (the size above
		is the size before compression).
	If it is DELTA, then
	  20-byte base object name SHA1 (the size above is the
	  	size of the delta data that follows).
          delta data, deflated.

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2006-02-18  6:50 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-02-15  8:39 [FYI] pack idx format Junio C Hamano
2006-02-15 11:16 ` Johannes Schindelin
2006-02-15 16:46 ` Nicolas Pitre
2006-02-16  1:58   ` Junio C Hamano
2006-02-16  1:43 ` [PATCH] pack-objects: reuse data from existing pack Junio C Hamano
2006-02-16  1:45   ` [PATCH] packed objects: minor cleanup Junio C Hamano
2006-02-16  3:41   ` [PATCH] pack-objects: reuse data from existing pack Nicolas Pitre
2006-02-16  3:59     ` Junio C Hamano
2006-02-16  3:55   ` Linus Torvalds
2006-02-16  4:07     ` Junio C Hamano
2006-02-16  8:32   ` Andreas Ericsson
2006-02-16  9:13     ` Junio C Hamano
2006-02-17  4:30   ` Junio C Hamano
2006-02-17 10:37     ` [PATCH] pack-objects: finishing touches Junio C Hamano
2006-02-18  6:50       ` [PATCH] pack-objects: avoid delta chains that are too long Junio C Hamano
2006-02-17 15:39     ` [PATCH] pack-objects: reuse data from existing pack Linus Torvalds
2006-02-17 18:18       ` Junio C Hamano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).