git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Peter Eriksen" <s022018@student.dtu.dk>
To: git@vger.kernel.org
Subject: Understanding version 4 packs
Date: Sat, 24 Mar 2007 21:23:56 +0100	[thread overview]
Message-ID: <20070324202356.GA20734@bohr.gbar.dtu.dk> (raw)

Hello Shawn (and Nicolas and other interested parties),

I have been reading the commits in the
git://repo.or.cz/git/fastimport.git/ repository (git makes it quite easy
to see what differs from mainline using "git log master..pack4"), and I
think, I have understood some of the details.

The easiest thing to get was the file name table, which is placed in the
beginning of the pack (after the header) using the format:

+------------+-------------------------------+
| NR_ENTRIES |  Compressed file name table   |
+------------+-------------------------------+
   4 bytes

The uncompressed file name table contains NR_ENTRIES entries,
and looks like this:

+------+--------------+------+------------------------+----
| MODE |  Full path 1 | MODE |   Full path 2          | ...
+------+--------------+------+------------------------+----
 2 bytes   n1 bytes    2 bytes     n2 bytes     

The table is sorted by path then mode for easy binary lookup, and so
that pointers into this table can be compared directly instead of
comparing the corresponding paths and modes.

There is a new tree type called OBJ_DICT_TREE, which looks something
like the following:

+-----------------+------------------------------------------------+----
|  Table offset   |  SHA-1 of the blob corresponding to the path.  | ...
+-----------------+------------------------------------------------+----
      6 bytes                     20 bytes

These new tree objects will remain uncompressed in the pack file, but
sorted with, and deltaed against other tree objects. All normal tree
objects are converted to OBJ_DICT_TREE when packing, and are converted
back on the fly to callers who need an ordinary OBJ_TREE.

The index (.idx) files are extended to have a 4 byte pointer to the
offset of this file name table in the pack file for easy lookup.

There is something similar with a table of common strings in commit
objects (e.g. author and timezone), and a new object OBJ_DICT_COMMIT,
but I have not understood that quite yet.

Is there something, I have gotten wrong with regards to my
understanding?

Regards,

Peter

             reply	other threads:[~2007-03-24 20:24 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-03-24 20:23 Peter Eriksen [this message]
2007-03-24 23:24 ` Understanding version 4 packs Nicolas Pitre
2007-03-25  8:35   ` Peter Eriksen
2007-03-25  9:18     ` Shawn O. Pearce
2007-03-25 17:09       ` Linus Torvalds
2007-03-25 20:31         ` Shawn O. Pearce
2007-03-26  1:12           ` Nicolas Pitre
2007-03-26  2:02             ` Shawn O. Pearce
2007-03-26  8:49               ` Jakub Narebski
2007-03-26 14:01                 ` Nicolas Pitre
2007-03-26 12:16       ` Marco Costalba
2007-03-26 14:27         ` Nicolas Pitre
2007-03-26 17:10           ` Marco Costalba
2007-03-26 18:15             ` Nicolas Pitre
2007-03-26 18:43             ` Nicolas Pitre
2007-03-27  6:46               ` Marco Costalba
2007-03-27  6:55                 ` Shawn O. Pearce
2007-03-25  8:46 ` Shawn O. Pearce
2007-03-25  9:40   ` Shawn O. Pearce

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070324202356.GA20734@bohr.gbar.dtu.dk \
    --to=s022018@student.dtu.dk \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).