git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Robin Rosenberg <robin.rosenberg@dewire.com>
To: "Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>
Cc: git@vger.kernel.org, Junio C Hamano <gitster@pobox.com>,
	srabbelier@gmail.com
Subject: Re: [PATCH] doc: technical details about the index file format
Date: Wed, 1 Sep 2010 20:54:20 +0200	[thread overview]
Message-ID: <201009012054.20482.robin.rosenberg@dewire.com> (raw)
In-Reply-To: <1283334825-18309-1-git-send-email-pclouds@gmail.com>

onsdagen den 1 september 2010 11.53.45 skrev  Nguyễn Thái Ngọc Duy:
> This bases on the original work by Robin Rosenberg:
> 
> http://thread.gmane.org/gmane.comp.version-control.git/73471
No need for this. My name is enough
> 
> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Add
Signed-off-by: Nguyễn Thái Ngọc Duy <robin.rosenberg@dewire.com>

> ---
>  I split index entry out so the overall format is clearer.
> 
>  Other changes:
>  - mention of version 3
>  - added ino and mode
>  - added extended flags (v3)
>  - entry sort order
> 
>  Again I don't realy know REUC extension, so only placeholder
> 
>  Documentation/technical/index-format.txt |  139
> ++++++++++++++++++++++++++++++ 1 files changed, 139 insertions(+), 0
> deletions(-)
>  create mode 100644 Documentation/technical/index-format.txt
> 
> diff --git a/Documentation/technical/index-format.txt
> b/Documentation/technical/index-format.txt new file mode 100644
> index 0000000..3e113ca
> --- /dev/null
> +++ b/Documentation/technical/index-format.txt
> @@ -0,0 +1,139 @@
> +GIT index format
> +================
> +
> += The git index file has the following format
> +
> +  All binary numbers are in network byte order. Version 2 is described
> +  here unless stated otherwise.
> +
> +   - A 12-byte header consisting of
> +
> +     4-byte signature:
> +       The signature is { 'D', 'I', 'R', 'C' }
> +
> +     4-byte version number:
> +       The current supported versions are 2 and 3.
> +
> +     32-bit number of index entries.
> +
> +   - A number of sorted index entries
> +
> +   - Extensions
> +
> +     Extensions are identified by signature. Optional extensions can
> +     be ignored if GIT does not understand them.
> +
> +     GIT currently supports tree cache and resolve undo extensions.
> +
> +     4-byte extension signature. If the first byte is 'A'..'Z' the
> +     extension is optional and can be ignored.
> +
> +     32-bit size of the extension
> +
> +     Extension data
> +
> +   - 160-bit SHA-1 over the content of the index file before this
> +     checksum.
> +
> +== Index entry
> +
> +  Index entries are sorted with memcmp() by entry name. Entries with
> +  the same name are sorted by their stage.
Index entries are sorted in ascending order on the name field, interpreted as
a string of unsigned bytes.

> +
> +  32-bit ctime seconds, the last time a file's metadata changed
> +    this is stat(2) data
> +
> +  32-bit ctime nanoseconds (modulo 1G)
> +    this is stat(2) data
> +
> +  32-bit mtime seconds, the last time a file's data changed
> +    this is stat(2) data
> +
> +  32-bit mtime nanoseconds (modulo 1G)
> +    this is stat(2) data
> +
> +  32-bit dev
> +    this is stat(2) data
> +
> +  32-bit ino
> +    this is stat(2) data
> +
> +  32-bit mode, split into (high to low bits)
> +
> +    4-bit object type
> +      valid values in binary are 1000 (blob), 1010 (symbolic link)
> +      and 1110 (gitlink)
> +
> +    3-bit unused
> +
> +    9-bit unix permission (only 0755 and 0644 are valid)
> +
> +  32-bit uid
> +    this is stat(2) data
> +
> +  32-bit gid
> +    this is stat(2) data
> +
> +  32-bit file size
> +    This is the on-disk size from stat(2)
> +
> +  160-bit SHA-1 for the represented object
> +
> +  A 16-bit field split into (high to low bits)
> +
> +    1-bit assume-valid flag
> +
> +    1-bit extended flag (must be zero in version 2)
> +
> +    2-bit stage (during merge)
> +
> +    12-bit name length if the length is less than 0x0FFF
> +
> +  (Version 3) A 16-bit field, only applicable if the "extended flag"
> +  above is 1, split into (high to low bits).
> +
> +    1-bit reserved for future
> +
> +    1-bit skip-worktree flag (used by sparse checkout)
> +
> +    1-bit intent-to-add flag (used by "git add -N")
> +
> +    13-bit unused, must be zero
> +
> +  Entry path name (variable length) relative to top-level directory
...to the top level...
> +    (without leading slash). '/' is used as path separator. Special
The special...
> +    paths ".", ".." and ".git" (without quotes) are disallowed.
> +    Trailing slash is also disallowed.
Why would anyone even consider adding a trailing slash to a _file_ name?

   The exact encoding is undefined, but the '.', and '/' characters
   are encoded in 7-bit ASCII and the encoding cannot contain a nul byte.
   Generally a superset of ASCII

> +
> +  1-8 nul bytes as necessary to pad the entry to a multiple ot eight bytes
...of eight bytes

A typo of mine.

> +  while keeping the name NUL-terminated.
> +
> +== Extensions
> +
> +=== Tree cache
> +
> +  Tree cache extension contains pre-computes hashes for all trees that
> +  can be derived from the index
> +
> +  - Extension tag { 'T', 'R', 'E', 'E' }
> +
> +  - 32-bit size
> +
> +  - A number of entries
> +
> +     NUL-terminated tree name
> +
> +     Blank-terminated ASCII decimal number of entries in this tree
> +
> +     Newline-terminated position of this tree in the parent tree. 0 for
> +     the root tree
> +
> +     160-bit SHA-1 for this tree and it's children
> +
> +=== Resolve undo
> +
> +  TODO
> +
> +  - Extension tag { 'R', 'E', 'U', 'C' }
> +
> +  - 32-bit size

  parent reply	other threads:[~2010-09-01 19:22 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-09-01  9:53 [PATCH] doc: technical details about the index file format Nguyễn Thái Ngọc Duy
2010-09-01 10:36 ` Ramkumar Ramachandra
2010-09-01 15:20   ` Sverre Rabbelier
2010-09-01 18:54 ` Robin Rosenberg [this message]
2010-09-01 14:39   ` Nguyễn Thái Ngọc Duy
2010-09-02  8:56     ` Alex Riesen
2010-09-02  9:08       ` Joshua Juran
2010-09-02 14:50       ` Junio C Hamano
2010-09-02 15:11         ` Erik Faye-Lund
2010-09-06 10:37           ` Nguyễn Thái Ngọc Duy
2011-02-19 22:16             ` Sverre Rabbelier
2011-02-20  9:30               ` Nguyen Thai Ngoc Duy
2011-02-26 10:03                 ` Nguyen Thai Ngoc Duy
2011-02-26 10:23                   ` Junio C Hamano
2011-02-26 13:36                     ` Nguyen Thai Ngoc Duy
2011-03-02  1:51                       ` Junio C Hamano
2011-03-02  3:34                         ` Nguyen Thai Ngoc Duy
2011-03-02  6:02                           ` Junio C Hamano
2011-03-02 11:43                             ` Nguyen Thai Ngoc Duy
2011-03-02 12:53                             ` Drew Northup
2010-09-01 23:28   ` Nguyen Thai Ngoc Duy
2010-09-02  5:59   ` Robin Rosenberg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=201009012054.20482.robin.rosenberg@dewire.com \
    --to=robin.rosenberg@dewire.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=pclouds@gmail.com \
    --cc=srabbelier@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).