git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] Technical details about the index file format.
@ 2008-02-11  6:28 Robin Rosenberg
  2008-02-11 12:00 ` Tim Stoakes
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Robin Rosenberg @ 2008-02-11  6:28 UTC (permalink / raw)
  To: gitster; +Cc: git, Robin Rosenberg

Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
---
 Documentation/technical/index-format.txt |   91 ++++++++++++++++++++++++++++++
 1 files changed, 91 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/technical/index-format.txt

I believe the main index details are accurate. Anything else to explain. The
TREE section probably needs more details.

-- robin

diff --git a/Documentation/technical/index-format.txt b/Documentation/technical/index-format.txt
new file mode 100644
index 0000000..c57b382
--- /dev/null
+++ b/Documentation/technical/index-format.txt
@@ -0,0 +1,91 @@
+GIT index format
+================
+
+= The git index file has the following format
+
+  All binary numbers are in network byte order.
+
+   - A twelve byte header consisting of
+
+     4 byte signature:
+	The signature is { 'D', 'I', 'R', 'C' }
+
+     4 byte version number:
+	The current version is 2
+
+     32-bit number of index entries.
+
+   - An entry consists of
+
+     32-bit ctime seconds, the last time a file's metadata changed
+	this is stat(2) data
+
+     32-bit ctime nanoseconds (modulo 1G)
+	this is stat(2) data
+
+     32-bit mtime seconds, the last time a file's data changed
+	this is stat(2) data
+
+     32-bit mtime nanoseconds (modulo 1G)
+	this is stat(2) data
+
+     32-bit dev
+	this is stat(2) data
+
+     32-bit uid
+	this is stat(2) data
+
+     32-bit gid
+	this is stat(2) data
+
+     32-bit file size
+	This is the on-disk size from stat(2)
+
+     160-bit SHA-1 for the represented blob
+
+     A 16-bit field split into (high to low bits)
+
+	1-bit assume-valid flag
+
+	1-bit update-needed flag
+
+	2-bit stage (during merge)
+
+	12-bit name length
+
+     Name (variable length) - encoding is undefined
+
+     1-8 nul bytes as necessary to pad the entry to a multiple ot eight bytes
+     while keeping the name NUL-terminated.
+
+  - Extensions
+
+    The only know index extension today is a tree cache. It contains
+    pre-computes hashes for all trees that can be derived from the index
+
+    4 byte extension signature. If the first byte is 'A'..'Z' the
+    extension is optional and can be ignored.
+
+    32-bit size of the extension
+
+    Extension data
+
+  - 160-bit SHA-1 over the content of the index file before this checksum.
+
+
+== Tree cache
+
+  - Extension tag { 'T', 'R', 'E', 'E' }
+
+  - 32-bit size
+
+  - A number of entries
+
+     NUL-terminated tree name
+
+     Blank-terminated ASCII decimal number of entries in this tree
+
+     Newline-terminated position of this tree in the parent tree. 0 for
+     the root tree
+
+     160-bit SHA-1 for this tree and it's children
-- 
1.5.4.rc4.25.g81cc

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] Technical details about the index file format.
  2008-02-11  6:28 [PATCH] Technical details about the index file format Robin Rosenberg
@ 2008-02-11 12:00 ` Tim Stoakes
  2008-02-11 19:27 ` Junio C Hamano
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 9+ messages in thread
From: Tim Stoakes @ 2008-02-11 12:00 UTC (permalink / raw)
  To: Robin Rosenberg; +Cc: gitster, git

Robin Rosenberg(robin.rosenberg@dewire.com)@110208-07:28:
> diff --git a/Documentation/technical/index-format.txt b/Documentation/technical/index-format.txt

A couple of typos:

> +     Name (variable length) - encoding is undefined
> +
> +     1-8 nul bytes as necessary to pad the entry to a multiple ot eight bytes
                                                                   ^
of

>  +    The only know index extension today is a tree cache. It contains
>  +    pre-computes hashes for all trees that can be derived from the index
                   ^

computed

full stop

Tim

-- 
Tim Stoakes

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Technical details about the index file format.
  2008-02-11  6:28 [PATCH] Technical details about the index file format Robin Rosenberg
  2008-02-11 12:00 ` Tim Stoakes
@ 2008-02-11 19:27 ` Junio C Hamano
  2010-08-31  0:59 ` Sverre Rabbelier
  2010-08-31 22:12 ` Nguyen Thai Ngoc Duy
  3 siblings, 0 replies; 9+ messages in thread
From: Junio C Hamano @ 2008-02-11 19:27 UTC (permalink / raw)
  To: Robin Rosenberg; +Cc: git

Robin Rosenberg <robin.rosenberg@dewire.com> writes:

> Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
> ---
>  Documentation/technical/index-format.txt |   91 ++++++++++++++++++++++++++++++
>  1 files changed, 91 insertions(+), 0 deletions(-)
>  create mode 100644 Documentation/technical/index-format.txt
>
> I believe the main index details are accurate. Anything else to explain.

You missed the most important feature.  The entries are sorted
in a particular order.

Also I do not think we store CE_UPDATE.  The tip of 'master'
will soon clarify about this.

Also when a name is longer than 12-bit can express, we will have
a constant there and the name length could be longer (you need
to strlen() it).  The tip of 'master' will soon have this fix
(we used to just overrun).

Other than these, I think the description is fairly accurate.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Technical details about the index file format.
  2008-02-11  6:28 [PATCH] Technical details about the index file format Robin Rosenberg
  2008-02-11 12:00 ` Tim Stoakes
  2008-02-11 19:27 ` Junio C Hamano
@ 2010-08-31  0:59 ` Sverre Rabbelier
  2010-08-31  7:08   ` Ramkumar Ramachandra
  2010-08-31 22:12 ` Nguyen Thai Ngoc Duy
  3 siblings, 1 reply; 9+ messages in thread
From: Sverre Rabbelier @ 2010-08-31  0:59 UTC (permalink / raw)
  To: git

Heya,

Robin Rosenberg <robin.rosenberg <at> dewire.com> writes:
> 
> Signed-off-by: Robin Rosenberg <robin.rosenberg <at> dewire.com>
> ---
>  Documentation/technical/index-format.txt |   91 ++++++++++++++++++++++++++++++
>  1 files changed, 91 insertions(+), 0 deletions(-)
>  create mode 100644 Documentation/technical/index-format.txt

This pretty much got a LGTM from Junio back when it was sent [0], Robin, can you 
be persuaded to resend this? I think it'd be very good if we had some 
documentation on the index format.

[0] http://thread.gmane.org/gmane.comp.version-control.git/73471

--
Cheers,

Sverre Rabbelier

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Technical details about the index file format.
  2010-08-31  0:59 ` Sverre Rabbelier
@ 2010-08-31  7:08   ` Ramkumar Ramachandra
  2010-08-31 14:23     ` Jonathan Nieder
  0 siblings, 1 reply; 9+ messages in thread
From: Ramkumar Ramachandra @ 2010-08-31  7:08 UTC (permalink / raw)
  To: Sverre Rabbelier; +Cc: git

Hi Sverre,

Sverre Rabbelier writes:
> Robin Rosenberg <robin.rosenberg <at> dewire.com> writes:
> > 
> > Signed-off-by: Robin Rosenberg <robin.rosenberg <at> dewire.com>
> > ---
> >  Documentation/technical/index-format.txt |   91 ++++++++++++++++++++++++++++++
> >  1 files changed, 91 insertions(+), 0 deletions(-)
> >  create mode 100644 Documentation/technical/index-format.txt
> 
> This pretty much got a LGTM from Junio back when it was sent [0], Robin, can you 
> be persuaded to resend this? I think it'd be very good if we had some 
> documentation on the index format.

Don't we already have this in Documentation/technical/pack-format.txt?

-- Ram

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Technical details about the index file format.
  2010-08-31  7:08   ` Ramkumar Ramachandra
@ 2010-08-31 14:23     ` Jonathan Nieder
  0 siblings, 0 replies; 9+ messages in thread
From: Jonathan Nieder @ 2010-08-31 14:23 UTC (permalink / raw)
  To: Ramkumar Ramachandra; +Cc: Sverre Rabbelier, git

Ramkumar Ramachandra wrote:

> Don't we already have this in Documentation/technical/pack-format.txt?

No; this is about the .git/index file rather than .git/objects/pack/*.idx.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Technical details about the index file format.
  2008-02-11  6:28 [PATCH] Technical details about the index file format Robin Rosenberg
                   ` (2 preceding siblings ...)
  2010-08-31  0:59 ` Sverre Rabbelier
@ 2010-08-31 22:12 ` Nguyen Thai Ngoc Duy
  2010-09-01  2:28   ` Sverre Rabbelier
  3 siblings, 1 reply; 9+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2010-08-31 22:12 UTC (permalink / raw)
  To: Robin Rosenberg; +Cc: gitster, git

On Mon, Feb 11, 2008 at 5:28 PM, Robin Rosenberg
<robin.rosenberg@dewire.com> wrote:
> +     4 byte version number:
> +       The current version is 2

The version could be 3 if extended flags are used.

> +     A 16-bit field split into (high to low bits)
> +
> +       1-bit assume-valid flag
> +
> +       1-bit update-needed flag

I think this bit is CE_EXTENDED, an indication that this entry has
extended flags

> +
> +       2-bit stage (during merge)
> +
> +       12-bit name length

     A 16-bit field additional flags (high to low bits), only
applicable to version 3

       1-bit reserved for future

       1-bit skip-worktree flag

       1-bit intent-to-add flag (aka "git add -N")

> +  - Extensions
> +
> +    The only know index extension today is a tree cache.

There's also "REUC" extension from read-cache.c. I personally have
never touched it, so no comments.
-- 
Duy

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Technical details about the index file format.
  2010-08-31 22:12 ` Nguyen Thai Ngoc Duy
@ 2010-09-01  2:28   ` Sverre Rabbelier
  2010-09-01  2:52     ` Nguyen Thai Ngoc Duy
  0 siblings, 1 reply; 9+ messages in thread
From: Sverre Rabbelier @ 2010-09-01  2:28 UTC (permalink / raw)
  To: Nguyen Thai Ngoc Duy; +Cc: Robin Rosenberg, gitster, git

Heya,

On Tue, Aug 31, 2010 at 17:12, Nguyen Thai Ngoc Duy <pclouds@gmail.com> wrote:
> The version could be 3 if extended flags are used.

I suspect you are the person to have most recently messed around with
the index, and given your corrections above, could you perhaps pick up
the patch?

-- 
Cheers,

Sverre Rabbelier

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Technical details about the index file format.
  2010-09-01  2:28   ` Sverre Rabbelier
@ 2010-09-01  2:52     ` Nguyen Thai Ngoc Duy
  0 siblings, 0 replies; 9+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2010-09-01  2:52 UTC (permalink / raw)
  To: Sverre Rabbelier; +Cc: Robin Rosenberg, gitster, git

On Wed, Sep 1, 2010 at 12:28 PM, Sverre Rabbelier <srabbelier@gmail.com> wrote:
> Heya,
>
> On Tue, Aug 31, 2010 at 17:12, Nguyen Thai Ngoc Duy <pclouds@gmail.com> wrote:
>> The version could be 3 if extended flags are used.
>
> I suspect you are the person to have most recently messed around with
> the index, and given your corrections above, could you perhaps pick up
> the patch?

Whoa, I did not notice the patch was from 2008. Yes, I'll fix it up and resend.
-- 
Duy

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2010-09-01  2:52 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-02-11  6:28 [PATCH] Technical details about the index file format Robin Rosenberg
2008-02-11 12:00 ` Tim Stoakes
2008-02-11 19:27 ` Junio C Hamano
2010-08-31  0:59 ` Sverre Rabbelier
2010-08-31  7:08   ` Ramkumar Ramachandra
2010-08-31 14:23     ` Jonathan Nieder
2010-08-31 22:12 ` Nguyen Thai Ngoc Duy
2010-09-01  2:28   ` Sverre Rabbelier
2010-09-01  2:52     ` Nguyen Thai Ngoc Duy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).