* [PATCH] doc: technical details about the index file format
@ 2010-09-01 9:53 Nguyễn Thái Ngọc Duy
2010-09-01 10:36 ` Ramkumar Ramachandra
2010-09-01 18:54 ` Robin Rosenberg
0 siblings, 2 replies; 22+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2010-09-01 9:53 UTC (permalink / raw)
To: git, Junio C Hamano, robin.rosenberg, srabbelier
Cc: Nguyễn Thái Ngọc Duy
This bases on the original work by Robin Rosenberg:
http://thread.gmane.org/gmane.comp.version-control.git/73471
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
I split index entry out so the overall format is clearer.
Other changes:
- mention of version 3
- added ino and mode
- added extended flags (v3)
- entry sort order
Again I don't realy know REUC extension, so only placeholder
Documentation/technical/index-format.txt | 139 ++++++++++++++++++++++++++++++
1 files changed, 139 insertions(+), 0 deletions(-)
create mode 100644 Documentation/technical/index-format.txt
diff --git a/Documentation/technical/index-format.txt b/Documentation/technical/index-format.txt
new file mode 100644
index 0000000..3e113ca
--- /dev/null
+++ b/Documentation/technical/index-format.txt
@@ -0,0 +1,139 @@
+GIT index format
+================
+
+= The git index file has the following format
+
+ All binary numbers are in network byte order. Version 2 is described
+ here unless stated otherwise.
+
+ - A 12-byte header consisting of
+
+ 4-byte signature:
+ The signature is { 'D', 'I', 'R', 'C' }
+
+ 4-byte version number:
+ The current supported versions are 2 and 3.
+
+ 32-bit number of index entries.
+
+ - A number of sorted index entries
+
+ - Extensions
+
+ Extensions are identified by signature. Optional extensions can
+ be ignored if GIT does not understand them.
+
+ GIT currently supports tree cache and resolve undo extensions.
+
+ 4-byte extension signature. If the first byte is 'A'..'Z' the
+ extension is optional and can be ignored.
+
+ 32-bit size of the extension
+
+ Extension data
+
+ - 160-bit SHA-1 over the content of the index file before this
+ checksum.
+
+== Index entry
+
+ Index entries are sorted with memcmp() by entry name. Entries with
+ the same name are sorted by their stage.
+
+ 32-bit ctime seconds, the last time a file's metadata changed
+ this is stat(2) data
+
+ 32-bit ctime nanoseconds (modulo 1G)
+ this is stat(2) data
+
+ 32-bit mtime seconds, the last time a file's data changed
+ this is stat(2) data
+
+ 32-bit mtime nanoseconds (modulo 1G)
+ this is stat(2) data
+
+ 32-bit dev
+ this is stat(2) data
+
+ 32-bit ino
+ this is stat(2) data
+
+ 32-bit mode, split into (high to low bits)
+
+ 4-bit object type
+ valid values in binary are 1000 (blob), 1010 (symbolic link)
+ and 1110 (gitlink)
+
+ 3-bit unused
+
+ 9-bit unix permission (only 0755 and 0644 are valid)
+
+ 32-bit uid
+ this is stat(2) data
+
+ 32-bit gid
+ this is stat(2) data
+
+ 32-bit file size
+ This is the on-disk size from stat(2)
+
+ 160-bit SHA-1 for the represented object
+
+ A 16-bit field split into (high to low bits)
+
+ 1-bit assume-valid flag
+
+ 1-bit extended flag (must be zero in version 2)
+
+ 2-bit stage (during merge)
+
+ 12-bit name length if the length is less than 0x0FFF
+
+ (Version 3) A 16-bit field, only applicable if the "extended flag"
+ above is 1, split into (high to low bits).
+
+ 1-bit reserved for future
+
+ 1-bit skip-worktree flag (used by sparse checkout)
+
+ 1-bit intent-to-add flag (used by "git add -N")
+
+ 13-bit unused, must be zero
+
+ Entry path name (variable length) relative to top-level directory
+ (without leading slash). '/' is used as path separator. Special
+ paths ".", ".." and ".git" (without quotes) are disallowed.
+ Trailing slash is also disallowed.
+
+ 1-8 nul bytes as necessary to pad the entry to a multiple ot eight bytes
+ while keeping the name NUL-terminated.
+
+== Extensions
+
+=== Tree cache
+
+ Tree cache extension contains pre-computes hashes for all trees that
+ can be derived from the index
+
+ - Extension tag { 'T', 'R', 'E', 'E' }
+
+ - 32-bit size
+
+ - A number of entries
+
+ NUL-terminated tree name
+
+ Blank-terminated ASCII decimal number of entries in this tree
+
+ Newline-terminated position of this tree in the parent tree. 0 for
+ the root tree
+
+ 160-bit SHA-1 for this tree and it's children
+
+=== Resolve undo
+
+ TODO
+
+ - Extension tag { 'R', 'E', 'U', 'C' }
+
+ - 32-bit size
--
1.7.1.rc1.69.g24c2f7
^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [PATCH] doc: technical details about the index file format
2010-09-01 9:53 [PATCH] doc: technical details about the index file format Nguyễn Thái Ngọc Duy
@ 2010-09-01 10:36 ` Ramkumar Ramachandra
2010-09-01 15:20 ` Sverre Rabbelier
2010-09-01 18:54 ` Robin Rosenberg
1 sibling, 1 reply; 22+ messages in thread
From: Ramkumar Ramachandra @ 2010-09-01 10:36 UTC (permalink / raw)
To: Nguyễn Thái Ngọc Duy
Cc: git, Junio C Hamano, robin.rosenberg, srabbelier
Hi,
Nguyễn Thái Ngọc Duy writes:
> This bases on the original work by Robin Rosenberg:
>
> http://thread.gmane.org/gmane.comp.version-control.git/73471
[...]
It might be more profitable to mention the Message-ID instead.
<1202711335-12026-1-git-send-email-robin.rosenberg@dewire.com>
-- Ram
^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH] doc: technical details about the index file format
2010-09-01 18:54 ` Robin Rosenberg
@ 2010-09-01 14:39 ` Nguyễn Thái Ngọc Duy
2010-09-02 8:56 ` Alex Riesen
2010-09-01 23:28 ` Nguyen Thai Ngoc Duy
2010-09-02 5:59 ` Robin Rosenberg
2 siblings, 1 reply; 22+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2010-09-01 14:39 UTC (permalink / raw)
To: git, Junio C Hamano, robin.rosenberg, srabbelier
Cc: Nguyễn Thái Ngọc Duy
This bases on the original work by Robin Rosenberg.
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
Fixups after Robin's review
Documentation/technical/index-format.txt | 144 ++++++++++++++++++++++++++++++
1 files changed, 144 insertions(+), 0 deletions(-)
create mode 100644 Documentation/technical/index-format.txt
diff --git a/Documentation/technical/index-format.txt b/Documentation/technical/index-format.txt
new file mode 100644
index 0000000..0285d88
--- /dev/null
+++ b/Documentation/technical/index-format.txt
@@ -0,0 +1,144 @@
+GIT index format
+================
+
+= The git index file has the following format
+
+ All binary numbers are in network byte order. Version 2 is described
+ here unless stated otherwise.
+
+ - A 12-byte header consisting of
+
+ 4-byte signature:
+ The signature is { 'D', 'I', 'R', 'C' }
+
+ 4-byte version number:
+ The current supported versions are 2 and 3.
+
+ 32-bit number of index entries.
+
+ - A number of sorted index entries
+
+ - Extensions
+
+ Extensions are identified by signature. Optional extensions can
+ be ignored if GIT does not understand them.
+
+ GIT currently supports tree cache and resolve undo extensions.
+
+ 4-byte extension signature. If the first byte is 'A'..'Z' the
+ extension is optional and can be ignored.
+
+ 32-bit size of the extension
+
+ Extension data
+
+ - 160-bit SHA-1 over the content of the index file before this
+ checksum.
+
+== Index entry
+
+ Index entries are sorted in ascending order on the name field,
+ interpreted as a string of unsigned bytes. Entries with the same
+ name are sorted by their stage field.
+
+ 32-bit ctime seconds, the last time a file's metadata changed
+ this is stat(2) data
+
+ 32-bit ctime nanoseconds (modulo 1G)
+ this is stat(2) data
+
+ 32-bit mtime seconds, the last time a file's data changed
+ this is stat(2) data
+
+ 32-bit mtime nanoseconds (modulo 1G)
+ this is stat(2) data
+
+ 32-bit dev
+ this is stat(2) data
+
+ 32-bit ino
+ this is stat(2) data
+
+ 32-bit mode, split into (high to low bits)
+
+ 4-bit object type
+ valid values in binary are 1000 (blob), 1010 (symbolic link)
+ and 1110 (gitlink)
+
+ 3-bit unused
+
+ 9-bit unix permission (only 0755 and 0644 are valid)
+
+ 32-bit uid
+ this is stat(2) data
+
+ 32-bit gid
+ this is stat(2) data
+
+ 32-bit file size
+ This is the on-disk size from stat(2)
+
+ 160-bit SHA-1 for the represented object
+
+ A 16-bit field split into (high to low bits)
+
+ 1-bit assume-valid flag
+
+ 1-bit extended flag (must be zero in version 2)
+
+ 2-bit stage (during merge)
+
+ 12-bit name length if the length is less than 0x0FFF
+
+ (Version 3) A 16-bit field, only applicable if the "extended flag"
+ above is 1, split into (high to low bits).
+
+ 1-bit reserved for future
+
+ 1-bit skip-worktree flag (used by sparse checkout)
+
+ 1-bit intent-to-add flag (used by "git add -N")
+
+ 13-bit unused, must be zero
+
+ Entry path name (variable length) relative to top level directory
+ (without leading slash). '/' is used as path separator. The special
+ paths ".", ".." and ".git" (without quotes) are disallowed.
+ Trailing slash is also disallowed.
+
+ The exact encoding is undefined, but the '.' and '/' characters
+ are encoded in 7-bit ASCII and the encoding cannot contain a nul
+ byte. Generally a superset of ASCII.
+
+ 1-8 nul bytes as necessary to pad the entry to a multiple of eight bytes
+ while keeping the name NUL-terminated.
+
+== Extensions
+
+=== Tree cache
+
+ Tree cache extension contains pre-computes hashes for all trees that
+ can be derived from the index
+
+ - Extension tag { 'T', 'R', 'E', 'E' }
+
+ - 32-bit size
+
+ - A number of entries
+
+ NUL-terminated tree name
+
+ Blank-terminated ASCII decimal number of entries in this tree
+
+ Newline-terminated position of this tree in the parent tree. 0 for
+ the root tree
+
+ 160-bit SHA-1 for this tree and it's children
+
+=== Resolve undo
+
+ TODO
+
+ - Extension tag { 'R', 'E', 'U', 'C' }
+
+ - 32-bit size
--
1.7.1.rc1.69.g24c2f7
^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [PATCH] doc: technical details about the index file format
2010-09-01 10:36 ` Ramkumar Ramachandra
@ 2010-09-01 15:20 ` Sverre Rabbelier
0 siblings, 0 replies; 22+ messages in thread
From: Sverre Rabbelier @ 2010-09-01 15:20 UTC (permalink / raw)
To: Ramkumar Ramachandra
Cc: Nguyễn Thái Ngọc, git, Junio C Hamano,
robin.rosenberg
Heya,
2010/9/1 Ramkumar Ramachandra <artagnon@gmail.com>:
> It might be more profitable to mention the Message-ID instead.
> <1202711335-12026-1-git-send-email-robin.rosenberg@dewire.com>
If you really want to do that, use this link:
http://mid.gmane.org/1202711335-12026-1-git-send-email-robin.rosenberg@dewire.com
--
Cheers,
Sverre Rabbelier
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] doc: technical details about the index file format
2010-09-01 9:53 [PATCH] doc: technical details about the index file format Nguyễn Thái Ngọc Duy
2010-09-01 10:36 ` Ramkumar Ramachandra
@ 2010-09-01 18:54 ` Robin Rosenberg
2010-09-01 14:39 ` Nguyễn Thái Ngọc Duy
` (2 more replies)
1 sibling, 3 replies; 22+ messages in thread
From: Robin Rosenberg @ 2010-09-01 18:54 UTC (permalink / raw)
To: Nguyễn Thái Ngọc Duy; +Cc: git, Junio C Hamano, srabbelier
onsdagen den 1 september 2010 11.53.45 skrev Nguyễn Thái Ngọc Duy:
> This bases on the original work by Robin Rosenberg:
>
> http://thread.gmane.org/gmane.comp.version-control.git/73471
No need for this. My name is enough
>
> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Add
Signed-off-by: Nguyễn Thái Ngọc Duy <robin.rosenberg@dewire.com>
> ---
> I split index entry out so the overall format is clearer.
>
> Other changes:
> - mention of version 3
> - added ino and mode
> - added extended flags (v3)
> - entry sort order
>
> Again I don't realy know REUC extension, so only placeholder
>
> Documentation/technical/index-format.txt | 139
> ++++++++++++++++++++++++++++++ 1 files changed, 139 insertions(+), 0
> deletions(-)
> create mode 100644 Documentation/technical/index-format.txt
>
> diff --git a/Documentation/technical/index-format.txt
> b/Documentation/technical/index-format.txt new file mode 100644
> index 0000000..3e113ca
> --- /dev/null
> +++ b/Documentation/technical/index-format.txt
> @@ -0,0 +1,139 @@
> +GIT index format
> +================
> +
> += The git index file has the following format
> +
> + All binary numbers are in network byte order. Version 2 is described
> + here unless stated otherwise.
> +
> + - A 12-byte header consisting of
> +
> + 4-byte signature:
> + The signature is { 'D', 'I', 'R', 'C' }
> +
> + 4-byte version number:
> + The current supported versions are 2 and 3.
> +
> + 32-bit number of index entries.
> +
> + - A number of sorted index entries
> +
> + - Extensions
> +
> + Extensions are identified by signature. Optional extensions can
> + be ignored if GIT does not understand them.
> +
> + GIT currently supports tree cache and resolve undo extensions.
> +
> + 4-byte extension signature. If the first byte is 'A'..'Z' the
> + extension is optional and can be ignored.
> +
> + 32-bit size of the extension
> +
> + Extension data
> +
> + - 160-bit SHA-1 over the content of the index file before this
> + checksum.
> +
> +== Index entry
> +
> + Index entries are sorted with memcmp() by entry name. Entries with
> + the same name are sorted by their stage.
Index entries are sorted in ascending order on the name field, interpreted as
a string of unsigned bytes.
> +
> + 32-bit ctime seconds, the last time a file's metadata changed
> + this is stat(2) data
> +
> + 32-bit ctime nanoseconds (modulo 1G)
> + this is stat(2) data
> +
> + 32-bit mtime seconds, the last time a file's data changed
> + this is stat(2) data
> +
> + 32-bit mtime nanoseconds (modulo 1G)
> + this is stat(2) data
> +
> + 32-bit dev
> + this is stat(2) data
> +
> + 32-bit ino
> + this is stat(2) data
> +
> + 32-bit mode, split into (high to low bits)
> +
> + 4-bit object type
> + valid values in binary are 1000 (blob), 1010 (symbolic link)
> + and 1110 (gitlink)
> +
> + 3-bit unused
> +
> + 9-bit unix permission (only 0755 and 0644 are valid)
> +
> + 32-bit uid
> + this is stat(2) data
> +
> + 32-bit gid
> + this is stat(2) data
> +
> + 32-bit file size
> + This is the on-disk size from stat(2)
> +
> + 160-bit SHA-1 for the represented object
> +
> + A 16-bit field split into (high to low bits)
> +
> + 1-bit assume-valid flag
> +
> + 1-bit extended flag (must be zero in version 2)
> +
> + 2-bit stage (during merge)
> +
> + 12-bit name length if the length is less than 0x0FFF
> +
> + (Version 3) A 16-bit field, only applicable if the "extended flag"
> + above is 1, split into (high to low bits).
> +
> + 1-bit reserved for future
> +
> + 1-bit skip-worktree flag (used by sparse checkout)
> +
> + 1-bit intent-to-add flag (used by "git add -N")
> +
> + 13-bit unused, must be zero
> +
> + Entry path name (variable length) relative to top-level directory
...to the top level...
> + (without leading slash). '/' is used as path separator. Special
The special...
> + paths ".", ".." and ".git" (without quotes) are disallowed.
> + Trailing slash is also disallowed.
Why would anyone even consider adding a trailing slash to a _file_ name?
The exact encoding is undefined, but the '.', and '/' characters
are encoded in 7-bit ASCII and the encoding cannot contain a nul byte.
Generally a superset of ASCII
> +
> + 1-8 nul bytes as necessary to pad the entry to a multiple ot eight bytes
...of eight bytes
A typo of mine.
> + while keeping the name NUL-terminated.
> +
> +== Extensions
> +
> +=== Tree cache
> +
> + Tree cache extension contains pre-computes hashes for all trees that
> + can be derived from the index
> +
> + - Extension tag { 'T', 'R', 'E', 'E' }
> +
> + - 32-bit size
> +
> + - A number of entries
> +
> + NUL-terminated tree name
> +
> + Blank-terminated ASCII decimal number of entries in this tree
> +
> + Newline-terminated position of this tree in the parent tree. 0 for
> + the root tree
> +
> + 160-bit SHA-1 for this tree and it's children
> +
> +=== Resolve undo
> +
> + TODO
> +
> + - Extension tag { 'R', 'E', 'U', 'C' }
> +
> + - 32-bit size
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] doc: technical details about the index file format
2010-09-01 18:54 ` Robin Rosenberg
2010-09-01 14:39 ` Nguyễn Thái Ngọc Duy
@ 2010-09-01 23:28 ` Nguyen Thai Ngoc Duy
2010-09-02 5:59 ` Robin Rosenberg
2 siblings, 0 replies; 22+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2010-09-01 23:28 UTC (permalink / raw)
To: Robin Rosenberg; +Cc: git, Junio C Hamano, srabbelier
2010/9/2 Robin Rosenberg <robin.rosenberg@dewire.com>:
>> + Entry path name (variable length) relative to top-level directory
> ...to the top level...
>> + (without leading slash). '/' is used as path separator. Special
> The special...
>> + paths ".", ".." and ".git" (without quotes) are disallowed.
>> + Trailing slash is also disallowed.
> Why would anyone even consider adding a trailing slash to a _file_ name?
Well, I was tempted to put directories in index more than once. And
subprojects are actually directories although they are treated as
files in index.
--
Duy
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] doc: technical details about the index file format
2010-09-01 18:54 ` Robin Rosenberg
2010-09-01 14:39 ` Nguyễn Thái Ngọc Duy
2010-09-01 23:28 ` Nguyen Thai Ngoc Duy
@ 2010-09-02 5:59 ` Robin Rosenberg
2 siblings, 0 replies; 22+ messages in thread
From: Robin Rosenberg @ 2010-09-02 5:59 UTC (permalink / raw)
To: Nguyễn Thái Ngọc Duy; +Cc: git, Junio C Hamano, srabbelier
onsdagen den 1 september 2010 20.54.20 skrev Robin Rosenberg:
> onsdagen den 1 september 2010 11.53.45 skrev Nguyễn Thái Ngọc Duy:
> > This bases on the original work by Robin Rosenberg:
> >
> > http://thread.gmane.org/gmane.comp.version-control.git/73471
>
> No need for this. My name is enough
>
> > Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
>
Add this rather than then one I sent in the previus mail...
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
-- robin
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] doc: technical details about the index file format
2010-09-01 14:39 ` Nguyễn Thái Ngọc Duy
@ 2010-09-02 8:56 ` Alex Riesen
2010-09-02 9:08 ` Joshua Juran
2010-09-02 14:50 ` Junio C Hamano
0 siblings, 2 replies; 22+ messages in thread
From: Alex Riesen @ 2010-09-02 8:56 UTC (permalink / raw)
To: Nguyễn Thái Ngọc Duy
Cc: git, Junio C Hamano, robin.rosenberg, srabbelier
2010/9/1 Nguyễn Thái Ngọc Duy <pclouds@gmail.com>:
> +== Index entry
> +
> + Index entries are sorted in ascending order on the name field,
> + interpreted as a string of unsigned bytes. Entries with the same
> + name are sorted by their stage field.
> +
> + 32-bit ctime seconds, the last time a file's metadata changed
> + this is stat(2) data
> +
> + 32-bit ctime nanoseconds (modulo 1G)
> + this is stat(2) data
Maybe I'm missing something, but I failed to find where "modulo 1G" comes from.
AFAICS (read-cache.c), the stat data are saved almost unmodified
(casted to unsigned int).
(BTW, is 1G the Gravitational Constant or what?)
I'm not sure it is safe to assume that every system Git will be
ported to defines
"unsigned int" to be 32 bits. OTOH, never met one where it is something else.
Still, using uint32_t (the POSIX types) in ondisk_cache_entry would be clearer
(unlikely alignment issues aside.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] doc: technical details about the index file format
2010-09-02 8:56 ` Alex Riesen
@ 2010-09-02 9:08 ` Joshua Juran
2010-09-02 14:50 ` Junio C Hamano
1 sibling, 0 replies; 22+ messages in thread
From: Joshua Juran @ 2010-09-02 9:08 UTC (permalink / raw)
To: Alex Riesen
Cc: Nguyễn Thái Ngọc Duy, git, Junio C Hamano,
robin.rosenberg, srabbelier
On Sep 2, 2010, at 1:56 AM, Alex Riesen wrote:
> 2010/9/1 Nguyễn Thái Ngọc Duy <pclouds@gmail.com>:
>> +== Index entry
>> +
>> + Index entries are sorted in ascending order on the name field,
>> + interpreted as a string of unsigned bytes. Entries with the same
>> + name are sorted by their stage field.
>> +
>> + 32-bit ctime seconds, the last time a file's metadata changed
>> + this is stat(2) data
>> +
>> + 32-bit ctime nanoseconds (modulo 1G)
>> + this is stat(2) data
>
> Maybe I'm missing something, but I failed to find where "modulo 1G"
> comes from.
> AFAICS (read-cache.c), the stat data are saved almost unmodified
> (casted to unsigned int).
> (BTW, is 1G the Gravitational Constant or what?)
G stands for "giga-" meaning one billion, so 1G refers to one billion
nanoseconds.
> I'm not sure it is safe to assume that every system Git will be
> ported to defines
> "unsigned int" to be 32 bits. OTOH, never met one where it is
> something else.
DOS and early Mac compilers have used 16-bit ints, but I don't think
anyone cares.
Josh
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] doc: technical details about the index file format
2010-09-02 8:56 ` Alex Riesen
2010-09-02 9:08 ` Joshua Juran
@ 2010-09-02 14:50 ` Junio C Hamano
2010-09-02 15:11 ` Erik Faye-Lund
1 sibling, 1 reply; 22+ messages in thread
From: Junio C Hamano @ 2010-09-02 14:50 UTC (permalink / raw)
To: Alex Riesen
Cc: Nguyễn Thái Ngọc Duy, git, robin.rosenberg,
srabbelier
Alex Riesen <raa.lkml@gmail.com> writes:
>> + 32-bit ctime seconds, the last time a file's metadata changed
>> + this is stat(2) data
>> +
>> + 32-bit ctime nanoseconds (modulo 1G)
>> + this is stat(2) data
>
> Maybe I'm missing something, but I failed to find where "modulo 1G" comes from.
I think the above wants to say "seconds and sub-seconds are stored in
separate fields, and latter is purely sub-seconds, never reaching nor
exceeding a whole second" (gig == 10^-9) times nano (== 10^+9) is 1).
I personally do not think it is a good idea to say " (modulo 1G)" there;
it is more confusing than without.
Either the reader knows, from seeing "this is stat(2) data", what
seconds/nanoseconds mean, in which case the comment gives redundant
information in cryptic terms, or the reader doesn't, in which case the
concept of storing the timestamp as a (second, subsecond) tuple needs to
be explained a lot better than the above to be understood.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] doc: technical details about the index file format
2010-09-02 14:50 ` Junio C Hamano
@ 2010-09-02 15:11 ` Erik Faye-Lund
2010-09-06 10:37 ` Nguyễn Thái Ngọc Duy
0 siblings, 1 reply; 22+ messages in thread
From: Erik Faye-Lund @ 2010-09-02 15:11 UTC (permalink / raw)
To: Junio C Hamano
Cc: Alex Riesen, Nguyễn Thái Ngọc, git,
robin.rosenberg, srabbelier
On Thu, Sep 2, 2010 at 4:50 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Alex Riesen <raa.lkml@gmail.com> writes:
>
>>> + 32-bit ctime seconds, the last time a file's metadata changed
>>> + this is stat(2) data
>>> +
>>> + 32-bit ctime nanoseconds (modulo 1G)
>>> + this is stat(2) data
>>
>> Maybe I'm missing something, but I failed to find where "modulo 1G" comes from.
>
> I think the above wants to say "seconds and sub-seconds are stored in
> separate fields, and latter is purely sub-seconds, never reaching nor
> exceeding a whole second" (gig == 10^-9) times nano (== 10^+9) is 1).
>
> I personally do not think it is a good idea to say " (modulo 1G)" there;
> it is more confusing than without.
>
Perhaps "nanosecond fractions" would be a simple and precise description?
^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH] doc: technical details about the index file format
2010-09-02 15:11 ` Erik Faye-Lund
@ 2010-09-06 10:37 ` Nguyễn Thái Ngọc Duy
2011-02-19 22:16 ` Sverre Rabbelier
0 siblings, 1 reply; 22+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2010-09-06 10:37 UTC (permalink / raw)
To: git, Junio C Hamano, kusmabite, raa.lkml, jjuran
Cc: Nguyễn Thái Ngọc Duy, Robin Rosenberg
This bases on the original work by Robin Rosenberg.
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
"nanoseconds (modulo 1G)" is changed to "nanosecond fractions"
Documentation/technical/index-format.txt | 144 ++++++++++++++++++++++++++++++
1 files changed, 144 insertions(+), 0 deletions(-)
create mode 100644 Documentation/technical/index-format.txt
diff --git a/Documentation/technical/index-format.txt b/Documentation/technical/index-format.txt
new file mode 100644
index 0000000..5b1d70d
--- /dev/null
+++ b/Documentation/technical/index-format.txt
@@ -0,0 +1,144 @@
+GIT index format
+================
+
+= The git index file has the following format
+
+ All binary numbers are in network byte order. Version 2 is described
+ here unless stated otherwise.
+
+ - A 12-byte header consisting of
+
+ 4-byte signature:
+ The signature is { 'D', 'I', 'R', 'C' }
+
+ 4-byte version number:
+ The current supported versions are 2 and 3.
+
+ 32-bit number of index entries.
+
+ - A number of sorted index entries
+
+ - Extensions
+
+ Extensions are identified by signature. Optional extensions can
+ be ignored if GIT does not understand them.
+
+ GIT currently supports tree cache and resolve undo extensions.
+
+ 4-byte extension signature. If the first byte is 'A'..'Z' the
+ extension is optional and can be ignored.
+
+ 32-bit size of the extension
+
+ Extension data
+
+ - 160-bit SHA-1 over the content of the index file before this
+ checksum.
+
+== Index entry
+
+ Index entries are sorted in ascending order on the name field,
+ interpreted as a string of unsigned bytes. Entries with the same
+ name are sorted by their stage field.
+
+ 32-bit ctime seconds, the last time a file's metadata changed
+ this is stat(2) data
+
+ 32-bit ctime nanosecond fractions
+ this is stat(2) data
+
+ 32-bit mtime seconds, the last time a file's data changed
+ this is stat(2) data
+
+ 32-bit mtime nanosecond fractions
+ this is stat(2) data
+
+ 32-bit dev
+ this is stat(2) data
+
+ 32-bit ino
+ this is stat(2) data
+
+ 32-bit mode, split into (high to low bits)
+
+ 4-bit object type
+ valid values in binary are 1000 (blob), 1010 (symbolic link)
+ and 1110 (gitlink)
+
+ 3-bit unused
+
+ 9-bit unix permission (only 0755 and 0644 are valid)
+
+ 32-bit uid
+ this is stat(2) data
+
+ 32-bit gid
+ this is stat(2) data
+
+ 32-bit file size
+ This is the on-disk size from stat(2)
+
+ 160-bit SHA-1 for the represented object
+
+ A 16-bit field split into (high to low bits)
+
+ 1-bit assume-valid flag
+
+ 1-bit extended flag (must be zero in version 2)
+
+ 2-bit stage (during merge)
+
+ 12-bit name length if the length is less than 0x0FFF
+
+ (Version 3) A 16-bit field, only applicable if the "extended flag"
+ above is 1, split into (high to low bits).
+
+ 1-bit reserved for future
+
+ 1-bit skip-worktree flag (used by sparse checkout)
+
+ 1-bit intent-to-add flag (used by "git add -N")
+
+ 13-bit unused, must be zero
+
+ Entry path name (variable length) relative to top level directory
+ (without leading slash). '/' is used as path separator. The special
+ paths ".", ".." and ".git" (without quotes) are disallowed.
+ Trailing slash is also disallowed.
+
+ The exact encoding is undefined, but the '.' and '/' characters
+ are encoded in 7-bit ASCII and the encoding cannot contain a nul
+ byte. Generally a superset of ASCII.
+
+ 1-8 nul bytes as necessary to pad the entry to a multiple of eight bytes
+ while keeping the name NUL-terminated.
+
+== Extensions
+
+=== Tree cache
+
+ Tree cache extension contains pre-computes hashes for all trees that
+ can be derived from the index
+
+ - Extension tag { 'T', 'R', 'E', 'E' }
+
+ - 32-bit size
+
+ - A number of entries
+
+ NUL-terminated tree name
+
+ Blank-terminated ASCII decimal number of entries in this tree
+
+ Newline-terminated position of this tree in the parent tree. 0 for
+ the root tree
+
+ 160-bit SHA-1 for this tree and it's children
+
+=== Resolve undo
+
+ TODO
+
+ - Extension tag { 'R', 'E', 'U', 'C' }
+
+ - 32-bit size
--
1.7.1.rc1.69.g24c2f7
^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [PATCH] doc: technical details about the index file format
2010-09-06 10:37 ` Nguyễn Thái Ngọc Duy
@ 2011-02-19 22:16 ` Sverre Rabbelier
2011-02-20 9:30 ` Nguyen Thai Ngoc Duy
0 siblings, 1 reply; 22+ messages in thread
From: Sverre Rabbelier @ 2011-02-19 22:16 UTC (permalink / raw)
To: Nguyễn Thái Ngọc Duy, Junio C Hamano
Cc: git, kusmabite, raa.lkml, jjuran, Robin Rosenberg
Heya,
2010/9/6 Nguyễn Thái Ngọc Duy <pclouds@gmail.com>:
> This bases on the original work by Robin Rosenberg.
Junio, in the "what's cooking" you mention that you might jump in to
improve this? Duy, are you still interested in carrying this forward?
This patch [0] would be helpful to the hgit people as well :).
http://git.kernel.org/?p=git/git.git;a=commit;h=673f3d9d4e019a15c6d3770e9f8d9b07059f16cc
--
Cheers,
Sverre Rabbelier
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] doc: technical details about the index file format
2011-02-19 22:16 ` Sverre Rabbelier
@ 2011-02-20 9:30 ` Nguyen Thai Ngoc Duy
2011-02-26 10:03 ` Nguyen Thai Ngoc Duy
0 siblings, 1 reply; 22+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2011-02-20 9:30 UTC (permalink / raw)
To: Sverre Rabbelier
Cc: Junio C Hamano, git, kusmabite, raa.lkml, jjuran, Robin Rosenberg
2011/2/20 Sverre Rabbelier <srabbelier@gmail.com>:
> Heya,
>
> 2010/9/6 Nguyễn Thái Ngọc Duy <pclouds@gmail.com>:
>> This bases on the original work by Robin Rosenberg.
>
> Junio, in the "what's cooking" you mention that you might jump in to
> improve this? Duy, are you still interested in carrying this forward?
> This patch [0] would be helpful to the hgit people as well :).
I can try to study resolve undo extension next week and see if I can
write it down in the document.
--
Duy
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] doc: technical details about the index file format
2011-02-20 9:30 ` Nguyen Thai Ngoc Duy
@ 2011-02-26 10:03 ` Nguyen Thai Ngoc Duy
2011-02-26 10:23 ` Junio C Hamano
0 siblings, 1 reply; 22+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2011-02-26 10:03 UTC (permalink / raw)
To: Sverre Rabbelier
Cc: Junio C Hamano, git, kusmabite, raa.lkml, jjuran, Robin Rosenberg
On Sun, Feb 20, 2011 at 04:30:07PM +0700, Nguyen Thai Ngoc Duy wrote:
> 2011/2/20 Sverre Rabbelier <srabbelier@gmail.com>:
> > Heya,
> >
> > 2010/9/6 Nguyễn Thái Ngọc Duy <pclouds@gmail.com>:
> >> This bases on the original work by Robin Rosenberg.
> >
> > Junio, in the "what's cooking" you mention that you might jump in to
> > improve this? Duy, are you still interested in carrying this forward?
> > This patch [0] would be helpful to the hgit people as well :).
>
> I can try to study resolve undo extension next week and see if I can
> write it down in the document.
OK here come the missing bits on top of the previous patch. Looks good?
--8<--
diff --git a/Documentation/technical/index-format.txt b/Documentation/technical/index-format.txt
index 5b1d70d..574eb3b 100644
--- a/Documentation/technical/index-format.txt
+++ b/Documentation/technical/index-format.txt
@@ -118,7 +118,7 @@ GIT index format
=== Tree cache
Tree cache extension contains pre-computes hashes for all trees that
- can be derived from the index
+ can be derived from the index.
- Extension tag { 'T', 'R', 'E', 'E' }
@@ -137,8 +137,20 @@ GIT index format
=== Resolve undo
- TODO
+ Resolve undo extension records staged entries before they are
+ resolved and removed from index. It can be used to recreate conflicts
+ after the conflict is incorrectly resolved.
- Extension tag { 'R', 'E', 'U', 'C' }
- 32-bit size
+
+ - A number of entries
+
+ NUL-terminated entry name
+
+ Entry mode of the entry in three stages, in increasing order from
+ 1 to 3, in NUL-terminated ASCII octal number.
+
+ 160 bit SHA-1 of the entry in three stages, in increasing
+ order from 1 to 3. A stage with zero mode will be skipped.
-->8--
--
Duy
^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [PATCH] doc: technical details about the index file format
2011-02-26 10:03 ` Nguyen Thai Ngoc Duy
@ 2011-02-26 10:23 ` Junio C Hamano
2011-02-26 13:36 ` Nguyen Thai Ngoc Duy
0 siblings, 1 reply; 22+ messages in thread
From: Junio C Hamano @ 2011-02-26 10:23 UTC (permalink / raw)
To: Nguyen Thai Ngoc Duy
Cc: Sverre Rabbelier, git, kusmabite, raa.lkml, jjuran,
Robin Rosenberg
Nguyen Thai Ngoc Duy <pclouds@gmail.com> writes:
> OK here come the missing bits on top of the previous patch. Looks good?
Thanks.
> diff --git a/Documentation/technical/index-format.txt b/Documentation/technical/index-format.txt
> index 5b1d70d..574eb3b 100644
> --- a/Documentation/technical/index-format.txt
> +++ b/Documentation/technical/index-format.txt
> @@ -118,7 +118,7 @@ GIT index format
> === Tree cache
>
> Tree cache extension contains pre-computes hashes for all trees that
> - can be derived from the index
> + can be derived from the index.
>
> - Extension tag { 'T', 'R', 'E', 'E' }
>
> @@ -137,8 +137,20 @@ GIT index format
>
> === Resolve undo
>
> - TODO
> + Resolve undo extension records staged entries before they are
> + resolved and removed from index. It can be used to recreate conflicts
> + after the conflict is incorrectly resolved.
I lack energy to come up with a succinct description right now, so here is
an undistilled version of what I would want to see the reader of the above
paragraph understand:
A set of entries for a path at higher stages (i.e. the ones that
represent a merge conflict at the path) used to be removed from the
index and replaced with the result of the resolution when the conflict
is resolved (e.g. with "git add path"). This extension saves these
higher stage entries away so that "checkout -m" and other operations
can recreate the conflicted state, in case you botched a conflict
resolution and want to redo it from scratch.
The description of the data contents looked fine, except that "A number of
entries" felt a bit unclear (it would make the reader wonder if we record
how many we have at that location as an integer, which is not the case).
> - Extension tag { 'R', 'E', 'U', 'C' }
>
> - 32-bit size
> +
> + - A number of entries
> +
> + NUL-terminated entry name
> +
> + Entry mode of the entry in three stages, in increasing order from
> + 1 to 3, in NUL-terminated ASCII octal number.
> +
> + 160 bit SHA-1 of the entry in three stages, in increasing
> + order from 1 to 3. A stage with zero mode will be skipped.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] doc: technical details about the index file format
2011-02-26 10:23 ` Junio C Hamano
@ 2011-02-26 13:36 ` Nguyen Thai Ngoc Duy
2011-03-02 1:51 ` Junio C Hamano
0 siblings, 1 reply; 22+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2011-02-26 13:36 UTC (permalink / raw)
To: Junio C Hamano
Cc: Sverre Rabbelier, git, kusmabite, raa.lkml, jjuran,
Robin Rosenberg
On Sat, Feb 26, 2011 at 02:23:38AM -0800, Junio C Hamano wrote:
> I lack energy to come up with a succinct description right now, so here is
> an undistilled version of what I would want to see the reader of the above
> paragraph understand:
>
> A set of entries for a path at higher stages (i.e. the ones that
> represent a merge conflict at the path) used to be removed from the
> index and replaced with the result of the resolution when the conflict
> is resolved (e.g. with "git add path"). This extension saves these
> higher stage entries away so that "checkout -m" and other operations
> can recreate the conflicted state, in case you botched a conflict
> resolution and want to redo it from scratch.
>
> The description of the data contents looked fine, except that "A number of
> entries" felt a bit unclear (it would make the reader wonder if we record
> how many we have at that location as an integer, which is not the case).
OK another try. I also add more details to tree cache. If somebody
uses this document to create a git-compatible tool, then such a tool
should behave the way git expects it.
A related note. Because we store SHA-1s in resolve undo ext. fsck
should check these for reachability as well. I see fsck checks for
cache-tree only.
--8<--
diff --git a/Documentation/technical/index-format.txt b/Documentation/technical/index-format.txt
index 5b1d70d..2a3490c 100644
--- a/Documentation/technical/index-format.txt
+++ b/Documentation/technical/index-format.txt
@@ -117,8 +117,12 @@ GIT index format
=== Tree cache
- Tree cache extension contains pre-computes hashes for all trees that
- can be derived from the index
+ Tree cache extension contains pre-computed hashes for trees that can
+ be derived from the index. It helps speed up tree object generation
+ from index for a new commit.
+
+ When a path is updated in index, the path must be invalidated and
+ removed from tree cache.
- Extension tag { 'T', 'R', 'E', 'E' }
@@ -137,8 +141,25 @@ GIT index format
=== Resolve undo
- TODO
+ A conflict is represented in index as a set of higher stage entries.
+ When a conflict is resolved (e.g. with "git add path"), these higher
+ stage entries will be removed and a stage-0 entry with proper
+ resoluton is added.
+
+ Resolve undo extension saves these higher stage entries so that
+ conflicts can be recreated (e.g. with "git checkout -m"), in case
+ users want to redo a conflict resolution from scratch.
- Extension tag { 'R', 'E', 'U', 'C' }
- 32-bit size
+
+ - A number of conflict entries
+
+ NUL-terminated conflict path
+
+ Three NUL-terminated ASCII octal numbers, entry mode of entries in
+ stage 1 to 3.
+
+ At most three 160-bit SHA-1s of the entry in three stages from 1
+ to 3. SHA-1 is not saved for any stage with entry mode zero.
--8<--
--
Duy
^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [PATCH] doc: technical details about the index file format
2011-02-26 13:36 ` Nguyen Thai Ngoc Duy
@ 2011-03-02 1:51 ` Junio C Hamano
2011-03-02 3:34 ` Nguyen Thai Ngoc Duy
0 siblings, 1 reply; 22+ messages in thread
From: Junio C Hamano @ 2011-03-02 1:51 UTC (permalink / raw)
To: Nguyen Thai Ngoc Duy
Cc: Sverre Rabbelier, git, kusmabite, raa.lkml, jjuran,
Robin Rosenberg
Nguyen Thai Ngoc Duy <pclouds@gmail.com> writes:
> OK another try. I also add more details to tree cache. If somebody
> uses this document to create a git-compatible tool, then such a tool
> should behave the way git expects it.
Thanks.
Here is what I scribbled on top of yours (not quite polished).
* Clarify "string of unsigned bytes";
* Blob has two variants (regular file vs symlink), not (blob vs symlink);
* Clarify permission mode bits;
* Clarify ce_namelen() "too long to fit in the length field" case;
* Clarify "." etc are forbidden as path components;
* Match the description with the internal wording "cache-tree";
* All types of extension begin with signature and length as explained in
the first part. Don't repeat the "length" part in the description of
each extension (can be mistaken as if there is a separate 32-bit size
field inside the extension), but state what the signature for each
extension is.
* Don't say "Extension tag", as we have said "Extension signature" in the
first part---be consistent;
* Clarify the invalidation of cache-tree entries;
* Correct description on subtree_nr field in the cache-tree;
* Clarify the order of entries in cache-tree;
Documentation/technical/index-format.txt | 94 ++++++++++++++++++------------
1 files changed, 57 insertions(+), 37 deletions(-)
diff --git a/Documentation/technical/index-format.txt b/Documentation/technical/index-format.txt
index 89e410a..8923f6f 100644
--- a/Documentation/technical/index-format.txt
+++ b/Documentation/technical/index-format.txt
@@ -9,21 +9,21 @@ GIT index format
- A 12-byte header consisting of
4-byte signature:
- The signature is { 'D', 'I', 'R', 'C' }
+ The signature is { 'D', 'I', 'R', 'C' } (stands for "dircache")
4-byte version number:
The current supported versions are 2 and 3.
32-bit number of index entries.
- - A number of sorted index entries
+ - A number of sorted index entries (see below).
- Extensions
Extensions are identified by signature. Optional extensions can
be ignored if GIT does not understand them.
- GIT currently supports tree cache and resolve undo extensions.
+ GIT currently supports cached tree and resolve undo extensions.
4-byte extension signature. If the first byte is 'A'..'Z' the
extension is optional and can be ignored.
@@ -38,8 +38,9 @@ GIT index format
== Index entry
Index entries are sorted in ascending order on the name field,
- interpreted as a string of unsigned bytes. Entries with the same
- name are sorted by their stage field.
+ interpreted as a string of unsigned bytes (i.e. memcmp() order, no
+ localization, no special casing of directory separator '/'). Entries
+ with the same name are sorted by their stage field.
32-bit ctime seconds, the last time a file's metadata changed
this is stat(2) data
@@ -62,12 +63,13 @@ GIT index format
32-bit mode, split into (high to low bits)
4-bit object type
- valid values in binary are 1000 (blob), 1010 (symbolic link)
+ valid values in binary are 1000 (regular file), 1010 (symbolic link)
and 1110 (gitlink)
3-bit unused
- 9-bit unix permission (only 0755 and 0644 are valid)
+ 9-bit unix permission. Only 0755 and 0644 are valid for regular files.
+ Symbolic links and gitlinks have value 0 in this field.
32-bit uid
this is stat(2) data
@@ -76,11 +78,11 @@ GIT index format
this is stat(2) data
32-bit file size
- This is the on-disk size from stat(2)
+ This is the on-disk size from stat(2), truncated to 32-bit.
160-bit SHA-1 for the represented object
- A 16-bit field split into (high to low bits)
+ A 16-bit 'flags' field split into (high to low bits)
1-bit assume-valid flag
@@ -88,7 +90,8 @@ GIT index format
2-bit stage (during merge)
- 12-bit name length if the length is less than 0x0FFF
+ 12-bit name length if the length is less than 0xFFF; otherwise 0xFFF
+ is stored in this field.
(Version 3) A 16-bit field, only applicable if the "extended flag"
above is 1, split into (high to low bits).
@@ -103,63 +106,80 @@ GIT index format
Entry path name (variable length) relative to top level directory
(without leading slash). '/' is used as path separator. The special
- paths ".", ".." and ".git" (without quotes) are disallowed.
+ path components ".", ".." and ".git" (without quotes) are disallowed.
Trailing slash is also disallowed.
The exact encoding is undefined, but the '.' and '/' characters
- are encoded in 7-bit ASCII and the encoding cannot contain a nul
- byte. Generally a superset of ASCII.
+ are encoded in 7-bit ASCII and the encoding cannot contain a NUL
+ byte (iow, this is a UNIX pathname).
1-8 nul bytes as necessary to pad the entry to a multiple of eight bytes
while keeping the name NUL-terminated.
== Extensions
-=== Tree cache
+=== Cached tree
- Tree cache extension contains pre-computed hashes for trees that can
+ Cached tree extension contains pre-computed hashes for trees that can
be derived from the index. It helps speed up tree object generation
from index for a new commit.
When a path is updated in index, the path must be invalidated and
removed from tree cache.
- - Extension tag { 'T', 'R', 'E', 'E' }
+ The signature for this extension is { 'T', 'R', 'E', 'E' }.
- - 32-bit size
+ A series of entries fill the entire extension; each of which
+ consists of:
- - A number of entries
+ - NUL-terminated path component (relative to its parent directory);
- NUL-terminated tree name
+ - ASCII decimal number of entries in the index that is covered by the
+ tree this entry represents (entry_count);
- Blank-terminated ASCII decimal number of entries in this tree
+ - A space (ASCII 32);
- Newline-terminated position of this tree in the parent tree. 0 for
- the root tree
+ - ASCII decimal number that represents the number of subtrees this
+ tree has;
- 160-bit SHA-1 for this tree and it's children
+ - A newline (ASCII 10); and
+
+ - 160-bit object name for the object that would result from writing
+ this span of index as a tree.
+
+ An entry can be in an invalidated state and is represented by having -1
+ in the entry_count field.
+
+ The entries are written out in the top-down, depth-first order. The
+ first entry represents the root level of the repository, followed by the
+ first subtree---let's call this A---of the root level (with its name
+ relative to the root level), followed by the first subtree of A (with
+ its name relative to A), ...
=== Resolve undo
- A conflict is represented in index as a set of higher stage entries.
+ A conflict is represented in the index as a set of higher stage entries.
When a conflict is resolved (e.g. with "git add path"), these higher
- stage entries will be removed and a stage-0 entry with proper
- resoluton is added.
+ stage entries will be removed and a stage-0 entry with proper resoluton
+ is added.
- Resolve undo extension saves these higher stage entries so that
- conflicts can be recreated (e.g. with "git checkout -m"), in case
- users want to redo a conflict resolution from scratch.
+ When these higher stage entries are removed, they are saved in the
+ resolve undo extension, so that conflicts can be recreated (e.g. with
+ "git checkout -m"), in case users want to redo a conflict resolution
+ from scratch.
- - Extension tag { 'R', 'E', 'U', 'C' }
+ The signature for this extension is { 'R', 'E', 'U', 'C' }.
- - 32-bit size
+ A series of entries fill the entire extension; each of which
+ consists of:
- - A number of conflict entries
+ - NUL-terminated pathname the entry describes (relative to the root of
+ the repository, i.e. full pathname);
- NUL-terminated conflict path
+ - Three NUL-terminated ASCII octal numbers, entry mode of entries in
+ stage 1 to 3 (a missing stage is represented by "0" in this field);
+ and
- Three NUL-terminated ASCII octal numbers, entry mode of entries in
- stage 1 to 3.
+ - At most three 160-bit object names of the entry in stages from 1 to 3
+ (nothing is written for a missing stage).
- At most three 160-bit SHA-1s of the entry in three stages from 1
- to 3. SHA-1 is not saved for any stage with entry mode zero.
^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [PATCH] doc: technical details about the index file format
2011-03-02 1:51 ` Junio C Hamano
@ 2011-03-02 3:34 ` Nguyen Thai Ngoc Duy
2011-03-02 6:02 ` Junio C Hamano
0 siblings, 1 reply; 22+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2011-03-02 3:34 UTC (permalink / raw)
To: Junio C Hamano
Cc: Sverre Rabbelier, git, kusmabite, raa.lkml, jjuran,
Robin Rosenberg
On Wed, Mar 2, 2011 at 8:51 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Nguyen Thai Ngoc Duy <pclouds@gmail.com> writes:
>
>> OK another try. I also add more details to tree cache. If somebody
>> uses this document to create a git-compatible tool, then such a tool
>> should behave the way git expects it.
>
> Thanks.
>
> Here is what I scribbled on top of yours (not quite polished).
>
> ...
Looks good. I don't really like ending a sentence with semicolon, but
that's just my taste.
I wonder if we should also point to relevant source files, so if this
document becomes out of date, the readers can jump in the source and
verify themselves (perhaps coming up with patches to this doc)?
--
Duy
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] doc: technical details about the index file format
2011-03-02 3:34 ` Nguyen Thai Ngoc Duy
@ 2011-03-02 6:02 ` Junio C Hamano
2011-03-02 11:43 ` Nguyen Thai Ngoc Duy
2011-03-02 12:53 ` Drew Northup
0 siblings, 2 replies; 22+ messages in thread
From: Junio C Hamano @ 2011-03-02 6:02 UTC (permalink / raw)
To: Nguyen Thai Ngoc Duy
Cc: Sverre Rabbelier, git, kusmabite, raa.lkml, jjuran,
Robin Rosenberg
Nguyen Thai Ngoc Duy <pclouds@gmail.com> writes:
> Looks good. I don't really like ending a sentence with semicolon, but
> that's just my taste.
I tend to do enumerated list like "A; B; and C." Perhaps just a personal
taste.
> I wonder if we should also point to relevant source files, so if this
> document becomes out of date, the readers can jump in the source and
> verify themselves (perhaps coming up with patches to this doc)?
I suspect that is a sure way to guarantee the document to go stale.
I didn't like the way I explained the cache-tree entry order. Was it
understandable?
I am wondering if an illustration with an example might be in order. I
think anybody halfway intelligent may be able to get a fuzzy idea of what
is going on by looking at the output from test-dump-cache-tree after
"reset --hard && write-tree" and then by comparing it with the output from
test-dump-cache-tree after running ">t/something && git add t/something"
(which invalidates the top-level tree and t/ subtree). But a well written
documentation should be able to help clarifying the idea obtainable that
way. I don't think what I wrote in the previous message is sufficient
even for that (i.e. comparing the two output would give you better
explanation of what is going on than what I wrote--iow, what I wrote may
not be very useful for people who are motivated to learn).
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] doc: technical details about the index file format
2011-03-02 6:02 ` Junio C Hamano
@ 2011-03-02 11:43 ` Nguyen Thai Ngoc Duy
2011-03-02 12:53 ` Drew Northup
1 sibling, 0 replies; 22+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2011-03-02 11:43 UTC (permalink / raw)
To: Junio C Hamano
Cc: Sverre Rabbelier, git, kusmabite, raa.lkml, jjuran,
Robin Rosenberg
On Wed, Mar 2, 2011 at 1:02 PM, Junio C Hamano <gitster@pobox.com> wrote:
>> I wonder if we should also point to relevant source files, so if this
>> document becomes out of date, the readers can jump in the source and
>> verify themselves (perhaps coming up with patches to this doc)?
>
> I suspect that is a sure way to guarantee the document to go stale.
No it does not. The point is to make it easier for readers to help
themselves when they suspect the document is not entirely correct.
> I didn't like the way I explained the cache-tree entry order. Was it
> understandable?
It is, although I'm wondering if it's just like memcmp() order with
parent component cut out.
> I am wondering if an illustration with an example might be in order. I
> think anybody halfway intelligent may be able to get a fuzzy idea of what
> is going on by looking at the output from test-dump-cache-tree after
> "reset --hard && write-tree" and then by comparing it with the output from
> test-dump-cache-tree after running ">t/something && git add t/something"
> (which invalidates the top-level tree and t/ subtree).
A short example would be great. test-dump-cache-tree might not be.
Last time I read its output, I wasn't sure I understood. Maybe because
I ran it on git.git and did not compare two outputs.
> But a well written
> documentation should be able to help clarifying the idea obtainable that
> way. I don't think what I wrote in the previous message is sufficient
> even for that (i.e. comparing the two output would give you better
> explanation of what is going on than what I wrote--iow, what I wrote may
> not be very useful for people who are motivated to learn).
--
Duy
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] doc: technical details about the index file format
2011-03-02 6:02 ` Junio C Hamano
2011-03-02 11:43 ` Nguyen Thai Ngoc Duy
@ 2011-03-02 12:53 ` Drew Northup
1 sibling, 0 replies; 22+ messages in thread
From: Drew Northup @ 2011-03-02 12:53 UTC (permalink / raw)
To: Junio C Hamano
Cc: Nguyen Thai Ngoc Duy, Sverre Rabbelier, git, kusmabite, raa.lkml,
jjuran, Robin Rosenberg
On Tue, 2011-03-01 at 22:02 -0800, Junio C Hamano wrote:
> I didn't like the way I explained the cache-tree entry order. Was it
> understandable?
>
> I am wondering if an illustration with an example might be in order. I
> think anybody halfway intelligent may be able to get a fuzzy idea of what
> is going on by looking at the output from test-dump-cache-tree after
> "reset --hard && write-tree" and then by comparing it with the output from
> test-dump-cache-tree after running ">t/something && git add t/something"
> (which invalidates the top-level tree and t/ subtree). But a well written
> documentation should be able to help clarifying the idea obtainable that
> way. I don't think what I wrote in the previous message is sufficient
> even for that (i.e. comparing the two output would give you better
> explanation of what is going on than what I wrote--iow, what I wrote may
> not be very useful for people who are motivated to learn).
Perhaps I'll be able to put some time into reading the work you guys are
doing.... I can definitely put the "newbie goggles" on if I do.
--
-Drew Northup
________________________________________________
"As opposed to vegetable or mineral error?"
-John Pescatore, SANS NewsBites Vol. 12 Num. 59
^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2011-03-02 12:54 UTC | newest]
Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-09-01 9:53 [PATCH] doc: technical details about the index file format Nguyễn Thái Ngọc Duy
2010-09-01 10:36 ` Ramkumar Ramachandra
2010-09-01 15:20 ` Sverre Rabbelier
2010-09-01 18:54 ` Robin Rosenberg
2010-09-01 14:39 ` Nguyễn Thái Ngọc Duy
2010-09-02 8:56 ` Alex Riesen
2010-09-02 9:08 ` Joshua Juran
2010-09-02 14:50 ` Junio C Hamano
2010-09-02 15:11 ` Erik Faye-Lund
2010-09-06 10:37 ` Nguyễn Thái Ngọc Duy
2011-02-19 22:16 ` Sverre Rabbelier
2011-02-20 9:30 ` Nguyen Thai Ngoc Duy
2011-02-26 10:03 ` Nguyen Thai Ngoc Duy
2011-02-26 10:23 ` Junio C Hamano
2011-02-26 13:36 ` Nguyen Thai Ngoc Duy
2011-03-02 1:51 ` Junio C Hamano
2011-03-02 3:34 ` Nguyen Thai Ngoc Duy
2011-03-02 6:02 ` Junio C Hamano
2011-03-02 11:43 ` Nguyen Thai Ngoc Duy
2011-03-02 12:53 ` Drew Northup
2010-09-01 23:28 ` Nguyen Thai Ngoc Duy
2010-09-02 5:59 ` Robin Rosenberg
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).