From: Jakub Narebski <jnareb@gmail.com>
To: Derrick Stolee <stolee@gmail.com>
Cc: git@vger.kernel.org, git@jeffhostetler.com, peff@peff.net,
jonathantanmy@google.com, szeder.dev@gmail.com,
sbeller@google.com, gitster@pobox.com,
Derrick Stolee <dstolee@microsoft.com>
Subject: Re: [PATCH v4 01/13] commit-graph: add format document
Date: Mon, 02 Apr 2018 16:09:11 +0200 [thread overview]
Message-ID: <866059em48.fsf@gmail.com> (raw)
In-Reply-To: <433a2523-d04a-08bb-d128-6c8e578916fa@gmail.com> (Derrick Stolee's message of "Mon, 2 Apr 2018 09:09:29 -0400")
Derrick Stolee <stolee@gmail.com> writes:
> On 3/30/2018 9:25 AM, Jakub Narebski wrote:
>> Derrick Stolee <stolee@gmail.com> writes:
>>
>>> +== graph-*.graph files have the following format:
>> What is this '*' here?
>
> No longer necessary. It used to be a placeholder for a hash value, but
> now the graph is stored in objects/info/commit-graph.
All right.
Excuse me replying to v4 instead of v6 of the patch series, where it
would be answered or rather made moot already.
>>
>> [...]
>>> + The remaining data in the body is described one chunk at a time, and
>>> + these chunks may be given in any order. Chunks are required unless
>>> + otherwise specified.
>> Does Git need to understand all chunks, or could there be optional
>> chunks that can be safely ignored (like in PNG format)? Though this may
>> be overkill, and could be left for later revision of the format if
>> deemed necessary.
>
> In v6, the format and design documents are edited to make clear the
> use of optional chunks, specifically for future extension without
> increasing the version number.
That's good.
>>> +CHUNK DATA:
>>> +
>>> + OID Fanout (ID: {'O', 'I', 'D', 'F'}) (256 * 4 bytes)
>>> + The ith entry, F[i], stores the number of OIDs with first
>>> + byte at most i. Thus F[255] stores the total
>>> + number of commits (N).
>> All right, it is small enough that can be required even for a very small
>> number of commits.
>>
>>> +
>>> + OID Lookup (ID: {'O', 'I', 'D', 'L'}) (N * H bytes)
>>> + The OIDs for all commits in the graph, sorted in ascending order.
>>> +
>>> + Commit Data (ID: {'C', 'G', 'E', 'T' }) (N * (H + 16) bytes)
>> Do commits need to be put here in the ascending order of OIDs?
>
> Yes.
>
>> If so, this would mean that it is not possible to add information about
>> new commits by only appending data and maybe overwriting some fields, I
>> think. You would need to do full rewrite to insert new commit in
>> appropriate place.
>
> That is the idea. This file is not updated with every new commit, but
> instead will be updated on some scheduled cleanup events. The
> commit-graph file is designed in a way to be non-critical, and not
> tied to the packfile layout. This allows flexibility for when to do
> the write.
>
> For example, in GVFS, we will write a new commit-graph when there are
> new daily prefetch packs.
>
> This could also integrate with 'gc' and 'repack' so whenever they are
> triggered the commit-graph is written as well.
I wonder if it would be possible to use existing hooks...
> Commits that do not exist in the commit-graph file will load from the
> object database as normal (after a failed lookup in the commit-graph
> file).
Ah. I thought wrongly that it would (or at least could) be something
that can be kept up to date, and extended when adding any new commit.
>>> + * The first H bytes are for the OID of the root tree.
>>> + * The next 8 bytes are for the int-ids of the first two parents
>>> + of the ith commit. Stores value 0xffffffff if no parent in that
>>> + position. If there are more than two parents, the second value
>>> + has its most-significant bit on and the other bits store an array
>>> + position into the Large Edge List chunk.
>>> + * The next 8 bytes store the generation number of the commit and
>>> + the commit time in seconds since EPOCH. The generation number
>>> + uses the higher 30 bits of the first 4 bytes, while the commit
>>> + time uses the 32 bits of the second 4 bytes, along with the lowest
>>> + 2 bits of the lowest byte, storing the 33rd and 34th bit of the
>>> + commit time.
>>> +
>>> + Large Edge List (ID: {'E', 'D', 'G', 'E'}) [Optional]
>>> + This list of 4-byte values store the second through nth parents for
>>> + all octopus merges. The second parent value in the commit data stores
>>> + an array position within this list along with the most-significant bit
>>> + on. Starting at that array position, iterate through this list of int-ids
>>> + for the parents until reaching a value with the most-significant bit on.
>>> + The other bits correspond to the int-id of the last parent.
>>
>> All right, that is one chunk that cannot use fixed-length records; this
>> shouldn't matter much, as we iterate only up to the number of parents
>> less two.
>
> Less one: the second "parent" column of the commit data chunk is used
> to point into this list, so (P-1) parents are in this chunk for a
> commit with P parents.
Right.
>> A question: what happens to the last list of parents? Is there a
>> guardian value of 0xffffffff at last place?
>
> The termination condition is in the position of the last parent, since
> the most-significant bit is on. The other 31 bits contain the int-id
> of the parent.
Ah. I have misunderstood the format: I thought that first entry is
marked with most-significant bit set to 1, and all the rest to 0, while
it is last entry (last parent) has most-significant bit set, while all
others (if any) do not. So there is no need for guardian value.
Best regards,
--
Jakub Narębski
next prev parent reply other threads:[~2018-04-02 14:09 UTC|newest]
Thread overview: 146+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-01-30 21:39 [PATCH v2 00/14] Serialized Git Commit Graph Derrick Stolee
2018-01-30 21:39 ` [PATCH v2 01/14] commit-graph: add format document Derrick Stolee
2018-02-01 21:44 ` Jonathan Tan
2018-01-30 21:39 ` [PATCH v2 02/14] graph: add commit graph design document Derrick Stolee
2018-01-31 2:19 ` Stefan Beller
2018-01-30 21:39 ` [PATCH v2 03/14] commit-graph: create git-commit-graph builtin Derrick Stolee
2018-02-02 0:53 ` SZEDER Gábor
2018-01-30 21:39 ` [PATCH v2 04/14] commit-graph: implement construct_commit_graph() Derrick Stolee
2018-02-01 22:23 ` Jonathan Tan
2018-02-01 23:46 ` SZEDER Gábor
2018-02-02 15:32 ` SZEDER Gábor
2018-02-05 16:06 ` Derrick Stolee
2018-02-07 15:08 ` SZEDER Gábor
2018-02-07 15:10 ` Derrick Stolee
2018-01-30 21:39 ` [PATCH v2 05/14] commit-graph: implement git-commit-graph --write Derrick Stolee
2018-02-01 23:33 ` Jonathan Tan
2018-02-02 18:36 ` Stefan Beller
2018-02-02 22:48 ` Junio C Hamano
2018-02-03 1:58 ` Derrick Stolee
2018-02-03 9:28 ` Jeff King
2018-02-05 18:48 ` Junio C Hamano
2018-02-06 18:55 ` Derrick Stolee
2018-02-01 23:48 ` SZEDER Gábor
2018-02-05 18:07 ` Derrick Stolee
2018-02-02 1:47 ` SZEDER Gábor
2018-01-30 21:39 ` [PATCH v2 06/14] commit-graph: implement git-commit-graph --read Derrick Stolee
2018-01-31 2:22 ` Stefan Beller
2018-02-02 0:02 ` SZEDER Gábor
2018-02-02 0:23 ` Jonathan Tan
2018-02-05 19:29 ` Derrick Stolee
2018-01-30 21:39 ` [PATCH v2 07/14] commit-graph: implement git-commit-graph --update-head Derrick Stolee
2018-02-02 1:35 ` SZEDER Gábor
2018-02-05 21:01 ` Derrick Stolee
2018-02-02 2:45 ` SZEDER Gábor
2018-01-30 21:39 ` [PATCH v2 08/14] commit-graph: implement git-commit-graph --clear Derrick Stolee
2018-02-02 4:01 ` SZEDER Gábor
2018-01-30 21:39 ` [PATCH v2 09/14] commit-graph: teach git-commit-graph --delete-expired Derrick Stolee
2018-02-02 15:04 ` SZEDER Gábor
2018-01-30 21:39 ` [PATCH v2 10/14] commit-graph: add core.commitgraph setting Derrick Stolee
2018-01-31 22:44 ` Igor Djordjevic
2018-02-02 16:01 ` SZEDER Gábor
2018-01-30 21:39 ` [PATCH v2 11/14] commit: integrate commit graph with commit parsing Derrick Stolee
2018-02-02 1:51 ` Jonathan Tan
2018-02-06 14:53 ` Derrick Stolee
2018-01-30 21:39 ` [PATCH v2 12/14] commit-graph: read only from specific pack-indexes Derrick Stolee
2018-01-30 21:39 ` [PATCH v2 13/14] commit-graph: close under reachability Derrick Stolee
2018-01-30 21:39 ` [PATCH v2 14/14] commit-graph: build graph from starting commits Derrick Stolee
2018-01-30 21:47 ` [PATCH v2 00/14] Serialized Git Commit Graph Stefan Beller
2018-02-01 2:34 ` Stefan Beller
2018-02-08 20:37 ` [PATCH v3 " Derrick Stolee
2018-02-08 20:37 ` [PATCH v3 01/14] commit-graph: add format document Derrick Stolee
2018-02-08 21:21 ` Junio C Hamano
2018-02-08 21:33 ` Derrick Stolee
2018-02-08 23:16 ` Junio C Hamano
2018-02-08 20:37 ` [PATCH v3 02/14] graph: add commit graph design document Derrick Stolee
2018-02-08 20:37 ` [PATCH v3 03/14] commit-graph: create git-commit-graph builtin Derrick Stolee
2018-02-08 21:27 ` Junio C Hamano
2018-02-08 21:36 ` Derrick Stolee
2018-02-08 23:21 ` Junio C Hamano
2018-02-08 20:37 ` [PATCH v3 04/14] commit-graph: implement write_commit_graph() Derrick Stolee
2018-02-08 22:14 ` Junio C Hamano
2018-02-15 18:19 ` Junio C Hamano
2018-02-15 18:23 ` Derrick Stolee
2018-02-08 20:37 ` [PATCH v3 05/14] commit-graph: implement 'git-commit-graph write' Derrick Stolee
2018-02-13 21:57 ` Jonathan Tan
2018-02-08 20:37 ` [PATCH v3 06/14] commit-graph: implement 'git-commit-graph read' Derrick Stolee
2018-02-08 23:38 ` Junio C Hamano
2018-02-08 20:37 ` [PATCH v3 07/14] commit-graph: update graph-head during write Derrick Stolee
2018-02-12 18:56 ` Junio C Hamano
2018-02-12 20:37 ` Junio C Hamano
2018-02-12 21:24 ` Derrick Stolee
2018-02-13 22:38 ` Jonathan Tan
2018-02-08 20:37 ` [PATCH v3 08/14] commit-graph: implement 'git-commit-graph clear' Derrick Stolee
2018-02-13 22:49 ` Jonathan Tan
2018-02-08 20:37 ` [PATCH v3 09/14] commit-graph: implement --delete-expired Derrick Stolee
2018-02-08 20:37 ` [PATCH v3 10/14] commit-graph: add core.commitGraph setting Derrick Stolee
2018-02-08 20:37 ` [PATCH v3 11/14] commit: integrate commit graph with commit parsing Derrick Stolee
2018-02-14 0:12 ` Jonathan Tan
2018-02-14 18:08 ` Derrick Stolee
2018-02-15 18:25 ` Junio C Hamano
2018-02-08 20:37 ` [PATCH v3 12/14] commit-graph: close under reachability Derrick Stolee
2018-02-08 20:37 ` [PATCH v3 13/14] commit-graph: read only from specific pack-indexes Derrick Stolee
2018-02-08 20:37 ` [PATCH v3 14/14] commit-graph: build graph from starting commits Derrick Stolee
2018-02-09 13:02 ` SZEDER Gábor
2018-02-09 13:45 ` Derrick Stolee
2018-02-14 18:15 ` [PATCH v3 00/14] Serialized Git Commit Graph Derrick Stolee
2018-02-14 18:27 ` Stefan Beller
2018-02-14 19:11 ` Derrick Stolee
2018-02-19 18:53 ` [PATCH v4 00/13] " Derrick Stolee
2018-02-19 18:53 ` [PATCH v4 01/13] commit-graph: add format document Derrick Stolee
2018-02-20 20:49 ` Junio C Hamano
2018-02-21 19:23 ` Stefan Beller
2018-02-21 19:45 ` Derrick Stolee
2018-02-21 19:48 ` Stefan Beller
2018-03-30 13:25 ` Jakub Narebski
2018-04-02 13:09 ` Derrick Stolee
2018-04-02 14:09 ` Jakub Narebski [this message]
2018-02-19 18:53 ` [PATCH v4 02/13] graph: add commit graph design document Derrick Stolee
2018-02-20 21:42 ` Junio C Hamano
2018-02-23 15:44 ` Derrick Stolee
2018-02-21 19:34 ` Stefan Beller
2018-02-19 18:53 ` [PATCH v4 03/13] commit-graph: create git-commit-graph builtin Derrick Stolee
2018-02-20 21:51 ` Junio C Hamano
2018-02-21 18:58 ` Junio C Hamano
2018-02-23 16:07 ` Derrick Stolee
2018-02-26 16:25 ` SZEDER Gábor
2018-02-26 17:08 ` Derrick Stolee
2018-02-19 18:53 ` [PATCH v4 04/13] commit-graph: implement write_commit_graph() Derrick Stolee
2018-02-20 22:57 ` Junio C Hamano
2018-02-23 17:23 ` Derrick Stolee
2018-02-23 19:30 ` Junio C Hamano
2018-02-23 19:48 ` Junio C Hamano
2018-02-23 20:02 ` Derrick Stolee
2018-02-26 16:10 ` SZEDER Gábor
2018-02-28 18:47 ` Junio C Hamano
2018-02-19 18:53 ` [PATCH v4 05/13] commit-graph: implement 'git-commit-graph write' Derrick Stolee
2018-02-21 19:25 ` Junio C Hamano
2018-02-19 18:53 ` [PATCH v4 06/13] commit-graph: implement git commit-graph read Derrick Stolee
2018-02-21 20:11 ` Junio C Hamano
2018-02-22 18:25 ` Junio C Hamano
2018-02-19 18:53 ` [PATCH v4 07/13] commit-graph: implement --set-latest Derrick Stolee
2018-02-22 18:31 ` Junio C Hamano
2018-02-23 17:53 ` Derrick Stolee
2018-02-19 18:53 ` [PATCH v4 08/13] commit-graph: implement --delete-expired Derrick Stolee
2018-02-21 21:34 ` Stefan Beller
2018-02-23 17:43 ` Derrick Stolee
2018-02-22 18:48 ` Junio C Hamano
2018-02-23 17:59 ` Derrick Stolee
2018-02-23 19:33 ` Junio C Hamano
2018-02-23 19:41 ` Derrick Stolee
2018-02-23 19:51 ` Junio C Hamano
2018-02-19 18:53 ` [PATCH v4 09/13] commit-graph: add core.commitGraph setting Derrick Stolee
2018-02-19 18:53 ` [PATCH v4 10/13] commit-graph: close under reachability Derrick Stolee
2018-02-19 18:53 ` [PATCH v4 11/13] commit: integrate commit graph with commit parsing Derrick Stolee
2018-02-19 18:53 ` [PATCH v4 12/13] commit-graph: read only from specific pack-indexes Derrick Stolee
2018-02-21 22:25 ` Stefan Beller
2018-02-23 19:19 ` Derrick Stolee
2018-02-19 18:53 ` [PATCH v4 13/13] commit-graph: build graph from starting commits Derrick Stolee
2018-03-30 11:10 ` [PATCH v4 00/13] Serialized Git Commit Graph Jakub Narebski
2018-04-02 13:02 ` Derrick Stolee
2018-04-02 14:46 ` Jakub Narebski
2018-04-02 15:02 ` Derrick Stolee
2018-04-02 17:35 ` Stefan Beller
2018-04-02 17:54 ` Derrick Stolee
2018-04-02 18:02 ` Stefan Beller
2018-04-07 22:37 ` Jakub Narebski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=866059em48.fsf@gmail.com \
--to=jnareb@gmail.com \
--cc=dstolee@microsoft.com \
--cc=git@jeffhostetler.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=jonathantanmy@google.com \
--cc=peff@peff.net \
--cc=sbeller@google.com \
--cc=stolee@gmail.com \
--cc=szeder.dev@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.