git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Derrick Stolee <stolee@gmail.com>
To: Grant Welch <gwelch925@gmail.com>, git@vger.kernel.org
Subject: Re: Inconsistencies in commit-graph technical docs.
Date: Thu, 28 Jun 2018 08:49:01 -0400	[thread overview]
Message-ID: <c48629f1-7b41-d516-84f5-c07ef7881e03@gmail.com> (raw)
In-Reply-To: <CAMYKhtUi3nCmTjZ8pXbNCxaKa6h9+1OSD7YitTeip+807w7dJA@mail.gmail.com>

On 6/28/2018 1:11 AM, Grant Welch wrote:
> I recently read the "Supercharging the Git Commit Graph blog by
> Derrick Stolee. I found the article interesting and wanted to verify
> the performance numbers for myself. Then that led me to want to know
> more about the implementation, so I read the technical design notes in
> commit-graph.txt, and then I jumped into the format documentation in
> commit-graph-format.txt.
>
> Along the way, I noticed a few issues. They might just be errors in
> the documentation, but I figured it was worth documenting my entire
> process just to be sure.
>
> "Supercharging the Git Commit Graph", by Derrick Stolee:
>    https://blogs.msdn.microsoft.com/devops/2018/06/25/supercharging-the-git-commit-graph/
>
> # TL;DR
>
> I found a few discrepencies between the documentation in
> commit-graph-format.txt and the results that I observed for myself.
>
> 1. The "Commit Data" chunk signature is documented to be 'CGET', but
> it should be 'CDAT'.
>
> commit-graph.c:18
>    #define GRAPH_CHUNKID_DATA 0x43444154 /* "CDAT" */

Thanks for catching this! Thankfully, this is an easy fix, as we only 
need to update the documentation.

> 2. The "no parent" value is documented to be 0xffffffff, but is
> actually 0x70000000.
>
> commit-graph.c:34
>    #define GRAPH_PARENT_NONE 0x70000000

This is a more important mistake, as it affects the data that was 
written in the commit-graph file.

Part of the problem is that leading hex digit of 0x7 which in binary is 
0b0111. We already designed a limit of at most 2^{31}-1 (~2.1 billion) 
commits in the commit-graph because of the way we track octopus edges, 
but this mistake has cost us more: we cannot store more than ~1.8 
billion commits.

I'm sorry for this mixup, mostly because it is aesthetically unpleasant. 
Those extra 300 million commits mean less to me than having a clean file 
format.

> 3. The "generation" field isn't set on any of the commits. (I don't
> know what to make of this.)

This is a difference between 2.18 and current 'master', which merged 
ds/generation-numbers. Commit-graphs written with Git 2.18 have all 
generation numbers listed as GENERATION_NUMBER_ZERO (0), which lets 
future versions know that the generation number is not computed yet, so 
the next commit-graph write will compute the correct generation number.

I'll send a patch soon fixing these doc issues.

Thanks,
-Stolee

  reply	other threads:[~2018-06-28 12:49 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-28  5:11 Inconsistencies in commit-graph technical docs Grant Welch
2018-06-28 12:49 ` Derrick Stolee [this message]
2018-06-28 12:52   ` [PATCH] commit-graph: fix documentation inconsistencies Derrick Stolee
2018-06-28 16:54   ` Inconsistencies in commit-graph technical docs Grant Welch

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c48629f1-7b41-d516-84f5-c07ef7881e03@gmail.com \
    --to=stolee@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gwelch925@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).