From: Bryan Turner <bturner@atlassian.com>
To: Derrick Stolee <stolee@gmail.com>
Cc: Jonathan Tan <jonathantanmy@google.com>, Git Users <git@vger.kernel.org>
Subject: Re: Commit graph chains with no corresponding files?
Date: Fri, 26 Feb 2021 18:49:33 -0800 [thread overview]
Message-ID: <CAGyf7-EzPrX8D5pnZKOGtiUHb6AZJs7Un4u39tOrcHkFi6gi-w@mail.gmail.com> (raw)
In-Reply-To: <fc8a2c0f-24b7-5884-b669-bb9700f3ba84@gmail.com>
On Thu, Feb 25, 2021 at 6:20 AM Derrick Stolee <stolee@gmail.com> wrote:
>
> On 2/24/2021 11:54 PM, Bryan Turner wrote:
> > On Mon, Jun 29, 2020 at 6:51 PM Derrick Stolee <stolee@gmail.com> wrote:
> >>
> >> On 6/29/2020 6:07 PM, Jonathan Tan wrote:
> >>> At $DAYJOB, a few people have reported "warning: unable to find all
> >>> commit-graph files" warnings. Their commit-graph-chain files have a few
> >>> lines, but they only have one commit graph file with very few commits. I
> >>> suspected something happening during fetch, because (as far as I know) a
> >>> fetch may cause an incremental commit graph to be written, but I ran a
> >>> fetch on a large repository myself and didn't run into this problem.
> >>>
> >>> Has anyone ran into this problem before, and know how to reproduce?
> >
> > I don't have any specific reproduction steps, but we've just run into
> > our first case of this on Git 2.29. I ended up kicking off a full `git
> > commit-graph write` to fix it. That displayed the same warning, but
> > commands run after it no longer do. Prior to writing the new graph, I
> > had this:
> > $ ls
> > commit-graph-chain graph-88f5fe6e0c659e3742e556982263813d528ead81.graph
>
> The contents of the 'commit-graph-chain' file are critical to diagnosing
> the problems here. Likely it had multiple lines.
>
> > Afterward, the `objects/info/commits-graphs` directory still exists
> > but is empty, and there is now an `objects/commit-graph` that didn't
> > exist before. `git commit-graph verify` seems happy with the state of
> > things.
>
> Yes, a full rewrite without "--split" will get you to this state.
>
> >> The incremental commit-graph code deletes any commit-graph files
> >> that do not appear in the chain. I believe this is done by comparing
> >> the contents of the ".git/objects/info/commit-graphs/" directory to
> >> the contents of the chain file.
> >>
> >> These appear to be case-sensitive, full-path comparisons.
> >>
> >> It is _possible_ that something like a case switch or a symlink
> >> could be causing a problem here. That's where I would look on
> >> the affected systems.
> >
> > Are commit graphs potentially problematic in repositories that are
> > borrowing objects from other repositories via alternates?
>
> This was definitely part of the design, with the intention of
> working with a common base in the alternate. However, if the
> alternate collapses layers, then the repo that is borrowing
> from that alternate may have a broken chain.
Thanks for the analysis, Derrick. This seems like a likely culprit for
how the repository got into this state, because it is a fork (of a
fork) and does use a series of alternates.
>
> It is likely a better setup to have the alternate keep a
> commit-graph file and leave the dependent repos clear of a
> commit-graph. _Or_ the dependent repos should use a full
> commit-graph instead of a chain.
Skipping writing the commit graph in forks seems like a reasonable
place for us to start, given the way it currently works, but always
writing full graphs may be another option. If the fork is able to
borrow the commit graph from its origin across the alternate, though,
then that implies there may not be a lot of value in writing commit
graphs in the forks (since they're likely to share the majority of
their refs with their origin).
>
> If you have a better idea for how to make this work, then there
> is room for improvement.
>
> For example, if we ensure during the commit-graph write that
> all layers of the chain are within our local repo, then these
> dependency issues go away without breaking any old Git versions
> that are reading the data.
Naively, this was the way I assumed it already worked--which is why I
was writing commit graphs in forks in the first place.
>
> > Have there
> > been important changes to commit graphs since 2.29?
>
> Not in the area of commit-graph chains.
Thanks again!
-b
>
> Thanks,
> -Stolee
prev parent reply other threads:[~2021-02-27 2:50 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-06-29 22:07 Commit graph chains with no corresponding files? Jonathan Tan
2020-06-30 1:51 ` Derrick Stolee
2020-07-16 22:57 ` [FYI] commit-graph: trace expiry of commit graph links Jonathan Tan
2021-02-25 4:54 ` Commit graph chains with no corresponding files? Bryan Turner
2021-02-25 14:20 ` Derrick Stolee
2021-02-27 2:49 ` Bryan Turner [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAGyf7-EzPrX8D5pnZKOGtiUHb6AZJs7Un4u39tOrcHkFi6gi-w@mail.gmail.com \
--to=bturner@atlassian.com \
--cc=git@vger.kernel.org \
--cc=jonathantanmy@google.com \
--cc=stolee@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).