All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Derrick Stolee <stolee@gmail.com>
Cc: git@vger.kernel.org, "Junio C Hamano" <gitster@pobox.com>,
	"SZEDER Gábor" <szeder.dev@gmail.com>,
	"Eric Sunshine" <sunshine@sunshineco.com>,
	"Ramsay Jones" <ramsay@ramsayjones.plus.com>
Subject: Re: [PATCH v3 4/8] commit-graph: don't early exit(1) on e.g. "git status"
Date: Mon, 29 Apr 2019 16:30:38 +0200	[thread overview]
Message-ID: <87wojcrii9.fsf@evledraar.gmail.com> (raw)
In-Reply-To: <3518ad3e-bc4a-c2c3-d4bd-c87f9e828b1c@gmail.com>


On Mon, Apr 29 2019, Derrick Stolee wrote:

> On 4/27/2019 9:06 AM, Ævar Arnfjörð Bjarmason wrote:
>>
>> There's still cases left where we'll exit early, e.g. if you do:
>>
>>     $ git diff -U1
>>     diff --git a/commit-graph.c b/commit-graph.c
>>     index 66865acbd7..63773764ce 100644
>>     --- a/commit-graph.c
>>     +++ b/commit-graph.c
>>     @@ -1074,3 +1074,3 @@ void write_commit_graph(const char *obj_dir,
>>             chunk_offsets[1] = chunk_offsets[0] + GRAPH_FANOUT_SIZE;
>>     -       chunk_offsets[2] = chunk_offsets[1] + hashsz * commits.nr;
>>     +       chunk_offsets[2] = chunk_offsets[0] + hashsz * commits.nr;
>>             chunk_offsets[3] = chunk_offsets[2] + (hashsz + 16) * commits.nr;
>>
>> Which is obviously bad, but something I encounterd while hacking up [1]
>> we'll still hard die as before this patch on:
>>
>>     $ git status
>>     fatal: invalid parent position 1734910766
>>     $
>
> I really appreciate you digging in deep into these kinds of issues. You
> seem to be hitting corrupted commit-graph files more often than we are
> (in VFS for Git world).

FWIW I've never encountered any of these in the wild. I just started
poking at this in 2ac138d568 ("commit-graph: fix segfault on e.g. "git
status"", 2019-03-25) because I was looking at the commit graph, running
its tests with -d, and we'd segfault previously on e.g. "git status" in
our own graph corruption tests.

> However, we should be _very careful_ when turning some of these errors
> to warnings. At the very least, we should do some high-level planning
> for how to handle this case.

Indeed. I should have been explicit, I don't think it's sane to do
anything except return NULL up the stack and say "the graph is screwed,
we can't use it" when initially parsing it/headers, but reading on...

> The biggest issue is that we have some logic that is run after a call to
> generation_numbers_enabled(), such as the `git rev-list --topo-order`
> logic, that relies on the commit-graph for correctness. If we output a
> warning and then stop using the commit-graph, then we will start having
> commits with finite generation pointing to commits with infinite generation.
>
> Perhaps, with some care, we can alert the algorithm to change the "minimum
> generation" that limits how far we dequeue the priority-queue. Changing it
> to zero will cause the algorithm to behave like the old algorithm.
>
> But, having an algorithm that usually takes 0.1 seconds suddenly take 10+
> seconds also violates some expectations.
>
> Q: How should we handle a detectably-invalid commit-graph?

...I don't think we need to do any paranoid algorithm fallback in
general. As you point out that's going to be a PITA as we read the
actual data in the graph in some cases.

> I think most of your patches have done a good job so far of detecting
> an invalid header, and responding by ignoring the commit-graph. This case
> of a detectable error in the chunk data itself is not something we can
> check on the first load without serious performance issues.
>
> I hope we can decide on a good solution.

...OK, so this is one of those PITA cases. I think it's fine to just
leave it.

Although maybe we'd still want to be more paranoid with O(n) cases like
"--contains" or "--topo-order" and cases that are surely just "look up
my immediate commit data", like "status" dying on this particular error.

But I think it's fine to just decide to nothing about this. I mainly
just wanted to make a note to myself & CC the list in case there was
interest...

  reply	other threads:[~2019-04-29 14:30 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-21 22:37 [PATCH 0/8] commit-graph: segfault & other fixes for broken graphs Ævar Arnfjörð Bjarmason
2019-02-21 22:37 ` [PATCH 1/8] commit-graph tests: split up corrupt_graph_and_verify() Ævar Arnfjörð Bjarmason
2019-02-21 22:37 ` [PATCH 2/8] commit-graph tests: test a graph that's too small Ævar Arnfjörð Bjarmason
2019-02-21 22:37 ` [PATCH 3/8] commit-graph: fix segfault on e.g. "git status" Ævar Arnfjörð Bjarmason
2019-02-21 22:37 ` [PATCH 4/8] commit-graph: don't early exit(1) " Ævar Arnfjörð Bjarmason
2019-02-21 22:37 ` [PATCH 5/8] commit-graph: don't pass filename to load_commit_graph_one_fd_st() Ævar Arnfjörð Bjarmason
2019-02-21 22:37 ` [PATCH 6/8] commit-graph verify: detect inability to read the graph Ævar Arnfjörð Bjarmason
2019-02-21 23:15   ` SZEDER Gábor
2019-02-21 22:37 ` [PATCH 7/8] commit-graph write: don't die if the existing graph is corrupt Ævar Arnfjörð Bjarmason
2019-02-22 23:13   ` Junio C Hamano
2019-02-23  9:29     ` Eric Sunshine
2019-02-21 22:37 ` [PATCH 8/8] commit-graph: improve & i18n error messages Ævar Arnfjörð Bjarmason
2019-03-14 21:47 ` [PATCH v2 0/8] commit-graph: segfault & other fixes for broken graphs Ævar Arnfjörð Bjarmason
2019-03-15  7:47   ` Eric Sunshine
2019-03-15 10:39     ` Ævar Arnfjörð Bjarmason
2019-03-25 12:08   ` [PATCH v3 " Ævar Arnfjörð Bjarmason
2019-03-25 12:08   ` [PATCH v3 1/8] commit-graph tests: split up corrupt_graph_and_verify() Ævar Arnfjörð Bjarmason
2019-03-25 12:08   ` [PATCH v3 2/8] commit-graph tests: test a graph that's too small Ævar Arnfjörð Bjarmason
2019-03-25 12:08   ` [PATCH v3 3/8] commit-graph: fix segfault on e.g. "git status" Ævar Arnfjörð Bjarmason
2019-03-25 12:08   ` [PATCH v3 4/8] commit-graph: don't early exit(1) " Ævar Arnfjörð Bjarmason
2019-04-27 13:06     ` Ævar Arnfjörð Bjarmason
2019-04-29 12:48       ` Derrick Stolee
2019-04-29 14:30         ` Ævar Arnfjörð Bjarmason [this message]
2019-05-01 18:31         ` Jeff King
2019-03-25 12:08   ` [PATCH v3 5/8] commit-graph: don't pass filename to load_commit_graph_one_fd_st() Ævar Arnfjörð Bjarmason
2019-03-25 12:08   ` [PATCH v3 6/8] commit-graph verify: detect inability to read the graph Ævar Arnfjörð Bjarmason
2019-03-25 12:08   ` [PATCH v3 7/8] commit-graph write: don't die if the existing graph is corrupt Ævar Arnfjörð Bjarmason
2019-03-25 12:08   ` [PATCH v3 8/8] commit-graph: improve & i18n error messages Ævar Arnfjörð Bjarmason
2019-03-14 21:47 ` [PATCH v2 1/8] commit-graph tests: split up corrupt_graph_and_verify() Ævar Arnfjörð Bjarmason
2019-03-14 21:47 ` [PATCH v2 2/8] commit-graph tests: test a graph that's too small Ævar Arnfjörð Bjarmason
2019-03-14 21:47 ` [PATCH v2 3/8] commit-graph: fix segfault on e.g. "git status" Ævar Arnfjörð Bjarmason
2019-03-14 21:47 ` [PATCH v2 4/8] commit-graph: don't early exit(1) " Ævar Arnfjörð Bjarmason
2019-03-14 21:47 ` [PATCH v2 5/8] commit-graph: don't pass filename to load_commit_graph_one_fd_st() Ævar Arnfjörð Bjarmason
2019-03-14 21:47 ` [PATCH v2 6/8] commit-graph verify: detect inability to read the graph Ævar Arnfjörð Bjarmason
2019-03-14 21:47 ` [PATCH v2 7/8] commit-graph write: don't die if the existing graph is corrupt Ævar Arnfjörð Bjarmason
2019-03-14 21:47 ` [PATCH v2 8/8] commit-graph: improve & i18n error messages Ævar Arnfjörð Bjarmason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87wojcrii9.fsf@evledraar.gmail.com \
    --to=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=ramsay@ramsayjones.plus.com \
    --cc=stolee@gmail.com \
    --cc=sunshine@sunshineco.com \
    --cc=szeder.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.