git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/2] commit-graph: detect commits missing in ODB
@ 2023-10-23 11:27 Patrick Steinhardt
  2023-10-23 11:27 ` [PATCH v2 1/2] commit-graph: introduce envvar to disable commit existence checks Patrick Steinhardt
                   ` (2 more replies)
  0 siblings, 3 replies; 18+ messages in thread
From: Patrick Steinhardt @ 2023-10-23 11:27 UTC (permalink / raw)
  To: git; +Cc: Karthik Nayak, Junio C Hamano, Taylor Blau, Jeff King

[-- Attachment #1: Type: text/plain, Size: 6482 bytes --]

Hi,

this is version 2 of my patch series to more readily detect commits
parsed from the commit graph which are missing in the object database.
I've split this out into a separate thread as version 1 was sent in
reply to a different series to extend git-rev-list(1)'s `--missing`
option so that I don't continue to hijack this thread.

Changes compared to v1:

    - I've added a preparatory patch that introduced a new
      GIT_COMMIT_GRAPH_PARANOIA environment variable as suggested by
      Peff. This envvar is retrofitted to the preexisting check in
      `lookup_commit_in_graph()` so that the behaviour for this sanity
      check is consistent.

    - `repo_parse_commit_internal()` now also honors this new envvar.

    - I've amended the commit message in v2 to include the benchmark
      that demonstrates the performance regression.

    - We now un-parse the commit when parsing it via the commit graph
      succeeded, but it doesn't exist in the ODB.

Thanks for all the feedback and discussion around this.

Patrick

[1]: <b0bf576c51a706367a758b8e30eca37edb9c2734.1697200576.git.ps@pks.im>

Patrick Steinhardt (2):
  commit-graph: introduce envvar to disable commit existence checks
  commit: detect commits that exist in commit-graph but not in the ODB

 Documentation/git.txt   |  9 ++++++++
 commit-graph.c          |  6 +++++-
 commit-graph.h          |  6 ++++++
 commit.c                | 16 +++++++++++++-
 t/t5318-commit-graph.sh | 48 +++++++++++++++++++++++++++++++++++++++++
 5 files changed, 83 insertions(+), 2 deletions(-)

Range-diff against v1:
-:  ---------- > 1:  a89c435528 commit-graph: introduce envvar to disable commit existence checks
1:  6ec1e340f8 ! 2:  0476d48555 commit: detect commits that exist in commit-graph but not in the ODB
    @@ Commit message
         behaviour by checking for object existence via the object database, as
         well.
     
    +    This check of course comes with a performance penalty. The following
    +    benchmarks have been executed in a clone of linux.git with stable tags
    +    added:
    +
    +        Benchmark 1: git -c core.commitGraph=true rev-list --topo-order --all (git = master)
    +          Time (mean ± σ):      2.913 s ±  0.018 s    [User: 2.363 s, System: 0.548 s]
    +          Range (min … max):    2.894 s …  2.950 s    10 runs
    +
    +        Benchmark 2: git -c core.commitGraph=true rev-list --topo-order --all (git = pks-commit-graph-inconsistency)
    +          Time (mean ± σ):      3.834 s ±  0.052 s    [User: 3.276 s, System: 0.556 s]
    +          Range (min … max):    3.780 s …  3.961 s    10 runs
    +
    +        Benchmark 3: git -c core.commitGraph=false rev-list --topo-order --all (git = master)
    +          Time (mean ± σ):     13.841 s ±  0.084 s    [User: 13.152 s, System: 0.687 s]
    +          Range (min … max):   13.714 s … 13.995 s    10 runs
    +
    +        Benchmark 4: git -c core.commitGraph=false rev-list --topo-order --all (git = pks-commit-graph-inconsistency)
    +          Time (mean ± σ):     13.762 s ±  0.116 s    [User: 13.094 s, System: 0.667 s]
    +          Range (min … max):   13.645 s … 14.038 s    10 runs
    +
    +        Summary
    +          git -c core.commitGraph=true rev-list --topo-order --all (git = master) ran
    +            1.32 ± 0.02 times faster than git -c core.commitGraph=true rev-list --topo-order --all (git = pks-commit-graph-inconsistency)
    +            4.72 ± 0.05 times faster than git -c core.commitGraph=false rev-list --topo-order --all (git = pks-commit-graph-inconsistency)
    +            4.75 ± 0.04 times faster than git -c core.commitGraph=false rev-list --topo-order --all (git = master)
    +
    +    We look at a ~30% regression in general, but in general we're still a
    +    whole lot faster than without the commit graph. To counteract this, the
    +    new check can be turned off with the `GIT_COMMIT_GRAPH_PARANOIA` envvar.
    +
         Signed-off-by: Patrick Steinhardt <ps@pks.im>
     
      ## commit.c ##
    +@@
    + #include "shallow.h"
    + #include "tree.h"
    + #include "hook.h"
    ++#include "parse.h"
    + 
    + static struct commit_extra_header *read_commit_extra_header_lines(const char *buf, size_t len, const char **);
    + 
     @@ commit.c: int repo_parse_commit_internal(struct repository *r,
      		return -1;
      	if (item->object.parsed)
      		return 0;
     -	if (use_commit_graph && parse_commit_in_graph(r, item))
     +	if (use_commit_graph && parse_commit_in_graph(r, item)) {
    -+		if (!has_object(r, &item->object.oid, 0))
    ++		static int object_paranoia = -1;
    ++
    ++		if (object_paranoia == -1)
    ++			object_paranoia = git_env_bool(GIT_COMMIT_GRAPH_PARANOIA, 1);
    ++
    ++		if (object_paranoia && !has_object(r, &item->object.oid, 0)) {
    ++			unparse_commit(r, &item->object.oid);
     +			return quiet_on_missing ? -1 :
     +				error(_("commit %s exists in commit-graph but not in the object database"),
     +				      oid_to_hex(&item->object.oid));
    ++		}
    ++
      		return 0;
     +	}
      
    @@ commit.c: int repo_parse_commit_internal(struct repository *r,
      		return quiet_on_missing ? -1 :
     
      ## t/t5318-commit-graph.sh ##
    -@@ t/t5318-commit-graph.sh: test_expect_success 'overflow during generation version upgrade' '
    +@@ t/t5318-commit-graph.sh: test_expect_success 'stale commit cannot be parsed when given directly' '
      	)
      '
      
    -+test_expect_success 'commit exists in commit-graph but not in object database' '
    ++test_expect_success 'stale commit cannot be parsed when traversing graph' '
     +	test_when_finished "rm -rf repo" &&
     +	git init repo &&
     +	(
    @@ t/t5318-commit-graph.sh: test_expect_success 'overflow during generation version
     +		oid=$(git rev-parse B) &&
     +		rm .git/objects/"$(test_oid_to_path "$oid")" &&
     +
    ++		# Again, we should be able to parse the commit when not
    ++		# being paranoid about commit graph staleness...
    ++		GIT_COMMIT_GRAPH_PARANOIA=false git rev-parse HEAD~2 &&
    ++		# ... but fail when we are paranoid.
     +		test_must_fail git rev-parse HEAD~2 2>error &&
     +		grep "error: commit $oid exists in commit-graph but not in the object database" error
     +	)
-- 
2.42.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2023-11-01  5:01 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-10-23 11:27 [PATCH v2 0/2] commit-graph: detect commits missing in ODB Patrick Steinhardt
2023-10-23 11:27 ` [PATCH v2 1/2] commit-graph: introduce envvar to disable commit existence checks Patrick Steinhardt
2023-10-30 21:29   ` Taylor Blau
2023-10-31  6:19     ` Patrick Steinhardt
2023-10-23 11:27 ` [PATCH v2 2/2] commit: detect commits that exist in commit-graph but not in the ODB Patrick Steinhardt
2023-10-24 19:10   ` Junio C Hamano
2023-10-30 21:32     ` Taylor Blau
2023-10-31  0:21       ` Junio C Hamano
2023-10-30 21:31   ` Taylor Blau
2023-10-31  7:16 ` [PATCH v3 0/2] commit-graph: detect commits missing in ODB Patrick Steinhardt
2023-10-31  7:16   ` [PATCH v3 1/2] commit-graph: introduce envvar to disable commit existence checks Patrick Steinhardt
2023-10-31  7:16   ` [PATCH v3 2/2] commit: detect commits that exist in commit-graph but not in the ODB Patrick Steinhardt
2023-10-31 19:16   ` [PATCH v3 0/2] commit-graph: detect commits missing in ODB Taylor Blau
2023-10-31 23:57     ` Junio C Hamano
2023-11-01  2:06       ` Junio C Hamano
2023-11-01  4:55         ` Patrick Steinhardt
2023-11-01  5:01           ` Junio C Hamano
2023-10-31 23:55   ` Junio C Hamano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).