From: Patrick Steinhardt <ps@pks.im>
To: git@vger.kernel.org
Cc: Karthik Nayak <karthik.188@gmail.com>,
Junio C Hamano <gitster@pobox.com>, Taylor Blau <me@ttaylorr.com>,
Jeff King <peff@peff.net>
Subject: [PATCH v2 0/2] commit-graph: detect commits missing in ODB
Date: Mon, 23 Oct 2023 13:27:11 +0200 [thread overview]
Message-ID: <cover.1698060036.git.ps@pks.im> (raw)
[-- Attachment #1: Type: text/plain, Size: 6482 bytes --]
Hi,
this is version 2 of my patch series to more readily detect commits
parsed from the commit graph which are missing in the object database.
I've split this out into a separate thread as version 1 was sent in
reply to a different series to extend git-rev-list(1)'s `--missing`
option so that I don't continue to hijack this thread.
Changes compared to v1:
- I've added a preparatory patch that introduced a new
GIT_COMMIT_GRAPH_PARANOIA environment variable as suggested by
Peff. This envvar is retrofitted to the preexisting check in
`lookup_commit_in_graph()` so that the behaviour for this sanity
check is consistent.
- `repo_parse_commit_internal()` now also honors this new envvar.
- I've amended the commit message in v2 to include the benchmark
that demonstrates the performance regression.
- We now un-parse the commit when parsing it via the commit graph
succeeded, but it doesn't exist in the ODB.
Thanks for all the feedback and discussion around this.
Patrick
[1]: <b0bf576c51a706367a758b8e30eca37edb9c2734.1697200576.git.ps@pks.im>
Patrick Steinhardt (2):
commit-graph: introduce envvar to disable commit existence checks
commit: detect commits that exist in commit-graph but not in the ODB
Documentation/git.txt | 9 ++++++++
commit-graph.c | 6 +++++-
commit-graph.h | 6 ++++++
commit.c | 16 +++++++++++++-
t/t5318-commit-graph.sh | 48 +++++++++++++++++++++++++++++++++++++++++
5 files changed, 83 insertions(+), 2 deletions(-)
Range-diff against v1:
-: ---------- > 1: a89c435528 commit-graph: introduce envvar to disable commit existence checks
1: 6ec1e340f8 ! 2: 0476d48555 commit: detect commits that exist in commit-graph but not in the ODB
@@ Commit message
behaviour by checking for object existence via the object database, as
well.
+ This check of course comes with a performance penalty. The following
+ benchmarks have been executed in a clone of linux.git with stable tags
+ added:
+
+ Benchmark 1: git -c core.commitGraph=true rev-list --topo-order --all (git = master)
+ Time (mean ± σ): 2.913 s ± 0.018 s [User: 2.363 s, System: 0.548 s]
+ Range (min … max): 2.894 s … 2.950 s 10 runs
+
+ Benchmark 2: git -c core.commitGraph=true rev-list --topo-order --all (git = pks-commit-graph-inconsistency)
+ Time (mean ± σ): 3.834 s ± 0.052 s [User: 3.276 s, System: 0.556 s]
+ Range (min … max): 3.780 s … 3.961 s 10 runs
+
+ Benchmark 3: git -c core.commitGraph=false rev-list --topo-order --all (git = master)
+ Time (mean ± σ): 13.841 s ± 0.084 s [User: 13.152 s, System: 0.687 s]
+ Range (min … max): 13.714 s … 13.995 s 10 runs
+
+ Benchmark 4: git -c core.commitGraph=false rev-list --topo-order --all (git = pks-commit-graph-inconsistency)
+ Time (mean ± σ): 13.762 s ± 0.116 s [User: 13.094 s, System: 0.667 s]
+ Range (min … max): 13.645 s … 14.038 s 10 runs
+
+ Summary
+ git -c core.commitGraph=true rev-list --topo-order --all (git = master) ran
+ 1.32 ± 0.02 times faster than git -c core.commitGraph=true rev-list --topo-order --all (git = pks-commit-graph-inconsistency)
+ 4.72 ± 0.05 times faster than git -c core.commitGraph=false rev-list --topo-order --all (git = pks-commit-graph-inconsistency)
+ 4.75 ± 0.04 times faster than git -c core.commitGraph=false rev-list --topo-order --all (git = master)
+
+ We look at a ~30% regression in general, but in general we're still a
+ whole lot faster than without the commit graph. To counteract this, the
+ new check can be turned off with the `GIT_COMMIT_GRAPH_PARANOIA` envvar.
+
Signed-off-by: Patrick Steinhardt <ps@pks.im>
## commit.c ##
+@@
+ #include "shallow.h"
+ #include "tree.h"
+ #include "hook.h"
++#include "parse.h"
+
+ static struct commit_extra_header *read_commit_extra_header_lines(const char *buf, size_t len, const char **);
+
@@ commit.c: int repo_parse_commit_internal(struct repository *r,
return -1;
if (item->object.parsed)
return 0;
- if (use_commit_graph && parse_commit_in_graph(r, item))
+ if (use_commit_graph && parse_commit_in_graph(r, item)) {
-+ if (!has_object(r, &item->object.oid, 0))
++ static int object_paranoia = -1;
++
++ if (object_paranoia == -1)
++ object_paranoia = git_env_bool(GIT_COMMIT_GRAPH_PARANOIA, 1);
++
++ if (object_paranoia && !has_object(r, &item->object.oid, 0)) {
++ unparse_commit(r, &item->object.oid);
+ return quiet_on_missing ? -1 :
+ error(_("commit %s exists in commit-graph but not in the object database"),
+ oid_to_hex(&item->object.oid));
++ }
++
return 0;
+ }
@@ commit.c: int repo_parse_commit_internal(struct repository *r,
return quiet_on_missing ? -1 :
## t/t5318-commit-graph.sh ##
-@@ t/t5318-commit-graph.sh: test_expect_success 'overflow during generation version upgrade' '
+@@ t/t5318-commit-graph.sh: test_expect_success 'stale commit cannot be parsed when given directly' '
)
'
-+test_expect_success 'commit exists in commit-graph but not in object database' '
++test_expect_success 'stale commit cannot be parsed when traversing graph' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ (
@@ t/t5318-commit-graph.sh: test_expect_success 'overflow during generation version
+ oid=$(git rev-parse B) &&
+ rm .git/objects/"$(test_oid_to_path "$oid")" &&
+
++ # Again, we should be able to parse the commit when not
++ # being paranoid about commit graph staleness...
++ GIT_COMMIT_GRAPH_PARANOIA=false git rev-parse HEAD~2 &&
++ # ... but fail when we are paranoid.
+ test_must_fail git rev-parse HEAD~2 2>error &&
+ grep "error: commit $oid exists in commit-graph but not in the object database" error
+ )
--
2.42.0
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
next reply other threads:[~2023-10-23 11:27 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-10-23 11:27 Patrick Steinhardt [this message]
2023-10-23 11:27 ` [PATCH v2 1/2] commit-graph: introduce envvar to disable commit existence checks Patrick Steinhardt
2023-10-30 21:29 ` Taylor Blau
2023-10-31 6:19 ` Patrick Steinhardt
2023-10-23 11:27 ` [PATCH v2 2/2] commit: detect commits that exist in commit-graph but not in the ODB Patrick Steinhardt
2023-10-24 19:10 ` Junio C Hamano
2023-10-30 21:32 ` Taylor Blau
2023-10-31 0:21 ` Junio C Hamano
2023-10-30 21:31 ` Taylor Blau
2023-10-31 7:16 ` [PATCH v3 0/2] commit-graph: detect commits missing in ODB Patrick Steinhardt
2023-10-31 7:16 ` [PATCH v3 1/2] commit-graph: introduce envvar to disable commit existence checks Patrick Steinhardt
2023-10-31 7:16 ` [PATCH v3 2/2] commit: detect commits that exist in commit-graph but not in the ODB Patrick Steinhardt
2023-10-31 19:16 ` [PATCH v3 0/2] commit-graph: detect commits missing in ODB Taylor Blau
2023-10-31 23:57 ` Junio C Hamano
2023-11-01 2:06 ` Junio C Hamano
2023-11-01 4:55 ` Patrick Steinhardt
2023-11-01 5:01 ` Junio C Hamano
2023-10-31 23:55 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cover.1698060036.git.ps@pks.im \
--to=ps@pks.im \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=karthik.188@gmail.com \
--cc=me@ttaylorr.com \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).