From: Patrick Steinhardt <ps@pks.im>
To: git@vger.kernel.org
Cc: Karthik Nayak <karthik.188@gmail.com>,
Junio C Hamano <gitster@pobox.com>, Taylor Blau <me@ttaylorr.com>,
Jeff King <peff@peff.net>
Subject: [PATCH v2 0/2] commit-graph: detect commits missing in ODB
Date: Mon, 23 Oct 2023 13:27:11 +0200 [thread overview]
Message-ID: <cover.1698060036.git.ps@pks.im> (raw)
[-- Attachment #1: Type: text/plain, Size: 6482 bytes --]
Hi,
this is version 2 of my patch series to more readily detect commits
parsed from the commit graph which are missing in the object database.
I've split this out into a separate thread as version 1 was sent in
reply to a different series to extend git-rev-list(1)'s `--missing`
option so that I don't continue to hijack this thread.
Changes compared to v1:
- I've added a preparatory patch that introduced a new
GIT_COMMIT_GRAPH_PARANOIA environment variable as suggested by
Peff. This envvar is retrofitted to the preexisting check in
`lookup_commit_in_graph()` so that the behaviour for this sanity
check is consistent.
- `repo_parse_commit_internal()` now also honors this new envvar.
- I've amended the commit message in v2 to include the benchmark
that demonstrates the performance regression.
- We now un-parse the commit when parsing it via the commit graph
succeeded, but it doesn't exist in the ODB.
Thanks for all the feedback and discussion around this.
Patrick
[1]: <b0bf576c51a706367a758b8e30eca37edb9c2734.1697200576.git.ps@pks.im>
Patrick Steinhardt (2):
commit-graph: introduce envvar to disable commit existence checks
commit: detect commits that exist in commit-graph but not in the ODB
Documentation/git.txt | 9 ++++++++
commit-graph.c | 6 +++++-
commit-graph.h | 6 ++++++
commit.c | 16 +++++++++++++-
t/t5318-commit-graph.sh | 48 +++++++++++++++++++++++++++++++++++++++++
5 files changed, 83 insertions(+), 2 deletions(-)
Range-diff against v1:
-: ---------- > 1: a89c435528 commit-graph: introduce envvar to disable commit existence checks
1: 6ec1e340f8 ! 2: 0476d48555 commit: detect commits that exist in commit-graph but not in the ODB
@@ Commit message
behaviour by checking for object existence via the object database, as
well.
+ This check of course comes with a performance penalty. The following
+ benchmarks have been executed in a clone of linux.git with stable tags
+ added:
+
+ Benchmark 1: git -c core.commitGraph=true rev-list --topo-order --all (git = master)
+ Time (mean ± σ): 2.913 s ± 0.018 s [User: 2.363 s, System: 0.548 s]
+ Range (min … max): 2.894 s … 2.950 s 10 runs
+
+ Benchmark 2: git -c core.commitGraph=true rev-list --topo-order --all (git = pks-commit-graph-inconsistency)
+ Time (mean ± σ): 3.834 s ± 0.052 s [User: 3.276 s, System: 0.556 s]
+ Range (min … max): 3.780 s … 3.961 s 10 runs
+
+ Benchmark 3: git -c core.commitGraph=false rev-list --topo-order --all (git = master)
+ Time (mean ± σ): 13.841 s ± 0.084 s [User: 13.152 s, System: 0.687 s]
+ Range (min … max): 13.714 s … 13.995 s 10 runs
+
+ Benchmark 4: git -c core.commitGraph=false rev-list --topo-order --all (git = pks-commit-graph-inconsistency)
+ Time (mean ± σ): 13.762 s ± 0.116 s [User: 13.094 s, System: 0.667 s]
+ Range (min … max): 13.645 s … 14.038 s 10 runs
+
+ Summary
+ git -c core.commitGraph=true rev-list --topo-order --all (git = master) ran
+ 1.32 ± 0.02 times faster than git -c core.commitGraph=true rev-list --topo-order --all (git = pks-commit-graph-inconsistency)
+ 4.72 ± 0.05 times faster than git -c core.commitGraph=false rev-list --topo-order --all (git = pks-commit-graph-inconsistency)
+ 4.75 ± 0.04 times faster than git -c core.commitGraph=false rev-list --topo-order --all (git = master)
+
+ We look at a ~30% regression in general, but in general we're still a
+ whole lot faster than without the commit graph. To counteract this, the
+ new check can be turned off with the `GIT_COMMIT_GRAPH_PARANOIA` envvar.
+
Signed-off-by: Patrick Steinhardt <ps@pks.im>
## commit.c ##
+@@
+ #include "shallow.h"
+ #include "tree.h"
+ #include "hook.h"
++#include "parse.h"
+
+ static struct commit_extra_header *read_commit_extra_header_lines(const char *buf, size_t len, const char **);
+
@@ commit.c: int repo_parse_commit_internal(struct repository *r,
return -1;
if (item->object.parsed)
return 0;
- if (use_commit_graph && parse_commit_in_graph(r, item))
+ if (use_commit_graph && parse_commit_in_graph(r, item)) {
-+ if (!has_object(r, &item->object.oid, 0))
++ static int object_paranoia = -1;
++
++ if (object_paranoia == -1)
++ object_paranoia = git_env_bool(GIT_COMMIT_GRAPH_PARANOIA, 1);
++
++ if (object_paranoia && !has_object(r, &item->object.oid, 0)) {
++ unparse_commit(r, &item->object.oid);
+ return quiet_on_missing ? -1 :
+ error(_("commit %s exists in commit-graph but not in the object database"),
+ oid_to_hex(&item->object.oid));
++ }
++
return 0;
+ }
@@ commit.c: int repo_parse_commit_internal(struct repository *r,
return quiet_on_missing ? -1 :
## t/t5318-commit-graph.sh ##
-@@ t/t5318-commit-graph.sh: test_expect_success 'overflow during generation version upgrade' '
+@@ t/t5318-commit-graph.sh: test_expect_success 'stale commit cannot be parsed when given directly' '
)
'
-+test_expect_success 'commit exists in commit-graph but not in object database' '
++test_expect_success 'stale commit cannot be parsed when traversing graph' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ (
@@ t/t5318-commit-graph.sh: test_expect_success 'overflow during generation version
+ oid=$(git rev-parse B) &&
+ rm .git/objects/"$(test_oid_to_path "$oid")" &&
+
++ # Again, we should be able to parse the commit when not
++ # being paranoid about commit graph staleness...
++ GIT_COMMIT_GRAPH_PARANOIA=false git rev-parse HEAD~2 &&
++ # ... but fail when we are paranoid.
+ test_must_fail git rev-parse HEAD~2 2>error &&
+ grep "error: commit $oid exists in commit-graph but not in the object database" error
+ )
--
2.42.0
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
next reply other threads:[~2023-10-23 11:27 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-10-23 11:27 Patrick Steinhardt [this message]
2023-10-23 11:27 ` [PATCH v2 1/2] commit-graph: introduce envvar to disable commit existence checks Patrick Steinhardt
2023-10-30 21:29 ` Taylor Blau
2023-10-31 6:19 ` Patrick Steinhardt
2023-10-23 11:27 ` [PATCH v2 2/2] commit: detect commits that exist in commit-graph but not in the ODB Patrick Steinhardt
2023-10-24 19:10 ` Junio C Hamano
2023-10-30 21:32 ` Taylor Blau
2023-10-31 0:21 ` Junio C Hamano
2023-10-30 21:31 ` Taylor Blau
2023-10-31 7:16 ` [PATCH v3 0/2] commit-graph: detect commits missing in ODB Patrick Steinhardt
2023-10-31 7:16 ` [PATCH v3 1/2] commit-graph: introduce envvar to disable commit existence checks Patrick Steinhardt
2023-10-31 7:16 ` [PATCH v3 2/2] commit: detect commits that exist in commit-graph but not in the ODB Patrick Steinhardt
2023-10-31 19:16 ` [PATCH v3 0/2] commit-graph: detect commits missing in ODB Taylor Blau
2023-10-31 23:57 ` Junio C Hamano
2023-11-01 2:06 ` Junio C Hamano
2023-11-01 4:55 ` Patrick Steinhardt
2023-11-01 5:01 ` Junio C Hamano
2023-10-31 23:55 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cover.1698060036.git.ps@pks.im \
--to=ps@pks.im \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=karthik.188@gmail.com \
--cc=me@ttaylorr.com \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.