From: "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: me@ttaylorr.com, peff@peff.net, gitster@pobox.com,
abhishekkumar8222@gmail.com, Derrick Stolee <stolee@gmail.com>,
Taylor Blau <ttaylorr@github.com>,
Derrick Stolee <derrickstolee@github.com>
Subject: [PATCH v2 0/6] Generation Number v2: Fix a tricky split graph bug
Date: Tue, 02 Feb 2021 03:01:17 +0000 [thread overview]
Message-ID: <pull.850.v2.git.1612234883.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.850.git.1612199707.gitgitgadget@gmail.com>
Here is a bugfix for the recently-landed-in-next generation v2 topic
(ak/corrected-commit-date).
This was occasionally hitting us when computing new commit-graphs on
existing repositories with the new bits. It was very hard to reproduce, and
it turns out to be due to not parsing commits before accessing generation
number data. Doing so in the right place demonstrates the bug of recomputing
the corrected commit date even for commits in lower layers with computed
values.
The fix is split into these steps:
1. Parse commits more often before accessing their data. (This allows the
bug to be demonstrated in the test suite.)
2. Check the full commit-graph chain for generation data chunks.
3. Don't compute corrected commit dates if the lower layers do not support
them.
4. Parse the commit-graph file more often.
Thanks, -Stolee
Updates in v2
=============
* Fixed some typos or other clarifications in commit messages.
* The loop assigning read_generation_data is skipped if they already all
agree with value 1.
* I split compute_generation_numbers into two methods. This essentially
splits the previous patch 4 into patches 4 & 5 here. The new patch 4 just
splits the logic as-is, then the new patch 5 does the re-initialization
of generation values when in the upgrade scenario.
Derrick Stolee (6):
commit-graph: use repo_parse_commit
commit-graph: always parse before commit_graph_data_at()
commit-graph: validate layers for generation data
commit-graph: compute generations separately
commit-graph: be extra careful about mixed generations
commit-graph: prepare commit graph
commit-graph.c | 138 +++++++++++++++++++++++++++++-----------
commit.h | 5 +-
t/t5318-commit-graph.sh | 21 ++++++
3 files changed, 125 insertions(+), 39 deletions(-)
base-commit: 5a3b130cad0d5c770f766e3af6d32b41766374c0
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-850%2Fderrickstolee%2Fgen-v2-upgrade-fix-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-850/derrickstolee/gen-v2-upgrade-fix-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/850
Range-diff vs v1:
1: 9c605c99f66 = 1: 9c605c99f66 commit-graph: use repo_parse_commit
2: 82afa811dff ! 2: 454b183b9ba commit-graph: always parse before commit_graph_data_at()
@@ Commit message
problems when writing a commit-graph with corrected commit dates based
on a commit-graph without them.
- It has been difficult to identify the issue here becaus it was so hard
+ It has been difficult to identify the issue here because it was so hard
to reproduce. It relies on this uninitialized data having a non-zero
value, but also on specifically in a way that overwrites the existing
data.
3: d554fa30660 ! 3: 3d223fa2156 commit-graph: validate layers for generation data
@@ commit-graph.c: static struct commit_graph *load_commit_graph_chain(struct repos
- if (!g)
- return;
--
-- read_generation_data = !!g->chunk_generation_data;
+ while (read_generation_data && p) {
+ read_generation_data = p->read_generation_data;
+ p = p->base_graph;
+ }
+- read_generation_data = !!g->chunk_generation_data;
++ if (read_generation_data)
++ return 1;
+
while (g) {
- g->read_generation_data = read_generation_data;
+- g->read_generation_data = read_generation_data;
++ g->read_generation_data = 0;
g = g->base_graph;
}
+
-+ return read_generation_data;
++ return 0;
}
struct commit_graph *read_commit_graph_one(struct repository *r,
-: ----------- > 4: 05248ff222f commit-graph: compute generations separately
4: b267a9653a7 ! 5: 9bccee8fb63 commit-graph: be extra careful about mixed generations
@@ Commit message
existing commit-graph data has no corrected commit dates.
While this improves our situation somewhat, we have not completely
- solved the issue for correctly computing generation numbers for mixes
+ solved the issue for correctly computing generation numbers for mixed
layers. That follows in the next change.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
@@ commit-graph.c: static void compute_generation_numbers(struct write_commit_graph
_("Computing commit graph generation numbers"),
ctx->commits.nr);
+
-+ if (ctx->write_generation_data && !ctx->trust_generation_numbers) {
++ if (!ctx->trust_generation_numbers) {
+ for (i = 0; i < ctx->commits.nr; i++) {
+ struct commit *c = ctx->commits.list[i];
+ repo_parse_commit(ctx->r, c);
@@ commit-graph.c: static void compute_generation_numbers(struct write_commit_graph
+
for (i = 0; i < ctx->commits.nr; i++) {
struct commit *c = ctx->commits.list[i];
- uint32_t level;
-@@ commit-graph.c: static void compute_generation_numbers(struct write_commit_graph_context *ctx)
- corrected_commit_date = commit_graph_data_at(parent->item)->generation;
-
- if (level == GENERATION_NUMBER_ZERO ||
-- corrected_commit_date == GENERATION_NUMBER_ZERO) {
-+ (ctx->write_generation_data &&
-+ corrected_commit_date == GENERATION_NUMBER_ZERO)) {
- all_parents_computed = 0;
- commit_list_insert(parent->item, &list);
- break;
-@@ commit-graph.c: static void compute_generation_numbers(struct write_commit_graph_context *ctx)
- max_level = GENERATION_NUMBER_V1_MAX - 1;
- *topo_level_slab_at(ctx->topo_levels, current) = max_level + 1;
-
-- if (current->date && current->date > max_corrected_commit_date)
-- max_corrected_commit_date = current->date - 1;
-- commit_graph_data_at(current)->generation = max_corrected_commit_date + 1;
--
-- if (commit_graph_data_at(current)->generation - current->date > GENERATION_NUMBER_V2_OFFSET_MAX)
-- ctx->num_generation_data_overflows++;
-+ if (ctx->write_generation_data) {
-+ timestamp_t cur_g;
-+ if (current->date && current->date > max_corrected_commit_date)
-+ max_corrected_commit_date = current->date - 1;
-+ cur_g = commit_graph_data_at(current)->generation
-+ = max_corrected_commit_date + 1;
-+ if (cur_g - current->date > GENERATION_NUMBER_V2_OFFSET_MAX)
-+ ctx->num_generation_data_overflows++;
-+ }
- }
- }
- }
+ timestamp_t corrected_commit_date;
@@ commit-graph.c: int write_commit_graph(struct object_directory *odb,
} else
ctx->num_commit_graphs_after = 1;
@@ commit-graph.c: int write_commit_graph(struct object_directory *odb,
- validate_mixed_generation_chain(ctx->r->objects->commit_graph);
+ ctx->trust_generation_numbers = validate_mixed_generation_chain(ctx->r->objects->commit_graph);
- compute_generation_numbers(ctx);
-
+ compute_topological_levels(ctx);
+ if (ctx->write_generation_data)
5: dddeec30ebf ! 6: 38086c85b52 commit-graph: prepare commit graph
@@ Commit message
commit-graph: prepare commit graph
Before checking if the repository has a commit-graph loaded, be sure
- to run prepare_commit_graph(). This is necessary because without this
- instance we would not initialize the topo_levels slab for each of the
- struct commit_graphs in the chain before we start to parse the
- commits. This leads to possibly recomputing the topological levels for
- commits in lower layers even when we are adding a small number of
- commits on top.
+ to run prepare_commit_graph(). This is necessary because otherwise
+ the topo_levels slab is not initialized. As we compute topo_levels for
+ the new commits, we iterate further into the lower layers since the
+ first visit to each commit looks as though the topo_level is not
+ populated.
By properly initializing the topo_slab, we fix the previously broken
case of a split commit graph where a base layer has the
--
gitgitgadget
next prev parent reply other threads:[~2021-02-02 3:02 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-02-01 17:15 [PATCH 0/5] Generation Number v2: Fix a tricky split graph bug Derrick Stolee via GitGitGadget
2021-02-01 17:15 ` [PATCH 1/5] commit-graph: use repo_parse_commit Derrick Stolee via GitGitGadget
2021-02-01 17:32 ` Taylor Blau
2021-02-01 17:15 ` [PATCH 2/5] commit-graph: always parse before commit_graph_data_at() Derrick Stolee via GitGitGadget
2021-02-01 18:44 ` Junio C Hamano
2021-02-01 17:15 ` [PATCH 3/5] commit-graph: validate layers for generation data Derrick Stolee via GitGitGadget
2021-02-01 17:39 ` Taylor Blau
2021-02-01 18:10 ` Derrick Stolee
2021-02-01 17:15 ` [PATCH 4/5] commit-graph: be extra careful about mixed generations Derrick Stolee via GitGitGadget
2021-02-01 18:04 ` Taylor Blau
2021-02-01 18:13 ` Derrick Stolee
2021-02-01 18:55 ` Junio C Hamano
2021-02-01 17:15 ` [PATCH 5/5] commit-graph: prepare commit graph Derrick Stolee via GitGitGadget
2021-02-01 18:25 ` Taylor Blau
2021-02-02 3:01 ` Derrick Stolee via GitGitGadget [this message]
2021-02-02 3:01 ` [PATCH v2 1/6] commit-graph: use repo_parse_commit Derrick Stolee via GitGitGadget
2021-02-02 3:01 ` [PATCH v2 2/6] commit-graph: always parse before commit_graph_data_at() Derrick Stolee via GitGitGadget
2021-02-03 1:08 ` Jonathan Nieder
2021-02-03 1:35 ` Derrick Stolee
2021-02-03 1:48 ` Jonathan Nieder
2021-02-03 3:07 ` Derrick Stolee
2021-02-03 15:34 ` Taylor Blau
2021-02-03 17:37 ` Eric Sunshine
2021-02-03 18:41 ` Junio C Hamano
2021-02-03 21:08 ` Taylor Blau
2021-02-03 2:06 ` Junio C Hamano
2021-02-03 3:09 ` Derrick Stolee
2021-02-07 19:04 ` SZEDER Gábor
2021-02-07 20:12 ` Junio C Hamano
2021-02-08 2:01 ` Derrick Stolee
2021-02-08 5:55 ` Junio C Hamano
2021-02-02 3:01 ` [PATCH v2 3/6] commit-graph: validate layers for generation data Derrick Stolee via GitGitGadget
2021-02-02 3:01 ` [PATCH v2 4/6] commit-graph: compute generations separately Derrick Stolee via GitGitGadget
2021-02-02 3:01 ` [PATCH v2 5/6] commit-graph: be extra careful about mixed generations Derrick Stolee via GitGitGadget
2021-02-02 3:01 ` [PATCH v2 6/6] commit-graph: prepare commit graph Derrick Stolee via GitGitGadget
2021-02-02 3:08 ` [PATCH v2 0/6] Generation Number v2: Fix a tricky split graph bug Taylor Blau
2021-02-11 4:44 ` Abhishek Kumar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=pull.850.v2.git.1612234883.gitgitgadget@gmail.com \
--to=gitgitgadget@gmail.com \
--cc=abhishekkumar8222@gmail.com \
--cc=derrickstolee@github.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=me@ttaylorr.com \
--cc=peff@peff.net \
--cc=stolee@gmail.com \
--cc=ttaylorr@github.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).