* [PATCH] commit-graph: use timestamp_t for max parent generation accumulator
@ 2026-06-14 6:57 Elijah Newren via GitGitGadget
2026-06-15 8:11 ` Patrick Steinhardt
0 siblings, 1 reply; 3+ messages in thread
From: Elijah Newren via GitGitGadget @ 2026-06-14 6:57 UTC (permalink / raw)
To: git; +Cc: Elijah Newren, Elijah Newren
From: Elijah Newren <newren@gmail.com>
compute_reachable_generation_numbers() computes each commit's
generation as
max(c->date, max(parent.generation)) + 1
by walking its parents and accumulating their generations into a
local
uint32_t max_gen = 0;
while info->get_generation() returns timestamp_t and
compute_generation_from_max() already takes its max_gen parameter
as timestamp_t. For v1 (topological levels) the narrowing is
harmless because GENERATION_NUMBER_V1_MAX is less than 2^30, but
for v2 (corrected committer dates) it silently truncates any
parent generation that does not fit in 32 bits, i.e. any parent
whose committer timestamp is at or beyond 2106-02-07 UTC
(>= 2^32).
The truncated max then causes child commits to end up with a
corrected committer date that matches the parent's instead of being
at least 1 higher. The bad value gets written into the commit-graph
and causes problems later, and can be noticed by running `git
commit-graph verify`.
Widen the accumulator to timestamp_t.
This is solely an in-memory arithmetic fix with no on-disk format
change: the on-disk format already encodes timestamp_t values and
existing readers handle them unchanged. This merely allows the code to
compute the correct value to write to disk.
The narrowing was introduced in 80c928d947c2 (commit-graph:
simplify compute_generation_numbers(), 2023-03-20), which rewired
v2 to use the shared compute_reachable_generation_numbers()
helper; the helper's local accumulator had been declared uint32_t
in the immediately preceding 368d19b0b7fa (commit-graph: refactor
compute_topological_levels(), 2023-03-20) when only v1 was using
it, where it was harmless.
Add a new test with a future-dated parent and a present-day child;
without the above fix, `git commit-graph verify` reports the
descendant's stored generation as below parent + 1.
Signed-off-by: Elijah Newren <newren@gmail.com>
---
commit-graph: use timestamp_t for max parent generation accumulator
We found a few repositories in the wild with commits whose authors were
apparently on a computer in the year 2120 when they recorded their
commits. Apparently, in a century from now, some folks are going to have
a really weird timezone as well (-13068837), though the timezone doesn't
factor into this patch at all.
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-2148%2Fnewren%2Fcommit-graph-fix-ccd-uint32-truncation-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-2148/newren/commit-graph-fix-ccd-uint32-truncation-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/2148
commit-graph.c | 2 +-
t/t5328-commit-graph-64bit-time.sh | 9 +++++++++
2 files changed, 10 insertions(+), 1 deletion(-)
diff --git a/commit-graph.c b/commit-graph.c
index 9abe62bd5a..4b7156fd76 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -1669,7 +1669,7 @@ static void compute_reachable_generation_numbers(
struct commit *current = list->item;
struct commit_list *parent;
int all_parents_computed = 1;
- uint32_t max_gen = 0;
+ timestamp_t max_gen = 0;
for (parent = current->parents; parent; parent = parent->next) {
repo_parse_commit(info->r, parent->item);
diff --git a/t/t5328-commit-graph-64bit-time.sh b/t/t5328-commit-graph-64bit-time.sh
index d8891e6a92..bc651b69de 100755
--- a/t/t5328-commit-graph-64bit-time.sh
+++ b/t/t5328-commit-graph-64bit-time.sh
@@ -74,6 +74,15 @@ test_expect_success 'single commit with generation data exceeding UINT32_MAX' '
git -C repo-uint32-max commit-graph verify
'
+test_expect_success 'descendant of commit with date exceeding UINT32_MAX' '
+ git init repo-uint32-max-descendant &&
+ test_commit -C repo-uint32-max-descendant \
+ --date "@4294967300 +0000" future-parent &&
+ test_commit -C repo-uint32-max-descendant present-day-child &&
+ git -C repo-uint32-max-descendant commit-graph write --reachable &&
+ git -C repo-uint32-max-descendant commit-graph verify
+'
+
test_expect_success PERL_TEST_HELPERS 'reader notices out-of-bounds generation overflow' '
graph=.git/objects/info/commit-graph &&
test_when_finished "rm -rf $graph" &&
base-commit: 600fe743028cbfb640855f659e9851522214bc0b
--
gitgitgadget
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] commit-graph: use timestamp_t for max parent generation accumulator
2026-06-14 6:57 [PATCH] commit-graph: use timestamp_t for max parent generation accumulator Elijah Newren via GitGitGadget
@ 2026-06-15 8:11 ` Patrick Steinhardt
2026-06-15 11:44 ` Derrick Stolee
0 siblings, 1 reply; 3+ messages in thread
From: Patrick Steinhardt @ 2026-06-15 8:11 UTC (permalink / raw)
To: Elijah Newren via GitGitGadget; +Cc: git, Elijah Newren
On Sun, Jun 14, 2026 at 06:57:50AM +0000, Elijah Newren via GitGitGadget wrote:
> commit-graph: use timestamp_t for max parent generation accumulator
>
> We found a few repositories in the wild with commits whose authors were
> apparently on a computer in the year 2120 when they recorded their
> commits. Apparently, in a century from now, some folks are going to have
> a really weird timezone as well (-13068837), though the timezone doesn't
> factor into this patch at all.
I'd really be curious which other parts of Git will start to break once
we cross that threshold. Would it make sense if we maybe expanded our
linux-TEST-VARS job to create commits with a date beyond UINT32_MAX?
Something like the patch at the end of this mail. And yes, many tests
break with the patch applied. From all I've seen though many of those
failures are benign, even though I'd bet that there might even be some
"proper" failures in there.
Anyway, this is of course outside the scope of this patch series.
> diff --git a/commit-graph.c b/commit-graph.c
> index 9abe62bd5a..4b7156fd76 100644
> --- a/commit-graph.c
> +++ b/commit-graph.c
> @@ -1669,7 +1669,7 @@ static void compute_reachable_generation_numbers(
> struct commit *current = list->item;
> struct commit_list *parent;
> int all_parents_computed = 1;
> - uint32_t max_gen = 0;
> + timestamp_t max_gen = 0;
>
> for (parent = current->parents; parent; parent = parent->next) {
> repo_parse_commit(info->r, parent->item);
This looks obviously correct.
> diff --git a/t/t5328-commit-graph-64bit-time.sh b/t/t5328-commit-graph-64bit-time.sh
> index d8891e6a92..bc651b69de 100755
> --- a/t/t5328-commit-graph-64bit-time.sh
> +++ b/t/t5328-commit-graph-64bit-time.sh
> @@ -74,6 +74,15 @@ test_expect_success 'single commit with generation data exceeding UINT32_MAX' '
> git -C repo-uint32-max commit-graph verify
> '
>
> +test_expect_success 'descendant of commit with date exceeding UINT32_MAX' '
> + git init repo-uint32-max-descendant &&
> + test_commit -C repo-uint32-max-descendant \
> + --date "@4294967300 +0000" future-parent &&
> + test_commit -C repo-uint32-max-descendant present-day-child &&
> + git -C repo-uint32-max-descendant commit-graph write --reachable &&
> + git -C repo-uint32-max-descendant commit-graph verify
> +'
Makes sense. Thanks!
Patrick
diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index 809c662124..e78902b671 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -136,12 +136,19 @@ sane_unset () {
test_tick () {
if test -z "${test_tick+set}"
then
- test_tick=1112911993
+ if test_bool_env GIT_TEST_FUTURE false
+ then
+ test_tick=4294697600
+ test_tick_prefix=@
+ else
+ test_tick=1112911993
+ test_tick_prefix=
+ fi
else
test_tick=$(($test_tick + 60))
fi
- GIT_COMMITTER_DATE="$test_tick -0700"
- GIT_AUTHOR_DATE="$test_tick -0700"
+ GIT_COMMITTER_DATE="$test_tick_prefix$test_tick -0700"
+ GIT_AUTHOR_DATE="$test_tick_prefix$test_tick -0700"
export GIT_COMMITTER_DATE GIT_AUTHOR_DATE
}
diff --git a/t/test-lib.sh b/t/test-lib.sh
index 4a7357b547..54798fb3f1 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -558,12 +558,26 @@ TEST_AUTHOR_LOCALNAME=author
TEST_AUTHOR_DOMAIN=example.com
GIT_AUTHOR_EMAIL=${TEST_AUTHOR_LOCALNAME}@${TEST_AUTHOR_DOMAIN}
GIT_AUTHOR_NAME='A U Thor'
-GIT_AUTHOR_DATE='1112354055 +0200'
TEST_COMMITTER_LOCALNAME=committer
TEST_COMMITTER_DOMAIN=example.com
GIT_COMMITTER_EMAIL=${TEST_COMMITTER_LOCALNAME}@${TEST_COMMITTER_DOMAIN}
GIT_COMMITTER_NAME='C O Mitter'
-GIT_COMMITTER_DATE='1112354055 +0200'
+
+case "${GIT_TEST_FUTURE:-false}" in
+1|on|true|yes)
+ GIT_AUTHOR_DATE="${GIT_TEST_DATE:-@4294697300 +0200}"
+ GIT_COMMITTER_DATE="${GIT_TEST_DATE:-@4294697300 +0200}"
+ ;;
+0|off|false|no)
+ GIT_AUTHOR_DATE="${GIT_TEST_DATE:-1112354055 +0200}"
+ GIT_COMMITTER_DATE="${GIT_TEST_DATE:-1112354055 +0200}"
+ ;;
+*)
+ echo "GIT_TEST_FUTURE requires a boolean" >&2
+ exit 1
+ ;;
+esac
+
GIT_MERGE_VERBOSITY=5
GIT_MERGE_AUTOEDIT=no
export GIT_MERGE_VERBOSITY GIT_MERGE_AUTOEDIT
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] commit-graph: use timestamp_t for max parent generation accumulator
2026-06-15 8:11 ` Patrick Steinhardt
@ 2026-06-15 11:44 ` Derrick Stolee
0 siblings, 0 replies; 3+ messages in thread
From: Derrick Stolee @ 2026-06-15 11:44 UTC (permalink / raw)
To: Patrick Steinhardt, Elijah Newren via GitGitGadget; +Cc: git, Elijah Newren
On 6/15/26 4:11 AM, Patrick Steinhardt wrote:
> On Sun, Jun 14, 2026 at 06:57:50AM +0000, Elijah Newren via GitGitGadget wrote:
>> commit-graph: use timestamp_t for max parent generation accumulator
>>
>> We found a few repositories in the wild with commits whose authors were
>> apparently on a computer in the year 2120 when they recorded their
>> commits. Apparently, in a century from now, some folks are going to have
>> a really weird timezone as well (-13068837), though the timezone doesn't
>> factor into this patch at all.
>> @@ -1669,7 +1669,7 @@ static void compute_reachable_generation_numbers(
>> struct commit *current = list->item;
>> struct commit_list *parent;
>> int all_parents_computed = 1;
>> - uint32_t max_gen = 0;
>> + timestamp_t max_gen = 0;
>>
>> for (parent = current->parents; parent; parent = parent->next) {
>> repo_parse_commit(info->r, parent->item);
>
> This looks obviously correct.
I agree. I was surprised this was the only necessary change, but
your message clearly describes how the timing of the patch that
delivered this change contributed to the mismatch.
Thanks,
-Stolee
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2026-06-15 11:44 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-14 6:57 [PATCH] commit-graph: use timestamp_t for max parent generation accumulator Elijah Newren via GitGitGadget
2026-06-15 8:11 ` Patrick Steinhardt
2026-06-15 11:44 ` Derrick Stolee
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox