* [PATCH/RFC 1/6] commit-reach: decouple ahead_behind from nonstale_queue
2026-06-20 10:36 [PATCH/RFC 0/6] commit-reach: terminate merge-base walk when one side is exhausted Kristofer Karlsson via GitGitGadget
@ 2026-06-20 10:36 ` Kristofer Karlsson via GitGitGadget
2026-06-20 10:36 ` [PATCH/RFC 2/6] commit-reach: introduce struct paint_queue with per-side counters Kristofer Karlsson via GitGitGadget
` (4 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Kristofer Karlsson via GitGitGadget @ 2026-06-20 10:36 UTC (permalink / raw)
To: git; +Cc: Derrick Stolee, Elijah Newren, Kristofer Karlsson,
Kristofer Karlsson
From: Kristofer Karlsson <krka@spotify.com>
Move ahead_behind() off the shared nonstale_queue abstraction to use
a plain prio_queue with a local max_nonstale pointer. The nonstale
tracking is inlined into insert_no_dup().
This prepares for replacing nonstale_queue with a paint_queue struct
that tracks per-side commit counts, which ahead_behind() does not
need. No behavior change.
Signed-off-by: Kristofer Karlsson <krka@spotify.com>
---
commit-reach.c | 32 +++++++++++++++++++++-----------
1 file changed, 21 insertions(+), 11 deletions(-)
diff --git a/commit-reach.c b/commit-reach.c
index 5df471a313..377a5cc42a 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -1089,12 +1089,18 @@ struct commit_list *get_reachable_subset(struct commit **from, size_t nr_from,
define_commit_slab(bit_arrays, struct bitmap *);
static struct bit_arrays bit_arrays;
-static void insert_no_dup(struct nonstale_queue *queue, struct commit *c)
+static void insert_no_dup(struct prio_queue *queue,
+ struct commit **max_nonstale,
+ struct commit *c)
{
if (c->object.flags & PARENT2)
return;
- nonstale_queue_put(queue, c);
c->object.flags |= PARENT2;
+ prio_queue_put(queue, c);
+ if (!(c->object.flags & STALE) &&
+ (!*max_nonstale ||
+ queue->compare(*max_nonstale, c, queue->cb_data) <= 0))
+ *max_nonstale = c;
}
static struct bitmap *get_bit_array(struct commit *c, int width)
@@ -1118,9 +1124,10 @@ void ahead_behind(struct repository *r,
struct commit **commits, size_t commits_nr,
struct ahead_behind_count *counts, size_t counts_nr)
{
- struct nonstale_queue queue = {
- { .compare = compare_commits_by_gen_then_commit_date }
+ struct prio_queue queue = {
+ .compare = compare_commits_by_gen_then_commit_date
};
+ struct commit *max_nonstale = NULL;
size_t width = DIV_ROUND_UP(commits_nr, BITS_IN_EWORD);
if (!commits_nr || !counts_nr)
@@ -1140,14 +1147,17 @@ void ahead_behind(struct repository *r,
struct bitmap *bitmap = get_bit_array(c, width);
bitmap_set(bitmap, i);
- insert_no_dup(&queue, c);
+ insert_no_dup(&queue, &max_nonstale, c);
}
- while (queue.max_nonstale) {
- struct commit *c = nonstale_queue_get(&queue);
+ while (max_nonstale) {
+ struct commit *c = prio_queue_get(&queue);
struct commit_list *p;
struct bitmap *bitmap_c = get_bit_array(c, width);
+ if (c == max_nonstale)
+ max_nonstale = NULL;
+
for (size_t i = 0; i < counts_nr; i++) {
int reach_from_tip = !!bitmap_get(bitmap_c, counts[i].tip_index);
int reach_from_base = !!bitmap_get(bitmap_c, counts[i].base_index);
@@ -1178,7 +1188,7 @@ void ahead_behind(struct repository *r,
if (bitmap_popcount(bitmap_p) == commits_nr)
p->item->object.flags |= STALE;
- insert_no_dup(&queue, p->item);
+ insert_no_dup(&queue, &max_nonstale, p->item);
}
free_bit_array(c);
@@ -1186,10 +1196,10 @@ void ahead_behind(struct repository *r,
/* STALE is used here, PARENT2 is used by insert_no_dup(). */
repo_clear_commit_marks(r, PARENT2 | STALE);
- for (size_t i = 0; i < queue.pq.nr; i++)
- free_bit_array(queue.pq.array[i].data);
+ for (size_t i = 0; i < queue.nr; i++)
+ free_bit_array(queue.array[i].data);
clear_bit_arrays(&bit_arrays);
- clear_nonstale_queue(&queue);
+ clear_prio_queue(&queue);
}
struct commit_and_index {
--
gitgitgadget
^ permalink raw reply related [flat|nested] 7+ messages in thread* [PATCH/RFC 2/6] commit-reach: introduce struct paint_queue with per-side counters
2026-06-20 10:36 [PATCH/RFC 0/6] commit-reach: terminate merge-base walk when one side is exhausted Kristofer Karlsson via GitGitGadget
2026-06-20 10:36 ` [PATCH/RFC 1/6] commit-reach: decouple ahead_behind from nonstale_queue Kristofer Karlsson via GitGitGadget
@ 2026-06-20 10:36 ` Kristofer Karlsson via GitGitGadget
2026-06-20 10:36 ` [PATCH/RFC 3/6] commit-reach: terminate merge-base walk when one paint side is exhausted Kristofer Karlsson via GitGitGadget
` (3 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Kristofer Karlsson via GitGitGadget @ 2026-06-20 10:36 UTC (permalink / raw)
To: git; +Cc: Derrick Stolee, Elijah Newren, Kristofer Karlsson,
Kristofer Karlsson
From: Kristofer Karlsson <krka@spotify.com>
Replace the nonstale_queue abstraction in paint_down_to_common() with
a new paint_queue struct that tracks per-side commit counts. Each
non-stale queued commit occupies exactly one counter bucket based on
its paint flags: PARENT1-only, PARENT2-only, or both sides (a pending
merge-base candidate).
The counters are maintained by paint_count_transition() which handles
all flag changes as bucket transfers: remove from the old bucket, add
to the new one. Either step is a no-op when the respective state has
no bucket (stale or zero).
The loop now drains the queue via paint_queue_get() and breaks when
all counters reach zero, replacing the old pointer-based termination
(max_nonstale). This is equivalent behavior.
Signed-off-by: Kristofer Karlsson <krka@spotify.com>
---
commit-reach.c | 114 ++++++++++++++++++++++++++++---------------------
1 file changed, 66 insertions(+), 48 deletions(-)
diff --git a/commit-reach.c b/commit-reach.c
index 377a5cc42a..ba1e896f0f 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -41,58 +41,75 @@ static int compare_commits_by_gen(const void *_a, const void *_b)
}
/*
- * A prio_queue with O(1) termination check. 'max_nonstale' tracks
- * the lowest-priority non-stale commit enqueued so far; once it is
- * popped, every remaining entry is known to be STALE.
+ * Priority queue with per-side commit counters for paint_down_to_common().
+ * Each non-stale queued commit occupies exactly one bucket: PARENT1-only,
+ * PARENT2-only, or both (a pending merge-base candidate).
*/
-struct nonstale_queue {
+struct paint_queue {
struct prio_queue pq;
- struct commit *max_nonstale;
+ int p1_count;
+ int p2_count;
+ int pending_merge_bases;
};
-static void nonstale_queue_put(struct nonstale_queue *queue,
- struct commit *c)
+/*
+ * Adjust per-side counters for a paint-state transition. Non-stale
+ * commits are counted in one of three counters: PARENT1-only,
+ * PARENT2-only, or both. Zero means "not in the queue" (used on
+ * enqueue/dequeue); stale commits are not counted at all.
+ */
+static void paint_count_transition(struct paint_queue *queue,
+ unsigned old_flags, unsigned new_flags)
{
- struct commit *old = queue->max_nonstale;
+ unsigned old_paint = old_flags & (PARENT1 | PARENT2 | STALE);
+ unsigned new_paint = new_flags & (PARENT1 | PARENT2 | STALE);
- prio_queue_put(&queue->pq, c);
- if (c->object.flags & STALE)
+ if (old_paint == new_paint)
return;
- if (!old || queue->pq.compare(old, c, queue->pq.cb_data) <= 0)
- queue->max_nonstale = c;
-}
-
-static struct commit *nonstale_queue_get(struct nonstale_queue *queue)
-{
- struct commit *commit = prio_queue_get(&queue->pq);
- if (commit == queue->max_nonstale)
- queue->max_nonstale = NULL;
-
- return commit;
+ if (!(old_paint & STALE)) {
+ switch (old_paint & (PARENT1 | PARENT2)) {
+ case 0: break;
+ case PARENT1: queue->p1_count--; break;
+ case PARENT2: queue->p2_count--; break;
+ case PARENT1 | PARENT2: queue->pending_merge_bases--; break;
+ default: BUG("unexpected paint state");
+ }
+ }
+ if (!(new_paint & STALE)) {
+ switch (new_paint & (PARENT1 | PARENT2)) {
+ case 0: break;
+ case PARENT1: queue->p1_count++; break;
+ case PARENT2: queue->p2_count++; break;
+ case PARENT1 | PARENT2: queue->pending_merge_bases++; break;
+ default: BUG("unexpected paint state");
+ }
+ }
}
-static void clear_nonstale_queue(struct nonstale_queue *queue)
+static void paint_queue_put(struct paint_queue *queue,
+ struct commit *c, unsigned add_flags)
{
- clear_prio_queue(&queue->pq);
- queue->max_nonstale = NULL;
-}
+ unsigned old_flags = c->object.flags;
+ c->object.flags |= add_flags;
-static void nonstale_queue_put_dedup(struct nonstale_queue *queue,
- struct commit *c)
-{
- if (c->object.flags & ENQUEUED)
- return;
- c->object.flags |= ENQUEUED;
- nonstale_queue_put(queue, c);
+ if (old_flags & ENQUEUED) {
+ paint_count_transition(queue, old_flags, c->object.flags);
+ } else {
+ c->object.flags |= ENQUEUED;
+ prio_queue_put(&queue->pq, c);
+ paint_count_transition(queue, 0, c->object.flags);
+ }
}
-static struct commit *nonstale_queue_get_dedup(struct nonstale_queue *queue)
+static struct commit *paint_queue_get(struct paint_queue *queue)
{
- struct commit *commit = nonstale_queue_get(queue);
+ struct commit *commit = prio_queue_get(&queue->pq);
- if (commit)
+ if (commit) {
commit->object.flags &= ~ENQUEUED;
+ paint_count_transition(queue, commit->object.flags, 0);
+ }
return commit;
}
@@ -104,9 +121,10 @@ static int paint_down_to_common(struct repository *r,
enum merge_base_flags mb_flags,
struct commit_list **result)
{
- struct nonstale_queue queue = {
- { compare_commits_by_gen_then_commit_date }
+ struct paint_queue queue = {
+ .pq = { compare_commits_by_gen_then_commit_date }
};
+ struct commit *commit;
int i;
timestamp_t last_gen = GENERATION_NUMBER_INFINITY;
struct commit_list **tail = result;
@@ -119,15 +137,12 @@ static int paint_down_to_common(struct repository *r,
commit_list_append(one, result);
return 0;
}
- nonstale_queue_put_dedup(&queue, one);
+ paint_queue_put(&queue, one, 0);
- for (i = 0; i < n; i++) {
- twos[i]->object.flags |= PARENT2;
- nonstale_queue_put_dedup(&queue, twos[i]);
- }
+ for (i = 0; i < n; i++)
+ paint_queue_put(&queue, twos[i], PARENT2);
- while (queue.max_nonstale) {
- struct commit *commit = nonstale_queue_get_dedup(&queue);
+ while ((commit = paint_queue_get(&queue))) {
struct commit_list *parents;
int flags;
timestamp_t generation = commit_graph_generation(commit);
@@ -165,7 +180,7 @@ static int paint_down_to_common(struct repository *r,
if ((p->object.flags & flags) == flags)
continue;
if (repo_parse_commit(r, p)) {
- clear_nonstale_queue(&queue);
+ clear_prio_queue(&queue.pq);
commit_list_free(*result);
*result = NULL;
/*
@@ -180,12 +195,15 @@ static int paint_down_to_common(struct repository *r,
return error(_("could not parse commit %s"),
oid_to_hex(&p->object.oid));
}
- p->object.flags |= flags;
- nonstale_queue_put_dedup(&queue, p);
+ paint_queue_put(&queue, p, flags);
}
+
+ if (queue.p1_count + queue.p2_count +
+ queue.pending_merge_bases == 0)
+ break;
}
- clear_nonstale_queue(&queue);
+ clear_prio_queue(&queue.pq);
commit_list_sort_by_date(result);
return 0;
}
--
gitgitgadget
^ permalink raw reply related [flat|nested] 7+ messages in thread* [PATCH/RFC 3/6] commit-reach: terminate merge-base walk when one paint side is exhausted
2026-06-20 10:36 [PATCH/RFC 0/6] commit-reach: terminate merge-base walk when one side is exhausted Kristofer Karlsson via GitGitGadget
2026-06-20 10:36 ` [PATCH/RFC 1/6] commit-reach: decouple ahead_behind from nonstale_queue Kristofer Karlsson via GitGitGadget
2026-06-20 10:36 ` [PATCH/RFC 2/6] commit-reach: introduce struct paint_queue with per-side counters Kristofer Karlsson via GitGitGadget
@ 2026-06-20 10:36 ` Kristofer Karlsson via GitGitGadget
2026-06-20 10:36 ` [PATCH/RFC 4/6] t6600: add test cases for side-exhaustion edge cases Elijah Newren via GitGitGadget
` (2 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Kristofer Karlsson via GitGitGadget @ 2026-06-20 10:36 UTC (permalink / raw)
To: git; +Cc: Derrick Stolee, Elijah Newren, Kristofer Karlsson,
Kristofer Karlsson
From: Kristofer Karlsson <krka@spotify.com>
Add an early termination check to paint_down_to_common() using the
per-side counters introduced in the previous commit. Once the walk
enters the finite-generation region, terminate early when one side's
exclusive count drops to zero -- no new merge-base can form without
both paint sides meeting.
The check also waits for pending_merge_bases to reach zero, ensuring
all merge-base candidates have been popped and recorded before
exiting.
The INFINITY gate ensures correctness: commits without a commit-graph
entry have GENERATION_NUMBER_INFINITY and are ordered by commit date,
which is not topologically reliable. The optimization only fires
once the walk enters the finite-generation region where ordering
guarantees hold.
On large repositories with commit-graph, this yields 100-1000x
speedups for merge-base queries where one side (e.g. a PR branch) is
much smaller than the other.
Helped-by: Derrick Stolee <stolee@gmail.com>
Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Kristofer Karlsson <krka@spotify.com>
---
commit-reach.c | 13 +++++++++++++
1 file changed, 13 insertions(+)
diff --git a/commit-reach.c b/commit-reach.c
index ba1e896f0f..fcd1ad0167 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -201,6 +201,19 @@ static int paint_down_to_common(struct repository *r,
if (queue.p1_count + queue.p2_count +
queue.pending_merge_bases == 0)
break;
+
+ /*
+ * Side exhaustion: a new merge-base can only form
+ * when both PARENT1-only and PARENT2-only commits
+ * remain in the queue. In the finite-generation
+ * region the queue is ordered topologically, so
+ * no future step can add paint to visited commits
+ * and an exhausted side cannot reappear.
+ */
+ if (generation < GENERATION_NUMBER_INFINITY &&
+ queue.pending_merge_bases == 0 &&
+ (queue.p1_count == 0 || queue.p2_count == 0))
+ break;
}
clear_prio_queue(&queue.pq);
--
gitgitgadget
^ permalink raw reply related [flat|nested] 7+ messages in thread* [PATCH/RFC 4/6] t6600: add test cases for side-exhaustion edge cases
2026-06-20 10:36 [PATCH/RFC 0/6] commit-reach: terminate merge-base walk when one side is exhausted Kristofer Karlsson via GitGitGadget
` (2 preceding siblings ...)
2026-06-20 10:36 ` [PATCH/RFC 3/6] commit-reach: terminate merge-base walk when one paint side is exhausted Kristofer Karlsson via GitGitGadget
@ 2026-06-20 10:36 ` Elijah Newren via GitGitGadget
2026-06-20 10:36 ` [PATCH/RFC 5/6] t6099, t6600: add side-exhaustion regression tests Kristofer Karlsson via GitGitGadget
2026-06-20 10:36 ` [PATCH/RFC 6/6] Documentation/technical: add paint-down-to-common doc Kristofer Karlsson via GitGitGadget
5 siblings, 0 replies; 7+ messages in thread
From: Elijah Newren via GitGitGadget @ 2026-06-20 10:36 UTC (permalink / raw)
To: git; +Cc: Derrick Stolee, Elijah Newren, Kristofer Karlsson, Elijah Newren
From: Elijah Newren <newren@gmail.com>
Add test cases to t6600-test-reach.sh that exercise edge cases in the
side-exhaustion optimization for paint_down_to_common():
- in_merge_bases_many:self: commit is both A and one of the X inputs
- get_merge_bases_many:duplicate-twos: duplicate entries in X list
- get_merge_bases_many:pending-stale: STALE transition on an
already-painted commit (ps-* diamond topology)
- get_merge_bases_many:infinity-both-sides: both tips outside the
commit-graph with non-monotonic dates (pi-* topology)
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Kristofer Karlsson <krka@spotify.com>
---
t/t6600-test-reach.sh | 111 ++++++++++++++++++++++++++++++++++++++++++
1 file changed, 111 insertions(+)
diff --git a/t/t6600-test-reach.sh b/t/t6600-test-reach.sh
index b5b314e570..775c077c87 100755
--- a/t/t6600-test-reach.sh
+++ b/t/t6600-test-reach.sh
@@ -49,6 +49,62 @@ test_expect_success 'setup' '
git tag -a -m "$x-$i" tag-$x-$i commit-$x-$i || return 1
done
done &&
+
+ # Build a small side topology to exercise the (PARENT1|PARENT2) ->
+ # (PARENT1|PARENT2|STALE) transition in paint_down_to_common(); the
+ # 10x10 grid above does not exercise it because no merge-base candidate
+ # there is a descendant of another, so STALE never reaches a
+ # still-pending candidate.
+ #
+ # ps-X
+ # /|\
+ # / | \
+ # ps-Z ps-B ps-W
+ # | / \ |
+ # | / \ |
+ # |/ \|
+ # ps-T1 ps-T2
+ #
+ # where ps-T1=merge(ps-Z,ps-B), ps-T2=merge(ps-W,ps-B), so
+ # merge-base(ps-T1,ps-T2) = ps-B. During the walk, ps-X transitions
+ # to (PARENT1|PARENT2) via ps-Z and ps-W before ps-B is dequeued;
+ # then the STALE-walk from ps-B transitions ps-X to
+ # (PARENT1|PARENT2|STALE).
+ git checkout --orphan ps-orphan &&
+ test_commit ps-X &&
+ git checkout -b ps-B-br ps-X && test_commit ps-B &&
+ git checkout -b ps-Z-br ps-X && test_commit ps-Z &&
+ git checkout -b ps-W-br ps-X && test_commit ps-W &&
+ git checkout -b ps-T1 ps-Z &&
+ git merge --no-ff -m ps-T1 ps-B &&
+ git checkout -b ps-T2 ps-W &&
+ git merge --no-ff -m ps-T2 ps-B &&
+
+ # Build a side topology that lives entirely outside the half
+ # commit-graph and has non-monotonic commit dates, to exercise the
+ # INFINITY-gate in paint_down_to_common. With both tips outside
+ # the graph, generation is INFINITY and the queue falls back to
+ # commit-date order, which here is non-monotonic.
+ #
+ # pi-X (date 500, PARENT1 tip) --> pi-P, pi-D
+ # pi-D (date 480) --> pi-C
+ # pi-C (date 200) --> pi-B
+ # pi-B (date 100, PARENT2 tip) --> pi-P
+ # pi-P (date 450, root)
+ #
+ # merge-base(pi-X, pi-B) = pi-B (it is an ancestor of pi-X and is
+ # itself one of the queried tips).
+ git checkout --orphan pi-orphan &&
+ test_commit --date "@450 +0000" pi-P &&
+ test_commit --date "@100 +0000" pi-B &&
+ test_commit --date "@200 +0000" pi-C &&
+ test_commit --date "@480 +0000" pi-D &&
+ GIT_AUTHOR_DATE="@500 +0000" GIT_COMMITTER_DATE="@500 +0000" \
+ git commit-tree -p pi-D -p pi-P -m pi-X pi-D^{tree} >pi-X-oid &&
+ pi_x="$(cat pi-X-oid)" &&
+ git branch -f pi-X-br "$pi_x" &&
+ git tag pi-X "$pi_x" &&
+
git commit-graph write --reachable &&
mv .git/objects/info/commit-graph commit-graph-full &&
chmod u+w commit-graph-full &&
@@ -146,6 +202,16 @@ test_expect_success 'in_merge_bases_many:miss-heuristic' '
test_all_modes in_merge_bases_many
'
+test_expect_success 'in_merge_bases_many:self' '
+ cat >input <<-\EOF &&
+ A:commit-6-8
+ X:commit-5-9
+ X:commit-6-8
+ EOF
+ echo "in_merge_bases_many(A,X):1" >expect &&
+ test_all_modes in_merge_bases_many
+'
+
test_expect_success 'is_descendant_of:hit' '
cat >input <<-\EOF &&
A:commit-5-7
@@ -183,6 +249,51 @@ test_expect_success 'get_merge_bases_many' '
test_all_modes get_merge_bases_many
'
+test_expect_success 'get_merge_bases_many:duplicate-twos' '
+ cat >input <<-\EOF &&
+ A:commit-5-7
+ X:commit-4-8
+ X:commit-4-8
+ X:commit-6-6
+ X:commit-6-6
+ X:commit-8-3
+ EOF
+ {
+ echo "get_merge_bases_many(A,X):" &&
+ git rev-parse commit-5-6 \
+ commit-4-7 | sort
+ } >expect &&
+ test_all_modes get_merge_bases_many
+'
+
+test_expect_success 'get_merge_bases_many:pending-stale' '
+ # Exercises the (PARENT1|PARENT2) -> (...|STALE) transition path in
+ # paint_down_to_common(). See the topology comment in the setup test.
+ cat >input <<-\EOF &&
+ A:ps-T1
+ X:ps-T2
+ EOF
+ {
+ echo "get_merge_bases_many(A,X):" &&
+ git rev-parse ps-B
+ } >expect &&
+ test_all_modes get_merge_bases_many
+'
+
+test_expect_success 'get_merge_bases_many:infinity-both-sides' '
+ # Exercises the push-time INFINITY-gate in paint_down_to_common(). See
+ # the pi-* topology comment in the setup test.
+ cat >input <<-\EOF &&
+ A:pi-X
+ X:pi-B
+ EOF
+ {
+ echo "get_merge_bases_many(A,X):" &&
+ git rev-parse pi-B
+ } >expect &&
+ test_all_modes get_merge_bases_many
+'
+
test_expect_success 'reduce_heads' '
cat >input <<-\EOF &&
X:commit-1-10
--
gitgitgadget
^ permalink raw reply related [flat|nested] 7+ messages in thread* [PATCH/RFC 5/6] t6099, t6600: add side-exhaustion regression tests
2026-06-20 10:36 [PATCH/RFC 0/6] commit-reach: terminate merge-base walk when one side is exhausted Kristofer Karlsson via GitGitGadget
` (3 preceding siblings ...)
2026-06-20 10:36 ` [PATCH/RFC 4/6] t6600: add test cases for side-exhaustion edge cases Elijah Newren via GitGitGadget
@ 2026-06-20 10:36 ` Kristofer Karlsson via GitGitGadget
2026-06-20 10:36 ` [PATCH/RFC 6/6] Documentation/technical: add paint-down-to-common doc Kristofer Karlsson via GitGitGadget
5 siblings, 0 replies; 7+ messages in thread
From: Kristofer Karlsson via GitGitGadget @ 2026-06-20 10:36 UTC (permalink / raw)
To: git; +Cc: Derrick Stolee, Elijah Newren, Kristofer Karlsson,
Kristofer Karlsson
From: Kristofer Karlsson <krka@spotify.com>
Add t6099 to test the case where multiple merge-base candidates exist
and one is an ancestor of another. This exercises the side-exhaustion
optimization in paint_down_to_common together with the
remove_redundant safety net in get_merge_bases_many_0.
Add a mixed finite/INFINITY test to t6600 where one tip is outside
the commit-graph (INFINITY generation) and the other is inside.
This exercises the region transition: the walk starts in the
INFINITY region where side-exhaustion is disabled, then crosses
into the finite region where it can fire.
Signed-off-by: Kristofer Karlsson <krka@spotify.com>
---
t/meson.build | 1 +
t/t6099-merge-base-side-exhaustion.sh | 82 +++++++++++++++++++++++++++
t/t6600-test-reach.sh | 25 ++++++++
3 files changed, 108 insertions(+)
create mode 100755 t/t6099-merge-base-side-exhaustion.sh
diff --git a/t/meson.build b/t/meson.build
index 3219264fe7..ee6ebdffb9 100644
--- a/t/meson.build
+++ b/t/meson.build
@@ -786,6 +786,7 @@ integration_tests = [
't6041-bisect-submodule.sh',
't6050-replace.sh',
't6060-merge-index.sh',
+ 't6099-merge-base-side-exhaustion.sh',
't6100-rev-list-in-order.sh',
't6101-rev-parse-parents.sh',
't6102-rev-list-unexpected-objects.sh',
diff --git a/t/t6099-merge-base-side-exhaustion.sh b/t/t6099-merge-base-side-exhaustion.sh
new file mode 100755
index 0000000000..bae3ea7f83
--- /dev/null
+++ b/t/t6099-merge-base-side-exhaustion.sh
@@ -0,0 +1,82 @@
+#!/bin/sh
+
+test_description='merge-base with ancestor among merge-base candidates
+
+Test that merge-base --all correctly handles cases where
+multiple merge-base candidates exist and one is an ancestor
+of another. The side-exhaustion optimization in
+paint_down_to_common may exit before STALE propagation
+removes the ancestor, but remove_redundant catches it.
+
+Graph shape (parents are below children):
+
+ A ----------- X
+ |\ /|
+ | B---------/ |
+ | | |
+ e2 \ f2
+ | | |
+ e1 d1 f1
+ \ | /
+ \ | /
+ \| /
+ C
+
+A and X are the two tips.
+B and C are both reachable from A and X.
+B reaches C through d1.
+Only B should appear in merge-base --all output.
+'
+
+GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
+export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
+
+TEST_PASSES_SANITIZE_LEAK=true
+. ./test-lib.sh
+
+test_expect_success 'setup ancestor merge-base candidate' '
+ test_commit C &&
+
+ git checkout -b d-chain HEAD &&
+ test_commit d1 &&
+ test_commit B &&
+
+ git checkout -b e-path C &&
+ test_commit e1 &&
+ test_commit e2 &&
+
+ git checkout -b f-path C &&
+ test_commit f1 &&
+ test_commit f2 &&
+
+ git checkout -b branch-A e-path &&
+ test_merge A B &&
+
+ git checkout -b branch-X f-path &&
+ test_merge X B &&
+
+ git commit-graph write --reachable
+'
+
+test_expect_success 'merge-base --all excludes ancestor candidate' '
+ git rev-parse B >expected &&
+ git merge-base --all A X >actual &&
+ test_cmp expected actual
+'
+
+test_expect_success 'merge-base (single) finds shallowest' '
+ git rev-parse B >expected &&
+ git merge-base A X >actual &&
+ test_cmp expected actual
+'
+
+# Without commit-graph: generation numbers are INFINITY,
+# side-exhaustion optimization does not fire.
+test_expect_success 'merge-base --all without commit-graph' '
+ rm -f .git/objects/info/commit-graph &&
+ git rev-parse B >expected &&
+ git merge-base --all A X >actual &&
+ test_cmp expected actual
+'
+
+test_done
diff --git a/t/t6600-test-reach.sh b/t/t6600-test-reach.sh
index 775c077c87..f5560b0c1c 100755
--- a/t/t6600-test-reach.sh
+++ b/t/t6600-test-reach.sh
@@ -294,6 +294,31 @@ test_expect_success 'get_merge_bases_many:infinity-both-sides' '
test_all_modes get_merge_bases_many
'
+test_expect_success 'setup mixed finite/INFINITY topology' '
+ # Create a commit outside all saved commit-graph files so it always
+ # has INFINITY generation, while its parent (ps-X) is in the graph
+ # with a finite generation. Use the ps-* orphan topology so we do
+ # not pollute the grid-based rev-list tests.
+ git checkout ps-X &&
+ test_env GIT_TEST_COMMIT_GRAPH= test_commit pm-INF
+'
+
+test_expect_success 'get_merge_bases_many:mixed-finite-infinity' '
+ # One tip (pm-INF) is outside the commit-graph with INFINITY
+ # generation; the other (ps-B) is in the graph with finite
+ # generation. The walk starts in the INFINITY region and crosses
+ # into the finite region where side-exhaustion can fire.
+ cat >input <<-\EOF &&
+ A:pm-INF
+ X:ps-B
+ EOF
+ {
+ echo "get_merge_bases_many(A,X):" &&
+ git rev-parse ps-X
+ } >expect &&
+ test_all_modes get_merge_bases_many
+'
+
test_expect_success 'reduce_heads' '
cat >input <<-\EOF &&
X:commit-1-10
--
gitgitgadget
^ permalink raw reply related [flat|nested] 7+ messages in thread* [PATCH/RFC 6/6] Documentation/technical: add paint-down-to-common doc
2026-06-20 10:36 [PATCH/RFC 0/6] commit-reach: terminate merge-base walk when one side is exhausted Kristofer Karlsson via GitGitGadget
` (4 preceding siblings ...)
2026-06-20 10:36 ` [PATCH/RFC 5/6] t6099, t6600: add side-exhaustion regression tests Kristofer Karlsson via GitGitGadget
@ 2026-06-20 10:36 ` Kristofer Karlsson via GitGitGadget
5 siblings, 0 replies; 7+ messages in thread
From: Kristofer Karlsson via GitGitGadget @ 2026-06-20 10:36 UTC (permalink / raw)
To: git; +Cc: Derrick Stolee, Elijah Newren, Kristofer Karlsson,
Kristofer Karlsson
From: Kristofer Karlsson <krka@spotify.com>
Add a technical document describing the paint_down_to_common()
algorithm used for merge-base computation.
Signed-off-by: Kristofer Karlsson <krka@spotify.com>
---
Documentation/Makefile | 1 +
Documentation/technical/meson.build | 1 +
.../technical/paint-down-to-common.adoc | 130 ++++++++++++++++++
3 files changed, 132 insertions(+)
create mode 100644 Documentation/technical/paint-down-to-common.adoc
diff --git a/Documentation/Makefile b/Documentation/Makefile
index 2699f0b24a..f8dea4b395 100644
--- a/Documentation/Makefile
+++ b/Documentation/Makefile
@@ -129,6 +129,7 @@ TECH_DOCS += technical/long-running-process-protocol
TECH_DOCS += technical/multi-pack-index
TECH_DOCS += technical/packfile-uri
TECH_DOCS += technical/pack-heuristics
+TECH_DOCS += technical/paint-down-to-common
TECH_DOCS += technical/parallel-checkout
TECH_DOCS += technical/partial-clone
TECH_DOCS += technical/platform-support
diff --git a/Documentation/technical/meson.build b/Documentation/technical/meson.build
index ec07088c57..9ce11d5e48 100644
--- a/Documentation/technical/meson.build
+++ b/Documentation/technical/meson.build
@@ -18,6 +18,7 @@ articles = [
'multi-pack-index.adoc',
'packfile-uri.adoc',
'pack-heuristics.adoc',
+ 'paint-down-to-common.adoc',
'parallel-checkout.adoc',
'partial-clone.adoc',
'platform-support.adoc',
diff --git a/Documentation/technical/paint-down-to-common.adoc b/Documentation/technical/paint-down-to-common.adoc
new file mode 100644
index 0000000000..e677cce84d
--- /dev/null
+++ b/Documentation/technical/paint-down-to-common.adoc
@@ -0,0 +1,130 @@
+Merge-Base Computation and paint_down_to_common()
+==================================================
+
+The function `paint_down_to_common()` in `commit-reach.c` computes merge
+bases by walking the commit graph backwards from two sets of tips and
+finding where their ancestry meets.
+
+Use cases
+---------
+
+Computing merge bases is used in two different ways:
+
+ 1. *Finding all merge bases* (`merge-base --all`, `merge-tree`,
+ `merge`, `rebase`). A merge base is a common ancestor that is
+ not itself an ancestor of another common ancestor.
+
+ 2. *Ancestry checks* (`in_merge_bases`, used by `merge-base
+ --is-ancestor`, `branch -d`, `fetch`). These ask: "is commit A
+ an ancestor of commit B?" If a common ancestor equals one of the
+ inputs, that input is necessarily the only merge base -- no other
+ common ancestor can be both as recent and not an ancestor of it.
+
+Both use cases share the same algorithm and implementation.
+
+Algorithm
+---------
+
+Given a commit `one` and a set of commits `twos[]`, the walk paints
+commits with two colors:
+
+ - PARENT1: reachable from `one`
+ - PARENT2: reachable from any commit in `twos[]`
+
+The walk uses a priority queue ordered by generation number (falling
+back to commit date when generation numbers are unavailable). Each
+step dequeues the highest-priority commit (this is when we say a
+commit is "visited") and propagates its paint flags to its parents,
+enqueuing them if they gained new flags. When a commit receives
+both PARENT1 and PARENT2, it is a merge-base candidate. A candidate
+gains the STALE flag so its ancestors propagate staleness -- any
+deeper common ancestor is necessarily redundant.
+
+INFINITY and finite generation regions
+--------------------------------------
+
+The commit-graph stores a generation number for each commit. Commits
+not in the commit-graph have generation `GENERATION_NUMBER_INFINITY`. The
+graph is closed under reachability: if a commit is in the graph, all
+its ancestors are too. This partitions the commit graph into two regions:
+
+....
+ +---------------------------------------+
+ | INFINITY region |
+ | generation = INFINITY |
+ | queue order: heuristic (commit date) |
+ +---------------------------------------+
+ |
+ v
+ +---------------------------------------+
+ | Finite region |
+ | generation = finite |
+ | queue order: topological |
+ +---------------------------------------+
+....
+
+When the commit-graph is enabled, the INFINITY region is typically
+very small -- it only contains commits added since the last
+commit-graph refresh.
+
+All reachable INFINITY-generation commits are visited before any
+finite-generation commit, because INFINITY is larger than any finite
+value. Once the walk crosses into the finite region, it stays there.
+
+In the finite region, generation ordering guarantees topological
+traversal: children are always visited before their parents. This
+means that paint on already-visited commits is final -- no future
+traversal step can add paint to them.
+
+In the INFINITY region, commit-date ordering can violate this: a
+parent with a later date can be visited before a child with an earlier
+date. Paint flags are therefore NOT final at visit time, and a
+commit visited with only one side's paint may later gain the other.
+
+Paint flags are only added, never removed. Since each flag can be set
+at most once per commit, the number of times a commit can be
+re-enqueued is bounded by the number of flag transitions.
+
+Termination
+-----------
+
+Termination happens when we can prove that no extra progress is
+possible. We are done with the main loop when one of the following
+conditions holds:
+
+ 1. The queue is empty.
+ 2. The queue only contains STALE entries.
+ 3. Side-exhaustion: the walk has reached the finite region and one
+ of the sides is fully exhausted.
+
+The loop waits for all pending merge-base candidates to be popped
+and recorded before any early exit fires, so no separate drain phase
+is needed after termination.
+
+Stale entry condition
+~~~~~~~~~~~~~~~~~~~~~
+If all entries are stale we cannot find any new merge bases since
+that requires at least one enqueued side node meeting the other side.
+However, we could still invalidate merge bases (if there are more
+than one). This is unnecessary since `remove_redundant()` will clean
+that up as a post-process step.
+
+Side-exhaustion
+~~~~~~~~~~~~~~~
+A commit is *exclusive* to one side if it carries that side's paint
+but not the other (e.g. PARENT1 without PARENT2).
+
+If we have reached the finite region of the graph, no future
+traversal step can add paint to an already-visited commit. Thus if
+there are no exclusive PARENT2 commits in the queue, no additional
+PARENT2 paint can be introduced into the walk. Even if exclusive
+PARENT1 commits remain, no new merge-base candidates can be
+discovered. The same holds symmetrically for PARENT1.
+
+This invariant is only valid in the finite region of the graph.
+
+Related documentation
+---------------------
+
+ - `Documentation/technical/commit-graph.adoc` -- generation numbers
+ and the reachability closure property.
--
gitgitgadget
^ permalink raw reply related [flat|nested] 7+ messages in thread