* [PATCH/RFC 0/6] commit-reach: terminate merge-base walk when one side is exhausted
@ 2026-06-20 10:36 Kristofer Karlsson via GitGitGadget
2026-06-20 10:36 ` [PATCH/RFC 1/6] commit-reach: decouple ahead_behind from nonstale_queue Kristofer Karlsson via GitGitGadget
` (5 more replies)
0 siblings, 6 replies; 7+ messages in thread
From: Kristofer Karlsson via GitGitGadget @ 2026-06-20 10:36 UTC (permalink / raw)
To: git; +Cc: Derrick Stolee, Elijah Newren, Kristofer Karlsson
Hi,
This follows up on my RFC [1] with a concrete proposal. I expect the design
to still be scrutinized, but that may be easier with actual code to look at.
I tried to make this easier to review by splitting into atomic patches. The
first two patches are the meatiest parts, though they are pure refactoring.
The behavior change is in patch 3 and is in itself quite small. The last
patch adds technical documentation to support future development.
----------------------------------------------------------------------------
Optimize paint_down_to_common() for merge-base queries that hit large
one-sided histories.
When the walk from one side reaches a commit with a very low generation
number that the other side never paints, the walk is forced to drain most of
the graph. A common trigger is a repository import that grafts a separate
history with its own root, but any merge that introduces a low-generation
commit never painted by the other side has the same effect.
A new merge-base candidate can only be discovered when exclusive PARENT1 and
PARENT2 paint meet. This series teaches paint_down_to_common() to stop as
soon as one side has no exclusive commits left in the queue; once one side
is exhausted, no further candidates can appear.
origin/HEAD o o PR HEAD
| |
(import) o :
/ \ /
| o merge-base
| |
: : (~2.5M commits)
| |
import root main root
In the RFC thread [1], Derrick Stolee provided a criss-cross counterexample
that sharpened the halt condition, and Elijah Newren independently
discovered the same optimization and shared an implementation in PR #2150
[2]. Patch 4 incorporates test cases from Elijah's branch.
This series implements the optimization only after the walk enters the
finite-generation region, where generation ordering guarantees that paint on
visited commits is final.
Patch layout:
1/6 commit-reach: decouple ahead_behind from nonstale_queue 2/6
commit-reach: introduce paint_queue and per-side counters 3/6 commit-reach:
stop the walk when one side is exhausted 4/6 t6600: add side-exhaustion
edge-case tests 5/6 t6099, t6600: add side-exhaustion regression tests 6/6
Documentation/technical: document paint_down_to_common()
Benchmarks
Measured on a 2.6M-commit monorepo with commit-graph (baseline v2.55-rc1):
merge-base --all (across import) 4.293s -> 8ms (537x)
merge-tree (across import) 5.345s -> 13ms (411x)
merge-base --all (1000 commits apart) 5.404s -> 7ms (772x)
No regression on linux.git (1.4M commits, commit-graph):
merge-base HEAD HEAD~1000 38ms -> 40ms
merge-base --all HEAD HEAD~1000 87ms -> 36ms
merge-base --is-ancestor HEAD~1000 HEAD 11ms -> 11ms
merge-base --all HEAD HEAD~10000 626ms -> 428ms
[1]
https://lore.kernel.org/git/CAL71e4Ps-2_0+uuZu43N9pFnXBemoAohPs_eyRJf8taXHJPAXQ@mail.gmail.com/T/#u
[2] https://github.com/gitgitgadget/git/pull/2150
Elijah Newren (1):
t6600: add test cases for side-exhaustion edge cases
Kristofer Karlsson (5):
commit-reach: decouple ahead_behind from nonstale_queue
commit-reach: introduce struct paint_queue with per-side counters
commit-reach: terminate merge-base walk when one paint side is
exhausted
t6099, t6600: add side-exhaustion regression tests
Documentation/technical: add paint-down-to-common doc
Documentation/Makefile | 1 +
Documentation/technical/meson.build | 1 +
.../technical/paint-down-to-common.adoc | 130 ++++++++++++++
commit-reach.c | 159 +++++++++++-------
t/meson.build | 1 +
t/t6099-merge-base-side-exhaustion.sh | 82 +++++++++
t/t6600-test-reach.sh | 136 +++++++++++++++
7 files changed, 451 insertions(+), 59 deletions(-)
create mode 100644 Documentation/technical/paint-down-to-common.adoc
create mode 100755 t/t6099-merge-base-side-exhaustion.sh
base-commit: 8d96f09e9245ddf80c1981476fcbac8c4bb4125f
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-2149%2Fspkrka%2Fside-exhaust-pr-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-2149/spkrka/side-exhaust-pr-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/2149
--
gitgitgadget
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH/RFC 1/6] commit-reach: decouple ahead_behind from nonstale_queue
2026-06-20 10:36 [PATCH/RFC 0/6] commit-reach: terminate merge-base walk when one side is exhausted Kristofer Karlsson via GitGitGadget
@ 2026-06-20 10:36 ` Kristofer Karlsson via GitGitGadget
2026-06-20 10:36 ` [PATCH/RFC 2/6] commit-reach: introduce struct paint_queue with per-side counters Kristofer Karlsson via GitGitGadget
` (4 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Kristofer Karlsson via GitGitGadget @ 2026-06-20 10:36 UTC (permalink / raw)
To: git; +Cc: Derrick Stolee, Elijah Newren, Kristofer Karlsson,
Kristofer Karlsson
From: Kristofer Karlsson <krka@spotify.com>
Move ahead_behind() off the shared nonstale_queue abstraction to use
a plain prio_queue with a local max_nonstale pointer. The nonstale
tracking is inlined into insert_no_dup().
This prepares for replacing nonstale_queue with a paint_queue struct
that tracks per-side commit counts, which ahead_behind() does not
need. No behavior change.
Signed-off-by: Kristofer Karlsson <krka@spotify.com>
---
commit-reach.c | 32 +++++++++++++++++++++-----------
1 file changed, 21 insertions(+), 11 deletions(-)
diff --git a/commit-reach.c b/commit-reach.c
index 5df471a313..377a5cc42a 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -1089,12 +1089,18 @@ struct commit_list *get_reachable_subset(struct commit **from, size_t nr_from,
define_commit_slab(bit_arrays, struct bitmap *);
static struct bit_arrays bit_arrays;
-static void insert_no_dup(struct nonstale_queue *queue, struct commit *c)
+static void insert_no_dup(struct prio_queue *queue,
+ struct commit **max_nonstale,
+ struct commit *c)
{
if (c->object.flags & PARENT2)
return;
- nonstale_queue_put(queue, c);
c->object.flags |= PARENT2;
+ prio_queue_put(queue, c);
+ if (!(c->object.flags & STALE) &&
+ (!*max_nonstale ||
+ queue->compare(*max_nonstale, c, queue->cb_data) <= 0))
+ *max_nonstale = c;
}
static struct bitmap *get_bit_array(struct commit *c, int width)
@@ -1118,9 +1124,10 @@ void ahead_behind(struct repository *r,
struct commit **commits, size_t commits_nr,
struct ahead_behind_count *counts, size_t counts_nr)
{
- struct nonstale_queue queue = {
- { .compare = compare_commits_by_gen_then_commit_date }
+ struct prio_queue queue = {
+ .compare = compare_commits_by_gen_then_commit_date
};
+ struct commit *max_nonstale = NULL;
size_t width = DIV_ROUND_UP(commits_nr, BITS_IN_EWORD);
if (!commits_nr || !counts_nr)
@@ -1140,14 +1147,17 @@ void ahead_behind(struct repository *r,
struct bitmap *bitmap = get_bit_array(c, width);
bitmap_set(bitmap, i);
- insert_no_dup(&queue, c);
+ insert_no_dup(&queue, &max_nonstale, c);
}
- while (queue.max_nonstale) {
- struct commit *c = nonstale_queue_get(&queue);
+ while (max_nonstale) {
+ struct commit *c = prio_queue_get(&queue);
struct commit_list *p;
struct bitmap *bitmap_c = get_bit_array(c, width);
+ if (c == max_nonstale)
+ max_nonstale = NULL;
+
for (size_t i = 0; i < counts_nr; i++) {
int reach_from_tip = !!bitmap_get(bitmap_c, counts[i].tip_index);
int reach_from_base = !!bitmap_get(bitmap_c, counts[i].base_index);
@@ -1178,7 +1188,7 @@ void ahead_behind(struct repository *r,
if (bitmap_popcount(bitmap_p) == commits_nr)
p->item->object.flags |= STALE;
- insert_no_dup(&queue, p->item);
+ insert_no_dup(&queue, &max_nonstale, p->item);
}
free_bit_array(c);
@@ -1186,10 +1196,10 @@ void ahead_behind(struct repository *r,
/* STALE is used here, PARENT2 is used by insert_no_dup(). */
repo_clear_commit_marks(r, PARENT2 | STALE);
- for (size_t i = 0; i < queue.pq.nr; i++)
- free_bit_array(queue.pq.array[i].data);
+ for (size_t i = 0; i < queue.nr; i++)
+ free_bit_array(queue.array[i].data);
clear_bit_arrays(&bit_arrays);
- clear_nonstale_queue(&queue);
+ clear_prio_queue(&queue);
}
struct commit_and_index {
--
gitgitgadget
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH/RFC 2/6] commit-reach: introduce struct paint_queue with per-side counters
2026-06-20 10:36 [PATCH/RFC 0/6] commit-reach: terminate merge-base walk when one side is exhausted Kristofer Karlsson via GitGitGadget
2026-06-20 10:36 ` [PATCH/RFC 1/6] commit-reach: decouple ahead_behind from nonstale_queue Kristofer Karlsson via GitGitGadget
@ 2026-06-20 10:36 ` Kristofer Karlsson via GitGitGadget
2026-06-20 10:36 ` [PATCH/RFC 3/6] commit-reach: terminate merge-base walk when one paint side is exhausted Kristofer Karlsson via GitGitGadget
` (3 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Kristofer Karlsson via GitGitGadget @ 2026-06-20 10:36 UTC (permalink / raw)
To: git; +Cc: Derrick Stolee, Elijah Newren, Kristofer Karlsson,
Kristofer Karlsson
From: Kristofer Karlsson <krka@spotify.com>
Replace the nonstale_queue abstraction in paint_down_to_common() with
a new paint_queue struct that tracks per-side commit counts. Each
non-stale queued commit occupies exactly one counter bucket based on
its paint flags: PARENT1-only, PARENT2-only, or both sides (a pending
merge-base candidate).
The counters are maintained by paint_count_transition() which handles
all flag changes as bucket transfers: remove from the old bucket, add
to the new one. Either step is a no-op when the respective state has
no bucket (stale or zero).
The loop now drains the queue via paint_queue_get() and breaks when
all counters reach zero, replacing the old pointer-based termination
(max_nonstale). This is equivalent behavior.
Signed-off-by: Kristofer Karlsson <krka@spotify.com>
---
commit-reach.c | 114 ++++++++++++++++++++++++++++---------------------
1 file changed, 66 insertions(+), 48 deletions(-)
diff --git a/commit-reach.c b/commit-reach.c
index 377a5cc42a..ba1e896f0f 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -41,58 +41,75 @@ static int compare_commits_by_gen(const void *_a, const void *_b)
}
/*
- * A prio_queue with O(1) termination check. 'max_nonstale' tracks
- * the lowest-priority non-stale commit enqueued so far; once it is
- * popped, every remaining entry is known to be STALE.
+ * Priority queue with per-side commit counters for paint_down_to_common().
+ * Each non-stale queued commit occupies exactly one bucket: PARENT1-only,
+ * PARENT2-only, or both (a pending merge-base candidate).
*/
-struct nonstale_queue {
+struct paint_queue {
struct prio_queue pq;
- struct commit *max_nonstale;
+ int p1_count;
+ int p2_count;
+ int pending_merge_bases;
};
-static void nonstale_queue_put(struct nonstale_queue *queue,
- struct commit *c)
+/*
+ * Adjust per-side counters for a paint-state transition. Non-stale
+ * commits are counted in one of three counters: PARENT1-only,
+ * PARENT2-only, or both. Zero means "not in the queue" (used on
+ * enqueue/dequeue); stale commits are not counted at all.
+ */
+static void paint_count_transition(struct paint_queue *queue,
+ unsigned old_flags, unsigned new_flags)
{
- struct commit *old = queue->max_nonstale;
+ unsigned old_paint = old_flags & (PARENT1 | PARENT2 | STALE);
+ unsigned new_paint = new_flags & (PARENT1 | PARENT2 | STALE);
- prio_queue_put(&queue->pq, c);
- if (c->object.flags & STALE)
+ if (old_paint == new_paint)
return;
- if (!old || queue->pq.compare(old, c, queue->pq.cb_data) <= 0)
- queue->max_nonstale = c;
-}
-
-static struct commit *nonstale_queue_get(struct nonstale_queue *queue)
-{
- struct commit *commit = prio_queue_get(&queue->pq);
- if (commit == queue->max_nonstale)
- queue->max_nonstale = NULL;
-
- return commit;
+ if (!(old_paint & STALE)) {
+ switch (old_paint & (PARENT1 | PARENT2)) {
+ case 0: break;
+ case PARENT1: queue->p1_count--; break;
+ case PARENT2: queue->p2_count--; break;
+ case PARENT1 | PARENT2: queue->pending_merge_bases--; break;
+ default: BUG("unexpected paint state");
+ }
+ }
+ if (!(new_paint & STALE)) {
+ switch (new_paint & (PARENT1 | PARENT2)) {
+ case 0: break;
+ case PARENT1: queue->p1_count++; break;
+ case PARENT2: queue->p2_count++; break;
+ case PARENT1 | PARENT2: queue->pending_merge_bases++; break;
+ default: BUG("unexpected paint state");
+ }
+ }
}
-static void clear_nonstale_queue(struct nonstale_queue *queue)
+static void paint_queue_put(struct paint_queue *queue,
+ struct commit *c, unsigned add_flags)
{
- clear_prio_queue(&queue->pq);
- queue->max_nonstale = NULL;
-}
+ unsigned old_flags = c->object.flags;
+ c->object.flags |= add_flags;
-static void nonstale_queue_put_dedup(struct nonstale_queue *queue,
- struct commit *c)
-{
- if (c->object.flags & ENQUEUED)
- return;
- c->object.flags |= ENQUEUED;
- nonstale_queue_put(queue, c);
+ if (old_flags & ENQUEUED) {
+ paint_count_transition(queue, old_flags, c->object.flags);
+ } else {
+ c->object.flags |= ENQUEUED;
+ prio_queue_put(&queue->pq, c);
+ paint_count_transition(queue, 0, c->object.flags);
+ }
}
-static struct commit *nonstale_queue_get_dedup(struct nonstale_queue *queue)
+static struct commit *paint_queue_get(struct paint_queue *queue)
{
- struct commit *commit = nonstale_queue_get(queue);
+ struct commit *commit = prio_queue_get(&queue->pq);
- if (commit)
+ if (commit) {
commit->object.flags &= ~ENQUEUED;
+ paint_count_transition(queue, commit->object.flags, 0);
+ }
return commit;
}
@@ -104,9 +121,10 @@ static int paint_down_to_common(struct repository *r,
enum merge_base_flags mb_flags,
struct commit_list **result)
{
- struct nonstale_queue queue = {
- { compare_commits_by_gen_then_commit_date }
+ struct paint_queue queue = {
+ .pq = { compare_commits_by_gen_then_commit_date }
};
+ struct commit *commit;
int i;
timestamp_t last_gen = GENERATION_NUMBER_INFINITY;
struct commit_list **tail = result;
@@ -119,15 +137,12 @@ static int paint_down_to_common(struct repository *r,
commit_list_append(one, result);
return 0;
}
- nonstale_queue_put_dedup(&queue, one);
+ paint_queue_put(&queue, one, 0);
- for (i = 0; i < n; i++) {
- twos[i]->object.flags |= PARENT2;
- nonstale_queue_put_dedup(&queue, twos[i]);
- }
+ for (i = 0; i < n; i++)
+ paint_queue_put(&queue, twos[i], PARENT2);
- while (queue.max_nonstale) {
- struct commit *commit = nonstale_queue_get_dedup(&queue);
+ while ((commit = paint_queue_get(&queue))) {
struct commit_list *parents;
int flags;
timestamp_t generation = commit_graph_generation(commit);
@@ -165,7 +180,7 @@ static int paint_down_to_common(struct repository *r,
if ((p->object.flags & flags) == flags)
continue;
if (repo_parse_commit(r, p)) {
- clear_nonstale_queue(&queue);
+ clear_prio_queue(&queue.pq);
commit_list_free(*result);
*result = NULL;
/*
@@ -180,12 +195,15 @@ static int paint_down_to_common(struct repository *r,
return error(_("could not parse commit %s"),
oid_to_hex(&p->object.oid));
}
- p->object.flags |= flags;
- nonstale_queue_put_dedup(&queue, p);
+ paint_queue_put(&queue, p, flags);
}
+
+ if (queue.p1_count + queue.p2_count +
+ queue.pending_merge_bases == 0)
+ break;
}
- clear_nonstale_queue(&queue);
+ clear_prio_queue(&queue.pq);
commit_list_sort_by_date(result);
return 0;
}
--
gitgitgadget
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH/RFC 3/6] commit-reach: terminate merge-base walk when one paint side is exhausted
2026-06-20 10:36 [PATCH/RFC 0/6] commit-reach: terminate merge-base walk when one side is exhausted Kristofer Karlsson via GitGitGadget
2026-06-20 10:36 ` [PATCH/RFC 1/6] commit-reach: decouple ahead_behind from nonstale_queue Kristofer Karlsson via GitGitGadget
2026-06-20 10:36 ` [PATCH/RFC 2/6] commit-reach: introduce struct paint_queue with per-side counters Kristofer Karlsson via GitGitGadget
@ 2026-06-20 10:36 ` Kristofer Karlsson via GitGitGadget
2026-06-20 10:36 ` [PATCH/RFC 4/6] t6600: add test cases for side-exhaustion edge cases Elijah Newren via GitGitGadget
` (2 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Kristofer Karlsson via GitGitGadget @ 2026-06-20 10:36 UTC (permalink / raw)
To: git; +Cc: Derrick Stolee, Elijah Newren, Kristofer Karlsson,
Kristofer Karlsson
From: Kristofer Karlsson <krka@spotify.com>
Add an early termination check to paint_down_to_common() using the
per-side counters introduced in the previous commit. Once the walk
enters the finite-generation region, terminate early when one side's
exclusive count drops to zero -- no new merge-base can form without
both paint sides meeting.
The check also waits for pending_merge_bases to reach zero, ensuring
all merge-base candidates have been popped and recorded before
exiting.
The INFINITY gate ensures correctness: commits without a commit-graph
entry have GENERATION_NUMBER_INFINITY and are ordered by commit date,
which is not topologically reliable. The optimization only fires
once the walk enters the finite-generation region where ordering
guarantees hold.
On large repositories with commit-graph, this yields 100-1000x
speedups for merge-base queries where one side (e.g. a PR branch) is
much smaller than the other.
Helped-by: Derrick Stolee <stolee@gmail.com>
Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Kristofer Karlsson <krka@spotify.com>
---
commit-reach.c | 13 +++++++++++++
1 file changed, 13 insertions(+)
diff --git a/commit-reach.c b/commit-reach.c
index ba1e896f0f..fcd1ad0167 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -201,6 +201,19 @@ static int paint_down_to_common(struct repository *r,
if (queue.p1_count + queue.p2_count +
queue.pending_merge_bases == 0)
break;
+
+ /*
+ * Side exhaustion: a new merge-base can only form
+ * when both PARENT1-only and PARENT2-only commits
+ * remain in the queue. In the finite-generation
+ * region the queue is ordered topologically, so
+ * no future step can add paint to visited commits
+ * and an exhausted side cannot reappear.
+ */
+ if (generation < GENERATION_NUMBER_INFINITY &&
+ queue.pending_merge_bases == 0 &&
+ (queue.p1_count == 0 || queue.p2_count == 0))
+ break;
}
clear_prio_queue(&queue.pq);
--
gitgitgadget
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH/RFC 4/6] t6600: add test cases for side-exhaustion edge cases
2026-06-20 10:36 [PATCH/RFC 0/6] commit-reach: terminate merge-base walk when one side is exhausted Kristofer Karlsson via GitGitGadget
` (2 preceding siblings ...)
2026-06-20 10:36 ` [PATCH/RFC 3/6] commit-reach: terminate merge-base walk when one paint side is exhausted Kristofer Karlsson via GitGitGadget
@ 2026-06-20 10:36 ` Elijah Newren via GitGitGadget
2026-06-20 10:36 ` [PATCH/RFC 5/6] t6099, t6600: add side-exhaustion regression tests Kristofer Karlsson via GitGitGadget
2026-06-20 10:36 ` [PATCH/RFC 6/6] Documentation/technical: add paint-down-to-common doc Kristofer Karlsson via GitGitGadget
5 siblings, 0 replies; 7+ messages in thread
From: Elijah Newren via GitGitGadget @ 2026-06-20 10:36 UTC (permalink / raw)
To: git; +Cc: Derrick Stolee, Elijah Newren, Kristofer Karlsson, Elijah Newren
From: Elijah Newren <newren@gmail.com>
Add test cases to t6600-test-reach.sh that exercise edge cases in the
side-exhaustion optimization for paint_down_to_common():
- in_merge_bases_many:self: commit is both A and one of the X inputs
- get_merge_bases_many:duplicate-twos: duplicate entries in X list
- get_merge_bases_many:pending-stale: STALE transition on an
already-painted commit (ps-* diamond topology)
- get_merge_bases_many:infinity-both-sides: both tips outside the
commit-graph with non-monotonic dates (pi-* topology)
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Kristofer Karlsson <krka@spotify.com>
---
t/t6600-test-reach.sh | 111 ++++++++++++++++++++++++++++++++++++++++++
1 file changed, 111 insertions(+)
diff --git a/t/t6600-test-reach.sh b/t/t6600-test-reach.sh
index b5b314e570..775c077c87 100755
--- a/t/t6600-test-reach.sh
+++ b/t/t6600-test-reach.sh
@@ -49,6 +49,62 @@ test_expect_success 'setup' '
git tag -a -m "$x-$i" tag-$x-$i commit-$x-$i || return 1
done
done &&
+
+ # Build a small side topology to exercise the (PARENT1|PARENT2) ->
+ # (PARENT1|PARENT2|STALE) transition in paint_down_to_common(); the
+ # 10x10 grid above does not exercise it because no merge-base candidate
+ # there is a descendant of another, so STALE never reaches a
+ # still-pending candidate.
+ #
+ # ps-X
+ # /|\
+ # / | \
+ # ps-Z ps-B ps-W
+ # | / \ |
+ # | / \ |
+ # |/ \|
+ # ps-T1 ps-T2
+ #
+ # where ps-T1=merge(ps-Z,ps-B), ps-T2=merge(ps-W,ps-B), so
+ # merge-base(ps-T1,ps-T2) = ps-B. During the walk, ps-X transitions
+ # to (PARENT1|PARENT2) via ps-Z and ps-W before ps-B is dequeued;
+ # then the STALE-walk from ps-B transitions ps-X to
+ # (PARENT1|PARENT2|STALE).
+ git checkout --orphan ps-orphan &&
+ test_commit ps-X &&
+ git checkout -b ps-B-br ps-X && test_commit ps-B &&
+ git checkout -b ps-Z-br ps-X && test_commit ps-Z &&
+ git checkout -b ps-W-br ps-X && test_commit ps-W &&
+ git checkout -b ps-T1 ps-Z &&
+ git merge --no-ff -m ps-T1 ps-B &&
+ git checkout -b ps-T2 ps-W &&
+ git merge --no-ff -m ps-T2 ps-B &&
+
+ # Build a side topology that lives entirely outside the half
+ # commit-graph and has non-monotonic commit dates, to exercise the
+ # INFINITY-gate in paint_down_to_common. With both tips outside
+ # the graph, generation is INFINITY and the queue falls back to
+ # commit-date order, which here is non-monotonic.
+ #
+ # pi-X (date 500, PARENT1 tip) --> pi-P, pi-D
+ # pi-D (date 480) --> pi-C
+ # pi-C (date 200) --> pi-B
+ # pi-B (date 100, PARENT2 tip) --> pi-P
+ # pi-P (date 450, root)
+ #
+ # merge-base(pi-X, pi-B) = pi-B (it is an ancestor of pi-X and is
+ # itself one of the queried tips).
+ git checkout --orphan pi-orphan &&
+ test_commit --date "@450 +0000" pi-P &&
+ test_commit --date "@100 +0000" pi-B &&
+ test_commit --date "@200 +0000" pi-C &&
+ test_commit --date "@480 +0000" pi-D &&
+ GIT_AUTHOR_DATE="@500 +0000" GIT_COMMITTER_DATE="@500 +0000" \
+ git commit-tree -p pi-D -p pi-P -m pi-X pi-D^{tree} >pi-X-oid &&
+ pi_x="$(cat pi-X-oid)" &&
+ git branch -f pi-X-br "$pi_x" &&
+ git tag pi-X "$pi_x" &&
+
git commit-graph write --reachable &&
mv .git/objects/info/commit-graph commit-graph-full &&
chmod u+w commit-graph-full &&
@@ -146,6 +202,16 @@ test_expect_success 'in_merge_bases_many:miss-heuristic' '
test_all_modes in_merge_bases_many
'
+test_expect_success 'in_merge_bases_many:self' '
+ cat >input <<-\EOF &&
+ A:commit-6-8
+ X:commit-5-9
+ X:commit-6-8
+ EOF
+ echo "in_merge_bases_many(A,X):1" >expect &&
+ test_all_modes in_merge_bases_many
+'
+
test_expect_success 'is_descendant_of:hit' '
cat >input <<-\EOF &&
A:commit-5-7
@@ -183,6 +249,51 @@ test_expect_success 'get_merge_bases_many' '
test_all_modes get_merge_bases_many
'
+test_expect_success 'get_merge_bases_many:duplicate-twos' '
+ cat >input <<-\EOF &&
+ A:commit-5-7
+ X:commit-4-8
+ X:commit-4-8
+ X:commit-6-6
+ X:commit-6-6
+ X:commit-8-3
+ EOF
+ {
+ echo "get_merge_bases_many(A,X):" &&
+ git rev-parse commit-5-6 \
+ commit-4-7 | sort
+ } >expect &&
+ test_all_modes get_merge_bases_many
+'
+
+test_expect_success 'get_merge_bases_many:pending-stale' '
+ # Exercises the (PARENT1|PARENT2) -> (...|STALE) transition path in
+ # paint_down_to_common(). See the topology comment in the setup test.
+ cat >input <<-\EOF &&
+ A:ps-T1
+ X:ps-T2
+ EOF
+ {
+ echo "get_merge_bases_many(A,X):" &&
+ git rev-parse ps-B
+ } >expect &&
+ test_all_modes get_merge_bases_many
+'
+
+test_expect_success 'get_merge_bases_many:infinity-both-sides' '
+ # Exercises the push-time INFINITY-gate in paint_down_to_common(). See
+ # the pi-* topology comment in the setup test.
+ cat >input <<-\EOF &&
+ A:pi-X
+ X:pi-B
+ EOF
+ {
+ echo "get_merge_bases_many(A,X):" &&
+ git rev-parse pi-B
+ } >expect &&
+ test_all_modes get_merge_bases_many
+'
+
test_expect_success 'reduce_heads' '
cat >input <<-\EOF &&
X:commit-1-10
--
gitgitgadget
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH/RFC 5/6] t6099, t6600: add side-exhaustion regression tests
2026-06-20 10:36 [PATCH/RFC 0/6] commit-reach: terminate merge-base walk when one side is exhausted Kristofer Karlsson via GitGitGadget
` (3 preceding siblings ...)
2026-06-20 10:36 ` [PATCH/RFC 4/6] t6600: add test cases for side-exhaustion edge cases Elijah Newren via GitGitGadget
@ 2026-06-20 10:36 ` Kristofer Karlsson via GitGitGadget
2026-06-20 10:36 ` [PATCH/RFC 6/6] Documentation/technical: add paint-down-to-common doc Kristofer Karlsson via GitGitGadget
5 siblings, 0 replies; 7+ messages in thread
From: Kristofer Karlsson via GitGitGadget @ 2026-06-20 10:36 UTC (permalink / raw)
To: git; +Cc: Derrick Stolee, Elijah Newren, Kristofer Karlsson,
Kristofer Karlsson
From: Kristofer Karlsson <krka@spotify.com>
Add t6099 to test the case where multiple merge-base candidates exist
and one is an ancestor of another. This exercises the side-exhaustion
optimization in paint_down_to_common together with the
remove_redundant safety net in get_merge_bases_many_0.
Add a mixed finite/INFINITY test to t6600 where one tip is outside
the commit-graph (INFINITY generation) and the other is inside.
This exercises the region transition: the walk starts in the
INFINITY region where side-exhaustion is disabled, then crosses
into the finite region where it can fire.
Signed-off-by: Kristofer Karlsson <krka@spotify.com>
---
t/meson.build | 1 +
t/t6099-merge-base-side-exhaustion.sh | 82 +++++++++++++++++++++++++++
t/t6600-test-reach.sh | 25 ++++++++
3 files changed, 108 insertions(+)
create mode 100755 t/t6099-merge-base-side-exhaustion.sh
diff --git a/t/meson.build b/t/meson.build
index 3219264fe7..ee6ebdffb9 100644
--- a/t/meson.build
+++ b/t/meson.build
@@ -786,6 +786,7 @@ integration_tests = [
't6041-bisect-submodule.sh',
't6050-replace.sh',
't6060-merge-index.sh',
+ 't6099-merge-base-side-exhaustion.sh',
't6100-rev-list-in-order.sh',
't6101-rev-parse-parents.sh',
't6102-rev-list-unexpected-objects.sh',
diff --git a/t/t6099-merge-base-side-exhaustion.sh b/t/t6099-merge-base-side-exhaustion.sh
new file mode 100755
index 0000000000..bae3ea7f83
--- /dev/null
+++ b/t/t6099-merge-base-side-exhaustion.sh
@@ -0,0 +1,82 @@
+#!/bin/sh
+
+test_description='merge-base with ancestor among merge-base candidates
+
+Test that merge-base --all correctly handles cases where
+multiple merge-base candidates exist and one is an ancestor
+of another. The side-exhaustion optimization in
+paint_down_to_common may exit before STALE propagation
+removes the ancestor, but remove_redundant catches it.
+
+Graph shape (parents are below children):
+
+ A ----------- X
+ |\ /|
+ | B---------/ |
+ | | |
+ e2 \ f2
+ | | |
+ e1 d1 f1
+ \ | /
+ \ | /
+ \| /
+ C
+
+A and X are the two tips.
+B and C are both reachable from A and X.
+B reaches C through d1.
+Only B should appear in merge-base --all output.
+'
+
+GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
+export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
+
+TEST_PASSES_SANITIZE_LEAK=true
+. ./test-lib.sh
+
+test_expect_success 'setup ancestor merge-base candidate' '
+ test_commit C &&
+
+ git checkout -b d-chain HEAD &&
+ test_commit d1 &&
+ test_commit B &&
+
+ git checkout -b e-path C &&
+ test_commit e1 &&
+ test_commit e2 &&
+
+ git checkout -b f-path C &&
+ test_commit f1 &&
+ test_commit f2 &&
+
+ git checkout -b branch-A e-path &&
+ test_merge A B &&
+
+ git checkout -b branch-X f-path &&
+ test_merge X B &&
+
+ git commit-graph write --reachable
+'
+
+test_expect_success 'merge-base --all excludes ancestor candidate' '
+ git rev-parse B >expected &&
+ git merge-base --all A X >actual &&
+ test_cmp expected actual
+'
+
+test_expect_success 'merge-base (single) finds shallowest' '
+ git rev-parse B >expected &&
+ git merge-base A X >actual &&
+ test_cmp expected actual
+'
+
+# Without commit-graph: generation numbers are INFINITY,
+# side-exhaustion optimization does not fire.
+test_expect_success 'merge-base --all without commit-graph' '
+ rm -f .git/objects/info/commit-graph &&
+ git rev-parse B >expected &&
+ git merge-base --all A X >actual &&
+ test_cmp expected actual
+'
+
+test_done
diff --git a/t/t6600-test-reach.sh b/t/t6600-test-reach.sh
index 775c077c87..f5560b0c1c 100755
--- a/t/t6600-test-reach.sh
+++ b/t/t6600-test-reach.sh
@@ -294,6 +294,31 @@ test_expect_success 'get_merge_bases_many:infinity-both-sides' '
test_all_modes get_merge_bases_many
'
+test_expect_success 'setup mixed finite/INFINITY topology' '
+ # Create a commit outside all saved commit-graph files so it always
+ # has INFINITY generation, while its parent (ps-X) is in the graph
+ # with a finite generation. Use the ps-* orphan topology so we do
+ # not pollute the grid-based rev-list tests.
+ git checkout ps-X &&
+ test_env GIT_TEST_COMMIT_GRAPH= test_commit pm-INF
+'
+
+test_expect_success 'get_merge_bases_many:mixed-finite-infinity' '
+ # One tip (pm-INF) is outside the commit-graph with INFINITY
+ # generation; the other (ps-B) is in the graph with finite
+ # generation. The walk starts in the INFINITY region and crosses
+ # into the finite region where side-exhaustion can fire.
+ cat >input <<-\EOF &&
+ A:pm-INF
+ X:ps-B
+ EOF
+ {
+ echo "get_merge_bases_many(A,X):" &&
+ git rev-parse ps-X
+ } >expect &&
+ test_all_modes get_merge_bases_many
+'
+
test_expect_success 'reduce_heads' '
cat >input <<-\EOF &&
X:commit-1-10
--
gitgitgadget
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH/RFC 6/6] Documentation/technical: add paint-down-to-common doc
2026-06-20 10:36 [PATCH/RFC 0/6] commit-reach: terminate merge-base walk when one side is exhausted Kristofer Karlsson via GitGitGadget
` (4 preceding siblings ...)
2026-06-20 10:36 ` [PATCH/RFC 5/6] t6099, t6600: add side-exhaustion regression tests Kristofer Karlsson via GitGitGadget
@ 2026-06-20 10:36 ` Kristofer Karlsson via GitGitGadget
5 siblings, 0 replies; 7+ messages in thread
From: Kristofer Karlsson via GitGitGadget @ 2026-06-20 10:36 UTC (permalink / raw)
To: git; +Cc: Derrick Stolee, Elijah Newren, Kristofer Karlsson,
Kristofer Karlsson
From: Kristofer Karlsson <krka@spotify.com>
Add a technical document describing the paint_down_to_common()
algorithm used for merge-base computation.
Signed-off-by: Kristofer Karlsson <krka@spotify.com>
---
Documentation/Makefile | 1 +
Documentation/technical/meson.build | 1 +
.../technical/paint-down-to-common.adoc | 130 ++++++++++++++++++
3 files changed, 132 insertions(+)
create mode 100644 Documentation/technical/paint-down-to-common.adoc
diff --git a/Documentation/Makefile b/Documentation/Makefile
index 2699f0b24a..f8dea4b395 100644
--- a/Documentation/Makefile
+++ b/Documentation/Makefile
@@ -129,6 +129,7 @@ TECH_DOCS += technical/long-running-process-protocol
TECH_DOCS += technical/multi-pack-index
TECH_DOCS += technical/packfile-uri
TECH_DOCS += technical/pack-heuristics
+TECH_DOCS += technical/paint-down-to-common
TECH_DOCS += technical/parallel-checkout
TECH_DOCS += technical/partial-clone
TECH_DOCS += technical/platform-support
diff --git a/Documentation/technical/meson.build b/Documentation/technical/meson.build
index ec07088c57..9ce11d5e48 100644
--- a/Documentation/technical/meson.build
+++ b/Documentation/technical/meson.build
@@ -18,6 +18,7 @@ articles = [
'multi-pack-index.adoc',
'packfile-uri.adoc',
'pack-heuristics.adoc',
+ 'paint-down-to-common.adoc',
'parallel-checkout.adoc',
'partial-clone.adoc',
'platform-support.adoc',
diff --git a/Documentation/technical/paint-down-to-common.adoc b/Documentation/technical/paint-down-to-common.adoc
new file mode 100644
index 0000000000..e677cce84d
--- /dev/null
+++ b/Documentation/technical/paint-down-to-common.adoc
@@ -0,0 +1,130 @@
+Merge-Base Computation and paint_down_to_common()
+==================================================
+
+The function `paint_down_to_common()` in `commit-reach.c` computes merge
+bases by walking the commit graph backwards from two sets of tips and
+finding where their ancestry meets.
+
+Use cases
+---------
+
+Computing merge bases is used in two different ways:
+
+ 1. *Finding all merge bases* (`merge-base --all`, `merge-tree`,
+ `merge`, `rebase`). A merge base is a common ancestor that is
+ not itself an ancestor of another common ancestor.
+
+ 2. *Ancestry checks* (`in_merge_bases`, used by `merge-base
+ --is-ancestor`, `branch -d`, `fetch`). These ask: "is commit A
+ an ancestor of commit B?" If a common ancestor equals one of the
+ inputs, that input is necessarily the only merge base -- no other
+ common ancestor can be both as recent and not an ancestor of it.
+
+Both use cases share the same algorithm and implementation.
+
+Algorithm
+---------
+
+Given a commit `one` and a set of commits `twos[]`, the walk paints
+commits with two colors:
+
+ - PARENT1: reachable from `one`
+ - PARENT2: reachable from any commit in `twos[]`
+
+The walk uses a priority queue ordered by generation number (falling
+back to commit date when generation numbers are unavailable). Each
+step dequeues the highest-priority commit (this is when we say a
+commit is "visited") and propagates its paint flags to its parents,
+enqueuing them if they gained new flags. When a commit receives
+both PARENT1 and PARENT2, it is a merge-base candidate. A candidate
+gains the STALE flag so its ancestors propagate staleness -- any
+deeper common ancestor is necessarily redundant.
+
+INFINITY and finite generation regions
+--------------------------------------
+
+The commit-graph stores a generation number for each commit. Commits
+not in the commit-graph have generation `GENERATION_NUMBER_INFINITY`. The
+graph is closed under reachability: if a commit is in the graph, all
+its ancestors are too. This partitions the commit graph into two regions:
+
+....
+ +---------------------------------------+
+ | INFINITY region |
+ | generation = INFINITY |
+ | queue order: heuristic (commit date) |
+ +---------------------------------------+
+ |
+ v
+ +---------------------------------------+
+ | Finite region |
+ | generation = finite |
+ | queue order: topological |
+ +---------------------------------------+
+....
+
+When the commit-graph is enabled, the INFINITY region is typically
+very small -- it only contains commits added since the last
+commit-graph refresh.
+
+All reachable INFINITY-generation commits are visited before any
+finite-generation commit, because INFINITY is larger than any finite
+value. Once the walk crosses into the finite region, it stays there.
+
+In the finite region, generation ordering guarantees topological
+traversal: children are always visited before their parents. This
+means that paint on already-visited commits is final -- no future
+traversal step can add paint to them.
+
+In the INFINITY region, commit-date ordering can violate this: a
+parent with a later date can be visited before a child with an earlier
+date. Paint flags are therefore NOT final at visit time, and a
+commit visited with only one side's paint may later gain the other.
+
+Paint flags are only added, never removed. Since each flag can be set
+at most once per commit, the number of times a commit can be
+re-enqueued is bounded by the number of flag transitions.
+
+Termination
+-----------
+
+Termination happens when we can prove that no extra progress is
+possible. We are done with the main loop when one of the following
+conditions holds:
+
+ 1. The queue is empty.
+ 2. The queue only contains STALE entries.
+ 3. Side-exhaustion: the walk has reached the finite region and one
+ of the sides is fully exhausted.
+
+The loop waits for all pending merge-base candidates to be popped
+and recorded before any early exit fires, so no separate drain phase
+is needed after termination.
+
+Stale entry condition
+~~~~~~~~~~~~~~~~~~~~~
+If all entries are stale we cannot find any new merge bases since
+that requires at least one enqueued side node meeting the other side.
+However, we could still invalidate merge bases (if there are more
+than one). This is unnecessary since `remove_redundant()` will clean
+that up as a post-process step.
+
+Side-exhaustion
+~~~~~~~~~~~~~~~
+A commit is *exclusive* to one side if it carries that side's paint
+but not the other (e.g. PARENT1 without PARENT2).
+
+If we have reached the finite region of the graph, no future
+traversal step can add paint to an already-visited commit. Thus if
+there are no exclusive PARENT2 commits in the queue, no additional
+PARENT2 paint can be introduced into the walk. Even if exclusive
+PARENT1 commits remain, no new merge-base candidates can be
+discovered. The same holds symmetrically for PARENT1.
+
+This invariant is only valid in the finite region of the graph.
+
+Related documentation
+---------------------
+
+ - `Documentation/technical/commit-graph.adoc` -- generation numbers
+ and the reachability closure property.
--
gitgitgadget
^ permalink raw reply related [flat|nested] 7+ messages in thread
end of thread, other threads:[~2026-06-20 10:37 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-20 10:36 [PATCH/RFC 0/6] commit-reach: terminate merge-base walk when one side is exhausted Kristofer Karlsson via GitGitGadget
2026-06-20 10:36 ` [PATCH/RFC 1/6] commit-reach: decouple ahead_behind from nonstale_queue Kristofer Karlsson via GitGitGadget
2026-06-20 10:36 ` [PATCH/RFC 2/6] commit-reach: introduce struct paint_queue with per-side counters Kristofer Karlsson via GitGitGadget
2026-06-20 10:36 ` [PATCH/RFC 3/6] commit-reach: terminate merge-base walk when one paint side is exhausted Kristofer Karlsson via GitGitGadget
2026-06-20 10:36 ` [PATCH/RFC 4/6] t6600: add test cases for side-exhaustion edge cases Elijah Newren via GitGitGadget
2026-06-20 10:36 ` [PATCH/RFC 5/6] t6099, t6600: add side-exhaustion regression tests Kristofer Karlsson via GitGitGadget
2026-06-20 10:36 ` [PATCH/RFC 6/6] Documentation/technical: add paint-down-to-common doc Kristofer Karlsson via GitGitGadget
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.