* [PATCH] commit-reach: early exit paint_down_to_common for single merge-base
@ 2026-05-08 15:07 Kristofer Karlsson via GitGitGadget
2026-05-11 2:08 ` Junio C Hamano
2026-05-11 6:19 ` [PATCH v2] " Kristofer Karlsson via GitGitGadget
0 siblings, 2 replies; 11+ messages in thread
From: Kristofer Karlsson via GitGitGadget @ 2026-05-08 15:07 UTC (permalink / raw)
To: git; +Cc: Derrick Stolee, Kristofer Karlsson, Kristofer Karlsson
From: Kristofer Karlsson <krka@spotify.com>
When find_all is false and generation numbers are available, the
priority queue pops in non-increasing generation order. The first
doubly-painted commit is a valid best merge-base; no later commit
can dominate it. Skip the expensive STALE drain in this case.
The early exit is guarded by three conditions: find_all must be
false, the commit-graph must provide generation numbers, and the
merge-base commit itself must have a finite generation (not
GENERATION_NUMBER_INFINITY from being outside the commit-graph).
Add find_all parameter to repo_get_merge_bases_many_dirty() and
thread it through to paint_down_to_common(). git merge-base
(without --all) passes show_all=0, triggering the early exit.
On a 2.2M-commit merge-heavy monorepo with commit-graph:
HEAD vs ~500: 5,229ms -> 24ms
HEAD vs ~1000: 4,214ms -> 39ms
HEAD vs ~5000: 3,799ms -> 46ms
HEAD vs ~10000: 3,827ms -> 61ms
Signed-off-by: Kristofer Karlsson <krka@spotify.com>
---
[RFC] commit-reach: skip STALE drain when only one merge-base needed
Context for what this is all about.
I am working with a very large git monorepo and have been investigating
performance issue. After some digging I ended up looking more deeply
into git merge-base. I saw it had an --all parameter but the default is
to only return a single merge-base. Looking through the code and adding
debug timing, I realized that although the total time to compute the
merge-base was high, a very small amount of time was spent finding the
initial merge-base value that was later returned.
The optimization is actually quite dramatic in a large repo - runtime
went down from 5000ms to 50ms, so it's roughly a 100x optimization. This
comes from an exploding frontier of STALE commits to drain.
Thus, my idea is simply to return early from the function once we know
what will be returned. This only works if we find a candidate that we
know will not be pruned later - but fortunately if we have a commit
graph with generations we will visit commits in order such that it will
actually not be pruned.
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-2109%2Fspkrka%2Fmerge-base-early-exit-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-2109/spkrka/merge-base-early-exit-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/2109
builtin/merge-base.c | 3 +-
commit-reach.c | 26 ++++++---
commit-reach.h | 5 +-
t/t6010-merge-base.sh | 119 ++++++++++++++++++++++++++++++++++++++++++
t/t6600-test-reach.sh | 40 ++++++++++++++
5 files changed, 183 insertions(+), 10 deletions(-)
diff --git a/builtin/merge-base.c b/builtin/merge-base.c
index c7ee97fa6a..6b9d42f596 100644
--- a/builtin/merge-base.c
+++ b/builtin/merge-base.c
@@ -14,7 +14,8 @@ static int show_merge_base(struct commit **rev, size_t rev_nr, int show_all)
struct commit_list *result = NULL, *r;
if (repo_get_merge_bases_many_dirty(the_repository, rev[0],
- rev_nr - 1, rev + 1, &result) < 0) {
+ rev_nr - 1, rev + 1,
+ show_all, &result) < 0) {
commit_list_free(result);
return -1;
}
diff --git a/commit-reach.c b/commit-reach.c
index d3a9b3ed6f..c9d2d594de 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -55,14 +55,16 @@ static int paint_down_to_common(struct repository *r,
struct commit **twos,
timestamp_t min_generation,
int ignore_missing_commits,
+ int find_all,
struct commit_list **result)
{
struct prio_queue queue = { compare_commits_by_gen_then_commit_date };
int i;
+ int has_gens = min_generation || corrected_commit_dates_enabled(r);
timestamp_t last_gen = GENERATION_NUMBER_INFINITY;
struct commit_list **tail = result;
- if (!min_generation && !corrected_commit_dates_enabled(r))
+ if (!has_gens)
queue.compare = compare_commits_by_commit_date;
one->object.flags |= PARENT1;
@@ -97,6 +99,11 @@ static int paint_down_to_common(struct repository *r,
if (!(commit->object.flags & RESULT)) {
commit->object.flags |= RESULT;
tail = commit_list_append(commit, tail);
+ /* Generation-ordered queue: no later
+ * commit can dominate this one. */
+ if (!find_all && has_gens &&
+ generation < GENERATION_NUMBER_INFINITY)
+ break;
}
/* Mark parents of a found merge stale */
flags |= STALE;
@@ -136,6 +143,7 @@ static int paint_down_to_common(struct repository *r,
static int merge_bases_many(struct repository *r,
struct commit *one, int n,
struct commit **twos,
+ int find_all,
struct commit_list **result)
{
struct commit_list *list = NULL, **tail = result;
@@ -165,7 +173,7 @@ static int merge_bases_many(struct repository *r,
oid_to_hex(&twos[i]->object.oid));
}
- if (paint_down_to_common(r, one, n, twos, 0, 0, &list)) {
+ if (paint_down_to_common(r, one, n, twos, 0, 0, find_all, &list)) {
commit_list_free(list);
return -1;
}
@@ -246,7 +254,7 @@ static int remove_redundant_no_gen(struct repository *r,
min_generation = curr_generation;
}
if (paint_down_to_common(r, array[i], filled,
- work, min_generation, 0, &common)) {
+ work, min_generation, 0, 1, &common)) {
clear_commit_marks(array[i], all_flags);
clear_commit_marks_many(filled, work, all_flags);
commit_list_free(common);
@@ -425,6 +433,7 @@ static int get_merge_bases_many_0(struct repository *r,
size_t n,
struct commit **twos,
int cleanup,
+ int find_all,
struct commit_list **result)
{
struct commit_list *list, **tail = result;
@@ -432,7 +441,7 @@ static int get_merge_bases_many_0(struct repository *r,
size_t cnt, i;
int ret;
- if (merge_bases_many(r, one, n, twos, result) < 0)
+ if (merge_bases_many(r, one, n, twos, find_all, result) < 0)
return -1;
for (i = 0; i < n; i++) {
if (one == twos[i])
@@ -475,16 +484,17 @@ int repo_get_merge_bases_many(struct repository *r,
struct commit **twos,
struct commit_list **result)
{
- return get_merge_bases_many_0(r, one, n, twos, 1, result);
+ return get_merge_bases_many_0(r, one, n, twos, 1, 1, result);
}
int repo_get_merge_bases_many_dirty(struct repository *r,
struct commit *one,
size_t n,
struct commit **twos,
+ int find_all,
struct commit_list **result)
{
- return get_merge_bases_many_0(r, one, n, twos, 0, result);
+ return get_merge_bases_many_0(r, one, n, twos, 0, find_all, result);
}
int repo_get_merge_bases(struct repository *r,
@@ -492,7 +502,7 @@ int repo_get_merge_bases(struct repository *r,
struct commit *two,
struct commit_list **result)
{
- return get_merge_bases_many_0(r, one, 1, &two, 1, result);
+ return get_merge_bases_many_0(r, one, 1, &two, 1, 1, result);
}
/*
@@ -555,7 +565,7 @@ int repo_in_merge_bases_many(struct repository *r, struct commit *commit,
if (paint_down_to_common(r, commit,
nr_reference, reference,
- generation, ignore_missing_commits, &bases))
+ generation, ignore_missing_commits, 1, &bases))
ret = -1;
else if (commit->object.flags & PARENT2)
ret = 1;
diff --git a/commit-reach.h b/commit-reach.h
index 6012402dfc..908b9539c5 100644
--- a/commit-reach.h
+++ b/commit-reach.h
@@ -17,10 +17,13 @@ int repo_get_merge_bases_many(struct repository *r,
struct commit *one, size_t n,
struct commit **twos,
struct commit_list **result);
-/* To be used only when object flags after this call no longer matter */
+/* To be used only when object flags after this call no longer matter.
+ * When find_all is false and generation numbers are available, returns
+ * after finding the first merge-base, skipping the STALE drain. */
int repo_get_merge_bases_many_dirty(struct repository *r,
struct commit *one, size_t n,
struct commit **twos,
+ int find_all,
struct commit_list **result);
int get_octopus_merge_bases(struct commit_list *in, struct commit_list **result);
diff --git a/t/t6010-merge-base.sh b/t/t6010-merge-base.sh
index 44c726ea39..f6c85d4f53 100755
--- a/t/t6010-merge-base.sh
+++ b/t/t6010-merge-base.sh
@@ -305,4 +305,123 @@ test_expect_success 'merge-base --octopus --all for complex tree' '
test_cmp expected actual
'
+# The following tests verify that "git merge-base" (without --all)
+# returns the same result with and without a commit-graph.
+# This exercises the early-exit optimisation in paint_down_to_common
+# that skips the STALE drain when generation numbers are available.
+
+test_expect_success 'setup for commit-graph tests' '
+ git init graph-repo &&
+ (
+ cd graph-repo &&
+
+ # Build a forked DAG:
+ #
+ # L1---L2 (left)
+ # /
+ # S
+ # \
+ # R1---R2 (right)
+ #
+ test_commit GS &&
+ git checkout -b left &&
+ test_commit L1 &&
+ test_commit L2 &&
+ git checkout GS &&
+ git checkout -b right &&
+ test_commit GR1 &&
+ test_commit GR2
+ )
+'
+
+test_expect_success 'merge-base without commit-graph' '
+ (
+ cd graph-repo &&
+ rm -f .git/objects/info/commit-graph &&
+ git merge-base left right >actual &&
+ git rev-parse GS >expected &&
+ test_cmp expected actual
+ )
+'
+
+test_expect_success 'merge-base with commit-graph' '
+ (
+ cd graph-repo &&
+ git commit-graph write --reachable &&
+ git merge-base left right >actual &&
+ git rev-parse GS >expected &&
+ test_cmp expected actual
+ )
+'
+
+test_expect_success 'merge-base --all with commit-graph' '
+ (
+ cd graph-repo &&
+ git merge-base --all left right >actual &&
+ git rev-parse GS >expected &&
+ test_cmp expected actual
+ )
+'
+
+test_expect_success 'merge-base agrees with --all for single result' '
+ (
+ cd graph-repo &&
+ git commit-graph write --reachable &&
+ git merge-base left right >actual.single &&
+ git merge-base --all left right >actual.all &&
+ test_cmp actual.all actual.single
+ )
+'
+
+test_expect_success 'setup for deep chain commit-graph test' '
+ git init deep-repo &&
+ (
+ cd deep-repo &&
+
+ # Build a deep forked DAG:
+ #
+ # L1--L2--...--L20 (left)
+ # /
+ # S
+ # \
+ # R1--R2--...--R20 (right)
+ #
+ test_commit DS &&
+ git checkout -b left &&
+ for i in $(test_seq 1 20)
+ do
+ test_commit DL$i || return 1
+ done &&
+ git checkout DS &&
+ git checkout -b right &&
+ for i in $(test_seq 1 20)
+ do
+ test_commit DR$i || return 1
+ done
+ )
+'
+
+test_expect_success 'deep chain: merge-base matches with and without commit-graph' '
+ (
+ cd deep-repo &&
+ rm -f .git/objects/info/commit-graph &&
+ git merge-base left right >actual.no-graph &&
+ git rev-parse DS >expected &&
+ test_cmp expected actual.no-graph &&
+ git commit-graph write --reachable &&
+ git merge-base left right >actual.graph &&
+ test_cmp expected actual.graph
+ )
+'
+
+test_expect_success 'deep chain: --all and non---all agree with commit-graph' '
+ (
+ cd deep-repo &&
+ git commit-graph write --reachable &&
+ git merge-base left right >actual.single &&
+ git merge-base --all left right >actual.all &&
+ test_cmp actual.all actual.single
+ )
+'
+
test_done
diff --git a/t/t6600-test-reach.sh b/t/t6600-test-reach.sh
index dc0421ed2f..51c23b7683 100755
--- a/t/t6600-test-reach.sh
+++ b/t/t6600-test-reach.sh
@@ -882,4 +882,44 @@ test_expect_success 'rev-list --maximal-only matches merge-base --independent' '
test_cmp expect.sorted actual.sorted
'
+# The following tests verify the early-exit optimisation in
+# paint_down_to_common when merge-base is invoked without --all.
+# Each test checks all four commit-graph configurations.
+
+merge_base_all_modes () {
+ test_when_finished rm -rf .git/objects/info/commit-graph &&
+ git merge-base "$@" >actual &&
+ test_cmp expect actual &&
+ cp commit-graph-full .git/objects/info/commit-graph &&
+ git merge-base "$@" >actual &&
+ test_cmp expect actual &&
+ cp commit-graph-half .git/objects/info/commit-graph &&
+ git merge-base "$@" >actual &&
+ test_cmp expect actual &&
+ cp commit-graph-no-gdat .git/objects/info/commit-graph &&
+ git merge-base "$@" >actual &&
+ test_cmp expect actual
+}
+
+test_expect_success 'merge-base without --all (unique base)' '
+ git rev-parse commit-5-3 >expect &&
+ merge_base_all_modes commit-5-7 commit-8-3
+'
+
+test_expect_success 'merge-base without --all is one of --all results' '
+ test_when_finished rm -rf .git/objects/info/commit-graph &&
+
+ cp commit-graph-full .git/objects/info/commit-graph &&
+ git merge-base --all commit-5-7 commit-4-8 commit-6-6 commit-8-3 >all &&
+ git merge-base commit-5-7 commit-4-8 commit-6-6 commit-8-3 >single &&
+ test_line_count = 1 single &&
+ grep -F -f single all &&
+
+ cp commit-graph-half .git/objects/info/commit-graph &&
+ git merge-base --all commit-5-7 commit-4-8 commit-6-6 commit-8-3 >all &&
+ git merge-base commit-5-7 commit-4-8 commit-6-6 commit-8-3 >single &&
+ test_line_count = 1 single &&
+ grep -F -f single all
+'
+
test_done
base-commit: 94f057755b7941b321fd11fec1b2e3ca5313a4e0
--
gitgitgadget
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH] commit-reach: early exit paint_down_to_common for single merge-base
2026-05-08 15:07 [PATCH] commit-reach: early exit paint_down_to_common for single merge-base Kristofer Karlsson via GitGitGadget
@ 2026-05-11 2:08 ` Junio C Hamano
2026-05-11 6:19 ` [PATCH v2] " Kristofer Karlsson via GitGitGadget
1 sibling, 0 replies; 11+ messages in thread
From: Junio C Hamano @ 2026-05-11 2:08 UTC (permalink / raw)
To: Kristofer Karlsson via GitGitGadget
Cc: git, Derrick Stolee, Kristofer Karlsson
"Kristofer Karlsson via GitGitGadget" <gitgitgadget@gmail.com>
writes:
> diff --git a/commit-reach.c b/commit-reach.c
> index d3a9b3ed6f..c9d2d594de 100644
> --- a/commit-reach.c
> +++ b/commit-reach.c
> @@ -55,14 +55,16 @@ static int paint_down_to_common(struct repository *r,
> struct commit **twos,
> timestamp_t min_generation,
> int ignore_missing_commits,
> + int find_all,
> struct commit_list **result)
> {
> struct prio_queue queue = { compare_commits_by_gen_then_commit_date };
> int i;
> + int has_gens = min_generation || corrected_commit_dates_enabled(r);
Giving a name that identifies what the commonly-used logical
expression means is a very good idea, but don't some caller pass
min_generation==infinity (i.e., not zero) when there is no
generation available? E.g., remove_redundant_no_gen() passes
the result of calling commit_graph_generation(), and without commit
graph, we would get infinity here, right?
Given that the second user of this variable is guarded by !find_all,
the variable being true even when we shouldn't be using generation
numbers may not matter for the purpose of the early break, but then
it means that the patch squanders the perfect opportunity to clarify
what the variable means, which is not very lovely.
> timestamp_t last_gen = GENERATION_NUMBER_INFINITY;
> struct commit_list **tail = result;
>
> - if (!min_generation && !corrected_commit_dates_enabled(r))
> + if (!has_gens)
> queue.compare = compare_commits_by_commit_date;
>
> one->object.flags |= PARENT1;
> @@ -97,6 +99,11 @@ static int paint_down_to_common(struct repository *r,
> if (!(commit->object.flags & RESULT)) {
> commit->object.flags |= RESULT;
> tail = commit_list_append(commit, tail);
> + /* Generation-ordered queue: no later
> + * commit can dominate this one. */
> + if (!find_all && has_gens &&
> + generation < GENERATION_NUMBER_INFINITY)
> + break;
Three comments
* See Documentation/CodingGuidelines for our preferred style for
multi-line comments.
* I do not think we often use "dominate" to describe relationship
between commits, and I am not sure what you want the word to mean
in this context. Can you clarify?
* The code is getting way too deeply indented. Perhaps using a
helper function and
if (!(commit->object.flags & RESULT)) {
if (mark_result(r, &tail, commit,
find_all, min_generation))
break;
}
move the logic to mark the newly discovered commit as one of the
possible result, and to tell the loop to terminate, to it, along
with the comment on the meaning of its return value, may make the
code easier to follow?
> +/* To be used only when object flags after this call no longer matter.
> + * When find_all is false and generation numbers are available, returns
> + * after finding the first merge-base, skipping the STALE drain. */
Ditto on the style.
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v2] commit-reach: early exit paint_down_to_common for single merge-base
2026-05-08 15:07 [PATCH] commit-reach: early exit paint_down_to_common for single merge-base Kristofer Karlsson via GitGitGadget
2026-05-11 2:08 ` Junio C Hamano
@ 2026-05-11 6:19 ` Kristofer Karlsson via GitGitGadget
2026-05-11 7:22 ` Patrick Steinhardt
2026-05-11 11:22 ` [PATCH v3] " Kristofer Karlsson via GitGitGadget
1 sibling, 2 replies; 11+ messages in thread
From: Kristofer Karlsson via GitGitGadget @ 2026-05-11 6:19 UTC (permalink / raw)
To: git; +Cc: Kristofer Karlsson, Kristofer Karlsson
From: Kristofer Karlsson <krka@spotify.com>
Commits not in the commit-graph get GENERATION_NUMBER_INFINITY and
sort to the top of the priority queue. After those, commits with
finite generation numbers are popped in non-increasing order.
When find_all is false the first doubly-painted commit with a
finite generation is therefore a best merge-base: no commit still
in the queue can be a descendant of it. Skip the expensive STALE
drain in this case.
Add find_all parameter to repo_get_merge_bases_many_dirty() and
thread it through to paint_down_to_common(). git merge-base
(without --all) passes show_all=0, triggering the early exit.
On a 2.2M-commit merge-heavy monorepo with commit-graph:
HEAD vs ~500: 5,229ms -> 24ms
HEAD vs ~1000: 4,214ms -> 39ms
HEAD vs ~5000: 3,799ms -> 46ms
HEAD vs ~10000: 3,827ms -> 61ms
Signed-off-by: Kristofer Karlsson <krka@spotify.com>
---
[RFC] commit-reach: skip STALE drain when only one merge-base needed
Context for what this is all about.
I am working with a very large git monorepo and have been investigating
performance issues. After some digging I ended up looking more deeply
into git merge-base. I saw it had an --all parameter but the default is
to only return a single merge-base. Looking through the code and adding
debug timing, I realized that although the total time to compute the
merge-base was high, a very small amount of time was spent finding the
initial merge-base value that was later returned.
The optimization is actually quite dramatic in a large repo - runtime
went down from 5000ms to 50ms, so it's roughly a 100x optimization. This
comes from an exploding frontier of STALE commits to drain.
Thus, my idea is simply to return early from the function once we know
what will be returned. This only works if we find a candidate that we
know will not be pruned later - but fortunately if we have a commit
graph with generations we will visit commits in order such that it will
actually not be pruned.
CC: Derrick Stolee stolee@gmail.com
Changes since v1 (thanks Junio for the review):
* Dropped the has_gens variable entirely. If a commit has a finite
generation then it is in the commit-graph, and so are all its
ancestors — no additional check is needed to know the queue ordering
is sound. Without a commit-graph every commit gets INFINITY and the
guard never fires. This also avoids the misleading interaction with
callers that pass non-zero min_generation without having generation
data.
* Simplified the early exit guard from three conditions to two:
!find_all && generation < GENERATION_NUMBER_INFINITY.
* Fixed multi-line comment style per CodingGuidelines.
* Replaced "dominate" with concrete reasoning about queue ordering.
* Did not extract a helper function: after the simplifications above
the inner block is four lines and reads naturally inline. The right
boundary for a helper is not obvious (it could absorb just the result
marking, or also the RESULT flag check, or also the PARENT1|PARENT2
test) and each level requires more local state passed by pointer.
Happy to extract one if preferred.
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-2109%2Fspkrka%2Fmerge-base-early-exit-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-2109/spkrka/merge-base-early-exit-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/2109
Range-diff vs v1:
1: 54cdf9bfd9 ! 1: f7b5c267f3 commit-reach: early exit paint_down_to_common for single merge-base
@@ Metadata
## Commit message ##
commit-reach: early exit paint_down_to_common for single merge-base
- When find_all is false and generation numbers are available, the
- priority queue pops in non-increasing generation order. The first
- doubly-painted commit is a valid best merge-base; no later commit
- can dominate it. Skip the expensive STALE drain in this case.
-
- The early exit is guarded by three conditions: find_all must be
- false, the commit-graph must provide generation numbers, and the
- merge-base commit itself must have a finite generation (not
- GENERATION_NUMBER_INFINITY from being outside the commit-graph).
+ Commits not in the commit-graph get GENERATION_NUMBER_INFINITY and
+ sort to the top of the priority queue. After those, commits with
+ finite generation numbers are popped in non-increasing order.
+ When find_all is false the first doubly-painted commit with a
+ finite generation is therefore a best merge-base: no commit still
+ in the queue can be a descendant of it. Skip the expensive STALE
+ drain in this case.
Add find_all parameter to repo_get_merge_bases_many_dirty() and
thread it through to paint_down_to_common(). git merge-base
@@ commit-reach.c: static int paint_down_to_common(struct repository *r,
struct commit_list **result)
{
struct prio_queue queue = { compare_commits_by_gen_then_commit_date };
- int i;
-+ int has_gens = min_generation || corrected_commit_dates_enabled(r);
- timestamp_t last_gen = GENERATION_NUMBER_INFINITY;
- struct commit_list **tail = result;
-
-- if (!min_generation && !corrected_commit_dates_enabled(r))
-+ if (!has_gens)
- queue.compare = compare_commits_by_commit_date;
-
- one->object.flags |= PARENT1;
@@ commit-reach.c: static int paint_down_to_common(struct repository *r,
if (!(commit->object.flags & RESULT)) {
commit->object.flags |= RESULT;
tail = commit_list_append(commit, tail);
-+ /* Generation-ordered queue: no later
-+ * commit can dominate this one. */
-+ if (!find_all && has_gens &&
++ /*
++ * The queue is generation-ordered; no
++ * remaining common ancestor can be a
++ * descendant of this one.
++ */
++ if (!find_all &&
+ generation < GENERATION_NUMBER_INFINITY)
+ break;
}
@@ commit-reach.h: int repo_get_merge_bases_many(struct repository *r,
struct commit **twos,
struct commit_list **result);
-/* To be used only when object flags after this call no longer matter */
-+/* To be used only when object flags after this call no longer matter.
++/*
++ * To be used only when object flags after this call no longer matter.
+ * When find_all is false and generation numbers are available, returns
-+ * after finding the first merge-base, skipping the STALE drain. */
++ * after finding the first merge-base, skipping the STALE drain.
++ */
int repo_get_merge_bases_many_dirty(struct repository *r,
struct commit *one, size_t n,
struct commit **twos,
builtin/merge-base.c | 3 +-
commit-reach.c | 26 ++++++---
commit-reach.h | 7 ++-
t/t6010-merge-base.sh | 119 ++++++++++++++++++++++++++++++++++++++++++
t/t6600-test-reach.sh | 40 ++++++++++++++
5 files changed, 186 insertions(+), 9 deletions(-)
diff --git a/builtin/merge-base.c b/builtin/merge-base.c
index c7ee97fa6a..6b9d42f596 100644
--- a/builtin/merge-base.c
+++ b/builtin/merge-base.c
@@ -14,7 +14,8 @@ static int show_merge_base(struct commit **rev, size_t rev_nr, int show_all)
struct commit_list *result = NULL, *r;
if (repo_get_merge_bases_many_dirty(the_repository, rev[0],
- rev_nr - 1, rev + 1, &result) < 0) {
+ rev_nr - 1, rev + 1,
+ show_all, &result) < 0) {
commit_list_free(result);
return -1;
}
diff --git a/commit-reach.c b/commit-reach.c
index d3a9b3ed6f..b4ca00bb7e 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -55,6 +55,7 @@ static int paint_down_to_common(struct repository *r,
struct commit **twos,
timestamp_t min_generation,
int ignore_missing_commits,
+ int find_all,
struct commit_list **result)
{
struct prio_queue queue = { compare_commits_by_gen_then_commit_date };
@@ -97,6 +98,14 @@ static int paint_down_to_common(struct repository *r,
if (!(commit->object.flags & RESULT)) {
commit->object.flags |= RESULT;
tail = commit_list_append(commit, tail);
+ /*
+ * The queue is generation-ordered; no
+ * remaining common ancestor can be a
+ * descendant of this one.
+ */
+ if (!find_all &&
+ generation < GENERATION_NUMBER_INFINITY)
+ break;
}
/* Mark parents of a found merge stale */
flags |= STALE;
@@ -136,6 +145,7 @@ static int paint_down_to_common(struct repository *r,
static int merge_bases_many(struct repository *r,
struct commit *one, int n,
struct commit **twos,
+ int find_all,
struct commit_list **result)
{
struct commit_list *list = NULL, **tail = result;
@@ -165,7 +175,7 @@ static int merge_bases_many(struct repository *r,
oid_to_hex(&twos[i]->object.oid));
}
- if (paint_down_to_common(r, one, n, twos, 0, 0, &list)) {
+ if (paint_down_to_common(r, one, n, twos, 0, 0, find_all, &list)) {
commit_list_free(list);
return -1;
}
@@ -246,7 +256,7 @@ static int remove_redundant_no_gen(struct repository *r,
min_generation = curr_generation;
}
if (paint_down_to_common(r, array[i], filled,
- work, min_generation, 0, &common)) {
+ work, min_generation, 0, 1, &common)) {
clear_commit_marks(array[i], all_flags);
clear_commit_marks_many(filled, work, all_flags);
commit_list_free(common);
@@ -425,6 +435,7 @@ static int get_merge_bases_many_0(struct repository *r,
size_t n,
struct commit **twos,
int cleanup,
+ int find_all,
struct commit_list **result)
{
struct commit_list *list, **tail = result;
@@ -432,7 +443,7 @@ static int get_merge_bases_many_0(struct repository *r,
size_t cnt, i;
int ret;
- if (merge_bases_many(r, one, n, twos, result) < 0)
+ if (merge_bases_many(r, one, n, twos, find_all, result) < 0)
return -1;
for (i = 0; i < n; i++) {
if (one == twos[i])
@@ -475,16 +486,17 @@ int repo_get_merge_bases_many(struct repository *r,
struct commit **twos,
struct commit_list **result)
{
- return get_merge_bases_many_0(r, one, n, twos, 1, result);
+ return get_merge_bases_many_0(r, one, n, twos, 1, 1, result);
}
int repo_get_merge_bases_many_dirty(struct repository *r,
struct commit *one,
size_t n,
struct commit **twos,
+ int find_all,
struct commit_list **result)
{
- return get_merge_bases_many_0(r, one, n, twos, 0, result);
+ return get_merge_bases_many_0(r, one, n, twos, 0, find_all, result);
}
int repo_get_merge_bases(struct repository *r,
@@ -492,7 +504,7 @@ int repo_get_merge_bases(struct repository *r,
struct commit *two,
struct commit_list **result)
{
- return get_merge_bases_many_0(r, one, 1, &two, 1, result);
+ return get_merge_bases_many_0(r, one, 1, &two, 1, 1, result);
}
/*
@@ -555,7 +567,7 @@ int repo_in_merge_bases_many(struct repository *r, struct commit *commit,
if (paint_down_to_common(r, commit,
nr_reference, reference,
- generation, ignore_missing_commits, &bases))
+ generation, ignore_missing_commits, 1, &bases))
ret = -1;
else if (commit->object.flags & PARENT2)
ret = 1;
diff --git a/commit-reach.h b/commit-reach.h
index 6012402dfc..c3b570a5cc 100644
--- a/commit-reach.h
+++ b/commit-reach.h
@@ -17,10 +17,15 @@ int repo_get_merge_bases_many(struct repository *r,
struct commit *one, size_t n,
struct commit **twos,
struct commit_list **result);
-/* To be used only when object flags after this call no longer matter */
+/*
+ * To be used only when object flags after this call no longer matter.
+ * When find_all is false and generation numbers are available, returns
+ * after finding the first merge-base, skipping the STALE drain.
+ */
int repo_get_merge_bases_many_dirty(struct repository *r,
struct commit *one, size_t n,
struct commit **twos,
+ int find_all,
struct commit_list **result);
int get_octopus_merge_bases(struct commit_list *in, struct commit_list **result);
diff --git a/t/t6010-merge-base.sh b/t/t6010-merge-base.sh
index 44c726ea39..f6c85d4f53 100755
--- a/t/t6010-merge-base.sh
+++ b/t/t6010-merge-base.sh
@@ -305,4 +305,123 @@ test_expect_success 'merge-base --octopus --all for complex tree' '
test_cmp expected actual
'
+# The following tests verify that "git merge-base" (without --all)
+# returns the same result with and without a commit-graph.
+# This exercises the early-exit optimisation in paint_down_to_common
+# that skips the STALE drain when generation numbers are available.
+
+test_expect_success 'setup for commit-graph tests' '
+ git init graph-repo &&
+ (
+ cd graph-repo &&
+
+ # Build a forked DAG:
+ #
+ # L1---L2 (left)
+ # /
+ # S
+ # \
+ # R1---R2 (right)
+ #
+ test_commit GS &&
+ git checkout -b left &&
+ test_commit L1 &&
+ test_commit L2 &&
+ git checkout GS &&
+ git checkout -b right &&
+ test_commit GR1 &&
+ test_commit GR2
+ )
+'
+
+test_expect_success 'merge-base without commit-graph' '
+ (
+ cd graph-repo &&
+ rm -f .git/objects/info/commit-graph &&
+ git merge-base left right >actual &&
+ git rev-parse GS >expected &&
+ test_cmp expected actual
+ )
+'
+
+test_expect_success 'merge-base with commit-graph' '
+ (
+ cd graph-repo &&
+ git commit-graph write --reachable &&
+ git merge-base left right >actual &&
+ git rev-parse GS >expected &&
+ test_cmp expected actual
+ )
+'
+
+test_expect_success 'merge-base --all with commit-graph' '
+ (
+ cd graph-repo &&
+ git merge-base --all left right >actual &&
+ git rev-parse GS >expected &&
+ test_cmp expected actual
+ )
+'
+
+test_expect_success 'merge-base agrees with --all for single result' '
+ (
+ cd graph-repo &&
+ git commit-graph write --reachable &&
+ git merge-base left right >actual.single &&
+ git merge-base --all left right >actual.all &&
+ test_cmp actual.all actual.single
+ )
+'
+
+test_expect_success 'setup for deep chain commit-graph test' '
+ git init deep-repo &&
+ (
+ cd deep-repo &&
+
+ # Build a deep forked DAG:
+ #
+ # L1--L2--...--L20 (left)
+ # /
+ # S
+ # \
+ # R1--R2--...--R20 (right)
+ #
+ test_commit DS &&
+ git checkout -b left &&
+ for i in $(test_seq 1 20)
+ do
+ test_commit DL$i || return 1
+ done &&
+ git checkout DS &&
+ git checkout -b right &&
+ for i in $(test_seq 1 20)
+ do
+ test_commit DR$i || return 1
+ done
+ )
+'
+
+test_expect_success 'deep chain: merge-base matches with and without commit-graph' '
+ (
+ cd deep-repo &&
+ rm -f .git/objects/info/commit-graph &&
+ git merge-base left right >actual.no-graph &&
+ git rev-parse DS >expected &&
+ test_cmp expected actual.no-graph &&
+ git commit-graph write --reachable &&
+ git merge-base left right >actual.graph &&
+ test_cmp expected actual.graph
+ )
+'
+
+test_expect_success 'deep chain: --all and non---all agree with commit-graph' '
+ (
+ cd deep-repo &&
+ git commit-graph write --reachable &&
+ git merge-base left right >actual.single &&
+ git merge-base --all left right >actual.all &&
+ test_cmp actual.all actual.single
+ )
+'
+
test_done
diff --git a/t/t6600-test-reach.sh b/t/t6600-test-reach.sh
index dc0421ed2f..51c23b7683 100755
--- a/t/t6600-test-reach.sh
+++ b/t/t6600-test-reach.sh
@@ -882,4 +882,44 @@ test_expect_success 'rev-list --maximal-only matches merge-base --independent' '
test_cmp expect.sorted actual.sorted
'
+# The following tests verify the early-exit optimisation in
+# paint_down_to_common when merge-base is invoked without --all.
+# Each test checks all four commit-graph configurations.
+
+merge_base_all_modes () {
+ test_when_finished rm -rf .git/objects/info/commit-graph &&
+ git merge-base "$@" >actual &&
+ test_cmp expect actual &&
+ cp commit-graph-full .git/objects/info/commit-graph &&
+ git merge-base "$@" >actual &&
+ test_cmp expect actual &&
+ cp commit-graph-half .git/objects/info/commit-graph &&
+ git merge-base "$@" >actual &&
+ test_cmp expect actual &&
+ cp commit-graph-no-gdat .git/objects/info/commit-graph &&
+ git merge-base "$@" >actual &&
+ test_cmp expect actual
+}
+
+test_expect_success 'merge-base without --all (unique base)' '
+ git rev-parse commit-5-3 >expect &&
+ merge_base_all_modes commit-5-7 commit-8-3
+'
+
+test_expect_success 'merge-base without --all is one of --all results' '
+ test_when_finished rm -rf .git/objects/info/commit-graph &&
+
+ cp commit-graph-full .git/objects/info/commit-graph &&
+ git merge-base --all commit-5-7 commit-4-8 commit-6-6 commit-8-3 >all &&
+ git merge-base commit-5-7 commit-4-8 commit-6-6 commit-8-3 >single &&
+ test_line_count = 1 single &&
+ grep -F -f single all &&
+
+ cp commit-graph-half .git/objects/info/commit-graph &&
+ git merge-base --all commit-5-7 commit-4-8 commit-6-6 commit-8-3 >all &&
+ git merge-base commit-5-7 commit-4-8 commit-6-6 commit-8-3 >single &&
+ test_line_count = 1 single &&
+ grep -F -f single all
+'
+
test_done
base-commit: 94f057755b7941b321fd11fec1b2e3ca5313a4e0
--
gitgitgadget
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH v2] commit-reach: early exit paint_down_to_common for single merge-base
2026-05-11 6:19 ` [PATCH v2] " Kristofer Karlsson via GitGitGadget
@ 2026-05-11 7:22 ` Patrick Steinhardt
2026-05-11 11:22 ` [PATCH v3] " Kristofer Karlsson via GitGitGadget
1 sibling, 0 replies; 11+ messages in thread
From: Patrick Steinhardt @ 2026-05-11 7:22 UTC (permalink / raw)
To: Kristofer Karlsson via GitGitGadget; +Cc: git, Kristofer Karlsson
On Mon, May 11, 2026 at 06:19:08AM +0000, Kristofer Karlsson via GitGitGadget wrote:
> diff --git a/commit-reach.c b/commit-reach.c
> index d3a9b3ed6f..b4ca00bb7e 100644
> --- a/commit-reach.c
> +++ b/commit-reach.c
> @@ -165,7 +175,7 @@ static int merge_bases_many(struct repository *r,
> oid_to_hex(&twos[i]->object.oid));
> }
>
> - if (paint_down_to_common(r, one, n, twos, 0, 0, &list)) {
> + if (paint_down_to_common(r, one, n, twos, 0, 0, find_all, &list)) {
> commit_list_free(list);
> return -1;
> }
Callsites like this are quite hard to read now with these boolean flags.
Would it be preferable to instead use flags?
enum paint_down_to_common_flags {
PAINT_DOWN_TO_COMMON_IGNORE_MISSING_COMMITS = (1 << 0),
PAINT_DOWN_TO_COMMON_FIND_ALL = (1 << 1),
};
It's more verbose of course, but that's kind of the point.
Only weirdness is that we don't only accept these flags in
`paint_down_to_common()`, but also in other functions that pass those
flags down.
Patrick
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v3] commit-reach: early exit paint_down_to_common for single merge-base
2026-05-11 6:19 ` [PATCH v2] " Kristofer Karlsson via GitGitGadget
2026-05-11 7:22 ` Patrick Steinhardt
@ 2026-05-11 11:22 ` Kristofer Karlsson via GitGitGadget
2026-05-11 12:04 ` Patrick Steinhardt
2026-05-11 12:59 ` [PATCH v4 0/2] [RFC] commit-reach: skip STALE drain when only one merge-base needed Kristofer Karlsson via GitGitGadget
1 sibling, 2 replies; 11+ messages in thread
From: Kristofer Karlsson via GitGitGadget @ 2026-05-11 11:22 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Kristofer Karlsson, Kristofer Karlsson
From: Kristofer Karlsson <krka@spotify.com>
Commits not in the commit-graph get GENERATION_NUMBER_INFINITY and
sort to the top of the priority queue. After those, commits with
finite generation numbers are popped in non-increasing order.
When MERGE_BASE_FIND_ALL is not set the first doubly-painted commit
with a finite generation is therefore a best merge-base: no commit
still in the queue can be a descendant of it. Skip the expensive
STALE drain in this case.
Introduce enum merge_base_flags with MERGE_BASE_FIND_ALL and
MERGE_BASE_IGNORE_MISSING_COMMITS, replacing the two boolean
parameters in paint_down_to_common(). Thread the flags through
merge_bases_many(), get_merge_bases_many_0(), and the public
repo_get_merge_bases_many_dirty() API. git merge-base (without
--all) passes 0, triggering the early exit.
On a 2.2M-commit merge-heavy monorepo with commit-graph:
HEAD vs ~500: 5,229ms -> 24ms
HEAD vs ~1000: 4,214ms -> 39ms
HEAD vs ~5000: 3,799ms -> 46ms
HEAD vs ~10000: 3,827ms -> 61ms
Signed-off-by: Kristofer Karlsson <krka@spotify.com>
---
[RFC] commit-reach: skip STALE drain when only one merge-base needed
Context for what this is all about.
I am working with a very large git monorepo and have been investigating
performance issues. After some digging I ended up looking more deeply
into git merge-base. I saw it had an --all parameter but the default is
to only return a single merge-base. Looking through the code and adding
debug timing, I realized that although the total time to compute the
merge-base was high, a very small amount of time was spent finding the
initial merge-base value that was later returned.
The optimization is actually quite dramatic in a large repo - runtime
went down from 5000ms to 50ms, so it's roughly a 100x optimization. This
comes from an exploding frontier of STALE commits to drain.
Thus, my idea is simply to return early from the function once we know
what will be returned. This only works if we find a candidate that we
know will not be pruned later - but fortunately if we have a commit
graph with generations we will visit commits in order such that it will
actually not be pruned.
CC: Derrick Stolee stolee@gmail.com
Changes since v1 (thanks Junio for the review):
* Dropped the has_gens variable entirely. If a commit has a finite
generation then it is in the commit-graph, and so are all its
ancestors — no additional check is needed to know the queue ordering
is sound. Without a commit-graph every commit gets INFINITY and the
guard never fires. This also avoids the misleading interaction with
callers that pass non-zero min_generation without having generation
data.
* Simplified the early exit guard from three conditions to two:
!find_all && generation < GENERATION_NUMBER_INFINITY.
* Fixed multi-line comment style per CodingGuidelines.
* Replaced "dominate" with concrete reasoning about queue ordering.
* Did not extract a helper function: after the simplifications above
the inner block is four lines and reads naturally inline. The right
boundary for a helper is not obvious (it could absorb just the result
marking, or also the RESULT flag check, or also the PARENT1|PARENT2
test) and each level requires more local state passed by pointer.
Happy to extract one if preferred.
Changes since v2 (thanks Patrick for the suggestion):
* Replaced the boolean find_all and ignore_missing_commits parameters
in paint_down_to_common() with a single enum merge_base_flags
mb_flags, reducing the function from 8 to 7 parameters. The enum is
defined in commit-reach.h with MERGE_BASE_FIND_ALL and
MERGE_BASE_IGNORE_MISSING_COMMITS.
* Named the enum merge_base_flags rather than
paint_down_to_common_flags since the flags express caller intent and
are threaded through multiple layers including the public
repo_get_merge_bases_many_dirty() API.
* Used mb_flags as the parameter name to avoid shadowing the existing
local int flags (commit object flags) inside paint_down_to_common().
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-2109%2Fspkrka%2Fmerge-base-early-exit-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-2109/spkrka/merge-base-early-exit-v3
Pull-Request: https://github.com/gitgitgadget/git/pull/2109
Range-diff vs v2:
1: f7b5c267f3 ! 1: e4dada892f commit-reach: early exit paint_down_to_common for single merge-base
@@ Commit message
Commits not in the commit-graph get GENERATION_NUMBER_INFINITY and
sort to the top of the priority queue. After those, commits with
finite generation numbers are popped in non-increasing order.
- When find_all is false the first doubly-painted commit with a
- finite generation is therefore a best merge-base: no commit still
- in the queue can be a descendant of it. Skip the expensive STALE
- drain in this case.
+ When MERGE_BASE_FIND_ALL is not set the first doubly-painted commit
+ with a finite generation is therefore a best merge-base: no commit
+ still in the queue can be a descendant of it. Skip the expensive
+ STALE drain in this case.
- Add find_all parameter to repo_get_merge_bases_many_dirty() and
- thread it through to paint_down_to_common(). git merge-base
- (without --all) passes show_all=0, triggering the early exit.
+ Introduce enum merge_base_flags with MERGE_BASE_FIND_ALL and
+ MERGE_BASE_IGNORE_MISSING_COMMITS, replacing the two boolean
+ parameters in paint_down_to_common(). Thread the flags through
+ merge_bases_many(), get_merge_bases_many_0(), and the public
+ repo_get_merge_bases_many_dirty() API. git merge-base (without
+ --all) passes 0, triggering the early exit.
On a 2.2M-commit merge-heavy monorepo with commit-graph:
@@ Commit message
Signed-off-by: Kristofer Karlsson <krka@spotify.com>
## builtin/merge-base.c ##
-@@ builtin/merge-base.c: static int show_merge_base(struct commit **rev, size_t rev_nr, int show_all)
+@@
+
+ static int show_merge_base(struct commit **rev, size_t rev_nr, int show_all)
+ {
++ enum merge_base_flags flags = show_all ? MERGE_BASE_FIND_ALL : 0;
struct commit_list *result = NULL, *r;
if (repo_get_merge_bases_many_dirty(the_repository, rev[0],
- rev_nr - 1, rev + 1, &result) < 0) {
+ rev_nr - 1, rev + 1,
-+ show_all, &result) < 0) {
++ flags, &result) < 0) {
commit_list_free(result);
return -1;
}
## commit-reach.c ##
@@ commit-reach.c: static int paint_down_to_common(struct repository *r,
+ struct commit *one, int n,
struct commit **twos,
timestamp_t min_generation,
- int ignore_missing_commits,
-+ int find_all,
+- int ignore_missing_commits,
++ enum merge_base_flags mb_flags,
struct commit_list **result)
{
struct prio_queue queue = { compare_commits_by_gen_then_commit_date };
@@ commit-reach.c: static int paint_down_to_common(struct repository *r,
+ * remaining common ancestor can be a
+ * descendant of this one.
+ */
-+ if (!find_all &&
++ if (!(mb_flags & MERGE_BASE_FIND_ALL) &&
+ generation < GENERATION_NUMBER_INFINITY)
+ break;
}
/* Mark parents of a found merge stale */
flags |= STALE;
+@@ commit-reach.c: static int paint_down_to_common(struct repository *r,
+ * corrupt commits would already have been
+ * dispatched with a `die()`.
+ */
+- if (ignore_missing_commits)
++ if (mb_flags & MERGE_BASE_IGNORE_MISSING_COMMITS)
+ return 0;
+ return error(_("could not parse commit %s"),
+ oid_to_hex(&p->object.oid));
@@ commit-reach.c: static int paint_down_to_common(struct repository *r,
static int merge_bases_many(struct repository *r,
struct commit *one, int n,
struct commit **twos,
-+ int find_all,
++ enum merge_base_flags mb_flags,
struct commit_list **result)
{
struct commit_list *list = NULL, **tail = result;
@@ commit-reach.c: static int merge_bases_many(struct repository *r,
}
- if (paint_down_to_common(r, one, n, twos, 0, 0, &list)) {
-+ if (paint_down_to_common(r, one, n, twos, 0, 0, find_all, &list)) {
++ if (paint_down_to_common(r, one, n, twos, 0, mb_flags, &list)) {
commit_list_free(list);
return -1;
}
@@ commit-reach.c: static int remove_redundant_no_gen(struct repository *r,
}
if (paint_down_to_common(r, array[i], filled,
- work, min_generation, 0, &common)) {
-+ work, min_generation, 0, 1, &common)) {
++ work, min_generation,
++ MERGE_BASE_FIND_ALL, &common)) {
clear_commit_marks(array[i], all_flags);
clear_commit_marks_many(filled, work, all_flags);
commit_list_free(common);
@@ commit-reach.c: static int get_merge_bases_many_0(struct repository *r,
size_t n,
struct commit **twos,
int cleanup,
-+ int find_all,
++ enum merge_base_flags mb_flags,
struct commit_list **result)
{
struct commit_list *list, **tail = result;
@@ commit-reach.c: static int get_merge_bases_many_0(struct repository *r,
int ret;
- if (merge_bases_many(r, one, n, twos, result) < 0)
-+ if (merge_bases_many(r, one, n, twos, find_all, result) < 0)
++ if (merge_bases_many(r, one, n, twos, mb_flags, result) < 0)
return -1;
for (i = 0; i < n; i++) {
if (one == twos[i])
@@ commit-reach.c: int repo_get_merge_bases_many(struct repository *r,
struct commit_list **result)
{
- return get_merge_bases_many_0(r, one, n, twos, 1, result);
-+ return get_merge_bases_many_0(r, one, n, twos, 1, 1, result);
++ return get_merge_bases_many_0(r, one, n, twos, 1,
++ MERGE_BASE_FIND_ALL, result);
}
int repo_get_merge_bases_many_dirty(struct repository *r,
struct commit *one,
size_t n,
struct commit **twos,
-+ int find_all,
++ enum merge_base_flags mb_flags,
struct commit_list **result)
{
- return get_merge_bases_many_0(r, one, n, twos, 0, result);
-+ return get_merge_bases_many_0(r, one, n, twos, 0, find_all, result);
++ return get_merge_bases_many_0(r, one, n, twos, 0, mb_flags, result);
}
int repo_get_merge_bases(struct repository *r,
@@ commit-reach.c: int repo_get_merge_bases(struct repository *r,
struct commit_list **result)
{
- return get_merge_bases_many_0(r, one, 1, &two, 1, result);
-+ return get_merge_bases_many_0(r, one, 1, &two, 1, 1, result);
++ return get_merge_bases_many_0(r, one, 1, &two, 1,
++ MERGE_BASE_FIND_ALL, result);
}
/*
@@ commit-reach.c: int repo_in_merge_bases_many(struct repository *r, struct commit *commit,
+ struct commit_list *bases = NULL;
+ int ret = 0, i;
+ timestamp_t generation, max_generation = GENERATION_NUMBER_ZERO;
++ enum merge_base_flags mb_flags = MERGE_BASE_FIND_ALL;
++
++ if (ignore_missing_commits)
++ mb_flags |= MERGE_BASE_IGNORE_MISSING_COMMITS;
+
+ if (repo_parse_commit(r, commit))
+ return ignore_missing_commits ? 0 : -1;
+@@ commit-reach.c: int repo_in_merge_bases_many(struct repository *r, struct commit *commit,
if (paint_down_to_common(r, commit,
nr_reference, reference,
- generation, ignore_missing_commits, &bases))
-+ generation, ignore_missing_commits, 1, &bases))
++ generation, mb_flags, &bases))
ret = -1;
else if (commit->object.flags & PARENT2)
ret = 1;
@@ commit-reach.h: int repo_get_merge_bases_many(struct repository *r,
struct commit **twos,
struct commit_list **result);
-/* To be used only when object flags after this call no longer matter */
++enum merge_base_flags {
++ MERGE_BASE_FIND_ALL = (1 << 0),
++ MERGE_BASE_IGNORE_MISSING_COMMITS = (1 << 1),
++};
++
+/*
+ * To be used only when object flags after this call no longer matter.
-+ * When find_all is false and generation numbers are available, returns
-+ * after finding the first merge-base, skipping the STALE drain.
++ * Without MERGE_BASE_FIND_ALL and with generation numbers available,
++ * returns after finding the first merge-base, skipping the STALE drain.
+ */
int repo_get_merge_bases_many_dirty(struct repository *r,
struct commit *one, size_t n,
struct commit **twos,
-+ int find_all,
++ enum merge_base_flags mb_flags,
struct commit_list **result);
int get_octopus_merge_bases(struct commit_list *in, struct commit_list **result);
builtin/merge-base.c | 4 +-
commit-reach.c | 36 +++++++++----
commit-reach.h | 12 ++++-
t/t6010-merge-base.sh | 119 ++++++++++++++++++++++++++++++++++++++++++
t/t6600-test-reach.sh | 40 ++++++++++++++
5 files changed, 200 insertions(+), 11 deletions(-)
diff --git a/builtin/merge-base.c b/builtin/merge-base.c
index c7ee97fa6a..a87011c6cd 100644
--- a/builtin/merge-base.c
+++ b/builtin/merge-base.c
@@ -11,10 +11,12 @@
static int show_merge_base(struct commit **rev, size_t rev_nr, int show_all)
{
+ enum merge_base_flags flags = show_all ? MERGE_BASE_FIND_ALL : 0;
struct commit_list *result = NULL, *r;
if (repo_get_merge_bases_many_dirty(the_repository, rev[0],
- rev_nr - 1, rev + 1, &result) < 0) {
+ rev_nr - 1, rev + 1,
+ flags, &result) < 0) {
commit_list_free(result);
return -1;
}
diff --git a/commit-reach.c b/commit-reach.c
index d3a9b3ed6f..5a52be90a6 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -54,7 +54,7 @@ static int paint_down_to_common(struct repository *r,
struct commit *one, int n,
struct commit **twos,
timestamp_t min_generation,
- int ignore_missing_commits,
+ enum merge_base_flags mb_flags,
struct commit_list **result)
{
struct prio_queue queue = { compare_commits_by_gen_then_commit_date };
@@ -97,6 +97,14 @@ static int paint_down_to_common(struct repository *r,
if (!(commit->object.flags & RESULT)) {
commit->object.flags |= RESULT;
tail = commit_list_append(commit, tail);
+ /*
+ * The queue is generation-ordered; no
+ * remaining common ancestor can be a
+ * descendant of this one.
+ */
+ if (!(mb_flags & MERGE_BASE_FIND_ALL) &&
+ generation < GENERATION_NUMBER_INFINITY)
+ break;
}
/* Mark parents of a found merge stale */
flags |= STALE;
@@ -118,7 +126,7 @@ static int paint_down_to_common(struct repository *r,
* corrupt commits would already have been
* dispatched with a `die()`.
*/
- if (ignore_missing_commits)
+ if (mb_flags & MERGE_BASE_IGNORE_MISSING_COMMITS)
return 0;
return error(_("could not parse commit %s"),
oid_to_hex(&p->object.oid));
@@ -136,6 +144,7 @@ static int paint_down_to_common(struct repository *r,
static int merge_bases_many(struct repository *r,
struct commit *one, int n,
struct commit **twos,
+ enum merge_base_flags mb_flags,
struct commit_list **result)
{
struct commit_list *list = NULL, **tail = result;
@@ -165,7 +174,7 @@ static int merge_bases_many(struct repository *r,
oid_to_hex(&twos[i]->object.oid));
}
- if (paint_down_to_common(r, one, n, twos, 0, 0, &list)) {
+ if (paint_down_to_common(r, one, n, twos, 0, mb_flags, &list)) {
commit_list_free(list);
return -1;
}
@@ -246,7 +255,8 @@ static int remove_redundant_no_gen(struct repository *r,
min_generation = curr_generation;
}
if (paint_down_to_common(r, array[i], filled,
- work, min_generation, 0, &common)) {
+ work, min_generation,
+ MERGE_BASE_FIND_ALL, &common)) {
clear_commit_marks(array[i], all_flags);
clear_commit_marks_many(filled, work, all_flags);
commit_list_free(common);
@@ -425,6 +435,7 @@ static int get_merge_bases_many_0(struct repository *r,
size_t n,
struct commit **twos,
int cleanup,
+ enum merge_base_flags mb_flags,
struct commit_list **result)
{
struct commit_list *list, **tail = result;
@@ -432,7 +443,7 @@ static int get_merge_bases_many_0(struct repository *r,
size_t cnt, i;
int ret;
- if (merge_bases_many(r, one, n, twos, result) < 0)
+ if (merge_bases_many(r, one, n, twos, mb_flags, result) < 0)
return -1;
for (i = 0; i < n; i++) {
if (one == twos[i])
@@ -475,16 +486,18 @@ int repo_get_merge_bases_many(struct repository *r,
struct commit **twos,
struct commit_list **result)
{
- return get_merge_bases_many_0(r, one, n, twos, 1, result);
+ return get_merge_bases_many_0(r, one, n, twos, 1,
+ MERGE_BASE_FIND_ALL, result);
}
int repo_get_merge_bases_many_dirty(struct repository *r,
struct commit *one,
size_t n,
struct commit **twos,
+ enum merge_base_flags mb_flags,
struct commit_list **result)
{
- return get_merge_bases_many_0(r, one, n, twos, 0, result);
+ return get_merge_bases_many_0(r, one, n, twos, 0, mb_flags, result);
}
int repo_get_merge_bases(struct repository *r,
@@ -492,7 +505,8 @@ int repo_get_merge_bases(struct repository *r,
struct commit *two,
struct commit_list **result)
{
- return get_merge_bases_many_0(r, one, 1, &two, 1, result);
+ return get_merge_bases_many_0(r, one, 1, &two, 1,
+ MERGE_BASE_FIND_ALL, result);
}
/*
@@ -537,6 +551,10 @@ int repo_in_merge_bases_many(struct repository *r, struct commit *commit,
struct commit_list *bases = NULL;
int ret = 0, i;
timestamp_t generation, max_generation = GENERATION_NUMBER_ZERO;
+ enum merge_base_flags mb_flags = MERGE_BASE_FIND_ALL;
+
+ if (ignore_missing_commits)
+ mb_flags |= MERGE_BASE_IGNORE_MISSING_COMMITS;
if (repo_parse_commit(r, commit))
return ignore_missing_commits ? 0 : -1;
@@ -555,7 +573,7 @@ int repo_in_merge_bases_many(struct repository *r, struct commit *commit,
if (paint_down_to_common(r, commit,
nr_reference, reference,
- generation, ignore_missing_commits, &bases))
+ generation, mb_flags, &bases))
ret = -1;
else if (commit->object.flags & PARENT2)
ret = 1;
diff --git a/commit-reach.h b/commit-reach.h
index 6012402dfc..41607d8952 100644
--- a/commit-reach.h
+++ b/commit-reach.h
@@ -17,10 +17,20 @@ int repo_get_merge_bases_many(struct repository *r,
struct commit *one, size_t n,
struct commit **twos,
struct commit_list **result);
-/* To be used only when object flags after this call no longer matter */
+enum merge_base_flags {
+ MERGE_BASE_FIND_ALL = (1 << 0),
+ MERGE_BASE_IGNORE_MISSING_COMMITS = (1 << 1),
+};
+
+/*
+ * To be used only when object flags after this call no longer matter.
+ * Without MERGE_BASE_FIND_ALL and with generation numbers available,
+ * returns after finding the first merge-base, skipping the STALE drain.
+ */
int repo_get_merge_bases_many_dirty(struct repository *r,
struct commit *one, size_t n,
struct commit **twos,
+ enum merge_base_flags mb_flags,
struct commit_list **result);
int get_octopus_merge_bases(struct commit_list *in, struct commit_list **result);
diff --git a/t/t6010-merge-base.sh b/t/t6010-merge-base.sh
index 44c726ea39..f6c85d4f53 100755
--- a/t/t6010-merge-base.sh
+++ b/t/t6010-merge-base.sh
@@ -305,4 +305,123 @@ test_expect_success 'merge-base --octopus --all for complex tree' '
test_cmp expected actual
'
+# The following tests verify that "git merge-base" (without --all)
+# returns the same result with and without a commit-graph.
+# This exercises the early-exit optimisation in paint_down_to_common
+# that skips the STALE drain when generation numbers are available.
+
+test_expect_success 'setup for commit-graph tests' '
+ git init graph-repo &&
+ (
+ cd graph-repo &&
+
+ # Build a forked DAG:
+ #
+ # L1---L2 (left)
+ # /
+ # S
+ # \
+ # R1---R2 (right)
+ #
+ test_commit GS &&
+ git checkout -b left &&
+ test_commit L1 &&
+ test_commit L2 &&
+ git checkout GS &&
+ git checkout -b right &&
+ test_commit GR1 &&
+ test_commit GR2
+ )
+'
+
+test_expect_success 'merge-base without commit-graph' '
+ (
+ cd graph-repo &&
+ rm -f .git/objects/info/commit-graph &&
+ git merge-base left right >actual &&
+ git rev-parse GS >expected &&
+ test_cmp expected actual
+ )
+'
+
+test_expect_success 'merge-base with commit-graph' '
+ (
+ cd graph-repo &&
+ git commit-graph write --reachable &&
+ git merge-base left right >actual &&
+ git rev-parse GS >expected &&
+ test_cmp expected actual
+ )
+'
+
+test_expect_success 'merge-base --all with commit-graph' '
+ (
+ cd graph-repo &&
+ git merge-base --all left right >actual &&
+ git rev-parse GS >expected &&
+ test_cmp expected actual
+ )
+'
+
+test_expect_success 'merge-base agrees with --all for single result' '
+ (
+ cd graph-repo &&
+ git commit-graph write --reachable &&
+ git merge-base left right >actual.single &&
+ git merge-base --all left right >actual.all &&
+ test_cmp actual.all actual.single
+ )
+'
+
+test_expect_success 'setup for deep chain commit-graph test' '
+ git init deep-repo &&
+ (
+ cd deep-repo &&
+
+ # Build a deep forked DAG:
+ #
+ # L1--L2--...--L20 (left)
+ # /
+ # S
+ # \
+ # R1--R2--...--R20 (right)
+ #
+ test_commit DS &&
+ git checkout -b left &&
+ for i in $(test_seq 1 20)
+ do
+ test_commit DL$i || return 1
+ done &&
+ git checkout DS &&
+ git checkout -b right &&
+ for i in $(test_seq 1 20)
+ do
+ test_commit DR$i || return 1
+ done
+ )
+'
+
+test_expect_success 'deep chain: merge-base matches with and without commit-graph' '
+ (
+ cd deep-repo &&
+ rm -f .git/objects/info/commit-graph &&
+ git merge-base left right >actual.no-graph &&
+ git rev-parse DS >expected &&
+ test_cmp expected actual.no-graph &&
+ git commit-graph write --reachable &&
+ git merge-base left right >actual.graph &&
+ test_cmp expected actual.graph
+ )
+'
+
+test_expect_success 'deep chain: --all and non---all agree with commit-graph' '
+ (
+ cd deep-repo &&
+ git commit-graph write --reachable &&
+ git merge-base left right >actual.single &&
+ git merge-base --all left right >actual.all &&
+ test_cmp actual.all actual.single
+ )
+'
+
test_done
diff --git a/t/t6600-test-reach.sh b/t/t6600-test-reach.sh
index dc0421ed2f..51c23b7683 100755
--- a/t/t6600-test-reach.sh
+++ b/t/t6600-test-reach.sh
@@ -882,4 +882,44 @@ test_expect_success 'rev-list --maximal-only matches merge-base --independent' '
test_cmp expect.sorted actual.sorted
'
+# The following tests verify the early-exit optimisation in
+# paint_down_to_common when merge-base is invoked without --all.
+# Each test checks all four commit-graph configurations.
+
+merge_base_all_modes () {
+ test_when_finished rm -rf .git/objects/info/commit-graph &&
+ git merge-base "$@" >actual &&
+ test_cmp expect actual &&
+ cp commit-graph-full .git/objects/info/commit-graph &&
+ git merge-base "$@" >actual &&
+ test_cmp expect actual &&
+ cp commit-graph-half .git/objects/info/commit-graph &&
+ git merge-base "$@" >actual &&
+ test_cmp expect actual &&
+ cp commit-graph-no-gdat .git/objects/info/commit-graph &&
+ git merge-base "$@" >actual &&
+ test_cmp expect actual
+}
+
+test_expect_success 'merge-base without --all (unique base)' '
+ git rev-parse commit-5-3 >expect &&
+ merge_base_all_modes commit-5-7 commit-8-3
+'
+
+test_expect_success 'merge-base without --all is one of --all results' '
+ test_when_finished rm -rf .git/objects/info/commit-graph &&
+
+ cp commit-graph-full .git/objects/info/commit-graph &&
+ git merge-base --all commit-5-7 commit-4-8 commit-6-6 commit-8-3 >all &&
+ git merge-base commit-5-7 commit-4-8 commit-6-6 commit-8-3 >single &&
+ test_line_count = 1 single &&
+ grep -F -f single all &&
+
+ cp commit-graph-half .git/objects/info/commit-graph &&
+ git merge-base --all commit-5-7 commit-4-8 commit-6-6 commit-8-3 >all &&
+ git merge-base commit-5-7 commit-4-8 commit-6-6 commit-8-3 >single &&
+ test_line_count = 1 single &&
+ grep -F -f single all
+'
+
test_done
base-commit: 94f057755b7941b321fd11fec1b2e3ca5313a4e0
--
gitgitgadget
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH v3] commit-reach: early exit paint_down_to_common for single merge-base
2026-05-11 11:22 ` [PATCH v3] " Kristofer Karlsson via GitGitGadget
@ 2026-05-11 12:04 ` Patrick Steinhardt
2026-05-11 12:59 ` [PATCH v4 0/2] [RFC] commit-reach: skip STALE drain when only one merge-base needed Kristofer Karlsson via GitGitGadget
1 sibling, 0 replies; 11+ messages in thread
From: Patrick Steinhardt @ 2026-05-11 12:04 UTC (permalink / raw)
To: Kristofer Karlsson via GitGitGadget; +Cc: git, Kristofer Karlsson
On Mon, May 11, 2026 at 11:22:12AM +0000, Kristofer Karlsson via GitGitGadget wrote:
> Changes since v2 (thanks Patrick for the suggestion):
>
> * Replaced the boolean find_all and ignore_missing_commits parameters
> in paint_down_to_common() with a single enum merge_base_flags
> mb_flags, reducing the function from 8 to 7 parameters. The enum is
> defined in commit-reach.h with MERGE_BASE_FIND_ALL and
> MERGE_BASE_IGNORE_MISSING_COMMITS.
>
> * Named the enum merge_base_flags rather than
> paint_down_to_common_flags since the flags express caller intent and
> are threaded through multiple layers including the public
> repo_get_merge_bases_many_dirty() API.
>
> * Used mb_flags as the parameter name to avoid shadowing the existing
> local int flags (commit object flags) inside paint_down_to_common().
Thanks for making these changes. I think it would make sense to split
this up into two commits though, where the first commit only introduces
the new enum (without the new flag) and the second commit then adds the
new flag and the performance optimization.
That'd help quite a bit with the review as it splits up the changes into
a refactoring-only change without any intended semantic change, and
another commit that then does result in a user-visible change.
Patrick
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v4 0/2] [RFC] commit-reach: skip STALE drain when only one merge-base needed
2026-05-11 11:22 ` [PATCH v3] " Kristofer Karlsson via GitGitGadget
2026-05-11 12:04 ` Patrick Steinhardt
@ 2026-05-11 12:59 ` Kristofer Karlsson via GitGitGadget
2026-05-11 12:59 ` [PATCH v4 1/2] commit-reach: introduce merge_base_flags enum Kristofer Karlsson via GitGitGadget
2026-05-11 12:59 ` [PATCH v4 2/2] commit-reach: early exit paint_down_to_common for single merge-base Kristofer Karlsson via GitGitGadget
1 sibling, 2 replies; 11+ messages in thread
From: Kristofer Karlsson via GitGitGadget @ 2026-05-11 12:59 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Kristofer Karlsson
Context for what this is all about.
I am working with a very large git monorepo and have been investigating
performance issues. After some digging I ended up looking more deeply into
git merge-base. I saw it had an --all parameter but the default is to only
return a single merge-base. Looking through the code and adding debug
timing, I realized that although the total time to compute the merge-base
was high, a very small amount of time was spent finding the initial
merge-base value that was later returned.
The optimization is actually quite dramatic in a large repo - runtime went
down from 5000ms to 50ms, so it's roughly a 100x optimization. This comes
from an exploding frontier of STALE commits to drain.
Thus, my idea is simply to return early from the function once we know what
will be returned. This only works if we find a candidate that we know will
not be pruned later - but fortunately if we have a commit graph with
generations we will visit commits in order such that it will actually not be
pruned.
CC: Derrick Stolee stolee@gmail.com
Changes since v1 (thanks Junio for the review):
* Dropped the has_gens variable entirely. If a commit has a finite
generation then it is in the commit-graph, and so are all its ancestors —
no additional check is needed to know the queue ordering is sound.
Without a commit-graph every commit gets INFINITY and the guard never
fires. This also avoids the misleading interaction with callers that pass
non-zero min_generation without having generation data.
* Simplified the early exit guard from three conditions to two: !find_all
&& generation < GENERATION_NUMBER_INFINITY.
* Fixed multi-line comment style per CodingGuidelines.
* Replaced "dominate" with concrete reasoning about queue ordering.
* Did not extract a helper function: after the simplifications above the
inner block is four lines and reads naturally inline. The right boundary
for a helper is not obvious (it could absorb just the result marking, or
also the RESULT flag check, or also the PARENT1|PARENT2 test) and each
level requires more local state passed by pointer. Happy to extract one
if preferred.
Changes since v2 (thanks Patrick for the suggestion):
* Split into two commits: the first is a pure refactoring that introduces
enum merge_base_flags and replaces the boolean ignore_missing_commits
parameter, the second adds the new MERGE_BASE_FIND_ALL flag and the early
exit optimization.
* Replaced the boolean find_all and ignore_missing_commits parameters in
paint_down_to_common() with a single enum merge_base_flags mb_flags,
reducing the function from 8 to 7 parameters. The enum is threaded
through merge_bases_many(), get_merge_bases_many_0(), and the public
repo_get_merge_bases_many_dirty() API.
* Named the enum merge_base_flags rather than paint_down_to_common_flags
since the flags express caller intent and are threaded through multiple
layers including the public API.
* Used mb_flags as the parameter name to avoid shadowing the existing local
int flags (commit object flags) inside paint_down_to_common().
Kristofer Karlsson (2):
commit-reach: introduce merge_base_flags enum
commit-reach: early exit paint_down_to_common for single merge-base
builtin/merge-base.c | 4 +++-
commit-reach.c | 36 +++++++++++++++++++++++++++---------
commit-reach.h | 12 +++++++++++-
t/t6600-test-reach.sh | 40 ++++++++++++++++++++++++++++++++++++++++
4 files changed, 81 insertions(+), 11 deletions(-)
base-commit: 94f057755b7941b321fd11fec1b2e3ca5313a4e0
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-2109%2Fspkrka%2Fmerge-base-early-exit-v4
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-2109/spkrka/merge-base-early-exit-v4
Pull-Request: https://github.com/gitgitgadget/git/pull/2109
Range-diff vs v3:
1: e4dada892f ! 1: 12d9e1c85f commit-reach: early exit paint_down_to_common for single merge-base
@@ Metadata
Author: Kristofer Karlsson <krka@spotify.com>
## Commit message ##
- commit-reach: early exit paint_down_to_common for single merge-base
+ commit-reach: introduce merge_base_flags enum
- Commits not in the commit-graph get GENERATION_NUMBER_INFINITY and
- sort to the top of the priority queue. After those, commits with
- finite generation numbers are popped in non-increasing order.
- When MERGE_BASE_FIND_ALL is not set the first doubly-painted commit
- with a finite generation is therefore a best merge-base: no commit
- still in the queue can be a descendant of it. Skip the expensive
- STALE drain in this case.
+ Replace the boolean ignore_missing_commits parameter in
+ paint_down_to_common() with an enum merge_base_flags, and thread
+ the flags through merge_bases_many(), get_merge_bases_many_0(),
+ and the public repo_get_merge_bases_many_dirty() API.
- Introduce enum merge_base_flags with MERGE_BASE_FIND_ALL and
- MERGE_BASE_IGNORE_MISSING_COMMITS, replacing the two boolean
- parameters in paint_down_to_common(). Thread the flags through
- merge_bases_many(), get_merge_bases_many_0(), and the public
- repo_get_merge_bases_many_dirty() API. git merge-base (without
- --all) passes 0, triggering the early exit.
+ This makes callsites with boolean parameters easier to read and
+ prepares the function for additional flags in a subsequent commit.
- On a 2.2M-commit merge-heavy monorepo with commit-graph:
-
- HEAD vs ~500: 5,229ms -> 24ms
- HEAD vs ~1000: 4,214ms -> 39ms
- HEAD vs ~5000: 3,799ms -> 46ms
- HEAD vs ~10000: 3,827ms -> 61ms
+ No functional change: the single caller that used
+ ignore_missing_commits (repo_in_merge_bases_many) now sets
+ MERGE_BASE_IGNORE_MISSING_COMMITS in the flags word, and all
+ other callers pass 0.
Signed-off-by: Kristofer Karlsson <krka@spotify.com>
## builtin/merge-base.c ##
-@@
-
- static int show_merge_base(struct commit **rev, size_t rev_nr, int show_all)
- {
-+ enum merge_base_flags flags = show_all ? MERGE_BASE_FIND_ALL : 0;
+@@ builtin/merge-base.c: static int show_merge_base(struct commit **rev, size_t rev_nr, int show_all)
struct commit_list *result = NULL, *r;
if (repo_get_merge_bases_many_dirty(the_repository, rev[0],
- rev_nr - 1, rev + 1, &result) < 0) {
+ rev_nr - 1, rev + 1,
-+ flags, &result) < 0) {
++ 0, &result) < 0) {
commit_list_free(result);
return -1;
}
@@ commit-reach.c: static int paint_down_to_common(struct repository *r,
struct commit_list **result)
{
struct prio_queue queue = { compare_commits_by_gen_then_commit_date };
-@@ commit-reach.c: static int paint_down_to_common(struct repository *r,
- if (!(commit->object.flags & RESULT)) {
- commit->object.flags |= RESULT;
- tail = commit_list_append(commit, tail);
-+ /*
-+ * The queue is generation-ordered; no
-+ * remaining common ancestor can be a
-+ * descendant of this one.
-+ */
-+ if (!(mb_flags & MERGE_BASE_FIND_ALL) &&
-+ generation < GENERATION_NUMBER_INFINITY)
-+ break;
- }
- /* Mark parents of a found merge stale */
- flags |= STALE;
@@ commit-reach.c: static int paint_down_to_common(struct repository *r,
* corrupt commits would already have been
* dispatched with a `die()`.
@@ commit-reach.c: static int merge_bases_many(struct repository *r,
commit_list_free(list);
return -1;
}
-@@ commit-reach.c: static int remove_redundant_no_gen(struct repository *r,
- min_generation = curr_generation;
- }
- if (paint_down_to_common(r, array[i], filled,
-- work, min_generation, 0, &common)) {
-+ work, min_generation,
-+ MERGE_BASE_FIND_ALL, &common)) {
- clear_commit_marks(array[i], all_flags);
- clear_commit_marks_many(filled, work, all_flags);
- commit_list_free(common);
@@ commit-reach.c: static int get_merge_bases_many_0(struct repository *r,
size_t n,
struct commit **twos,
@@ commit-reach.c: int repo_get_merge_bases_many(struct repository *r,
struct commit_list **result)
{
- return get_merge_bases_many_0(r, one, n, twos, 1, result);
-+ return get_merge_bases_many_0(r, one, n, twos, 1,
-+ MERGE_BASE_FIND_ALL, result);
++ return get_merge_bases_many_0(r, one, n, twos, 1, 0, result);
}
int repo_get_merge_bases_many_dirty(struct repository *r,
@@ commit-reach.c: int repo_get_merge_bases(struct repository *r,
struct commit_list **result)
{
- return get_merge_bases_many_0(r, one, 1, &two, 1, result);
-+ return get_merge_bases_many_0(r, one, 1, &two, 1,
-+ MERGE_BASE_FIND_ALL, result);
++ return get_merge_bases_many_0(r, one, 1, &two, 1, 0, result);
}
/*
@@ commit-reach.c: int repo_in_merge_bases_many(struct repository *r, struct commit
struct commit_list *bases = NULL;
int ret = 0, i;
timestamp_t generation, max_generation = GENERATION_NUMBER_ZERO;
-+ enum merge_base_flags mb_flags = MERGE_BASE_FIND_ALL;
++ enum merge_base_flags mb_flags = 0;
+
+ if (ignore_missing_commits)
+ mb_flags |= MERGE_BASE_IGNORE_MISSING_COMMITS;
@@ commit-reach.h: int repo_get_merge_bases_many(struct repository *r,
struct commit *one, size_t n,
struct commit **twos,
struct commit_list **result);
--/* To be used only when object flags after this call no longer matter */
+enum merge_base_flags {
-+ MERGE_BASE_FIND_ALL = (1 << 0),
-+ MERGE_BASE_IGNORE_MISSING_COMMITS = (1 << 1),
++ MERGE_BASE_IGNORE_MISSING_COMMITS = (1 << 0),
+};
+
-+/*
-+ * To be used only when object flags after this call no longer matter.
-+ * Without MERGE_BASE_FIND_ALL and with generation numbers available,
-+ * returns after finding the first merge-base, skipping the STALE drain.
-+ */
+ /* To be used only when object flags after this call no longer matter */
int repo_get_merge_bases_many_dirty(struct repository *r,
struct commit *one, size_t n,
struct commit **twos,
@@ commit-reach.h: int repo_get_merge_bases_many(struct repository *r,
struct commit_list **result);
int get_octopus_merge_bases(struct commit_list *in, struct commit_list **result);
-
- ## t/t6010-merge-base.sh ##
-@@ t/t6010-merge-base.sh: test_expect_success 'merge-base --octopus --all for complex tree' '
- test_cmp expected actual
- '
-
-+# The following tests verify that "git merge-base" (without --all)
-+# returns the same result with and without a commit-graph.
-+# This exercises the early-exit optimisation in paint_down_to_common
-+# that skips the STALE drain when generation numbers are available.
-+
-+test_expect_success 'setup for commit-graph tests' '
-+ git init graph-repo &&
-+ (
-+ cd graph-repo &&
-+
-+ # Build a forked DAG:
-+ #
-+ # L1---L2 (left)
-+ # /
-+ # S
-+ # \
-+ # R1---R2 (right)
-+ #
-+ test_commit GS &&
-+ git checkout -b left &&
-+ test_commit L1 &&
-+ test_commit L2 &&
-+ git checkout GS &&
-+ git checkout -b right &&
-+ test_commit GR1 &&
-+ test_commit GR2
-+ )
-+'
-+
-+test_expect_success 'merge-base without commit-graph' '
-+ (
-+ cd graph-repo &&
-+ rm -f .git/objects/info/commit-graph &&
-+ git merge-base left right >actual &&
-+ git rev-parse GS >expected &&
-+ test_cmp expected actual
-+ )
-+'
-+
-+test_expect_success 'merge-base with commit-graph' '
-+ (
-+ cd graph-repo &&
-+ git commit-graph write --reachable &&
-+ git merge-base left right >actual &&
-+ git rev-parse GS >expected &&
-+ test_cmp expected actual
-+ )
-+'
-+
-+test_expect_success 'merge-base --all with commit-graph' '
-+ (
-+ cd graph-repo &&
-+ git merge-base --all left right >actual &&
-+ git rev-parse GS >expected &&
-+ test_cmp expected actual
-+ )
-+'
-+
-+test_expect_success 'merge-base agrees with --all for single result' '
-+ (
-+ cd graph-repo &&
-+ git commit-graph write --reachable &&
-+ git merge-base left right >actual.single &&
-+ git merge-base --all left right >actual.all &&
-+ test_cmp actual.all actual.single
-+ )
-+'
-+
-+test_expect_success 'setup for deep chain commit-graph test' '
-+ git init deep-repo &&
-+ (
-+ cd deep-repo &&
-+
-+ # Build a deep forked DAG:
-+ #
-+ # L1--L2--...--L20 (left)
-+ # /
-+ # S
-+ # \
-+ # R1--R2--...--R20 (right)
-+ #
-+ test_commit DS &&
-+ git checkout -b left &&
-+ for i in $(test_seq 1 20)
-+ do
-+ test_commit DL$i || return 1
-+ done &&
-+ git checkout DS &&
-+ git checkout -b right &&
-+ for i in $(test_seq 1 20)
-+ do
-+ test_commit DR$i || return 1
-+ done
-+ )
-+'
-+
-+test_expect_success 'deep chain: merge-base matches with and without commit-graph' '
-+ (
-+ cd deep-repo &&
-+ rm -f .git/objects/info/commit-graph &&
-+ git merge-base left right >actual.no-graph &&
-+ git rev-parse DS >expected &&
-+ test_cmp expected actual.no-graph &&
-+ git commit-graph write --reachable &&
-+ git merge-base left right >actual.graph &&
-+ test_cmp expected actual.graph
-+ )
-+'
-+
-+test_expect_success 'deep chain: --all and non---all agree with commit-graph' '
-+ (
-+ cd deep-repo &&
-+ git commit-graph write --reachable &&
-+ git merge-base left right >actual.single &&
-+ git merge-base --all left right >actual.all &&
-+ test_cmp actual.all actual.single
-+ )
-+'
-+
- test_done
-
- ## t/t6600-test-reach.sh ##
-@@ t/t6600-test-reach.sh: test_expect_success 'rev-list --maximal-only matches merge-base --independent' '
- test_cmp expect.sorted actual.sorted
- '
-
-+# The following tests verify the early-exit optimisation in
-+# paint_down_to_common when merge-base is invoked without --all.
-+# Each test checks all four commit-graph configurations.
-+
-+merge_base_all_modes () {
-+ test_when_finished rm -rf .git/objects/info/commit-graph &&
-+ git merge-base "$@" >actual &&
-+ test_cmp expect actual &&
-+ cp commit-graph-full .git/objects/info/commit-graph &&
-+ git merge-base "$@" >actual &&
-+ test_cmp expect actual &&
-+ cp commit-graph-half .git/objects/info/commit-graph &&
-+ git merge-base "$@" >actual &&
-+ test_cmp expect actual &&
-+ cp commit-graph-no-gdat .git/objects/info/commit-graph &&
-+ git merge-base "$@" >actual &&
-+ test_cmp expect actual
-+}
-+
-+test_expect_success 'merge-base without --all (unique base)' '
-+ git rev-parse commit-5-3 >expect &&
-+ merge_base_all_modes commit-5-7 commit-8-3
-+'
-+
-+test_expect_success 'merge-base without --all is one of --all results' '
-+ test_when_finished rm -rf .git/objects/info/commit-graph &&
-+
-+ cp commit-graph-full .git/objects/info/commit-graph &&
-+ git merge-base --all commit-5-7 commit-4-8 commit-6-6 commit-8-3 >all &&
-+ git merge-base commit-5-7 commit-4-8 commit-6-6 commit-8-3 >single &&
-+ test_line_count = 1 single &&
-+ grep -F -f single all &&
-+
-+ cp commit-graph-half .git/objects/info/commit-graph &&
-+ git merge-base --all commit-5-7 commit-4-8 commit-6-6 commit-8-3 >all &&
-+ git merge-base commit-5-7 commit-4-8 commit-6-6 commit-8-3 >single &&
-+ test_line_count = 1 single &&
-+ grep -F -f single all
-+'
-+
- test_done
-: ---------- > 2: 19f1605067 commit-reach: early exit paint_down_to_common for single merge-base
--
gitgitgadget
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v4 1/2] commit-reach: introduce merge_base_flags enum
2026-05-11 12:59 ` [PATCH v4 0/2] [RFC] commit-reach: skip STALE drain when only one merge-base needed Kristofer Karlsson via GitGitGadget
@ 2026-05-11 12:59 ` Kristofer Karlsson via GitGitGadget
2026-05-11 12:59 ` [PATCH v4 2/2] commit-reach: early exit paint_down_to_common for single merge-base Kristofer Karlsson via GitGitGadget
1 sibling, 0 replies; 11+ messages in thread
From: Kristofer Karlsson via GitGitGadget @ 2026-05-11 12:59 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Kristofer Karlsson, Kristofer Karlsson
From: Kristofer Karlsson <krka@spotify.com>
Replace the boolean ignore_missing_commits parameter in
paint_down_to_common() with an enum merge_base_flags, and thread
the flags through merge_bases_many(), get_merge_bases_many_0(),
and the public repo_get_merge_bases_many_dirty() API.
This makes callsites with boolean parameters easier to read and
prepares the function for additional flags in a subsequent commit.
No functional change: the single caller that used
ignore_missing_commits (repo_in_merge_bases_many) now sets
MERGE_BASE_IGNORE_MISSING_COMMITS in the flags word, and all
other callers pass 0.
Signed-off-by: Kristofer Karlsson <krka@spotify.com>
---
builtin/merge-base.c | 3 ++-
commit-reach.c | 23 +++++++++++++++--------
commit-reach.h | 5 +++++
3 files changed, 22 insertions(+), 9 deletions(-)
diff --git a/builtin/merge-base.c b/builtin/merge-base.c
index c7ee97fa6a..9b50b4660e 100644
--- a/builtin/merge-base.c
+++ b/builtin/merge-base.c
@@ -14,7 +14,8 @@ static int show_merge_base(struct commit **rev, size_t rev_nr, int show_all)
struct commit_list *result = NULL, *r;
if (repo_get_merge_bases_many_dirty(the_repository, rev[0],
- rev_nr - 1, rev + 1, &result) < 0) {
+ rev_nr - 1, rev + 1,
+ 0, &result) < 0) {
commit_list_free(result);
return -1;
}
diff --git a/commit-reach.c b/commit-reach.c
index d3a9b3ed6f..766ba1156a 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -54,7 +54,7 @@ static int paint_down_to_common(struct repository *r,
struct commit *one, int n,
struct commit **twos,
timestamp_t min_generation,
- int ignore_missing_commits,
+ enum merge_base_flags mb_flags,
struct commit_list **result)
{
struct prio_queue queue = { compare_commits_by_gen_then_commit_date };
@@ -118,7 +118,7 @@ static int paint_down_to_common(struct repository *r,
* corrupt commits would already have been
* dispatched with a `die()`.
*/
- if (ignore_missing_commits)
+ if (mb_flags & MERGE_BASE_IGNORE_MISSING_COMMITS)
return 0;
return error(_("could not parse commit %s"),
oid_to_hex(&p->object.oid));
@@ -136,6 +136,7 @@ static int paint_down_to_common(struct repository *r,
static int merge_bases_many(struct repository *r,
struct commit *one, int n,
struct commit **twos,
+ enum merge_base_flags mb_flags,
struct commit_list **result)
{
struct commit_list *list = NULL, **tail = result;
@@ -165,7 +166,7 @@ static int merge_bases_many(struct repository *r,
oid_to_hex(&twos[i]->object.oid));
}
- if (paint_down_to_common(r, one, n, twos, 0, 0, &list)) {
+ if (paint_down_to_common(r, one, n, twos, 0, mb_flags, &list)) {
commit_list_free(list);
return -1;
}
@@ -425,6 +426,7 @@ static int get_merge_bases_many_0(struct repository *r,
size_t n,
struct commit **twos,
int cleanup,
+ enum merge_base_flags mb_flags,
struct commit_list **result)
{
struct commit_list *list, **tail = result;
@@ -432,7 +434,7 @@ static int get_merge_bases_many_0(struct repository *r,
size_t cnt, i;
int ret;
- if (merge_bases_many(r, one, n, twos, result) < 0)
+ if (merge_bases_many(r, one, n, twos, mb_flags, result) < 0)
return -1;
for (i = 0; i < n; i++) {
if (one == twos[i])
@@ -475,16 +477,17 @@ int repo_get_merge_bases_many(struct repository *r,
struct commit **twos,
struct commit_list **result)
{
- return get_merge_bases_many_0(r, one, n, twos, 1, result);
+ return get_merge_bases_many_0(r, one, n, twos, 1, 0, result);
}
int repo_get_merge_bases_many_dirty(struct repository *r,
struct commit *one,
size_t n,
struct commit **twos,
+ enum merge_base_flags mb_flags,
struct commit_list **result)
{
- return get_merge_bases_many_0(r, one, n, twos, 0, result);
+ return get_merge_bases_many_0(r, one, n, twos, 0, mb_flags, result);
}
int repo_get_merge_bases(struct repository *r,
@@ -492,7 +495,7 @@ int repo_get_merge_bases(struct repository *r,
struct commit *two,
struct commit_list **result)
{
- return get_merge_bases_many_0(r, one, 1, &two, 1, result);
+ return get_merge_bases_many_0(r, one, 1, &two, 1, 0, result);
}
/*
@@ -537,6 +540,10 @@ int repo_in_merge_bases_many(struct repository *r, struct commit *commit,
struct commit_list *bases = NULL;
int ret = 0, i;
timestamp_t generation, max_generation = GENERATION_NUMBER_ZERO;
+ enum merge_base_flags mb_flags = 0;
+
+ if (ignore_missing_commits)
+ mb_flags |= MERGE_BASE_IGNORE_MISSING_COMMITS;
if (repo_parse_commit(r, commit))
return ignore_missing_commits ? 0 : -1;
@@ -555,7 +562,7 @@ int repo_in_merge_bases_many(struct repository *r, struct commit *commit,
if (paint_down_to_common(r, commit,
nr_reference, reference,
- generation, ignore_missing_commits, &bases))
+ generation, mb_flags, &bases))
ret = -1;
else if (commit->object.flags & PARENT2)
ret = 1;
diff --git a/commit-reach.h b/commit-reach.h
index 6012402dfc..a3f2cd80eb 100644
--- a/commit-reach.h
+++ b/commit-reach.h
@@ -17,10 +17,15 @@ int repo_get_merge_bases_many(struct repository *r,
struct commit *one, size_t n,
struct commit **twos,
struct commit_list **result);
+enum merge_base_flags {
+ MERGE_BASE_IGNORE_MISSING_COMMITS = (1 << 0),
+};
+
/* To be used only when object flags after this call no longer matter */
int repo_get_merge_bases_many_dirty(struct repository *r,
struct commit *one, size_t n,
struct commit **twos,
+ enum merge_base_flags mb_flags,
struct commit_list **result);
int get_octopus_merge_bases(struct commit_list *in, struct commit_list **result);
--
gitgitgadget
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH v4 2/2] commit-reach: early exit paint_down_to_common for single merge-base
2026-05-11 12:59 ` [PATCH v4 0/2] [RFC] commit-reach: skip STALE drain when only one merge-base needed Kristofer Karlsson via GitGitGadget
2026-05-11 12:59 ` [PATCH v4 1/2] commit-reach: introduce merge_base_flags enum Kristofer Karlsson via GitGitGadget
@ 2026-05-11 12:59 ` Kristofer Karlsson via GitGitGadget
2026-05-12 0:40 ` Junio C Hamano
1 sibling, 1 reply; 11+ messages in thread
From: Kristofer Karlsson via GitGitGadget @ 2026-05-11 12:59 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Kristofer Karlsson, Kristofer Karlsson
From: Kristofer Karlsson <krka@spotify.com>
Commits not in the commit-graph get GENERATION_NUMBER_INFINITY and
sort to the top of the priority queue. After those, commits with
finite generation numbers are popped in non-increasing order.
When MERGE_BASE_FIND_ALL is not set the first doubly-painted commit
with a finite generation is therefore a best merge-base: no commit
still in the queue can be a descendant of it. Skip the expensive
STALE drain in this case.
Add MERGE_BASE_FIND_ALL to the merge_base_flags enum. Callers that
need every merge-base (repo_get_merge_bases_many, repo_get_merge_bases,
repo_in_merge_bases_many, remove_redundant_no_gen) pass the flag to
preserve existing behavior. git merge-base (without --all) passes 0,
triggering the early exit.
On a 2.2M-commit merge-heavy monorepo with commit-graph:
HEAD vs ~500: 5,229ms -> 24ms
HEAD vs ~1000: 4,214ms -> 39ms
HEAD vs ~5000: 3,799ms -> 46ms
HEAD vs ~10000: 3,827ms -> 61ms
Signed-off-by: Kristofer Karlsson <krka@spotify.com>
---
builtin/merge-base.c | 3 ++-
commit-reach.c | 19 +++++++++++++++----
commit-reach.h | 7 ++++++-
t/t6600-test-reach.sh | 40 ++++++++++++++++++++++++++++++++++++++++
4 files changed, 63 insertions(+), 6 deletions(-)
diff --git a/builtin/merge-base.c b/builtin/merge-base.c
index 9b50b4660e..a87011c6cd 100644
--- a/builtin/merge-base.c
+++ b/builtin/merge-base.c
@@ -11,11 +11,12 @@
static int show_merge_base(struct commit **rev, size_t rev_nr, int show_all)
{
+ enum merge_base_flags flags = show_all ? MERGE_BASE_FIND_ALL : 0;
struct commit_list *result = NULL, *r;
if (repo_get_merge_bases_many_dirty(the_repository, rev[0],
rev_nr - 1, rev + 1,
- 0, &result) < 0) {
+ flags, &result) < 0) {
commit_list_free(result);
return -1;
}
diff --git a/commit-reach.c b/commit-reach.c
index 766ba1156a..5a52be90a6 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -97,6 +97,14 @@ static int paint_down_to_common(struct repository *r,
if (!(commit->object.flags & RESULT)) {
commit->object.flags |= RESULT;
tail = commit_list_append(commit, tail);
+ /*
+ * The queue is generation-ordered; no
+ * remaining common ancestor can be a
+ * descendant of this one.
+ */
+ if (!(mb_flags & MERGE_BASE_FIND_ALL) &&
+ generation < GENERATION_NUMBER_INFINITY)
+ break;
}
/* Mark parents of a found merge stale */
flags |= STALE;
@@ -247,7 +255,8 @@ static int remove_redundant_no_gen(struct repository *r,
min_generation = curr_generation;
}
if (paint_down_to_common(r, array[i], filled,
- work, min_generation, 0, &common)) {
+ work, min_generation,
+ MERGE_BASE_FIND_ALL, &common)) {
clear_commit_marks(array[i], all_flags);
clear_commit_marks_many(filled, work, all_flags);
commit_list_free(common);
@@ -477,7 +486,8 @@ int repo_get_merge_bases_many(struct repository *r,
struct commit **twos,
struct commit_list **result)
{
- return get_merge_bases_many_0(r, one, n, twos, 1, 0, result);
+ return get_merge_bases_many_0(r, one, n, twos, 1,
+ MERGE_BASE_FIND_ALL, result);
}
int repo_get_merge_bases_many_dirty(struct repository *r,
@@ -495,7 +505,8 @@ int repo_get_merge_bases(struct repository *r,
struct commit *two,
struct commit_list **result)
{
- return get_merge_bases_many_0(r, one, 1, &two, 1, 0, result);
+ return get_merge_bases_many_0(r, one, 1, &two, 1,
+ MERGE_BASE_FIND_ALL, result);
}
/*
@@ -540,7 +551,7 @@ int repo_in_merge_bases_many(struct repository *r, struct commit *commit,
struct commit_list *bases = NULL;
int ret = 0, i;
timestamp_t generation, max_generation = GENERATION_NUMBER_ZERO;
- enum merge_base_flags mb_flags = 0;
+ enum merge_base_flags mb_flags = MERGE_BASE_FIND_ALL;
if (ignore_missing_commits)
mb_flags |= MERGE_BASE_IGNORE_MISSING_COMMITS;
diff --git a/commit-reach.h b/commit-reach.h
index a3f2cd80eb..3f3a563d8a 100644
--- a/commit-reach.h
+++ b/commit-reach.h
@@ -19,9 +19,14 @@ int repo_get_merge_bases_many(struct repository *r,
struct commit_list **result);
enum merge_base_flags {
MERGE_BASE_IGNORE_MISSING_COMMITS = (1 << 0),
+ MERGE_BASE_FIND_ALL = (1 << 1),
};
-/* To be used only when object flags after this call no longer matter */
+/*
+ * To be used only when object flags after this call no longer matter.
+ * Without MERGE_BASE_FIND_ALL and with generation numbers available,
+ * returns after finding the first merge-base, skipping the STALE drain.
+ */
int repo_get_merge_bases_many_dirty(struct repository *r,
struct commit *one, size_t n,
struct commit **twos,
diff --git a/t/t6600-test-reach.sh b/t/t6600-test-reach.sh
index dc0421ed2f..51c23b7683 100755
--- a/t/t6600-test-reach.sh
+++ b/t/t6600-test-reach.sh
@@ -882,4 +882,44 @@ test_expect_success 'rev-list --maximal-only matches merge-base --independent' '
test_cmp expect.sorted actual.sorted
'
+# The following tests verify the early-exit optimisation in
+# paint_down_to_common when merge-base is invoked without --all.
+# Each test checks all four commit-graph configurations.
+
+merge_base_all_modes () {
+ test_when_finished rm -rf .git/objects/info/commit-graph &&
+ git merge-base "$@" >actual &&
+ test_cmp expect actual &&
+ cp commit-graph-full .git/objects/info/commit-graph &&
+ git merge-base "$@" >actual &&
+ test_cmp expect actual &&
+ cp commit-graph-half .git/objects/info/commit-graph &&
+ git merge-base "$@" >actual &&
+ test_cmp expect actual &&
+ cp commit-graph-no-gdat .git/objects/info/commit-graph &&
+ git merge-base "$@" >actual &&
+ test_cmp expect actual
+}
+
+test_expect_success 'merge-base without --all (unique base)' '
+ git rev-parse commit-5-3 >expect &&
+ merge_base_all_modes commit-5-7 commit-8-3
+'
+
+test_expect_success 'merge-base without --all is one of --all results' '
+ test_when_finished rm -rf .git/objects/info/commit-graph &&
+
+ cp commit-graph-full .git/objects/info/commit-graph &&
+ git merge-base --all commit-5-7 commit-4-8 commit-6-6 commit-8-3 >all &&
+ git merge-base commit-5-7 commit-4-8 commit-6-6 commit-8-3 >single &&
+ test_line_count = 1 single &&
+ grep -F -f single all &&
+
+ cp commit-graph-half .git/objects/info/commit-graph &&
+ git merge-base --all commit-5-7 commit-4-8 commit-6-6 commit-8-3 >all &&
+ git merge-base commit-5-7 commit-4-8 commit-6-6 commit-8-3 >single &&
+ test_line_count = 1 single &&
+ grep -F -f single all
+'
+
test_done
--
gitgitgadget
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH v4 2/2] commit-reach: early exit paint_down_to_common for single merge-base
2026-05-11 12:59 ` [PATCH v4 2/2] commit-reach: early exit paint_down_to_common for single merge-base Kristofer Karlsson via GitGitGadget
@ 2026-05-12 0:40 ` Junio C Hamano
2026-05-12 5:16 ` Kristofer Karlsson
0 siblings, 1 reply; 11+ messages in thread
From: Junio C Hamano @ 2026-05-12 0:40 UTC (permalink / raw)
To: Kristofer Karlsson via GitGitGadget
Cc: git, Patrick Steinhardt, Kristofer Karlsson
"Kristofer Karlsson via GitGitGadget" <gitgitgadget@gmail.com>
writes:
> From: Kristofer Karlsson <krka@spotify.com>
>
> Commits not in the commit-graph get GENERATION_NUMBER_INFINITY and
> sort to the top of the priority queue. After those, commits with
> finite generation numbers are popped in non-increasing order.
> When MERGE_BASE_FIND_ALL is not set the first doubly-painted commit
> with a finite generation is therefore a best merge-base: no commit
> still in the queue can be a descendant of it. Skip the expensive
> STALE drain in this case.
>
> Add MERGE_BASE_FIND_ALL to the merge_base_flags enum. Callers that
> need every merge-base (repo_get_merge_bases_many, repo_get_merge_bases,
> repo_in_merge_bases_many, remove_redundant_no_gen) pass the flag to
> preserve existing behavior. git merge-base (without --all) passes 0,
> triggering the early exit.
>
> On a 2.2M-commit merge-heavy monorepo with commit-graph:
>
> HEAD vs ~500: 5,229ms -> 24ms
> HEAD vs ~1000: 4,214ms -> 39ms
> HEAD vs ~5000: 3,799ms -> 46ms
> HEAD vs ~10000: 3,827ms -> 61ms
>
> Signed-off-by: Kristofer Karlsson <krka@spotify.com>
> ---
> builtin/merge-base.c | 3 ++-
> commit-reach.c | 19 +++++++++++++++----
> commit-reach.h | 7 ++++++-
> t/t6600-test-reach.sh | 40 ++++++++++++++++++++++++++++++++++++++++
> 4 files changed, 63 insertions(+), 6 deletions(-)
Very nicely done and well described.
> diff --git a/builtin/merge-base.c b/builtin/merge-base.c
> index 9b50b4660e..a87011c6cd 100644
> --- a/builtin/merge-base.c
> +++ b/builtin/merge-base.c
> @@ -11,11 +11,12 @@
>
> static int show_merge_base(struct commit **rev, size_t rev_nr, int show_all)
> {
> + enum merge_base_flags flags = show_all ? MERGE_BASE_FIND_ALL : 0;
Curious that only this variable, among 6 that this two-patch series
introduces for the type, is called "flags" while all others are
called "mb_flags". No need to change it; the comment is mostly to
show I did read the two patches with reasonable attention to the
detail ;-).
Will queue. Thanks.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v4 2/2] commit-reach: early exit paint_down_to_common for single merge-base
2026-05-12 0:40 ` Junio C Hamano
@ 2026-05-12 5:16 ` Kristofer Karlsson
0 siblings, 0 replies; 11+ messages in thread
From: Kristofer Karlsson @ 2026-05-12 5:16 UTC (permalink / raw)
To: Junio C Hamano
Cc: Kristofer Karlsson via GitGitGadget, git, Patrick Steinhardt
Thank you!
As for the difference in variable name, I will attribute it to a mix
of oversight and personal preference to keep variable names short if
their scope is very small (and longer names for things like fields or
larger scope).
In fact, I might have preferred flags instead of mb_flags within
paint_down_to_common but it conflicted with the existing flags for the
commit so I had to differentiate them.
That said, I think it would also be fair to rename it to mb_flags everywhere.
On Tue, 12 May 2026 at 02:40, Junio C Hamano <gitster@pobox.com> wrote:
>
> "Kristofer Karlsson via GitGitGadget" <gitgitgadget@gmail.com>
> writes:
>
> > From: Kristofer Karlsson <krka@spotify.com>
> >
> > Commits not in the commit-graph get GENERATION_NUMBER_INFINITY and
> > sort to the top of the priority queue. After those, commits with
> > finite generation numbers are popped in non-increasing order.
> > When MERGE_BASE_FIND_ALL is not set the first doubly-painted commit
> > with a finite generation is therefore a best merge-base: no commit
> > still in the queue can be a descendant of it. Skip the expensive
> > STALE drain in this case.
> >
> > Add MERGE_BASE_FIND_ALL to the merge_base_flags enum. Callers that
> > need every merge-base (repo_get_merge_bases_many, repo_get_merge_bases,
> > repo_in_merge_bases_many, remove_redundant_no_gen) pass the flag to
> > preserve existing behavior. git merge-base (without --all) passes 0,
> > triggering the early exit.
> >
> > On a 2.2M-commit merge-heavy monorepo with commit-graph:
> >
> > HEAD vs ~500: 5,229ms -> 24ms
> > HEAD vs ~1000: 4,214ms -> 39ms
> > HEAD vs ~5000: 3,799ms -> 46ms
> > HEAD vs ~10000: 3,827ms -> 61ms
> >
> > Signed-off-by: Kristofer Karlsson <krka@spotify.com>
> > ---
> > builtin/merge-base.c | 3 ++-
> > commit-reach.c | 19 +++++++++++++++----
> > commit-reach.h | 7 ++++++-
> > t/t6600-test-reach.sh | 40 ++++++++++++++++++++++++++++++++++++++++
> > 4 files changed, 63 insertions(+), 6 deletions(-)
>
> Very nicely done and well described.
>
> > diff --git a/builtin/merge-base.c b/builtin/merge-base.c
> > index 9b50b4660e..a87011c6cd 100644
> > --- a/builtin/merge-base.c
> > +++ b/builtin/merge-base.c
> > @@ -11,11 +11,12 @@
> >
> > static int show_merge_base(struct commit **rev, size_t rev_nr, int show_all)
> > {
> > + enum merge_base_flags flags = show_all ? MERGE_BASE_FIND_ALL : 0;
>
> Curious that only this variable, among 6 that this two-patch series
> introduces for the type, is called "flags" while all others are
> called "mb_flags". No need to change it; the comment is mostly to
> show I did read the two patches with reasonable attention to the
> detail ;-).
>
> Will queue. Thanks.
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2026-05-12 5:16 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-08 15:07 [PATCH] commit-reach: early exit paint_down_to_common for single merge-base Kristofer Karlsson via GitGitGadget
2026-05-11 2:08 ` Junio C Hamano
2026-05-11 6:19 ` [PATCH v2] " Kristofer Karlsson via GitGitGadget
2026-05-11 7:22 ` Patrick Steinhardt
2026-05-11 11:22 ` [PATCH v3] " Kristofer Karlsson via GitGitGadget
2026-05-11 12:04 ` Patrick Steinhardt
2026-05-11 12:59 ` [PATCH v4 0/2] [RFC] commit-reach: skip STALE drain when only one merge-base needed Kristofer Karlsson via GitGitGadget
2026-05-11 12:59 ` [PATCH v4 1/2] commit-reach: introduce merge_base_flags enum Kristofer Karlsson via GitGitGadget
2026-05-11 12:59 ` [PATCH v4 2/2] commit-reach: early exit paint_down_to_common for single merge-base Kristofer Karlsson via GitGitGadget
2026-05-12 0:40 ` Junio C Hamano
2026-05-12 5:16 ` Kristofer Karlsson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox