* [PATCH 0/3] rev-list: use merge-base --independent algorithm when possible
@ 2026-04-06 13:27 Derrick Stolee via GitGitGadget
2026-04-06 13:27 ` [PATCH 1/3] t6600: test --maximal-only and --independent Derrick Stolee via GitGitGadget
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2026-04-06 13:27 UTC (permalink / raw)
To: git; +Cc: gitster, j6t, Derrick Stolee
The --maximal-only option was added to git rev-list in b4e8f60a3c (revision:
add --maximal-only option, 2026-01-22) and the discussion [1] included talks
of how 'git rev-list --maximal-only <refs>' acts the same as 'git merge-base
--independent <refs>' assuming that no other walk modifiers are provided to
the revision walk. And with those assumptions, the merge-base algorithm can
be faster if the refs have most of their history shared.
[1]
https://lore.kernel.org/git/pull.2032.v2.git.1769097958549.gitgitgadget@gmail.com/
This series updates the revision walk to use the merge-base algorithm when
possible. This checks the rev_info struct for options that cause the walk to
be different and also looks for negative references. If none of these
appear, then the merge-base algorithm is used instead.
The series is broken into three patches that could theoretically be squashed
into a single patch.
1. The first demonstrates the equivalence of these two commands via some
tests.
2. The second creates a performance test and documents the current
behavior.
3. The third updates the implementation and demonstrates the improvement in
the case of no walk modifiers.
Thanks, -Stolee
Derrick Stolee (3):
t6600: test --maximal-only and --independent
p6011: add perf test for rev-list --maximal-only
rev-list: use reduce_heads() for --maximal-only
builtin/rev-list.c | 59 ++++++++++++++++++++++++++++++++
t/perf/p6011-rev-list-maximal.sh | 29 ++++++++++++++++
t/t6000-rev-list-misc.sh | 31 +++++++++++++++++
t/t6600-test-reach.sh | 45 ++++++++++++++++++++++++
4 files changed, 164 insertions(+)
create mode 100755 t/perf/p6011-rev-list-maximal.sh
base-commit: ca1db8a0f7dc0dbea892e99f5b37c5fe5861be71
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-2082%2Fderrickstolee%2Fmaximal-faster-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-2082/derrickstolee/maximal-faster-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/2082
--
gitgitgadget
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH 1/3] t6600: test --maximal-only and --independent
2026-04-06 13:27 [PATCH 0/3] rev-list: use merge-base --independent algorithm when possible Derrick Stolee via GitGitGadget
@ 2026-04-06 13:27 ` Derrick Stolee via GitGitGadget
2026-04-06 13:27 ` [PATCH 2/3] p6011: add perf test for rev-list --maximal-only Derrick Stolee via GitGitGadget
2026-04-06 13:27 ` [PATCH 3/3] rev-list: use reduce_heads() for --maximal-only Derrick Stolee via GitGitGadget
2 siblings, 0 replies; 4+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2026-04-06 13:27 UTC (permalink / raw)
To: git; +Cc: gitster, j6t, Derrick Stolee, Derrick Stolee
From: Derrick Stolee <stolee@gmail.com>
Add a test that verifies the 'git rev-list --maximal-only' option
produces the same set of commits as 'git merge-base --independent'. This
equivalence was noted when the feature was first created, but we are
about to update the implementation to use a common algorithm in this
case where the user intention is identical.
Signed-off-by: Derrick Stolee <stolee@gmail.com>
---
t/t6600-test-reach.sh | 45 +++++++++++++++++++++++++++++++++++++++++++
1 file changed, 45 insertions(+)
diff --git a/t/t6600-test-reach.sh b/t/t6600-test-reach.sh
index 2613075894..dc0421ed2f 100755
--- a/t/t6600-test-reach.sh
+++ b/t/t6600-test-reach.sh
@@ -837,4 +837,49 @@ test_expect_success 'rev-list --maximal-only (range)' '
--first-parent --exclude-first-parent-only
'
+test_expect_success 'rev-list --maximal-only matches merge-base --independent' '
+ # Mix of independent and dependent
+ git merge-base --independent \
+ refs/heads/commit-5-2 \
+ refs/heads/commit-3-2 \
+ refs/heads/commit-2-5 >expect &&
+ sort expect >expect.sorted &&
+ git rev-list --maximal-only \
+ refs/heads/commit-5-2 \
+ refs/heads/commit-3-2 \
+ refs/heads/commit-2-5 >actual &&
+ sort actual >actual.sorted &&
+ test_cmp expect.sorted actual.sorted &&
+
+ # All independent commits.
+ git merge-base --independent \
+ refs/heads/commit-5-2 \
+ refs/heads/commit-4-3 \
+ refs/heads/commit-3-4 \
+ refs/heads/commit-2-5 >expect &&
+ sort expect >expect.sorted &&
+ git rev-list --maximal-only \
+ refs/heads/commit-5-2 \
+ refs/heads/commit-4-3 \
+ refs/heads/commit-3-4 \
+ refs/heads/commit-2-5 >actual &&
+ sort actual >actual.sorted &&
+ test_cmp expect.sorted actual.sorted &&
+
+ # Only one independent.
+ git merge-base --independent \
+ refs/heads/commit-1-1 \
+ refs/heads/commit-4-2 \
+ refs/heads/commit-4-4 \
+ refs/heads/commit-8-4 >expect &&
+ sort expect >expect.sorted &&
+ git rev-list --maximal-only \
+ refs/heads/commit-1-1 \
+ refs/heads/commit-4-2 \
+ refs/heads/commit-4-4 \
+ refs/heads/commit-8-4 >actual &&
+ sort actual >actual.sorted &&
+ test_cmp expect.sorted actual.sorted
+'
+
test_done
--
gitgitgadget
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH 2/3] p6011: add perf test for rev-list --maximal-only
2026-04-06 13:27 [PATCH 0/3] rev-list: use merge-base --independent algorithm when possible Derrick Stolee via GitGitGadget
2026-04-06 13:27 ` [PATCH 1/3] t6600: test --maximal-only and --independent Derrick Stolee via GitGitGadget
@ 2026-04-06 13:27 ` Derrick Stolee via GitGitGadget
2026-04-06 13:27 ` [PATCH 3/3] rev-list: use reduce_heads() for --maximal-only Derrick Stolee via GitGitGadget
2 siblings, 0 replies; 4+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2026-04-06 13:27 UTC (permalink / raw)
To: git; +Cc: gitster, j6t, Derrick Stolee, Derrick Stolee
From: Derrick Stolee <stolee@gmail.com>
Add a performance test that compares 'git rev-list --maximal-only'
against 'git merge-base --independent'. These two commands are asking
essentially the same thing, but the rev-list implementation is more
generic and hence slower. These performance tests will demonstrate that
in the current state and also be used to show the equivalence in the
future.
We also add a case with '--since' to force the generic walk logic for
rev-list even when we make that future change to use the merge-base
algorithm on a simple walk.
When run on my copy of git.git, I see these results:
Test HEAD
----------------------------------------------
6011.2: merge-base --independent 0.03
6011.3: rev-list --maximal-only 0.06
6011.4: rev-list --maximal-only --since 0.06
These numbers are low, but the --independent calculation is interesting
due to having a lot of local branches that are actually independent.
Running the same test on a fresh clone of the Linux kernel repository
shows a larger difference between the algorithms, especially because the
--independent algorithm is extremely fast when there are no independent
references selected:
Test HEAD
----------------------------------------------
6011.2: merge-base --independent 0.00
6011.3: rev-list --maximal-only 0.70
6011.4: rev-list --maximal-only --since 0.70
Signed-off-by: Derrick Stolee <stolee@gmail.com>
---
t/perf/p6011-rev-list-maximal.sh | 29 +++++++++++++++++++++++++++++
1 file changed, 29 insertions(+)
create mode 100755 t/perf/p6011-rev-list-maximal.sh
diff --git a/t/perf/p6011-rev-list-maximal.sh b/t/perf/p6011-rev-list-maximal.sh
new file mode 100755
index 0000000000..e868e83ff8
--- /dev/null
+++ b/t/perf/p6011-rev-list-maximal.sh
@@ -0,0 +1,29 @@
+#!/bin/sh
+
+test_description='Test --maximal-only and --independent options'
+
+. ./perf-lib.sh
+
+test_perf_default_repo
+
+test_expect_success 'setup' '
+ git for-each-ref --format="%(*objecttype) %(objecttype) %(objectname)" \
+ "refs/heads/*" "refs/tags/*" |
+ sed -n -e "s/^commit commit //p" -e "s/^ commit //p" |
+ head -n 50 >commits &&
+ git commit-graph write --reachable
+'
+
+test_perf 'merge-base --independent' '
+ git merge-base --independent $(cat commits) >/dev/null
+'
+
+test_perf 'rev-list --maximal-only' '
+ git rev-list --maximal-only $(cat commits) >/dev/null
+'
+
+test_perf 'rev-list --maximal-only --since' '
+ git rev-list --maximal-only --since=2000-01-01 $(cat commits) >/dev/null
+'
+
+test_done
--
gitgitgadget
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH 3/3] rev-list: use reduce_heads() for --maximal-only
2026-04-06 13:27 [PATCH 0/3] rev-list: use merge-base --independent algorithm when possible Derrick Stolee via GitGitGadget
2026-04-06 13:27 ` [PATCH 1/3] t6600: test --maximal-only and --independent Derrick Stolee via GitGitGadget
2026-04-06 13:27 ` [PATCH 2/3] p6011: add perf test for rev-list --maximal-only Derrick Stolee via GitGitGadget
@ 2026-04-06 13:27 ` Derrick Stolee via GitGitGadget
2 siblings, 0 replies; 4+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2026-04-06 13:27 UTC (permalink / raw)
To: git; +Cc: gitster, j6t, Derrick Stolee, Derrick Stolee
From: Derrick Stolee <stolee@gmail.com>
The 'git rev-list --maximal-only' option filters the output to only
independent commits. A commit is independent if it is not reachable from
other listed commits. Currently this is implemented by doing a full
revision walk and marking parents with CHILD_VISITED to skip non-maximal
commits.
The 'git merge-base --independent' command computes the same result
using reduce_heads(), which uses the more efficient remove_redundant()
algorithm. This is significantly faster because it avoids walking the
entire commit graph.
Add a fast path in rev-list that detects when --maximal-only is the only
interesting option and all input commits are positive (no revision
ranges). In this case, use reduce_heads() directly instead of doing a
full revision walk.
In order to preserve the rest of the output filtering, this computation
is done opportunistically in a new prepare_maximal_independent() method
when possible. If successful, it populates revs->commits with the list
of independent commits and set revs->no_walk to prevent any other walk
from occurring. This allows us to have any custom output be handled
using the existing output code hidden inside
traverse_commit_list_filtered(). A new test is added to demonstrate that
this output is preserved.
The fast path is only used when no other flags complicate the walk or
output format: no UNINTERESTING commits, no limiting options (max-count,
age filters, path filters, grep filters), no output formatting beyond
plain OIDs, and no object listing flags.
Running the p6011 performance test for my copy of git.git, I see the
following improvement with this change:
Test HEAD~1 HEAD
------------------------------------------------------------
6011.2: merge-base --independent 0.03 0.03 +0.0%
6011.3: rev-list --maximal-only 0.06 0.03 -50.0%
6011.4: rev-list --maximal-only --since 0.06 0.06 +0.0%
And for a fresh clone of the Linux kernel repository, I see:
Test HEAD~1 HEAD
------------------------------------------------------------
6011.2: merge-base --independent 0.00 0.00 =
6011.3: rev-list --maximal-only 0.70 0.00 -100.0%
6011.4: rev-list --maximal-only --since 0.70 0.70 +0.0%
In both cases, the performance is indeed matching the behavior of 'git
merge-base --independent', as expected.
Signed-off-by: Derrick Stolee <stolee@gmail.com>
---
builtin/rev-list.c | 59 ++++++++++++++++++++++++++++++++++++++++
t/t6000-rev-list-misc.sh | 31 +++++++++++++++++++++
2 files changed, 90 insertions(+)
diff --git a/builtin/rev-list.c b/builtin/rev-list.c
index 854d82ece3..8f63003709 100644
--- a/builtin/rev-list.c
+++ b/builtin/rev-list.c
@@ -25,6 +25,7 @@
#include "oidset.h"
#include "oidmap.h"
#include "packfile.h"
+#include "commit-reach.h"
#include "quote.h"
#include "strbuf.h"
@@ -633,6 +634,61 @@ static int try_bitmap_disk_usage(struct rev_info *revs,
return 0;
}
+/*
+ * If revs->maximal_only is set and no other walk modifiers are provided,
+ * run a faster computation to filter the independent commits and prepare
+ * them for output. Set revs->no_walk to prevent later walking.
+ *
+ * If this algorithm doesn't apply, then no changes are made to revs.
+ */
+static void prepare_maximal_independent(struct rev_info *revs)
+{
+ struct commit_list *c;
+
+ if (!revs->maximal_only)
+ return;
+
+ for (c = revs->commits; c; c = c->next) {
+ if (c->item->object.flags & UNINTERESTING)
+ return;
+ }
+
+ if (revs->limited ||
+ revs->topo_order ||
+ revs->first_parent_only ||
+ revs->reverse ||
+ revs->max_count >= 0 ||
+ revs->skip_count >= 0 ||
+ revs->min_age != (timestamp_t)-1 ||
+ revs->max_age != (timestamp_t)-1 ||
+ revs->min_parents > 0 ||
+ revs->max_parents >= 0 ||
+ revs->prune_data.nr ||
+ revs->count ||
+ revs->left_right ||
+ revs->boundary ||
+ revs->tag_objects ||
+ revs->tree_objects ||
+ revs->blob_objects ||
+ revs->filter.choice ||
+ revs->reflog_info ||
+ revs->diff ||
+ revs->grep_filter.pattern_list ||
+ revs->grep_filter.header_list ||
+ revs->verbose_header ||
+ revs->print_parents ||
+ revs->edge_hint ||
+ revs->unpacked ||
+ revs->no_kept_objects ||
+ revs->line_level_traverse)
+ return;
+
+ reduce_heads_replace(&revs->commits);
+
+ /* Modify 'revs' to only output this commit list. */
+ revs->no_walk = 1;
+}
+
int cmd_rev_list(int argc,
const char **argv,
const char *prefix,
@@ -875,6 +931,9 @@ int cmd_rev_list(int argc,
if (prepare_revision_walk(&revs))
die("revision walk setup failed");
+
+ prepare_maximal_independent(&revs);
+
if (revs.tree_objects)
mark_edges_uninteresting(&revs, show_edge, 0);
diff --git a/t/t6000-rev-list-misc.sh b/t/t6000-rev-list-misc.sh
index d0a2a86610..a95ba576fa 100755
--- a/t/t6000-rev-list-misc.sh
+++ b/t/t6000-rev-list-misc.sh
@@ -263,4 +263,35 @@ test_expect_success 'rev-list --boundary incompatible with --maximal-only' '
test_grep "cannot be used together" err
'
+test_expect_success 'rev-list --maximal-only and --pretty' '
+ test_when_finished rm -rf repo &&
+
+ git init repo &&
+ test_commit -C repo 1 &&
+ oid1=$(git -C repo rev-parse HEAD) &&
+ test_commit -C repo 2 &&
+ oid2=$(git -C repo rev-parse HEAD) &&
+ git -C repo checkout --detach HEAD~1 &&
+ test_commit -C repo 3 &&
+ oid3=$(git -C repo rev-parse HEAD) &&
+
+ cat >expect <<-EOF &&
+ commit $oid3
+ $oid3
+ commit $oid2
+ $oid2
+ EOF
+
+ git -C repo rev-list --pretty="%H" --maximal-only $oid1 $oid2 $oid3 >out &&
+ test_cmp expect out &&
+
+ cat >expect <<-EOF &&
+ $oid3
+ $oid2
+ EOF
+
+ git -C repo log --pretty="%H" --maximal-only $oid1 $oid2 $oid3 >out &&
+ test_cmp expect out
+'
+
test_done
--
gitgitgadget
^ permalink raw reply related [flat|nested] 4+ messages in thread
end of thread, other threads:[~2026-04-06 13:27 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-06 13:27 [PATCH 0/3] rev-list: use merge-base --independent algorithm when possible Derrick Stolee via GitGitGadget
2026-04-06 13:27 ` [PATCH 1/3] t6600: test --maximal-only and --independent Derrick Stolee via GitGitGadget
2026-04-06 13:27 ` [PATCH 2/3] p6011: add perf test for rev-list --maximal-only Derrick Stolee via GitGitGadget
2026-04-06 13:27 ` [PATCH 3/3] rev-list: use reduce_heads() for --maximal-only Derrick Stolee via GitGitGadget
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox