Git development
 help / color / mirror / Atom feed
From: "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: gitster@pobox.com, j6t@kdbg.org,
	Derrick Stolee <stolee@gmail.com>,
	Derrick Stolee <stolee@gmail.com>
Subject: [PATCH 3/3] rev-list: use reduce_heads() for --maximal-only
Date: Mon, 06 Apr 2026 13:27:28 +0000	[thread overview]
Message-ID: <0bdb88c85eabfa88b97c83a8f20f76cb8ed0489d.1775482048.git.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.2082.git.1775482048.gitgitgadget@gmail.com>

From: Derrick Stolee <stolee@gmail.com>

The 'git rev-list --maximal-only' option filters the output to only
independent commits. A commit is independent if it is not reachable from
other listed commits. Currently this is implemented by doing a full
revision walk and marking parents with CHILD_VISITED to skip non-maximal
commits.

The 'git merge-base --independent' command computes the same result
using reduce_heads(), which uses the more efficient remove_redundant()
algorithm. This is significantly faster because it avoids walking the
entire commit graph.

Add a fast path in rev-list that detects when --maximal-only is the only
interesting option and all input commits are positive (no revision
ranges). In this case, use reduce_heads() directly instead of doing a
full revision walk.

In order to preserve the rest of the output filtering, this computation
is done opportunistically in a new prepare_maximal_independent() method
when possible. If successful, it populates revs->commits with the list
of independent commits and set revs->no_walk to prevent any other walk
from occurring. This allows us to have any custom output be handled
using the existing output code hidden inside
traverse_commit_list_filtered(). A new test is added to demonstrate that
this output is preserved.

The fast path is only used when no other flags complicate the walk or
output format: no UNINTERESTING commits, no limiting options (max-count,
age filters, path filters, grep filters), no output formatting beyond
plain OIDs, and no object listing flags.

Running the p6011 performance test for my copy of git.git, I see the
following improvement with this change:

  Test                                     HEAD~1  HEAD
  ------------------------------------------------------------
  6011.2: merge-base --independent          0.03   0.03 +0.0%
  6011.3: rev-list --maximal-only           0.06   0.03 -50.0%
  6011.4: rev-list --maximal-only --since   0.06   0.06 +0.0%

And for a fresh clone of the Linux kernel repository, I see:

  Test                                     HEAD~1  HEAD
  ------------------------------------------------------------
  6011.2: merge-base --independent          0.00   0.00 =
  6011.3: rev-list --maximal-only           0.70   0.00 -100.0%
  6011.4: rev-list --maximal-only --since   0.70   0.70 +0.0%

In both cases, the performance is indeed matching the behavior of 'git
merge-base --independent', as expected.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
---
 builtin/rev-list.c       | 59 ++++++++++++++++++++++++++++++++++++++++
 t/t6000-rev-list-misc.sh | 31 +++++++++++++++++++++
 2 files changed, 90 insertions(+)

diff --git a/builtin/rev-list.c b/builtin/rev-list.c
index 854d82ece3..8f63003709 100644
--- a/builtin/rev-list.c
+++ b/builtin/rev-list.c
@@ -25,6 +25,7 @@
 #include "oidset.h"
 #include "oidmap.h"
 #include "packfile.h"
+#include "commit-reach.h"
 #include "quote.h"
 #include "strbuf.h"
 
@@ -633,6 +634,61 @@ static int try_bitmap_disk_usage(struct rev_info *revs,
 	return 0;
 }
 
+/*
+ * If revs->maximal_only is set and no other walk modifiers are provided,
+ * run a faster computation to filter the independent commits and prepare
+ * them for output. Set revs->no_walk to prevent later walking.
+ *
+ * If this algorithm doesn't apply, then no changes are made to revs.
+ */
+static void prepare_maximal_independent(struct rev_info *revs)
+{
+	struct commit_list *c;
+
+	if (!revs->maximal_only)
+		return;
+
+	for (c = revs->commits; c; c = c->next) {
+		if (c->item->object.flags & UNINTERESTING)
+			return;
+	}
+
+	if (revs->limited ||
+	    revs->topo_order ||
+	    revs->first_parent_only ||
+	    revs->reverse ||
+	    revs->max_count >= 0 ||
+	    revs->skip_count >= 0 ||
+	    revs->min_age != (timestamp_t)-1 ||
+	    revs->max_age != (timestamp_t)-1 ||
+	    revs->min_parents > 0 ||
+	    revs->max_parents >= 0 ||
+	    revs->prune_data.nr ||
+	    revs->count ||
+	    revs->left_right ||
+	    revs->boundary ||
+	    revs->tag_objects ||
+	    revs->tree_objects ||
+	    revs->blob_objects ||
+	    revs->filter.choice ||
+	    revs->reflog_info ||
+	    revs->diff ||
+	    revs->grep_filter.pattern_list ||
+	    revs->grep_filter.header_list ||
+	    revs->verbose_header ||
+	    revs->print_parents ||
+	    revs->edge_hint ||
+	    revs->unpacked ||
+	    revs->no_kept_objects ||
+	    revs->line_level_traverse)
+		return;
+
+	reduce_heads_replace(&revs->commits);
+
+	/* Modify 'revs' to only output this commit list. */
+	revs->no_walk = 1;
+}
+
 int cmd_rev_list(int argc,
 		 const char **argv,
 		 const char *prefix,
@@ -875,6 +931,9 @@ int cmd_rev_list(int argc,
 
 	if (prepare_revision_walk(&revs))
 		die("revision walk setup failed");
+
+	prepare_maximal_independent(&revs);
+
 	if (revs.tree_objects)
 		mark_edges_uninteresting(&revs, show_edge, 0);
 
diff --git a/t/t6000-rev-list-misc.sh b/t/t6000-rev-list-misc.sh
index d0a2a86610..a95ba576fa 100755
--- a/t/t6000-rev-list-misc.sh
+++ b/t/t6000-rev-list-misc.sh
@@ -263,4 +263,35 @@ test_expect_success 'rev-list --boundary incompatible with --maximal-only' '
 	test_grep "cannot be used together" err
 '
 
+test_expect_success 'rev-list --maximal-only and --pretty' '
+	test_when_finished rm -rf repo &&
+
+	git init repo &&
+	test_commit -C repo 1 &&
+	oid1=$(git -C repo rev-parse HEAD) &&
+	test_commit -C repo 2 &&
+	oid2=$(git -C repo rev-parse HEAD) &&
+	git -C repo checkout --detach HEAD~1 &&
+	test_commit -C repo 3 &&
+	oid3=$(git -C repo rev-parse HEAD) &&
+
+	cat >expect <<-EOF &&
+	commit $oid3
+	$oid3
+	commit $oid2
+	$oid2
+	EOF
+
+	git -C repo rev-list --pretty="%H" --maximal-only $oid1 $oid2 $oid3 >out &&
+	test_cmp expect out &&
+
+	cat >expect <<-EOF &&
+	$oid3
+	$oid2
+	EOF
+
+	git -C repo log --pretty="%H" --maximal-only $oid1 $oid2 $oid3 >out &&
+	test_cmp expect out
+'
+
 test_done
-- 
gitgitgadget

      parent reply	other threads:[~2026-04-06 13:27 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-06 13:27 [PATCH 0/3] rev-list: use merge-base --independent algorithm when possible Derrick Stolee via GitGitGadget
2026-04-06 13:27 ` [PATCH 1/3] t6600: test --maximal-only and --independent Derrick Stolee via GitGitGadget
2026-04-06 13:27 ` [PATCH 2/3] p6011: add perf test for rev-list --maximal-only Derrick Stolee via GitGitGadget
2026-04-06 13:27 ` Derrick Stolee via GitGitGadget [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0bdb88c85eabfa88b97c83a8f20f76cb8ed0489d.1775482048.git.gitgitgadget@gmail.com \
    --to=gitgitgadget@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=j6t@kdbg.org \
    --cc=stolee@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox