All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: gitster@pobox.com, j6t@kdbg.org,
	Derrick Stolee <stolee@gmail.com>,
	Derrick Stolee <stolee@gmail.com>
Subject: [PATCH 3/3] rev-list: use reduce_heads() for --maximal-only
Date: Mon, 06 Apr 2026 13:27:28 +0000	[thread overview]
Message-ID: <0bdb88c85eabfa88b97c83a8f20f76cb8ed0489d.1775482048.git.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.2082.git.1775482048.gitgitgadget@gmail.com>

From: Derrick Stolee <stolee@gmail.com>

The 'git rev-list --maximal-only' option filters the output to only
independent commits. A commit is independent if it is not reachable from
other listed commits. Currently this is implemented by doing a full
revision walk and marking parents with CHILD_VISITED to skip non-maximal
commits.

The 'git merge-base --independent' command computes the same result
using reduce_heads(), which uses the more efficient remove_redundant()
algorithm. This is significantly faster because it avoids walking the
entire commit graph.

Add a fast path in rev-list that detects when --maximal-only is the only
interesting option and all input commits are positive (no revision
ranges). In this case, use reduce_heads() directly instead of doing a
full revision walk.

In order to preserve the rest of the output filtering, this computation
is done opportunistically in a new prepare_maximal_independent() method
when possible. If successful, it populates revs->commits with the list
of independent commits and set revs->no_walk to prevent any other walk
from occurring. This allows us to have any custom output be handled
using the existing output code hidden inside
traverse_commit_list_filtered(). A new test is added to demonstrate that
this output is preserved.

The fast path is only used when no other flags complicate the walk or
output format: no UNINTERESTING commits, no limiting options (max-count,
age filters, path filters, grep filters), no output formatting beyond
plain OIDs, and no object listing flags.

Running the p6011 performance test for my copy of git.git, I see the
following improvement with this change:

  Test                                     HEAD~1  HEAD
  ------------------------------------------------------------
  6011.2: merge-base --independent          0.03   0.03 +0.0%
  6011.3: rev-list --maximal-only           0.06   0.03 -50.0%
  6011.4: rev-list --maximal-only --since   0.06   0.06 +0.0%

And for a fresh clone of the Linux kernel repository, I see:

  Test                                     HEAD~1  HEAD
  ------------------------------------------------------------
  6011.2: merge-base --independent          0.00   0.00 =
  6011.3: rev-list --maximal-only           0.70   0.00 -100.0%
  6011.4: rev-list --maximal-only --since   0.70   0.70 +0.0%

In both cases, the performance is indeed matching the behavior of 'git
merge-base --independent', as expected.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
---
 builtin/rev-list.c       | 59 ++++++++++++++++++++++++++++++++++++++++
 t/t6000-rev-list-misc.sh | 31 +++++++++++++++++++++
 2 files changed, 90 insertions(+)

diff --git a/builtin/rev-list.c b/builtin/rev-list.c
index 854d82ece3..8f63003709 100644
--- a/builtin/rev-list.c
+++ b/builtin/rev-list.c
@@ -25,6 +25,7 @@
 #include "oidset.h"
 #include "oidmap.h"
 #include "packfile.h"
+#include "commit-reach.h"
 #include "quote.h"
 #include "strbuf.h"
 
@@ -633,6 +634,61 @@ static int try_bitmap_disk_usage(struct rev_info *revs,
 	return 0;
 }
 
+/*
+ * If revs->maximal_only is set and no other walk modifiers are provided,
+ * run a faster computation to filter the independent commits and prepare
+ * them for output. Set revs->no_walk to prevent later walking.
+ *
+ * If this algorithm doesn't apply, then no changes are made to revs.
+ */
+static void prepare_maximal_independent(struct rev_info *revs)
+{
+	struct commit_list *c;
+
+	if (!revs->maximal_only)
+		return;
+
+	for (c = revs->commits; c; c = c->next) {
+		if (c->item->object.flags & UNINTERESTING)
+			return;
+	}
+
+	if (revs->limited ||
+	    revs->topo_order ||
+	    revs->first_parent_only ||
+	    revs->reverse ||
+	    revs->max_count >= 0 ||
+	    revs->skip_count >= 0 ||
+	    revs->min_age != (timestamp_t)-1 ||
+	    revs->max_age != (timestamp_t)-1 ||
+	    revs->min_parents > 0 ||
+	    revs->max_parents >= 0 ||
+	    revs->prune_data.nr ||
+	    revs->count ||
+	    revs->left_right ||
+	    revs->boundary ||
+	    revs->tag_objects ||
+	    revs->tree_objects ||
+	    revs->blob_objects ||
+	    revs->filter.choice ||
+	    revs->reflog_info ||
+	    revs->diff ||
+	    revs->grep_filter.pattern_list ||
+	    revs->grep_filter.header_list ||
+	    revs->verbose_header ||
+	    revs->print_parents ||
+	    revs->edge_hint ||
+	    revs->unpacked ||
+	    revs->no_kept_objects ||
+	    revs->line_level_traverse)
+		return;
+
+	reduce_heads_replace(&revs->commits);
+
+	/* Modify 'revs' to only output this commit list. */
+	revs->no_walk = 1;
+}
+
 int cmd_rev_list(int argc,
 		 const char **argv,
 		 const char *prefix,
@@ -875,6 +931,9 @@ int cmd_rev_list(int argc,
 
 	if (prepare_revision_walk(&revs))
 		die("revision walk setup failed");
+
+	prepare_maximal_independent(&revs);
+
 	if (revs.tree_objects)
 		mark_edges_uninteresting(&revs, show_edge, 0);
 
diff --git a/t/t6000-rev-list-misc.sh b/t/t6000-rev-list-misc.sh
index d0a2a86610..a95ba576fa 100755
--- a/t/t6000-rev-list-misc.sh
+++ b/t/t6000-rev-list-misc.sh
@@ -263,4 +263,35 @@ test_expect_success 'rev-list --boundary incompatible with --maximal-only' '
 	test_grep "cannot be used together" err
 '
 
+test_expect_success 'rev-list --maximal-only and --pretty' '
+	test_when_finished rm -rf repo &&
+
+	git init repo &&
+	test_commit -C repo 1 &&
+	oid1=$(git -C repo rev-parse HEAD) &&
+	test_commit -C repo 2 &&
+	oid2=$(git -C repo rev-parse HEAD) &&
+	git -C repo checkout --detach HEAD~1 &&
+	test_commit -C repo 3 &&
+	oid3=$(git -C repo rev-parse HEAD) &&
+
+	cat >expect <<-EOF &&
+	commit $oid3
+	$oid3
+	commit $oid2
+	$oid2
+	EOF
+
+	git -C repo rev-list --pretty="%H" --maximal-only $oid1 $oid2 $oid3 >out &&
+	test_cmp expect out &&
+
+	cat >expect <<-EOF &&
+	$oid3
+	$oid2
+	EOF
+
+	git -C repo log --pretty="%H" --maximal-only $oid1 $oid2 $oid3 >out &&
+	test_cmp expect out
+'
+
 test_done
-- 
gitgitgadget

      parent reply	other threads:[~2026-04-06 13:27 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-06 13:27 [PATCH 0/3] rev-list: use merge-base --independent algorithm when possible Derrick Stolee via GitGitGadget
2026-04-06 13:27 ` [PATCH 1/3] t6600: test --maximal-only and --independent Derrick Stolee via GitGitGadget
2026-04-06 13:27 ` [PATCH 2/3] p6011: add perf test for rev-list --maximal-only Derrick Stolee via GitGitGadget
2026-04-06 13:27 ` Derrick Stolee via GitGitGadget [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0bdb88c85eabfa88b97c83a8f20f76cb8ed0489d.1775482048.git.gitgitgadget@gmail.com \
    --to=gitgitgadget@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=j6t@kdbg.org \
    --cc=stolee@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.