Git development
 help / color / mirror / Atom feed
From: "Kristofer Karlsson via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: Jeff King <peff@peff.net>, Kristofer Karlsson <krka@spotify.com>,
	Derrick Stolee <stolee@gmail.com>,
	Kristofer Karlsson <krka@spotify.com>,
	Kristofer Karlsson <krka@spotify.com>
Subject: [PATCH v3 1/2] commit-reach: use object flags for tips_reachable_from_bases()
Date: Sat, 16 May 2026 15:59:40 +0000	[thread overview]
Message-ID: <7399a12518e2021bc8f7cf3fc1f2996099f787b6.1778947182.git.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.2116.v3.git.1778947182.gitgitgadget@gmail.com>

From: Kristofer Karlsson <krka@spotify.com>

tips_reachable_from_bases() walks the commit graph from a set of base
commits to find which tip commits are reachable.  The inner loop does
a linear scan over the tips array to check whether each visited commit
is a tip, making the overall cost O(C * T) where C is commits walked
and T is the number of tips.

Use the RESULT object flag to mark tip commits, replacing the linear
scan with a single flag test per visited commit.  This reduces the
per-commit tip check from O(T) to O(1) and the overall cost from
O(C * T) to O(C + T).

When multiple refs point to the same commit, the shared object gets
the flag once, so all duplicates are handled automatically.  The
early-termination advancement loop checks the flag on the sorted
commits array directly, which naturally handles duplicates since the
flag is on the shared commit object.

This also removes the index field from struct commit_and_index, since
the indirection through the original tips array is no longer needed.

This function is called by `git for-each-ref --merged` and
`git branch/tag --contains/--no-contains` via reach_filter() in
ref-filter.c.

Benchmark on a merge-heavy monorepo (2.3M commits, 10,000 refs):

  Command                           Before    After   Speedup
  for-each-ref --merged HEAD        6.57s     1.59s     4.1x
  for-each-ref --no-merged HEAD     6.67s     1.66s     4.0x
  branch --merged HEAD              0.68s     0.61s      10%
  branch --no-merged HEAD           0.65s     0.61s       8%
  tag --merged HEAD                 0.12s     0.12s       -

On linux.git with 10,000 synthetic branches at the root commit (worst
case for the DFS walk):

  Command                           Before    After   Speedup
  for-each-ref --merged HEAD        1.35s     0.35s     3.9x
  for-each-ref --no-merged HEAD     1.82s     0.31s     5.9x

The large speedup for for-each-ref is because it checks all 10,000
refs as tips, making the O(T) inner loop expensive.  The branch
subcommand only checks local branches (fewer tips), so the improvement
is smaller.

Signed-off-by: Kristofer Karlsson <krka@spotify.com>
---
 commit-reach.c | 23 +++++++++++------------
 1 file changed, 11 insertions(+), 12 deletions(-)

diff --git a/commit-reach.c b/commit-reach.c
index d3a9b3ed6f..82614d2409 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -1125,7 +1125,6 @@ void ahead_behind(struct repository *r,
 
 struct commit_and_index {
 	struct commit *commit;
-	unsigned int index;
 	timestamp_t generation;
 };
 
@@ -1165,7 +1164,6 @@ void tips_reachable_from_bases(struct repository *r,
 
 	for (size_t i = 0; i < tips_nr; i++) {
 		commits[i].commit = tips[i];
-		commits[i].index = i;
 		commits[i].generation = commit_graph_generation(tips[i]);
 	}
 
@@ -1173,6 +1171,9 @@ void tips_reachable_from_bases(struct repository *r,
 	QSORT(commits, tips_nr, compare_commit_and_index_by_generation);
 	min_generation = commits[0].generation;
 
+	for (size_t i = 0; i < tips_nr; i++)
+		commits[i].commit->object.flags |= RESULT;
+
 	while (bases) {
 		repo_parse_commit(r, bases->item);
 		commit_list_insert(bases->item, &stack);
@@ -1183,20 +1184,16 @@ void tips_reachable_from_bases(struct repository *r,
 		int explored_all_parents = 1;
 		struct commit_list *p;
 		struct commit *c = stack->item;
-		timestamp_t c_gen = commit_graph_generation(c);
 
 		/* Does it match any of our tips? */
-		for (size_t j = min_generation_index; j < tips_nr; j++) {
-			if (c_gen < commits[j].generation)
-				break;
-
-			if (commits[j].commit == c) {
-				tips[commits[j].index]->object.flags |= mark;
+		{
+			if (c->object.flags & RESULT) {
+				c->object.flags |= mark;
 
-				if (j == min_generation_index) {
-					unsigned int k = j + 1;
+				if (commits[min_generation_index].commit->object.flags & mark) {
+					unsigned int k = min_generation_index + 1;
 					while (k < tips_nr &&
-					       (tips[commits[k].index]->object.flags & mark))
+					       (commits[k].commit->object.flags & mark))
 						k++;
 
 					/* Terminate early if all found. */
@@ -1232,6 +1229,8 @@ void tips_reachable_from_bases(struct repository *r,
 	}
 
 done:
+	for (size_t i = 0; i < tips_nr; i++)
+		commits[i].commit->object.flags &= ~RESULT;
 	free(commits);
 	repo_clear_commit_marks(r, SEEN);
 	commit_list_free(stack);
-- 
gitgitgadget


  reply	other threads:[~2026-05-16 15:59 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-15 18:07 [PATCH] commit-reach: use the decoration hash for tips_reachable_from_bases() Kristofer Karlsson via GitGitGadget
2026-05-15 21:14 ` Jeff King
2026-05-16  8:23   ` Kristofer Karlsson
2026-05-16 13:46     ` Derrick Stolee
2026-05-16  9:16 ` [PATCH v2] commit-reach: use object flags " Kristofer Karlsson via GitGitGadget
2026-05-16 15:59   ` [PATCH v3 0/2] " Kristofer Karlsson via GitGitGadget
2026-05-16 15:59     ` Kristofer Karlsson via GitGitGadget [this message]
2026-05-16 15:59     ` [PATCH v3 2/2] t6600: add tests for duplicate tips in tips_reachable_from_bases() Kristofer Karlsson via GitGitGadget

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7399a12518e2021bc8f7cf3fc1f2996099f787b6.1778947182.git.gitgitgadget@gmail.com \
    --to=gitgitgadget@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=krka@spotify.com \
    --cc=peff@peff.net \
    --cc=stolee@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox