Git development
 help / color / mirror / Atom feed
From: "Kristofer Karlsson via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: Kristofer Karlsson <krka@spotify.com>,
	Kristofer Karlsson <krka@spotify.com>
Subject: [PATCH] commit-reach: use the decoration hash for tips_reachable_from_bases()
Date: Fri, 15 May 2026 18:07:43 +0000	[thread overview]
Message-ID: <pull.2116.git.1778868463992.gitgitgadget@gmail.com> (raw)

From: Kristofer Karlsson <krka@spotify.com>

tips_reachable_from_bases() walks the commit graph from a set of base
commits to find which tip commits are reachable.  The inner loop does
a linear scan over the tips array to check whether each visited commit
is a tip, making the overall cost O(C * T) where C is commits walked
and T is the number of tips.

Replace the linear scan with the decoration hash for lookups, reducing
the per-commit tip check from O(T) to O(1) and the overall cost from
O(C * T) to O(C + T).

This function is called by `git for-each-ref --merged` and
`git branch/tag --contains/--no-contains` via reach_filter() in
ref-filter.c.

Benchmark on a merge-heavy monorepo (2.3M commits, 10,000 refs):

  Command                           Before    After   Speedup
  for-each-ref --merged HEAD        6.64s     1.66s     4.0x
  for-each-ref --no-merged HEAD     6.75s     1.74s     3.9x
  branch --merged HEAD              0.68s     0.61s      10%
  branch --no-merged HEAD           0.65s     0.61s       8%
  tag --merged HEAD                 0.12s     0.12s       -

The large speedup for for-each-ref is because it checks all 10,000
refs as tips, making the O(T) inner loop expensive.  The branch
subcommand only checks local branches (fewer tips), so the improvement
is smaller.

Signed-off-by: Kristofer Karlsson <krka@spotify.com>
---
    commit-reach: use the decoration hash for tips_reachable_from_bases()
    
    This is a single small commit that replaces an O(C*T) linear scan in
    tips_reachable_from_bases() with an O(1) lookup using the decoration
    hash.
    
    The function is called by git for-each-ref --merged and git branch/tag
    --contains/--no-contains via reach_filter() in ref-filter.c. On a
    merge-heavy monorepo with 2.3M commits and 10,000 refs, git for-each-ref
    --merged HEAD goes from 6.6s to 1.7s (4x).
    
    The diff is intentionally minimal (+9/-6) to make the idea easy to
    discuss before polishing. Things I'm not fully happy about:
    
     * Extra block scope { } just to preserve indentation of the inner body
     * Hacking the array index into the decoration value as (void *)(i + 1)
       instead of storing a proper pointer
     * Relying on unsigned wraparound (- 1 on a size_t 0) to check for
       not-found via j < tips_nr
    
    Happy to clean all of these up in a follow-up commit if the approach
    makes sense.

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-2116%2Fspkrka%2Ftips-reachable-minimal-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-2116/spkrka/tips-reachable-minimal-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/2116

 commit-reach.c | 15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/commit-reach.c b/commit-reach.c
index d3a9b3ed6f..70b056eae0 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -1150,6 +1150,7 @@ void tips_reachable_from_bases(struct repository *r,
 	size_t min_generation_index = 0;
 	timestamp_t min_generation;
 	struct commit_list *stack = NULL;
+	struct decoration tip_index = { "tip_index" };
 
 	if (!bases || !tips || !tips_nr)
 		return;
@@ -1173,6 +1174,10 @@ void tips_reachable_from_bases(struct repository *r,
 	QSORT(commits, tips_nr, compare_commit_and_index_by_generation);
 	min_generation = commits[0].generation;
 
+	for (size_t i = 0; i < tips_nr; i++)
+		add_decoration(&tip_index, &commits[i].commit->object,
+			       (void *)(i + 1));
+
 	while (bases) {
 		repo_parse_commit(r, bases->item);
 		commit_list_insert(bases->item, &stack);
@@ -1183,14 +1188,11 @@ void tips_reachable_from_bases(struct repository *r,
 		int explored_all_parents = 1;
 		struct commit_list *p;
 		struct commit *c = stack->item;
-		timestamp_t c_gen = commit_graph_generation(c);
 
 		/* Does it match any of our tips? */
-		for (size_t j = min_generation_index; j < tips_nr; j++) {
-			if (c_gen < commits[j].generation)
-				break;
-
-			if (commits[j].commit == c) {
+		{
+			size_t j = (size_t)lookup_decoration(&tip_index, &c->object) - 1;
+			if (j < tips_nr) {
 				tips[commits[j].index]->object.flags |= mark;
 
 				if (j == min_generation_index) {
@@ -1232,6 +1234,7 @@ void tips_reachable_from_bases(struct repository *r,
 	}
 
 done:
+	clear_decoration(&tip_index, NULL);
 	free(commits);
 	repo_clear_commit_marks(r, SEEN);
 	commit_list_free(stack);

base-commit: 59ff4886a579f4bc91e976fe18590b9ae02c7a08
-- 
gitgitgadget

             reply	other threads:[~2026-05-15 18:07 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-15 18:07 Kristofer Karlsson via GitGitGadget [this message]
2026-05-15 21:14 ` [PATCH] commit-reach: use the decoration hash for tips_reachable_from_bases() Jeff King
2026-05-16  8:23   ` Kristofer Karlsson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=pull.2116.git.1778868463992.gitgitgadget@gmail.com \
    --to=gitgitgadget@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=krka@spotify.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox