From: "Kristofer Karlsson via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: Jeff King <peff@peff.net>, Kristofer Karlsson <krka@spotify.com>,
Kristofer Karlsson <krka@spotify.com>,
Kristofer Karlsson <krka@spotify.com>
Subject: [PATCH v2] commit-reach: use object flags for tips_reachable_from_bases()
Date: Sat, 16 May 2026 09:16:32 +0000 [thread overview]
Message-ID: <pull.2116.v2.git.1778922993480.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.2116.git.1778868463992.gitgitgadget@gmail.com>
From: Kristofer Karlsson <krka@spotify.com>
tips_reachable_from_bases() walks the commit graph from a set of base
commits to find which tip commits are reachable. The inner loop does
a linear scan over the tips array to check whether each visited commit
is a tip, making the overall cost O(C * T) where C is commits walked
and T is the number of tips.
Use the RESULT object flag to mark tip commits, replacing the linear
scan with a single flag test per visited commit. This reduces the
per-commit tip check from O(T) to O(1) and the overall cost from
O(C * T) to O(C + T).
When multiple refs point to the same commit, the shared object gets
the flag once, so all duplicates are handled automatically. The
early-termination advancement loop checks the flag on the sorted
commits array directly, which naturally handles duplicates since the
flag is on the shared commit object.
This also removes the index field from struct commit_and_index, since
the indirection through the original tips array is no longer needed.
This function is called by `git for-each-ref --merged` and
`git branch/tag --contains/--no-contains` via reach_filter() in
ref-filter.c.
Benchmark on a merge-heavy monorepo (2.3M commits, 10,000 refs):
Command Before After Speedup
for-each-ref --merged HEAD 6.57s 1.59s 4.1x
for-each-ref --no-merged HEAD 6.67s 1.66s 4.0x
branch --merged HEAD 0.68s 0.61s 10%
branch --no-merged HEAD 0.65s 0.61s 8%
tag --merged HEAD 0.12s 0.12s -
On linux.git with 10,000 synthetic branches at the root commit (worst
case for the DFS walk):
Command Before After Speedup
for-each-ref --merged HEAD 1.35s 0.35s 3.9x
for-each-ref --no-merged HEAD 1.82s 0.31s 5.9x
The large speedup for for-each-ref is because it checks all 10,000
refs as tips, making the O(T) inner loop expensive. The branch
subcommand only checks local branches (fewer tips), so the improvement
is smaller.
Signed-off-by: Kristofer Karlsson <krka@spotify.com>
---
commit-reach: use object flags for tips_reachable_from_bases()
This replaces the O(C*T) linear scan in tips_reachable_from_bases() with
an O(1) flag check using the RESULT object flag.
The function is called by git for-each-ref --merged and git branch/tag
--contains/--no-contains via reach_filter() in ref-filter.c.
Benchmarks on a merge-heavy monorepo (2.3M commits, 10,000 refs):
* for-each-ref --merged HEAD: 6.6s → 1.6s (4.1x)
* for-each-ref --no-merged HEAD: 6.7s → 1.7s (4.0x)
On linux.git with 10,000 synthetic branches at the root commit:
* for-each-ref --merged HEAD: 1.35s → 0.35s (3.9x)
* for-each-ref --no-merged HEAD: 1.82s → 0.31s (5.9x)
v2 of this patch, addressing Jeff King's feedback:
* Replaced the decoration hash with the RESULT object flag (simpler, no
extra data structure, handles duplicate tips naturally)
* Fixed early-termination bug when multiple refs point to the same
commit (the decoration API overwrites on duplicate keys)
* Removed the now-unused index field from struct commit_and_index
* Diff is +11/-12 lines
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-2116%2Fspkrka%2Ftips-reachable-minimal-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-2116/spkrka/tips-reachable-minimal-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/2116
Range-diff vs v1:
1: 992c0aff0e ! 1: 7399a12518 commit-reach: use the decoration hash for tips_reachable_from_bases()
@@ Metadata
Author: Kristofer Karlsson <krka@spotify.com>
## Commit message ##
- commit-reach: use the decoration hash for tips_reachable_from_bases()
+ commit-reach: use object flags for tips_reachable_from_bases()
tips_reachable_from_bases() walks the commit graph from a set of base
commits to find which tip commits are reachable. The inner loop does
@@ Commit message
is a tip, making the overall cost O(C * T) where C is commits walked
and T is the number of tips.
- Replace the linear scan with the decoration hash for lookups, reducing
- the per-commit tip check from O(T) to O(1) and the overall cost from
+ Use the RESULT object flag to mark tip commits, replacing the linear
+ scan with a single flag test per visited commit. This reduces the
+ per-commit tip check from O(T) to O(1) and the overall cost from
O(C * T) to O(C + T).
+ When multiple refs point to the same commit, the shared object gets
+ the flag once, so all duplicates are handled automatically. The
+ early-termination advancement loop checks the flag on the sorted
+ commits array directly, which naturally handles duplicates since the
+ flag is on the shared commit object.
+
+ This also removes the index field from struct commit_and_index, since
+ the indirection through the original tips array is no longer needed.
+
This function is called by `git for-each-ref --merged` and
`git branch/tag --contains/--no-contains` via reach_filter() in
ref-filter.c.
@@ Commit message
Benchmark on a merge-heavy monorepo (2.3M commits, 10,000 refs):
Command Before After Speedup
- for-each-ref --merged HEAD 6.64s 1.66s 4.0x
- for-each-ref --no-merged HEAD 6.75s 1.74s 3.9x
+ for-each-ref --merged HEAD 6.57s 1.59s 4.1x
+ for-each-ref --no-merged HEAD 6.67s 1.66s 4.0x
branch --merged HEAD 0.68s 0.61s 10%
branch --no-merged HEAD 0.65s 0.61s 8%
tag --merged HEAD 0.12s 0.12s -
+ On linux.git with 10,000 synthetic branches at the root commit (worst
+ case for the DFS walk):
+
+ Command Before After Speedup
+ for-each-ref --merged HEAD 1.35s 0.35s 3.9x
+ for-each-ref --no-merged HEAD 1.82s 0.31s 5.9x
+
The large speedup for for-each-ref is because it checks all 10,000
refs as tips, making the O(T) inner loop expensive. The branch
subcommand only checks local branches (fewer tips), so the improvement
@@ Commit message
Signed-off-by: Kristofer Karlsson <krka@spotify.com>
## commit-reach.c ##
+@@ commit-reach.c: void ahead_behind(struct repository *r,
+
+ struct commit_and_index {
+ struct commit *commit;
+- unsigned int index;
+ timestamp_t generation;
+ };
+
@@ commit-reach.c: void tips_reachable_from_bases(struct repository *r,
- size_t min_generation_index = 0;
- timestamp_t min_generation;
- struct commit_list *stack = NULL;
-+ struct decoration tip_index = { "tip_index" };
- if (!bases || !tips || !tips_nr)
- return;
+ for (size_t i = 0; i < tips_nr; i++) {
+ commits[i].commit = tips[i];
+- commits[i].index = i;
+ commits[i].generation = commit_graph_generation(tips[i]);
+ }
+
@@ commit-reach.c: void tips_reachable_from_bases(struct repository *r,
QSORT(commits, tips_nr, compare_commit_and_index_by_generation);
min_generation = commits[0].generation;
+ for (size_t i = 0; i < tips_nr; i++)
-+ add_decoration(&tip_index, &commits[i].commit->object,
-+ (void *)(i + 1));
++ commits[i].commit->object.flags |= RESULT;
+
while (bases) {
repo_parse_commit(r, bases->item);
@@ commit-reach.c: void tips_reachable_from_bases(struct repository *r,
- break;
-
- if (commits[j].commit == c) {
+- tips[commits[j].index]->object.flags |= mark;
+ {
-+ size_t j = (size_t)lookup_decoration(&tip_index, &c->object) - 1;
-+ if (j < tips_nr) {
- tips[commits[j].index]->object.flags |= mark;
++ if (c->object.flags & RESULT) {
++ c->object.flags |= mark;
- if (j == min_generation_index) {
+- if (j == min_generation_index) {
+- unsigned int k = j + 1;
++ if (commits[min_generation_index].commit->object.flags & mark) {
++ unsigned int k = min_generation_index + 1;
+ while (k < tips_nr &&
+- (tips[commits[k].index]->object.flags & mark))
++ (commits[k].commit->object.flags & mark))
+ k++;
+
+ /* Terminate early if all found. */
@@ commit-reach.c: void tips_reachable_from_bases(struct repository *r,
}
done:
-+ clear_decoration(&tip_index, NULL);
++ for (size_t i = 0; i < tips_nr; i++)
++ commits[i].commit->object.flags &= ~RESULT;
free(commits);
repo_clear_commit_marks(r, SEEN);
commit_list_free(stack);
commit-reach.c | 23 +++++++++++------------
1 file changed, 11 insertions(+), 12 deletions(-)
diff --git a/commit-reach.c b/commit-reach.c
index d3a9b3ed6f..82614d2409 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -1125,7 +1125,6 @@ void ahead_behind(struct repository *r,
struct commit_and_index {
struct commit *commit;
- unsigned int index;
timestamp_t generation;
};
@@ -1165,7 +1164,6 @@ void tips_reachable_from_bases(struct repository *r,
for (size_t i = 0; i < tips_nr; i++) {
commits[i].commit = tips[i];
- commits[i].index = i;
commits[i].generation = commit_graph_generation(tips[i]);
}
@@ -1173,6 +1171,9 @@ void tips_reachable_from_bases(struct repository *r,
QSORT(commits, tips_nr, compare_commit_and_index_by_generation);
min_generation = commits[0].generation;
+ for (size_t i = 0; i < tips_nr; i++)
+ commits[i].commit->object.flags |= RESULT;
+
while (bases) {
repo_parse_commit(r, bases->item);
commit_list_insert(bases->item, &stack);
@@ -1183,20 +1184,16 @@ void tips_reachable_from_bases(struct repository *r,
int explored_all_parents = 1;
struct commit_list *p;
struct commit *c = stack->item;
- timestamp_t c_gen = commit_graph_generation(c);
/* Does it match any of our tips? */
- for (size_t j = min_generation_index; j < tips_nr; j++) {
- if (c_gen < commits[j].generation)
- break;
-
- if (commits[j].commit == c) {
- tips[commits[j].index]->object.flags |= mark;
+ {
+ if (c->object.flags & RESULT) {
+ c->object.flags |= mark;
- if (j == min_generation_index) {
- unsigned int k = j + 1;
+ if (commits[min_generation_index].commit->object.flags & mark) {
+ unsigned int k = min_generation_index + 1;
while (k < tips_nr &&
- (tips[commits[k].index]->object.flags & mark))
+ (commits[k].commit->object.flags & mark))
k++;
/* Terminate early if all found. */
@@ -1232,6 +1229,8 @@ void tips_reachable_from_bases(struct repository *r,
}
done:
+ for (size_t i = 0; i < tips_nr; i++)
+ commits[i].commit->object.flags &= ~RESULT;
free(commits);
repo_clear_commit_marks(r, SEEN);
commit_list_free(stack);
base-commit: 59ff4886a579f4bc91e976fe18590b9ae02c7a08
--
gitgitgadget
next prev parent reply other threads:[~2026-05-16 9:16 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-15 18:07 [PATCH] commit-reach: use the decoration hash for tips_reachable_from_bases() Kristofer Karlsson via GitGitGadget
2026-05-15 21:14 ` Jeff King
2026-05-16 8:23 ` Kristofer Karlsson
2026-05-16 13:46 ` Derrick Stolee
2026-05-16 9:16 ` Kristofer Karlsson via GitGitGadget [this message]
2026-05-16 15:59 ` [PATCH v3 0/2] commit-reach: use object flags " Kristofer Karlsson via GitGitGadget
2026-05-16 15:59 ` [PATCH v3 1/2] " Kristofer Karlsson via GitGitGadget
2026-05-16 15:59 ` [PATCH v3 2/2] t6600: add tests for duplicate tips in tips_reachable_from_bases() Kristofer Karlsson via GitGitGadget
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=pull.2116.v2.git.1778922993480.gitgitgadget@gmail.com \
--to=gitgitgadget@gmail.com \
--cc=git@vger.kernel.org \
--cc=krka@spotify.com \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox