Git development
 help / color / mirror / Atom feed
* [PATCH 0/3] commit-reach: replace queue_has_nonstale with a counter
@ 2026-05-24 17:42 Kristofer Karlsson via GitGitGadget
  2026-05-24 17:42 ` [PATCH 1/3] commit-reach: deduplicate queue entries in paint_down_to_common Kristofer Karlsson via GitGitGadget
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Kristofer Karlsson via GitGitGadget @ 2026-05-24 17:42 UTC (permalink / raw)
  To: git; +Cc: Kristofer Karlsson

paint_down_to_common() and ahead_behind() terminate when every commit in
their priority queue is STALE. The current check, queue_has_nonstale(), does
an O(n) linear scan of the queue on every iteration, costing O(n*m) total
where n is the queue size and m is the number of commits processed. This
series replaces that scan with an O(1) counter.

Performance measurements with git merge-base --all and git for-each-ref
--format='%(ahead-behind:...)':

git.git (merge-base)
                                          Baseline  Dedup  Dedup+Ctr
seen..next, 33 merge bases:               157ms    165ms    143ms
seen..master, 1 base:                      47ms     40ms     44ms
master..next, 1 base:                      62ms     60ms     63ms

(seen=fe056fe1, next=c82f1880, master=6a4418c3)

Large monorepo, 2.4M commits (merge-base)
                                          Baseline        Dedup+Ctr
component import, wide frontier (1):      8083ms           3778ms
component import, wide frontier (2):      5664ms           4207ms
component import, wide frontier (3):      4558ms           1796ms

Large monorepo, 2.4M commits (ahead-behind)
                                          Baseline        Dedup+Ctr
component import, wide frontier (1):      8216ms           4145ms
component import, wide frontier (2):      6107ms           4528ms
component import, wide frontier (3):      4725ms           1999ms

Linear history (merge-base), no regression:
master vs HEAD~10000:                     4410ms           4180ms
master vs HEAD~50000:                     4412ms           4494ms


The improvement depends on how wide the frontier gets during the walk.
Component imports in the monorepo create wide frontiers where the queue
grows large, making the O(n) scan expensive -- up to 2.5x speedup for
merge-base and 2.4x for ahead-behind. Linear history and simple merges show
no regression.

With a very narrow frontier the counter approach adds a small constant
overhead per iteration (maintaining the counter and the ENQUEUED flag)
compared to the old scan which would return almost immediately. Both are
O(1) and cheap in that scenario, so it should not matter in practice -- the
benchmark numbers above confirm this.

Kristofer Karlsson (3):
  commit-reach: deduplicate queue entries in paint_down_to_common
  commit-reach: optimize queue scan in paint_down_to_common
  commit-reach: optimize queue scan in ahead_behind

 commit-reach.c | 58 ++++++++++++++++++++++++++++++++++++--------------
 object.h       |  2 +-
 2 files changed, 43 insertions(+), 17 deletions(-)


base-commit: 6a4418c36d6bad69a599044b3cf49dcbd049cb45
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-2124%2Fspkrka%2Fqueue-has-nonstale-v3-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-2124/spkrka/queue-has-nonstale-v3-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/2124
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH 1/3] commit-reach: deduplicate queue entries in paint_down_to_common
  2026-05-24 17:42 [PATCH 0/3] commit-reach: replace queue_has_nonstale with a counter Kristofer Karlsson via GitGitGadget
@ 2026-05-24 17:42 ` Kristofer Karlsson via GitGitGadget
  2026-05-24 17:42 ` [PATCH 2/3] commit-reach: optimize queue scan " Kristofer Karlsson via GitGitGadget
  2026-05-24 17:42 ` [PATCH 3/3] commit-reach: optimize queue scan in ahead_behind Kristofer Karlsson via GitGitGadget
  2 siblings, 0 replies; 4+ messages in thread
From: Kristofer Karlsson via GitGitGadget @ 2026-05-24 17:42 UTC (permalink / raw)
  To: git; +Cc: Kristofer Karlsson, Kristofer Karlsson

From: Kristofer Karlsson <krka@spotify.com>

paint_down_to_common() can enqueue the same commit multiple times
when it is reached through different parents with different flag
combinations. Add an ENQUEUED flag to track whether a commit is
currently in the priority queue, and skip it if already present.

This change is performance-neutral on its own: the O(n)
queue_has_nonstale() scan still dominates the per-iteration cost.
However, the deduplication guarantee (each commit appears in the
queue at most once) is a prerequisite for the next commit, which
replaces that scan with an O(1) nonstale counter.

Signed-off-by: Kristofer Karlsson <krka@spotify.com>
---
 commit-reach.c | 19 +++++++++++++++----
 object.h       |  2 +-
 2 files changed, 16 insertions(+), 5 deletions(-)

diff --git a/commit-reach.c b/commit-reach.c
index d3a9b3ed6f..c16d4b061c 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -17,8 +17,9 @@
 #define PARENT2		(1u<<17)
 #define STALE		(1u<<18)
 #define RESULT		(1u<<19)
+#define ENQUEUED	(1u<<20)
 
-static const unsigned all_flags = (PARENT1 | PARENT2 | STALE | RESULT);
+static const unsigned all_flags = (PARENT1 | PARENT2 | STALE | RESULT | ENQUEUED);
 
 static int compare_commits_by_gen(const void *_a, const void *_b)
 {
@@ -39,6 +40,14 @@ static int compare_commits_by_gen(const void *_a, const void *_b)
 	return 0;
 }
 
+static void maybe_enqueue(struct prio_queue *queue, struct commit *c)
+{
+	if (c->object.flags & ENQUEUED)
+		return;
+	c->object.flags |= ENQUEUED;
+	prio_queue_put(queue, c);
+}
+
 static int queue_has_nonstale(struct prio_queue *queue)
 {
 	for (size_t i = 0; i < queue->nr; i++) {
@@ -70,11 +79,11 @@ static int paint_down_to_common(struct repository *r,
 		commit_list_append(one, result);
 		return 0;
 	}
-	prio_queue_put(&queue, one);
+	maybe_enqueue(&queue, one);
 
 	for (i = 0; i < n; i++) {
 		twos[i]->object.flags |= PARENT2;
-		prio_queue_put(&queue, twos[i]);
+		maybe_enqueue(&queue, twos[i]);
 	}
 
 	while (queue_has_nonstale(&queue)) {
@@ -83,6 +92,8 @@ static int paint_down_to_common(struct repository *r,
 		int flags;
 		timestamp_t generation = commit_graph_generation(commit);
 
+		commit->object.flags &= ~ENQUEUED;
+
 		if (min_generation && generation > last_gen)
 			BUG("bad generation skip %"PRItime" > %"PRItime" at %s",
 			    generation, last_gen,
@@ -124,7 +135,7 @@ static int paint_down_to_common(struct repository *r,
 					     oid_to_hex(&p->object.oid));
 			}
 			p->object.flags |= flags;
-			prio_queue_put(&queue, p);
+			maybe_enqueue(&queue, p);
 		}
 	}
 
diff --git a/object.h b/object.h
index d814647ebe..05cbf728e9 100644
--- a/object.h
+++ b/object.h
@@ -74,7 +74,7 @@ void object_array_init(struct object_array *array);
  * bundle.c:                                        16
  * http-push.c:                          11-----14
  * commit-graph.c:                                15
- * commit-reach.c:                                  16-----19
+ * commit-reach.c:                                  16-------20
  * builtin/last-modified.c:                         1617
  * sha1-name.c:                                              20
  * list-objects-filter.c:                                      21
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH 2/3] commit-reach: optimize queue scan in paint_down_to_common
  2026-05-24 17:42 [PATCH 0/3] commit-reach: replace queue_has_nonstale with a counter Kristofer Karlsson via GitGitGadget
  2026-05-24 17:42 ` [PATCH 1/3] commit-reach: deduplicate queue entries in paint_down_to_common Kristofer Karlsson via GitGitGadget
@ 2026-05-24 17:42 ` Kristofer Karlsson via GitGitGadget
  2026-05-24 17:42 ` [PATCH 3/3] commit-reach: optimize queue scan in ahead_behind Kristofer Karlsson via GitGitGadget
  2 siblings, 0 replies; 4+ messages in thread
From: Kristofer Karlsson via GitGitGadget @ 2026-05-24 17:42 UTC (permalink / raw)
  To: git; +Cc: Kristofer Karlsson, Kristofer Karlsson

From: Kristofer Karlsson <krka@spotify.com>

paint_down_to_common() terminates when every commit remaining in its
priority queue is STALE. This was checked by queue_has_nonstale(),
which performed an O(n) linear scan of the entire queue on every
iteration, resulting in O(n*m) total overhead where n is the queue
size and m is the number of commits processed.

Replace this with an O(1) nonstale_count that tracks the number of
non-stale commits currently in the queue. The counter is incremented
by maybe_enqueue() and decremented on dequeue and by mark_stale()
when a commit transitions to STALE while still in the queue. Since
each commit appears at most once (guaranteed by the ENQUEUED flag
from the previous commit), the counter is exact.

ahead_behind() also uses queue_has_nonstale() and will be converted
in the next commit.

Signed-off-by: Kristofer Karlsson <krka@spotify.com>
---
 commit-reach.c | 28 +++++++++++++++++++++++-----
 1 file changed, 23 insertions(+), 5 deletions(-)

diff --git a/commit-reach.c b/commit-reach.c
index c16d4b061c..356ff52d08 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -40,12 +40,25 @@ static int compare_commits_by_gen(const void *_a, const void *_b)
 	return 0;
 }
 
-static void maybe_enqueue(struct prio_queue *queue, struct commit *c)
+static void maybe_enqueue(struct prio_queue *queue, struct commit *c,
+			  int *nonstale_count)
 {
 	if (c->object.flags & ENQUEUED)
 		return;
 	c->object.flags |= ENQUEUED;
 	prio_queue_put(queue, c);
+	if (!(c->object.flags & STALE))
+		(*nonstale_count)++;
+}
+
+static void mark_stale(struct commit *c, unsigned queued_flag,
+		       int *nonstale_count)
+{
+	if (!(c->object.flags & STALE)) {
+		if (c->object.flags & queued_flag)
+			(*nonstale_count)--;
+		c->object.flags |= STALE;
+	}
 }
 
 static int queue_has_nonstale(struct prio_queue *queue)
@@ -68,6 +81,7 @@ static int paint_down_to_common(struct repository *r,
 {
 	struct prio_queue queue = { compare_commits_by_gen_then_commit_date };
 	int i;
+	int nonstale_count = 0;
 	timestamp_t last_gen = GENERATION_NUMBER_INFINITY;
 	struct commit_list **tail = result;
 
@@ -79,20 +93,22 @@ static int paint_down_to_common(struct repository *r,
 		commit_list_append(one, result);
 		return 0;
 	}
-	maybe_enqueue(&queue, one);
+	maybe_enqueue(&queue, one, &nonstale_count);
 
 	for (i = 0; i < n; i++) {
 		twos[i]->object.flags |= PARENT2;
-		maybe_enqueue(&queue, twos[i]);
+		maybe_enqueue(&queue, twos[i], &nonstale_count);
 	}
 
-	while (queue_has_nonstale(&queue)) {
+	while (nonstale_count > 0) {
 		struct commit *commit = prio_queue_get(&queue);
 		struct commit_list *parents;
 		int flags;
 		timestamp_t generation = commit_graph_generation(commit);
 
 		commit->object.flags &= ~ENQUEUED;
+		if (!(commit->object.flags & STALE))
+			nonstale_count--;
 
 		if (min_generation && generation > last_gen)
 			BUG("bad generation skip %"PRItime" > %"PRItime" at %s",
@@ -134,8 +150,10 @@ static int paint_down_to_common(struct repository *r,
 				return error(_("could not parse commit %s"),
 					     oid_to_hex(&p->object.oid));
 			}
+			if (flags & STALE)
+				mark_stale(p, ENQUEUED, &nonstale_count);
 			p->object.flags |= flags;
-			maybe_enqueue(&queue, p);
+			maybe_enqueue(&queue, p, &nonstale_count);
 		}
 	}
 
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH 3/3] commit-reach: optimize queue scan in ahead_behind
  2026-05-24 17:42 [PATCH 0/3] commit-reach: replace queue_has_nonstale with a counter Kristofer Karlsson via GitGitGadget
  2026-05-24 17:42 ` [PATCH 1/3] commit-reach: deduplicate queue entries in paint_down_to_common Kristofer Karlsson via GitGitGadget
  2026-05-24 17:42 ` [PATCH 2/3] commit-reach: optimize queue scan " Kristofer Karlsson via GitGitGadget
@ 2026-05-24 17:42 ` Kristofer Karlsson via GitGitGadget
  2 siblings, 0 replies; 4+ messages in thread
From: Kristofer Karlsson via GitGitGadget @ 2026-05-24 17:42 UTC (permalink / raw)
  To: git; +Cc: Kristofer Karlsson, Kristofer Karlsson

From: Kristofer Karlsson <krka@spotify.com>

Apply the same nonstale_count optimization from the previous commit
to ahead_behind(). This replaces the remaining caller of the O(n)
queue_has_nonstale() scan with an O(1) counter check, allowing
queue_has_nonstale() to be removed.

ahead_behind() already deduplicates queue entries using the PARENT2
flag (via insert_no_dup), so the counter is maintained through
insert_no_dup() and mark_stale() using PARENT2 as the queued_flag.

Signed-off-by: Kristofer Karlsson <krka@spotify.com>
---
 commit-reach.c | 27 ++++++++++++---------------
 1 file changed, 12 insertions(+), 15 deletions(-)

diff --git a/commit-reach.c b/commit-reach.c
index 356ff52d08..41deb8fc78 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -61,16 +61,6 @@ static void mark_stale(struct commit *c, unsigned queued_flag,
 	}
 }
 
-static int queue_has_nonstale(struct prio_queue *queue)
-{
-	for (size_t i = 0; i < queue->nr; i++) {
-		struct commit *commit = queue->array[i].data;
-		if (!(commit->object.flags & STALE))
-			return 1;
-	}
-	return 0;
-}
-
 /* all input commits in one and twos[] must have been parsed! */
 static int paint_down_to_common(struct repository *r,
 				struct commit *one, int n,
@@ -1051,12 +1041,15 @@ struct commit_list *get_reachable_subset(struct commit **from, size_t nr_from,
 define_commit_slab(bit_arrays, struct bitmap *);
 static struct bit_arrays bit_arrays;
 
-static void insert_no_dup(struct prio_queue *queue, struct commit *c)
+static void insert_no_dup(struct prio_queue *queue, struct commit *c,
+			  int *nonstale_count)
 {
 	if (c->object.flags & PARENT2)
 		return;
 	prio_queue_put(queue, c);
 	c->object.flags |= PARENT2;
+	if (!(c->object.flags & STALE))
+		(*nonstale_count)++;
 }
 
 static struct bitmap *get_bit_array(struct commit *c, int width)
@@ -1082,6 +1075,7 @@ void ahead_behind(struct repository *r,
 {
 	struct prio_queue queue = { .compare = compare_commits_by_gen_then_commit_date };
 	size_t width = DIV_ROUND_UP(commits_nr, BITS_IN_EWORD);
+	int nonstale_count = 0;
 
 	if (!commits_nr || !counts_nr)
 		return;
@@ -1100,14 +1094,17 @@ void ahead_behind(struct repository *r,
 		struct bitmap *bitmap = get_bit_array(c, width);
 
 		bitmap_set(bitmap, i);
-		insert_no_dup(&queue, c);
+		insert_no_dup(&queue, c, &nonstale_count);
 	}
 
-	while (queue_has_nonstale(&queue)) {
+	while (nonstale_count > 0) {
 		struct commit *c = prio_queue_get(&queue);
 		struct commit_list *p;
 		struct bitmap *bitmap_c = get_bit_array(c, width);
 
+		if (!(c->object.flags & STALE))
+			nonstale_count--;
+
 		for (size_t i = 0; i < counts_nr; i++) {
 			int reach_from_tip = !!bitmap_get(bitmap_c, counts[i].tip_index);
 			int reach_from_base = !!bitmap_get(bitmap_c, counts[i].base_index);
@@ -1136,9 +1133,9 @@ void ahead_behind(struct repository *r,
 			 * queue is STALE.
 			 */
 			if (bitmap_popcount(bitmap_p) == commits_nr)
-				p->item->object.flags |= STALE;
+				mark_stale(p->item, PARENT2, &nonstale_count);
 
-			insert_no_dup(&queue, p->item);
+			insert_no_dup(&queue, p->item, &nonstale_count);
 		}
 
 		free_bit_array(c);
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2026-05-24 17:42 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-24 17:42 [PATCH 0/3] commit-reach: replace queue_has_nonstale with a counter Kristofer Karlsson via GitGitGadget
2026-05-24 17:42 ` [PATCH 1/3] commit-reach: deduplicate queue entries in paint_down_to_common Kristofer Karlsson via GitGitGadget
2026-05-24 17:42 ` [PATCH 2/3] commit-reach: optimize queue scan " Kristofer Karlsson via GitGitGadget
2026-05-24 17:42 ` [PATCH 3/3] commit-reach: optimize queue scan in ahead_behind Kristofer Karlsson via GitGitGadget

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox