Git development
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: "Kristofer Karlsson via GitGitGadget" <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org,  Derrick Stolee <stolee@gmail.com>,
	 Jeff King <peff@peff.net>,
	 Kristofer Karlsson <krka@spotify.com>
Subject: Re: [PATCH v2 2/3] commit-reach: deduplicate queue entries in paint_down_to_common
Date: Tue, 26 May 2026 07:50:18 +0900	[thread overview]
Message-ID: <xmqqzf1ncded.fsf@gitster.g> (raw)
In-Reply-To: <fc38c0f856e93b80073ec3f1b9f641b9ab187e4e.1779719286.git.gitgitgadget@gmail.com> (Kristofer Karlsson via GitGitGadget's message of "Mon, 25 May 2026 14:28:04 +0000")

"Kristofer Karlsson via GitGitGadget" <gitgitgadget@gmail.com>
writes:

> From: Kristofer Karlsson <krka@spotify.com>
>
> paint_down_to_common() can enqueue the same commit multiple times
> when it is reached through different parents with different flag
> combinations. Add an ENQUEUED flag to track whether a commit is
> currently in the priority queue, and skip it if already present.
>
> Introduce prio_queue_put_dedup() and prio_queue_get_dedup()
> wrappers that manage the ENQUEUED flag on enqueue and dequeue.

OK.  I guess an obvious alternative design would be to have an
associated hashtable for deduping, or tweak prio_queue_get() so
that it notices duplicated entry just before it returns (i.e.,
peek and discard until queue->array[0].data is different from
what you are going to return).  Both would not beat the cheap cost
of using a single bit per object, I guess ;-)

> This change is performance-neutral on its own: the O(n)
> queue_has_nonstale() scan still dominates the per-iteration cost.
> However, the deduplication guarantee (each commit appears in the
> queue at most once) is a prerequisite for the next commit, which
> replaces that scan with O(1) tracking.
>
> Signed-off-by: Kristofer Karlsson <krka@spotify.com>
> ---
>  commit-reach.c | 27 ++++++++++++++++++++++-----
>  object.h       |  2 +-
>  2 files changed, 23 insertions(+), 6 deletions(-)

Thanks for the clean-up in the previous step, by the way.

> diff --git a/commit-reach.c b/commit-reach.c
> index 5a52be90a6..85583ae359 100644
> --- a/commit-reach.c
> +++ b/commit-reach.c
> @@ -17,8 +17,9 @@
>  #define PARENT2		(1u<<17)
>  #define STALE		(1u<<18)
>  #define RESULT		(1u<<19)
> +#define ENQUEUED	(1u<<20)
>  
> -static const unsigned all_flags = (PARENT1 | PARENT2 | STALE | RESULT);
> +static const unsigned all_flags = (PARENT1 | PARENT2 | STALE | RESULT | ENQUEUED);
>  
>  static int compare_commits_by_gen(const void *_a, const void *_b)
>  {
> @@ -39,6 +40,22 @@ static int compare_commits_by_gen(const void *_a, const void *_b)
>  	return 0;
>  }
>  
> +static void prio_queue_put_dedup(struct prio_queue *queue, struct commit *c)
> +{
> +	if (c->object.flags & ENQUEUED)
> +		return;
> +	c->object.flags |= ENQUEUED;
> +	prio_queue_put(queue, c);
> +}
> +
> +static struct commit *prio_queue_get_dedup(struct prio_queue *queue)
> +{
> +	struct commit *commit = prio_queue_get(queue);
> +	if (commit)
> +		commit->object.flags &= ~ENQUEUED;
> +	return commit;
> +}
> +
>  static int queue_has_nonstale(struct prio_queue *queue)
>  {
>  	for (size_t i = 0; i < queue->nr; i++) {
> @@ -70,15 +87,15 @@ static int paint_down_to_common(struct repository *r,
>  		commit_list_append(one, result);
>  		return 0;
>  	}
> -	prio_queue_put(&queue, one);
> +	prio_queue_put_dedup(&queue, one);
>  
>  	for (i = 0; i < n; i++) {
>  		twos[i]->object.flags |= PARENT2;
> -		prio_queue_put(&queue, twos[i]);
> +		prio_queue_put_dedup(&queue, twos[i]);
>  	}
>  
>  	while (queue_has_nonstale(&queue)) {
> -		struct commit *commit = prio_queue_get(&queue);
> +		struct commit *commit = prio_queue_get_dedup(&queue);
>  		struct commit_list *parents;
>  		int flags;
>  		timestamp_t generation = commit_graph_generation(commit);
> @@ -132,7 +149,7 @@ static int paint_down_to_common(struct repository *r,
>  					     oid_to_hex(&p->object.oid));
>  			}
>  			p->object.flags |= flags;
> -			prio_queue_put(&queue, p);
> +			prio_queue_put_dedup(&queue, p);
>  		}
>  	}
>  
> diff --git a/object.h b/object.h
> index 2b26de3044..8fb03ff90a 100644
> --- a/object.h
> +++ b/object.h
> @@ -75,7 +75,7 @@ void object_array_init(struct object_array *array);
>   * bundle.c:                                        16
>   * http-push.c:                          11-----14
>   * commit-graph.c:                                15
> - * commit-reach.c:                                  16-----19
> + * commit-reach.c:                                  16-------20
>   * builtin/last-modified.c:                         1617
>   * object-name.c:                                            20
>   * list-objects-filter.c:                                      21

  reply	other threads:[~2026-05-25 22:50 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-24 17:42 [PATCH 0/3] commit-reach: replace queue_has_nonstale with a counter Kristofer Karlsson via GitGitGadget
2026-05-24 17:42 ` [PATCH 1/3] commit-reach: deduplicate queue entries in paint_down_to_common Kristofer Karlsson via GitGitGadget
2026-05-24 23:40   ` Junio C Hamano
2026-05-25  1:43     ` Derrick Stolee
2026-05-25  6:50       ` Kristofer Karlsson
2026-05-25  7:17         ` Junio C Hamano
2026-05-25  7:53           ` Kristofer Karlsson
2026-05-25 10:02             ` Jeff King
2026-05-25  7:01   ` Jeff King
2026-05-25  7:15     ` Jeff King
2026-05-24 17:42 ` [PATCH 2/3] commit-reach: optimize queue scan " Kristofer Karlsson via GitGitGadget
2026-05-25  1:59   ` Derrick Stolee
2026-05-25  8:54     ` Kristofer Karlsson
2026-05-24 17:42 ` [PATCH 3/3] commit-reach: optimize queue scan in ahead_behind Kristofer Karlsson via GitGitGadget
2026-05-25  7:11   ` Jeff King
2026-05-25  6:47 ` [PATCH 0/3] commit-reach: replace queue_has_nonstale with a counter Jeff King
2026-05-25  7:59   ` Kristofer Karlsson
2026-05-25  8:38     ` Junio C Hamano
2026-05-25  9:55     ` Jeff King
2026-05-25 10:47       ` Kristofer Karlsson
2026-05-25 14:28 ` [PATCH v2 0/3] commit-reach: replace queue_has_nonstale() scan with O(1) tracking Kristofer Karlsson via GitGitGadget
2026-05-25 14:28   ` [PATCH v2 1/3] object.h: fix stale entries in object flag allocation table Kristofer Karlsson via GitGitGadget
2026-05-25 14:28   ` [PATCH v2 2/3] commit-reach: deduplicate queue entries in paint_down_to_common Kristofer Karlsson via GitGitGadget
2026-05-25 22:50     ` Junio C Hamano [this message]
2026-05-25 14:28   ` [PATCH v2 3/3] commit-reach: replace queue_has_nonstale() scan with O(1) tracking Kristofer Karlsson via GitGitGadget

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqzf1ncded.fsf@gitster.g \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=krka@spotify.com \
    --cc=peff@peff.net \
    --cc=stolee@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox