All of lore.kernel.org
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: "Kristofer Karlsson via GitGitGadget" <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org,  Derrick Stolee <stolee@gmail.com>,
	 Jeff King <peff@peff.net>,
	 Kristofer Karlsson <krka@spotify.com>
Subject: Re: [PATCH v2 2/3] commit-reach: deduplicate queue entries in paint_down_to_common
Date: Tue, 26 May 2026 07:50:18 +0900	[thread overview]
Message-ID: <xmqqzf1ncded.fsf@gitster.g> (raw)
In-Reply-To: <fc38c0f856e93b80073ec3f1b9f641b9ab187e4e.1779719286.git.gitgitgadget@gmail.com> (Kristofer Karlsson via GitGitGadget's message of "Mon, 25 May 2026 14:28:04 +0000")

"Kristofer Karlsson via GitGitGadget" <gitgitgadget@gmail.com>
writes:

> From: Kristofer Karlsson <krka@spotify.com>
>
> paint_down_to_common() can enqueue the same commit multiple times
> when it is reached through different parents with different flag
> combinations. Add an ENQUEUED flag to track whether a commit is
> currently in the priority queue, and skip it if already present.
>
> Introduce prio_queue_put_dedup() and prio_queue_get_dedup()
> wrappers that manage the ENQUEUED flag on enqueue and dequeue.

OK.  I guess an obvious alternative design would be to have an
associated hashtable for deduping, or tweak prio_queue_get() so
that it notices duplicated entry just before it returns (i.e.,
peek and discard until queue->array[0].data is different from
what you are going to return).  Both would not beat the cheap cost
of using a single bit per object, I guess ;-)

> This change is performance-neutral on its own: the O(n)
> queue_has_nonstale() scan still dominates the per-iteration cost.
> However, the deduplication guarantee (each commit appears in the
> queue at most once) is a prerequisite for the next commit, which
> replaces that scan with O(1) tracking.
>
> Signed-off-by: Kristofer Karlsson <krka@spotify.com>
> ---
>  commit-reach.c | 27 ++++++++++++++++++++++-----
>  object.h       |  2 +-
>  2 files changed, 23 insertions(+), 6 deletions(-)

Thanks for the clean-up in the previous step, by the way.

> diff --git a/commit-reach.c b/commit-reach.c
> index 5a52be90a6..85583ae359 100644
> --- a/commit-reach.c
> +++ b/commit-reach.c
> @@ -17,8 +17,9 @@
>  #define PARENT2		(1u<<17)
>  #define STALE		(1u<<18)
>  #define RESULT		(1u<<19)
> +#define ENQUEUED	(1u<<20)
>  
> -static const unsigned all_flags = (PARENT1 | PARENT2 | STALE | RESULT);
> +static const unsigned all_flags = (PARENT1 | PARENT2 | STALE | RESULT | ENQUEUED);
>  
>  static int compare_commits_by_gen(const void *_a, const void *_b)
>  {
> @@ -39,6 +40,22 @@ static int compare_commits_by_gen(const void *_a, const void *_b)
>  	return 0;
>  }
>  
> +static void prio_queue_put_dedup(struct prio_queue *queue, struct commit *c)
> +{
> +	if (c->object.flags & ENQUEUED)
> +		return;
> +	c->object.flags |= ENQUEUED;
> +	prio_queue_put(queue, c);
> +}
> +
> +static struct commit *prio_queue_get_dedup(struct prio_queue *queue)
> +{
> +	struct commit *commit = prio_queue_get(queue);
> +	if (commit)
> +		commit->object.flags &= ~ENQUEUED;
> +	return commit;
> +}
> +
>  static int queue_has_nonstale(struct prio_queue *queue)
>  {
>  	for (size_t i = 0; i < queue->nr; i++) {
> @@ -70,15 +87,15 @@ static int paint_down_to_common(struct repository *r,
>  		commit_list_append(one, result);
>  		return 0;
>  	}
> -	prio_queue_put(&queue, one);
> +	prio_queue_put_dedup(&queue, one);
>  
>  	for (i = 0; i < n; i++) {
>  		twos[i]->object.flags |= PARENT2;
> -		prio_queue_put(&queue, twos[i]);
> +		prio_queue_put_dedup(&queue, twos[i]);
>  	}
>  
>  	while (queue_has_nonstale(&queue)) {
> -		struct commit *commit = prio_queue_get(&queue);
> +		struct commit *commit = prio_queue_get_dedup(&queue);
>  		struct commit_list *parents;
>  		int flags;
>  		timestamp_t generation = commit_graph_generation(commit);
> @@ -132,7 +149,7 @@ static int paint_down_to_common(struct repository *r,
>  					     oid_to_hex(&p->object.oid));
>  			}
>  			p->object.flags |= flags;
> -			prio_queue_put(&queue, p);
> +			prio_queue_put_dedup(&queue, p);
>  		}
>  	}
>  
> diff --git a/object.h b/object.h
> index 2b26de3044..8fb03ff90a 100644
> --- a/object.h
> +++ b/object.h
> @@ -75,7 +75,7 @@ void object_array_init(struct object_array *array);
>   * bundle.c:                                        16
>   * http-push.c:                          11-----14
>   * commit-graph.c:                                15
> - * commit-reach.c:                                  16-----19
> + * commit-reach.c:                                  16-------20
>   * builtin/last-modified.c:                         1617
>   * object-name.c:                                            20
>   * list-objects-filter.c:                                      21

  reply	other threads:[~2026-05-25 22:50 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-24 17:42 [PATCH 0/3] commit-reach: replace queue_has_nonstale with a counter Kristofer Karlsson via GitGitGadget
2026-05-24 17:42 ` [PATCH 1/3] commit-reach: deduplicate queue entries in paint_down_to_common Kristofer Karlsson via GitGitGadget
2026-05-24 23:40   ` Junio C Hamano
2026-05-25  1:43     ` Derrick Stolee
2026-05-25  6:50       ` Kristofer Karlsson
2026-05-25  7:17         ` Junio C Hamano
2026-05-25  7:53           ` Kristofer Karlsson
2026-05-25 10:02             ` Jeff King
2026-05-25  7:01   ` Jeff King
2026-05-25  7:15     ` Jeff King
2026-05-24 17:42 ` [PATCH 2/3] commit-reach: optimize queue scan " Kristofer Karlsson via GitGitGadget
2026-05-25  1:59   ` Derrick Stolee
2026-05-25  8:54     ` Kristofer Karlsson
2026-05-24 17:42 ` [PATCH 3/3] commit-reach: optimize queue scan in ahead_behind Kristofer Karlsson via GitGitGadget
2026-05-25  7:11   ` Jeff King
2026-05-25  6:47 ` [PATCH 0/3] commit-reach: replace queue_has_nonstale with a counter Jeff King
2026-05-25  7:59   ` Kristofer Karlsson
2026-05-25  8:38     ` Junio C Hamano
2026-05-25  9:55     ` Jeff King
2026-05-25 10:47       ` Kristofer Karlsson
2026-05-25 14:28 ` [PATCH v2 0/3] commit-reach: replace queue_has_nonstale() scan with O(1) tracking Kristofer Karlsson via GitGitGadget
2026-05-25 14:28   ` [PATCH v2 1/3] object.h: fix stale entries in object flag allocation table Kristofer Karlsson via GitGitGadget
2026-05-25 14:28   ` [PATCH v2 2/3] commit-reach: deduplicate queue entries in paint_down_to_common Kristofer Karlsson via GitGitGadget
2026-05-25 22:50     ` Junio C Hamano [this message]
2026-05-26  6:57       ` Kristofer Karlsson
2026-05-25 14:28   ` [PATCH v2 3/3] commit-reach: replace queue_has_nonstale() scan with O(1) tracking Kristofer Karlsson via GitGitGadget

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqzf1ncded.fsf@gitster.g \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=krka@spotify.com \
    --cc=peff@peff.net \
    --cc=stolee@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.