From: Junio C Hamano <gitster@pobox.com>
To: "Kristofer Karlsson via GitGitGadget" <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org, Derrick Stolee <stolee@gmail.com>,
Jeff King <peff@peff.net>,
Kristofer Karlsson <krka@spotify.com>
Subject: Re: [PATCH v2 2/3] commit-reach: deduplicate queue entries in paint_down_to_common
Date: Tue, 26 May 2026 07:50:18 +0900 [thread overview]
Message-ID: <xmqqzf1ncded.fsf@gitster.g> (raw)
In-Reply-To: <fc38c0f856e93b80073ec3f1b9f641b9ab187e4e.1779719286.git.gitgitgadget@gmail.com> (Kristofer Karlsson via GitGitGadget's message of "Mon, 25 May 2026 14:28:04 +0000")
"Kristofer Karlsson via GitGitGadget" <gitgitgadget@gmail.com>
writes:
> From: Kristofer Karlsson <krka@spotify.com>
>
> paint_down_to_common() can enqueue the same commit multiple times
> when it is reached through different parents with different flag
> combinations. Add an ENQUEUED flag to track whether a commit is
> currently in the priority queue, and skip it if already present.
>
> Introduce prio_queue_put_dedup() and prio_queue_get_dedup()
> wrappers that manage the ENQUEUED flag on enqueue and dequeue.
OK. I guess an obvious alternative design would be to have an
associated hashtable for deduping, or tweak prio_queue_get() so
that it notices duplicated entry just before it returns (i.e.,
peek and discard until queue->array[0].data is different from
what you are going to return). Both would not beat the cheap cost
of using a single bit per object, I guess ;-)
> This change is performance-neutral on its own: the O(n)
> queue_has_nonstale() scan still dominates the per-iteration cost.
> However, the deduplication guarantee (each commit appears in the
> queue at most once) is a prerequisite for the next commit, which
> replaces that scan with O(1) tracking.
>
> Signed-off-by: Kristofer Karlsson <krka@spotify.com>
> ---
> commit-reach.c | 27 ++++++++++++++++++++++-----
> object.h | 2 +-
> 2 files changed, 23 insertions(+), 6 deletions(-)
Thanks for the clean-up in the previous step, by the way.
> diff --git a/commit-reach.c b/commit-reach.c
> index 5a52be90a6..85583ae359 100644
> --- a/commit-reach.c
> +++ b/commit-reach.c
> @@ -17,8 +17,9 @@
> #define PARENT2 (1u<<17)
> #define STALE (1u<<18)
> #define RESULT (1u<<19)
> +#define ENQUEUED (1u<<20)
>
> -static const unsigned all_flags = (PARENT1 | PARENT2 | STALE | RESULT);
> +static const unsigned all_flags = (PARENT1 | PARENT2 | STALE | RESULT | ENQUEUED);
>
> static int compare_commits_by_gen(const void *_a, const void *_b)
> {
> @@ -39,6 +40,22 @@ static int compare_commits_by_gen(const void *_a, const void *_b)
> return 0;
> }
>
> +static void prio_queue_put_dedup(struct prio_queue *queue, struct commit *c)
> +{
> + if (c->object.flags & ENQUEUED)
> + return;
> + c->object.flags |= ENQUEUED;
> + prio_queue_put(queue, c);
> +}
> +
> +static struct commit *prio_queue_get_dedup(struct prio_queue *queue)
> +{
> + struct commit *commit = prio_queue_get(queue);
> + if (commit)
> + commit->object.flags &= ~ENQUEUED;
> + return commit;
> +}
> +
> static int queue_has_nonstale(struct prio_queue *queue)
> {
> for (size_t i = 0; i < queue->nr; i++) {
> @@ -70,15 +87,15 @@ static int paint_down_to_common(struct repository *r,
> commit_list_append(one, result);
> return 0;
> }
> - prio_queue_put(&queue, one);
> + prio_queue_put_dedup(&queue, one);
>
> for (i = 0; i < n; i++) {
> twos[i]->object.flags |= PARENT2;
> - prio_queue_put(&queue, twos[i]);
> + prio_queue_put_dedup(&queue, twos[i]);
> }
>
> while (queue_has_nonstale(&queue)) {
> - struct commit *commit = prio_queue_get(&queue);
> + struct commit *commit = prio_queue_get_dedup(&queue);
> struct commit_list *parents;
> int flags;
> timestamp_t generation = commit_graph_generation(commit);
> @@ -132,7 +149,7 @@ static int paint_down_to_common(struct repository *r,
> oid_to_hex(&p->object.oid));
> }
> p->object.flags |= flags;
> - prio_queue_put(&queue, p);
> + prio_queue_put_dedup(&queue, p);
> }
> }
>
> diff --git a/object.h b/object.h
> index 2b26de3044..8fb03ff90a 100644
> --- a/object.h
> +++ b/object.h
> @@ -75,7 +75,7 @@ void object_array_init(struct object_array *array);
> * bundle.c: 16
> * http-push.c: 11-----14
> * commit-graph.c: 15
> - * commit-reach.c: 16-----19
> + * commit-reach.c: 16-------20
> * builtin/last-modified.c: 1617
> * object-name.c: 20
> * list-objects-filter.c: 21
next prev parent reply other threads:[~2026-05-25 22:50 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-24 17:42 [PATCH 0/3] commit-reach: replace queue_has_nonstale with a counter Kristofer Karlsson via GitGitGadget
2026-05-24 17:42 ` [PATCH 1/3] commit-reach: deduplicate queue entries in paint_down_to_common Kristofer Karlsson via GitGitGadget
2026-05-24 23:40 ` Junio C Hamano
2026-05-25 1:43 ` Derrick Stolee
2026-05-25 6:50 ` Kristofer Karlsson
2026-05-25 7:17 ` Junio C Hamano
2026-05-25 7:53 ` Kristofer Karlsson
2026-05-25 10:02 ` Jeff King
2026-05-25 7:01 ` Jeff King
2026-05-25 7:15 ` Jeff King
2026-05-24 17:42 ` [PATCH 2/3] commit-reach: optimize queue scan " Kristofer Karlsson via GitGitGadget
2026-05-25 1:59 ` Derrick Stolee
2026-05-25 8:54 ` Kristofer Karlsson
2026-05-24 17:42 ` [PATCH 3/3] commit-reach: optimize queue scan in ahead_behind Kristofer Karlsson via GitGitGadget
2026-05-25 7:11 ` Jeff King
2026-05-25 6:47 ` [PATCH 0/3] commit-reach: replace queue_has_nonstale with a counter Jeff King
2026-05-25 7:59 ` Kristofer Karlsson
2026-05-25 8:38 ` Junio C Hamano
2026-05-25 9:55 ` Jeff King
2026-05-25 10:47 ` Kristofer Karlsson
2026-05-25 14:28 ` [PATCH v2 0/3] commit-reach: replace queue_has_nonstale() scan with O(1) tracking Kristofer Karlsson via GitGitGadget
2026-05-25 14:28 ` [PATCH v2 1/3] object.h: fix stale entries in object flag allocation table Kristofer Karlsson via GitGitGadget
2026-05-25 14:28 ` [PATCH v2 2/3] commit-reach: deduplicate queue entries in paint_down_to_common Kristofer Karlsson via GitGitGadget
2026-05-25 22:50 ` Junio C Hamano [this message]
2026-05-25 14:28 ` [PATCH v2 3/3] commit-reach: replace queue_has_nonstale() scan with O(1) tracking Kristofer Karlsson via GitGitGadget
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xmqqzf1ncded.fsf@gitster.g \
--to=gitster@pobox.com \
--cc=git@vger.kernel.org \
--cc=gitgitgadget@gmail.com \
--cc=krka@spotify.com \
--cc=peff@peff.net \
--cc=stolee@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox