From: Jeff King <peff@peff.net>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org, Thomas Rast <trast@student.ethz.ch>
Subject: Re: [PATCH 2/2] commit: use a priority queue in merge base functions
Date: Thu, 30 Aug 2012 17:31:05 -0400 [thread overview]
Message-ID: <20120830213105.GA18636@sigill.intra.peff.net> (raw)
In-Reply-To: <7v3934tkle.fsf@alter.siamese.dyndns.org>
On Thu, Aug 30, 2012 at 09:23:25AM -0700, Junio C Hamano wrote:
> Jeff King <peff@peff.net> writes:
>
> > Anyway, since this isn't yielding any performance benefit, I'm not going
> > to go down that route. But stability of the queue is something that we
> > need to consider if we ever do replace commit_list with a different data
> > structure.
> >
> > Here's the patch to make the existing priority queue stable (by wasting
> > space) in case we ever want to use it.
>
> Thanks for being thorough and showing a good analysis.
>
> If we want stability, the space to hold insertion counter is not
> "wasted", by the way.
It is if there is another data structure that will do the same job with
the same performance characteristics and without using that space.
But I'm not even sure it is an issue in practice. There are ~320K
objects in linux-2.6. Even if we inserted _all_ of them at once into a
queue (which we don't; it's usually more like a couple dozen, with a few
pathological cases being equal to the number of refs we have, which is
usually in the hundreds), then we are talking about an extra 2.5M on a
64-bit system. Compared to dropping an O(n^2) operation to O(lg n),
that's probably not even going to be noticeable.
> I actually think the generic "queue" implementation may want to
> handle elements that are not just single pointers. Just like
> qsort() is not about sorting an array of pointers but it is about
> sorting an array of elements of caller specified size, perhaps
> "queue" can hold elements of size the caller specified (set in stone
> when the queue is initialized).
>
> Then, a caller that wants a stable priority queue of commits can
> tell the queue to manage "struct { struct commit *c; int gen; }",
> use the "gen" field for stability in its comparison callback, etc.,
> while a caller that does not need stability can tell the queue to
> manage "struct commit *". That way, the generic queue layer does
> not have to worry about wasting the insertion counter space, no?
Yeah, that would be more generic. One wonky thing I didn't point out in
my implementation is this:
> +static inline int queue_cmp(struct queue *pq, unsigned i, unsigned j)
> +{
> + int cmp = pq->cmp(pq->items[i].item, pq->items[j].item);
> + if (cmp)
> + return cmp;
> + if (pq->items[i].counter < pq->items[j].counter)
> + return 1;
> + return -1;
> +}
Notice how the counter comparison is "backwards" to the regular
comparison. That's because the queue is actually a max-queue (I did it
that way since we are indexing on timestamp, and want to go in reverse
chronological order). But the counter part wants to prioritize minimums,
so you have to reverse it. You could also implement a min-queue and ask
the caller to reverse their comparison function.
But really, a caller could theoretically want to prioritize in either
direction (e.g., giving a LIFO behavior to elements with the same
priority). Pulling the counter out of the queue would allow that.
The reason I didn't, though, is that it would make the interface a huge
pain if the caller had to do this:
struct stable_commit_queue_element qe;
qe.commit = commit;
qe.counter = counter++; /* who is holding the global counter? */
queue_push(&q, &qe);
instead of:
queue_push(&q, commit);
I suspect you could build a "queue" object, and then implement a
"stable queue" on top of it with some clever use of function pointers
for the comparison function.
-Peff
next prev parent reply other threads:[~2012-08-30 21:31 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-08-27 23:11 [PATCH 0/5] optimize fast-forward checks Junio C Hamano
2012-08-27 23:11 ` [PATCH 1/5] in_merge_bases(): support only one "other" commit Junio C Hamano
2012-08-27 23:12 ` [PATCH 2/5] receive-pack: use in_merge_bases() for fast-forward check Junio C Hamano
2012-08-27 23:12 ` [PATCH 3/5] http-push: " Junio C Hamano
2012-08-27 23:12 ` [PATCH 4/5] in_merge_bases(): omit unnecessary redundant common ancestor reduction Junio C Hamano
2012-08-27 23:12 ` [PATCH 5/5] (BROKEN) get_merge_bases_many(): walk from many tips in parallel Junio C Hamano
2012-08-28 1:25 ` Junio C Hamano
2012-08-28 21:35 ` Junio C Hamano
2012-08-28 23:39 ` Junio C Hamano
2012-08-29 11:08 ` Jeff King
2012-08-29 11:10 ` [PATCH 1/2] basic priority queue implementation Jeff King
2012-08-29 11:11 ` [PATCH 2/2] commit: use a priority queue in merge base functions Jeff King
2012-08-29 16:36 ` Junio C Hamano
2012-08-29 20:53 ` Jeff King
2012-08-29 20:55 ` Jeff King
2012-08-29 21:00 ` Jeff King
2012-08-29 21:05 ` Jeff King
2012-08-30 12:54 ` Jeff King
2012-08-30 13:03 ` Jeff King
2012-08-30 13:24 ` Jeff King
2012-08-30 16:33 ` Junio C Hamano
2012-08-30 21:48 ` Jeff King
2012-08-30 22:16 ` Junio C Hamano
2012-08-30 16:23 ` Junio C Hamano
2012-08-30 21:31 ` Jeff King [this message]
2012-08-30 21:59 ` Junio C Hamano
2012-08-29 21:18 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120830213105.GA18636@sigill.intra.peff.net \
--to=peff@peff.net \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=trast@student.ethz.ch \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).