From: "René Scharfe" <rene.scharfe@lsrfire.ath.cx>
To: Junio C Hamano <gitster@pobox.com>
Cc: Jeff King <peff@peff.net>, Martin Fick <mfick@codeaurora.org>,
git@vger.kernel.org
Subject: Re: [PATCH 1/3] add mergesort() for linked lists
Date: Sun, 08 Apr 2012 22:32:38 +0200 [thread overview]
Message-ID: <4F81F5E6.2070609@lsrfire.ath.cx> (raw)
In-Reply-To: <7vpqbm56pf.fsf@alter.siamese.dyndns.org>
Am 05.04.2012 21:17, schrieb Junio C Hamano:
> After seeing "I wrote it myself due to NIH", it strikes me a bit odd that
> you still used "start from bunch of singleton sublist, elongating twice
> per round as we go" structure from the original.
It's just becasue the dumb bottom-up approach is the most simple way to
implement merge sort.
> I wonder if it would be an improvement if you structured the loop so that:
>
> (1) the first sublist 'p' grabs as many elements in the ascending order
> as you find;
>
> (2) the second sublist 'q' begins at the end of the first sublist and
> grabs as many elements in the ascending order;
>
> (3) 'p' and 'q' are merge-sorted into the result list;
>
> (4) if your two sublists did not cover "list" in its entirety, process
> the remainder (i.e. where the second sublist stopped because of an
> unordered element) by going back to step (1); and
>
> (5) if you did not need to jump back to step (1) from step (4), then you
> had only two sublists (or less), so the result is sorted. Otherwise,
> the result now has fewer ascending sublists than the original, so go
> back to (1) and iterate.
>
> If the input is in a random order, this may end up doing the same number
> of iterations as the original, but if the input is mostly sorted, wouldn't
> it allow us to take advantage of the fact by starting with a longer
> sublist in the earlier rounds?
This optimization speeds up the pre-sorted case but slows down the case of
a reversed pre-sorted list because we have to determine the length of the
sublists each time, while the dumb implementation already knows it. I
didn't measure a significant difference for Jeff's test case. Here's my
attempt at an implementation, for reference.
---
mergesort.c | 61 +++++++++++++++++++++++++++++++++++++++--------------------
1 file changed, 41 insertions(+), 20 deletions(-)
diff --git a/mergesort.c b/mergesort.c
index c0f1874..3a61b9b 100644
--- a/mergesort.c
+++ b/mergesort.c
@@ -8,12 +8,37 @@ struct mergesort_sublist {
unsigned long len;
};
-static void *get_nth_next(void *list, unsigned long n,
- void *(*get_next_fn)(const void *))
+static unsigned long run_length(const void *list,
+ struct mergesort_sublist *next_list,
+ void *(*get_next_fn)(const void *),
+ int (*compare_fn)(const void *, const void *))
{
- while (n-- && list)
- list = get_next_fn(list);
- return list;
+ unsigned long len = 1;
+
+ if (!list)
+ return 0;
+ for (;;) {
+ void *next = get_next_fn(list);
+
+ if (!next || compare_fn(list, next) > 0) {
+ if (next_list)
+ next_list->ptr = next;
+ break;
+ }
+ list = next;
+ len++;
+ }
+ return len;
+}
+
+static void set_next_pair(struct mergesort_sublist *p,
+ struct mergesort_sublist *q, void *list,
+ void *(*get_next_fn)(const void *),
+ int (*compare_fn)(const void *, const void *))
+{
+ p->ptr = list;
+ p->len = run_length(p->ptr, q, get_next_fn, compare_fn);
+ q->len = q->ptr ? run_length(q->ptr, NULL, get_next_fn, compare_fn) : 0;
}
static void *pop_item(struct mergesort_sublist *l,
@@ -30,24 +55,16 @@ void *mergesort(void *list,
void (*set_next_fn)(void *, void *),
int (*compare_fn)(const void *, const void *))
{
- unsigned long l;
-
if (!list)
return NULL;
- for (l = 1; ; l *= 2) {
+ for (;;) {
void *curr;
struct mergesort_sublist p, q;
- p.ptr = list;
- q.ptr = get_nth_next(p.ptr, l, get_next_fn);
+ set_next_pair(&p, &q, list, get_next_fn, compare_fn);
if (!q.ptr)
break;
- p.len = q.len = l;
-
- if (compare_fn(p.ptr, q.ptr) > 0)
- list = curr = pop_item(&q, get_next_fn);
- else
- list = curr = pop_item(&p, get_next_fn);
+ list = curr = pop_item(&q, get_next_fn);
while (p.ptr) {
while (p.len || q.len) {
@@ -63,10 +80,14 @@ void *mergesort(void *list,
curr = pop_item(&p, get_next_fn);
set_next_fn(prev, curr);
}
- p.ptr = q.ptr;
- p.len = l;
- q.ptr = get_nth_next(p.ptr, l, get_next_fn);
- q.len = q.ptr ? l : 0;
+
+ set_next_pair(&p, &q, q.ptr, get_next_fn, compare_fn);
+ if (q.ptr) {
+ void *prev = curr;
+
+ curr = pop_item(&q, get_next_fn);
+ set_next_fn(prev, curr);
+ }
}
set_next_fn(curr, NULL);
--
1.7.10
next prev parent reply other threads:[~2012-04-08 20:33 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-03-30 0:18 Git push performance problems with ~100K refs Martin Fick
2012-03-30 2:12 ` Junio C Hamano
2012-03-30 2:43 ` Martin Fick
2012-03-30 9:32 ` Jeff King
2012-03-30 9:40 ` Jeff King
2012-03-30 14:22 ` Martin Fick
2012-03-31 22:10 ` [PATCH 1/3] add mergesort() for linked lists René Scharfe
2012-04-05 19:17 ` Junio C Hamano
2012-04-08 20:32 ` René Scharfe [this message]
2012-04-09 18:26 ` Junio C Hamano
2012-04-11 6:19 ` Stephen Boyd
2012-04-11 16:44 ` Junio C Hamano
2012-03-31 22:10 ` [PATCH 2/3] commit: use mergesort() in commit_list_sort_by_date() René Scharfe
2012-03-31 22:11 ` [PATCH 3/3] revision: insert unsorted, then sort in prepare_revision_walk() René Scharfe
2012-03-31 22:36 ` Martin Fick
2012-03-31 23:45 ` Junio C Hamano
2012-04-02 16:24 ` Martin Fick
2012-04-02 16:39 ` Shawn Pearce
2012-04-02 16:49 ` Martin Fick
2012-04-02 16:51 ` Shawn Pearce
2012-04-02 20:37 ` Jeff King
2012-04-02 20:51 ` Jeff King
2012-04-02 23:16 ` Martin Fick
2012-04-03 3:49 ` Nguyen Thai Ngoc Duy
2012-04-03 5:55 ` Martin Fick
2012-04-03 6:55 ` [PATCH 0/3] Commit cache Nguyễn Thái Ngọc Duy
2012-04-03 6:55 ` [PATCH 1/3] parse_commit_buffer: rename a confusing variable name Nguyễn Thái Ngọc Duy
2012-04-03 6:55 ` [PATCH 2/3] Add commit cache to help speed up commit traversal Nguyễn Thái Ngọc Duy
2012-04-03 6:55 ` [PATCH 3/3] Add parse_commit_for_rev() to take advantage of sha1-cache Nguyễn Thái Ngọc Duy
2012-04-05 13:02 ` [PATCH 3/3] revision: insert unsorted, then sort in prepare_revision_walk() Nguyen Thai Ngoc Duy
2012-04-06 19:21 ` Shawn Pearce
2012-04-07 4:20 ` Nguyen Thai Ngoc Duy
2012-04-03 3:44 ` Nguyen Thai Ngoc Duy
2012-04-02 20:14 ` Jeff King
2012-04-02 22:54 ` René Scharfe
2012-04-03 8:40 ` Jeff King
2012-04-03 9:19 ` Jeff King
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4F81F5E6.2070609@lsrfire.ath.cx \
--to=rene.scharfe@lsrfire.ath.cx \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=mfick@codeaurora.org \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.