From: Brian Downing <bdowning-oU/tDdhfGLReoWH0uzbU5w@public.gmane.org>
To: Junio C Hamano <gitster-e+AXbWqSrlAAvxtiuMwx3w@public.gmane.org>
Cc: git-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
msysgit-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org,
Edgar Toernig <froese-Mmb7MZpHnFY@public.gmane.org>,
Steffen Prohaska <prohaska-wjoc1KHpMeg@public.gmane.org>,
Johannes Schindelin
<johannes.schindelin-Mmb7MZpHnFY@public.gmane.org>
Subject: [PATCH v2] compat: Add simplified merge sort implementation from glibc
Date: Tue, 5 Feb 2008 15:10:44 -0600 [thread overview]
Message-ID: <20080205211044.GP26392@lavos.net> (raw)
qsort in Windows 2000 (and various other C libraries) is a Quicksort
with the usual O(n^2) worst case. Unfortunately, sorting Git trees
seems to get very close to that worst case quite often:
$ /git/gitbad runstatus
# On branch master
qsort, nmemb = 30842
done, 237838087 comparisons.
This patch adds a simplified version of the merge sort that is glibc's
qsort(3). As a merge sort, this needs a temporary array equal in size
to the array that is to be sorted, but has a worst-case performance of
O(n log n).
The complexity that was removed is:
* Doing direct stores for word-size and -aligned data.
* Falling back to quicksort if the allocation required to perform the
merge sort would likely push the machine into swap.
Even with these simplifications, this seems to outperform the Windows
qsort(3) implementation, even in Windows XP (where it is "fixed" and
doesn't trigger O(n^2) complexity on trees).
[jes: moved into compat/qsort.c, as per Johannes Sixt's suggestion]
[bcd: removed gcc-ism, thanks to Edgar Toernig. renamed make variable
per Junio's comment.]
Signed-off-by: Brian Downing <bdowning-oU/tDdhfGLReoWH0uzbU5w@public.gmane.org>
Signed-off-by: Steffen Prohaska <prohaska-wjoc1KHpMeg@public.gmane.org>
Signed-off-by: Johannes Schindelin <johannes.schindelin-Mmb7MZpHnFY@public.gmane.org>
---
Here it is again. Sorry about the mail screwup.
-bcd
Makefile | 9 ++++++++
compat/qsort.c | 60 +++++++++++++++++++++++++++++++++++++++++++++++++++++
git-compat-util.h | 6 +++++
3 files changed, 75 insertions(+), 0 deletions(-)
create mode 100644 compat/qsort.c
diff --git a/Makefile b/Makefile
index 92341c4..a0e5c0c 100644
--- a/Makefile
+++ b/Makefile
@@ -137,6 +137,11 @@ all::
# Define THREADED_DELTA_SEARCH if you have pthreads and wish to exploit
# parallel delta searching when packing objects.
#
+# Define INTERNAL_QSORT to use Git's implementation of qsort(), which
+# is a simplified version of the merge sort used in glibc. This is
+# recommended if Git triggers O(n^2) complexity in your
+# implementation's qsort().
+#
GIT-VERSION-FILE: .FORCE-GIT-VERSION-FILE
@$(SHELL_PATH) ./GIT-VERSION-GEN
@@ -722,6 +727,10 @@ ifdef NO_MEMMEM
COMPAT_CFLAGS += -DNO_MEMMEM
COMPAT_OBJS += compat/memmem.o
endif
+ifdef INTERNAL_QSORT
+ COMPAT_CFLAGS += -DINTERNAL_QSORT
+ COMPAT_OBJS += compat/qsort.o
+endif
ifdef THREADED_DELTA_SEARCH
BASIC_CFLAGS += -DTHREADED_DELTA_SEARCH
diff --git a/compat/qsort.c b/compat/qsort.c
new file mode 100644
index 0000000..8663889
--- /dev/null
+++ b/compat/qsort.c
@@ -0,0 +1,60 @@
+#include "../git-compat-util.h"
+
+/* This merge sort implementation is simplified from glibc's. */
+static void msort_with_tmp(void *b, size_t n, size_t s,
+ int (*cmp)(const void *, const void *),
+ char *t)
+{
+ char *tmp;
+ char *b1, *b2;
+ size_t n1, n2;
+
+ if (n <= 1)
+ return;
+
+ n1 = n / 2;
+ n2 = n - n1;
+ b1 = b;
+ b2 = (char *)b + (n1 * s);
+
+ msort_with_tmp(b1, n1, s, cmp, t);
+ msort_with_tmp(b2, n2, s, cmp, t);
+
+ tmp = t;
+
+ while (n1 > 0 && n2 > 0) {
+ if (cmp(b1, b2) <= 0) {
+ memcpy(tmp, b1, s);
+ tmp += s;
+ b1 += s;
+ --n1;
+ } else {
+ memcpy(tmp, b2, s);
+ tmp += s;
+ b2 += s;
+ --n2;
+ }
+ }
+ if (n1 > 0)
+ memcpy(tmp, b1, n1 * s);
+ memcpy(b, t, (n - n2) * s);
+}
+
+void git_qsort(void *b, size_t n, size_t s,
+ int (*cmp)(const void *, const void *))
+{
+ const size_t size = n * s;
+
+ if (size < 1024) {
+ char buf[1024];
+
+ /* The temporary array is small, so put it on
+ the stack. */
+ msort_with_tmp(b, n, s, cmp, buf);
+ } else {
+ /* It's somewhat large, so malloc it. */
+ char *tmp = malloc(size);
+ msort_with_tmp(b, n, s, cmp, tmp);
+ free(tmp);
+ }
+}
diff --git a/git-compat-util.h b/git-compat-util.h
index 4df90cb..0514604 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -426,4 +426,10 @@ static inline int strtol_i(char const *s, int base, int *result)
return 0;
}
+#ifdef INTERNAL_QSORT
+void git_qsort(void *base, size_t nmemb, size_t size,
+ int(*compar)(const void *, const void *));
+#define qsort git_qsort
+#endif
+
#endif
--
1.5.4.rc3
next reply other threads:[~2008-02-05 21:11 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-02-05 21:10 Brian Downing [this message]
[not found] ` <20080205211044.GP26392-oU/tDdhfGLReoWH0uzbU5w@public.gmane.org>
2008-02-05 22:21 ` [PATCH v2] compat: Add simplified merge sort implementation from glibc Johannes Schindelin
[not found] ` <alpine.LSU.1.00.0802052220500.8543-OGWIkrnhIhzN0uC3ymp8PA@public.gmane.org>
2008-02-06 2:47 ` Brian Downing
2008-02-07 4:14 ` Applying patches from gmane can be dangerous Junio C Hamano
2008-02-07 4:29 ` Nicolas Pitre
2008-02-07 9:08 ` Junio C Hamano
2008-02-10 10:51 ` 'next' will be rewound and rebuilt after feature releases Junio C Hamano
2008-02-07 8:01 ` Applying patches from gmane can be dangerous Jari Aalto
2008-02-07 8:21 ` Junio C Hamano
2008-02-07 9:05 ` Mike Hommey
2008-02-07 12:42 ` Johannes Schindelin
2008-02-07 13:32 ` Brian Downing
2008-02-07 14:50 ` Aidan Van Dyk
2008-02-07 15:03 ` Brian Downing
2008-02-07 16:10 ` Johannes Schindelin
2008-02-11 21:16 ` Johannes Schindelin
2008-02-07 14:10 ` Frank Lichtenheld
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080205211044.GP26392@lavos.net \
--to=bdowning-ou/tddhfglreowh0uzbu5w@public.gmane.org \
--cc=froese-Mmb7MZpHnFY@public.gmane.org \
--cc=git-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=gitster-e+AXbWqSrlAAvxtiuMwx3w@public.gmane.org \
--cc=johannes.schindelin-Mmb7MZpHnFY@public.gmane.org \
--cc=msysgit-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org \
--cc=prohaska-wjoc1KHpMeg@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).