From: "René Scharfe" <l.s.r@web.de>
To: git@vger.kernel.org
Cc: Junio C Hamano <gitster@pobox.com>
Subject: [PATCH 1/2] test-mergesort: read sort input all at once
Date: Sun, 28 Aug 2022 12:34:05 +0200 [thread overview]
Message-ID: <63cae51b-877e-0ca5-c61a-bf4f72d7dc8c@web.de> (raw)
In-Reply-To: <128f0fb8-d29b-8622-0cfe-2ecadc999db5@web.de>
The sort subcommand of test-mergesort is used to test the performance of
sorting linked lists. It reads lines from stdin, sorts them and prints
the result to stdout. Two heap allocations are done per line: One for
the linked list item and one for the actual line string. That imposes a
significant amount of allocation overhead.
Reduce it by doing the same as the sort subcommand of test-string-list,
namely to read the whole input file into a single buffer and then split
it in-place.
Note that t/perf/run can't be used directly to compare two versions of
test-mergesort because it always runs the helpers from the checked-out
version. So I hand-merged the results of separate runs before and with
this patch:
macOS 12.5.1 on M1:
0071.12: DEFINE_LIST_SORT unsorted 0.23(0.20+0.01) 0.22(0.20+0.01)
0071.14: DEFINE_LIST_SORT sorted 0.12(0.10+0.01) 0.10(0.08+0.01)
0071.16: DEFINE_LIST_SORT reversed 0.12(0.10+0.01) 0.10(0.08+0.01)
Git SDK 64-bit on Windows 11 21H2 on Ryzen 7 5800H:
0071.12: DEFINE_LIST_SORT unsorted 0.71(0.00+0.03) 0.54(0.00+0.06)
0071.14: DEFINE_LIST_SORT sorted 0.42(0.00+0.04) 0.21(0.03+0.03)
0071.16: DEFINE_LIST_SORT reversed 0.42(0.06+0.01) 0.21(0.01+0.04)
Debian bullseye on WSL2 on the same system:
0071.12: DEFINE_LIST_SORT unsorted 0.41(0.39+0.02) 0.29(0.27+0.01)
0071.14: DEFINE_LIST_SORT sorted 0.11(0.08+0.02) 0.07(0.06+0.01)
0071.16: DEFINE_LIST_SORT reversed 0.11(0.08+0.02) 0.07(0.04+0.03)
Signed-off-by: René Scharfe <l.s.r@web.de>
---
t/helper/test-mergesort.c | 38 +++++++++++++++++++++++++-------------
1 file changed, 25 insertions(+), 13 deletions(-)
diff --git a/t/helper/test-mergesort.c b/t/helper/test-mergesort.c
index 202e54a7ff..540537224f 100644
--- a/t/helper/test-mergesort.c
+++ b/t/helper/test-mergesort.c
@@ -22,21 +22,33 @@ static int compare_strings(const struct line *x, const struct line *y)
static int sort_stdin(void)
{
- struct line *line, *p = NULL, *lines = NULL;
+ struct line *lines;
+ struct line **tail = &lines;
struct strbuf sb = STRBUF_INIT;
-
- while (!strbuf_getline(&sb, stdin)) {
- line = xmalloc(sizeof(struct line));
- line->text = strbuf_detach(&sb, NULL);
- if (p) {
- line->next = p->next;
- p->next = line;
- } else {
- line->next = NULL;
- lines = line;
- }
- p = line;
+ char *p;
+
+ strbuf_read(&sb, 0, 0);
+
+ /*
+ * Split by newline, but don't create an item
+ * for the empty string after the last separator.
+ */
+ if (sb.len && sb.buf[sb.len - 1] == '\n')
+ strbuf_setlen(&sb, sb.len - 1);
+
+ p = sb.buf;
+ for (;;) {
+ char *eol = strchr(p, '\n');
+ struct line *line = xmalloc(sizeof(*line));
+ line->text = p;
+ *tail = line;
+ tail = &line->next;
+ if (!eol)
+ break;
+ *eol = '\0';
+ p = eol + 1;
}
+ *tail = NULL;
sort_lines(&lines, compare_strings);
--
2.30.2
next prev parent reply other threads:[~2022-08-28 10:34 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-08-28 10:32 [PATCH 0/2] test-mergesort: reduce memory allocation overhead of sort subcommand René Scharfe
2022-08-28 10:34 ` René Scharfe [this message]
2022-08-28 10:34 ` [PATCH 2/2] test-mergesort: use mem_pool for sort input René Scharfe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=63cae51b-877e-0ca5-c61a-bf4f72d7dc8c@web.de \
--to=l.s.r@web.de \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).