git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] test-mergesort: reduce memory allocation overhead of sort subcommand
@ 2022-08-28 10:32 René Scharfe
  2022-08-28 10:34 ` [PATCH 1/2] test-mergesort: read sort input all at once René Scharfe
  2022-08-28 10:34 ` [PATCH 2/2] test-mergesort: use mem_pool for sort input René Scharfe
  0 siblings, 2 replies; 3+ messages in thread
From: René Scharfe @ 2022-08-28 10:32 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano

The sort subcommand of the test-mergesort helper is used by p0071 to
measure the performance of the linked list sort functions built by the
macro DEFINE_LIST_SORT.  It spends a significant amount of time
allocating memory for the test data.  This series reduces it to allow
focusing more on the actual sort performance.

  test-mergesort: read sort input all at once
  test-mergesort: use mem_pool for sort input

 t/helper/test-mergesort.c | 40 ++++++++++++++++++++++++++-------------
 1 file changed, 27 insertions(+), 13 deletions(-)

--
2.30.2

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [PATCH 1/2] test-mergesort: read sort input all at once
  2022-08-28 10:32 [PATCH 0/2] test-mergesort: reduce memory allocation overhead of sort subcommand René Scharfe
@ 2022-08-28 10:34 ` René Scharfe
  2022-08-28 10:34 ` [PATCH 2/2] test-mergesort: use mem_pool for sort input René Scharfe
  1 sibling, 0 replies; 3+ messages in thread
From: René Scharfe @ 2022-08-28 10:34 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano

The sort subcommand of test-mergesort is used to test the performance of
sorting linked lists.  It reads lines from stdin, sorts them and prints
the result to stdout.  Two heap allocations are done per line: One for
the linked list item and one for the actual line string.  That imposes a
significant amount of allocation overhead.

Reduce it by doing the same as the sort subcommand of test-string-list,
namely to read the whole input file into a single buffer and then split
it in-place.

Note that t/perf/run can't be used directly to compare two versions of
test-mergesort because it always runs the helpers from the checked-out
version.  So I hand-merged the results of separate runs before and with
this patch:

macOS 12.5.1 on M1:
0071.12: DEFINE_LIST_SORT unsorted     0.23(0.20+0.01)     0.22(0.20+0.01)
0071.14: DEFINE_LIST_SORT sorted       0.12(0.10+0.01)     0.10(0.08+0.01)
0071.16: DEFINE_LIST_SORT reversed     0.12(0.10+0.01)     0.10(0.08+0.01)

Git SDK 64-bit on Windows 11 21H2 on Ryzen 7 5800H:
0071.12: DEFINE_LIST_SORT unsorted     0.71(0.00+0.03)     0.54(0.00+0.06)
0071.14: DEFINE_LIST_SORT sorted       0.42(0.00+0.04)     0.21(0.03+0.03)
0071.16: DEFINE_LIST_SORT reversed     0.42(0.06+0.01)     0.21(0.01+0.04)

Debian bullseye on WSL2 on the same system:
0071.12: DEFINE_LIST_SORT unsorted     0.41(0.39+0.02)     0.29(0.27+0.01)
0071.14: DEFINE_LIST_SORT sorted       0.11(0.08+0.02)     0.07(0.06+0.01)
0071.16: DEFINE_LIST_SORT reversed     0.11(0.08+0.02)     0.07(0.04+0.03)

Signed-off-by: René Scharfe <l.s.r@web.de>
---
 t/helper/test-mergesort.c | 38 +++++++++++++++++++++++++-------------
 1 file changed, 25 insertions(+), 13 deletions(-)

diff --git a/t/helper/test-mergesort.c b/t/helper/test-mergesort.c
index 202e54a7ff..540537224f 100644
--- a/t/helper/test-mergesort.c
+++ b/t/helper/test-mergesort.c
@@ -22,21 +22,33 @@ static int compare_strings(const struct line *x, const struct line *y)

 static int sort_stdin(void)
 {
-	struct line *line, *p = NULL, *lines = NULL;
+	struct line *lines;
+	struct line **tail = &lines;
 	struct strbuf sb = STRBUF_INIT;
-
-	while (!strbuf_getline(&sb, stdin)) {
-		line = xmalloc(sizeof(struct line));
-		line->text = strbuf_detach(&sb, NULL);
-		if (p) {
-			line->next = p->next;
-			p->next = line;
-		} else {
-			line->next = NULL;
-			lines = line;
-		}
-		p = line;
+	char *p;
+
+	strbuf_read(&sb, 0, 0);
+
+	/*
+	 * Split by newline, but don't create an item
+	 * for the empty string after the last separator.
+	 */
+	if (sb.len && sb.buf[sb.len - 1] == '\n')
+		strbuf_setlen(&sb, sb.len - 1);
+
+	p = sb.buf;
+	for (;;) {
+		char *eol = strchr(p, '\n');
+		struct line *line = xmalloc(sizeof(*line));
+		line->text = p;
+		*tail = line;
+		tail = &line->next;
+		if (!eol)
+			break;
+		*eol = '\0';
+		p = eol + 1;
 	}
+	*tail = NULL;

 	sort_lines(&lines, compare_strings);

--
2.30.2

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* [PATCH 2/2] test-mergesort: use mem_pool for sort input
  2022-08-28 10:32 [PATCH 0/2] test-mergesort: reduce memory allocation overhead of sort subcommand René Scharfe
  2022-08-28 10:34 ` [PATCH 1/2] test-mergesort: read sort input all at once René Scharfe
@ 2022-08-28 10:34 ` René Scharfe
  1 sibling, 0 replies; 3+ messages in thread
From: René Scharfe @ 2022-08-28 10:34 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano

The previous patch almost halved the number of heap allocations for the
sort subcommand.  Reduce it further by using a mem_pool for the line
objects.

Note that t/perf/run can't be used directly to compare two versions of
test-mergesort because it always runs the helpers from the checked-out
version.  So I hand-merged the results of separate runs before and with
this patch:

macOS 12.5.1 on M1:
0071.12: DEFINE_LIST_SORT unsorted     0.22(0.20+0.01)     0.21(0.19+0.01)
0071.14: DEFINE_LIST_SORT sorted       0.10(0.08+0.01)     0.10(0.08+0.01)
0071.16: DEFINE_LIST_SORT reversed     0.10(0.08+0.01)     0.10(0.08+0.01)

Git SDK 64-bit on Windows 11 21H2 on Ryzen 7 5800H:
0071.12: DEFINE_LIST_SORT unsorted     0.54(0.00+0.06)     0.44(0.01+0.06)
0071.14: DEFINE_LIST_SORT sorted       0.21(0.03+0.03)     0.19(0.04+0.01)
0071.16: DEFINE_LIST_SORT reversed     0.21(0.01+0.04)     0.19(0.04+0.04)

Debian bullseye on WSL2 on the same system:
0071.12: DEFINE_LIST_SORT unsorted     0.29(0.27+0.01)     0.22(0.19+0.02)
0071.14: DEFINE_LIST_SORT sorted       0.07(0.06+0.01)     0.06(0.04+0.02)
0071.16: DEFINE_LIST_SORT reversed     0.07(0.04+0.03)     0.06(0.04+0.02)

Signed-off-by: René Scharfe <l.s.r@web.de>
---
 t/helper/test-mergesort.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/t/helper/test-mergesort.c b/t/helper/test-mergesort.c
index 540537224f..335e5bb3a9 100644
--- a/t/helper/test-mergesort.c
+++ b/t/helper/test-mergesort.c
@@ -25,6 +25,7 @@ static int sort_stdin(void)
 	struct line *lines;
 	struct line **tail = &lines;
 	struct strbuf sb = STRBUF_INIT;
+	struct mem_pool lines_pool;
 	char *p;

 	strbuf_read(&sb, 0, 0);
@@ -36,10 +37,11 @@ static int sort_stdin(void)
 	if (sb.len && sb.buf[sb.len - 1] == '\n')
 		strbuf_setlen(&sb, sb.len - 1);

+	mem_pool_init(&lines_pool, 0);
 	p = sb.buf;
 	for (;;) {
 		char *eol = strchr(p, '\n');
-		struct line *line = xmalloc(sizeof(*line));
+		struct line *line = mem_pool_alloc(&lines_pool, sizeof(*line));
 		line->text = p;
 		*tail = line;
 		tail = &line->next;
--
2.30.2

^ permalink raw reply related	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2022-08-28 10:34 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-08-28 10:32 [PATCH 0/2] test-mergesort: reduce memory allocation overhead of sort subcommand René Scharfe
2022-08-28 10:34 ` [PATCH 1/2] test-mergesort: read sort input all at once René Scharfe
2022-08-28 10:34 ` [PATCH 2/2] test-mergesort: use mem_pool for sort input René Scharfe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).