From: Patrick Steinhardt <ps@pks.im>
To: git@vger.kernel.org
Cc: karthik nayak <karthik.188@gmail.com>,
Junio C Hamano <gitster@pobox.com>,
Kristoffer Haugsbakk <kristofferhaugsbakk@fastmail.com>,
Christian Couder <chriscool@tuxfamily.org>
Subject: [PATCH v3 08/10] reftable/writer: optimize allocations by using a scratch buffer
Date: Mon, 25 Nov 2024 07:27:13 +0100 [thread overview]
Message-ID: <20241125-pks-refs-optimize-migrations-v3-8-17bc85e33ad7@pks.im> (raw)
In-Reply-To: <20241125-pks-refs-optimize-migrations-v3-0-17bc85e33ad7@pks.im>
Both `writer_add_record()` and `reftable_writer_add_ref()` get executed
for every single ref record we're adding to the reftable writer. And as
both functions use a local buffer to write data, the allocations we have
to do here add up during larger transactions.
Refactor the code to use a scratch buffer part of the `reftable_writer`
itself such that we can reuse it. This signifcantly reduces the number
of allocations during large transactions, e.g. when migrating refs from
the "files" backend to the "reftable" backend. Before this change:
HEAP SUMMARY:
in use at exit: 80,048 bytes in 49 blocks
total heap usage: 5,032,171 allocs, 5,032,122 frees, 418,792,092 bytes allocated
After this change:
HEAP SUMMARY:
in use at exit: 80,048 bytes in 49 blocks
total heap usage: 3,025,864 allocs, 3,025,815 frees, 372,746,291 bytes allocated
This also translate into a small speedup:
Benchmark 1: migrate files:reftable (refcount = 1000000, revision = HEAD~)
Time (mean ± σ): 827.2 ms ± 16.5 ms [User: 689.4 ms, System: 124.9 ms]
Range (min … max): 809.0 ms … 924.7 ms 50 runs
Benchmark 2: migrate files:reftable (refcount = 1000000, revision = HEAD)
Time (mean ± σ): 813.6 ms ± 11.6 ms [User: 679.0 ms, System: 123.4 ms]
Range (min … max): 786.7 ms … 833.5 ms 50 runs
Summary
migrate files:reftable (refcount = 1000000, revision = HEAD) ran
1.02 ± 0.02 times faster than migrate files:reftable (refcount = 1000000, revision = HEAD~)
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
reftable/writer.c | 23 +++++++++++------------
reftable/writer.h | 2 ++
2 files changed, 13 insertions(+), 12 deletions(-)
diff --git a/reftable/writer.c b/reftable/writer.c
index fd136794d5a27b33b5017f36fbd6b095ab8dac5b..be0fae6cb229411258d40b8865c2fdee88fd5bcd 100644
--- a/reftable/writer.c
+++ b/reftable/writer.c
@@ -148,6 +148,7 @@ int reftable_writer_new(struct reftable_writer **out,
reftable_buf_init(&wp->block_writer_data.last_key);
reftable_buf_init(&wp->last_key);
+ reftable_buf_init(&wp->scratch);
REFTABLE_CALLOC_ARRAY(wp->block, opts.block_size);
if (!wp->block) {
reftable_free(wp);
@@ -180,6 +181,7 @@ static void writer_release(struct reftable_writer *w)
w->block_writer = NULL;
writer_clear_index(w);
reftable_buf_release(&w->last_key);
+ reftable_buf_release(&w->scratch);
}
}
@@ -249,20 +251,19 @@ static int writer_index_hash(struct reftable_writer *w, struct reftable_buf *has
static int writer_add_record(struct reftable_writer *w,
struct reftable_record *rec)
{
- struct reftable_buf key = REFTABLE_BUF_INIT;
int err;
- err = reftable_record_key(rec, &key);
+ err = reftable_record_key(rec, &w->scratch);
if (err < 0)
goto done;
- if (reftable_buf_cmp(&w->last_key, &key) >= 0) {
+ if (reftable_buf_cmp(&w->last_key, &w->scratch) >= 0) {
err = REFTABLE_API_ERROR;
goto done;
}
reftable_buf_reset(&w->last_key);
- err = reftable_buf_add(&w->last_key, key.buf, key.len);
+ err = reftable_buf_add(&w->last_key, w->scratch.buf, w->scratch.len);
if (err < 0)
goto done;
@@ -312,7 +313,6 @@ static int writer_add_record(struct reftable_writer *w,
}
done:
- reftable_buf_release(&key);
return err;
}
@@ -325,7 +325,6 @@ int reftable_writer_add_ref(struct reftable_writer *w,
.ref = *ref
},
};
- struct reftable_buf buf = REFTABLE_BUF_INIT;
int err;
if (!ref->refname ||
@@ -340,24 +339,25 @@ int reftable_writer_add_ref(struct reftable_writer *w,
goto out;
if (!w->opts.skip_index_objects && reftable_ref_record_val1(ref)) {
- err = reftable_buf_add(&buf, (char *)reftable_ref_record_val1(ref),
+ reftable_buf_reset(&w->scratch);
+ err = reftable_buf_add(&w->scratch, (char *)reftable_ref_record_val1(ref),
hash_size(w->opts.hash_id));
if (err < 0)
goto out;
- err = writer_index_hash(w, &buf);
+ err = writer_index_hash(w, &w->scratch);
if (err < 0)
goto out;
}
if (!w->opts.skip_index_objects && reftable_ref_record_val2(ref)) {
- reftable_buf_reset(&buf);
- err = reftable_buf_add(&buf, reftable_ref_record_val2(ref),
+ reftable_buf_reset(&w->scratch);
+ err = reftable_buf_add(&w->scratch, reftable_ref_record_val2(ref),
hash_size(w->opts.hash_id));
if (err < 0)
goto out;
- err = writer_index_hash(w, &buf);
+ err = writer_index_hash(w, &w->scratch);
if (err < 0)
goto out;
}
@@ -365,7 +365,6 @@ int reftable_writer_add_ref(struct reftable_writer *w,
err = 0;
out:
- reftable_buf_release(&buf);
return err;
}
diff --git a/reftable/writer.h b/reftable/writer.h
index e8a6fbb78543e6e56920a2999601db0db9fe4d97..1f4788a430c52c5387b5e97f639e84544d0b9ba2 100644
--- a/reftable/writer.h
+++ b/reftable/writer.h
@@ -20,6 +20,8 @@ struct reftable_writer {
void *write_arg;
int pending_padding;
struct reftable_buf last_key;
+ /* Scratch buffer used to avoid allocations. */
+ struct reftable_buf scratch;
/* offset of next block to write. */
uint64_t next;
--
2.47.0.274.g962d0b743d.dirty
next prev parent reply other threads:[~2024-11-25 6:27 UTC|newest]
Thread overview: 54+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-11-08 9:34 [PATCH 00/10] refs: optimize ref format migrations Patrick Steinhardt
2024-11-08 9:34 ` [PATCH 01/10] refs: allow passing flags when setting up a transaction Patrick Steinhardt
2024-11-11 10:30 ` karthik nayak
2024-11-11 12:53 ` Patrick Steinhardt
2024-11-08 9:34 ` [PATCH 02/10] refs/files: move logic to commit initial transaction Patrick Steinhardt
2024-11-08 9:34 ` [PATCH 03/10] refs: introduce "initial" transaction flag Patrick Steinhardt
2024-11-08 9:34 ` [PATCH 04/10] refs/files: support symbolic and root refs in initial transaction Patrick Steinhardt
2024-11-11 10:42 ` karthik nayak
2024-11-11 12:53 ` Patrick Steinhardt
2024-11-08 9:34 ` [PATCH 05/10] refs: use "initial" transaction semantics to migrate refs Patrick Steinhardt
2024-11-11 10:43 ` karthik nayak
2024-11-08 9:34 ` [PATCH 06/10] refs: skip collision checks in initial transactions Patrick Steinhardt
2024-11-11 10:53 ` karthik nayak
2024-11-08 9:34 ` [PATCH 07/10] refs: don't normalize log messages with `REF_SKIP_CREATE_REFLOG` Patrick Steinhardt
2024-11-08 9:34 ` [PATCH 08/10] reftable/writer: optimize allocations by using a scratch buffer Patrick Steinhardt
2024-11-08 9:34 ` [PATCH 09/10] reftable/block: rename `block_writer::buf` variable Patrick Steinhardt
2024-11-08 9:34 ` [PATCH 10/10] reftable/block: optimize allocations by using scratch buffer Patrick Steinhardt
2024-11-11 10:57 ` [PATCH 00/10] refs: optimize ref format migrations karthik nayak
2024-11-11 12:53 ` Patrick Steinhardt
2024-11-20 7:04 ` Junio C Hamano
2024-11-20 7:50 ` Patrick Steinhardt
2024-11-20 10:25 ` Christian Couder
2024-11-25 5:52 ` Patrick Steinhardt
2024-11-20 7:51 ` [PATCH v2 " Patrick Steinhardt
2024-11-20 7:51 ` [PATCH v2 01/10] refs: allow passing flags when setting up a transaction Patrick Steinhardt
2024-11-20 10:19 ` Christian Couder
2024-11-20 7:51 ` [PATCH v2 02/10] refs/files: move logic to commit initial transaction Patrick Steinhardt
2024-11-20 7:51 ` [PATCH v2 03/10] refs: introduce "initial" transaction flag Patrick Steinhardt
2024-11-20 7:51 ` [PATCH v2 04/10] refs/files: support symbolic and root refs in initial transaction Patrick Steinhardt
2024-11-20 7:51 ` [PATCH v2 05/10] refs: use "initial" transaction semantics to migrate refs Patrick Steinhardt
2024-11-20 7:51 ` [PATCH v2 06/10] refs: skip collision checks in initial transactions Patrick Steinhardt
2024-11-20 10:21 ` Christian Couder
2024-11-25 5:52 ` Patrick Steinhardt
2024-11-20 10:42 ` Kristoffer Haugsbakk
2024-11-25 5:52 ` Patrick Steinhardt
2024-11-20 7:51 ` [PATCH v2 07/10] refs: don't normalize log messages with `REF_SKIP_CREATE_REFLOG` Patrick Steinhardt
2024-11-20 7:51 ` [PATCH v2 08/10] reftable/writer: optimize allocations by using a scratch buffer Patrick Steinhardt
2024-11-20 10:21 ` Christian Couder
2024-11-25 5:52 ` Patrick Steinhardt
2024-11-20 7:51 ` [PATCH v2 09/10] reftable/block: rename `block_writer::buf` variable Patrick Steinhardt
2024-11-20 7:51 ` [PATCH v2 10/10] reftable/block: optimize allocations by using scratch buffer Patrick Steinhardt
2024-11-20 10:22 ` Christian Couder
2024-11-25 6:27 ` [PATCH v3 00/10] refs: optimize ref format migrations Patrick Steinhardt
2024-11-25 6:27 ` [PATCH v3 01/10] refs: allow passing flags when setting up a transaction Patrick Steinhardt
2024-11-25 6:27 ` [PATCH v3 02/10] refs/files: move logic to commit initial transaction Patrick Steinhardt
2024-11-25 6:27 ` [PATCH v3 03/10] refs: introduce "initial" transaction flag Patrick Steinhardt
2024-11-25 6:27 ` [PATCH v3 04/10] refs/files: support symbolic and root refs in initial transaction Patrick Steinhardt
2024-11-25 6:27 ` [PATCH v3 05/10] refs: use "initial" transaction semantics to migrate refs Patrick Steinhardt
2024-11-25 6:27 ` [PATCH v3 06/10] refs: skip collision checks in initial transactions Patrick Steinhardt
2024-11-25 6:27 ` [PATCH v3 07/10] refs: don't normalize log messages with `REF_SKIP_CREATE_REFLOG` Patrick Steinhardt
2024-11-25 6:27 ` Patrick Steinhardt [this message]
2024-11-25 6:27 ` [PATCH v3 09/10] reftable/block: rename `block_writer::buf` variable Patrick Steinhardt
2024-11-25 6:27 ` [PATCH v3 10/10] reftable/block: optimize allocations by using scratch buffer Patrick Steinhardt
2024-11-25 6:29 ` [PATCH v3 00/10] refs: optimize ref format migrations Patrick Steinhardt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20241125-pks-refs-optimize-migrations-v3-8-17bc85e33ad7@pks.im \
--to=ps@pks.im \
--cc=chriscool@tuxfamily.org \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=karthik.188@gmail.com \
--cc=kristofferhaugsbakk@fastmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).