From: Patrick Steinhardt <ps@pks.im>
To: git@vger.kernel.org
Cc: Patrick Steinhardt <ps@pks.im>
Subject: [PATCH 00/10] refs: optimize ref format migrations
Date: Fri, 08 Nov 2024 10:34:40 +0100 [thread overview]
Message-ID: <20241108-pks-refs-optimize-migrations-v1-0-7fd37fa80e35@pks.im> (raw)
Hi,
I have recently learned that ref format migrations can take a
significant amount of time in the order of minutes when migrating
millions of refs. This is probably not entirely surprising: the initial
focus for the logic to migrate ref backends was mostly focussed on
getting the basic feature working, and I didn't yet invest any time into
optimizing the code path at all. But I was still mildly surprised that
the migration of a couple million refs was taking minutes to finish.
This patch series thus optimizes how we migrate ref formats. This is
mostly done by expanding upon the "initial transaction" semantics that
we already use for git-clone(1). These semantics allow us to assume that
the ref backend is completely empty and that there are no concurrent
writers, and thus we are free to perform certain optimizations that
wouldn't have otherwise been possible. On the one hand this allows us to
drop needless collision checks. On the other hand, it also allows us to
write regular refs directly into the "packed-refs" file when migrating
from the "reftable" backend to the "files" backend.
This leads to some significant speedups. Migrating 1 million refs from
"files" to "reftable":
Benchmark 1: migrate files:reftable (refcount = 1000000, revision = origin/master)
Time (mean ± σ): 4.580 s ± 0.062 s [User: 1.818 s, System: 2.746 s]
Range (min … max): 4.534 s … 4.743 s 10 runs
Benchmark 2: migrate files:reftable (refcount = 1000000, revision = pks-refs-optimize-migrations)
Time (mean ± σ): 767.7 ms ± 9.5 ms [User: 629.2 ms, System: 126.1 ms]
Range (min … max): 755.8 ms … 786.9 ms 10 runs
Summary
migrate files:reftable (refcount = 1000000, revision = pks-refs-optimize-migrations) ran
5.97 ± 0.11 times faster than migrate files:reftable (refcount = 1000000, revision = origin/master)
And migrating from "reftable" to "files:
Benchmark 1: migrate reftable:files (refcount = 1000000, revision = origin/master)
Time (mean ± σ): 35.409 s ± 0.302 s [User: 5.061 s, System: 29.244 s]
Range (min … max): 35.055 s … 35.898 s 10 runs
Benchmark 2: migrate reftable:files (refcount = 1000000, revision = pks-refs-optimize-migrations)
Time (mean ± σ): 855.9 ms ± 61.5 ms [User: 646.7 ms, System: 187.1 ms]
Range (min … max): 830.0 ms … 1030.3 ms 10 runs
Summary
migrate reftable:files (refcount = 1000000, revision = pks-refs-optimize-migrations) ran
41.37 ± 2.99 times faster than migrate reftable:files (refcount = 1000000, revision = origin/master)
Thanks!
Patrick
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
Patrick Steinhardt (10):
refs: allow passing flags when setting up a transaction
refs/files: move logic to commit initial transaction
refs: introduce "initial" transaction flag
refs/files: support symbolic and root refs in initial transaction
refs: use "initial" transaction semantics to migrate refs
refs: skip collision checks in initial transactions
refs: don't normalize log messages with `REF_SKIP_CREATE_REFLOG`
reftable/writer: optimize allocations by using a scratch buffer
reftable/block: rename `block_writer::buf` variable
reftable/block: optimize allocations by using scratch buffer
branch.c | 2 +-
builtin/clone.c | 4 +-
builtin/fast-import.c | 4 +-
builtin/fetch.c | 4 +-
builtin/receive-pack.c | 4 +-
builtin/replace.c | 2 +-
builtin/tag.c | 2 +-
builtin/update-ref.c | 4 +-
refs.c | 70 ++++++-------
refs.h | 45 +++++----
refs/debug.c | 13 ---
refs/files-backend.c | 244 +++++++++++++++++++++++++---------------------
refs/packed-backend.c | 8 --
refs/refs-internal.h | 2 +-
refs/reftable-backend.c | 14 +--
reftable/block.c | 33 +++----
reftable/block.h | 9 +-
reftable/writer.c | 23 +++--
reftable/writer.h | 1 +
sequencer.c | 6 +-
t/helper/test-ref-store.c | 2 +-
t/t1460-refs-migrate.sh | 2 +-
walker.c | 2 +-
23 files changed, 247 insertions(+), 253 deletions(-)
---
base-commit: facbe4f633e4ad31e641f64617bc88074c659959
change-id: 20241108-pks-refs-optimize-migrations-6d0ceee4abb7
Best regards,
--
Patrick Steinhardt <ps@pks.im>
next reply other threads:[~2024-11-08 9:35 UTC|newest]
Thread overview: 54+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-11-08 9:34 Patrick Steinhardt [this message]
2024-11-08 9:34 ` [PATCH 01/10] refs: allow passing flags when setting up a transaction Patrick Steinhardt
2024-11-11 10:30 ` karthik nayak
2024-11-11 12:53 ` Patrick Steinhardt
2024-11-08 9:34 ` [PATCH 02/10] refs/files: move logic to commit initial transaction Patrick Steinhardt
2024-11-08 9:34 ` [PATCH 03/10] refs: introduce "initial" transaction flag Patrick Steinhardt
2024-11-08 9:34 ` [PATCH 04/10] refs/files: support symbolic and root refs in initial transaction Patrick Steinhardt
2024-11-11 10:42 ` karthik nayak
2024-11-11 12:53 ` Patrick Steinhardt
2024-11-08 9:34 ` [PATCH 05/10] refs: use "initial" transaction semantics to migrate refs Patrick Steinhardt
2024-11-11 10:43 ` karthik nayak
2024-11-08 9:34 ` [PATCH 06/10] refs: skip collision checks in initial transactions Patrick Steinhardt
2024-11-11 10:53 ` karthik nayak
2024-11-08 9:34 ` [PATCH 07/10] refs: don't normalize log messages with `REF_SKIP_CREATE_REFLOG` Patrick Steinhardt
2024-11-08 9:34 ` [PATCH 08/10] reftable/writer: optimize allocations by using a scratch buffer Patrick Steinhardt
2024-11-08 9:34 ` [PATCH 09/10] reftable/block: rename `block_writer::buf` variable Patrick Steinhardt
2024-11-08 9:34 ` [PATCH 10/10] reftable/block: optimize allocations by using scratch buffer Patrick Steinhardt
2024-11-11 10:57 ` [PATCH 00/10] refs: optimize ref format migrations karthik nayak
2024-11-11 12:53 ` Patrick Steinhardt
2024-11-20 7:04 ` Junio C Hamano
2024-11-20 7:50 ` Patrick Steinhardt
2024-11-20 10:25 ` Christian Couder
2024-11-25 5:52 ` Patrick Steinhardt
2024-11-20 7:51 ` [PATCH v2 " Patrick Steinhardt
2024-11-20 7:51 ` [PATCH v2 01/10] refs: allow passing flags when setting up a transaction Patrick Steinhardt
2024-11-20 10:19 ` Christian Couder
2024-11-20 7:51 ` [PATCH v2 02/10] refs/files: move logic to commit initial transaction Patrick Steinhardt
2024-11-20 7:51 ` [PATCH v2 03/10] refs: introduce "initial" transaction flag Patrick Steinhardt
2024-11-20 7:51 ` [PATCH v2 04/10] refs/files: support symbolic and root refs in initial transaction Patrick Steinhardt
2024-11-20 7:51 ` [PATCH v2 05/10] refs: use "initial" transaction semantics to migrate refs Patrick Steinhardt
2024-11-20 7:51 ` [PATCH v2 06/10] refs: skip collision checks in initial transactions Patrick Steinhardt
2024-11-20 10:21 ` Christian Couder
2024-11-25 5:52 ` Patrick Steinhardt
2024-11-20 10:42 ` Kristoffer Haugsbakk
2024-11-25 5:52 ` Patrick Steinhardt
2024-11-20 7:51 ` [PATCH v2 07/10] refs: don't normalize log messages with `REF_SKIP_CREATE_REFLOG` Patrick Steinhardt
2024-11-20 7:51 ` [PATCH v2 08/10] reftable/writer: optimize allocations by using a scratch buffer Patrick Steinhardt
2024-11-20 10:21 ` Christian Couder
2024-11-25 5:52 ` Patrick Steinhardt
2024-11-20 7:51 ` [PATCH v2 09/10] reftable/block: rename `block_writer::buf` variable Patrick Steinhardt
2024-11-20 7:51 ` [PATCH v2 10/10] reftable/block: optimize allocations by using scratch buffer Patrick Steinhardt
2024-11-20 10:22 ` Christian Couder
2024-11-25 6:27 ` [PATCH v3 00/10] refs: optimize ref format migrations Patrick Steinhardt
2024-11-25 6:27 ` [PATCH v3 01/10] refs: allow passing flags when setting up a transaction Patrick Steinhardt
2024-11-25 6:27 ` [PATCH v3 02/10] refs/files: move logic to commit initial transaction Patrick Steinhardt
2024-11-25 6:27 ` [PATCH v3 03/10] refs: introduce "initial" transaction flag Patrick Steinhardt
2024-11-25 6:27 ` [PATCH v3 04/10] refs/files: support symbolic and root refs in initial transaction Patrick Steinhardt
2024-11-25 6:27 ` [PATCH v3 05/10] refs: use "initial" transaction semantics to migrate refs Patrick Steinhardt
2024-11-25 6:27 ` [PATCH v3 06/10] refs: skip collision checks in initial transactions Patrick Steinhardt
2024-11-25 6:27 ` [PATCH v3 07/10] refs: don't normalize log messages with `REF_SKIP_CREATE_REFLOG` Patrick Steinhardt
2024-11-25 6:27 ` [PATCH v3 08/10] reftable/writer: optimize allocations by using a scratch buffer Patrick Steinhardt
2024-11-25 6:27 ` [PATCH v3 09/10] reftable/block: rename `block_writer::buf` variable Patrick Steinhardt
2024-11-25 6:27 ` [PATCH v3 10/10] reftable/block: optimize allocations by using scratch buffer Patrick Steinhardt
2024-11-25 6:29 ` [PATCH v3 00/10] refs: optimize ref format migrations Patrick Steinhardt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20241108-pks-refs-optimize-migrations-v1-0-7fd37fa80e35@pks.im \
--to=ps@pks.im \
--cc=git@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).