From: "Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: "Derrick Stolee" <stolee@gmail.com>,
"Torsten Bögershausen" <tboegi@web.de>,
"Jeff King" <peff@peff.net>, "Patrick Steinhardt" <ps@pks.im>,
"Johannes Schindelin" <johannes.schindelin@gmx.de>,
"Johannes Schindelin" <johannes.schindelin@gmx.de>
Subject: [PATCH v3 07/11] test-tool synthesize: use the unsafe hash for speed
Date: Fri, 08 May 2026 08:16:45 +0000 [thread overview]
Message-ID: <4f207c8a470af1f8cf00c704043dfa94e6e1420d.1778228209.git.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.2102.v3.git.1778228209.gitgitgadget@gmail.com>
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Jeff King pointed out on the mailing list [1] that t5608's new >4GB
test cases dominate the entire test suite runtime: 160 seconds on his
laptop when the rest of the suite finishes in under 90 seconds, and
305-850 seconds across CI jobs. The bottleneck is that the synthesize
helper hashes roughly 8 GB of data through SHA-1 (4 GB for the pack
checksum plus 4 GB for the blob OID) for a 4 GB+1 blob.
Since the helper generates known test data, collision detection is
unnecessary. Switch from repo->hash_algo to unsafe_hash_algo(), which
uses hardware-accelerated SHA-1 (via OpenSSL or Apple CommonCrypto)
when available.
Benchmarks on an x86_64 machine generating a 4 GB+1 pack (2 runs
each, interleaved):
SHA-1 backend Run 1 Run 2
SHA1DC (safe) 75s 80s
OpenSSL (unsafe) 21s 19s
The effect scales linearly. At 64 MB with 10 randomized interleaved
runs, the OpenSSL unsafe backend shows a 5.4x improvement (median
0.202s vs 1.088s) with tight variance (stdev 0.028s vs 0.095s).
The speedup is only realized when the build has a fast unsafe backend
compiled in. The CI's linux-TEST-vars job already sets
OPENSSL_SHA1_UNSAFE=YesPlease; macOS benefits from Apple CommonCrypto
when configured. On builds without a separate unsafe backend (such as
the default Windows builds), unsafe_hash_algo() returns the regular
collision-detecting implementation and the change is a no-op.
[1] https://lore.kernel.org/git/20260501063805.GA2038915@coredump.intra.peff.net/
Assisted-by: Claude Opus 4.6
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
t/helper/test-synthesize.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/t/helper/test-synthesize.c b/t/helper/test-synthesize.c
index 3ce7078078..e2faaad7b4 100644
--- a/t/helper/test-synthesize.c
+++ b/t/helper/test-synthesize.c
@@ -217,7 +217,7 @@ static int cmd__synthesize__pack(int argc, const char **argv,
setup_git_directory_gently(&non_git);
repo = the_repository;
- algo = repo->hash_algo;
+ algo = unsafe_hash_algo(repo->hash_algo);
argc = parse_options(argc, argv, NULL, options, usage,
PARSE_OPT_KEEP_ARGV0);
--
gitgitgadget
next prev parent reply other threads:[~2026-05-08 8:17 UTC|newest]
Thread overview: 60+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-28 16:26 [PATCH 0/6] Handle cloning of objects larger than 4GB on Windows Johannes Schindelin via GitGitGadget
2026-04-28 16:26 ` [PATCH 1/6] index-pack, unpack-objects: use size_t for object size Johannes Schindelin via GitGitGadget
2026-04-30 14:13 ` Torsten Bögershausen
2026-05-03 14:46 ` Johannes Schindelin
2026-04-28 16:26 ` [PATCH 2/6] git-zlib: handle data streams larger than 4GB Johannes Schindelin via GitGitGadget
2026-04-28 16:26 ` [PATCH 3/6] odb, packfile: use size_t for streaming object sizes Johannes Schindelin via GitGitGadget
2026-04-28 16:26 ` [PATCH 4/6] delta, packfile: use size_t for delta header sizes Johannes Schindelin via GitGitGadget
2026-04-29 13:28 ` Derrick Stolee
2026-05-03 14:49 ` Johannes Schindelin
2026-04-28 16:26 ` [PATCH 5/6] test-tool: add a helper to synthesize large packfiles Johannes Schindelin via GitGitGadget
2026-04-28 16:26 ` [PATCH 6/6] t5608: add regression test for >4GB object clone Johannes Schindelin via GitGitGadget
2026-04-29 13:34 ` Derrick Stolee
2026-05-01 6:38 ` Jeff King
2026-05-01 13:19 ` Derrick Stolee
2026-05-04 17:07 ` Johannes Schindelin
2026-04-29 13:35 ` [PATCH 0/6] Handle cloning of objects larger than 4GB on Windows Derrick Stolee
2026-05-04 17:08 ` [PATCH v2 00/11] " Johannes Schindelin via GitGitGadget
2026-05-04 17:08 ` [PATCH v2 01/11] index-pack, unpack-objects: use size_t for object size Johannes Schindelin via GitGitGadget
2026-05-05 19:11 ` Torsten Bögershausen
2026-05-08 7:36 ` Johannes Schindelin
2026-05-08 19:09 ` Torsten Bögershausen
2026-05-10 2:41 ` Junio C Hamano
2026-05-10 9:14 ` Torsten Bögershausen
2026-05-04 17:08 ` [PATCH v2 02/11] git-zlib: handle data streams larger than 4GB Johannes Schindelin via GitGitGadget
2026-05-04 17:08 ` [PATCH v2 03/11] odb, packfile: use size_t for streaming object sizes Johannes Schindelin via GitGitGadget
2026-05-05 19:27 ` Torsten Bögershausen
2026-05-08 7:38 ` Johannes Schindelin
2026-05-04 17:08 ` [PATCH v2 04/11] delta, packfile: use size_t for delta header sizes Johannes Schindelin via GitGitGadget
2026-05-04 17:08 ` [PATCH v2 05/11] test-tool: add a helper to synthesize large packfiles Johannes Schindelin via GitGitGadget
2026-05-04 17:08 ` [PATCH v2 06/11] t5608: add regression test for >4GB object clone Johannes Schindelin via GitGitGadget
2026-05-04 17:08 ` [PATCH v2 07/11] test-tool synthesize: use the unsafe hash for speed Johannes Schindelin via GitGitGadget
2026-05-04 17:08 ` [PATCH v2 08/11] test-tool synthesize: precompute pack for 4 GiB + 1 Johannes Schindelin via GitGitGadget
2026-05-04 18:27 ` Derrick Stolee
2026-05-05 20:54 ` Johannes Schindelin
2026-05-04 17:08 ` [PATCH v2 09/11] test-tool synthesize: add precomputed SHA-256 " Johannes Schindelin via GitGitGadget
2026-05-04 17:08 ` [PATCH v2 10/11] t5608: mark >4GB tests as EXPENSIVE Johannes Schindelin via GitGitGadget
2026-05-04 17:08 ` [PATCH v2 11/11] ci: run expensive tests on push builds to integration branches Johannes Schindelin via GitGitGadget
2026-05-04 18:35 ` Derrick Stolee
2026-05-05 12:56 ` Junio C Hamano
2026-05-05 23:07 ` Junio C Hamano
2026-05-06 8:33 ` Johannes Schindelin
2026-05-07 9:18 ` Junio C Hamano
2026-05-07 10:24 ` Patrick Steinhardt
2026-05-08 2:50 ` Junio C Hamano
2026-05-08 8:16 ` [PATCH v3 00/11] Handle cloning of objects larger than 4GB on Windows Johannes Schindelin via GitGitGadget
2026-05-08 8:16 ` [PATCH v3 01/11] index-pack, unpack-objects: use size_t for object size Johannes Schindelin via GitGitGadget
2026-05-08 8:16 ` [PATCH v3 02/11] git-zlib: handle data streams larger than 4GB Johannes Schindelin via GitGitGadget
2026-05-08 8:16 ` [PATCH v3 03/11] odb, packfile: use size_t for streaming object sizes Johannes Schindelin via GitGitGadget
2026-05-08 8:16 ` [PATCH v3 04/11] delta, packfile: use size_t for delta header sizes Johannes Schindelin via GitGitGadget
2026-05-08 8:16 ` [PATCH v3 05/11] test-tool: add a helper to synthesize large packfiles Johannes Schindelin via GitGitGadget
2026-05-08 8:16 ` [PATCH v3 06/11] t5608: add regression test for >4GB object clone Johannes Schindelin via GitGitGadget
2026-05-08 8:16 ` Johannes Schindelin via GitGitGadget [this message]
2026-05-08 8:16 ` [PATCH v3 08/11] test-tool synthesize: precompute pack for 4 GiB + 1 Johannes Schindelin via GitGitGadget
2026-05-08 8:16 ` [PATCH v3 09/11] test-tool synthesize: add precomputed SHA-256 " Johannes Schindelin via GitGitGadget
2026-05-08 8:16 ` [PATCH v3 10/11] t5608: mark >4GB tests as EXPENSIVE Johannes Schindelin via GitGitGadget
2026-05-08 8:16 ` [PATCH v3 11/11] ci: run expensive tests on push builds to integration branches Johannes Schindelin via GitGitGadget
2026-05-10 23:51 ` [PATCH] ci: enable EXPENSIVE for contributor builds Junio C Hamano
2026-05-11 7:05 ` Patrick Steinhardt
2026-05-11 8:29 ` Junio C Hamano
2026-05-11 10:02 ` Patrick Steinhardt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4f207c8a470af1f8cf00c704043dfa94e6e1420d.1778228209.git.gitgitgadget@gmail.com \
--to=gitgitgadget@gmail.com \
--cc=git@vger.kernel.org \
--cc=johannes.schindelin@gmx.de \
--cc=peff@peff.net \
--cc=ps@pks.im \
--cc=stolee@gmail.com \
--cc=tboegi@web.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox