From: "Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: Johannes Schindelin <johannes.schindelin@gmx.de>
Subject: [PATCH 0/6] Handle cloning of objects larger than 4GB on Windows
Date: Tue, 28 Apr 2026 16:26:14 +0000 [thread overview]
Message-ID: <pull.2102.git.1777393580.gitgitgadget@gmail.com> (raw)
On Windows, unsigned long is 32-bit even on 64-bit systems. This causes
multiple problems when Git handles objects larger than 4GB. This patch
series is a very targeted fix for a very early part of the problem: it
addresses the most fundamental truncation points that prevent a >4GB object
from surviving a clone at all.
Specifically, this fixes:
* zlib's uLong wrapping and triggering BUG() assertions in the git_zstream
wrapper
* Object sizes being truncated in pack streaming, delta headers, and
index-pack/unpack-objects
* pack-objects re-encoding reused pack entries with a truncated size,
producing corrupt packs on the wire
Many other code paths still use unsigned long for object sizes (e.g.,
cat-file -s, object_info.sizep, the delta machinery) and will need their own
conversions. This series does not attempt to fix those.
Based on work by @LordKiRon in git-for-windows/git#6076.
The last two commits add a test helper that synthesizes a pack with a >4GB
blob and regression tests that clone it via both the unpack-objects and
index-pack code paths using file:// transport.
Johannes Schindelin (6):
index-pack, unpack-objects: use size_t for object size
git-zlib: handle data streams larger than 4GB
odb, packfile: use size_t for streaming object sizes
delta, packfile: use size_t for delta header sizes
test-tool: add a helper to synthesize large packfiles
t5608: add regression test for >4GB object clone
Makefile | 1 +
builtin/index-pack.c | 9 +-
builtin/pack-objects.c | 23 +++-
builtin/unpack-objects.c | 5 +-
compat/zlib-compat.h | 2 +
delta.h | 14 +-
git-zlib.c | 25 ++--
git-zlib.h | 4 +-
object-file.c | 12 +-
odb/streaming.c | 13 +-
odb/streaming.h | 2 +-
oss-fuzz/fuzz-pack-headers.c | 2 +-
pack-bitmap.c | 2 +-
pack-check.c | 6 +-
packfile.c | 57 +++++---
packfile.h | 4 +-
t/helper/meson.build | 1 +
t/helper/test-synthesize.c | 250 +++++++++++++++++++++++++++++++++++
t/helper/test-tool.c | 1 +
t/helper/test-tool.h | 1 +
t/t5608-clone-2gb.sh | 37 ++++++
21 files changed, 418 insertions(+), 53 deletions(-)
create mode 100644 t/helper/test-synthesize.c
base-commit: 94f057755b7941b321fd11fec1b2e3ca5313a4e0
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-2102%2Fdscho%2Ffix-large-clones-on-windows-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-2102/dscho/fix-large-clones-on-windows-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/2102
--
gitgitgadget
next reply other threads:[~2026-04-28 16:26 UTC|newest]
Thread overview: 60+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-28 16:26 Johannes Schindelin via GitGitGadget [this message]
2026-04-28 16:26 ` [PATCH 1/6] index-pack, unpack-objects: use size_t for object size Johannes Schindelin via GitGitGadget
2026-04-30 14:13 ` Torsten Bögershausen
2026-05-03 14:46 ` Johannes Schindelin
2026-04-28 16:26 ` [PATCH 2/6] git-zlib: handle data streams larger than 4GB Johannes Schindelin via GitGitGadget
2026-04-28 16:26 ` [PATCH 3/6] odb, packfile: use size_t for streaming object sizes Johannes Schindelin via GitGitGadget
2026-04-28 16:26 ` [PATCH 4/6] delta, packfile: use size_t for delta header sizes Johannes Schindelin via GitGitGadget
2026-04-29 13:28 ` Derrick Stolee
2026-05-03 14:49 ` Johannes Schindelin
2026-04-28 16:26 ` [PATCH 5/6] test-tool: add a helper to synthesize large packfiles Johannes Schindelin via GitGitGadget
2026-04-28 16:26 ` [PATCH 6/6] t5608: add regression test for >4GB object clone Johannes Schindelin via GitGitGadget
2026-04-29 13:34 ` Derrick Stolee
2026-05-01 6:38 ` Jeff King
2026-05-01 13:19 ` Derrick Stolee
2026-05-04 17:07 ` Johannes Schindelin
2026-04-29 13:35 ` [PATCH 0/6] Handle cloning of objects larger than 4GB on Windows Derrick Stolee
2026-05-04 17:08 ` [PATCH v2 00/11] " Johannes Schindelin via GitGitGadget
2026-05-04 17:08 ` [PATCH v2 01/11] index-pack, unpack-objects: use size_t for object size Johannes Schindelin via GitGitGadget
2026-05-05 19:11 ` Torsten Bögershausen
2026-05-08 7:36 ` Johannes Schindelin
2026-05-08 19:09 ` Torsten Bögershausen
2026-05-10 2:41 ` Junio C Hamano
2026-05-10 9:14 ` Torsten Bögershausen
2026-05-04 17:08 ` [PATCH v2 02/11] git-zlib: handle data streams larger than 4GB Johannes Schindelin via GitGitGadget
2026-05-04 17:08 ` [PATCH v2 03/11] odb, packfile: use size_t for streaming object sizes Johannes Schindelin via GitGitGadget
2026-05-05 19:27 ` Torsten Bögershausen
2026-05-08 7:38 ` Johannes Schindelin
2026-05-04 17:08 ` [PATCH v2 04/11] delta, packfile: use size_t for delta header sizes Johannes Schindelin via GitGitGadget
2026-05-04 17:08 ` [PATCH v2 05/11] test-tool: add a helper to synthesize large packfiles Johannes Schindelin via GitGitGadget
2026-05-04 17:08 ` [PATCH v2 06/11] t5608: add regression test for >4GB object clone Johannes Schindelin via GitGitGadget
2026-05-04 17:08 ` [PATCH v2 07/11] test-tool synthesize: use the unsafe hash for speed Johannes Schindelin via GitGitGadget
2026-05-04 17:08 ` [PATCH v2 08/11] test-tool synthesize: precompute pack for 4 GiB + 1 Johannes Schindelin via GitGitGadget
2026-05-04 18:27 ` Derrick Stolee
2026-05-05 20:54 ` Johannes Schindelin
2026-05-04 17:08 ` [PATCH v2 09/11] test-tool synthesize: add precomputed SHA-256 " Johannes Schindelin via GitGitGadget
2026-05-04 17:08 ` [PATCH v2 10/11] t5608: mark >4GB tests as EXPENSIVE Johannes Schindelin via GitGitGadget
2026-05-04 17:08 ` [PATCH v2 11/11] ci: run expensive tests on push builds to integration branches Johannes Schindelin via GitGitGadget
2026-05-04 18:35 ` Derrick Stolee
2026-05-05 12:56 ` Junio C Hamano
2026-05-05 23:07 ` Junio C Hamano
2026-05-06 8:33 ` Johannes Schindelin
2026-05-07 9:18 ` Junio C Hamano
2026-05-07 10:24 ` Patrick Steinhardt
2026-05-08 2:50 ` Junio C Hamano
2026-05-08 8:16 ` [PATCH v3 00/11] Handle cloning of objects larger than 4GB on Windows Johannes Schindelin via GitGitGadget
2026-05-08 8:16 ` [PATCH v3 01/11] index-pack, unpack-objects: use size_t for object size Johannes Schindelin via GitGitGadget
2026-05-08 8:16 ` [PATCH v3 02/11] git-zlib: handle data streams larger than 4GB Johannes Schindelin via GitGitGadget
2026-05-08 8:16 ` [PATCH v3 03/11] odb, packfile: use size_t for streaming object sizes Johannes Schindelin via GitGitGadget
2026-05-08 8:16 ` [PATCH v3 04/11] delta, packfile: use size_t for delta header sizes Johannes Schindelin via GitGitGadget
2026-05-08 8:16 ` [PATCH v3 05/11] test-tool: add a helper to synthesize large packfiles Johannes Schindelin via GitGitGadget
2026-05-08 8:16 ` [PATCH v3 06/11] t5608: add regression test for >4GB object clone Johannes Schindelin via GitGitGadget
2026-05-08 8:16 ` [PATCH v3 07/11] test-tool synthesize: use the unsafe hash for speed Johannes Schindelin via GitGitGadget
2026-05-08 8:16 ` [PATCH v3 08/11] test-tool synthesize: precompute pack for 4 GiB + 1 Johannes Schindelin via GitGitGadget
2026-05-08 8:16 ` [PATCH v3 09/11] test-tool synthesize: add precomputed SHA-256 " Johannes Schindelin via GitGitGadget
2026-05-08 8:16 ` [PATCH v3 10/11] t5608: mark >4GB tests as EXPENSIVE Johannes Schindelin via GitGitGadget
2026-05-08 8:16 ` [PATCH v3 11/11] ci: run expensive tests on push builds to integration branches Johannes Schindelin via GitGitGadget
2026-05-10 23:51 ` [PATCH] ci: enable EXPENSIVE for contributor builds Junio C Hamano
2026-05-11 7:05 ` Patrick Steinhardt
2026-05-11 8:29 ` Junio C Hamano
2026-05-11 10:02 ` Patrick Steinhardt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=pull.2102.git.1777393580.gitgitgadget@gmail.com \
--to=gitgitgadget@gmail.com \
--cc=git@vger.kernel.org \
--cc=johannes.schindelin@gmx.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox