Git development
 help / color / mirror / Atom feed
* [PATCH 0/6] Handle cloning of objects larger than 4GB on Windows
@ 2026-04-28 16:26 Johannes Schindelin via GitGitGadget
  2026-04-28 16:26 ` [PATCH 1/6] index-pack, unpack-objects: use size_t for object size Johannes Schindelin via GitGitGadget
                   ` (7 more replies)
  0 siblings, 8 replies; 60+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2026-04-28 16:26 UTC (permalink / raw)
  To: git; +Cc: Johannes Schindelin

On Windows, unsigned long is 32-bit even on 64-bit systems. This causes
multiple problems when Git handles objects larger than 4GB. This patch
series is a very targeted fix for a very early part of the problem: it
addresses the most fundamental truncation points that prevent a >4GB object
from surviving a clone at all.

Specifically, this fixes:

 * zlib's uLong wrapping and triggering BUG() assertions in the git_zstream
   wrapper
 * Object sizes being truncated in pack streaming, delta headers, and
   index-pack/unpack-objects
 * pack-objects re-encoding reused pack entries with a truncated size,
   producing corrupt packs on the wire

Many other code paths still use unsigned long for object sizes (e.g.,
cat-file -s, object_info.sizep, the delta machinery) and will need their own
conversions. This series does not attempt to fix those.

Based on work by @LordKiRon in git-for-windows/git#6076.

The last two commits add a test helper that synthesizes a pack with a >4GB
blob and regression tests that clone it via both the unpack-objects and
index-pack code paths using file:// transport.

Johannes Schindelin (6):
  index-pack, unpack-objects: use size_t for object size
  git-zlib: handle data streams larger than 4GB
  odb, packfile: use size_t for streaming object sizes
  delta, packfile: use size_t for delta header sizes
  test-tool: add a helper to synthesize large packfiles
  t5608: add regression test for >4GB object clone

 Makefile                     |   1 +
 builtin/index-pack.c         |   9 +-
 builtin/pack-objects.c       |  23 +++-
 builtin/unpack-objects.c     |   5 +-
 compat/zlib-compat.h         |   2 +
 delta.h                      |  14 +-
 git-zlib.c                   |  25 ++--
 git-zlib.h                   |   4 +-
 object-file.c                |  12 +-
 odb/streaming.c              |  13 +-
 odb/streaming.h              |   2 +-
 oss-fuzz/fuzz-pack-headers.c |   2 +-
 pack-bitmap.c                |   2 +-
 pack-check.c                 |   6 +-
 packfile.c                   |  57 +++++---
 packfile.h                   |   4 +-
 t/helper/meson.build         |   1 +
 t/helper/test-synthesize.c   | 250 +++++++++++++++++++++++++++++++++++
 t/helper/test-tool.c         |   1 +
 t/helper/test-tool.h         |   1 +
 t/t5608-clone-2gb.sh         |  37 ++++++
 21 files changed, 418 insertions(+), 53 deletions(-)
 create mode 100644 t/helper/test-synthesize.c


base-commit: 94f057755b7941b321fd11fec1b2e3ca5313a4e0
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-2102%2Fdscho%2Ffix-large-clones-on-windows-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-2102/dscho/fix-large-clones-on-windows-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/2102
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 60+ messages in thread

end of thread, other threads:[~2026-05-11 10:02 UTC | newest]

Thread overview: 60+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-28 16:26 [PATCH 0/6] Handle cloning of objects larger than 4GB on Windows Johannes Schindelin via GitGitGadget
2026-04-28 16:26 ` [PATCH 1/6] index-pack, unpack-objects: use size_t for object size Johannes Schindelin via GitGitGadget
2026-04-30 14:13   ` Torsten Bögershausen
2026-05-03 14:46     ` Johannes Schindelin
2026-04-28 16:26 ` [PATCH 2/6] git-zlib: handle data streams larger than 4GB Johannes Schindelin via GitGitGadget
2026-04-28 16:26 ` [PATCH 3/6] odb, packfile: use size_t for streaming object sizes Johannes Schindelin via GitGitGadget
2026-04-28 16:26 ` [PATCH 4/6] delta, packfile: use size_t for delta header sizes Johannes Schindelin via GitGitGadget
2026-04-29 13:28   ` Derrick Stolee
2026-05-03 14:49     ` Johannes Schindelin
2026-04-28 16:26 ` [PATCH 5/6] test-tool: add a helper to synthesize large packfiles Johannes Schindelin via GitGitGadget
2026-04-28 16:26 ` [PATCH 6/6] t5608: add regression test for >4GB object clone Johannes Schindelin via GitGitGadget
2026-04-29 13:34   ` Derrick Stolee
2026-05-01  6:38     ` Jeff King
2026-05-01 13:19       ` Derrick Stolee
2026-05-04 17:07         ` Johannes Schindelin
2026-04-29 13:35 ` [PATCH 0/6] Handle cloning of objects larger than 4GB on Windows Derrick Stolee
2026-05-04 17:08 ` [PATCH v2 00/11] " Johannes Schindelin via GitGitGadget
2026-05-04 17:08   ` [PATCH v2 01/11] index-pack, unpack-objects: use size_t for object size Johannes Schindelin via GitGitGadget
2026-05-05 19:11     ` Torsten Bögershausen
2026-05-08  7:36       ` Johannes Schindelin
2026-05-08 19:09         ` Torsten Bögershausen
2026-05-10  2:41           ` Junio C Hamano
2026-05-10  9:14             ` Torsten Bögershausen
2026-05-04 17:08   ` [PATCH v2 02/11] git-zlib: handle data streams larger than 4GB Johannes Schindelin via GitGitGadget
2026-05-04 17:08   ` [PATCH v2 03/11] odb, packfile: use size_t for streaming object sizes Johannes Schindelin via GitGitGadget
2026-05-05 19:27     ` Torsten Bögershausen
2026-05-08  7:38       ` Johannes Schindelin
2026-05-04 17:08   ` [PATCH v2 04/11] delta, packfile: use size_t for delta header sizes Johannes Schindelin via GitGitGadget
2026-05-04 17:08   ` [PATCH v2 05/11] test-tool: add a helper to synthesize large packfiles Johannes Schindelin via GitGitGadget
2026-05-04 17:08   ` [PATCH v2 06/11] t5608: add regression test for >4GB object clone Johannes Schindelin via GitGitGadget
2026-05-04 17:08   ` [PATCH v2 07/11] test-tool synthesize: use the unsafe hash for speed Johannes Schindelin via GitGitGadget
2026-05-04 17:08   ` [PATCH v2 08/11] test-tool synthesize: precompute pack for 4 GiB + 1 Johannes Schindelin via GitGitGadget
2026-05-04 18:27     ` Derrick Stolee
2026-05-05 20:54       ` Johannes Schindelin
2026-05-04 17:08   ` [PATCH v2 09/11] test-tool synthesize: add precomputed SHA-256 " Johannes Schindelin via GitGitGadget
2026-05-04 17:08   ` [PATCH v2 10/11] t5608: mark >4GB tests as EXPENSIVE Johannes Schindelin via GitGitGadget
2026-05-04 17:08   ` [PATCH v2 11/11] ci: run expensive tests on push builds to integration branches Johannes Schindelin via GitGitGadget
2026-05-04 18:35     ` Derrick Stolee
2026-05-05 12:56       ` Junio C Hamano
2026-05-05 23:07         ` Junio C Hamano
2026-05-06  8:33           ` Johannes Schindelin
2026-05-07  9:18             ` Junio C Hamano
2026-05-07 10:24               ` Patrick Steinhardt
2026-05-08  2:50         ` Junio C Hamano
2026-05-08  8:16   ` [PATCH v3 00/11] Handle cloning of objects larger than 4GB on Windows Johannes Schindelin via GitGitGadget
2026-05-08  8:16     ` [PATCH v3 01/11] index-pack, unpack-objects: use size_t for object size Johannes Schindelin via GitGitGadget
2026-05-08  8:16     ` [PATCH v3 02/11] git-zlib: handle data streams larger than 4GB Johannes Schindelin via GitGitGadget
2026-05-08  8:16     ` [PATCH v3 03/11] odb, packfile: use size_t for streaming object sizes Johannes Schindelin via GitGitGadget
2026-05-08  8:16     ` [PATCH v3 04/11] delta, packfile: use size_t for delta header sizes Johannes Schindelin via GitGitGadget
2026-05-08  8:16     ` [PATCH v3 05/11] test-tool: add a helper to synthesize large packfiles Johannes Schindelin via GitGitGadget
2026-05-08  8:16     ` [PATCH v3 06/11] t5608: add regression test for >4GB object clone Johannes Schindelin via GitGitGadget
2026-05-08  8:16     ` [PATCH v3 07/11] test-tool synthesize: use the unsafe hash for speed Johannes Schindelin via GitGitGadget
2026-05-08  8:16     ` [PATCH v3 08/11] test-tool synthesize: precompute pack for 4 GiB + 1 Johannes Schindelin via GitGitGadget
2026-05-08  8:16     ` [PATCH v3 09/11] test-tool synthesize: add precomputed SHA-256 " Johannes Schindelin via GitGitGadget
2026-05-08  8:16     ` [PATCH v3 10/11] t5608: mark >4GB tests as EXPENSIVE Johannes Schindelin via GitGitGadget
2026-05-08  8:16     ` [PATCH v3 11/11] ci: run expensive tests on push builds to integration branches Johannes Schindelin via GitGitGadget
2026-05-10 23:51       ` [PATCH] ci: enable EXPENSIVE for contributor builds Junio C Hamano
2026-05-11  7:05         ` Patrick Steinhardt
2026-05-11  8:29           ` Junio C Hamano
2026-05-11 10:02             ` Patrick Steinhardt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox