* [PATCH] index-pack, unpack-objects: increase input buffer from 4 KiB to 128 KiB
@ 2026-04-24 19:14 Scott Bauersfeld via GitGitGadget
2026-04-25 10:21 ` Junio C Hamano
2026-04-27 16:08 ` [PATCH v2] " Scott Bauersfeld via GitGitGadget
0 siblings, 2 replies; 12+ messages in thread
From: Scott Bauersfeld via GitGitGadget @ 2026-04-24 19:14 UTC (permalink / raw)
To: git; +Cc: Scott Bauersfeld, Scott Bauersfeld
From: Scott Bauersfeld <sbauersfeld@g.ucla.edu>
Both index-pack and unpack-objects read pack data from stdin through
a 4 KiB static buffer (input_buffer[4096]). On each fill(), consumed
bytes are flushed to the output pack file via write_or_die(), so
every write(2) moves at most 4 KiB.
On FUSE-backed filesystems every write(2) is a synchronous round
trip through the FUSE protocol (userspace -> kernel -> userspace ->
back), so the 4 KiB buffer turns a clone into many unnecessary tiny
writes with noticeable latency overhead.
Increase the buffer from 4 KiB to 128 KiB, matching the default
already used by the hashfile layer in csum-file.c.
Testing with strace on HTTPS clones of git/git (~296 MB pack, 5 runs
per variant, isolated builds from the same v2.54.0 source) shows:
index-pack pack file writes: 72,465 -> 24,943 avg (66% reduction)
total write() syscalls: 310,192 -> 259,530 avg (17% reduction)
writes of exactly 4096 bytes: ~40,077 -> 0 (eliminated)
All clones produce identical HEAD, file count, and pass fsck.
Signed-off-by: Scott Bauersfeld <sbauersfeld@g.ucla.edu>
---
index-pack, unpack-objects: increase input buffer from 4 KiB to 128 KiB
Both index-pack and unpack-objects read pack data from stdin through a 4
KiB static buffer (input_buffer[4096]). On each fill(), consumed bytes
are flushed to the output pack file via write_or_die(), so every
write(2) moves at most 4 KiB.
On FUSE-backed filesystems every write(2) is a synchronous round trip
through the FUSE protocol (userspace → kernel → userspace → back), so
the 4 KiB buffer turns a clone into many unnecessary tiny writes with
noticeable latency overhead.
This change increase the buffer from 4 KiB to 128 KiB, matching the
default already used by the hashfile layer in csum-file.c.
Benchmarked with 5 HTTPS clones per version of
https://github.com/sbauersfeld/git.git (~296 MB pack), using strace -f
to count write() syscalls. Both binaries built from the same v2.54.0
source tree in isolated directories to ensure the bin-wrappers resolve
to the correct binary.
Correctness verified via git fsck --no-dangling, rev-parse HEAD, and
working tree file count — all 10 clones match.
Results:
Metric Unpatched (4 KiB) Patched (128 KiB) Change index-pack writes to
pack file 72,465 avg 24,943 avg −66% Total write() syscalls (all
processes) 310,192 avg 259,530 avg −17% Writes of exactly 4096 bytes
~40,077 avg 0 eliminated HEAD / file count / fsck ✓ ✓ None
Raw data:
unpatched (input_buffer[4096]): run 1: total_writes=311787
ip_pack_writes=72353 ip_4k=35311 run 2: total_writes=310252
ip_pack_writes=72348 ip_4k=38024 run 3: total_writes=309737
ip_pack_writes=72303 ip_4k=43003 run 4: total_writes=309801
ip_pack_writes=72661 ip_4k=42349 run 5: total_writes=309383
ip_pack_writes=72662 ip_4k=41702
patched (input_buffer[128 * 1024]): run 1: total_writes=264659
ip_pack_writes=26605 ip_4k=0 run 2: total_writes=264276
ip_pack_writes=26568 ip_4k=0 run 3: total_writes=227796 ip_pack_writes=
9762 ip_4k=0 run 4: total_writes=262464 ip_pack_writes=27830 ip_4k=0 run
5: total_writes=278455 ip_pack_writes=33952 ip_4k=0
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-2282%2Fsbauersfeld%2Fsb%2Fincrease-index-pack-input-buffer-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-2282/sbauersfeld/sb/increase-index-pack-input-buffer-v1
Pull-Request: https://github.com/git/git/pull/2282
builtin/index-pack.c | 4 ++--
builtin/unpack-objects.c | 4 ++--
2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index ca7784dc2c..81a628bf34 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -145,8 +145,8 @@ static int check_self_contained_and_connected;
static struct progress *progress;
-/* We always read in 4kB chunks. */
-static unsigned char input_buffer[4096];
+#define INPUT_BUFFER_SIZE (128 * 1024)
+static unsigned char input_buffer[INPUT_BUFFER_SIZE];
static unsigned int input_offset, input_len;
static off_t consumed_bytes;
static off_t max_input_size;
diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c
index e01cf6e360..535c019f82 100644
--- a/builtin/unpack-objects.c
+++ b/builtin/unpack-objects.c
@@ -23,8 +23,8 @@
static int dry_run, quiet, recover, has_errors, strict;
static const char unpack_usage[] = "git unpack-objects [-n] [-q] [-r] [--strict]";
-/* We always read in 4kB chunks. */
-static unsigned char buffer[4096];
+#define INPUT_BUFFER_SIZE (128 * 1024)
+static unsigned char buffer[INPUT_BUFFER_SIZE];
static unsigned int offset, len;
static off_t consumed_bytes;
static off_t max_input_size;
base-commit: 94f057755b7941b321fd11fec1b2e3ca5313a4e0
--
gitgitgadget
^ permalink raw reply related [flat|nested] 12+ messages in thread* Re: [PATCH] index-pack, unpack-objects: increase input buffer from 4 KiB to 128 KiB 2026-04-24 19:14 [PATCH] index-pack, unpack-objects: increase input buffer from 4 KiB to 128 KiB Scott Bauersfeld via GitGitGadget @ 2026-04-25 10:21 ` Junio C Hamano 2026-04-27 12:36 ` Derrick Stolee 2026-04-27 16:08 ` [PATCH v2] " Scott Bauersfeld via GitGitGadget 1 sibling, 1 reply; 12+ messages in thread From: Junio C Hamano @ 2026-04-25 10:21 UTC (permalink / raw) To: Scott Bauersfeld via GitGitGadget; +Cc: git, Scott Bauersfeld "Scott Bauersfeld via GitGitGadget" <gitgitgadget@gmail.com> writes: > From: Scott Bauersfeld <sbauersfeld@g.ucla.edu> > > Both index-pack and unpack-objects read pack data from stdin through > a 4 KiB static buffer (input_buffer[4096]). On each fill(), consumed > bytes are flushed to the output pack file via write_or_die(), so > every write(2) moves at most 4 KiB. Micronit. Output of unpack-objects obviously does not get flushed to "the output pack file". > On FUSE-backed filesystems every write(2) is a synchronous round > trip through the FUSE protocol (userspace -> kernel -> userspace -> > back), so the 4 KiB buffer turns a clone into many unnecessary tiny > writes with noticeable latency overhead. > > Increase the buffer from 4 KiB to 128 KiB, matching the default > already used by the hashfile layer in csum-file.c. Quite sensible reasoning presented very nicely. It may probably be a #leftoverbit but these three instances of (128 * 1024) may want to have a common symbolic constant, like #define DEFAULT_IOBUFFER_SIZE_IN_BYTES (128 * 1024) in a bit more central header file. Especially for the one in csum-file.c where there is no symbolic constant used for that purpose. > Testing with strace on HTTPS clones of git/git (~296 MB pack, 5 runs > per variant, isolated builds from the same v2.54.0 source) shows: > > index-pack pack file writes: 72,465 -> 24,943 avg (66% reduction) > total write() syscalls: 310,192 -> 259,530 avg (17% reduction) > writes of exactly 4096 bytes: ~40,077 -> 0 (eliminated) Hmph, I would have expected more like (1 - 4/128) ~ 97% reduction. The difference between that and 66% is coming from where? There are inherently short writes that do not utilize the new larger buffer beyond 4kB? If so, another number of interest might be the number of writes smaller than 4096 bytes, perhaps? ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] index-pack, unpack-objects: increase input buffer from 4 KiB to 128 KiB 2026-04-25 10:21 ` Junio C Hamano @ 2026-04-27 12:36 ` Derrick Stolee 2026-04-28 1:46 ` Junio C Hamano 0 siblings, 1 reply; 12+ messages in thread From: Derrick Stolee @ 2026-04-27 12:36 UTC (permalink / raw) To: Junio C Hamano, Scott Bauersfeld via GitGitGadget; +Cc: git, Scott Bauersfeld On 4/25/2026 6:21 AM, Junio C Hamano wrote: > "Scott Bauersfeld via GitGitGadget" <gitgitgadget@gmail.com> writes: > >> From: Scott Bauersfeld <sbauersfeld@g.ucla.edu> >> >> On FUSE-backed filesystems every write(2) is a synchronous round >> trip through the FUSE protocol (userspace -> kernel -> userspace -> >> back), so the 4 KiB buffer turns a clone into many unnecessary tiny >> writes with noticeable latency overhead. >> >> Increase the buffer from 4 KiB to 128 KiB, matching the default >> already used by the hashfile layer in csum-file.c. > > Quite sensible reasoning presented very nicely. > > It may probably be a #leftoverbit but these three instances of (128 > * 1024) may want to have a common symbolic constant, like > > #define DEFAULT_IOBUFFER_SIZE_IN_BYTES (128 * 1024) > > in a bit more central header file. Especially for the one in > csum-file.c where there is no symbolic constant used for that > purpose. I also had this thought. Would environment.h be the best place? >> Testing with strace on HTTPS clones of git/git (~296 MB pack, 5 runs >> per variant, isolated builds from the same v2.54.0 source) shows: >> >> index-pack pack file writes: 72,465 -> 24,943 avg (66% reduction) >> total write() syscalls: 310,192 -> 259,530 avg (17% reduction) >> writes of exactly 4096 bytes: ~40,077 -> 0 (eliminated) > > Hmph, I would have expected more like (1 - 4/128) ~ 97% reduction. > The difference between that and 66% is coming from where? There are > inherently short writes that do not utilize the new larger buffer > beyond 4kB? If so, another number of interest might be the number > of writes smaller than 4096 bytes, perhaps? One way to reword what you're asking is to measure "number of writes not using the whole buffer" which is basically going to be "the number of flush events from the application layer". Every time the application intends to flush, the current buffer is likely to not be exactly full. I would expect this number to not change between implementations in real experiments. The improvement here comes from the reduced number of flushes due to buffer limits. I see that this can be measured in the number of system-level events, but what impact does this have on the end-to- end time of 'git index-pack' or 'git unpack-objects'? Is there a t/perf/ test that can demonstrate this improvement for a variety of real repos using GIT_PERF_REPO? Thanks, -Stolee ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] index-pack, unpack-objects: increase input buffer from 4 KiB to 128 KiB 2026-04-27 12:36 ` Derrick Stolee @ 2026-04-28 1:46 ` Junio C Hamano 2026-04-28 2:09 ` Jeff King 0 siblings, 1 reply; 12+ messages in thread From: Junio C Hamano @ 2026-04-28 1:46 UTC (permalink / raw) To: Derrick Stolee; +Cc: Scott Bauersfeld via GitGitGadget, git, Scott Bauersfeld Derrick Stolee <stolee@gmail.com> writes: >> The difference between that and 66% is coming from where? There are >> inherently short writes that do not utilize the new larger buffer >> beyond 4kB? If so, another number of interest might be the number >> of writes smaller than 4096 bytes, perhaps? > > One way to reword what you're asking is to measure "number of writes > not using the whole buffer" which is basically going to be "the > number of flush events from the application layer". I do not think it would differ between the old and the new implementation. > Every time the > application intends to flush, the current buffer is likely to not > be exactly full. I would expect this number to not change between > implementations in real experiments. Yes, I agree. But what I was trying to get at was a bit different. The application may have produced only 2kB before it issues a "flush". Whether the buffer size is 4kB or 128kB, such a flush will only write out 2kB, and the larger buffer size does not help at all. But if the application has produced 90kB before it issues a "flush", the larger buffer size would give us a great improvement. With 4kB buffer, before such an application level "flush", we would have seen 22 = floor(90/4) calls of write(2) to flush the buffer, plus a 2kB write(2). With 128kB buffer, we would see a single 90kB write(2). So the apparently lower improvement than I naively have expected may be attributable to the fact that many application level "flush" was not large enough to benefit from 128kB buffer? How much of the total number of bytes written came in large batches, vs tiny ones? > The improvement here comes from the reduced number of flushes due > to buffer limits. Yes. > I see that this can be measured in the number of > system-level events, but what impact does this have on the end-to- > end time of 'git index-pack' or 'git unpack-objects'? Is there a > t/perf/ test that can demonstrate this improvement for a variety > of real repos using GIT_PERF_REPO? Interesting thought, but the number of system-level events (or the number of write(2) system calls) is not reduced by 97% because we apparently are issuing too many of them, and the reason is? I suspect the reason why we still issue too many write(2) is because we often do not send enough data between application-level flushes. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] index-pack, unpack-objects: increase input buffer from 4 KiB to 128 KiB 2026-04-28 1:46 ` Junio C Hamano @ 2026-04-28 2:09 ` Jeff King 0 siblings, 0 replies; 12+ messages in thread From: Jeff King @ 2026-04-28 2:09 UTC (permalink / raw) To: Junio C Hamano Cc: Derrick Stolee, Scott Bauersfeld via GitGitGadget, git, Scott Bauersfeld On Tue, Apr 28, 2026 at 10:46:44AM +0900, Junio C Hamano wrote: > The application may have produced only 2kB before it issues a > "flush". Whether the buffer size is 4kB or 128kB, such a flush will > only write out 2kB, and the larger buffer size does not help at all. > But if the application has produced 90kB before it issues a "flush", > the larger buffer size would give us a great improvement. With 4kB > buffer, before such an application level "flush", we would have seen > 22 = floor(90/4) calls of write(2) to flush the buffer, plus a 2kB > write(2). With 128kB buffer, we would see a single 90kB write(2). > > So the apparently lower improvement than I naively have expected may > be attributable to the fact that many application level "flush" was > not large enough to benefit from 128kB buffer? How much of the > total number of bytes written came in large batches, vs tiny ones? The input to index-pack in a fetch is going to be the demuxing of the sideband via git-fetch. So it's probably flushing 64k or less each time (because that's the max size of a packet), and unless index-pack is going much slower than the input, that maximizes how much it will read. Depending on the source, though, it may be possible to go faster than index-pack (which has to at least update the pack checksum for every byte, and may even zlib inflate and hash the object itself if it's a non-delta). In which case the sideband demuxer would start filling the pipe and index-pack may get larger reads. We could actually reduce the number of syscalls further if index-pack did the demuxing itself, and we just handed it the descriptor. That probably doesn't help all that much in this case, though, if the problem is not raw reads/writes on pipes, but rather ones that go to the slow FUSE filesystem. And as long as those pipe reads/writes are "wide" (allowing the eventual filesystem writes to also be wide), then the exact number may not be as important. But the demuxing may also explain why the total number of writes did not decrease as much as you expected, since those ones will probably not be reduced by the patch in question. So the improvement is a percentage of only a smaller portion of the total (but not necessarily half, because they may have been larger writes in the first place). -Peff ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH v2] index-pack, unpack-objects: increase input buffer from 4 KiB to 128 KiB 2026-04-24 19:14 [PATCH] index-pack, unpack-objects: increase input buffer from 4 KiB to 128 KiB Scott Bauersfeld via GitGitGadget 2026-04-25 10:21 ` Junio C Hamano @ 2026-04-27 16:08 ` Scott Bauersfeld via GitGitGadget 2026-04-27 17:23 ` Derrick Stolee 2026-04-27 19:26 ` [PATCH v3] " Scott Bauersfeld via GitGitGadget 1 sibling, 2 replies; 12+ messages in thread From: Scott Bauersfeld via GitGitGadget @ 2026-04-27 16:08 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, Derrick Stolee, Scott Bauersfeld, Scott Bauersfeld From: Scott Bauersfeld <sbauersfeld@g.ucla.edu> index-pack and unpack-objects both read pack data from stdin through a 4 KiB static buffer. In index-pack, each fill() flushes consumed bytes to the pack file via write_or_die(), capping every write(2) at 4 KiB. unpack-objects uses the same buffer pattern for reads. On FUSE-backed filesystems every write(2) is a synchronous round trip through the FUSE protocol (userspace -> kernel -> userspace -> back), so the 4 KiB buffer turns a clone into many unnecessary tiny writes with noticeable latency overhead. Increase the buffer from 4 KiB to 128 KiB. Introduce a shared DEFAULT_PACKFILE_BUFFER_SIZE constant in git-compat-util.h (next to MAX_IO_SIZE) and use it in index-pack, unpack-objects, and the hashfile layer in csum-file (which already used 128 KiB but hardcoded the value). Syscall counts via strace on HTTPS clones of git/git (~296 MB pack, 5 runs per variant, isolated builds from the same v2.54.0 source): index-pack pack file writes: 72,465 -> 24,943 avg (65% fewer) total write() syscalls: 310,192 -> 259,530 avg (16% fewer) writes of exactly 4096 bytes: ~40,077 -> 0 Wall-clock time of git clone over HTTPS onto a FUSE passthrough filesystem with writeback caching disabled, 3 runs per variant: vscode (~1.26 GB pack): 84.5s -> 75.7s avg (10% faster) git/git (~306 MB pack): 22.6s -> 20.0s avg (11% faster) Signed-off-by: Scott Bauersfeld <sbauersfeld@g.ucla.edu> --- index-pack, unpack-objects: increase input buffer from 4 KiB to 128 KiB index-pack and unpack-objects read pack data from stdin through a 4 KiB static buffer. In index-pack, each fill() flushes consumed bytes to the pack file via write_or_die(), capping every write(2) at 4 KiB. unpack-objects uses the same buffer pattern for reads. On FUSE-backed filesystems every write(2) is a synchronous round trip through the FUSE protocol (userspace → kernel → userspace → back), so the 4 KiB buffer turns a clone into many unnecessary tiny writes with noticeable latency overhead. Increase the buffer from 4 KiB to 128 KiB. Introduce a shared DEFAULT_PACKFILE_BUFFER_SIZE constant in git-compat-util.h (next to MAX_IO_SIZE) and use it in index-pack, unpack-objects, and the hashfile layer in csum-file (which already used 128 KiB but hardcoded the value). Syscall reduction ================= Measured via strace -f on HTTPS clones of git/git (~296 MB pack, 5 runs per variant, isolated builds from the same v2.54.0 source): Metric Unpatched (4 KiB) Patched (128 KiB) Change index-pack writes to pack file 72,465 avg 24,943 avg −65% Total write() syscalls (all processes) 310,192 avg 259,530 avg −16% Writes of exactly 4096 bytes ~40,077 avg 0 eliminated HEAD / file count / fsck ✓ ✓ identical Wall-clock time on FUSE ======================= Measured wall-clock time of git clone over HTTPS onto a FUSE passthrough filesystem with writeback caching disabled. 3 runs per variant: Repo Unpatched avg Patched avg Change microsoft/vscode (~1.26 GB pack) 84.5s 75.7s −10% git/git (~306 MB pack) 22.6s 20.0s −11% Changes since v1 ================ * Introduced shared DEFAULT_PACKFILE_BUFFER_SIZE constant in git-compat-util.h (next to MAX_IO_SIZE), replacing per-file #define and the hardcoded value in csum-file.c. Placed here rather than environment.h since it is an I/O buffer size, not an environment variable or repo config. * Added wall-clock timing on a FUSE filesystem. * Cleaned up the commit description a bit. Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-2282%2Fsbauersfeld%2Fsb%2Fincrease-index-pack-input-buffer-v2 Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-2282/sbauersfeld/sb/increase-index-pack-input-buffer-v2 Pull-Request: https://github.com/git/git/pull/2282 Range-diff vs v1: 1: c388e1dc2f ! 1: ac2559ccb5 index-pack, unpack-objects: increase input buffer from 4 KiB to 128 KiB @@ Metadata ## Commit message ## index-pack, unpack-objects: increase input buffer from 4 KiB to 128 KiB - Both index-pack and unpack-objects read pack data from stdin through - a 4 KiB static buffer (input_buffer[4096]). On each fill(), consumed - bytes are flushed to the output pack file via write_or_die(), so - every write(2) moves at most 4 KiB. + index-pack and unpack-objects both read pack data from stdin through + a 4 KiB static buffer. In index-pack, each fill() flushes consumed + bytes to the pack file via write_or_die(), capping every write(2) + at 4 KiB. unpack-objects uses the same buffer pattern for reads. On FUSE-backed filesystems every write(2) is a synchronous round trip through the FUSE protocol (userspace -> kernel -> userspace -> back), so the 4 KiB buffer turns a clone into many unnecessary tiny writes with noticeable latency overhead. - Increase the buffer from 4 KiB to 128 KiB, matching the default - already used by the hashfile layer in csum-file.c. + Increase the buffer from 4 KiB to 128 KiB. Introduce a shared + DEFAULT_PACKFILE_BUFFER_SIZE constant in git-compat-util.h (next to + MAX_IO_SIZE) and use it in index-pack, unpack-objects, and the + hashfile layer in csum-file (which already used 128 KiB but + hardcoded the value). - Testing with strace on HTTPS clones of git/git (~296 MB pack, 5 runs - per variant, isolated builds from the same v2.54.0 source) shows: + Syscall counts via strace on HTTPS clones of git/git (~296 MB pack, + 5 runs per variant, isolated builds from the same v2.54.0 source): - index-pack pack file writes: 72,465 -> 24,943 avg (66% reduction) - total write() syscalls: 310,192 -> 259,530 avg (17% reduction) - writes of exactly 4096 bytes: ~40,077 -> 0 (eliminated) + index-pack pack file writes: 72,465 -> 24,943 avg (65% fewer) + total write() syscalls: 310,192 -> 259,530 avg (16% fewer) + writes of exactly 4096 bytes: ~40,077 -> 0 - All clones produce identical HEAD, file count, and pass fsck. + Wall-clock time of git clone over HTTPS onto a FUSE passthrough + filesystem with writeback caching disabled, 3 runs per variant: + + vscode (~1.26 GB pack): 84.5s -> 75.7s avg (10% faster) + git/git (~306 MB pack): 22.6s -> 20.0s avg (11% faster) Signed-off-by: Scott Bauersfeld <sbauersfeld@g.ucla.edu> @@ builtin/index-pack.c: static int check_self_contained_and_connected; -/* We always read in 4kB chunks. */ -static unsigned char input_buffer[4096]; -+#define INPUT_BUFFER_SIZE (128 * 1024) -+static unsigned char input_buffer[INPUT_BUFFER_SIZE]; ++static unsigned char input_buffer[DEFAULT_PACKFILE_BUFFER_SIZE]; static unsigned int input_offset, input_len; static off_t consumed_bytes; static off_t max_input_size; @@ builtin/unpack-objects.c -/* We always read in 4kB chunks. */ -static unsigned char buffer[4096]; -+#define INPUT_BUFFER_SIZE (128 * 1024) -+static unsigned char buffer[INPUT_BUFFER_SIZE]; ++static unsigned char buffer[DEFAULT_PACKFILE_BUFFER_SIZE]; static unsigned int offset, len; static off_t consumed_bytes; static off_t max_input_size; + + ## csum-file.c ## +@@ csum-file.c: struct hashfile *hashfd_ext(const struct git_hash_algo *algop, + f->algop = unsafe_hash_algo(algop); + f->algop->init_fn(&f->ctx); + +- f->buffer_len = opts->buffer_len ? opts->buffer_len : 128 * 1024; ++ f->buffer_len = opts->buffer_len ? opts->buffer_len : DEFAULT_PACKFILE_BUFFER_SIZE; + f->buffer = xmalloc(f->buffer_len); + f->check_buffer = NULL; + + + ## git-compat-util.h ## +@@ git-compat-util.h: static inline uint64_t u64_add(uint64_t a, uint64_t b) + # endif + #endif + ++/* ++ * Default buffer size for buffered I/O in pack file operations (index-pack, ++ * unpack-objects) and the hashfile layer in csum-file. ++ */ ++#define DEFAULT_PACKFILE_BUFFER_SIZE (128 * 1024) ++ + #ifdef HAVE_ALLOCA_H + # include <alloca.h> + # define xalloca(size) (alloca(size)) builtin/index-pack.c | 3 +-- builtin/unpack-objects.c | 3 +-- csum-file.c | 2 +- git-compat-util.h | 6 ++++++ 4 files changed, 9 insertions(+), 5 deletions(-) diff --git a/builtin/index-pack.c b/builtin/index-pack.c index ca7784dc2c..d86476676f 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -145,8 +145,7 @@ static int check_self_contained_and_connected; static struct progress *progress; -/* We always read in 4kB chunks. */ -static unsigned char input_buffer[4096]; +static unsigned char input_buffer[DEFAULT_PACKFILE_BUFFER_SIZE]; static unsigned int input_offset, input_len; static off_t consumed_bytes; static off_t max_input_size; diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c index e01cf6e360..da8ec83d9f 100644 --- a/builtin/unpack-objects.c +++ b/builtin/unpack-objects.c @@ -23,8 +23,7 @@ static int dry_run, quiet, recover, has_errors, strict; static const char unpack_usage[] = "git unpack-objects [-n] [-q] [-r] [--strict]"; -/* We always read in 4kB chunks. */ -static unsigned char buffer[4096]; +static unsigned char buffer[DEFAULT_PACKFILE_BUFFER_SIZE]; static unsigned int offset, len; static off_t consumed_bytes; static off_t max_input_size; diff --git a/csum-file.c b/csum-file.c index 9558177a11..c1aeaf587a 100644 --- a/csum-file.c +++ b/csum-file.c @@ -178,7 +178,7 @@ struct hashfile *hashfd_ext(const struct git_hash_algo *algop, f->algop = unsafe_hash_algo(algop); f->algop->init_fn(&f->ctx); - f->buffer_len = opts->buffer_len ? opts->buffer_len : 128 * 1024; + f->buffer_len = opts->buffer_len ? opts->buffer_len : DEFAULT_PACKFILE_BUFFER_SIZE; f->buffer = xmalloc(f->buffer_len); f->check_buffer = NULL; diff --git a/git-compat-util.h b/git-compat-util.h index ae1bdc90a4..a2f037811c 100644 --- a/git-compat-util.h +++ b/git-compat-util.h @@ -712,6 +712,12 @@ static inline uint64_t u64_add(uint64_t a, uint64_t b) # endif #endif +/* + * Default buffer size for buffered I/O in pack file operations (index-pack, + * unpack-objects) and the hashfile layer in csum-file. + */ +#define DEFAULT_PACKFILE_BUFFER_SIZE (128 * 1024) + #ifdef HAVE_ALLOCA_H # include <alloca.h> # define xalloca(size) (alloca(size)) base-commit: 94f057755b7941b321fd11fec1b2e3ca5313a4e0 -- gitgitgadget ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH v2] index-pack, unpack-objects: increase input buffer from 4 KiB to 128 KiB 2026-04-27 16:08 ` [PATCH v2] " Scott Bauersfeld via GitGitGadget @ 2026-04-27 17:23 ` Derrick Stolee 2026-04-27 19:26 ` [PATCH v3] " Scott Bauersfeld via GitGitGadget 1 sibling, 0 replies; 12+ messages in thread From: Derrick Stolee @ 2026-04-27 17:23 UTC (permalink / raw) To: Scott Bauersfeld via GitGitGadget, git; +Cc: Junio C Hamano, Scott Bauersfeld On 4/27/2026 12:08 PM, Scott Bauersfeld via GitGitGadget wrote: > From: Scott Bauersfeld <sbauersfeld@g.ucla.edu> ... > Wall-clock time of git clone over HTTPS onto a FUSE passthrough > filesystem with writeback caching disabled, 3 runs per variant: > > vscode (~1.26 GB pack): 84.5s -> 75.7s avg (10% faster) > git/git (~306 MB pack): 22.6s -> 20.0s avg (11% faster) Wow! This is much higher than I expected. Great find. I imagine that other platforms or non-FUSE setups will not have the same benefits. As long as they aren't _regressions_ then this is a great find. > -/* We always read in 4kB chunks. */ > -static unsigned char input_buffer[4096]; > +static unsigned char input_buffer[DEFAULT_PACKFILE_BUFFER_SIZE]; > -/* We always read in 4kB chunks. */ > -static unsigned char buffer[4096]; > +static unsigned char buffer[DEFAULT_PACKFILE_BUFFER_SIZE]; These changes are what I expected in v2. > diff --git a/csum-file.c b/csum-file.c > index 9558177a11..c1aeaf587a 100644 > --- a/csum-file.c > +++ b/csum-file.c > @@ -178,7 +178,7 @@ struct hashfile *hashfd_ext(const struct git_hash_algo *algop, > f->algop = unsafe_hash_algo(algop); > f->algop->init_fn(&f->ctx); > > - f->buffer_len = opts->buffer_len ? opts->buffer_len : 128 * 1024; > + f->buffer_len = opts->buffer_len ? opts->buffer_len : DEFAULT_PACKFILE_BUFFER_SIZE; > f->buffer = xmalloc(f->buffer_len); > f->check_buffer = NULL; This one surprised me, as this hunk wasn't in your v1 patch. I think using this replacement makes sense, since it _is_ an exact value. It did make me think as to how we landed on 128K for this example. The previous line is due to a1118c0a446 (csum-file: introduce `hashfd_ext()`, 2026-03-13), but it only moved the 128K default from hashfd(). Notably, hashfd_throughput() still uses an 8K setting in opt->buffer_len. Hilariously, I went spelunking for the original reason for the 128K and it was 2ca245f8be5 (csum-file.h: increase hashfile buffer size, 2021-05-18) written by...me. The motivation was due to using the hashfile logic for the .git/index file which also used 128K buffers in f279894 (read-cache: make the index write buffer size 128K, 2021-02-18). All this is to say that we now have two constants of identical value, where WRITE_BUFFER_SIZE in read-cache.c could be replaced with your new DEFAULT_PACKFILE_BUFFER_SIZE. This does make me think that maybe DEFAULT_PACKFILE_BUFFER_SIZE is misnamed? Should it be DEFAULT_HASHFILE_BUFFER_SIZE or DEFAULT_FILESYSTEM_BUFFER_SIZE to better fit this size value being used in both packfiles and index files? > diff --git a/git-compat-util.h b/git-compat-util.h > index ae1bdc90a4..a2f037811c 100644 > --- a/git-compat-util.h > +++ b/git-compat-util.h > @@ -712,6 +712,12 @@ static inline uint64_t u64_add(uint64_t a, uint64_t b) > # endif > #endif > > +/* > + * Default buffer size for buffered I/O in pack file operations (index-pack, > + * unpack-objects) and the hashfile layer in csum-file. > + */ > +#define DEFAULT_PACKFILE_BUFFER_SIZE (128 * 1024) > + I see. Putting this in git-compat-util.h makes the rest of the changes good without any need to add a new include. Thanks, -Stolee ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH v3] index-pack, unpack-objects: increase input buffer from 4 KiB to 128 KiB 2026-04-27 16:08 ` [PATCH v2] " Scott Bauersfeld via GitGitGadget 2026-04-27 17:23 ` Derrick Stolee @ 2026-04-27 19:26 ` Scott Bauersfeld via GitGitGadget 2026-04-27 20:12 ` Derrick Stolee 2026-04-28 14:47 ` [PATCH v4] " Scott Bauersfeld via GitGitGadget 1 sibling, 2 replies; 12+ messages in thread From: Scott Bauersfeld via GitGitGadget @ 2026-04-27 19:26 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, Derrick Stolee, Scott Bauersfeld, Scott Bauersfeld From: Scott Bauersfeld <sbauersfeld@g.ucla.edu> index-pack and unpack-objects both read pack data from stdin through a 4 KiB static buffer. In index-pack, each fill() flushes consumed bytes to the pack file via write_or_die(), capping every write(2) at 4 KiB. unpack-objects uses the same buffer pattern for reads. On FUSE-backed filesystems every write(2) is a synchronous round trip through the FUSE protocol (userspace -> kernel -> userspace -> back), so the 4 KiB buffer turns a clone into many unnecessary tiny writes with noticeable latency overhead. Increase the buffer from 4 KiB to 128 KiB. Introduce a shared DEFAULT_IO_BUFFER_SIZE constant in git-compat-util.h (next to MAX_IO_SIZE) and use it in index-pack, unpack-objects, and the hashfile layer in csum-file (which already used 128 KiB but hardcoded the value). Syscall counts via strace on HTTPS clones of git/git (~296 MB pack, 5 runs per variant, isolated builds from the same v2.54.0 source): index-pack pack file writes: 72,465 -> 24,943 avg (65% fewer) total write() syscalls: 310,192 -> 259,530 avg (16% fewer) writes of exactly 4096 bytes: ~40,077 -> 0 Wall-clock time of git clone over HTTPS onto a FUSE passthrough filesystem with writeback caching disabled, 3 runs per variant: vscode (~1.26 GB pack): 84.5s -> 75.7s avg (10% faster) git/git (~306 MB pack): 22.6s -> 20.0s avg (11% faster) Signed-off-by: Scott Bauersfeld <sbauersfeld@g.ucla.edu> --- index-pack, unpack-objects: increase input buffer from 4 KiB to 128 KiB index-pack and unpack-objects read pack data from stdin through a 4 KiB static buffer. In index-pack, each fill() flushes consumed bytes to the pack file via write_or_die(), capping every write(2) at 4 KiB. unpack-objects uses the same buffer pattern for reads. On FUSE-backed filesystems every write(2) is a synchronous round trip through the FUSE protocol (userspace → kernel → userspace → back), so the 4 KiB buffer turns a clone into many unnecessary tiny writes with noticeable latency overhead. Increase the buffer from 4 KiB to 128 KiB. Introduce a shared DEFAULT_IO_BUFFER_SIZE constant in git-compat-util.h (next to MAX_IO_SIZE) and use it in index-pack, unpack-objects, and the hashfile layer in csum-file (which already used 128 KiB but hardcoded the value). Syscall reduction ================= Measured via strace -f on HTTPS clones of git/git (~296 MB pack, 5 runs per variant, isolated builds from the same v2.54.0 source): Metric Unpatched (4 KiB) Patched (128 KiB) Change index-pack writes to pack file 72,465 avg 24,943 avg −65% Total write() syscalls (all processes) 310,192 avg 259,530 avg −16% Writes of exactly 4096 bytes ~40,077 avg 0 eliminated HEAD / file count / fsck ✓ ✓ identical Wall-clock time on FUSE ======================= Measured wall-clock time of git clone over HTTPS onto a FUSE passthrough filesystem with writeback caching disabled. 3 runs per variant: Repo Unpatched avg Patched avg Change microsoft/vscode (~1.26 GB pack) 84.5s 75.7s −10% git/git (~306 MB pack) 22.6s 20.0s −11% Changes since v2 ================ * Renamed DEFAULT_PACKFILE_BUFFER_SIZE → DEFAULT_IO_BUFFER_SIZE per Stolee's feedback. The constant is not packfile-specific, since it is also used by the hashfile layer. * Stolee noted that WRITE_BUFFER_SIZE in read-cache.c could be consolidated. That constant was already removed in f6e2cd0625 ("read-cache: delete unused hashing methods", 2021-05-18) when read-cache.c was converted to use the hashfile API, so there is nothing left to unify. The rename to DEFAULT_IO_BUFFER_SIZE helps account for the multiple usages of this constant. Changes since v1 ================ * Introduced shared DEFAULT_PACKFILE_BUFFER_SIZE constant in git-compat-util.h (next to MAX_IO_SIZE), replacing per-file #define and the hardcoded value in csum-file.c. Placed here rather than environment.h since it is an I/O buffer size, not an environment variable or repo config. * Added wall-clock timing on a FUSE filesystem. * Cleaned up the commit description a bit. Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-2282%2Fsbauersfeld%2Fsb%2Fincrease-index-pack-input-buffer-v3 Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-2282/sbauersfeld/sb/increase-index-pack-input-buffer-v3 Pull-Request: https://github.com/git/git/pull/2282 Range-diff vs v2: 1: ac2559ccb5 ! 1: df754ac879 index-pack, unpack-objects: increase input buffer from 4 KiB to 128 KiB @@ Commit message writes with noticeable latency overhead. Increase the buffer from 4 KiB to 128 KiB. Introduce a shared - DEFAULT_PACKFILE_BUFFER_SIZE constant in git-compat-util.h (next to + DEFAULT_IO_BUFFER_SIZE constant in git-compat-util.h (next to MAX_IO_SIZE) and use it in index-pack, unpack-objects, and the hashfile layer in csum-file (which already used 128 KiB but hardcoded the value). @@ builtin/index-pack.c: static int check_self_contained_and_connected; -/* We always read in 4kB chunks. */ -static unsigned char input_buffer[4096]; -+static unsigned char input_buffer[DEFAULT_PACKFILE_BUFFER_SIZE]; ++static unsigned char input_buffer[DEFAULT_IO_BUFFER_SIZE]; static unsigned int input_offset, input_len; static off_t consumed_bytes; static off_t max_input_size; @@ builtin/unpack-objects.c -/* We always read in 4kB chunks. */ -static unsigned char buffer[4096]; -+static unsigned char buffer[DEFAULT_PACKFILE_BUFFER_SIZE]; ++static unsigned char buffer[DEFAULT_IO_BUFFER_SIZE]; static unsigned int offset, len; static off_t consumed_bytes; static off_t max_input_size; @@ csum-file.c: struct hashfile *hashfd_ext(const struct git_hash_algo *algop, f->algop->init_fn(&f->ctx); - f->buffer_len = opts->buffer_len ? opts->buffer_len : 128 * 1024; -+ f->buffer_len = opts->buffer_len ? opts->buffer_len : DEFAULT_PACKFILE_BUFFER_SIZE; ++ f->buffer_len = opts->buffer_len ? opts->buffer_len : DEFAULT_IO_BUFFER_SIZE; f->buffer = xmalloc(f->buffer_len); f->check_buffer = NULL; @@ git-compat-util.h: static inline uint64_t u64_add(uint64_t a, uint64_t b) #endif +/* -+ * Default buffer size for buffered I/O in pack file operations (index-pack, -+ * unpack-objects) and the hashfile layer in csum-file. ++ * Default buffer size for buffered I/O in index-pack, unpack-objects, ++ * and the hashfile layer in csum-file. + */ -+#define DEFAULT_PACKFILE_BUFFER_SIZE (128 * 1024) ++#define DEFAULT_IO_BUFFER_SIZE (128 * 1024) + #ifdef HAVE_ALLOCA_H # include <alloca.h> builtin/index-pack.c | 3 +-- builtin/unpack-objects.c | 3 +-- csum-file.c | 2 +- git-compat-util.h | 6 ++++++ 4 files changed, 9 insertions(+), 5 deletions(-) diff --git a/builtin/index-pack.c b/builtin/index-pack.c index ca7784dc2c..bb3639641c 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -145,8 +145,7 @@ static int check_self_contained_and_connected; static struct progress *progress; -/* We always read in 4kB chunks. */ -static unsigned char input_buffer[4096]; +static unsigned char input_buffer[DEFAULT_IO_BUFFER_SIZE]; static unsigned int input_offset, input_len; static off_t consumed_bytes; static off_t max_input_size; diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c index e01cf6e360..af67d1a1d3 100644 --- a/builtin/unpack-objects.c +++ b/builtin/unpack-objects.c @@ -23,8 +23,7 @@ static int dry_run, quiet, recover, has_errors, strict; static const char unpack_usage[] = "git unpack-objects [-n] [-q] [-r] [--strict]"; -/* We always read in 4kB chunks. */ -static unsigned char buffer[4096]; +static unsigned char buffer[DEFAULT_IO_BUFFER_SIZE]; static unsigned int offset, len; static off_t consumed_bytes; static off_t max_input_size; diff --git a/csum-file.c b/csum-file.c index 9558177a11..d7a682c2b6 100644 --- a/csum-file.c +++ b/csum-file.c @@ -178,7 +178,7 @@ struct hashfile *hashfd_ext(const struct git_hash_algo *algop, f->algop = unsafe_hash_algo(algop); f->algop->init_fn(&f->ctx); - f->buffer_len = opts->buffer_len ? opts->buffer_len : 128 * 1024; + f->buffer_len = opts->buffer_len ? opts->buffer_len : DEFAULT_IO_BUFFER_SIZE; f->buffer = xmalloc(f->buffer_len); f->check_buffer = NULL; diff --git a/git-compat-util.h b/git-compat-util.h index ae1bdc90a4..5024814bd4 100644 --- a/git-compat-util.h +++ b/git-compat-util.h @@ -712,6 +712,12 @@ static inline uint64_t u64_add(uint64_t a, uint64_t b) # endif #endif +/* + * Default buffer size for buffered I/O in index-pack, unpack-objects, + * and the hashfile layer in csum-file. + */ +#define DEFAULT_IO_BUFFER_SIZE (128 * 1024) + #ifdef HAVE_ALLOCA_H # include <alloca.h> # define xalloca(size) (alloca(size)) base-commit: 94f057755b7941b321fd11fec1b2e3ca5313a4e0 -- gitgitgadget ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH v3] index-pack, unpack-objects: increase input buffer from 4 KiB to 128 KiB 2026-04-27 19:26 ` [PATCH v3] " Scott Bauersfeld via GitGitGadget @ 2026-04-27 20:12 ` Derrick Stolee 2026-04-28 1:47 ` Junio C Hamano 2026-04-28 14:47 ` [PATCH v4] " Scott Bauersfeld via GitGitGadget 1 sibling, 1 reply; 12+ messages in thread From: Derrick Stolee @ 2026-04-27 20:12 UTC (permalink / raw) To: Scott Bauersfeld via GitGitGadget, git; +Cc: Junio C Hamano, Scott Bauersfeld On 4/27/2026 3:26 PM, Scott Bauersfeld via GitGitGadget wrote: > From: Scott Bauersfeld <sbauersfeld@g.ucla.edu> > Changes since v2 > ================ > > * Renamed DEFAULT_PACKFILE_BUFFER_SIZE → DEFAULT_IO_BUFFER_SIZE per > Stolee's feedback. The constant is not packfile-specific, since it is > also used by the hashfile layer. > * Stolee noted that WRITE_BUFFER_SIZE in read-cache.c could be > consolidated. That constant was already removed in f6e2cd0625 > ("read-cache: delete unused hashing methods", 2021-05-18) when > read-cache.c was converted to use the hashfile API, so there is > nothing left to unify. The rename to DEFAULT_IO_BUFFER_SIZE helps > account for the multiple usages of this constant. Thank you for discovering this context which made my recommendation non-actionable. I was looking at the commit that added the 128K limit, which had that in its context, but not at the latest code. My mistake! I'm very happy with this version and look forward to the performance benefits! Thanks, -Stolee ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v3] index-pack, unpack-objects: increase input buffer from 4 KiB to 128 KiB 2026-04-27 20:12 ` Derrick Stolee @ 2026-04-28 1:47 ` Junio C Hamano 0 siblings, 0 replies; 12+ messages in thread From: Junio C Hamano @ 2026-04-28 1:47 UTC (permalink / raw) To: Derrick Stolee; +Cc: Scott Bauersfeld via GitGitGadget, git, Scott Bauersfeld Derrick Stolee <stolee@gmail.com> writes: > On 4/27/2026 3:26 PM, Scott Bauersfeld via GitGitGadget wrote: >> From: Scott Bauersfeld <sbauersfeld@g.ucla.edu> > >> Changes since v2 >> ================ >> >> * Renamed DEFAULT_PACKFILE_BUFFER_SIZE → DEFAULT_IO_BUFFER_SIZE per >> Stolee's feedback. The constant is not packfile-specific, since it is >> also used by the hashfile layer. >> * Stolee noted that WRITE_BUFFER_SIZE in read-cache.c could be >> consolidated. That constant was already removed in f6e2cd0625 >> ("read-cache: delete unused hashing methods", 2021-05-18) when >> read-cache.c was converted to use the hashfile API, so there is >> nothing left to unify. The rename to DEFAULT_IO_BUFFER_SIZE helps >> account for the multiple usages of this constant. > > Thank you for discovering this context which made my recommendation > non-actionable. I was looking at the commit that added the 128K limit, > which had that in its context, but not at the latest code. My mistake! > > I'm very happy with this version and look forward to the performance > benefits! > > Thanks, > -Stolee Yes, this version was very pleasant to read. Thanks both. ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH v4] index-pack, unpack-objects: increase input buffer from 4 KiB to 128 KiB 2026-04-27 19:26 ` [PATCH v3] " Scott Bauersfeld via GitGitGadget 2026-04-27 20:12 ` Derrick Stolee @ 2026-04-28 14:47 ` Scott Bauersfeld via GitGitGadget 2026-05-12 5:51 ` Junio C Hamano 1 sibling, 1 reply; 12+ messages in thread From: Scott Bauersfeld via GitGitGadget @ 2026-04-28 14:47 UTC (permalink / raw) To: git Cc: Junio C Hamano, Derrick Stolee, Jeff King, Scott Bauersfeld, Scott Bauersfeld From: Scott Bauersfeld <sbauersfeld@g.ucla.edu> index-pack and unpack-objects both read pack data from stdin through a 4 KiB static buffer. In index-pack, each fill() flushes consumed bytes to the pack file via write_or_die(), capping every write(2) at 4 KiB. unpack-objects uses the same buffer pattern for reads. On FUSE-backed filesystems every write(2) is a synchronous round trip through the FUSE protocol (userspace -> kernel -> userspace -> back), so the 4 KiB buffer turns a clone into many unnecessary tiny writes with noticeable latency overhead. Increase the buffer from 4 KiB to 128 KiB. Introduce a shared DEFAULT_IO_BUFFER_SIZE constant in git-compat-util.h (next to MAX_IO_SIZE) and use it in index-pack, unpack-objects, and the hashfile layer in csum-file (which already used 128 KiB but hardcoded the value). Pack file writes to a FUSE filesystem with writeback caching disabled during HTTPS clones of git/git (~293 MB pack): 74,958 -> 4,687 (94% fewer) Wall-clock time of git clone over HTTPS onto a FUSE passthrough filesystem with writeback caching disabled, 3 runs per variant: vscode (~1.26 GB pack): 84.5s -> 75.7s avg (10% faster) git/git (~306 MB pack): 22.6s -> 20.0s avg (11% faster) Signed-off-by: Scott Bauersfeld <sbauersfeld@g.ucla.edu> --- index-pack, unpack-objects: increase input buffer from 4 KiB to 128 KiB index-pack and unpack-objects read pack data from stdin through a 4 KiB static buffer. In index-pack, each fill() flushes consumed bytes to the pack file via write_or_die(), capping every write(2) at 4 KiB. unpack-objects uses the same buffer pattern for reads. On FUSE-backed filesystems every write(2) is a synchronous round trip through the FUSE protocol (userspace → kernel → userspace → back), so the 4 KiB buffer turns a clone into many unnecessary tiny writes with noticeable latency overhead. Increase the buffer from 4 KiB to 128 KiB. Introduce a shared DEFAULT_IO_BUFFER_SIZE constant in git-compat-util.h (next to MAX_IO_SIZE) and use it in index-pack, unpack-objects, and the hashfile layer in csum-file (which already used 128 KiB but hardcoded the value). Pack file write reduction ========================= Pack file writes to a FUSE filesystem with writeback caching disabled during HTTPS clones of git/git (~293 MB pack): Unpatched avg Patched avg Change 74,958 4,687 −94% Write counts measured by logging writes in a FUSE passthrough daemon (libfuse 3.10.5, writeback cache off). Wall-clock time on FUSE ======================= Measured wall-clock time of git clone over HTTPS onto a FUSE passthrough filesystem with writeback caching disabled. 3 runs per variant: Repo Unpatched avg Patched avg Change microsoft/vscode (~1.26 GB pack) 84.5s 75.7s −10% git/git (~306 MB pack) 22.6s 20.0s −11% Changes since v3 ================ * Replaced strace-based syscall measurements with FUSE daemon write logging. The earlier strace numbers (72,465 → 24,943, 65% reduction) were distorted: strace -f ptrace intercepts every syscall in all traced processes and added enough overhead to distort the measurements. The FUSE daemon logging captures write sizes without perturbing the traced processes, showing the true reduction is 94% (74,958 → 4,687). * Note: Why 4,687 writes instead of ~2k writes as would be expected with a 128 KiB buffer size? It appears that fill() is calling xread() on a pipe and the linux default buffer size for pipes is 64KiB. I also tested using fcntl(F_SETPIPE_SZ) to increase the pipe's buffer size to 128KiB, which does indeed reduce total pack file writes to ~2.4K. Changes since v2 ================ * Renamed DEFAULT_PACKFILE_BUFFER_SIZE → DEFAULT_IO_BUFFER_SIZE per Stolee's feedback. The constant is not packfile-specific, since it is also used by the hashfile layer. * Stolee noted that WRITE_BUFFER_SIZE in read-cache.c could be consolidated. That constant was already removed in f6e2cd0625 ("read-cache: delete unused hashing methods", 2021-05-18) when read-cache.c was converted to use the hashfile API, so there is nothing left to unify. The rename to DEFAULT_IO_BUFFER_SIZE helps account for the multiple usages of this constant. Changes since v1 ================ * Introduced shared DEFAULT_PACKFILE_BUFFER_SIZE constant in git-compat-util.h (next to MAX_IO_SIZE), replacing per-file #define and the hardcoded value in csum-file.c. Placed here rather than environment.h since it is an I/O buffer size, not an environment variable or repo config. * Added wall-clock timing on a FUSE filesystem. * Cleaned up the commit description a bit. Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-2282%2Fsbauersfeld%2Fsb%2Fincrease-index-pack-input-buffer-v4 Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-2282/sbauersfeld/sb/increase-index-pack-input-buffer-v4 Pull-Request: https://github.com/git/git/pull/2282 Range-diff vs v3: 1: df754ac879 ! 1: 146b1846a5 index-pack, unpack-objects: increase input buffer from 4 KiB to 128 KiB @@ Commit message hashfile layer in csum-file (which already used 128 KiB but hardcoded the value). - Syscall counts via strace on HTTPS clones of git/git (~296 MB pack, - 5 runs per variant, isolated builds from the same v2.54.0 source): + Pack file writes to a FUSE filesystem with writeback caching + disabled during HTTPS clones of git/git (~293 MB pack): - index-pack pack file writes: 72,465 -> 24,943 avg (65% fewer) - total write() syscalls: 310,192 -> 259,530 avg (16% fewer) - writes of exactly 4096 bytes: ~40,077 -> 0 + 74,958 -> 4,687 (94% fewer) Wall-clock time of git clone over HTTPS onto a FUSE passthrough filesystem with writeback caching disabled, 3 runs per variant: builtin/index-pack.c | 3 +-- builtin/unpack-objects.c | 3 +-- csum-file.c | 2 +- git-compat-util.h | 6 ++++++ 4 files changed, 9 insertions(+), 5 deletions(-) diff --git a/builtin/index-pack.c b/builtin/index-pack.c index ca7784dc2c..bb3639641c 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -145,8 +145,7 @@ static int check_self_contained_and_connected; static struct progress *progress; -/* We always read in 4kB chunks. */ -static unsigned char input_buffer[4096]; +static unsigned char input_buffer[DEFAULT_IO_BUFFER_SIZE]; static unsigned int input_offset, input_len; static off_t consumed_bytes; static off_t max_input_size; diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c index e01cf6e360..af67d1a1d3 100644 --- a/builtin/unpack-objects.c +++ b/builtin/unpack-objects.c @@ -23,8 +23,7 @@ static int dry_run, quiet, recover, has_errors, strict; static const char unpack_usage[] = "git unpack-objects [-n] [-q] [-r] [--strict]"; -/* We always read in 4kB chunks. */ -static unsigned char buffer[4096]; +static unsigned char buffer[DEFAULT_IO_BUFFER_SIZE]; static unsigned int offset, len; static off_t consumed_bytes; static off_t max_input_size; diff --git a/csum-file.c b/csum-file.c index 9558177a11..d7a682c2b6 100644 --- a/csum-file.c +++ b/csum-file.c @@ -178,7 +178,7 @@ struct hashfile *hashfd_ext(const struct git_hash_algo *algop, f->algop = unsafe_hash_algo(algop); f->algop->init_fn(&f->ctx); - f->buffer_len = opts->buffer_len ? opts->buffer_len : 128 * 1024; + f->buffer_len = opts->buffer_len ? opts->buffer_len : DEFAULT_IO_BUFFER_SIZE; f->buffer = xmalloc(f->buffer_len); f->check_buffer = NULL; diff --git a/git-compat-util.h b/git-compat-util.h index ae1bdc90a4..5024814bd4 100644 --- a/git-compat-util.h +++ b/git-compat-util.h @@ -712,6 +712,12 @@ static inline uint64_t u64_add(uint64_t a, uint64_t b) # endif #endif +/* + * Default buffer size for buffered I/O in index-pack, unpack-objects, + * and the hashfile layer in csum-file. + */ +#define DEFAULT_IO_BUFFER_SIZE (128 * 1024) + #ifdef HAVE_ALLOCA_H # include <alloca.h> # define xalloca(size) (alloca(size)) base-commit: 94f057755b7941b321fd11fec1b2e3ca5313a4e0 -- gitgitgadget ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH v4] index-pack, unpack-objects: increase input buffer from 4 KiB to 128 KiB 2026-04-28 14:47 ` [PATCH v4] " Scott Bauersfeld via GitGitGadget @ 2026-05-12 5:51 ` Junio C Hamano 0 siblings, 0 replies; 12+ messages in thread From: Junio C Hamano @ 2026-05-12 5:51 UTC (permalink / raw) To: Scott Bauersfeld via GitGitGadget Cc: git, Derrick Stolee, Jeff King, Scott Bauersfeld "Scott Bauersfeld via GitGitGadget" <gitgitgadget@gmail.com> writes: > From: Scott Bauersfeld <sbauersfeld@g.ucla.edu> > > index-pack and unpack-objects both read pack data from stdin through > a 4 KiB static buffer. In index-pack, each fill() flushes consumed > bytes to the pack file via write_or_die(), capping every write(2) > at 4 KiB. unpack-objects uses the same buffer pattern for reads. > > On FUSE-backed filesystems every write(2) is a synchronous round > trip through the FUSE protocol (userspace -> kernel -> userspace -> > back), so the 4 KiB buffer turns a clone into many unnecessary tiny > writes with noticeable latency overhead. > > Increase the buffer from 4 KiB to 128 KiB. Introduce a shared > DEFAULT_IO_BUFFER_SIZE constant in git-compat-util.h (next to > MAX_IO_SIZE) and use it in index-pack, unpack-objects, and the > hashfile layer in csum-file (which already used 128 KiB but > hardcoded the value). > > Pack file writes to a FUSE filesystem with writeback caching > disabled during HTTPS clones of git/git (~293 MB pack): > > 74,958 -> 4,687 (94% fewer) > > Wall-clock time of git clone over HTTPS onto a FUSE passthrough > filesystem with writeback caching disabled, 3 runs per variant: > > vscode (~1.26 GB pack): 84.5s -> 75.7s avg (10% faster) > git/git (~306 MB pack): 22.6s -> 20.0s avg (11% faster) > > Signed-off-by: Scott Bauersfeld <sbauersfeld@g.ucla.edu> > --- >... > > Changes since v3 > ================ > > * Replaced strace-based syscall measurements with FUSE daemon write > logging. The earlier strace numbers (72,465 → 24,943, 65% reduction) > were distorted: strace -f ptrace intercepts every syscall in all > traced processes and added enough overhead to distort the > measurements. The FUSE daemon logging captures write sizes without > perturbing the traced processes, showing the true reduction is 94% > (74,958 → 4,687). > * Note: Why 4,687 writes instead of ~2k writes as would be expected > with a 128 KiB buffer size? It appears that fill() is calling xread() > on a pipe and the linux default buffer size for pipes is 64KiB. I > also tested using fcntl(F_SETPIPE_SZ) to increase the pipe's buffer > size to 128KiB, which does indeed reduce total pack file writes to > ~2.4K. It seems that everybody was happy with v3 already, so let's merge it down to 'next'. Thanks. ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2026-05-12 5:51 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-04-24 19:14 [PATCH] index-pack, unpack-objects: increase input buffer from 4 KiB to 128 KiB Scott Bauersfeld via GitGitGadget 2026-04-25 10:21 ` Junio C Hamano 2026-04-27 12:36 ` Derrick Stolee 2026-04-28 1:46 ` Junio C Hamano 2026-04-28 2:09 ` Jeff King 2026-04-27 16:08 ` [PATCH v2] " Scott Bauersfeld via GitGitGadget 2026-04-27 17:23 ` Derrick Stolee 2026-04-27 19:26 ` [PATCH v3] " Scott Bauersfeld via GitGitGadget 2026-04-27 20:12 ` Derrick Stolee 2026-04-28 1:47 ` Junio C Hamano 2026-04-28 14:47 ` [PATCH v4] " Scott Bauersfeld via GitGitGadget 2026-05-12 5:51 ` Junio C Hamano
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox