* [PATCH] index-pack, unpack-objects: increase input buffer from 4 KiB to 128 KiB
@ 2026-04-24 19:14 Scott Bauersfeld via GitGitGadget
2026-04-25 10:21 ` Junio C Hamano
2026-04-27 16:08 ` [PATCH v2] " Scott Bauersfeld via GitGitGadget
0 siblings, 2 replies; 12+ messages in thread
From: Scott Bauersfeld via GitGitGadget @ 2026-04-24 19:14 UTC (permalink / raw)
To: git; +Cc: Scott Bauersfeld, Scott Bauersfeld
From: Scott Bauersfeld <sbauersfeld@g.ucla.edu>
Both index-pack and unpack-objects read pack data from stdin through
a 4 KiB static buffer (input_buffer[4096]). On each fill(), consumed
bytes are flushed to the output pack file via write_or_die(), so
every write(2) moves at most 4 KiB.
On FUSE-backed filesystems every write(2) is a synchronous round
trip through the FUSE protocol (userspace -> kernel -> userspace ->
back), so the 4 KiB buffer turns a clone into many unnecessary tiny
writes with noticeable latency overhead.
Increase the buffer from 4 KiB to 128 KiB, matching the default
already used by the hashfile layer in csum-file.c.
Testing with strace on HTTPS clones of git/git (~296 MB pack, 5 runs
per variant, isolated builds from the same v2.54.0 source) shows:
index-pack pack file writes: 72,465 -> 24,943 avg (66% reduction)
total write() syscalls: 310,192 -> 259,530 avg (17% reduction)
writes of exactly 4096 bytes: ~40,077 -> 0 (eliminated)
All clones produce identical HEAD, file count, and pass fsck.
Signed-off-by: Scott Bauersfeld <sbauersfeld@g.ucla.edu>
---
index-pack, unpack-objects: increase input buffer from 4 KiB to 128 KiB
Both index-pack and unpack-objects read pack data from stdin through a 4
KiB static buffer (input_buffer[4096]). On each fill(), consumed bytes
are flushed to the output pack file via write_or_die(), so every
write(2) moves at most 4 KiB.
On FUSE-backed filesystems every write(2) is a synchronous round trip
through the FUSE protocol (userspace → kernel → userspace → back), so
the 4 KiB buffer turns a clone into many unnecessary tiny writes with
noticeable latency overhead.
This change increase the buffer from 4 KiB to 128 KiB, matching the
default already used by the hashfile layer in csum-file.c.
Benchmarked with 5 HTTPS clones per version of
https://github.com/sbauersfeld/git.git (~296 MB pack), using strace -f
to count write() syscalls. Both binaries built from the same v2.54.0
source tree in isolated directories to ensure the bin-wrappers resolve
to the correct binary.
Correctness verified via git fsck --no-dangling, rev-parse HEAD, and
working tree file count — all 10 clones match.
Results:
Metric Unpatched (4 KiB) Patched (128 KiB) Change index-pack writes to
pack file 72,465 avg 24,943 avg −66% Total write() syscalls (all
processes) 310,192 avg 259,530 avg −17% Writes of exactly 4096 bytes
~40,077 avg 0 eliminated HEAD / file count / fsck ✓ ✓ None
Raw data:
unpatched (input_buffer[4096]): run 1: total_writes=311787
ip_pack_writes=72353 ip_4k=35311 run 2: total_writes=310252
ip_pack_writes=72348 ip_4k=38024 run 3: total_writes=309737
ip_pack_writes=72303 ip_4k=43003 run 4: total_writes=309801
ip_pack_writes=72661 ip_4k=42349 run 5: total_writes=309383
ip_pack_writes=72662 ip_4k=41702
patched (input_buffer[128 * 1024]): run 1: total_writes=264659
ip_pack_writes=26605 ip_4k=0 run 2: total_writes=264276
ip_pack_writes=26568 ip_4k=0 run 3: total_writes=227796 ip_pack_writes=
9762 ip_4k=0 run 4: total_writes=262464 ip_pack_writes=27830 ip_4k=0 run
5: total_writes=278455 ip_pack_writes=33952 ip_4k=0
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-2282%2Fsbauersfeld%2Fsb%2Fincrease-index-pack-input-buffer-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-2282/sbauersfeld/sb/increase-index-pack-input-buffer-v1
Pull-Request: https://github.com/git/git/pull/2282
builtin/index-pack.c | 4 ++--
builtin/unpack-objects.c | 4 ++--
2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index ca7784dc2c..81a628bf34 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -145,8 +145,8 @@ static int check_self_contained_and_connected;
static struct progress *progress;
-/* We always read in 4kB chunks. */
-static unsigned char input_buffer[4096];
+#define INPUT_BUFFER_SIZE (128 * 1024)
+static unsigned char input_buffer[INPUT_BUFFER_SIZE];
static unsigned int input_offset, input_len;
static off_t consumed_bytes;
static off_t max_input_size;
diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c
index e01cf6e360..535c019f82 100644
--- a/builtin/unpack-objects.c
+++ b/builtin/unpack-objects.c
@@ -23,8 +23,8 @@
static int dry_run, quiet, recover, has_errors, strict;
static const char unpack_usage[] = "git unpack-objects [-n] [-q] [-r] [--strict]";
-/* We always read in 4kB chunks. */
-static unsigned char buffer[4096];
+#define INPUT_BUFFER_SIZE (128 * 1024)
+static unsigned char buffer[INPUT_BUFFER_SIZE];
static unsigned int offset, len;
static off_t consumed_bytes;
static off_t max_input_size;
base-commit: 94f057755b7941b321fd11fec1b2e3ca5313a4e0
--
gitgitgadget
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH] index-pack, unpack-objects: increase input buffer from 4 KiB to 128 KiB
2026-04-24 19:14 [PATCH] index-pack, unpack-objects: increase input buffer from 4 KiB to 128 KiB Scott Bauersfeld via GitGitGadget
@ 2026-04-25 10:21 ` Junio C Hamano
2026-04-27 12:36 ` Derrick Stolee
2026-04-27 16:08 ` [PATCH v2] " Scott Bauersfeld via GitGitGadget
1 sibling, 1 reply; 12+ messages in thread
From: Junio C Hamano @ 2026-04-25 10:21 UTC (permalink / raw)
To: Scott Bauersfeld via GitGitGadget; +Cc: git, Scott Bauersfeld
"Scott Bauersfeld via GitGitGadget" <gitgitgadget@gmail.com> writes:
> From: Scott Bauersfeld <sbauersfeld@g.ucla.edu>
>
> Both index-pack and unpack-objects read pack data from stdin through
> a 4 KiB static buffer (input_buffer[4096]). On each fill(), consumed
> bytes are flushed to the output pack file via write_or_die(), so
> every write(2) moves at most 4 KiB.
Micronit. Output of unpack-objects obviously does not get flushed
to "the output pack file".
> On FUSE-backed filesystems every write(2) is a synchronous round
> trip through the FUSE protocol (userspace -> kernel -> userspace ->
> back), so the 4 KiB buffer turns a clone into many unnecessary tiny
> writes with noticeable latency overhead.
>
> Increase the buffer from 4 KiB to 128 KiB, matching the default
> already used by the hashfile layer in csum-file.c.
Quite sensible reasoning presented very nicely.
It may probably be a #leftoverbit but these three instances of (128
* 1024) may want to have a common symbolic constant, like
#define DEFAULT_IOBUFFER_SIZE_IN_BYTES (128 * 1024)
in a bit more central header file. Especially for the one in
csum-file.c where there is no symbolic constant used for that
purpose.
> Testing with strace on HTTPS clones of git/git (~296 MB pack, 5 runs
> per variant, isolated builds from the same v2.54.0 source) shows:
>
> index-pack pack file writes: 72,465 -> 24,943 avg (66% reduction)
> total write() syscalls: 310,192 -> 259,530 avg (17% reduction)
> writes of exactly 4096 bytes: ~40,077 -> 0 (eliminated)
Hmph, I would have expected more like (1 - 4/128) ~ 97% reduction.
The difference between that and 66% is coming from where? There are
inherently short writes that do not utilize the new larger buffer
beyond 4kB? If so, another number of interest might be the number
of writes smaller than 4096 bytes, perhaps?
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] index-pack, unpack-objects: increase input buffer from 4 KiB to 128 KiB
2026-04-25 10:21 ` Junio C Hamano
@ 2026-04-27 12:36 ` Derrick Stolee
2026-04-28 1:46 ` Junio C Hamano
0 siblings, 1 reply; 12+ messages in thread
From: Derrick Stolee @ 2026-04-27 12:36 UTC (permalink / raw)
To: Junio C Hamano, Scott Bauersfeld via GitGitGadget; +Cc: git, Scott Bauersfeld
On 4/25/2026 6:21 AM, Junio C Hamano wrote:
> "Scott Bauersfeld via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
>> From: Scott Bauersfeld <sbauersfeld@g.ucla.edu>
>>
>> On FUSE-backed filesystems every write(2) is a synchronous round
>> trip through the FUSE protocol (userspace -> kernel -> userspace ->
>> back), so the 4 KiB buffer turns a clone into many unnecessary tiny
>> writes with noticeable latency overhead.
>>
>> Increase the buffer from 4 KiB to 128 KiB, matching the default
>> already used by the hashfile layer in csum-file.c.
>
> Quite sensible reasoning presented very nicely.
>
> It may probably be a #leftoverbit but these three instances of (128
> * 1024) may want to have a common symbolic constant, like
>
> #define DEFAULT_IOBUFFER_SIZE_IN_BYTES (128 * 1024)
>
> in a bit more central header file. Especially for the one in
> csum-file.c where there is no symbolic constant used for that
> purpose.
I also had this thought. Would environment.h be the best place?
>> Testing with strace on HTTPS clones of git/git (~296 MB pack, 5 runs
>> per variant, isolated builds from the same v2.54.0 source) shows:
>>
>> index-pack pack file writes: 72,465 -> 24,943 avg (66% reduction)
>> total write() syscalls: 310,192 -> 259,530 avg (17% reduction)
>> writes of exactly 4096 bytes: ~40,077 -> 0 (eliminated)
>
> Hmph, I would have expected more like (1 - 4/128) ~ 97% reduction.
> The difference between that and 66% is coming from where? There are
> inherently short writes that do not utilize the new larger buffer
> beyond 4kB? If so, another number of interest might be the number
> of writes smaller than 4096 bytes, perhaps?
One way to reword what you're asking is to measure "number of writes
not using the whole buffer" which is basically going to be "the
number of flush events from the application layer". Every time the
application intends to flush, the current buffer is likely to not
be exactly full. I would expect this number to not change between
implementations in real experiments.
The improvement here comes from the reduced number of flushes due
to buffer limits. I see that this can be measured in the number of
system-level events, but what impact does this have on the end-to-
end time of 'git index-pack' or 'git unpack-objects'? Is there a
t/perf/ test that can demonstrate this improvement for a variety
of real repos using GIT_PERF_REPO?
Thanks,
-Stolee
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH v2] index-pack, unpack-objects: increase input buffer from 4 KiB to 128 KiB
2026-04-24 19:14 [PATCH] index-pack, unpack-objects: increase input buffer from 4 KiB to 128 KiB Scott Bauersfeld via GitGitGadget
2026-04-25 10:21 ` Junio C Hamano
@ 2026-04-27 16:08 ` Scott Bauersfeld via GitGitGadget
2026-04-27 17:23 ` Derrick Stolee
2026-04-27 19:26 ` [PATCH v3] " Scott Bauersfeld via GitGitGadget
1 sibling, 2 replies; 12+ messages in thread
From: Scott Bauersfeld via GitGitGadget @ 2026-04-27 16:08 UTC (permalink / raw)
To: git; +Cc: Junio C Hamano, Derrick Stolee, Scott Bauersfeld,
Scott Bauersfeld
From: Scott Bauersfeld <sbauersfeld@g.ucla.edu>
index-pack and unpack-objects both read pack data from stdin through
a 4 KiB static buffer. In index-pack, each fill() flushes consumed
bytes to the pack file via write_or_die(), capping every write(2)
at 4 KiB. unpack-objects uses the same buffer pattern for reads.
On FUSE-backed filesystems every write(2) is a synchronous round
trip through the FUSE protocol (userspace -> kernel -> userspace ->
back), so the 4 KiB buffer turns a clone into many unnecessary tiny
writes with noticeable latency overhead.
Increase the buffer from 4 KiB to 128 KiB. Introduce a shared
DEFAULT_PACKFILE_BUFFER_SIZE constant in git-compat-util.h (next to
MAX_IO_SIZE) and use it in index-pack, unpack-objects, and the
hashfile layer in csum-file (which already used 128 KiB but
hardcoded the value).
Syscall counts via strace on HTTPS clones of git/git (~296 MB pack,
5 runs per variant, isolated builds from the same v2.54.0 source):
index-pack pack file writes: 72,465 -> 24,943 avg (65% fewer)
total write() syscalls: 310,192 -> 259,530 avg (16% fewer)
writes of exactly 4096 bytes: ~40,077 -> 0
Wall-clock time of git clone over HTTPS onto a FUSE passthrough
filesystem with writeback caching disabled, 3 runs per variant:
vscode (~1.26 GB pack): 84.5s -> 75.7s avg (10% faster)
git/git (~306 MB pack): 22.6s -> 20.0s avg (11% faster)
Signed-off-by: Scott Bauersfeld <sbauersfeld@g.ucla.edu>
---
index-pack, unpack-objects: increase input buffer from 4 KiB to 128 KiB
index-pack and unpack-objects read pack data from stdin through a 4 KiB
static buffer. In index-pack, each fill() flushes consumed bytes to the
pack file via write_or_die(), capping every write(2) at 4 KiB.
unpack-objects uses the same buffer pattern for reads.
On FUSE-backed filesystems every write(2) is a synchronous round trip
through the FUSE protocol (userspace → kernel → userspace → back), so
the 4 KiB buffer turns a clone into many unnecessary tiny writes with
noticeable latency overhead.
Increase the buffer from 4 KiB to 128 KiB. Introduce a shared
DEFAULT_PACKFILE_BUFFER_SIZE constant in git-compat-util.h (next to
MAX_IO_SIZE) and use it in index-pack, unpack-objects, and the hashfile
layer in csum-file (which already used 128 KiB but hardcoded the value).
Syscall reduction
=================
Measured via strace -f on HTTPS clones of git/git (~296 MB pack, 5 runs
per variant, isolated builds from the same v2.54.0 source):
Metric Unpatched (4 KiB) Patched (128 KiB) Change index-pack writes to
pack file 72,465 avg 24,943 avg −65% Total write() syscalls (all
processes) 310,192 avg 259,530 avg −16% Writes of exactly 4096 bytes
~40,077 avg 0 eliminated HEAD / file count / fsck ✓ ✓ identical
Wall-clock time on FUSE
=======================
Measured wall-clock time of git clone over HTTPS onto a FUSE passthrough
filesystem with writeback caching disabled. 3 runs per variant:
Repo Unpatched avg Patched avg Change microsoft/vscode (~1.26 GB pack)
84.5s 75.7s −10% git/git (~306 MB pack) 22.6s 20.0s −11%
Changes since v1
================
* Introduced shared DEFAULT_PACKFILE_BUFFER_SIZE constant in
git-compat-util.h (next to MAX_IO_SIZE), replacing per-file #define
and the hardcoded value in csum-file.c. Placed here rather than
environment.h since it is an I/O buffer size, not an environment
variable or repo config.
* Added wall-clock timing on a FUSE filesystem.
* Cleaned up the commit description a bit.
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-2282%2Fsbauersfeld%2Fsb%2Fincrease-index-pack-input-buffer-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-2282/sbauersfeld/sb/increase-index-pack-input-buffer-v2
Pull-Request: https://github.com/git/git/pull/2282
Range-diff vs v1:
1: c388e1dc2f ! 1: ac2559ccb5 index-pack, unpack-objects: increase input buffer from 4 KiB to 128 KiB
@@ Metadata
## Commit message ##
index-pack, unpack-objects: increase input buffer from 4 KiB to 128 KiB
- Both index-pack and unpack-objects read pack data from stdin through
- a 4 KiB static buffer (input_buffer[4096]). On each fill(), consumed
- bytes are flushed to the output pack file via write_or_die(), so
- every write(2) moves at most 4 KiB.
+ index-pack and unpack-objects both read pack data from stdin through
+ a 4 KiB static buffer. In index-pack, each fill() flushes consumed
+ bytes to the pack file via write_or_die(), capping every write(2)
+ at 4 KiB. unpack-objects uses the same buffer pattern for reads.
On FUSE-backed filesystems every write(2) is a synchronous round
trip through the FUSE protocol (userspace -> kernel -> userspace ->
back), so the 4 KiB buffer turns a clone into many unnecessary tiny
writes with noticeable latency overhead.
- Increase the buffer from 4 KiB to 128 KiB, matching the default
- already used by the hashfile layer in csum-file.c.
+ Increase the buffer from 4 KiB to 128 KiB. Introduce a shared
+ DEFAULT_PACKFILE_BUFFER_SIZE constant in git-compat-util.h (next to
+ MAX_IO_SIZE) and use it in index-pack, unpack-objects, and the
+ hashfile layer in csum-file (which already used 128 KiB but
+ hardcoded the value).
- Testing with strace on HTTPS clones of git/git (~296 MB pack, 5 runs
- per variant, isolated builds from the same v2.54.0 source) shows:
+ Syscall counts via strace on HTTPS clones of git/git (~296 MB pack,
+ 5 runs per variant, isolated builds from the same v2.54.0 source):
- index-pack pack file writes: 72,465 -> 24,943 avg (66% reduction)
- total write() syscalls: 310,192 -> 259,530 avg (17% reduction)
- writes of exactly 4096 bytes: ~40,077 -> 0 (eliminated)
+ index-pack pack file writes: 72,465 -> 24,943 avg (65% fewer)
+ total write() syscalls: 310,192 -> 259,530 avg (16% fewer)
+ writes of exactly 4096 bytes: ~40,077 -> 0
- All clones produce identical HEAD, file count, and pass fsck.
+ Wall-clock time of git clone over HTTPS onto a FUSE passthrough
+ filesystem with writeback caching disabled, 3 runs per variant:
+
+ vscode (~1.26 GB pack): 84.5s -> 75.7s avg (10% faster)
+ git/git (~306 MB pack): 22.6s -> 20.0s avg (11% faster)
Signed-off-by: Scott Bauersfeld <sbauersfeld@g.ucla.edu>
@@ builtin/index-pack.c: static int check_self_contained_and_connected;
-/* We always read in 4kB chunks. */
-static unsigned char input_buffer[4096];
-+#define INPUT_BUFFER_SIZE (128 * 1024)
-+static unsigned char input_buffer[INPUT_BUFFER_SIZE];
++static unsigned char input_buffer[DEFAULT_PACKFILE_BUFFER_SIZE];
static unsigned int input_offset, input_len;
static off_t consumed_bytes;
static off_t max_input_size;
@@ builtin/unpack-objects.c
-/* We always read in 4kB chunks. */
-static unsigned char buffer[4096];
-+#define INPUT_BUFFER_SIZE (128 * 1024)
-+static unsigned char buffer[INPUT_BUFFER_SIZE];
++static unsigned char buffer[DEFAULT_PACKFILE_BUFFER_SIZE];
static unsigned int offset, len;
static off_t consumed_bytes;
static off_t max_input_size;
+
+ ## csum-file.c ##
+@@ csum-file.c: struct hashfile *hashfd_ext(const struct git_hash_algo *algop,
+ f->algop = unsafe_hash_algo(algop);
+ f->algop->init_fn(&f->ctx);
+
+- f->buffer_len = opts->buffer_len ? opts->buffer_len : 128 * 1024;
++ f->buffer_len = opts->buffer_len ? opts->buffer_len : DEFAULT_PACKFILE_BUFFER_SIZE;
+ f->buffer = xmalloc(f->buffer_len);
+ f->check_buffer = NULL;
+
+
+ ## git-compat-util.h ##
+@@ git-compat-util.h: static inline uint64_t u64_add(uint64_t a, uint64_t b)
+ # endif
+ #endif
+
++/*
++ * Default buffer size for buffered I/O in pack file operations (index-pack,
++ * unpack-objects) and the hashfile layer in csum-file.
++ */
++#define DEFAULT_PACKFILE_BUFFER_SIZE (128 * 1024)
++
+ #ifdef HAVE_ALLOCA_H
+ # include <alloca.h>
+ # define xalloca(size) (alloca(size))
builtin/index-pack.c | 3 +--
builtin/unpack-objects.c | 3 +--
csum-file.c | 2 +-
git-compat-util.h | 6 ++++++
4 files changed, 9 insertions(+), 5 deletions(-)
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index ca7784dc2c..d86476676f 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -145,8 +145,7 @@ static int check_self_contained_and_connected;
static struct progress *progress;
-/* We always read in 4kB chunks. */
-static unsigned char input_buffer[4096];
+static unsigned char input_buffer[DEFAULT_PACKFILE_BUFFER_SIZE];
static unsigned int input_offset, input_len;
static off_t consumed_bytes;
static off_t max_input_size;
diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c
index e01cf6e360..da8ec83d9f 100644
--- a/builtin/unpack-objects.c
+++ b/builtin/unpack-objects.c
@@ -23,8 +23,7 @@
static int dry_run, quiet, recover, has_errors, strict;
static const char unpack_usage[] = "git unpack-objects [-n] [-q] [-r] [--strict]";
-/* We always read in 4kB chunks. */
-static unsigned char buffer[4096];
+static unsigned char buffer[DEFAULT_PACKFILE_BUFFER_SIZE];
static unsigned int offset, len;
static off_t consumed_bytes;
static off_t max_input_size;
diff --git a/csum-file.c b/csum-file.c
index 9558177a11..c1aeaf587a 100644
--- a/csum-file.c
+++ b/csum-file.c
@@ -178,7 +178,7 @@ struct hashfile *hashfd_ext(const struct git_hash_algo *algop,
f->algop = unsafe_hash_algo(algop);
f->algop->init_fn(&f->ctx);
- f->buffer_len = opts->buffer_len ? opts->buffer_len : 128 * 1024;
+ f->buffer_len = opts->buffer_len ? opts->buffer_len : DEFAULT_PACKFILE_BUFFER_SIZE;
f->buffer = xmalloc(f->buffer_len);
f->check_buffer = NULL;
diff --git a/git-compat-util.h b/git-compat-util.h
index ae1bdc90a4..a2f037811c 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -712,6 +712,12 @@ static inline uint64_t u64_add(uint64_t a, uint64_t b)
# endif
#endif
+/*
+ * Default buffer size for buffered I/O in pack file operations (index-pack,
+ * unpack-objects) and the hashfile layer in csum-file.
+ */
+#define DEFAULT_PACKFILE_BUFFER_SIZE (128 * 1024)
+
#ifdef HAVE_ALLOCA_H
# include <alloca.h>
# define xalloca(size) (alloca(size))
base-commit: 94f057755b7941b321fd11fec1b2e3ca5313a4e0
--
gitgitgadget
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH v2] index-pack, unpack-objects: increase input buffer from 4 KiB to 128 KiB
2026-04-27 16:08 ` [PATCH v2] " Scott Bauersfeld via GitGitGadget
@ 2026-04-27 17:23 ` Derrick Stolee
2026-04-27 19:26 ` [PATCH v3] " Scott Bauersfeld via GitGitGadget
1 sibling, 0 replies; 12+ messages in thread
From: Derrick Stolee @ 2026-04-27 17:23 UTC (permalink / raw)
To: Scott Bauersfeld via GitGitGadget, git; +Cc: Junio C Hamano, Scott Bauersfeld
On 4/27/2026 12:08 PM, Scott Bauersfeld via GitGitGadget wrote:
> From: Scott Bauersfeld <sbauersfeld@g.ucla.edu>
...
> Wall-clock time of git clone over HTTPS onto a FUSE passthrough
> filesystem with writeback caching disabled, 3 runs per variant:
>
> vscode (~1.26 GB pack): 84.5s -> 75.7s avg (10% faster)
> git/git (~306 MB pack): 22.6s -> 20.0s avg (11% faster)
Wow! This is much higher than I expected. Great find.
I imagine that other platforms or non-FUSE setups will not
have the same benefits. As long as they aren't _regressions_
then this is a great find.
> -/* We always read in 4kB chunks. */
> -static unsigned char input_buffer[4096];
> +static unsigned char input_buffer[DEFAULT_PACKFILE_BUFFER_SIZE];
> -/* We always read in 4kB chunks. */
> -static unsigned char buffer[4096];
> +static unsigned char buffer[DEFAULT_PACKFILE_BUFFER_SIZE];
These changes are what I expected in v2.
> diff --git a/csum-file.c b/csum-file.c
> index 9558177a11..c1aeaf587a 100644
> --- a/csum-file.c
> +++ b/csum-file.c
> @@ -178,7 +178,7 @@ struct hashfile *hashfd_ext(const struct git_hash_algo *algop,
> f->algop = unsafe_hash_algo(algop);
> f->algop->init_fn(&f->ctx);
>
> - f->buffer_len = opts->buffer_len ? opts->buffer_len : 128 * 1024;
> + f->buffer_len = opts->buffer_len ? opts->buffer_len : DEFAULT_PACKFILE_BUFFER_SIZE;
> f->buffer = xmalloc(f->buffer_len);
> f->check_buffer = NULL;
This one surprised me, as this hunk wasn't in your v1 patch.
I think using this replacement makes sense, since it _is_ an
exact value. It did make me think as to how we landed on 128K
for this example.
The previous line is due to a1118c0a446 (csum-file: introduce
`hashfd_ext()`, 2026-03-13), but it only moved the 128K default
from hashfd(). Notably, hashfd_throughput() still uses an 8K
setting in opt->buffer_len.
Hilariously, I went spelunking for the original reason for the
128K and it was 2ca245f8be5 (csum-file.h: increase hashfile
buffer size, 2021-05-18) written by...me. The motivation was
due to using the hashfile logic for the .git/index file which
also used 128K buffers in f279894 (read-cache: make the index
write buffer size 128K, 2021-02-18).
All this is to say that we now have two constants of identical
value, where WRITE_BUFFER_SIZE in read-cache.c could be replaced
with your new DEFAULT_PACKFILE_BUFFER_SIZE.
This does make me think that maybe DEFAULT_PACKFILE_BUFFER_SIZE
is misnamed? Should it be DEFAULT_HASHFILE_BUFFER_SIZE or
DEFAULT_FILESYSTEM_BUFFER_SIZE to better fit this size value
being used in both packfiles and index files?
> diff --git a/git-compat-util.h b/git-compat-util.h
> index ae1bdc90a4..a2f037811c 100644
> --- a/git-compat-util.h
> +++ b/git-compat-util.h
> @@ -712,6 +712,12 @@ static inline uint64_t u64_add(uint64_t a, uint64_t b)
> # endif
> #endif
>
> +/*
> + * Default buffer size for buffered I/O in pack file operations (index-pack,
> + * unpack-objects) and the hashfile layer in csum-file.
> + */
> +#define DEFAULT_PACKFILE_BUFFER_SIZE (128 * 1024)
> +
I see. Putting this in git-compat-util.h makes the rest
of the changes good without any need to add a new include.
Thanks,
-Stolee
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH v3] index-pack, unpack-objects: increase input buffer from 4 KiB to 128 KiB
2026-04-27 16:08 ` [PATCH v2] " Scott Bauersfeld via GitGitGadget
2026-04-27 17:23 ` Derrick Stolee
@ 2026-04-27 19:26 ` Scott Bauersfeld via GitGitGadget
2026-04-27 20:12 ` Derrick Stolee
2026-04-28 14:47 ` [PATCH v4] " Scott Bauersfeld via GitGitGadget
1 sibling, 2 replies; 12+ messages in thread
From: Scott Bauersfeld via GitGitGadget @ 2026-04-27 19:26 UTC (permalink / raw)
To: git; +Cc: Junio C Hamano, Derrick Stolee, Scott Bauersfeld,
Scott Bauersfeld
From: Scott Bauersfeld <sbauersfeld@g.ucla.edu>
index-pack and unpack-objects both read pack data from stdin through
a 4 KiB static buffer. In index-pack, each fill() flushes consumed
bytes to the pack file via write_or_die(), capping every write(2)
at 4 KiB. unpack-objects uses the same buffer pattern for reads.
On FUSE-backed filesystems every write(2) is a synchronous round
trip through the FUSE protocol (userspace -> kernel -> userspace ->
back), so the 4 KiB buffer turns a clone into many unnecessary tiny
writes with noticeable latency overhead.
Increase the buffer from 4 KiB to 128 KiB. Introduce a shared
DEFAULT_IO_BUFFER_SIZE constant in git-compat-util.h (next to
MAX_IO_SIZE) and use it in index-pack, unpack-objects, and the
hashfile layer in csum-file (which already used 128 KiB but
hardcoded the value).
Syscall counts via strace on HTTPS clones of git/git (~296 MB pack,
5 runs per variant, isolated builds from the same v2.54.0 source):
index-pack pack file writes: 72,465 -> 24,943 avg (65% fewer)
total write() syscalls: 310,192 -> 259,530 avg (16% fewer)
writes of exactly 4096 bytes: ~40,077 -> 0
Wall-clock time of git clone over HTTPS onto a FUSE passthrough
filesystem with writeback caching disabled, 3 runs per variant:
vscode (~1.26 GB pack): 84.5s -> 75.7s avg (10% faster)
git/git (~306 MB pack): 22.6s -> 20.0s avg (11% faster)
Signed-off-by: Scott Bauersfeld <sbauersfeld@g.ucla.edu>
---
index-pack, unpack-objects: increase input buffer from 4 KiB to 128 KiB
index-pack and unpack-objects read pack data from stdin through a 4 KiB
static buffer. In index-pack, each fill() flushes consumed bytes to the
pack file via write_or_die(), capping every write(2) at 4 KiB.
unpack-objects uses the same buffer pattern for reads.
On FUSE-backed filesystems every write(2) is a synchronous round trip
through the FUSE protocol (userspace → kernel → userspace → back), so
the 4 KiB buffer turns a clone into many unnecessary tiny writes with
noticeable latency overhead.
Increase the buffer from 4 KiB to 128 KiB. Introduce a shared
DEFAULT_IO_BUFFER_SIZE constant in git-compat-util.h (next to
MAX_IO_SIZE) and use it in index-pack, unpack-objects, and the hashfile
layer in csum-file (which already used 128 KiB but hardcoded the value).
Syscall reduction
=================
Measured via strace -f on HTTPS clones of git/git (~296 MB pack, 5 runs
per variant, isolated builds from the same v2.54.0 source):
Metric Unpatched (4 KiB) Patched (128 KiB) Change index-pack writes to
pack file 72,465 avg 24,943 avg −65% Total write() syscalls (all
processes) 310,192 avg 259,530 avg −16% Writes of exactly 4096 bytes
~40,077 avg 0 eliminated HEAD / file count / fsck ✓ ✓ identical
Wall-clock time on FUSE
=======================
Measured wall-clock time of git clone over HTTPS onto a FUSE passthrough
filesystem with writeback caching disabled. 3 runs per variant:
Repo Unpatched avg Patched avg Change microsoft/vscode (~1.26 GB pack)
84.5s 75.7s −10% git/git (~306 MB pack) 22.6s 20.0s −11%
Changes since v2
================
* Renamed DEFAULT_PACKFILE_BUFFER_SIZE → DEFAULT_IO_BUFFER_SIZE per
Stolee's feedback. The constant is not packfile-specific, since it is
also used by the hashfile layer.
* Stolee noted that WRITE_BUFFER_SIZE in read-cache.c could be
consolidated. That constant was already removed in f6e2cd0625
("read-cache: delete unused hashing methods", 2021-05-18) when
read-cache.c was converted to use the hashfile API, so there is
nothing left to unify. The rename to DEFAULT_IO_BUFFER_SIZE helps
account for the multiple usages of this constant.
Changes since v1
================
* Introduced shared DEFAULT_PACKFILE_BUFFER_SIZE constant in
git-compat-util.h (next to MAX_IO_SIZE), replacing per-file #define
and the hardcoded value in csum-file.c. Placed here rather than
environment.h since it is an I/O buffer size, not an environment
variable or repo config.
* Added wall-clock timing on a FUSE filesystem.
* Cleaned up the commit description a bit.
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-2282%2Fsbauersfeld%2Fsb%2Fincrease-index-pack-input-buffer-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-2282/sbauersfeld/sb/increase-index-pack-input-buffer-v3
Pull-Request: https://github.com/git/git/pull/2282
Range-diff vs v2:
1: ac2559ccb5 ! 1: df754ac879 index-pack, unpack-objects: increase input buffer from 4 KiB to 128 KiB
@@ Commit message
writes with noticeable latency overhead.
Increase the buffer from 4 KiB to 128 KiB. Introduce a shared
- DEFAULT_PACKFILE_BUFFER_SIZE constant in git-compat-util.h (next to
+ DEFAULT_IO_BUFFER_SIZE constant in git-compat-util.h (next to
MAX_IO_SIZE) and use it in index-pack, unpack-objects, and the
hashfile layer in csum-file (which already used 128 KiB but
hardcoded the value).
@@ builtin/index-pack.c: static int check_self_contained_and_connected;
-/* We always read in 4kB chunks. */
-static unsigned char input_buffer[4096];
-+static unsigned char input_buffer[DEFAULT_PACKFILE_BUFFER_SIZE];
++static unsigned char input_buffer[DEFAULT_IO_BUFFER_SIZE];
static unsigned int input_offset, input_len;
static off_t consumed_bytes;
static off_t max_input_size;
@@ builtin/unpack-objects.c
-/* We always read in 4kB chunks. */
-static unsigned char buffer[4096];
-+static unsigned char buffer[DEFAULT_PACKFILE_BUFFER_SIZE];
++static unsigned char buffer[DEFAULT_IO_BUFFER_SIZE];
static unsigned int offset, len;
static off_t consumed_bytes;
static off_t max_input_size;
@@ csum-file.c: struct hashfile *hashfd_ext(const struct git_hash_algo *algop,
f->algop->init_fn(&f->ctx);
- f->buffer_len = opts->buffer_len ? opts->buffer_len : 128 * 1024;
-+ f->buffer_len = opts->buffer_len ? opts->buffer_len : DEFAULT_PACKFILE_BUFFER_SIZE;
++ f->buffer_len = opts->buffer_len ? opts->buffer_len : DEFAULT_IO_BUFFER_SIZE;
f->buffer = xmalloc(f->buffer_len);
f->check_buffer = NULL;
@@ git-compat-util.h: static inline uint64_t u64_add(uint64_t a, uint64_t b)
#endif
+/*
-+ * Default buffer size for buffered I/O in pack file operations (index-pack,
-+ * unpack-objects) and the hashfile layer in csum-file.
++ * Default buffer size for buffered I/O in index-pack, unpack-objects,
++ * and the hashfile layer in csum-file.
+ */
-+#define DEFAULT_PACKFILE_BUFFER_SIZE (128 * 1024)
++#define DEFAULT_IO_BUFFER_SIZE (128 * 1024)
+
#ifdef HAVE_ALLOCA_H
# include <alloca.h>
builtin/index-pack.c | 3 +--
builtin/unpack-objects.c | 3 +--
csum-file.c | 2 +-
git-compat-util.h | 6 ++++++
4 files changed, 9 insertions(+), 5 deletions(-)
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index ca7784dc2c..bb3639641c 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -145,8 +145,7 @@ static int check_self_contained_and_connected;
static struct progress *progress;
-/* We always read in 4kB chunks. */
-static unsigned char input_buffer[4096];
+static unsigned char input_buffer[DEFAULT_IO_BUFFER_SIZE];
static unsigned int input_offset, input_len;
static off_t consumed_bytes;
static off_t max_input_size;
diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c
index e01cf6e360..af67d1a1d3 100644
--- a/builtin/unpack-objects.c
+++ b/builtin/unpack-objects.c
@@ -23,8 +23,7 @@
static int dry_run, quiet, recover, has_errors, strict;
static const char unpack_usage[] = "git unpack-objects [-n] [-q] [-r] [--strict]";
-/* We always read in 4kB chunks. */
-static unsigned char buffer[4096];
+static unsigned char buffer[DEFAULT_IO_BUFFER_SIZE];
static unsigned int offset, len;
static off_t consumed_bytes;
static off_t max_input_size;
diff --git a/csum-file.c b/csum-file.c
index 9558177a11..d7a682c2b6 100644
--- a/csum-file.c
+++ b/csum-file.c
@@ -178,7 +178,7 @@ struct hashfile *hashfd_ext(const struct git_hash_algo *algop,
f->algop = unsafe_hash_algo(algop);
f->algop->init_fn(&f->ctx);
- f->buffer_len = opts->buffer_len ? opts->buffer_len : 128 * 1024;
+ f->buffer_len = opts->buffer_len ? opts->buffer_len : DEFAULT_IO_BUFFER_SIZE;
f->buffer = xmalloc(f->buffer_len);
f->check_buffer = NULL;
diff --git a/git-compat-util.h b/git-compat-util.h
index ae1bdc90a4..5024814bd4 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -712,6 +712,12 @@ static inline uint64_t u64_add(uint64_t a, uint64_t b)
# endif
#endif
+/*
+ * Default buffer size for buffered I/O in index-pack, unpack-objects,
+ * and the hashfile layer in csum-file.
+ */
+#define DEFAULT_IO_BUFFER_SIZE (128 * 1024)
+
#ifdef HAVE_ALLOCA_H
# include <alloca.h>
# define xalloca(size) (alloca(size))
base-commit: 94f057755b7941b321fd11fec1b2e3ca5313a4e0
--
gitgitgadget
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH v3] index-pack, unpack-objects: increase input buffer from 4 KiB to 128 KiB
2026-04-27 19:26 ` [PATCH v3] " Scott Bauersfeld via GitGitGadget
@ 2026-04-27 20:12 ` Derrick Stolee
2026-04-28 1:47 ` Junio C Hamano
2026-04-28 14:47 ` [PATCH v4] " Scott Bauersfeld via GitGitGadget
1 sibling, 1 reply; 12+ messages in thread
From: Derrick Stolee @ 2026-04-27 20:12 UTC (permalink / raw)
To: Scott Bauersfeld via GitGitGadget, git; +Cc: Junio C Hamano, Scott Bauersfeld
On 4/27/2026 3:26 PM, Scott Bauersfeld via GitGitGadget wrote:
> From: Scott Bauersfeld <sbauersfeld@g.ucla.edu>
> Changes since v2
> ================
>
> * Renamed DEFAULT_PACKFILE_BUFFER_SIZE → DEFAULT_IO_BUFFER_SIZE per
> Stolee's feedback. The constant is not packfile-specific, since it is
> also used by the hashfile layer.
> * Stolee noted that WRITE_BUFFER_SIZE in read-cache.c could be
> consolidated. That constant was already removed in f6e2cd0625
> ("read-cache: delete unused hashing methods", 2021-05-18) when
> read-cache.c was converted to use the hashfile API, so there is
> nothing left to unify. The rename to DEFAULT_IO_BUFFER_SIZE helps
> account for the multiple usages of this constant.
Thank you for discovering this context which made my recommendation
non-actionable. I was looking at the commit that added the 128K limit,
which had that in its context, but not at the latest code. My mistake!
I'm very happy with this version and look forward to the performance
benefits!
Thanks,
-Stolee
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] index-pack, unpack-objects: increase input buffer from 4 KiB to 128 KiB
2026-04-27 12:36 ` Derrick Stolee
@ 2026-04-28 1:46 ` Junio C Hamano
2026-04-28 2:09 ` Jeff King
0 siblings, 1 reply; 12+ messages in thread
From: Junio C Hamano @ 2026-04-28 1:46 UTC (permalink / raw)
To: Derrick Stolee; +Cc: Scott Bauersfeld via GitGitGadget, git, Scott Bauersfeld
Derrick Stolee <stolee@gmail.com> writes:
>> The difference between that and 66% is coming from where? There are
>> inherently short writes that do not utilize the new larger buffer
>> beyond 4kB? If so, another number of interest might be the number
>> of writes smaller than 4096 bytes, perhaps?
>
> One way to reword what you're asking is to measure "number of writes
> not using the whole buffer" which is basically going to be "the
> number of flush events from the application layer".
I do not think it would differ between the old and the new
implementation.
> Every time the
> application intends to flush, the current buffer is likely to not
> be exactly full. I would expect this number to not change between
> implementations in real experiments.
Yes, I agree.
But what I was trying to get at was a bit different.
The application may have produced only 2kB before it issues a
"flush". Whether the buffer size is 4kB or 128kB, such a flush will
only write out 2kB, and the larger buffer size does not help at all.
But if the application has produced 90kB before it issues a "flush",
the larger buffer size would give us a great improvement. With 4kB
buffer, before such an application level "flush", we would have seen
22 = floor(90/4) calls of write(2) to flush the buffer, plus a 2kB
write(2). With 128kB buffer, we would see a single 90kB write(2).
So the apparently lower improvement than I naively have expected may
be attributable to the fact that many application level "flush" was
not large enough to benefit from 128kB buffer? How much of the
total number of bytes written came in large batches, vs tiny ones?
> The improvement here comes from the reduced number of flushes due
> to buffer limits.
Yes.
> I see that this can be measured in the number of
> system-level events, but what impact does this have on the end-to-
> end time of 'git index-pack' or 'git unpack-objects'? Is there a
> t/perf/ test that can demonstrate this improvement for a variety
> of real repos using GIT_PERF_REPO?
Interesting thought, but the number of system-level events (or the
number of write(2) system calls) is not reduced by 97% because we
apparently are issuing too many of them, and the reason is? I
suspect the reason why we still issue too many write(2) is because
we often do not send enough data between application-level flushes.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v3] index-pack, unpack-objects: increase input buffer from 4 KiB to 128 KiB
2026-04-27 20:12 ` Derrick Stolee
@ 2026-04-28 1:47 ` Junio C Hamano
0 siblings, 0 replies; 12+ messages in thread
From: Junio C Hamano @ 2026-04-28 1:47 UTC (permalink / raw)
To: Derrick Stolee; +Cc: Scott Bauersfeld via GitGitGadget, git, Scott Bauersfeld
Derrick Stolee <stolee@gmail.com> writes:
> On 4/27/2026 3:26 PM, Scott Bauersfeld via GitGitGadget wrote:
>> From: Scott Bauersfeld <sbauersfeld@g.ucla.edu>
>
>> Changes since v2
>> ================
>>
>> * Renamed DEFAULT_PACKFILE_BUFFER_SIZE → DEFAULT_IO_BUFFER_SIZE per
>> Stolee's feedback. The constant is not packfile-specific, since it is
>> also used by the hashfile layer.
>> * Stolee noted that WRITE_BUFFER_SIZE in read-cache.c could be
>> consolidated. That constant was already removed in f6e2cd0625
>> ("read-cache: delete unused hashing methods", 2021-05-18) when
>> read-cache.c was converted to use the hashfile API, so there is
>> nothing left to unify. The rename to DEFAULT_IO_BUFFER_SIZE helps
>> account for the multiple usages of this constant.
>
> Thank you for discovering this context which made my recommendation
> non-actionable. I was looking at the commit that added the 128K limit,
> which had that in its context, but not at the latest code. My mistake!
>
> I'm very happy with this version and look forward to the performance
> benefits!
>
> Thanks,
> -Stolee
Yes, this version was very pleasant to read. Thanks both.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] index-pack, unpack-objects: increase input buffer from 4 KiB to 128 KiB
2026-04-28 1:46 ` Junio C Hamano
@ 2026-04-28 2:09 ` Jeff King
0 siblings, 0 replies; 12+ messages in thread
From: Jeff King @ 2026-04-28 2:09 UTC (permalink / raw)
To: Junio C Hamano
Cc: Derrick Stolee, Scott Bauersfeld via GitGitGadget, git,
Scott Bauersfeld
On Tue, Apr 28, 2026 at 10:46:44AM +0900, Junio C Hamano wrote:
> The application may have produced only 2kB before it issues a
> "flush". Whether the buffer size is 4kB or 128kB, such a flush will
> only write out 2kB, and the larger buffer size does not help at all.
> But if the application has produced 90kB before it issues a "flush",
> the larger buffer size would give us a great improvement. With 4kB
> buffer, before such an application level "flush", we would have seen
> 22 = floor(90/4) calls of write(2) to flush the buffer, plus a 2kB
> write(2). With 128kB buffer, we would see a single 90kB write(2).
>
> So the apparently lower improvement than I naively have expected may
> be attributable to the fact that many application level "flush" was
> not large enough to benefit from 128kB buffer? How much of the
> total number of bytes written came in large batches, vs tiny ones?
The input to index-pack in a fetch is going to be the demuxing of the
sideband via git-fetch. So it's probably flushing 64k or less each time
(because that's the max size of a packet), and unless index-pack is
going much slower than the input, that maximizes how much it will read.
Depending on the source, though, it may be possible to go faster than
index-pack (which has to at least update the pack checksum for every
byte, and may even zlib inflate and hash the object itself if it's a
non-delta). In which case the sideband demuxer would start filling the
pipe and index-pack may get larger reads.
We could actually reduce the number of syscalls further if index-pack
did the demuxing itself, and we just handed it the descriptor. That
probably doesn't help all that much in this case, though, if the problem
is not raw reads/writes on pipes, but rather ones that go to the slow
FUSE filesystem. And as long as those pipe reads/writes are "wide"
(allowing the eventual filesystem writes to also be wide), then the
exact number may not be as important.
But the demuxing may also explain why the total number of writes did not
decrease as much as you expected, since those ones will probably not be
reduced by the patch in question. So the improvement is a percentage of
only a smaller portion of the total (but not necessarily half, because
they may have been larger writes in the first place).
-Peff
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH v4] index-pack, unpack-objects: increase input buffer from 4 KiB to 128 KiB
2026-04-27 19:26 ` [PATCH v3] " Scott Bauersfeld via GitGitGadget
2026-04-27 20:12 ` Derrick Stolee
@ 2026-04-28 14:47 ` Scott Bauersfeld via GitGitGadget
2026-05-12 5:51 ` Junio C Hamano
1 sibling, 1 reply; 12+ messages in thread
From: Scott Bauersfeld via GitGitGadget @ 2026-04-28 14:47 UTC (permalink / raw)
To: git
Cc: Junio C Hamano, Derrick Stolee, Jeff King, Scott Bauersfeld,
Scott Bauersfeld
From: Scott Bauersfeld <sbauersfeld@g.ucla.edu>
index-pack and unpack-objects both read pack data from stdin through
a 4 KiB static buffer. In index-pack, each fill() flushes consumed
bytes to the pack file via write_or_die(), capping every write(2)
at 4 KiB. unpack-objects uses the same buffer pattern for reads.
On FUSE-backed filesystems every write(2) is a synchronous round
trip through the FUSE protocol (userspace -> kernel -> userspace ->
back), so the 4 KiB buffer turns a clone into many unnecessary tiny
writes with noticeable latency overhead.
Increase the buffer from 4 KiB to 128 KiB. Introduce a shared
DEFAULT_IO_BUFFER_SIZE constant in git-compat-util.h (next to
MAX_IO_SIZE) and use it in index-pack, unpack-objects, and the
hashfile layer in csum-file (which already used 128 KiB but
hardcoded the value).
Pack file writes to a FUSE filesystem with writeback caching
disabled during HTTPS clones of git/git (~293 MB pack):
74,958 -> 4,687 (94% fewer)
Wall-clock time of git clone over HTTPS onto a FUSE passthrough
filesystem with writeback caching disabled, 3 runs per variant:
vscode (~1.26 GB pack): 84.5s -> 75.7s avg (10% faster)
git/git (~306 MB pack): 22.6s -> 20.0s avg (11% faster)
Signed-off-by: Scott Bauersfeld <sbauersfeld@g.ucla.edu>
---
index-pack, unpack-objects: increase input buffer from 4 KiB to 128 KiB
index-pack and unpack-objects read pack data from stdin through a 4 KiB
static buffer. In index-pack, each fill() flushes consumed bytes to the
pack file via write_or_die(), capping every write(2) at 4 KiB.
unpack-objects uses the same buffer pattern for reads.
On FUSE-backed filesystems every write(2) is a synchronous round trip
through the FUSE protocol (userspace → kernel → userspace → back), so
the 4 KiB buffer turns a clone into many unnecessary tiny writes with
noticeable latency overhead.
Increase the buffer from 4 KiB to 128 KiB. Introduce a shared
DEFAULT_IO_BUFFER_SIZE constant in git-compat-util.h (next to
MAX_IO_SIZE) and use it in index-pack, unpack-objects, and the hashfile
layer in csum-file (which already used 128 KiB but hardcoded the value).
Pack file write reduction
=========================
Pack file writes to a FUSE filesystem with writeback caching disabled
during HTTPS clones of git/git (~293 MB pack):
Unpatched avg Patched avg Change 74,958 4,687 −94%
Write counts measured by logging writes in a FUSE passthrough daemon
(libfuse 3.10.5, writeback cache off).
Wall-clock time on FUSE
=======================
Measured wall-clock time of git clone over HTTPS onto a FUSE passthrough
filesystem with writeback caching disabled. 3 runs per variant:
Repo Unpatched avg Patched avg Change microsoft/vscode (~1.26 GB pack)
84.5s 75.7s −10% git/git (~306 MB pack) 22.6s 20.0s −11%
Changes since v3
================
* Replaced strace-based syscall measurements with FUSE daemon write
logging. The earlier strace numbers (72,465 → 24,943, 65% reduction)
were distorted: strace -f ptrace intercepts every syscall in all
traced processes and added enough overhead to distort the
measurements. The FUSE daemon logging captures write sizes without
perturbing the traced processes, showing the true reduction is 94%
(74,958 → 4,687).
* Note: Why 4,687 writes instead of ~2k writes as would be expected
with a 128 KiB buffer size? It appears that fill() is calling xread()
on a pipe and the linux default buffer size for pipes is 64KiB. I
also tested using fcntl(F_SETPIPE_SZ) to increase the pipe's buffer
size to 128KiB, which does indeed reduce total pack file writes to
~2.4K.
Changes since v2
================
* Renamed DEFAULT_PACKFILE_BUFFER_SIZE → DEFAULT_IO_BUFFER_SIZE per
Stolee's feedback. The constant is not packfile-specific, since it is
also used by the hashfile layer.
* Stolee noted that WRITE_BUFFER_SIZE in read-cache.c could be
consolidated. That constant was already removed in f6e2cd0625
("read-cache: delete unused hashing methods", 2021-05-18) when
read-cache.c was converted to use the hashfile API, so there is
nothing left to unify. The rename to DEFAULT_IO_BUFFER_SIZE helps
account for the multiple usages of this constant.
Changes since v1
================
* Introduced shared DEFAULT_PACKFILE_BUFFER_SIZE constant in
git-compat-util.h (next to MAX_IO_SIZE), replacing per-file #define
and the hardcoded value in csum-file.c. Placed here rather than
environment.h since it is an I/O buffer size, not an environment
variable or repo config.
* Added wall-clock timing on a FUSE filesystem.
* Cleaned up the commit description a bit.
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-2282%2Fsbauersfeld%2Fsb%2Fincrease-index-pack-input-buffer-v4
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-2282/sbauersfeld/sb/increase-index-pack-input-buffer-v4
Pull-Request: https://github.com/git/git/pull/2282
Range-diff vs v3:
1: df754ac879 ! 1: 146b1846a5 index-pack, unpack-objects: increase input buffer from 4 KiB to 128 KiB
@@ Commit message
hashfile layer in csum-file (which already used 128 KiB but
hardcoded the value).
- Syscall counts via strace on HTTPS clones of git/git (~296 MB pack,
- 5 runs per variant, isolated builds from the same v2.54.0 source):
+ Pack file writes to a FUSE filesystem with writeback caching
+ disabled during HTTPS clones of git/git (~293 MB pack):
- index-pack pack file writes: 72,465 -> 24,943 avg (65% fewer)
- total write() syscalls: 310,192 -> 259,530 avg (16% fewer)
- writes of exactly 4096 bytes: ~40,077 -> 0
+ 74,958 -> 4,687 (94% fewer)
Wall-clock time of git clone over HTTPS onto a FUSE passthrough
filesystem with writeback caching disabled, 3 runs per variant:
builtin/index-pack.c | 3 +--
builtin/unpack-objects.c | 3 +--
csum-file.c | 2 +-
git-compat-util.h | 6 ++++++
4 files changed, 9 insertions(+), 5 deletions(-)
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index ca7784dc2c..bb3639641c 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -145,8 +145,7 @@ static int check_self_contained_and_connected;
static struct progress *progress;
-/* We always read in 4kB chunks. */
-static unsigned char input_buffer[4096];
+static unsigned char input_buffer[DEFAULT_IO_BUFFER_SIZE];
static unsigned int input_offset, input_len;
static off_t consumed_bytes;
static off_t max_input_size;
diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c
index e01cf6e360..af67d1a1d3 100644
--- a/builtin/unpack-objects.c
+++ b/builtin/unpack-objects.c
@@ -23,8 +23,7 @@
static int dry_run, quiet, recover, has_errors, strict;
static const char unpack_usage[] = "git unpack-objects [-n] [-q] [-r] [--strict]";
-/* We always read in 4kB chunks. */
-static unsigned char buffer[4096];
+static unsigned char buffer[DEFAULT_IO_BUFFER_SIZE];
static unsigned int offset, len;
static off_t consumed_bytes;
static off_t max_input_size;
diff --git a/csum-file.c b/csum-file.c
index 9558177a11..d7a682c2b6 100644
--- a/csum-file.c
+++ b/csum-file.c
@@ -178,7 +178,7 @@ struct hashfile *hashfd_ext(const struct git_hash_algo *algop,
f->algop = unsafe_hash_algo(algop);
f->algop->init_fn(&f->ctx);
- f->buffer_len = opts->buffer_len ? opts->buffer_len : 128 * 1024;
+ f->buffer_len = opts->buffer_len ? opts->buffer_len : DEFAULT_IO_BUFFER_SIZE;
f->buffer = xmalloc(f->buffer_len);
f->check_buffer = NULL;
diff --git a/git-compat-util.h b/git-compat-util.h
index ae1bdc90a4..5024814bd4 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -712,6 +712,12 @@ static inline uint64_t u64_add(uint64_t a, uint64_t b)
# endif
#endif
+/*
+ * Default buffer size for buffered I/O in index-pack, unpack-objects,
+ * and the hashfile layer in csum-file.
+ */
+#define DEFAULT_IO_BUFFER_SIZE (128 * 1024)
+
#ifdef HAVE_ALLOCA_H
# include <alloca.h>
# define xalloca(size) (alloca(size))
base-commit: 94f057755b7941b321fd11fec1b2e3ca5313a4e0
--
gitgitgadget
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH v4] index-pack, unpack-objects: increase input buffer from 4 KiB to 128 KiB
2026-04-28 14:47 ` [PATCH v4] " Scott Bauersfeld via GitGitGadget
@ 2026-05-12 5:51 ` Junio C Hamano
0 siblings, 0 replies; 12+ messages in thread
From: Junio C Hamano @ 2026-05-12 5:51 UTC (permalink / raw)
To: Scott Bauersfeld via GitGitGadget
Cc: git, Derrick Stolee, Jeff King, Scott Bauersfeld
"Scott Bauersfeld via GitGitGadget" <gitgitgadget@gmail.com> writes:
> From: Scott Bauersfeld <sbauersfeld@g.ucla.edu>
>
> index-pack and unpack-objects both read pack data from stdin through
> a 4 KiB static buffer. In index-pack, each fill() flushes consumed
> bytes to the pack file via write_or_die(), capping every write(2)
> at 4 KiB. unpack-objects uses the same buffer pattern for reads.
>
> On FUSE-backed filesystems every write(2) is a synchronous round
> trip through the FUSE protocol (userspace -> kernel -> userspace ->
> back), so the 4 KiB buffer turns a clone into many unnecessary tiny
> writes with noticeable latency overhead.
>
> Increase the buffer from 4 KiB to 128 KiB. Introduce a shared
> DEFAULT_IO_BUFFER_SIZE constant in git-compat-util.h (next to
> MAX_IO_SIZE) and use it in index-pack, unpack-objects, and the
> hashfile layer in csum-file (which already used 128 KiB but
> hardcoded the value).
>
> Pack file writes to a FUSE filesystem with writeback caching
> disabled during HTTPS clones of git/git (~293 MB pack):
>
> 74,958 -> 4,687 (94% fewer)
>
> Wall-clock time of git clone over HTTPS onto a FUSE passthrough
> filesystem with writeback caching disabled, 3 runs per variant:
>
> vscode (~1.26 GB pack): 84.5s -> 75.7s avg (10% faster)
> git/git (~306 MB pack): 22.6s -> 20.0s avg (11% faster)
>
> Signed-off-by: Scott Bauersfeld <sbauersfeld@g.ucla.edu>
> ---
>...
>
> Changes since v3
> ================
>
> * Replaced strace-based syscall measurements with FUSE daemon write
> logging. The earlier strace numbers (72,465 → 24,943, 65% reduction)
> were distorted: strace -f ptrace intercepts every syscall in all
> traced processes and added enough overhead to distort the
> measurements. The FUSE daemon logging captures write sizes without
> perturbing the traced processes, showing the true reduction is 94%
> (74,958 → 4,687).
> * Note: Why 4,687 writes instead of ~2k writes as would be expected
> with a 128 KiB buffer size? It appears that fill() is calling xread()
> on a pipe and the linux default buffer size for pipes is 64KiB. I
> also tested using fcntl(F_SETPIPE_SZ) to increase the pipe's buffer
> size to 128KiB, which does indeed reduce total pack file writes to
> ~2.4K.
It seems that everybody was happy with v3 already, so let's merge it
down to 'next'.
Thanks.
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2026-05-12 5:51 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-24 19:14 [PATCH] index-pack, unpack-objects: increase input buffer from 4 KiB to 128 KiB Scott Bauersfeld via GitGitGadget
2026-04-25 10:21 ` Junio C Hamano
2026-04-27 12:36 ` Derrick Stolee
2026-04-28 1:46 ` Junio C Hamano
2026-04-28 2:09 ` Jeff King
2026-04-27 16:08 ` [PATCH v2] " Scott Bauersfeld via GitGitGadget
2026-04-27 17:23 ` Derrick Stolee
2026-04-27 19:26 ` [PATCH v3] " Scott Bauersfeld via GitGitGadget
2026-04-27 20:12 ` Derrick Stolee
2026-04-28 1:47 ` Junio C Hamano
2026-04-28 14:47 ` [PATCH v4] " Scott Bauersfeld via GitGitGadget
2026-05-12 5:51 ` Junio C Hamano
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox