From: "Arijit Banerjee via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
"Junio C Hamano" <gitster@pobox.com>,
"Derrick Stolee" <stolee@gmail.com>,
"Arijit Banerjee" <arijit91@gmail.com>,
"Arijit Banerjee" <arijit@effectiveailabs.com>
Subject: [PATCH] index-pack: retain child bases in delta cache
Date: Fri, 29 May 2026 16:06:02 +0000 [thread overview]
Message-ID: <pull.2131.git.1780070763044.gitgitgadget@gmail.com> (raw)
From: Arijit Banerjee <arijit@effectiveailabs.com>
When resolving a delta whose result has children of its own,
index-pack adds the result to work_head, accounts its data in
base_cache_used, and calls prune_base_data(). It then immediately
frees that same data.
This bypasses the existing delta base cache policy and can force later
descendants to reconstruct the queued base again. Let the existing
delta_base_cache_limit pruning policy decide whether to keep or evict
the data instead.
Signed-off-by: Arijit Banerjee <arijit@effectiveailabs.com>
---
index-pack: retain child bases in delta cache
Speed up the local pack indexing phase of clone/fetch for large
delta-compressed packs by keeping reconstructed delta bases available
for reuse when they are queued for later delta resolution.
When index-pack reconstructs a child base and queues it for resolving
descendant deltas, it currently frees that data immediately. This can
force the same base to be reconstructed again. Instead, keep it in the
existing delta base cache and let the existing delta_base_cache_limit
policy decide whether to retain or evict it.
This does not add a new cache or increase the cache limit. The object
data is already accounted in base_cache_used, and prune_base_data() is
already called at this point.
Correctness:
* t/t5302-pack-index.sh passed all 36 tests.
Benchmarks on a quiet Ubuntu 24.04 VM, 16 vCPU, 32 GiB RAM, local SSD:
pack baseline patched wall-time change RSS change linux blobless 69.17s
57.98s 16.2% faster -0.0% linux full 280.72s 236.32s 15.8% faster +1.9%
Five-repeat public-repo medians also improved: git.git 13.1%, libgit2
14.0%, redis 13.5%, cpython 4.8%.
Perf on the linux blobless pack showed the same direction under
profiling: 76.64s baseline vs 61.09s patched, with similar RSS.
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-2131%2Farijit91%2Findex-pack-retain-child-base-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-2131/arijit91/index-pack-retain-child-base-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/2131
builtin/index-pack.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index cf0bd8280d..027c64b522 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -1212,7 +1212,6 @@ static void *threaded_second_pass(void *data)
list_add(&child->list, &work_head);
base_cache_used += child->size;
prune_base_data(NULL);
- free_base_data(child);
} else if (child) {
/*
* This child does not have its own children. It may be
base-commit: c69baaf57ba26cf117c2b6793802877f19738b0d
--
gitgitgadget
next reply other threads:[~2026-05-29 16:06 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-29 16:06 Arijit Banerjee via GitGitGadget [this message]
2026-06-01 12:50 ` [PATCH] index-pack: retain child bases in delta cache Derrick Stolee
2026-06-01 16:13 ` [PATCH v2] " Arijit Banerjee via GitGitGadget
2026-06-02 6:45 ` Jeff King
2026-06-03 0:05 ` [PATCH v3] " Arijit Banerjee via GitGitGadget
2026-06-03 12:24 ` Derrick Stolee
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=pull.2131.git.1780070763044.gitgitgadget@gmail.com \
--to=gitgitgadget@gmail.com \
--cc=arijit91@gmail.com \
--cc=arijit@effectiveailabs.com \
--cc=avarab@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=stolee@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox