From: "Paul Tarjan via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: Christian Couder <christian.couder@gmail.com>,
Han Xin <hanxin.hx@bytedance.com>,
Paul Tarjan <paul@paultarjan.com>, Patrick Steinhardt <ps@pks.im>,
Paul Tarjan <github@paulisageek.com>,
Paul Tarjan <github@paulisageek.com>
Subject: [PATCH v3] promisor-remote: prevent lazy-fetch recursion in child fetch
Date: Wed, 11 Mar 2026 14:19:38 +0000 [thread overview]
Message-ID: <pull.2224.v3.git.git.1773238778894.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.2224.v2.git.git.1772648846009.gitgitgadget@gmail.com>
From: Paul Tarjan <github@paulisageek.com>
fetch_objects() spawns a child `git fetch` to lazily fill in missing
objects. That child's index-pack, when it receives a thin pack
containing a REF_DELTA against a still-missing base, calls
promisor_remote_get_direct() -- which is fetch_objects() again.
With negotiationAlgorithm=noop the client advertises no "have"
lines, so a well-behaved server sends requested objects
un-deltified or deltified only against objects in the same pack.
A server that nevertheless sends REF_DELTA against a base the
client does not have is misbehaving; however the client should
not recurse unboundedly in response.
Propagate GIT_NO_LAZY_FETCH=1 into the child fetch's environment
so that if the child's index-pack encounters such a REF_DELTA, it
hits the existing guard at the top of fetch_objects() and fails
fast instead of recursing. Depth-1 lazy fetch (the whole point
of fetch_objects()) is unaffected: only the child and its
descendants see the variable.
Add a test that injects a thin pack containing a REF_DELTA against
a missing base via HTTP, triggering the recursion path through
index-pack's promisor_remote_get_direct() call. With the fix, the
child's fetch_objects() sees GIT_NO_LAZY_FETCH=1 and blocks the
depth-2 fetch with a "lazy fetching disabled" warning.
Signed-off-by: Paul Tarjan <github@paulisageek.com>
---
promisor-remote: prevent recursive lazy-fetch during index-pack
Propagate GIT_NO_LAZY_FETCH=1 into the child fetch spawned by
fetch_objects() so that index-pack cannot recurse back into lazy-fetch
when resolving REF_DELTA bases.
We hit this in production: 276 GB of promisor packs written in 90
minutes against a ~10 GB monorepo with ~61K stale prefetch refs pointing
at GC'd commits.
Changes since v2:
* Replaced env-var-proxy test with behavioral test that injects a thin
pack containing a REF_DELTA against a missing base via HTTP,
triggering the actual recursion path through index-pack's
promisor_remote_get_direct() call
* Moved test from t0411-clone-from-partial.sh to t5616-partial-clone.sh
(requires HTTP infrastructure)
* Dropped commit-graph comparison from commit message per review
feedback
Changes since v1:
* Dropped CC: trailers from commit message (moved here for
GitGitGadget)
* Moved test into t0411-clone-from-partial.sh instead of a new file
* Removed duplicate commit-message summary from PR description
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-2224%2Fptarjan%2Fclaude%2Ffix-lazy-fetch-recursion-KP9Hl-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-2224/ptarjan/claude/fix-lazy-fetch-recursion-KP9Hl-v3
Pull-Request: https://github.com/git/git/pull/2224
Range-diff vs v2:
1: 907ca7a0ac ! 1: d58ccb858f promisor-remote: prevent lazy-fetch recursion in child fetch
@@ Commit message
fetch_objects() spawns a child `git fetch` to lazily fill in missing
objects. That child's index-pack, when it receives a thin pack
- containing a REF_DELTA against a still-missing base, explicitly
- calls promisor_remote_get_direct() — which is fetch_objects() again.
- If the base is truly unavailable (e.g. because many refs in the
- local store point at objects that have been garbage-collected on the
- server), each recursive lazy-fetch can trigger another, leading to
- unbounded recursion with runaway disk and process consumption.
+ containing a REF_DELTA against a still-missing base, calls
+ promisor_remote_get_direct() -- which is fetch_objects() again.
- The GIT_NO_LAZY_FETCH guard (introduced by e6d5479e7a (git: add
- --no-lazy-fetch option, 2021-08-31)) already exists at the top of
- fetch_objects(); the missing piece is propagating it into the child
- fetch's environment. Add that propagation so the child's
- index-pack, if it encounters a REF_DELTA against a missing base,
- hits the guard and fails fast instead of recursing.
-
- Depth-1 lazy fetch (the whole point of fetch_objects()) is
- unaffected: only the child and its descendants see the variable.
With negotiationAlgorithm=noop the client advertises no "have"
lines, so a well-behaved server sends requested objects
- un-deltified or deltified only against objects in the same pack;
- the child's index-pack should never need a depth-2 fetch. If it
- does, the server response was broken or the local store is already
- corrupt, and further fetching would not help.
+ un-deltified or deltified only against objects in the same pack.
+ A server that nevertheless sends REF_DELTA against a base the
+ client does not have is misbehaving; however the client should
+ not recurse unboundedly in response.
- This is the same bug shape that 3a1ea94a49 (commit-graph.c: no lazy
- fetch in lookup_commit_in_graph(), 2022-07-01) addressed at a
- different entry point.
+ Propagate GIT_NO_LAZY_FETCH=1 into the child fetch's environment
+ so that if the child's index-pack encounters such a REF_DELTA, it
+ hits the existing guard at the top of fetch_objects() and fails
+ fast instead of recursing. Depth-1 lazy fetch (the whole point
+ of fetch_objects()) is unaffected: only the child and its
+ descendants see the variable.
- Add a test that verifies the child fetch environment contains
- GIT_NO_LAZY_FETCH=1 via a reference-transaction hook.
+ Add a test that injects a thin pack containing a REF_DELTA against
+ a missing base via HTTP, triggering the recursion path through
+ index-pack's promisor_remote_get_direct() call. With the fix, the
+ child's fetch_objects() sees GIT_NO_LAZY_FETCH=1 and blocks the
+ depth-2 fetch with a "lazy fetching disabled" warning.
Signed-off-by: Paul Tarjan <github@paulisageek.com>
@@ promisor-remote.c: static int fetch_objects(struct repository *repo,
"fetch", remote_name, "--no-tags",
"--no-write-fetch-head", "--recurse-submodules=no",
- ## t/t0411-clone-from-partial.sh ##
-@@ t/t0411-clone-from-partial.sh: test_expect_success 'promisor lazy-fetching can be re-enabled' '
- test_path_is_file script-executed
+ ## t/t5616-partial-clone.sh ##
+@@ t/t5616-partial-clone.sh: test_expect_success PERL_TEST_HELPERS 'tolerate server sending REF_DELTA against
+ ! test -e "$HTTPD_ROOT_PATH/one-time-script"
'
-+test_expect_success 'lazy-fetch child has GIT_NO_LAZY_FETCH=1' '
-+ test_create_repo nolazy-server &&
-+ test_commit -C nolazy-server foo &&
-+ git -C nolazy-server repack -a -d --write-bitmap-index &&
++test_expect_success PERL_TEST_HELPERS 'lazy-fetch of REF_DELTA with missing base does not recurse' '
++ SERVER="$HTTPD_DOCUMENT_ROOT_PATH/server" &&
++ rm -rf "$SERVER" repo &&
++ test_create_repo "$SERVER" &&
++ test_config -C "$SERVER" uploadpack.allowfilter 1 &&
++ test_config -C "$SERVER" uploadpack.allowanysha1inwant 1 &&
++
++ # Create a commit with 2 blobs to be used as delta base and content.
++ for i in $(test_seq 10)
++ do
++ echo "this is a line" >>"$SERVER/foo.txt" &&
++ echo "this is another line" >>"$SERVER/bar.txt" || return 1
++ done &&
++ git -C "$SERVER" add foo.txt bar.txt &&
++ git -C "$SERVER" commit -m initial &&
++ BLOB_FOO=$(git -C "$SERVER" rev-parse HEAD:foo.txt) &&
++ BLOB_BAR=$(git -C "$SERVER" rev-parse HEAD:bar.txt) &&
+
-+ git clone "file://$(pwd)/nolazy-server" nolazy-client &&
-+ HASH=$(git -C nolazy-client rev-parse foo) &&
-+ rm -rf nolazy-client/.git/objects/* &&
++ # Partial clone with blob:none. The client has commits and
++ # trees but no blobs.
++ test_config -C "$SERVER" protocol.version 2 &&
++ git -c protocol.version=2 clone --no-checkout \
++ --filter=blob:none $HTTPD_URL/one_time_script/server repo &&
+
-+ git -C nolazy-client config core.repositoryformatversion 1 &&
-+ git -C nolazy-client config extensions.partialclone "origin" &&
++ # Sanity check: client does not have either blob locally.
++ git -C repo rev-list --objects --ignore-missing \
++ -- $BLOB_FOO >objlist &&
++ test_line_count = 0 objlist &&
+
-+ # Install a reference-transaction hook to record the env var
-+ # as seen by processes inside the child fetch.
-+ test_hook -C nolazy-client reference-transaction <<-\EOF &&
-+ echo "$GIT_NO_LAZY_FETCH" >>../env-in-child
++ # Craft a thin pack where BLOB_FOO is a REF_DELTA against
++ # BLOB_BAR. Since the client has neither blob (blob:none
++ # filter), the delta base will be missing. This simulates a
++ # misbehaving server that sends REF_DELTA against an object
++ # the client does not have.
++ test-tool -C "$SERVER" pack-deltas --num-objects=1 >thin.pack <<-EOF &&
++ REF_DELTA $BLOB_FOO $BLOB_BAR
+ EOF
+
-+ rm -f env-in-child &&
-+ git -C nolazy-client cat-file -p "$HASH" &&
++ replace_packfile thin.pack &&
+
-+ # The hook runs inside the child fetch, which should have
-+ # GIT_NO_LAZY_FETCH=1 in its environment.
-+ grep "^1$" env-in-child
++ # Trigger a lazy fetch for BLOB_FOO. The child fetch spawned
++ # by fetch_objects() receives our crafted thin pack. Its
++ # index-pack encounters the missing delta base (BLOB_BAR) and
++ # tries to lazy-fetch it via promisor_remote_get_direct().
++ #
++ # With the fix: fetch_objects() propagates GIT_NO_LAZY_FETCH=1
++ # to the child, so the depth-2 fetch is blocked and we see the
++ # "lazy fetching disabled" warning. The object cannot be
++ # resolved, so cat-file fails.
++ #
++ # Without the fix: the depth-2 fetch would proceed, potentially
++ # recursing unboundedly with a persistently misbehaving server.
++ test_must_fail git -C repo -c protocol.version=2 \
++ cat-file -p $BLOB_FOO 2>err &&
++ test_grep "lazy fetching disabled" err &&
++
++ # Ensure that the one-time-script was used.
++ ! test -e "$HTTPD_ROOT_PATH/one-time-script"
+'
+
- test_done
+ # DO NOT add non-httpd-specific tests here, because the last part of this
+ # test script is only executed when httpd is available and enabled.
+
promisor-remote.c | 7 +++++
t/t5616-partial-clone.sh | 60 ++++++++++++++++++++++++++++++++++++++++
2 files changed, 67 insertions(+)
diff --git a/promisor-remote.c b/promisor-remote.c
index 96fa215b06..35c7aab93d 100644
--- a/promisor-remote.c
+++ b/promisor-remote.c
@@ -42,6 +42,13 @@ static int fetch_objects(struct repository *repo,
child.in = -1;
if (repo != the_repository)
prepare_other_repo_env(&child.env, repo->gitdir);
+ /*
+ * Prevent the child's index-pack from recursing back into
+ * fetch_objects() when resolving REF_DELTA bases it does not
+ * have. With noop negotiation the server should never need
+ * to send such deltas, so a depth-2 fetch would not help.
+ */
+ strvec_pushf(&child.env, "%s=1", NO_LAZY_FETCH_ENVIRONMENT);
strvec_pushl(&child.args, "-c", "fetch.negotiationAlgorithm=noop",
"fetch", remote_name, "--no-tags",
"--no-write-fetch-head", "--recurse-submodules=no",
diff --git a/t/t5616-partial-clone.sh b/t/t5616-partial-clone.sh
index 1e354e057f..27f131c8d9 100755
--- a/t/t5616-partial-clone.sh
+++ b/t/t5616-partial-clone.sh
@@ -907,6 +907,66 @@ test_expect_success PERL_TEST_HELPERS 'tolerate server sending REF_DELTA against
! test -e "$HTTPD_ROOT_PATH/one-time-script"
'
+test_expect_success PERL_TEST_HELPERS 'lazy-fetch of REF_DELTA with missing base does not recurse' '
+ SERVER="$HTTPD_DOCUMENT_ROOT_PATH/server" &&
+ rm -rf "$SERVER" repo &&
+ test_create_repo "$SERVER" &&
+ test_config -C "$SERVER" uploadpack.allowfilter 1 &&
+ test_config -C "$SERVER" uploadpack.allowanysha1inwant 1 &&
+
+ # Create a commit with 2 blobs to be used as delta base and content.
+ for i in $(test_seq 10)
+ do
+ echo "this is a line" >>"$SERVER/foo.txt" &&
+ echo "this is another line" >>"$SERVER/bar.txt" || return 1
+ done &&
+ git -C "$SERVER" add foo.txt bar.txt &&
+ git -C "$SERVER" commit -m initial &&
+ BLOB_FOO=$(git -C "$SERVER" rev-parse HEAD:foo.txt) &&
+ BLOB_BAR=$(git -C "$SERVER" rev-parse HEAD:bar.txt) &&
+
+ # Partial clone with blob:none. The client has commits and
+ # trees but no blobs.
+ test_config -C "$SERVER" protocol.version 2 &&
+ git -c protocol.version=2 clone --no-checkout \
+ --filter=blob:none $HTTPD_URL/one_time_script/server repo &&
+
+ # Sanity check: client does not have either blob locally.
+ git -C repo rev-list --objects --ignore-missing \
+ -- $BLOB_FOO >objlist &&
+ test_line_count = 0 objlist &&
+
+ # Craft a thin pack where BLOB_FOO is a REF_DELTA against
+ # BLOB_BAR. Since the client has neither blob (blob:none
+ # filter), the delta base will be missing. This simulates a
+ # misbehaving server that sends REF_DELTA against an object
+ # the client does not have.
+ test-tool -C "$SERVER" pack-deltas --num-objects=1 >thin.pack <<-EOF &&
+ REF_DELTA $BLOB_FOO $BLOB_BAR
+ EOF
+
+ replace_packfile thin.pack &&
+
+ # Trigger a lazy fetch for BLOB_FOO. The child fetch spawned
+ # by fetch_objects() receives our crafted thin pack. Its
+ # index-pack encounters the missing delta base (BLOB_BAR) and
+ # tries to lazy-fetch it via promisor_remote_get_direct().
+ #
+ # With the fix: fetch_objects() propagates GIT_NO_LAZY_FETCH=1
+ # to the child, so the depth-2 fetch is blocked and we see the
+ # "lazy fetching disabled" warning. The object cannot be
+ # resolved, so cat-file fails.
+ #
+ # Without the fix: the depth-2 fetch would proceed, potentially
+ # recursing unboundedly with a persistently misbehaving server.
+ test_must_fail git -C repo -c protocol.version=2 \
+ cat-file -p $BLOB_FOO 2>err &&
+ test_grep "lazy fetching disabled" err &&
+
+ # Ensure that the one-time-script was used.
+ ! test -e "$HTTPD_ROOT_PATH/one-time-script"
+'
+
# DO NOT add non-httpd-specific tests here, because the last part of this
# test script is only executed when httpd is available and enabled.
base-commit: 7b2bccb0d58d4f24705bf985de1f4612e4cf06e5
--
gitgitgadget
prev parent reply other threads:[~2026-03-11 14:19 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-04 16:57 [PATCH] promisor-remote: prevent lazy-fetch recursion in child fetch Paul Tarjan via GitGitGadget
2026-03-04 17:41 ` Junio C Hamano
2026-03-04 18:20 ` Paul Tarjan
2026-03-04 18:27 ` [PATCH v2] " Paul Tarjan via GitGitGadget
2026-03-11 10:52 ` Patrick Steinhardt
2026-03-11 14:18 ` Paul Tarjan
2026-03-12 7:27 ` Patrick Steinhardt
2026-03-13 1:43 ` Jeff King
2026-03-13 12:43 ` [PATCH v3] " Paul Tarjan
2026-03-13 12:43 ` Paul Tarjan
2026-03-11 14:19 ` Paul Tarjan via GitGitGadget [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=pull.2224.v3.git.git.1773238778894.gitgitgadget@gmail.com \
--to=gitgitgadget@gmail.com \
--cc=christian.couder@gmail.com \
--cc=git@vger.kernel.org \
--cc=github@paulisageek.com \
--cc=hanxin.hx@bytedance.com \
--cc=paul@paultarjan.com \
--cc=ps@pks.im \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox