From: "Paul Tarjan via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: Christian Couder <christian.couder@gmail.com>,
Han Xin <hanxin.hx@bytedance.com>,
Paul Tarjan <paul@paultarjan.com>, Patrick Steinhardt <ps@pks.im>,
Paul Tarjan <github@paulisageek.com>,
Paul Tarjan <github@paulisageek.com>
Subject: [PATCH v3] promisor-remote: prevent lazy-fetch recursion in child fetch
Date: Wed, 11 Mar 2026 14:19:38 +0000 [thread overview]
Message-ID: <pull.2224.v3.git.git.1773238778894.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.2224.v2.git.git.1772648846009.gitgitgadget@gmail.com>
From: Paul Tarjan <github@paulisageek.com>
fetch_objects() spawns a child `git fetch` to lazily fill in missing
objects. That child's index-pack, when it receives a thin pack
containing a REF_DELTA against a still-missing base, calls
promisor_remote_get_direct() -- which is fetch_objects() again.
With negotiationAlgorithm=noop the client advertises no "have"
lines, so a well-behaved server sends requested objects
un-deltified or deltified only against objects in the same pack.
A server that nevertheless sends REF_DELTA against a base the
client does not have is misbehaving; however the client should
not recurse unboundedly in response.
Propagate GIT_NO_LAZY_FETCH=1 into the child fetch's environment
so that if the child's index-pack encounters such a REF_DELTA, it
hits the existing guard at the top of fetch_objects() and fails
fast instead of recursing. Depth-1 lazy fetch (the whole point
of fetch_objects()) is unaffected: only the child and its
descendants see the variable.
Add a test that injects a thin pack containing a REF_DELTA against
a missing base via HTTP, triggering the recursion path through
index-pack's promisor_remote_get_direct() call. With the fix, the
child's fetch_objects() sees GIT_NO_LAZY_FETCH=1 and blocks the
depth-2 fetch with a "lazy fetching disabled" warning.
Signed-off-by: Paul Tarjan <github@paulisageek.com>
---
promisor-remote: prevent recursive lazy-fetch during index-pack
Propagate GIT_NO_LAZY_FETCH=1 into the child fetch spawned by
fetch_objects() so that index-pack cannot recurse back into lazy-fetch
when resolving REF_DELTA bases.
We hit this in production: 276 GB of promisor packs written in 90
minutes against a ~10 GB monorepo with ~61K stale prefetch refs pointing
at GC'd commits.
Changes since v2:
* Replaced env-var-proxy test with behavioral test that injects a thin
pack containing a REF_DELTA against a missing base via HTTP,
triggering the actual recursion path through index-pack's
promisor_remote_get_direct() call
* Moved test from t0411-clone-from-partial.sh to t5616-partial-clone.sh
(requires HTTP infrastructure)
* Dropped commit-graph comparison from commit message per review
feedback
Changes since v1:
* Dropped CC: trailers from commit message (moved here for
GitGitGadget)
* Moved test into t0411-clone-from-partial.sh instead of a new file
* Removed duplicate commit-message summary from PR description
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-2224%2Fptarjan%2Fclaude%2Ffix-lazy-fetch-recursion-KP9Hl-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-2224/ptarjan/claude/fix-lazy-fetch-recursion-KP9Hl-v3
Pull-Request: https://github.com/git/git/pull/2224
Range-diff vs v2:
1: 907ca7a0ac ! 1: d58ccb858f promisor-remote: prevent lazy-fetch recursion in child fetch
@@ Commit message
fetch_objects() spawns a child `git fetch` to lazily fill in missing
objects. That child's index-pack, when it receives a thin pack
- containing a REF_DELTA against a still-missing base, explicitly
- calls promisor_remote_get_direct() — which is fetch_objects() again.
- If the base is truly unavailable (e.g. because many refs in the
- local store point at objects that have been garbage-collected on the
- server), each recursive lazy-fetch can trigger another, leading to
- unbounded recursion with runaway disk and process consumption.
+ containing a REF_DELTA against a still-missing base, calls
+ promisor_remote_get_direct() -- which is fetch_objects() again.
- The GIT_NO_LAZY_FETCH guard (introduced by e6d5479e7a (git: add
- --no-lazy-fetch option, 2021-08-31)) already exists at the top of
- fetch_objects(); the missing piece is propagating it into the child
- fetch's environment. Add that propagation so the child's
- index-pack, if it encounters a REF_DELTA against a missing base,
- hits the guard and fails fast instead of recursing.
-
- Depth-1 lazy fetch (the whole point of fetch_objects()) is
- unaffected: only the child and its descendants see the variable.
With negotiationAlgorithm=noop the client advertises no "have"
lines, so a well-behaved server sends requested objects
- un-deltified or deltified only against objects in the same pack;
- the child's index-pack should never need a depth-2 fetch. If it
- does, the server response was broken or the local store is already
- corrupt, and further fetching would not help.
+ un-deltified or deltified only against objects in the same pack.
+ A server that nevertheless sends REF_DELTA against a base the
+ client does not have is misbehaving; however the client should
+ not recurse unboundedly in response.
- This is the same bug shape that 3a1ea94a49 (commit-graph.c: no lazy
- fetch in lookup_commit_in_graph(), 2022-07-01) addressed at a
- different entry point.
+ Propagate GIT_NO_LAZY_FETCH=1 into the child fetch's environment
+ so that if the child's index-pack encounters such a REF_DELTA, it
+ hits the existing guard at the top of fetch_objects() and fails
+ fast instead of recursing. Depth-1 lazy fetch (the whole point
+ of fetch_objects()) is unaffected: only the child and its
+ descendants see the variable.
- Add a test that verifies the child fetch environment contains
- GIT_NO_LAZY_FETCH=1 via a reference-transaction hook.
+ Add a test that injects a thin pack containing a REF_DELTA against
+ a missing base via HTTP, triggering the recursion path through
+ index-pack's promisor_remote_get_direct() call. With the fix, the
+ child's fetch_objects() sees GIT_NO_LAZY_FETCH=1 and blocks the
+ depth-2 fetch with a "lazy fetching disabled" warning.
Signed-off-by: Paul Tarjan <github@paulisageek.com>
@@ promisor-remote.c: static int fetch_objects(struct repository *repo,
"fetch", remote_name, "--no-tags",
"--no-write-fetch-head", "--recurse-submodules=no",
- ## t/t0411-clone-from-partial.sh ##
-@@ t/t0411-clone-from-partial.sh: test_expect_success 'promisor lazy-fetching can be re-enabled' '
- test_path_is_file script-executed
+ ## t/t5616-partial-clone.sh ##
+@@ t/t5616-partial-clone.sh: test_expect_success PERL_TEST_HELPERS 'tolerate server sending REF_DELTA against
+ ! test -e "$HTTPD_ROOT_PATH/one-time-script"
'
-+test_expect_success 'lazy-fetch child has GIT_NO_LAZY_FETCH=1' '
-+ test_create_repo nolazy-server &&
-+ test_commit -C nolazy-server foo &&
-+ git -C nolazy-server repack -a -d --write-bitmap-index &&
++test_expect_success PERL_TEST_HELPERS 'lazy-fetch of REF_DELTA with missing base does not recurse' '
++ SERVER="$HTTPD_DOCUMENT_ROOT_PATH/server" &&
++ rm -rf "$SERVER" repo &&
++ test_create_repo "$SERVER" &&
++ test_config -C "$SERVER" uploadpack.allowfilter 1 &&
++ test_config -C "$SERVER" uploadpack.allowanysha1inwant 1 &&
++
++ # Create a commit with 2 blobs to be used as delta base and content.
++ for i in $(test_seq 10)
++ do
++ echo "this is a line" >>"$SERVER/foo.txt" &&
++ echo "this is another line" >>"$SERVER/bar.txt" || return 1
++ done &&
++ git -C "$SERVER" add foo.txt bar.txt &&
++ git -C "$SERVER" commit -m initial &&
++ BLOB_FOO=$(git -C "$SERVER" rev-parse HEAD:foo.txt) &&
++ BLOB_BAR=$(git -C "$SERVER" rev-parse HEAD:bar.txt) &&
+
-+ git clone "file://$(pwd)/nolazy-server" nolazy-client &&
-+ HASH=$(git -C nolazy-client rev-parse foo) &&
-+ rm -rf nolazy-client/.git/objects/* &&
++ # Partial clone with blob:none. The client has commits and
++ # trees but no blobs.
++ test_config -C "$SERVER" protocol.version 2 &&
++ git -c protocol.version=2 clone --no-checkout \
++ --filter=blob:none $HTTPD_URL/one_time_script/server repo &&
+
-+ git -C nolazy-client config core.repositoryformatversion 1 &&
-+ git -C nolazy-client config extensions.partialclone "origin" &&
++ # Sanity check: client does not have either blob locally.
++ git -C repo rev-list --objects --ignore-missing \
++ -- $BLOB_FOO >objlist &&
++ test_line_count = 0 objlist &&
+
-+ # Install a reference-transaction hook to record the env var
-+ # as seen by processes inside the child fetch.
-+ test_hook -C nolazy-client reference-transaction <<-\EOF &&
-+ echo "$GIT_NO_LAZY_FETCH" >>../env-in-child
++ # Craft a thin pack where BLOB_FOO is a REF_DELTA against
++ # BLOB_BAR. Since the client has neither blob (blob:none
++ # filter), the delta base will be missing. This simulates a
++ # misbehaving server that sends REF_DELTA against an object
++ # the client does not have.
++ test-tool -C "$SERVER" pack-deltas --num-objects=1 >thin.pack <<-EOF &&
++ REF_DELTA $BLOB_FOO $BLOB_BAR
+ EOF
+
-+ rm -f env-in-child &&
-+ git -C nolazy-client cat-file -p "$HASH" &&
++ replace_packfile thin.pack &&
+
-+ # The hook runs inside the child fetch, which should have
-+ # GIT_NO_LAZY_FETCH=1 in its environment.
-+ grep "^1$" env-in-child
++ # Trigger a lazy fetch for BLOB_FOO. The child fetch spawned
++ # by fetch_objects() receives our crafted thin pack. Its
++ # index-pack encounters the missing delta base (BLOB_BAR) and
++ # tries to lazy-fetch it via promisor_remote_get_direct().
++ #
++ # With the fix: fetch_objects() propagates GIT_NO_LAZY_FETCH=1
++ # to the child, so the depth-2 fetch is blocked and we see the
++ # "lazy fetching disabled" warning. The object cannot be
++ # resolved, so cat-file fails.
++ #
++ # Without the fix: the depth-2 fetch would proceed, potentially
++ # recursing unboundedly with a persistently misbehaving server.
++ test_must_fail git -C repo -c protocol.version=2 \
++ cat-file -p $BLOB_FOO 2>err &&
++ test_grep "lazy fetching disabled" err &&
++
++ # Ensure that the one-time-script was used.
++ ! test -e "$HTTPD_ROOT_PATH/one-time-script"
+'
+
- test_done
+ # DO NOT add non-httpd-specific tests here, because the last part of this
+ # test script is only executed when httpd is available and enabled.
+
promisor-remote.c | 7 +++++
t/t5616-partial-clone.sh | 60 ++++++++++++++++++++++++++++++++++++++++
2 files changed, 67 insertions(+)
diff --git a/promisor-remote.c b/promisor-remote.c
index 96fa215b06..35c7aab93d 100644
--- a/promisor-remote.c
+++ b/promisor-remote.c
@@ -42,6 +42,13 @@ static int fetch_objects(struct repository *repo,
child.in = -1;
if (repo != the_repository)
prepare_other_repo_env(&child.env, repo->gitdir);
+ /*
+ * Prevent the child's index-pack from recursing back into
+ * fetch_objects() when resolving REF_DELTA bases it does not
+ * have. With noop negotiation the server should never need
+ * to send such deltas, so a depth-2 fetch would not help.
+ */
+ strvec_pushf(&child.env, "%s=1", NO_LAZY_FETCH_ENVIRONMENT);
strvec_pushl(&child.args, "-c", "fetch.negotiationAlgorithm=noop",
"fetch", remote_name, "--no-tags",
"--no-write-fetch-head", "--recurse-submodules=no",
diff --git a/t/t5616-partial-clone.sh b/t/t5616-partial-clone.sh
index 1e354e057f..27f131c8d9 100755
--- a/t/t5616-partial-clone.sh
+++ b/t/t5616-partial-clone.sh
@@ -907,6 +907,66 @@ test_expect_success PERL_TEST_HELPERS 'tolerate server sending REF_DELTA against
! test -e "$HTTPD_ROOT_PATH/one-time-script"
'
+test_expect_success PERL_TEST_HELPERS 'lazy-fetch of REF_DELTA with missing base does not recurse' '
+ SERVER="$HTTPD_DOCUMENT_ROOT_PATH/server" &&
+ rm -rf "$SERVER" repo &&
+ test_create_repo "$SERVER" &&
+ test_config -C "$SERVER" uploadpack.allowfilter 1 &&
+ test_config -C "$SERVER" uploadpack.allowanysha1inwant 1 &&
+
+ # Create a commit with 2 blobs to be used as delta base and content.
+ for i in $(test_seq 10)
+ do
+ echo "this is a line" >>"$SERVER/foo.txt" &&
+ echo "this is another line" >>"$SERVER/bar.txt" || return 1
+ done &&
+ git -C "$SERVER" add foo.txt bar.txt &&
+ git -C "$SERVER" commit -m initial &&
+ BLOB_FOO=$(git -C "$SERVER" rev-parse HEAD:foo.txt) &&
+ BLOB_BAR=$(git -C "$SERVER" rev-parse HEAD:bar.txt) &&
+
+ # Partial clone with blob:none. The client has commits and
+ # trees but no blobs.
+ test_config -C "$SERVER" protocol.version 2 &&
+ git -c protocol.version=2 clone --no-checkout \
+ --filter=blob:none $HTTPD_URL/one_time_script/server repo &&
+
+ # Sanity check: client does not have either blob locally.
+ git -C repo rev-list --objects --ignore-missing \
+ -- $BLOB_FOO >objlist &&
+ test_line_count = 0 objlist &&
+
+ # Craft a thin pack where BLOB_FOO is a REF_DELTA against
+ # BLOB_BAR. Since the client has neither blob (blob:none
+ # filter), the delta base will be missing. This simulates a
+ # misbehaving server that sends REF_DELTA against an object
+ # the client does not have.
+ test-tool -C "$SERVER" pack-deltas --num-objects=1 >thin.pack <<-EOF &&
+ REF_DELTA $BLOB_FOO $BLOB_BAR
+ EOF
+
+ replace_packfile thin.pack &&
+
+ # Trigger a lazy fetch for BLOB_FOO. The child fetch spawned
+ # by fetch_objects() receives our crafted thin pack. Its
+ # index-pack encounters the missing delta base (BLOB_BAR) and
+ # tries to lazy-fetch it via promisor_remote_get_direct().
+ #
+ # With the fix: fetch_objects() propagates GIT_NO_LAZY_FETCH=1
+ # to the child, so the depth-2 fetch is blocked and we see the
+ # "lazy fetching disabled" warning. The object cannot be
+ # resolved, so cat-file fails.
+ #
+ # Without the fix: the depth-2 fetch would proceed, potentially
+ # recursing unboundedly with a persistently misbehaving server.
+ test_must_fail git -C repo -c protocol.version=2 \
+ cat-file -p $BLOB_FOO 2>err &&
+ test_grep "lazy fetching disabled" err &&
+
+ # Ensure that the one-time-script was used.
+ ! test -e "$HTTPD_ROOT_PATH/one-time-script"
+'
+
# DO NOT add non-httpd-specific tests here, because the last part of this
# test script is only executed when httpd is available and enabled.
base-commit: 7b2bccb0d58d4f24705bf985de1f4612e4cf06e5
--
gitgitgadget
next prev parent reply other threads:[~2026-03-11 14:19 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-04 16:57 [PATCH] promisor-remote: prevent lazy-fetch recursion in child fetch Paul Tarjan via GitGitGadget
2026-03-04 17:41 ` Junio C Hamano
2026-03-04 18:20 ` Paul Tarjan
2026-03-04 18:27 ` [PATCH v2] " Paul Tarjan via GitGitGadget
2026-03-11 10:52 ` Patrick Steinhardt
2026-03-11 14:18 ` Paul Tarjan
2026-03-12 7:27 ` Patrick Steinhardt
2026-03-13 1:43 ` Jeff King
2026-03-13 12:43 ` [PATCH v3] " Paul Tarjan
2026-03-13 12:43 ` Paul Tarjan
2026-04-15 18:05 ` Junio C Hamano
2026-03-11 14:19 ` Paul Tarjan via GitGitGadget [this message]
[not found] <xmqqse6p0cvm.fsf@gitster.g>
2026-04-17 2:21 ` Paul Tarjan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=pull.2224.v3.git.git.1773238778894.gitgitgadget@gmail.com \
--to=gitgitgadget@gmail.com \
--cc=christian.couder@gmail.com \
--cc=git@vger.kernel.org \
--cc=github@paulisageek.com \
--cc=hanxin.hx@bytedance.com \
--cc=paul@paultarjan.com \
--cc=ps@pks.im \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.