From: Paul Tarjan <paul@paultarjan.com>
To: git@vger.kernel.org
Cc: ps@pks.im, peff@peff.net, gitgitgadget@gmail.com,
christian.couder@gmail.com, hanxin.hx@bytedance.com
Subject: Re: [PATCH v3] promisor-remote: prevent lazy-fetch recursion in child fetch
Date: Fri, 13 Mar 2026 06:43:29 -0600 [thread overview]
Message-ID: <20260313124329.75626-1-github@paulisageek.com> (raw)
In-Reply-To: <abJqySqfdFoY8cEu@pks.im>
Patrick Steinhardt <ps@pks.im> writes:
> I must be misunderstanding something here, but how is it that a commit
> can be garbage collected if a ref points to it? That shouldn't ever
> happen, as reachable commits should not be pruned.
>
> Or do you mean to say that the commits don't exist on the server side
> anymore?
Sloppy wording on my part — "GC'd" is wrong. These refs pointed at
commits that were promised but never materialized on the partial
clone. The ~77K broken refs looked like:
error: refs/prefetch/remotes/origin/claude/add-azure-dependencies-EaaDn \
does not point to a valid object!
They were created by git maintenance --task=prefetch, which runs
git fetch --prefetch --prune and writes refs/prefetch/remotes/origin/<branch>
pointing at the remote tip. On a blob:none partial clone it fetches
commit/tree metadata, but some referenced commits were never
actually downloaded before the upstream branches (ephemeral
CI/automation branches, force-pushed and deleted within days)
disappeared.
This is a red herring for the patch though. The stale prefetch refs
explain why the outer fetch got a thin pack — the client advertised
haves from promised-but-absent commits. But the recursion (depth-1
to depth-2+) is entirely inside fetch_objects() with noop
negotiation, independent of any refs.
I'll fix the commit message wording in a v4 if you'd like.
> That's certainly curious. Do you maybe have multiple remotes attached to
> the repository, or are you dropping/modifying the object filter at some
> point?
>
> All subsequent fetches need to use the same object filter as you've used
> during the initial clone, otherwise you may run into a situation as you
> have described. But in theory, Git knows to continue using the filter.
Nobody intentionally changed the filter. What happened is the
lazy-fetch child kept re-writing it. fetch_objects() hardcodes
--filter=blob:none on the child argv, and the child's
builtin/fetch.c writes the active filter to config.
23547c40 ("fetch: do not override partial clone filter", 2020)
guards this write behind a check for an already-set filter. But I
was unsetting remote.origin.partialclonefilter manually trying to
stop the storm, so the guard passed and the next lazy-fetch child
wrote it right back:
21:45 (unset) manual git config --unset
22:05 blob:none re-written by a lazy-fetch child
22:11 (unset) manual unset again
23:11 blob:none re-written again
23:13 (unset) manual unset
23:28 blob:none re-written (caught live by a config-mtime trap)
The actual mitigation was unsetting remote.origin.promisor too —
with no promisor remote, fix_unresolved_deltas() skips the prefetch
entirely.
This is arguably a separate bug: fetch_objects() should probably
pass -c remote.<name>.partialclonefilter=blob:none to override for
the single invocation, rather than --filter=blob:none which
persists to config. Not in scope for this patch, but I could follow
up separately if there's interest.
Paul
next prev parent reply other threads:[~2026-03-13 12:43 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-04 16:57 [PATCH] promisor-remote: prevent lazy-fetch recursion in child fetch Paul Tarjan via GitGitGadget
2026-03-04 17:41 ` Junio C Hamano
2026-03-04 18:20 ` Paul Tarjan
2026-03-04 18:27 ` [PATCH v2] " Paul Tarjan via GitGitGadget
2026-03-11 10:52 ` Patrick Steinhardt
2026-03-11 14:18 ` Paul Tarjan
2026-03-12 7:27 ` Patrick Steinhardt
2026-03-13 1:43 ` Jeff King
2026-03-13 12:43 ` [PATCH v3] " Paul Tarjan
2026-03-13 12:43 ` Paul Tarjan [this message]
2026-03-11 14:19 ` Paul Tarjan via GitGitGadget
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260313124329.75626-1-github@paulisageek.com \
--to=paul@paultarjan.com \
--cc=christian.couder@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitgitgadget@gmail.com \
--cc=hanxin.hx@bytedance.com \
--cc=peff@peff.net \
--cc=ps@pks.im \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox