From: Taylor Blau <me@ttaylorr.com>
To: git@vger.kernel.org
Cc: Junio C Hamano <gitster@pobox.com>, Jeff King <peff@peff.net>,
Elijah Newren <newren@gmail.com>, Patrick Steinhardt <ps@pks.im>
Subject: [PATCH 0/5] pack-objects: handle excluded-but-open packs via `--stdin-packs=follow`
Date: Thu, 19 Mar 2026 18:24:12 -0400 [thread overview]
Message-ID: <cover.1773959041.git.me@ttaylorr.com> (raw)
This series came from an issue I saw in GitHub's infrastructure where a
particular repository was failing to repack with the following in its
log output:
warning: Failed to write bitmap index. Packfile doesn't have full closure (object XYZ is missing)
This was following a geometric repack that generated a MIDX and
attempted to generate its corresponding MIDX bitmap. Ordinarily, we
should never expect the above when generating MIDX bitmaps, unless:
- Object XYZ is indeed missing from the repository (though we should
have seen an earlier failure during the MIDX write itself), or
- Object XYZ is somehow not included in the MIDX, but is reachable from
another object which is.
After validating that object XYZ was indeed present at the time the
repack failed, I tried to figure out why we would ever generate a MIDX
whose set of objects was not closed under reachability.
The precise details are laid out in the fourth patch, but the gist is
that:
1. A pack whose objects are *not* closed under reachability was deemed
large enough by the geometric repack machinery so as to not need a
repack.
2. When generating the new pack with the non-closed pack was marked as
excluded, some object in an included pack reached an object (XYZ) in
an unknown pack whose only reachability path involved walking
through the parents of an object in another excluded pack. That
3. We wrote a MIDX containing the new pack, along with all of the
large-enough packs from the previous step
4. Because XYZ was in an unknown (likely cruft) pack, the resulting
MIDX does not contain a copy of object XYZ, but does contain a copy
of an object which reaches it, making it impossible to write
bitmaps.
This series introduces a special denotation for packs which are excluded
but not guaranteed to be closed under reachability. By marking a pack as
excluded via '!' (as opposed to the traditional '^'), we will now pick
up copies of objects reachable from objects in those pack(s), but
exclude objects which appear in excluded packs (open or closed).
In practice this means that the resulting pack contains:
1. All objects in the set difference between the included packs vs. the
excluded ones (open or closed).
2. All objects reachable from at least one object in either an included
pack or an excluded-open pack which (a) do not appear in an excluded
pack (of either kind), and (b) has a reachability path that does not
involve objects in the excluded-closed packs.
The series contains a couple of minor fixups and some refactoring that I
did along the way to make the substantive changes easier to read. The
first commit is a cleanup, the second is a refactoring, and the
remaining patches demonstrate, explain, and fix the bug.
(As a general side-note, I am somewhat unhappy with the growing number
of ways to mark a pack as "kept", and tried to untangle this by letting
packs hold an arbitrary bitset of flags that is opaque to the object
traversal machinery. This ended up being doable but rather complicated,
so I ended up punting on it for now.)
Thanks in advance for your review!
Taylor Blau (5):
pack-objects: plug leak in `read_stdin_packs()`
pack-objects: refactor `read_packs_list_from_stdin()` to use `strmap`
t7704: demonstrate failure with once-cruft objects above the geometric
split
pack-objects: support excluded-open packs with --stdin-packs
repack: mark non-MIDX packs above the split as excluded-open
Documentation/git-pack-objects.adoc | 25 ++-
builtin/pack-objects.c | 234 ++++++++++++++++++++--------
builtin/repack.c | 19 ++-
packfile.c | 3 +-
packfile.h | 2 +
t/t5331-pack-objects-stdin.sh | 105 +++++++++++++
t/t7704-repack-cruft.sh | 22 +++
7 files changed, 332 insertions(+), 78 deletions(-)
base-commit: 7ff1e8dc1e1680510c96e69965b3fa81372c5037
--
2.53.0.614.gc4fd52e751a
next reply other threads:[~2026-03-19 22:24 UTC|newest]
Thread overview: 41+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-19 22:24 Taylor Blau [this message]
2026-03-19 22:24 ` [PATCH 1/5] pack-objects: plug leak in `read_stdin_packs()` Taylor Blau
2026-03-24 7:39 ` Patrick Steinhardt
2026-03-25 23:03 ` Taylor Blau
2026-03-19 22:24 ` [PATCH 2/5] pack-objects: refactor `read_packs_list_from_stdin()` to use `strmap` Taylor Blau
2026-03-24 7:39 ` Patrick Steinhardt
2026-03-25 23:13 ` Taylor Blau
2026-03-19 22:24 ` [PATCH 3/5] t7704: demonstrate failure with once-cruft objects above the geometric split Taylor Blau
2026-03-19 22:24 ` [PATCH 4/5] pack-objects: support excluded-open packs with --stdin-packs Taylor Blau
2026-03-21 16:57 ` Jeff King
2026-03-22 18:09 ` Taylor Blau
2026-03-25 23:19 ` Taylor Blau
2026-03-19 22:24 ` [PATCH 5/5] repack: mark non-MIDX packs above the split as excluded-open Taylor Blau
2026-03-25 23:51 ` [PATCH v2 0/5] pack-objects: handle excluded-but-open packs via `--stdin-packs=follow` Taylor Blau
2026-03-25 23:51 ` [PATCH v2 1/5] pack-objects: plug leak in `read_stdin_packs()` Taylor Blau
2026-03-25 23:51 ` [PATCH v2 2/5] pack-objects: refactor `read_packs_list_from_stdin()` to use `strmap` Taylor Blau
2026-03-26 20:40 ` Derrick Stolee
2026-03-26 21:44 ` Taylor Blau
2026-03-26 22:11 ` Junio C Hamano
2026-03-26 22:32 ` Taylor Blau
2026-03-27 0:29 ` Derrick Stolee
2026-03-27 17:51 ` Taylor Blau
2026-03-27 18:34 ` Derrick Stolee
2026-03-27 15:52 ` Junio C Hamano
2026-03-26 22:37 ` Taylor Blau
2026-03-25 23:51 ` [PATCH v2 3/5] t7704: demonstrate failure with once-cruft objects above the geometric split Taylor Blau
2026-03-25 23:51 ` [PATCH v2 4/5] pack-objects: support excluded-open packs with --stdin-packs Taylor Blau
2026-03-26 20:48 ` Derrick Stolee
2026-03-25 23:51 ` [PATCH v2 5/5] repack: mark non-MIDX packs above the split as excluded-open Taylor Blau
2026-03-26 20:49 ` Derrick Stolee
2026-03-26 21:44 ` Taylor Blau
2026-03-26 20:51 ` [PATCH v2 0/5] pack-objects: handle excluded-but-open packs via `--stdin-packs=follow` Derrick Stolee
2026-03-26 21:46 ` Taylor Blau
2026-03-27 20:06 ` [PATCH v3 " Taylor Blau
2026-03-27 20:06 ` [PATCH v3 1/5] pack-objects: plug leak in `read_stdin_packs()` Taylor Blau
2026-03-27 20:06 ` [PATCH v3 2/5] pack-objects: refactor `read_packs_list_from_stdin()` to use `strmap` Taylor Blau
2026-03-27 20:06 ` [PATCH v3 3/5] t7704: demonstrate failure with once-cruft objects above the geometric split Taylor Blau
2026-03-27 20:06 ` [PATCH v3 4/5] pack-objects: support excluded-open packs with --stdin-packs Taylor Blau
2026-03-27 20:06 ` [PATCH v3 5/5] repack: mark non-MIDX packs above the split as excluded-open Taylor Blau
2026-03-27 20:16 ` [PATCH v3 0/5] pack-objects: handle excluded-but-open packs via `--stdin-packs=follow` Derrick Stolee
2026-03-27 20:43 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cover.1773959041.git.me@ttaylorr.com \
--to=me@ttaylorr.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=newren@gmail.com \
--cc=peff@peff.net \
--cc=ps@pks.im \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox