public inbox for git@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/5] pack-objects: handle excluded-but-open packs via `--stdin-packs=follow`
@ 2026-03-19 22:24 Taylor Blau
  2026-03-19 22:24 ` [PATCH 1/5] pack-objects: plug leak in `read_stdin_packs()` Taylor Blau
                   ` (6 more replies)
  0 siblings, 7 replies; 41+ messages in thread
From: Taylor Blau @ 2026-03-19 22:24 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Elijah Newren, Patrick Steinhardt

This series came from an issue I saw in GitHub's infrastructure where a
particular repository was failing to repack with the following in its
log output:

    warning: Failed to write bitmap index. Packfile doesn't have full closure (object XYZ is missing)

This was following a geometric repack that generated a MIDX and
attempted to generate its corresponding MIDX bitmap. Ordinarily, we
should never expect the above when generating MIDX bitmaps, unless:

 - Object XYZ is indeed missing from the repository (though we should
   have seen an earlier failure during the MIDX write itself), or

 - Object XYZ is somehow not included in the MIDX, but is reachable from
   another object which is.

After validating that object XYZ was indeed present at the time the
repack failed, I tried to figure out why we would ever generate a MIDX
whose set of objects was not closed under reachability.

The precise details are laid out in the fourth patch, but the gist is
that:

 1. A pack whose objects are *not* closed under reachability was deemed
    large enough by the geometric repack machinery so as to not need a
    repack.

 2. When generating the new pack with the non-closed pack was marked as
    excluded, some object in an included pack reached an object (XYZ) in
    an unknown pack whose only reachability path involved walking
    through the parents of an object in another excluded pack. That

 3. We wrote a MIDX containing the new pack, along with all of the
    large-enough packs from the previous step

 4. Because XYZ was in an unknown (likely cruft) pack, the resulting
    MIDX does not contain a copy of object XYZ, but does contain a copy
    of an object which reaches it, making it impossible to write
    bitmaps.

This series introduces a special denotation for packs which are excluded
but not guaranteed to be closed under reachability. By marking a pack as
excluded via '!' (as opposed to the traditional '^'), we will now pick
up copies of objects reachable from objects in those pack(s), but
exclude objects which appear in excluded packs (open or closed).

In practice this means that the resulting pack contains:

 1. All objects in the set difference between the included packs vs. the
    excluded ones (open or closed).

 2. All objects reachable from at least one object in either an included
    pack or an excluded-open pack which (a) do not appear in an excluded
    pack (of either kind), and (b) has a reachability path that does not
    involve objects in the excluded-closed packs.

The series contains a couple of minor fixups and some refactoring that I
did along the way to make the substantive changes easier to read. The
first commit is a cleanup, the second is a refactoring, and the
remaining patches demonstrate, explain, and fix the bug.

(As a general side-note, I am somewhat unhappy with the growing number
of ways to mark a pack as "kept", and tried to untangle this by letting
packs hold an arbitrary bitset of flags that is opaque to the object
traversal machinery. This ended up being doable but rather complicated,
so I ended up punting on it for now.)

Thanks in advance for your review!

Taylor Blau (5):
  pack-objects: plug leak in `read_stdin_packs()`
  pack-objects: refactor `read_packs_list_from_stdin()` to use `strmap`
  t7704: demonstrate failure with once-cruft objects above the geometric
    split
  pack-objects: support excluded-open packs with --stdin-packs
  repack: mark non-MIDX packs above the split as excluded-open

 Documentation/git-pack-objects.adoc |  25 ++-
 builtin/pack-objects.c              | 234 ++++++++++++++++++++--------
 builtin/repack.c                    |  19 ++-
 packfile.c                          |   3 +-
 packfile.h                          |   2 +
 t/t5331-pack-objects-stdin.sh       | 105 +++++++++++++
 t/t7704-repack-cruft.sh             |  22 +++
 7 files changed, 332 insertions(+), 78 deletions(-)


base-commit: 7ff1e8dc1e1680510c96e69965b3fa81372c5037
-- 
2.53.0.614.gc4fd52e751a

^ permalink raw reply	[flat|nested] 41+ messages in thread

end of thread, other threads:[~2026-03-27 20:43 UTC | newest]

Thread overview: 41+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-19 22:24 [PATCH 0/5] pack-objects: handle excluded-but-open packs via `--stdin-packs=follow` Taylor Blau
2026-03-19 22:24 ` [PATCH 1/5] pack-objects: plug leak in `read_stdin_packs()` Taylor Blau
2026-03-24  7:39   ` Patrick Steinhardt
2026-03-25 23:03     ` Taylor Blau
2026-03-19 22:24 ` [PATCH 2/5] pack-objects: refactor `read_packs_list_from_stdin()` to use `strmap` Taylor Blau
2026-03-24  7:39   ` Patrick Steinhardt
2026-03-25 23:13     ` Taylor Blau
2026-03-19 22:24 ` [PATCH 3/5] t7704: demonstrate failure with once-cruft objects above the geometric split Taylor Blau
2026-03-19 22:24 ` [PATCH 4/5] pack-objects: support excluded-open packs with --stdin-packs Taylor Blau
2026-03-21 16:57   ` Jeff King
2026-03-22 18:09     ` Taylor Blau
2026-03-25 23:19       ` Taylor Blau
2026-03-19 22:24 ` [PATCH 5/5] repack: mark non-MIDX packs above the split as excluded-open Taylor Blau
2026-03-25 23:51 ` [PATCH v2 0/5] pack-objects: handle excluded-but-open packs via `--stdin-packs=follow` Taylor Blau
2026-03-25 23:51   ` [PATCH v2 1/5] pack-objects: plug leak in `read_stdin_packs()` Taylor Blau
2026-03-25 23:51   ` [PATCH v2 2/5] pack-objects: refactor `read_packs_list_from_stdin()` to use `strmap` Taylor Blau
2026-03-26 20:40     ` Derrick Stolee
2026-03-26 21:44       ` Taylor Blau
2026-03-26 22:11         ` Junio C Hamano
2026-03-26 22:32           ` Taylor Blau
2026-03-27  0:29             ` Derrick Stolee
2026-03-27 17:51               ` Taylor Blau
2026-03-27 18:34                 ` Derrick Stolee
2026-03-27 15:52             ` Junio C Hamano
2026-03-26 22:37     ` Taylor Blau
2026-03-25 23:51   ` [PATCH v2 3/5] t7704: demonstrate failure with once-cruft objects above the geometric split Taylor Blau
2026-03-25 23:51   ` [PATCH v2 4/5] pack-objects: support excluded-open packs with --stdin-packs Taylor Blau
2026-03-26 20:48     ` Derrick Stolee
2026-03-25 23:51   ` [PATCH v2 5/5] repack: mark non-MIDX packs above the split as excluded-open Taylor Blau
2026-03-26 20:49     ` Derrick Stolee
2026-03-26 21:44       ` Taylor Blau
2026-03-26 20:51   ` [PATCH v2 0/5] pack-objects: handle excluded-but-open packs via `--stdin-packs=follow` Derrick Stolee
2026-03-26 21:46     ` Taylor Blau
2026-03-27 20:06 ` [PATCH v3 " Taylor Blau
2026-03-27 20:06   ` [PATCH v3 1/5] pack-objects: plug leak in `read_stdin_packs()` Taylor Blau
2026-03-27 20:06   ` [PATCH v3 2/5] pack-objects: refactor `read_packs_list_from_stdin()` to use `strmap` Taylor Blau
2026-03-27 20:06   ` [PATCH v3 3/5] t7704: demonstrate failure with once-cruft objects above the geometric split Taylor Blau
2026-03-27 20:06   ` [PATCH v3 4/5] pack-objects: support excluded-open packs with --stdin-packs Taylor Blau
2026-03-27 20:06   ` [PATCH v3 5/5] repack: mark non-MIDX packs above the split as excluded-open Taylor Blau
2026-03-27 20:16   ` [PATCH v3 0/5] pack-objects: handle excluded-but-open packs via `--stdin-packs=follow` Derrick Stolee
2026-03-27 20:43     ` Junio C Hamano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox