From: Taylor Blau <me@ttaylorr.com>
To: git@vger.kernel.org
Cc: Junio C Hamano <gitster@pobox.com>, Jeff King <peff@peff.net>,
Elijah Newren <newren@gmail.com>, Patrick Steinhardt <ps@pks.im>,
Derrick Stolee <stolee@gmail.com>
Subject: [PATCH v3 0/5] pack-objects: handle excluded-but-open packs via `--stdin-packs=follow`
Date: Fri, 27 Mar 2026 16:06:40 -0400 [thread overview]
Message-ID: <cover.1774641999.git.me@ttaylorr.com> (raw)
In-Reply-To: <cover.1773959041.git.me@ttaylorr.com>
This is another small reroll of my series to fix an issue where MIDX
bitmaps fail to generate after a geometric repack in certain scenarios
where the set of MIDX'd objects is not closed under reachability.
The main changes since last time are:
* Named enum stdin_pack_info_kind.
* Refactored how we handle reading incoming packs via stdin.
* Fixed a nasty case where sorting the packs in order of mtime happened
to work on some systems, but ASan detected a very legitimate bug.
As usual, a range-diff is included below for convenience.
Thanks again for your review!
Taylor Blau (5):
pack-objects: plug leak in `read_stdin_packs()`
pack-objects: refactor `read_packs_list_from_stdin()` to use `strmap`
t7704: demonstrate failure with once-cruft objects above the geometric
split
pack-objects: support excluded-open packs with --stdin-packs
repack: mark non-MIDX packs above the split as excluded-open
Documentation/git-pack-objects.adoc | 25 ++-
builtin/pack-objects.c | 301 +++++++++++++++++++---------
builtin/repack.c | 19 +-
packfile.c | 3 +-
packfile.h | 2 +
t/t5331-pack-objects-stdin.sh | 105 ++++++++++
t/t7704-repack-cruft.sh | 22 ++
7 files changed, 373 insertions(+), 104 deletions(-)
Range-diff against v2:
1: 1fabd88f5e3 = 1: d6ff4e801ab pack-objects: plug leak in `read_stdin_packs()`
2: d5cb793f0eb ! 2: dd9ff1ede4a pack-objects: refactor `read_packs_list_from_stdin()` to use `strmap`
@@ builtin/pack-objects.c
#include "list.h"
#include "packfile.h"
#include "object-file.h"
-@@ builtin/pack-objects.c: static int pack_mtime_cmp(const void *_a, const void *_b)
+@@ builtin/pack-objects.c: static void show_commit_pack_hint(struct commit *commit, void *data)
+
+ }
+
++/*
++ * stdin_pack_info_kind specifies how a pack specified over stdin
++ * should be treated when pack-objects is invoked with --stdin-packs.
++ *
++ * - STDIN_PACK_INCLUDE: objects in any packs with this flag bit set
++ * should be included in the output pack, unless they appear in an
++ * excluded pack.
++ *
++ * - STDIN_PACK_EXCLUDE_CLOSED: objects in any packs with this flag
++ * bit set should be excluded from the output pack.
++ *
++ * Objects in packs whose 'kind' bits include STDIN_PACK_INCLUDE are
++ * used as traversal tips when invoked with --stdin-packs=follow.
++ */
++enum stdin_pack_info_kind {
++ STDIN_PACK_INCLUDE = (1<<0),
++ STDIN_PACK_EXCLUDE_CLOSED = (1<<1),
++};
++
++struct stdin_pack_info {
++ struct packed_git *p;
++ enum stdin_pack_info_kind kind;
++};
++
+ static int pack_mtime_cmp(const void *_a, const void *_b)
+ {
+- struct packed_git *a = ((const struct string_list_item*)_a)->util;
+- struct packed_git *b = ((const struct string_list_item*)_b)->util;
++ struct stdin_pack_info *a = ((const struct string_list_item*)_a)->util;
++ struct stdin_pack_info *b = ((const struct string_list_item*)_b)->util;
+
+ /*
+ * order packs by descending mtime so that objects are laid out
+ * roughly as newest-to-oldest
+ */
+- if (a->mtime < b->mtime)
++ if (a->p->mtime < b->p->mtime)
+ return 1;
+- else if (b->mtime < a->mtime)
++ else if (b->p->mtime < a->p->mtime)
+ return -1;
+ else
return 0;
}
-static void read_packs_list_from_stdin(struct rev_info *revs)
-+struct stdin_pack_info {
-+ struct packed_git *p;
-+ enum {
-+ STDIN_PACK_INCLUDE = (1<<0),
-+ STDIN_PACK_EXCLUDE_CLOSED = (1<<1),
-+ } kind;
-+};
-+
+static void stdin_packs_add_pack_entries(struct strmap *packs,
+ struct rev_info *revs)
-+{
+ {
+- struct strbuf buf = STRBUF_INIT;
+- struct string_list include_packs = STRING_LIST_INIT_DUP;
+- struct string_list exclude_packs = STRING_LIST_INIT_DUP;
+- struct string_list_item *item = NULL;
+- struct packed_git *p;
+ struct string_list keys = STRING_LIST_INIT_NODUP;
+ struct string_list_item *item;
+ struct hashmap_iter iter;
+ struct strmap_entry *entry;
-+
+
+- while (strbuf_getline(&buf, stdin) != EOF) {
+- if (!buf.len)
+- continue;
+ strmap_for_each_entry(packs, &iter, entry) {
+ struct stdin_pack_info *info = entry->value;
+ if (!info->p)
+ die(_("could not find pack '%s'"), entry->key);
-+
+
+- if (*buf.buf == '^')
+- string_list_append(&exclude_packs, buf.buf + 1);
+- else
+- string_list_append(&include_packs, buf.buf);
+-
+- strbuf_reset(&buf);
+- }
+-
+- string_list_sort_u(&include_packs, 0);
+- string_list_sort_u(&exclude_packs, 0);
+-
+- repo_for_each_pack(the_repository, p) {
+- const char *pack_name = pack_basename(p);
+-
+- if ((item = string_list_lookup(&include_packs, pack_name))) {
+- if (exclude_promisor_objects && p->pack_promisor)
+- die(_("packfile %s is a promisor but --exclude-promisor-objects was given"), p->pack_name);
+- item->util = p;
+- }
+- if ((item = string_list_lookup(&exclude_packs, pack_name)))
+- item->util = p;
+- }
+-
+- /*
+- * Arguments we got on stdin may not even be packs. First
+- * check that to avoid segfaulting later on in
+- * e.g. pack_mtime_cmp(), excluded packs are handled below.
+- *
+- * Since we first parsed our STDIN and then sorted the input
+- * lines the pack we error on will be whatever line happens to
+- * sort first. This is lazy, it's enough that we report one
+- * bad case here, we don't need to report the first/last one,
+- * or all of them.
+- */
+- for_each_string_list_item(item, &include_packs) {
+- struct packed_git *p = item->util;
+- if (!p)
+- die(_("could not find pack '%s'"), item->string);
+- if (!is_pack_valid(p))
+- die(_("packfile %s cannot be accessed"), p->pack_name);
+- }
+-
+- /*
+- * Then, handle all of the excluded packs, marking them as
+- * kept in-core so that later calls to add_object_entry()
+- * discards any objects that are also found in excluded packs.
+- */
+- for_each_string_list_item(item, &exclude_packs) {
+- struct packed_git *p = item->util;
+- if (!p)
+- die(_("could not find pack '%s'"), item->string);
+- p->pack_keep_in_core = 1;
+ string_list_append(&keys, entry->key)->util = info;
-+ }
-+
-+ /*
-+ * Order packs by ascending mtime; use QSORT directly to access the
-+ * string_list_item's ->util pointer, which string_list_sort() does not
-+ * provide.
-+ */
+ }
+
+ /*
+@@ builtin/pack-objects.c: static void read_packs_list_from_stdin(struct rev_info *revs)
+ * string_list_item's ->util pointer, which string_list_sort() does not
+ * provide.
+ */
+- QSORT(include_packs.items, include_packs.nr, pack_mtime_cmp);
+-
+- for_each_string_list_item(item, &include_packs) {
+- struct packed_git *p = item->util;
+- for_each_object_in_pack(p,
+- add_object_entry_from_pack,
+- revs,
+- ODB_FOR_EACH_OBJECT_PACK_ORDER);
+ QSORT(keys.items, keys.nr, pack_mtime_cmp);
+
+ for_each_string_list_item(item, &keys) {
@@ builtin/pack-objects.c: static int pack_mtime_cmp(const void *_a, const void *_b
+ add_object_entry_from_pack,
+ revs,
+ ODB_FOR_EACH_OBJECT_PACK_ORDER);
-+ }
-+
+ }
+
+ string_list_clear(&keys, 0);
+}
+
+static void stdin_packs_read_input(struct rev_info *revs)
- {
- struct strbuf buf = STRBUF_INIT;
-- struct string_list include_packs = STRING_LIST_INIT_DUP;
-- struct string_list exclude_packs = STRING_LIST_INIT_DUP;
-- struct string_list_item *item = NULL;
++{
++ struct strbuf buf = STRBUF_INIT;
+ struct strmap packs = STRMAP_INIT;
- struct packed_git *p;
-
- while (strbuf_getline(&buf, stdin) != EOF) {
-- if (!buf.len)
++ struct packed_git *p;
++
++ while (strbuf_getline(&buf, stdin) != EOF) {
+ struct stdin_pack_info *info;
++ enum stdin_pack_info_kind kind = STDIN_PACK_INCLUDE;
+ const char *key = buf.buf;
+
+ if (!*key)
- continue;
-+ if (*key == '^')
++ continue;
++ else if (*key == '^')
++ kind = STDIN_PACK_EXCLUDE_CLOSED;
++
++ if (kind != STDIN_PACK_INCLUDE)
+ key++;
+
+ info = strmap_get(&packs, key);
@@ builtin/pack-objects.c: static int pack_mtime_cmp(const void *_a, const void *_b
+ CALLOC_ARRAY(info, 1);
+ strmap_put(&packs, key, info);
+ }
-
- if (*buf.buf == '^')
-- string_list_append(&exclude_packs, buf.buf + 1);
-+ info->kind |= STDIN_PACK_EXCLUDE_CLOSED;
- else
-- string_list_append(&include_packs, buf.buf);
-+ info->kind |= STDIN_PACK_INCLUDE;
-
- strbuf_reset(&buf);
- }
-
-- string_list_sort_u(&include_packs, 0);
-- string_list_sort_u(&exclude_packs, 0);
--
- repo_for_each_pack(the_repository, p) {
-- const char *pack_name = pack_basename(p);
++
++ info->kind |= kind;
++
++ strbuf_reset(&buf);
++ }
++
++ repo_for_each_pack(the_repository, p) {
+ struct stdin_pack_info *info;
-
-- if ((item = string_list_lookup(&include_packs, pack_name))) {
++
+ info = strmap_get(&packs, pack_basename(p));
+ if (!info)
+ continue;
+
+ if (info->kind & STDIN_PACK_INCLUDE) {
- if (exclude_promisor_objects && p->pack_promisor)
- die(_("packfile %s is a promisor but --exclude-promisor-objects was given"), p->pack_name);
-- item->util = p;
++ if (exclude_promisor_objects && p->pack_promisor)
++ die(_("packfile %s is a promisor but --exclude-promisor-objects was given"), p->pack_name);
+
+ /*
+ * Arguments we got on stdin may not even be
@@ builtin/pack-objects.c: static int pack_mtime_cmp(const void *_a, const void *_b
+ */
+ if (!is_pack_valid(p))
+ die(_("packfile %s cannot be accessed"), p->pack_name);
- }
-- if ((item = string_list_lookup(&exclude_packs, pack_name)))
-- item->util = p;
-- }
-
-- /*
-- * Arguments we got on stdin may not even be packs. First
-- * check that to avoid segfaulting later on in
-- * e.g. pack_mtime_cmp(), excluded packs are handled below.
-- *
-- * Since we first parsed our STDIN and then sorted the input
-- * lines the pack we error on will be whatever line happens to
-- * sort first. This is lazy, it's enough that we report one
-- * bad case here, we don't need to report the first/last one,
-- * or all of them.
-- */
-- for_each_string_list_item(item, &include_packs) {
-- struct packed_git *p = item->util;
-- if (!p)
-- die(_("could not find pack '%s'"), item->string);
-- if (!is_pack_valid(p))
-- die(_("packfile %s cannot be accessed"), p->pack_name);
-- }
++ }
++
+ if (info->kind & STDIN_PACK_EXCLUDE_CLOSED) {
+ /*
+ * Marking excluded packs as kept in-core so
@@ builtin/pack-objects.c: static int pack_mtime_cmp(const void *_a, const void *_b
+ */
+ p->pack_keep_in_core = 1;
+ }
-
-- /*
-- * Then, handle all of the excluded packs, marking them as
-- * kept in-core so that later calls to add_object_entry()
-- * discards any objects that are also found in excluded packs.
-- */
-- for_each_string_list_item(item, &exclude_packs) {
-- struct packed_git *p = item->util;
-- if (!p)
-- die(_("could not find pack '%s'"), item->string);
-- p->pack_keep_in_core = 1;
++
+ info->p = p;
- }
-
-- /*
-- * Order packs by ascending mtime; use QSORT directly to access the
-- * string_list_item's ->util pointer, which string_list_sort() does not
-- * provide.
-- */
-- QSORT(include_packs.items, include_packs.nr, pack_mtime_cmp);
--
-- for_each_string_list_item(item, &include_packs) {
-- struct packed_git *p = item->util;
-- for_each_object_in_pack(p,
-- add_object_entry_from_pack,
-- revs,
-- ODB_FOR_EACH_OBJECT_PACK_ORDER);
-- }
++ }
++
+ stdin_packs_add_pack_entries(&packs, revs);
-
++
strbuf_release(&buf);
- string_list_clear(&include_packs, 0);
- string_list_clear(&exclude_packs, 0);
3: d8f0577077c = 3: 5a5090f8da2 t7704: demonstrate failure with once-cruft objects above the geometric split
4: e028dfbc9fb ! 4: efba1ab93d8 pack-objects: support excluded-open packs with --stdin-packs
@@ builtin/pack-objects.c: static int add_object_entry_from_pack(const struct objec
create_object_entry(oid, type, 0, 0, 0, p, ofs);
return 0;
+@@ builtin/pack-objects.c: static void show_commit_pack_hint(struct commit *commit, void *data)
+ * - STDIN_PACK_EXCLUDE_CLOSED: objects in any packs with this flag
+ * bit set should be excluded from the output pack.
+ *
+- * Objects in packs whose 'kind' bits include STDIN_PACK_INCLUDE are
+- * used as traversal tips when invoked with --stdin-packs=follow.
++ * - STDIN_PACK_EXCLUDE_OPEN: objects in any packs with this flag
++ * bit set should be excluded from the output pack, but are not
++ * guaranteed to be closed under reachability.
++ *
++ * Objects in packs whose 'kind' bits include STDIN_PACK_INCLUDE or
++ * STDIN_PACK_EXCLUDE_OPEN are used as traversal tips when invoked
++ * with --stdin-packs=follow.
+ */
+ enum stdin_pack_info_kind {
+ STDIN_PACK_INCLUDE = (1<<0),
+ STDIN_PACK_EXCLUDE_CLOSED = (1<<1),
++ STDIN_PACK_EXCLUDE_OPEN = (1<<2),
+ };
+
+ struct stdin_pack_info {
@@ builtin/pack-objects.c: static int pack_mtime_cmp(const void *_a, const void *_b)
return 0;
}
@@ builtin/pack-objects.c: static int pack_mtime_cmp(const void *_a, const void *_b
+ return stdin_packs_include_check_obj((struct object *)commit, data);
+}
+
- struct stdin_pack_info {
- struct packed_git *p;
- enum {
- STDIN_PACK_INCLUDE = (1<<0),
- STDIN_PACK_EXCLUDE_CLOSED = (1<<1),
-+ STDIN_PACK_EXCLUDE_OPEN = (1<<2),
- } kind;
- };
-
+ static void stdin_packs_add_pack_entries(struct strmap *packs,
+ struct rev_info *revs)
+ {
@@ builtin/pack-objects.c: static void stdin_packs_add_pack_entries(struct strmap *packs,
for_each_string_list_item(item, &keys) {
struct stdin_pack_info *info = item->util;
@@ builtin/pack-objects.c: static void stdin_packs_add_pack_entries(struct strmap *
struct strbuf buf = STRBUF_INIT;
struct strmap packs = STRMAP_INIT;
@@ builtin/pack-objects.c: static void stdin_packs_read_input(struct rev_info *revs)
-
- if (!*key)
continue;
-- if (*key == '^')
-+ if (*key == '^' ||
-+ (*key == '!' && mode == STDIN_PACKS_MODE_FOLLOW))
+ else if (*key == '^')
+ kind = STDIN_PACK_EXCLUDE_CLOSED;
++ else if (*key == '!' && mode == STDIN_PACKS_MODE_FOLLOW)
++ kind = STDIN_PACK_EXCLUDE_OPEN;
+
+ if (kind != STDIN_PACK_INCLUDE)
key++;
-
- info = strmap_get(&packs, key);
-@@ builtin/pack-objects.c: static void stdin_packs_read_input(struct rev_info *revs)
-
- if (*buf.buf == '^')
- info->kind |= STDIN_PACK_EXCLUDE_CLOSED;
-+ else if (*buf.buf == '!' && mode == STDIN_PACKS_MODE_FOLLOW)
-+ info->kind |= STDIN_PACK_EXCLUDE_OPEN;
- else
- info->kind |= STDIN_PACK_INCLUDE;
-
@@ builtin/pack-objects.c: static void stdin_packs_read_input(struct rev_info *revs)
p->pack_keep_in_core = 1;
}
5: 23cb9f33dba = 5: c9ad9a0c4ae repack: mark non-MIDX packs above the split as excluded-open
base-commit: 41688c1a2312f62f44435e1a6d03b4b904b5b0ec
--
2.53.0.724.gb20b077944a
next prev parent reply other threads:[~2026-03-27 20:06 UTC|newest]
Thread overview: 41+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-19 22:24 [PATCH 0/5] pack-objects: handle excluded-but-open packs via `--stdin-packs=follow` Taylor Blau
2026-03-19 22:24 ` [PATCH 1/5] pack-objects: plug leak in `read_stdin_packs()` Taylor Blau
2026-03-24 7:39 ` Patrick Steinhardt
2026-03-25 23:03 ` Taylor Blau
2026-03-19 22:24 ` [PATCH 2/5] pack-objects: refactor `read_packs_list_from_stdin()` to use `strmap` Taylor Blau
2026-03-24 7:39 ` Patrick Steinhardt
2026-03-25 23:13 ` Taylor Blau
2026-03-19 22:24 ` [PATCH 3/5] t7704: demonstrate failure with once-cruft objects above the geometric split Taylor Blau
2026-03-19 22:24 ` [PATCH 4/5] pack-objects: support excluded-open packs with --stdin-packs Taylor Blau
2026-03-21 16:57 ` Jeff King
2026-03-22 18:09 ` Taylor Blau
2026-03-25 23:19 ` Taylor Blau
2026-03-19 22:24 ` [PATCH 5/5] repack: mark non-MIDX packs above the split as excluded-open Taylor Blau
2026-03-25 23:51 ` [PATCH v2 0/5] pack-objects: handle excluded-but-open packs via `--stdin-packs=follow` Taylor Blau
2026-03-25 23:51 ` [PATCH v2 1/5] pack-objects: plug leak in `read_stdin_packs()` Taylor Blau
2026-03-25 23:51 ` [PATCH v2 2/5] pack-objects: refactor `read_packs_list_from_stdin()` to use `strmap` Taylor Blau
2026-03-26 20:40 ` Derrick Stolee
2026-03-26 21:44 ` Taylor Blau
2026-03-26 22:11 ` Junio C Hamano
2026-03-26 22:32 ` Taylor Blau
2026-03-27 0:29 ` Derrick Stolee
2026-03-27 17:51 ` Taylor Blau
2026-03-27 18:34 ` Derrick Stolee
2026-03-27 15:52 ` Junio C Hamano
2026-03-26 22:37 ` Taylor Blau
2026-03-25 23:51 ` [PATCH v2 3/5] t7704: demonstrate failure with once-cruft objects above the geometric split Taylor Blau
2026-03-25 23:51 ` [PATCH v2 4/5] pack-objects: support excluded-open packs with --stdin-packs Taylor Blau
2026-03-26 20:48 ` Derrick Stolee
2026-03-25 23:51 ` [PATCH v2 5/5] repack: mark non-MIDX packs above the split as excluded-open Taylor Blau
2026-03-26 20:49 ` Derrick Stolee
2026-03-26 21:44 ` Taylor Blau
2026-03-26 20:51 ` [PATCH v2 0/5] pack-objects: handle excluded-but-open packs via `--stdin-packs=follow` Derrick Stolee
2026-03-26 21:46 ` Taylor Blau
2026-03-27 20:06 ` Taylor Blau [this message]
2026-03-27 20:06 ` [PATCH v3 1/5] pack-objects: plug leak in `read_stdin_packs()` Taylor Blau
2026-03-27 20:06 ` [PATCH v3 2/5] pack-objects: refactor `read_packs_list_from_stdin()` to use `strmap` Taylor Blau
2026-03-27 20:06 ` [PATCH v3 3/5] t7704: demonstrate failure with once-cruft objects above the geometric split Taylor Blau
2026-03-27 20:06 ` [PATCH v3 4/5] pack-objects: support excluded-open packs with --stdin-packs Taylor Blau
2026-03-27 20:06 ` [PATCH v3 5/5] repack: mark non-MIDX packs above the split as excluded-open Taylor Blau
2026-03-27 20:16 ` [PATCH v3 0/5] pack-objects: handle excluded-but-open packs via `--stdin-packs=follow` Derrick Stolee
2026-03-27 20:43 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cover.1774641999.git.me@ttaylorr.com \
--to=me@ttaylorr.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=newren@gmail.com \
--cc=peff@peff.net \
--cc=ps@pks.im \
--cc=stolee@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox