git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Taylor Blau <me@ttaylorr.com>
To: git@vger.kernel.org
Cc: Elijah Newren <newren@gmail.com>, Jeff King <peff@peff.net>,
	Junio C Hamano <gitster@pobox.com>
Subject: [PATCH v4 0/9] repack: avoid MIDX'ing cruft pack(s) where possible
Date: Wed, 28 May 2025 19:20:07 -0400	[thread overview]
Message-ID: <cover.1748473889.git.me@ttaylorr.com> (raw)
In-Reply-To: <cover.1744413969.git.me@ttaylorr.com>

Here is a tiny reroll of my series to explore creating MIDXs while
repacking that don't include the cruft pack.

The only changes since last time are as follows:

 - removed a stray whitespace change in patch 2/9 caught by Junio and
   Elijah

 - reworded the final commit message based on helpful feedback from
   Elijah.

The rest of the series is unchanged, and a range-diff is included below
as usual for convenience.

A meta-note on something new since last time: I have deployed a
cherry-picked version of this series to GitHub's infrastructure a few
weeks ago. The testing repository (whose maintenance failure many years
ago precipitated commit ddee3703b3 (builtin/repack.c: add cruft packs to
MIDX during geometric repack, 2022-05-20)) has been happily running
maintenance tasks without any issues since then, and the cruft pack has
been excluded from the MIDX.

Thanks in advance for any review :-).

Taylor Blau (9):
  pack-objects: use standard option incompatibility functions
  pack-objects: limit scope in 'add_object_entry_from_pack()'
  pack-objects: factor out handling '--stdin-packs'
  pack-objects: declare 'rev_info' for '--stdin-packs' earlier
  pack-objects: perform name-hash traversal for unpacked objects
  pack-objects: fix typo in 'show_object_pack_hint()'
  pack-objects: swap 'show_{object,commit}_pack_hint'
  pack-objects: introduce '--stdin-packs=follow'
  repack: exclude cruft pack(s) from the MIDX where possible

 Documentation/config/repack.adoc    |   7 +
 Documentation/git-pack-objects.adoc |  10 +-
 builtin/pack-objects.c              | 190 ++++++++++++++++++----------
 builtin/repack.c                    | 163 +++++++++++++++++++++---
 t/t5331-pack-objects-stdin.sh       |  84 +++++++++++-
 t/t7704-repack-cruft.sh             |  90 +++++++++++++
 6 files changed, 455 insertions(+), 89 deletions(-)

Range-diff against v3:
 1:  986bef29b5 !  1:  f8b31c6a8d pack-objects: limit scope in 'add_object_entry_from_pack()'
    @@ Metadata
     Author: Taylor Blau <me@ttaylorr.com>

      ## Commit message ##
    -    pack-objects: limit scope in 'add_object_entry_from_pack()'
    +    pack-objects: use standard option incompatibility functions

    -    In add_object_entry_from_pack() we declare 'revs' (given to us through
    -    the miscellaneous context argument) earlier in the "if (p)" conditional
    -    than is necessary.  Move it down as far as it can go to reduce its
    -    scope.
    +    pack-objects has a handful of explicit checks for pairs of command-line
    +    options which are mutually incompatible. Many of these pre-date
    +    a699367bb8 (i18n: factorize more 'incompatible options' messages,
    +    2022-01-31).
    +
    +    Convert the explicit checks into die_for_incompatible_opt2() calls,
    +    which simplifies the implementation and standardizes pack-objects'
    +    output when given incompatible options (e.g., --stdin-packs with
    +    --filter gives different output than --keep-unreachable with
    +    --unpack-unreachable).
    +
    +    There is one minor piece of test fallout in t5331 that expects the old
    +    format, which has been corrected.

         Signed-off-by: Taylor Blau <me@ttaylorr.com>

      ## builtin/pack-objects.c ##
    -@@ builtin/pack-objects.c: static int add_object_entry_from_pack(const struct object_id *oid,
    - 		return 0;
    +@@ builtin/pack-objects.c: int cmd_pack_objects(int argc,
    + 		strvec_push(&rp, "--unpacked");
    + 	}

    - 	if (p) {
    --		struct rev_info *revs = _data;
    - 		struct object_info oi = OBJECT_INFO_INIT;
    --
    - 		oi.typep = &type;
    +-	if (exclude_promisor_objects && exclude_promisor_objects_best_effort)
    +-		die(_("options '%s' and '%s' cannot be used together"),
    +-		    "--exclude-promisor-objects", "--exclude-promisor-objects-best-effort");
    ++	die_for_incompatible_opt2(exclude_promisor_objects,
    ++				  "--exclude-promisor-objects",
    ++				  exclude_promisor_objects_best_effort,
    ++				  "--exclude-promisor-objects-best-effort");
    + 	if (exclude_promisor_objects) {
    + 		use_internal_rev_list = 1;
    + 		fetch_if_missing = 0;
    +@@ builtin/pack-objects.c: int cmd_pack_objects(int argc,
    + 	if (!pack_to_stdout && thin)
    + 		die(_("--thin cannot be used to build an indexable pack"));
    +
    +-	if (keep_unreachable && unpack_unreachable)
    +-		die(_("options '%s' and '%s' cannot be used together"), "--keep-unreachable", "--unpack-unreachable");
    ++	die_for_incompatible_opt2(keep_unreachable, "--keep-unreachable",
    ++				  unpack_unreachable, "--unpack-unreachable");
    + 	if (!rev_list_all || !rev_list_reflog || !rev_list_index)
    + 		unpack_unreachable_expiration = 0;
    +
    +-	if (stdin_packs && filter_options.choice)
    +-		die(_("cannot use --filter with --stdin-packs"));
    ++	die_for_incompatible_opt2(stdin_packs, "--stdin-packs",
    ++				  filter_options.choice, "--filter");
     +
    - 		if (packed_object_info(the_repository, p, ofs, &oi) < 0) {
    - 			die(_("could not get type of object %s in pack %s"),
    - 			    oid_to_hex(oid), p->pack_name);
    - 		} else if (type == OBJ_COMMIT) {
    -+			struct rev_info *revs = _data;
    - 			/*
    - 			 * commits in included packs are used as starting points for the
    - 			 * subsequent revision walk
    +
    + 	if (stdin_packs && use_internal_rev_list)
    + 		die(_("cannot use internal rev list with --stdin-packs"));
    +@@ builtin/pack-objects.c: int cmd_pack_objects(int argc,
    + 	if (cruft) {
    + 		if (use_internal_rev_list)
    + 			die(_("cannot use internal rev list with --cruft"));
    +-		if (stdin_packs)
    +-			die(_("cannot use --stdin-packs with --cruft"));
    ++		die_for_incompatible_opt2(stdin_packs, "--stdin-packs",
    ++					  cruft, "--cruft");
    + 	}
    +
    + 	/*
    +
    + ## t/t5331-pack-objects-stdin.sh ##
    +@@ t/t5331-pack-objects-stdin.sh: test_expect_success '--stdin-packs is incompatible with --filter' '
    + 		cd stdin-packs &&
    + 		test_must_fail git pack-objects --stdin-packs --stdout \
    + 			--filter=blob:none </dev/null 2>err &&
    +-		test_grep "cannot use --filter with --stdin-packs" err
    ++		test_grep "options .--stdin-packs. and .--filter. cannot be used together" err
    + 	)
    + '
    +
 -:  ---------- >  2:  2753e29648 pack-objects: limit scope in 'add_object_entry_from_pack()'
 2:  6f8fe8a4e1 =  3:  32b49d9073 pack-objects: factor out handling '--stdin-packs'
 3:  2a235461a6 =  4:  a797ff3a83 pack-objects: declare 'rev_info' for '--stdin-packs' earlier
 4:  240e90b68d =  5:  29bf05633a pack-objects: perform name-hash traversal for unpacked objects
 5:  9a18fa2e52 =  6:  0696fa1736 pack-objects: fix typo in 'show_object_pack_hint()'
 6:  6c997853f1 =  7:  1cc45b4472 pack-objects: swap 'show_{object,commit}_pack_hint'
 7:  0ff699f056 =  8:  3e3d929bd0 pack-objects: introduce '--stdin-packs=follow'
 8:  58891101f3 !  9:  52a069ef48 repack: exclude cruft pack(s) from the MIDX where possible
    @@ Commit message
         MIDX with '--write-midx' to ensure that the resulting MIDX was always
         closed under reachability in order to generate reachability bitmaps.

    -    Suppose (prior to this patch) you have a once-unreachable object packed
    -    in a cruft pack, which later on becomes reachable from one or more
    -    objects in a geometrically repacked pack. That once-unreachable object
    -    *won't* appear in the new pack, since the cruft pack was specified as
    -    neither included nor excluded to 'pack-objects --stdin-packs'. If the
    +    While the previous patch added the '--stdin-packs=follow' option to
    +    pack-objects, it is not yet on by default. Given that, suppose you have
    +    a once-unreachable object packed in a cruft pack, which later becomes
    +    reachable from one or more objects in a geometrically repacked pack.
    +    That once-unreachable object *won't* appear in the new pack, since the
    +    cruft pack was not specified as included or excluded when the
    +    geometrically repacked pack was created with 'pack-objects
    +    --stdin-packs' (*not* '--stdin-packs=follow', which is not on). If that
         new pack is included in a MIDX without the cruft pack, then trying to
         generate bitmaps for that MIDX may fail. This happens when the bitmap
         selection process picks one or more commits which reach the
    -    once-unreachable objects, commit ddee3703b3 ensures that the MIDX will
    -    be closed under reachability. Without it, we would fail to generate a
    -    MIDX bitmap.
    +    once-unreachable objects.

    +    To mitigate this failure mode, commit ddee3703b3 ensures that the MIDX
    +    will be closed under reachability by including cruft pack(s). If cruft
    +    pack(s) were not included, we would fail to generate a MIDX bitmap. But
         ddee3703b3 alludes to the fact that this is sub-optimal by saying

             [...] it's desirable to avoid including cruft packs in the MIDX

base-commit: 485f5f863615e670fd97ae40af744e14072cfe18
--
2.49.0.640.ga4de40e6a8

  parent reply	other threads:[~2025-05-28 23:20 UTC|newest]

Thread overview: 105+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-11 23:26 [RFC PATCH 0/8] repack: avoid MIDX'ing cruft pack(s) where possible Taylor Blau
2025-04-11 23:26 ` [RFC PATCH 1/8] pack-objects: use standard option incompatibility functions Taylor Blau
2025-04-11 23:26 ` [RFC PATCH 2/8] pack-objects: limit scope in 'add_object_entry_from_pack()' Taylor Blau
2025-04-11 23:26 ` [RFC PATCH 3/8] pack-objects: factor out handling '--stdin-packs' Taylor Blau
2025-04-11 23:26 ` [RFC PATCH 4/8] pack-objects: declare 'rev_info' for '--stdin-packs' earlier Taylor Blau
2025-04-11 23:26 ` [RFC PATCH 5/8] pack-objects: perform name-hash traversal for unpacked objects Taylor Blau
2025-04-11 23:26 ` [RFC PATCH 6/8] pack-objects: introduce '--stdin-packs=follow' Taylor Blau
2025-04-11 23:26 ` [RFC PATCH 7/8] repack: keep track of existing MIDX'd packs Taylor Blau
2025-04-11 23:26 ` [RFC PATCH 8/8] repack: exclude cruft pack(s) from the MIDX where possible Taylor Blau
2025-04-14 20:06 ` [PATCH v2 0/8] repack: avoid MIDX'ing cruft pack(s) " Taylor Blau
2025-04-14 20:06   ` [PATCH v2 1/8] pack-objects: use standard option incompatibility functions Taylor Blau
2025-04-14 20:41     ` Junio C Hamano
2025-04-15 19:32       ` Taylor Blau
2025-04-15 19:48         ` Junio C Hamano
2025-04-15 22:27           ` Taylor Blau
2025-04-14 20:06   ` [PATCH v2 2/8] object-store-ll.h: add note about designated initializers Taylor Blau
2025-04-14 21:07     ` Junio C Hamano
2025-04-15 19:51       ` Taylor Blau
2025-04-15  2:57     ` Elijah Newren
2025-04-15 19:47       ` Taylor Blau
2025-04-14 20:06   ` [PATCH v2 3/8] pack-objects: limit scope in 'add_object_entry_from_pack()' Taylor Blau
2025-04-15  3:10     ` Elijah Newren
2025-04-14 20:06   ` [PATCH v2 4/8] pack-objects: factor out handling '--stdin-packs' Taylor Blau
2025-04-14 20:06   ` [PATCH v2 5/8] pack-objects: declare 'rev_info' for '--stdin-packs' earlier Taylor Blau
2025-04-14 20:06   ` [PATCH v2 6/8] pack-objects: perform name-hash traversal for unpacked objects Taylor Blau
2025-04-15  3:10     ` Elijah Newren
2025-04-15 19:57       ` Taylor Blau
2025-04-14 20:06   ` [PATCH v2 7/8] pack-objects: introduce '--stdin-packs=follow' Taylor Blau
2025-04-15  3:11     ` Elijah Newren
2025-04-15 20:45       ` Taylor Blau
2025-04-16  5:26         ` Elijah Newren
2025-04-14 20:06   ` [PATCH v2 8/8] repack: exclude cruft pack(s) from the MIDX where possible Taylor Blau
2025-04-15  3:11     ` Elijah Newren
2025-04-15 20:51       ` Taylor Blau
2025-04-15  2:57   ` [PATCH v2 0/8] repack: avoid MIDX'ing cruft pack(s) " Elijah Newren
2025-04-15 22:05     ` Taylor Blau
2025-04-15 22:46 ` [PATCH v3 0/9] " Taylor Blau
2025-04-15 22:46   ` [PATCH v3 1/9] pack-objects: use standard option incompatibility functions Taylor Blau
2025-04-15 22:46   ` [PATCH v3 2/9] pack-objects: limit scope in 'add_object_entry_from_pack()' Taylor Blau
2025-04-16  0:58     ` Junio C Hamano
2025-04-16 22:07       ` Taylor Blau
2025-04-16  5:31     ` Elijah Newren
2025-04-16 22:07       ` Taylor Blau
2025-04-15 22:46   ` [PATCH v3 3/9] pack-objects: factor out handling '--stdin-packs' Taylor Blau
2025-04-16  0:59     ` Junio C Hamano
2025-04-15 22:46   ` [PATCH v3 4/9] pack-objects: declare 'rev_info' for '--stdin-packs' earlier Taylor Blau
2025-04-15 22:47   ` [PATCH v3 5/9] pack-objects: perform name-hash traversal for unpacked objects Taylor Blau
2025-04-16  9:21     ` Junio C Hamano
2025-04-15 22:47   ` [PATCH v3 6/9] pack-objects: fix typo in 'show_object_pack_hint()' Taylor Blau
2025-04-16  5:36     ` Elijah Newren
2025-04-15 22:47   ` [PATCH v3 7/9] pack-objects: swap 'show_{object,commit}_pack_hint' Taylor Blau
2025-04-15 22:47   ` [PATCH v3 8/9] pack-objects: introduce '--stdin-packs=follow' Taylor Blau
2025-04-15 22:47   ` [PATCH v3 9/9] repack: exclude cruft pack(s) from the MIDX where possible Taylor Blau
2025-04-16  5:56     ` Elijah Newren
2025-04-16 22:16       ` Taylor Blau
2025-05-13  3:34         ` Elijah Newren
2025-05-28 23:20 ` Taylor Blau [this message]
2025-05-28 23:20   ` [PATCH v4 1/9] pack-objects: use standard option incompatibility functions Taylor Blau
2025-05-28 23:20   ` [PATCH v4 2/9] pack-objects: limit scope in 'add_object_entry_from_pack()' Taylor Blau
2025-05-28 23:20   ` [PATCH v4 3/9] pack-objects: factor out handling '--stdin-packs' Taylor Blau
2025-05-28 23:20   ` [PATCH v4 4/9] pack-objects: declare 'rev_info' for '--stdin-packs' earlier Taylor Blau
2025-05-28 23:20   ` [PATCH v4 5/9] pack-objects: perform name-hash traversal for unpacked objects Taylor Blau
2025-05-28 23:20   ` [PATCH v4 6/9] pack-objects: fix typo in 'show_object_pack_hint()' Taylor Blau
2025-05-28 23:20   ` [PATCH v4 7/9] pack-objects: swap 'show_{object,commit}_pack_hint' Taylor Blau
2025-05-28 23:20   ` [PATCH v4 8/9] pack-objects: introduce '--stdin-packs=follow' Taylor Blau
2025-05-28 23:20   ` [PATCH v4 9/9] repack: exclude cruft pack(s) from the MIDX where possible Taylor Blau
2025-06-19 11:33     ` Carlo Marcelo Arenas Belón
2025-06-19 13:08     ` [PATCH] fixup! " Carlo Marcelo Arenas Belón
2025-06-19 17:07       ` Junio C Hamano
2025-06-19 23:26         ` Taylor Blau
2025-05-29  0:07   ` [PATCH v4 0/9] repack: avoid MIDX'ing cruft pack(s) " Taylor Blau
2025-05-29  0:15     ` Elijah Newren
2025-06-19 23:30 ` [PATCH v5 " Taylor Blau
2025-06-19 23:30   ` [PATCH v5 1/9] pack-objects: use standard option incompatibility functions Taylor Blau
2025-06-19 23:30   ` [PATCH v5 2/9] pack-objects: limit scope in 'add_object_entry_from_pack()' Taylor Blau
2025-06-19 23:30   ` [PATCH v5 3/9] pack-objects: factor out handling '--stdin-packs' Taylor Blau
2025-06-19 23:30   ` [PATCH v5 4/9] pack-objects: declare 'rev_info' for '--stdin-packs' earlier Taylor Blau
2025-06-19 23:30   ` [PATCH v5 5/9] pack-objects: perform name-hash traversal for unpacked objects Taylor Blau
2025-06-19 23:30   ` [PATCH v5 6/9] pack-objects: fix typo in 'show_object_pack_hint()' Taylor Blau
2025-06-19 23:30   ` [PATCH v5 7/9] pack-objects: swap 'show_{object,commit}_pack_hint' Taylor Blau
2025-06-19 23:30   ` [PATCH v5 8/9] pack-objects: introduce '--stdin-packs=follow' Taylor Blau
2025-06-20 15:27     ` Junio C Hamano
2025-06-19 23:30   ` [PATCH v5 9/9] repack: exclude cruft pack(s) from the MIDX where possible Taylor Blau
2025-06-21  4:35     ` Jeff King
2025-06-23 18:47       ` Taylor Blau
2025-06-24 10:54         ` Jeff King
2025-06-24 16:05           ` Taylor Blau
2025-06-23 22:32 ` [PATCH v6 0/9] repack: avoid MIDX'ing cruft pack(s) " Taylor Blau
2025-06-23 22:32   ` [PATCH v6 1/9] pack-objects: use standard option incompatibility functions Taylor Blau
2025-06-24 15:52     ` Junio C Hamano
2025-06-24 16:06       ` Taylor Blau
2025-06-23 22:32   ` [PATCH v6 2/9] pack-objects: limit scope in 'add_object_entry_from_pack()' Taylor Blau
2025-06-23 22:49     ` Junio C Hamano
2025-06-23 22:32   ` [PATCH v6 3/9] pack-objects: factor out handling '--stdin-packs' Taylor Blau
2025-06-23 22:32   ` [PATCH v6 4/9] pack-objects: declare 'rev_info' for '--stdin-packs' earlier Taylor Blau
2025-06-23 22:59     ` Junio C Hamano
2025-06-23 22:32   ` [PATCH v6 5/9] pack-objects: perform name-hash traversal for unpacked objects Taylor Blau
2025-06-23 23:08     ` Junio C Hamano
2025-06-24 16:08       ` Taylor Blau
2025-06-23 22:32   ` [PATCH v6 6/9] pack-objects: fix typo in 'show_object_pack_hint()' Taylor Blau
2025-06-23 22:32   ` [PATCH v6 7/9] pack-objects: swap 'show_{object,commit}_pack_hint' Taylor Blau
2025-06-23 22:32   ` [PATCH v6 8/9] pack-objects: introduce '--stdin-packs=follow' Taylor Blau
2025-06-23 23:35     ` Junio C Hamano
2025-06-24 16:10       ` Taylor Blau
2025-06-23 22:32   ` [PATCH v6 9/9] repack: exclude cruft pack(s) from the MIDX where possible Taylor Blau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cover.1748473889.git.me@ttaylorr.com \
    --to=me@ttaylorr.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=newren@gmail.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).