git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Taylor Blau <me@ttaylorr.com>
To: git@vger.kernel.org
Cc: Patrick Steinhardt <ps@pks.im>, Jeff King <peff@peff.net>,
	Junio C Hamano <gitster@pobox.com>
Subject: [PATCH v2 0/4] midx: improve prepare_midx_pack() ergonomics
Date: Wed, 28 May 2025 18:58:57 -0400	[thread overview]
Message-ID: <cover.1748473122.git.me@ttaylorr.com> (raw)
In-Reply-To: <cover.1748198489.git.me@ttaylorr.com>

This series has been updated to be based on v3 of
ps/midx-negative-packfile-cache, which is 1f34bf3e08 (midx: stop
repeatedly looking up nonexistent packfiles, 2025-05-28) at the time of
writing.

There are a couple of minor changes from last time, including:

 - using the new MIDX_PACK_ERROR macro instead of casting '-1' to a void
   pointer

 - converting some direct m->pack_names lookups (outside of midx.c and
   midx-write.c with one exception) to use a similar pattern to access
   the nth pack name

As usual, a range-diff against v1 is below for convenience.

Thanks again for your review!

Taylor Blau (4):
  midx: access pack names through `nth_midxed_pack_name()`
  midx-write.c: guard against incremental MIDXs in want_included_pack()
  midx-write.c: extract inner loop from fill_packs_from_midx()
  midx: return a `packed_git` pointer from `prepare_midx_pack()`

 midx-write.c              | 101 +++++++++++++++++++++-----------------
 midx.c                    |  88 +++++++++++++++++----------------
 midx.h                    |   6 ++-
 pack-bitmap.c             |   6 +--
 t/helper/test-read-midx.c |   7 +--
 5 files changed, 116 insertions(+), 92 deletions(-)

Range-diff against v1:
1:  a7082dc7f0 < -:  ---------- packfile: explain ordering of how we look up auxiliary pack files
2:  73fe0882ee < -:  ---------- midx: stop repeatedly looking up nonexistent packfiles
3:  ad7295b11b < -:  ---------- pack-bitmap.c: fix broken warning() when missing MIDX'd pack
5:  5d97b706e1 ! 1:  d3508d3cfb midx-write.c: simplify fill_packs_from_midx()
    @@ Metadata
     Author: Taylor Blau <me@ttaylorr.com>
     
      ## Commit message ##
    -    midx-write.c: simplify fill_packs_from_midx()
    +    midx: access pack names through `nth_midxed_pack_name()`
     
    -    When enumerating packs within fill_packs_from_midx(), we open the pack's
    -    index iff we are either writing a reverse index, have a preferred pack
    -    (by name) or both.
    +    Accessing a MIDX's 'pack_names' array is somewhat error-prone when
    +    dealing with incremental MIDX chains, where the (global) pack_int_id for
    +    some pack may differ from the containing layer's index for that pack.
     
    -    Let's simplify this into a single case by setting the
    -    MIDX_WRITE_REV_INDEX flag bit when we have a preferred_pack_name. This
    -    is a little bit of a shortcut to reduce the line length in the loop
    -    below. But it sets us up nicely to extract the inner loop of this
    -    function out into its own function, where we will no longer have to pass
    -    the preferred_pack_name.
    +    Introduce `nth_midxed_pack_name()` in an effort to reduce a common
    +    source of errors by discouraging external callers from accessing a
    +    layer's `pack_names` array directly.
     
         Signed-off-by: Taylor Blau <me@ttaylorr.com>
     
    - ## midx-write.c ##
    -@@ midx-write.c: static int fill_packs_from_midx(struct write_midx_context *ctx,
    - {
    - 	struct multi_pack_index *m;
    + ## midx.c ##
    +@@ midx.c: struct packed_git *nth_midxed_pack(struct multi_pack_index *m,
    + 	return m->packs[local_pack_int_id];
    + }
      
    -+	if (preferred_pack_name) {
    -+		/*
    -+		 * If a preferred pack is specified, need to have
    -+		 * packed_git's loaded to ensure the chosen preferred
    -+		 * pack has a non-zero object count.
    -+		 *
    -+		 * Trick ourselves into thinking that we're writing a
    -+		 * reverse index in this case in order to open up the
    -+		 * pack index file.
    -+		 */
    -+		flags |= MIDX_WRITE_REV_INDEX;
    -+	}
    ++const char *nth_midxed_pack_name(struct multi_pack_index *m,
    ++				 uint32_t pack_int_id)
    ++{
    ++	uint32_t local_pack_int_id = midx_for_pack(&m, pack_int_id);
    ++	return m->pack_names[local_pack_int_id];
    ++}
     +
    - 	for (m = ctx->m; m; m = m->base_midx) {
    - 		uint32_t i;
    + #define MIDX_CHUNK_BITMAPPED_PACKS_WIDTH (2 * sizeof(uint32_t))
      
    -@@ midx-write.c: static int fill_packs_from_midx(struct write_midx_context *ctx,
    - 			 * If generating a reverse index, need to have
    - 			 * packed_git's loaded to compare their
    - 			 * mtimes and object count.
    --			 *
    --			 * If a preferred pack is specified, need to
    --			 * have packed_git's loaded to ensure the chosen
    --			 * preferred pack has a non-zero object count.
    - 			 */
    --			if (flags & MIDX_WRITE_REV_INDEX ||
    --			    preferred_pack_name) {
    -+			if (flags & MIDX_WRITE_REV_INDEX) {
    - 				if (prepare_midx_pack(ctx->repo, m,
    - 						      m->num_packs_in_base + i)) {
    - 					error(_("could not load pack"));
    + int nth_bitmapped_pack(struct repository *r, struct multi_pack_index *m,
    +
    + ## midx.h ##
    +@@ midx.h: struct multi_pack_index *load_multi_pack_index(struct repository *r,
    + int prepare_midx_pack(struct repository *r, struct multi_pack_index *m, uint32_t pack_int_id);
    + struct packed_git *nth_midxed_pack(struct multi_pack_index *m,
    + 				   uint32_t pack_int_id);
    ++const char *nth_midxed_pack_name(struct multi_pack_index *m,
    ++				 uint32_t pack_int_id);
    + int nth_bitmapped_pack(struct repository *r, struct multi_pack_index *m,
    + 		       struct bitmapped_pack *bp, uint32_t pack_int_id);
    + int bsearch_one_midx(const struct object_id *oid, struct multi_pack_index *m,
    +
    + ## pack-bitmap.c ##
    +@@ pack-bitmap.c: static int open_midx_bitmap_1(struct bitmap_index *bitmap_git,
    + 	for (i = 0; i < bitmap_git->midx->num_packs + bitmap_git->midx->num_packs_in_base; i++) {
    + 		if (prepare_midx_pack(bitmap_repo(bitmap_git), bitmap_git->midx, i)) {
    + 			warning(_("could not open pack %s"),
    +-				bitmap_git->midx->pack_names[i]);
    ++				nth_midxed_pack_name(bitmap_git->midx, i));
    + 			goto cleanup;
    + 		}
    + 	}
    +@@ pack-bitmap.c: void reuse_partial_packfile_from_bitmap(struct bitmap_index *bitmap_git,
    + 			struct bitmapped_pack pack;
    + 			if (nth_bitmapped_pack(r, bitmap_git->midx, &pack, i) < 0) {
    + 				warning(_("unable to load pack: '%s', disabling pack-reuse"),
    +-					bitmap_git->midx->pack_names[i]);
    ++					nth_midxed_pack_name(bitmap_git->midx, i));
    + 				free(packs);
    + 				return;
    + 			}
    +
    + ## t/helper/test-read-midx.c ##
    +@@ t/helper/test-read-midx.c: static int read_midx_file(const char *object_dir, const char *checksum,
    + 	printf("\nnum_objects: %d\n", m->num_objects);
    + 
    + 	printf("packs:\n");
    +-	for (i = 0; i < m->num_packs; i++)
    +-		printf("%s\n", m->pack_names[i]);
    ++	for (i = m->num_packs_in_base; i < m->num_packs + m->num_packs_in_base;
    ++	     i++)
    ++		printf("%s\n", nth_midxed_pack_name(m, i));
    + 
    + 	printf("object-dir: %s\n", m->object_dir);
    + 
    +@@ t/helper/test-read-midx.c: static int read_midx_preferred_pack(const char *object_dir)
    + 		return 1;
    + 	}
    + 
    +-	printf("%s\n", midx->pack_names[preferred_pack]);
    ++	printf("%s\n", nth_midxed_pack_name(midx, preferred_pack));
    + 	close_midx(midx);
    + 	return 0;
    + }
4:  d2f9645aa1 ! 2:  cf3ab81a08 midx-write.c: guard against incremental MIDXs in want_included_pack()
    @@ Commit message
             if (m->base_midx)
                 die(_("cannot repack an incremental multi-pack-index"));
     
    -    So want_included_pack() is OK becuase it will never encounter a
    +    So want_included_pack() is OK because it will never encounter a
         situation where it has to chase backwards through the '->base_midx'
         pointer. But that is not immediately clear from reading the code, and is
         too fragile for my comfort. Make this more clear by adding an ASSERT()
6:  a4080c12df ! 3:  e54988bfea midx-write.c: extract inner loop from fill_packs_from_midx()
    @@ midx-write.c: static struct multi_pack_index *lookup_multi_pack_index(struct rep
      
     +static int fill_packs_from_midx_1(struct write_midx_context *ctx,
     +				  struct multi_pack_index *m,
    -+				  uint32_t flags)
    ++				  int prepare_packs)
     +{
     +	for (uint32_t i = 0; i < m->num_packs; i++) {
    -+		ALLOC_GROW(ctx->info, ctx->nr + 1, ctx->alloc);
    -+
     +		/*
     +		 * If generating a reverse index, need to have
     +		 * packed_git's loaded to compare their
     +		 * mtimes and object count.
     +		 */
    -+		if (flags & MIDX_WRITE_REV_INDEX) {
    ++		if (prepare_packs) {
     +			if (prepare_midx_pack(ctx->repo, m,
     +					      m->num_packs_in_base + i)) {
     +				error(_("could not load pack"));
    @@ midx-write.c: static struct multi_pack_index *lookup_multi_pack_index(struct rep
      static int fill_packs_from_midx(struct write_midx_context *ctx,
      				const char *preferred_pack_name, uint32_t flags)
      {
    -@@ midx-write.c: static int fill_packs_from_midx(struct write_midx_context *ctx,
    - 	}
    + 	struct multi_pack_index *m;
    ++	int prepare_packs;
    ++
    ++	/*
    ++	 * If generating a reverse index, need to have packed_git's
    ++	 * loaded to compare their mtimes and object count.
    ++	 */
    ++	prepare_packs = !!(flags & MIDX_WRITE_REV_INDEX || preferred_pack_name);
      
      	for (m = ctx->m; m; m = m->base_midx) {
     -		uint32_t i;
    @@ midx-write.c: static int fill_packs_from_midx(struct write_midx_context *ctx,
     -			 * If generating a reverse index, need to have
     -			 * packed_git's loaded to compare their
     -			 * mtimes and object count.
    +-			 *
    +-			 * If a preferred pack is specified, need to
    +-			 * have packed_git's loaded to ensure the chosen
    +-			 * preferred pack has a non-zero object count.
     -			 */
    --			if (flags & MIDX_WRITE_REV_INDEX) {
    +-			if (flags & MIDX_WRITE_REV_INDEX ||
    +-			    preferred_pack_name) {
     -				if (prepare_midx_pack(ctx->repo, m,
     -						      m->num_packs_in_base + i)) {
     -					error(_("could not load pack"));
    @@ midx-write.c: static int fill_packs_from_midx(struct write_midx_context *ctx,
     -				       m->pack_names[i],
     -				       m->num_packs_in_base + i);
     -		}
    -+		int ret = fill_packs_from_midx_1(ctx, m, flags);
    ++		int ret = fill_packs_from_midx_1(ctx, m, prepare_packs);
     +		if (ret)
     +			return ret;
      	}
7:  80699bb3ee ! 4:  e3e21db673 midx: return a `packed_git` pointer from `prepare_midx_pack()`
    @@ Commit message
     
      ## midx-write.c ##
     @@ midx-write.c: static int fill_packs_from_midx_1(struct write_midx_context *ctx,
    - 				  uint32_t flags)
    + 				  int prepare_packs)
      {
      	for (uint32_t i = 0; i < m->num_packs; i++) {
    +-		/*
    +-		 * If generating a reverse index, need to have
    +-		 * packed_git's loaded to compare their
    +-		 * mtimes and object count.
    +-		 */
     +		struct packed_git *p = NULL;
     +
    - 		ALLOC_GROW(ctx->info, ctx->nr + 1, ctx->alloc);
    - 
    - 		/*
    -@@ midx-write.c: static int fill_packs_from_midx_1(struct write_midx_context *ctx,
    - 		 * mtimes and object count.
    - 		 */
    - 		if (flags & MIDX_WRITE_REV_INDEX) {
    ++		ALLOC_GROW(ctx->info, ctx->nr + 1, ctx->alloc);
    + 		if (prepare_packs) {
     -			if (prepare_midx_pack(ctx->repo, m,
     -					      m->num_packs_in_base + i)) {
     +			p = prepare_midx_pack(ctx->repo, m,
    @@ midx-write.c: static int fill_packs_from_midx_1(struct write_midx_context *ctx,
      		}
      
     -		fill_pack_info(&ctx->info[ctx->nr++], m->packs[i],
    -+		fill_pack_info(&ctx->info[ctx->nr++], p,
    - 			       m->pack_names[i],
    +-			       m->pack_names[i],
    ++		fill_pack_info(&ctx->info[ctx->nr++], p, m->pack_names[i],
      			       m->num_packs_in_base + i);
      	}
    + 
     @@ midx-write.c: int expire_midx_packs(struct repository *r, const char *object_dir, unsigned fla
      					  _("Finding and deleting unreferenced packfiles"),
      					  m->num_packs);
    @@ midx.c: static uint32_t midx_for_pack(struct multi_pack_index **_m,
     +		struct strbuf key = STRBUF_INIT;
     +		struct packed_git *p;
      
    --	if (m->packs[pack_int_id] == (void *)(intptr_t)-1)
    +-	if (m->packs[pack_int_id] == MIDX_PACK_ERROR)
     -		return 1;
     -	if (m->packs[pack_int_id])
     -		return 0;
    @@ midx.c: static uint32_t midx_for_pack(struct multi_pack_index **_m,
     +		strbuf_release(&key);
      
     -	if (!p) {
    --		m->packs[pack_int_id] = (void *)(intptr_t)-1;
    +-		m->packs[pack_int_id] = MIDX_PACK_ERROR;
     -		return 1;
    -+		m->packs[pack_pos] = p ? p : (void *)(intptr_t)-1;
    ++		m->packs[pack_pos] = p ? p : MIDX_PACK_ERROR;
     +		if (p)
     +			p->multi_pack_index = 1;
      	}
    @@ midx.c: static uint32_t midx_for_pack(struct multi_pack_index **_m,
     -	m->packs[pack_int_id] = p;
     -
     -	return 0;
    -+	if (m->packs[pack_pos] == (void *)(intptr_t)-1)
    ++	if (m->packs[pack_pos] == MIDX_PACK_ERROR)
     +		return NULL;
     +	return m->packs[pack_pos];
      }
    @@ midx.h: void get_split_midx_filename_ext(const struct git_hash_algo *hash_algo,
     +				     uint32_t pack_int_id);
      struct packed_git *nth_midxed_pack(struct multi_pack_index *m,
      				   uint32_t pack_int_id);
    - int nth_bitmapped_pack(struct repository *r, struct multi_pack_index *m,
    + const char *nth_midxed_pack_name(struct multi_pack_index *m,
     
      ## pack-bitmap.c ##
     @@ pack-bitmap.c: static int open_midx_bitmap_1(struct bitmap_index *bitmap_git,
    @@ pack-bitmap.c: static int open_midx_bitmap_1(struct bitmap_index *bitmap_git,
     -		if (prepare_midx_pack(bitmap_repo(bitmap_git), bitmap_git->midx, i)) {
     +		if (!prepare_midx_pack(bitmap_repo(bitmap_git), bitmap_git->midx, i)) {
      			warning(_("could not open pack %s"),
    - 				bitmap_git->midx->pack_names[i - bitmap_git->midx->num_packs_in_base]);
    + 				nth_midxed_pack_name(bitmap_git->midx, i));
      			goto cleanup;

base-commit: 1f34bf3e082741e053d25b76a0ffe31d9d967594
-- 
2.49.0.640.ga4de40e6a8

  parent reply	other threads:[~2025-05-28 22:58 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-16  8:55 [PATCH] packfile: avoid access(3p) calls for missing packs Patrick Steinhardt
2025-05-16 18:34 ` Junio C Hamano
2025-05-19  6:52   ` Jeff King
2025-05-19 15:46     ` Junio C Hamano
2025-05-20  6:45     ` Patrick Steinhardt
2025-05-22  5:28       ` Jeff King
2025-05-23  1:02     ` Taylor Blau
2025-05-23  2:03       ` Jeff King
2025-05-20  9:53 ` [PATCH v2 0/2] " Patrick Steinhardt
2025-05-20  9:53   ` [PATCH v2 1/2] packfile: explain ordering of how we look up auxiliary pack files Patrick Steinhardt
2025-05-23  1:03     ` Taylor Blau
2025-05-20  9:53   ` [PATCH v2 2/2] midx: stop repeatedly looking up nonexistent packfiles Patrick Steinhardt
2025-05-22  5:32     ` Jeff King
2025-05-22 15:47       ` Junio C Hamano
2025-05-22 16:59         ` Jeff King
2025-05-22 18:44           ` Junio C Hamano
2025-05-23  1:22           ` Taylor Blau
2025-05-23  2:08             ` Jeff King
2025-05-23 17:46               ` Taylor Blau
2025-05-25 18:41                 ` [PATCH 0/5] midx: improve prepare_midx_pack() ergonomics Taylor Blau
2025-05-25 18:41                   ` [PATCH 1/5] pack-bitmap.c: fix broken warning() when missing MIDX'd pack Taylor Blau
2025-05-26  7:23                     ` Patrick Steinhardt
2025-05-28  2:00                       ` Taylor Blau
2025-05-25 18:41                   ` [PATCH 2/5] midx-write.c: guard against incremental MIDXs in want_included_pack() Taylor Blau
2025-05-26  7:23                     ` Patrick Steinhardt
2025-05-28  2:08                       ` Taylor Blau
2025-05-25 18:41                   ` [PATCH 3/5] midx-write.c: simplify fill_packs_from_midx() Taylor Blau
2025-05-26  7:23                     ` Patrick Steinhardt
2025-05-28  2:15                       ` Taylor Blau
2025-05-25 18:42                   ` [PATCH 4/5] midx-write.c: extract inner loop from fill_packs_from_midx() Taylor Blau
2025-05-25 18:42                   ` [PATCH 5/5] midx: return a `packed_git` pointer from `prepare_midx_pack()` Taylor Blau
2025-05-26  7:24                     ` Patrick Steinhardt
2025-05-28  2:18                       ` Taylor Blau
2025-05-28 11:53                         ` Patrick Steinhardt
2025-05-28 22:58                   ` Taylor Blau [this message]
2025-05-28 22:59                     ` [PATCH v2 1/4] midx: access pack names through `nth_midxed_pack_name()` Taylor Blau
2025-05-29 20:47                       ` Junio C Hamano
2025-06-03 22:22                         ` Taylor Blau
2025-05-29 20:51                       ` Junio C Hamano
2025-06-03 22:23                         ` Taylor Blau
2025-05-28 22:59                     ` [PATCH v2 2/4] midx-write.c: guard against incremental MIDXs in want_included_pack() Taylor Blau
2025-05-28 22:59                     ` [PATCH v2 3/4] midx-write.c: extract inner loop from fill_packs_from_midx() Taylor Blau
2025-05-28 22:59                     ` [PATCH v2 4/4] midx: return a `packed_git` pointer from `prepare_midx_pack()` Taylor Blau
2025-05-30  6:50                       ` Jeff King
2025-06-03 22:27                         ` Taylor Blau
2025-08-28 23:25                           ` Junio C Hamano
2025-05-23  1:31       ` [PATCH v2 2/2] midx: stop repeatedly looking up nonexistent packfiles Taylor Blau
2025-05-23  2:18         ` Jeff King
2025-05-21 13:24   ` [PATCH v2 0/2] packfile: avoid access(3p) calls for missing packs Junio C Hamano
2025-05-28 12:24 ` [PATCH v3 " Patrick Steinhardt
2025-05-28 12:24   ` [PATCH v3 1/2] packfile: explain ordering of how we look up auxiliary pack files Patrick Steinhardt
2025-05-28 12:24   ` [PATCH v3 2/2] midx: stop repeatedly looking up nonexistent packfiles Patrick Steinhardt
2025-05-30  6:27   ` [PATCH v3 0/2] packfile: avoid access(3p) calls for missing packs Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cover.1748473122.git.me@ttaylorr.com \
    --to=me@ttaylorr.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=peff@peff.net \
    --cc=ps@pks.im \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).