git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Taylor Blau <me@ttaylorr.com>
To: git@vger.kernel.org
Cc: Jeff King <peff@peff.net>, Elijah Newren <newren@gmail.com>,
	Junio C Hamano <gitster@pobox.com>
Subject: [PATCH v3 00/19] midx: incremental multi-pack indexes, part one
Date: Tue, 6 Aug 2024 11:36:36 -0400	[thread overview]
Message-ID: <cover.1722958595.git.me@ttaylorr.com> (raw)
In-Reply-To: <cover.1717715060.git.me@ttaylorr.com>

This series implements incremental MIDXs, which allow for storing
a MIDX across multiple layers, each with their own distinct set of
packs.

This round is also similar to the previous one, but is rebased on
current 'master' (406f326d27 (The second batch, 2024-08-01)) and has
been updated in response to review from Peff on the previous round.

As usual, a range-diff is below, but the main changes since last time
are as follows:

  - Documentation improvements to clarify what happens when both an
    incremental- and non-incremental MIDX are both present in a
    repository.

  - Commit message typofix on 3/19 to fix an error in one of the
    technical examples.

  - Dropped a custom 'local_pack_int_id' in 4/19 to make the remaining
    diff easier to read.

  - Minor bugfix in 7/19 where we incorrectly terminated the object
    abbreviation disambiguation step for incremental MIDXs.

  - Various additional bits of information in the commit message to
    explain anything that was subtle.

Thanks in advance for any review! :-)

Taylor Blau (19):
  Documentation: describe incremental MIDX format
  midx: add new fields for incremental MIDX chains
  midx: teach `nth_midxed_pack_int_id()` about incremental MIDXs
  midx: teach `prepare_midx_pack()` about incremental MIDXs
  midx: teach `nth_midxed_object_oid()` about incremental MIDXs
  midx: teach `nth_bitmapped_pack()` about incremental MIDXs
  midx: introduce `bsearch_one_midx()`
  midx: teach `bsearch_midx()` about incremental MIDXs
  midx: teach `nth_midxed_offset()` about incremental MIDXs
  midx: teach `fill_midx_entry()` about incremental MIDXs
  midx: remove unused `midx_locate_pack()`
  midx: teach `midx_contains_pack()` about incremental MIDXs
  midx: teach `midx_preferred_pack()` about incremental MIDXs
  midx: teach `midx_fanout_add_midx_fanout()` about incremental MIDXs
  midx: support reading incremental MIDX chains
  midx: implement verification support for incremental MIDXs
  t: retire 'GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP'
  t/t5313-pack-bounds-checks.sh: prepare for sub-directories
  midx: implement support for writing incremental MIDX chains

 Documentation/git-multi-pack-index.txt       |  11 +-
 Documentation/technical/multi-pack-index.txt | 103 +++++
 builtin/multi-pack-index.c                   |   2 +
 builtin/repack.c                             |   8 +-
 ci/run-build-and-tests.sh                    |   2 +-
 midx-write.c                                 | 326 ++++++++++++---
 midx.c                                       | 405 ++++++++++++++++---
 midx.h                                       |  26 +-
 object-name.c                                |  99 ++---
 packfile.c                                   |  21 +-
 packfile.h                                   |   4 +
 t/README                                     |   6 +-
 t/helper/test-read-midx.c                    |  24 +-
 t/lib-bitmap.sh                              |   6 +-
 t/lib-midx.sh                                |  28 ++
 t/t0410-partial-clone.sh                     |   2 -
 t/t5310-pack-bitmaps.sh                      |   4 -
 t/t5313-pack-bounds-checks.sh                |   8 +-
 t/t5319-multi-pack-index.sh                  |  30 +-
 t/t5326-multi-pack-bitmaps.sh                |   4 +-
 t/t5327-multi-pack-bitmaps-rev.sh            |   6 +-
 t/t5332-multi-pack-reuse.sh                  |   2 +
 t/t5334-incremental-multi-pack-index.sh      |  46 +++
 t/t7700-repack.sh                            |  48 +--
 24 files changed, 960 insertions(+), 261 deletions(-)
 create mode 100755 t/t5334-incremental-multi-pack-index.sh

Range-diff against v2:
 1:  014588b3ec !  1:  90b21b11ed Documentation: describe incremental MIDX format
    @@ Documentation/technical/multi-pack-index.txt: Design Details
     +extending the incremental MIDX format to support reachability bitmaps.
     +The design below specifically takes this into account, and support for
     +reachability bitmaps will be added in a future patch series. It is
    -+omitted from this series for the same reason as above.
    ++omitted from the current implementation for the same reason as above.
     ++
     +In brief, to support reachability bitmaps with the incremental MIDX
     +feature, the concept of the pseudo-pack order is extended across each
    @@ Documentation/technical/multi-pack-index.txt: Design Details
     +multi-pack-index chain. The `multi-pack-index-$H2.midx` file contains
     +the second layer of the chain, and so on.
     +
    ++When both an incremental- and non-incremental MIDX are present, the
    ++non-incremental MIDX is always read first.
    ++
     +=== Object positions for incremental MIDXs
     +
     +In the original multi-pack-index design, we refer to objects via their
 2:  337ebc6de7 =  2:  0d3b19c59f midx: add new fields for incremental MIDX chains
 3:  f449a72877 !  3:  5cd742b677 midx: teach `nth_midxed_pack_int_id()` about incremental MIDXs
    @@ Commit message
           objects contained in all layers of the incremental MIDX chain, not any
           particular layer. For example, consider MIDX chain with two individual
           MIDXs, one with 4 objects and another with 3 objects. If the MIDX with
    -      4 objects appears earlier in the chain, then asking for pack "6" would
    +      4 objects appears earlier in the chain, then asking for object 6 would
           return the second object in the MIDX with 3 objects.
     
         [^2]: Building on the previous example, asking for object 6 in a MIDX
 4:  f88569c819 !  4:  372104c73d midx: teach `prepare_midx_pack()` about incremental MIDXs
    @@ midx.c: static uint32_t midx_for_object(struct multi_pack_index **_m, uint32_t p
      		die(_("bad pack-int-id: %u (%u total packs)"),
     -		    pack_int_id, m->num_packs);
     +		    pack_int_id, m->num_packs + m->num_packs_in_base);
    - 
    --	if (m->packs[pack_int_id])
    ++
     +	*_m = m;
     +
     +	return pack_int_id - m->num_packs_in_base;
    @@ midx.c: static uint32_t midx_for_object(struct multi_pack_index **_m, uint32_t p
     +{
     +	struct strbuf pack_name = STRBUF_INIT;
     +	struct packed_git *p;
    -+	uint32_t local_pack_int_id = midx_for_pack(&m, pack_int_id);
     +
    -+	if (m->packs[local_pack_int_id])
    ++	pack_int_id = midx_for_pack(&m, pack_int_id);
    + 
    + 	if (m->packs[pack_int_id])
      		return 0;
    - 
    - 	strbuf_addf(&pack_name, "%s/pack/%s", m->object_dir,
    --		    m->pack_names[pack_int_id]);
    -+		    m->pack_names[local_pack_int_id]);
    - 
    - 	p = add_packed_git(pack_name.buf, pack_name.len, m->local);
    - 	strbuf_release(&pack_name);
    -@@ midx.c: int prepare_midx_pack(struct repository *r, struct multi_pack_index *m, uint32_t
    - 		return 1;
    - 
    - 	p->multi_pack_index = 1;
    --	m->packs[pack_int_id] = p;
    -+	m->packs[local_pack_int_id] = p;
    - 	install_packed_git(r, p);
    - 	list_add_tail(&p->mru, &r->objects->packed_git_mru);
    - 
 5:  ec57ff4349 =  5:  e68a3ceff9 midx: teach `nth_midxed_object_oid()` about incremental MIDXs
 6:  650b8c8c21 !  6:  ff2d7bc5ca midx: teach `nth_bitmapped_pack()` about incremental MIDXs
    @@ Commit message
         ID. Likewise, when reading the 'BTMP' chunk, use the MIDX-local offset
         when accessing the data within that chunk.
     
    +    (Note that the both the call to prepare_midx_pack() and the assignment
    +    of bp->pack_int_id both care about the global pack_int_id, so avoid
    +    shadowing the given 'pack_int_id' parameter).
    +
         Signed-off-by: Taylor Blau <me@ttaylorr.com>
     
      ## midx.c ##
 7:  bfd1dadbf1 !  7:  32c3fceada midx: introduce `bsearch_one_midx()`
    @@ object-name.c: static int match_hash(unsigned len, const unsigned char *a, const
      
     -	if (!num)
     -		return;
    -+		num = m->num_objects + m->num_objects_in_base;
    ++		if (!m->num_objects)
    ++			continue;
      
     -	bsearch_midx(&ds->bin_pfx, m, &first);
    -+		if (!num)
    -+			continue;
    ++		num = m->num_objects + m->num_objects_in_base;
      
     -	/*
     -	 * At this point, "first" is the location of the lowest object
 8:  38bd45bd24 =  8:  16db6c98ce midx: teach `bsearch_midx()` about incremental MIDXs
 9:  342ed56033 =  9:  761c7c59ba midx: teach `nth_midxed_offset()` about incremental MIDXs
10:  2b335c45ae = 10:  8366456d29 midx: teach `fill_midx_entry()` about incremental MIDXs
11:  22de5898f3 = 11:  909d927c47 midx: remove unused `midx_locate_pack()`
12:  fb60f2b022 = 12:  71127601b5 midx: teach `midx_contains_pack()` about incremental MIDXs
13:  38b642d404 = 13:  2f98ebb141 midx: teach `midx_preferred_pack()` about incremental MIDXs
14:  594386da10 ! 14:  550ae2dc93 midx: teach `midx_fanout_add_midx_fanout()` about incremental MIDXs
    @@ Commit message
             MIDX layers when dealing with an incremental MIDX chain by calling
             itself when given a MIDX with a non-NULL `base_midx`.
     
    +    Note that after 0c5a62f14b (midx-write.c: do not read existing MIDX with
    +    `packs_to_include`, 2024-06-11), we do not use this function with an
    +    existing MIDX (incremental or not) when generating a MIDX with
    +    --stdin-packs, and likewise for incremental MIDXs.
    +
    +    But it is still used when adding the fanout table from an incremental
    +    MIDX when generating a non-incremental MIDX (without --stdin-packs, of
    +    course).
    +
         Signed-off-by: Taylor Blau <me@ttaylorr.com>
     
      ## midx-write.c ##
15:  dad130799c ! 15:  9ae1bc415e midx: support reading incremental MIDX chains
    @@ Commit message
         in the commit after next.)
     
         The core of this change involves following the order specified in the
    -    MIDX chain and opening up MIDXs in the chain one-by-one, adding them to
    -    the previous layer's `->base_midx` pointer at each step.
    +    MIDX chain in reverse and opening up MIDXs in the chain one-by-one,
    +    adding them to the previous layer's `->base_midx` pointer at each step.
     
         In order to implement this, the `load_multi_pack_index()` function is
         taught to call a new `load_multi_pack_index_chain()` function if loading
16:  ad976ef413 = 16:  3d4181df51 midx: implement verification support for incremental MIDXs
17:  23912425bf = 17:  3b268f91bf t: retire 'GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP'
18:  814da1916d = 18:  09d74f8942 t/t5313-pack-bounds-checks.sh: prepare for sub-directories
19:  e2b5961b45 = 19:  5d467d38a8 midx: implement support for writing incremental MIDX chains

base-commit: 406f326d271e0bacecdb00425422c5fa3f314930
-- 
2.46.0.46.g406f326d27.dirty

  parent reply	other threads:[~2024-08-06 15:36 UTC|newest]

Thread overview: 102+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-06-06 23:04 [PATCH 00/19] midx: incremental multi-pack indexes, part one Taylor Blau
2024-06-06 23:04 ` [PATCH 01/19] Documentation: describe incremental MIDX format Taylor Blau
2024-06-06 23:04 ` [PATCH 02/19] midx: add new fields for incremental MIDX chains Taylor Blau
2024-06-06 23:04 ` [PATCH 03/19] midx: teach `nth_midxed_pack_int_id()` about incremental MIDXs Taylor Blau
2024-06-06 23:04 ` [PATCH 04/19] midx: teach `prepare_midx_pack()` " Taylor Blau
2024-06-06 23:04 ` [PATCH 05/19] midx: teach `nth_midxed_object_oid()` " Taylor Blau
2024-06-06 23:04 ` [PATCH 06/19] midx: teach `nth_bitmapped_pack()` " Taylor Blau
2024-06-06 23:04 ` [PATCH 07/19] midx: introduce `bsearch_one_midx()` Taylor Blau
2024-06-06 23:04 ` [PATCH 08/19] midx: teach `bsearch_midx()` about incremental MIDXs Taylor Blau
2024-06-06 23:04 ` [PATCH 09/19] midx: teach `nth_midxed_offset()` " Taylor Blau
2024-06-06 23:04 ` [PATCH 10/19] midx: teach `fill_midx_entry()` " Taylor Blau
2024-06-06 23:04 ` [PATCH 11/19] midx: remove unused `midx_locate_pack()` Taylor Blau
2024-06-06 23:05 ` [PATCH 12/19] midx: teach `midx_contains_pack()` about incremental MIDXs Taylor Blau
2024-06-06 23:05 ` [PATCH 13/19] midx: teach `midx_preferred_pack()` " Taylor Blau
2024-06-06 23:05 ` [PATCH 14/19] midx: teach `midx_fanout_add_midx_fanout()` " Taylor Blau
2024-06-06 23:05 ` [PATCH 15/19] midx: support reading incremental MIDX chains Taylor Blau
2024-06-06 23:05 ` [PATCH 16/19] midx: implement verification support for incremental MIDXs Taylor Blau
2024-06-06 23:05 ` [PATCH 17/19] t: retire 'GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP' Taylor Blau
2024-06-06 23:05 ` [PATCH 18/19] t/t5313-pack-bounds-checks.sh: prepare for sub-directories Taylor Blau
2024-06-06 23:05 ` [PATCH 19/19] midx: implement support for writing incremental MIDX chains Taylor Blau
2024-06-06 23:06 ` [PATCH 00/19] midx: incremental multi-pack indexes, part one Taylor Blau
2024-06-07 18:33   ` Junio C Hamano
2024-06-07 20:29     ` Taylor Blau
2024-06-07 17:55 ` Junio C Hamano
2024-06-07 20:31   ` Taylor Blau
2024-06-25 23:21 ` Junio C Hamano
2024-06-26  0:44   ` Elijah Newren
2024-07-17 21:11 ` [PATCH v2 " Taylor Blau
2024-07-17 21:11   ` [PATCH v2 01/19] Documentation: describe incremental MIDX format Taylor Blau
2024-08-01  9:19     ` Jeff King
2024-08-01 18:52       ` Taylor Blau
2024-07-17 21:12   ` [PATCH v2 02/19] midx: add new fields for incremental MIDX chains Taylor Blau
2024-08-01  9:21     ` Jeff King
2024-08-01 18:54       ` Taylor Blau
2024-07-17 21:12   ` [PATCH v2 03/19] midx: teach `nth_midxed_pack_int_id()` about incremental MIDXs Taylor Blau
2024-08-01  9:30     ` Jeff King
2024-08-01 18:57       ` Taylor Blau
2024-07-17 21:12   ` [PATCH v2 04/19] midx: teach `prepare_midx_pack()` " Taylor Blau
2024-08-01  9:35     ` Jeff King
2024-08-01 19:00       ` Taylor Blau
2024-07-17 21:12   ` [PATCH v2 05/19] midx: teach `nth_midxed_object_oid()` " Taylor Blau
2024-08-01  9:38     ` Jeff King
2024-08-01 19:03       ` Taylor Blau
2024-07-17 21:12   ` [PATCH v2 06/19] midx: teach `nth_bitmapped_pack()` " Taylor Blau
2024-08-01  9:39     ` Jeff King
2024-08-01 19:07       ` Taylor Blau
2024-07-17 21:12   ` [PATCH v2 07/19] midx: introduce `bsearch_one_midx()` Taylor Blau
2024-08-01 10:06     ` Jeff King
2024-08-01 19:54       ` Taylor Blau
2024-07-17 21:12   ` [PATCH v2 08/19] midx: teach `bsearch_midx()` about incremental MIDXs Taylor Blau
2024-08-01 10:07     ` Jeff King
2024-07-17 21:12   ` [PATCH v2 09/19] midx: teach `nth_midxed_offset()` " Taylor Blau
2024-08-01 10:08     ` Jeff King
2024-07-17 21:12   ` [PATCH v2 10/19] midx: teach `fill_midx_entry()` " Taylor Blau
2024-08-01 10:12     ` Jeff King
2024-08-01 20:01       ` Taylor Blau
2024-07-17 21:12   ` [PATCH v2 11/19] midx: remove unused `midx_locate_pack()` Taylor Blau
2024-08-01 10:14     ` Jeff King
2024-08-01 20:01       ` Taylor Blau
2024-07-17 21:12   ` [PATCH v2 12/19] midx: teach `midx_contains_pack()` about incremental MIDXs Taylor Blau
2024-08-01 10:17     ` Jeff King
2024-07-17 21:12   ` [PATCH v2 13/19] midx: teach `midx_preferred_pack()` " Taylor Blau
2024-08-01 10:25     ` Jeff King
2024-08-01 20:05       ` Taylor Blau
2024-07-17 21:12   ` [PATCH v2 14/19] midx: teach `midx_fanout_add_midx_fanout()` " Taylor Blau
2024-08-01 10:29     ` Jeff King
2024-08-01 20:09       ` Taylor Blau
2024-07-17 21:12   ` [PATCH v2 15/19] midx: support reading incremental MIDX chains Taylor Blau
2024-08-01 10:40     ` Jeff King
2024-08-01 20:35       ` Taylor Blau
2024-07-17 21:12   ` [PATCH v2 16/19] midx: implement verification support for incremental MIDXs Taylor Blau
2024-08-01 10:41     ` Jeff King
2024-07-17 21:12   ` [PATCH v2 17/19] t: retire 'GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP' Taylor Blau
2024-08-01 10:46     ` Jeff King
2024-08-01 20:36       ` Taylor Blau
2024-07-17 21:12   ` [PATCH v2 18/19] t/t5313-pack-bounds-checks.sh: prepare for sub-directories Taylor Blau
2024-07-17 21:12   ` [PATCH v2 19/19] midx: implement support for writing incremental MIDX chains Taylor Blau
2024-08-01 11:07     ` Jeff King
2024-08-01 20:39       ` Taylor Blau
2024-08-01 11:14   ` [PATCH v2 00/19] midx: incremental multi-pack indexes, part one Jeff King
2024-08-01 20:41     ` Taylor Blau
2024-08-06 15:36 ` Taylor Blau [this message]
2024-08-06 15:36   ` [PATCH v3 01/19] Documentation: describe incremental MIDX format Taylor Blau
2024-08-06 15:36   ` [PATCH v3 02/19] midx: add new fields for incremental MIDX chains Taylor Blau
2024-08-06 15:37   ` [PATCH v3 03/19] midx: teach `nth_midxed_pack_int_id()` about incremental MIDXs Taylor Blau
2024-08-06 15:37   ` [PATCH v3 04/19] midx: teach `prepare_midx_pack()` " Taylor Blau
2024-08-06 15:37   ` [PATCH v3 05/19] midx: teach `nth_midxed_object_oid()` " Taylor Blau
2024-08-06 15:37   ` [PATCH v3 06/19] midx: teach `nth_bitmapped_pack()` " Taylor Blau
2024-08-06 15:37   ` [PATCH v3 07/19] midx: introduce `bsearch_one_midx()` Taylor Blau
2024-08-06 15:37   ` [PATCH v3 08/19] midx: teach `bsearch_midx()` about incremental MIDXs Taylor Blau
2024-08-06 15:37   ` [PATCH v3 09/19] midx: teach `nth_midxed_offset()` " Taylor Blau
2024-08-06 15:37   ` [PATCH v3 10/19] midx: teach `fill_midx_entry()` " Taylor Blau
2024-08-06 15:37   ` [PATCH v3 11/19] midx: remove unused `midx_locate_pack()` Taylor Blau
2024-08-06 15:37   ` [PATCH v3 12/19] midx: teach `midx_contains_pack()` about incremental MIDXs Taylor Blau
2024-08-06 15:37   ` [PATCH v3 13/19] midx: teach `midx_preferred_pack()` " Taylor Blau
2024-08-06 15:37   ` [PATCH v3 14/19] midx: teach `midx_fanout_add_midx_fanout()` " Taylor Blau
2024-08-06 15:37   ` [PATCH v3 15/19] midx: support reading incremental MIDX chains Taylor Blau
2024-08-06 15:37   ` [PATCH v3 16/19] midx: implement verification support for incremental MIDXs Taylor Blau
2024-08-06 15:38   ` [PATCH v3 17/19] t: retire 'GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP' Taylor Blau
2024-08-06 15:38   ` [PATCH v3 18/19] t/t5313-pack-bounds-checks.sh: prepare for sub-directories Taylor Blau
2024-08-06 15:38   ` [PATCH v3 19/19] midx: implement support for writing incremental MIDX chains Taylor Blau
2024-08-12 14:27   ` [PATCH v3 00/19] midx: incremental multi-pack indexes, part one Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cover.1722958595.git.me@ttaylorr.com \
    --to=me@ttaylorr.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=newren@gmail.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).