From: Taylor Blau <me@ttaylorr.com>
To: git@vger.kernel.org
Cc: Jeff King <peff@peff.net>, Elijah Newren <newren@gmail.com>,
Junio C Hamano <gitster@pobox.com>
Subject: [PATCH v2 00/13] midx: incremental multi-pack indexes, part two
Date: Thu, 15 Aug 2024 18:28:46 -0400 [thread overview]
Message-ID: <cover.1723760847.git.me@ttaylorr.com> (raw)
In-Reply-To: <cover.1723755667.git.me@ttaylorr.com>
== Changes since last time
[The first round did not include commit messages for four intermediate
commits which could benefit from additional description, which has been
corrected here. The code is unchanged from the previous round, and a
range-diff is below.]
== Original cover letter
This series is based on 'master', with an additional merge between
tb/incremental-midx-part-1[1] and my newer series to fix a handful of
bugs related to pseudo-merge bitmaps[2].
This is the second of three series to implement support for incremental
multi-pack indexes (MIDXs). This series brings support for bitmaps that
are tied to incremental MIDXs in addition to regular MIDX bitmaps.
The details are laid out in the commits themselves, but the high-level
approach is as follows:
- Each layer in the incremental MIDX chain has its own corresponding
*.bitmap file. Each bitmap contains commits / pseudo-merges which
are selected only from the commits in that layer. Likewise, only
that layer's objects are written in the type-level bitmaps.
- The reachability traversal is only conducted on the top-most bitmap
corresponding to the most recent layer in the incremental MIDX
chain. Earlier layers may be consulted to retrieve commit /
pseudo-merge reachability bitmaps, but only the top-most bitmap's
"result" and "haves" fields are used.
- In essence, the top-most bitmap is the only one that "matters", and
earlier bitmaps are merely used to look up commit and pseudo-merge
bitmaps from that layer.
- Whenever we need to look at the type-level bitmaps corresponding to
the whole incremental MIDX chain, a new "ewah_or_iterator" is used.
This works in concept like a typical ewah_iterator, except works
over many EWAH bitmaps in parallel, OR-ing their results together
before returning them to the user.
In effect, this allows us to treat the union of all type-level
bitmaps (each of which only stores information about the objects its
corresponding layer within the incremental MIDX chain) as a single
type-level bitmap corresponding to all of the objects across every
layer of the incremental MIDX chain.
The sum total of this series is that we are able to append new commits /
pseudo-merges to a repository's reachability bitmaps without having to
rewrite existing bitmaps, making the operation much cheaper to perform
in large repositories.
The series is laid out roughly as follows:
- The first patch describes the technical details of incremental MIDX
bitmaps.
- The second patch adjusts the pack-revindex internals to prepare for
incremental MIDX bitmaps.
- The next seven patches adjust various components of the pack-bitmap
internals to do the same.
- The next three patches introduce and adjust callers to use the
ewah_or_iterator (as above).
- The final patch implements writing incremental MIDX bitmaps, and
introduces tests.
After this series, the remaining goals for this project include being
able to compact contiguous runs of incremental MIDX layers into a single
layer to support growing the chain of MIDX layers without the chain
itself becoming too long.
Thanks in advance for your review!
[1]: https://lore.kernel.org/git/cover.1722958595.git.me@ttaylorr.com/
[2]: https://lore.kernel.org/git/cover.1723743050.git.me@ttaylorr.com/
Taylor Blau (13):
Documentation: describe incremental MIDX bitmaps
pack-revindex: prepare for incremental MIDX bitmaps
pack-bitmap.c: open and store incremental bitmap layers
pack-bitmap.c: teach `bitmap_for_commit()` about incremental MIDXs
pack-bitmap.c: teach `show_objects_for_type()` about incremental MIDXs
pack-bitmap.c: support bitmap pack-reuse with incremental MIDXs
pack-bitmap.c: teach `rev-list --test-bitmap` about incremental MIDXs
pack-bitmap.c: compute disk-usage with incremental MIDXs
pack-bitmap.c: apply pseudo-merge commits with incremental MIDXs
ewah: implement `struct ewah_or_iterator`
pack-bitmap.c: keep track of each layer's type bitmaps
pack-bitmap.c: use `ewah_or_iterator` for type bitmap iterators
midx: implement writing incremental MIDX bitmaps
Documentation/technical/multi-pack-index.txt | 64 ++++
builtin/pack-objects.c | 3 +-
ewah/ewah_bitmap.c | 33 ++
ewah/ewok.h | 12 +
midx-write.c | 35 +-
pack-bitmap-write.c | 65 +++-
pack-bitmap.c | 328 ++++++++++++++-----
pack-bitmap.h | 4 +-
pack-revindex.c | 32 +-
t/t5334-incremental-multi-pack-index.sh | 84 +++++
10 files changed, 548 insertions(+), 112 deletions(-)
Range-diff against v1:
-: ---------- > 1: d1b8d11b37 Documentation: describe incremental MIDX bitmaps
-: ---------- > 2: f5d0866e5c pack-revindex: prepare for incremental MIDX bitmaps
-: ---------- > 3: 43444efc21 pack-bitmap.c: open and store incremental bitmap layers
-: ---------- > 4: 4487130648 pack-bitmap.c: teach `bitmap_for_commit()` about incremental MIDXs
1: b7eae5dc61 ! 5: b720fe56da pack-bitmap.c: teach `show_objects_for_type()` about incremental MIDXs
@@ Metadata
## Commit message ##
pack-bitmap.c: teach `show_objects_for_type()` about incremental MIDXs
+ Since we may ask for a pack_id that is in an earlier MIDX layer relative
+ to the one corresponding to our bitmap, use nth_midxed_pack() instead of
+ accessing the ->packs array directly.
+
Signed-off-by: Taylor Blau <me@ttaylorr.com>
## pack-bitmap.c ##
2: 01b8bd22cd ! 6: 9716d022e0 pack-bitmap.c: support bitmap pack-reuse with incremental MIDXs
@@ Metadata
## Commit message ##
pack-bitmap.c: support bitmap pack-reuse with incremental MIDXs
+ In a similar fashion as previous commits in the first phase of
+ incremental MIDXs, enumerate not just the packs in the current
+ incremental MIDX layer, but previous ones as well.
+
+ Likewise, in reuse_partial_packfile_from_bitmap(), when reusing only a
+ single pack from a MIDX, use the oldest layer's preferred pack as it is
+ likely to contain the most amount of reusable sections.
+
Signed-off-by: Taylor Blau <me@ttaylorr.com>
## pack-bitmap.c ##
3: 928a4eabc8 ! 7: 6baece3175 pack-bitmap.c: teach `rev-list --test-bitmap` about incremental MIDXs
@@ Metadata
## Commit message ##
pack-bitmap.c: teach `rev-list --test-bitmap` about incremental MIDXs
+ Implement support for the special `--test-bitmap` mode of `git rev-list`
+ when using incremental MIDXs.
+
+ The bitmap_test_data structure is extended to contain a "base" pointer
+ that mirrors the structure of the bitmap chain that it is being used to
+ test.
+
+ When we find a commit to test, we first chase down the ->base pointer to
+ find the appropriate bitmap_test_data for the bitmap layer that the
+ given commit is contained within, and then perform the test on that
+ bitmap.
+
+ In order to implement this, light modifications are made to
+ bitmap_for_commit() to reimplement it in terms of a new function,
+ find_bitmap_for_commit(), which fills out a pointer which indicates the
+ bitmap layer which contains the given commit.
+
Signed-off-by: Taylor Blau <me@ttaylorr.com>
## pack-bitmap.c ##
4: 129f55ac28 ! 8: 5c909df38a pack-bitmap.c: compute disk-usage with incremental MIDXs
@@ Metadata
## Commit message ##
pack-bitmap.c: compute disk-usage with incremental MIDXs
+ In a similar fashion as previous commits, use nth_midxed_pack() instead
+ of accessing the MIDX's ->packs array directly to support incremental
+ MIDXs.
+
Signed-off-by: Taylor Blau <me@ttaylorr.com>
## pack-bitmap.c ##
5: 81839fc1d1 = 9: f9ae10fce9 pack-bitmap.c: apply pseudo-merge commits with incremental MIDXs
6: 63019e952a = 10: 04042981c1 ewah: implement `struct ewah_or_iterator`
7: 01508e4ff5 = 11: c4d543d43d pack-bitmap.c: keep track of each layer's type bitmaps
8: 59a50a2ea2 = 12: c6730b4107 pack-bitmap.c: use `ewah_or_iterator` for type bitmap iterators
9: da34cc9441 = 13: afefb45557 midx: implement writing incremental MIDX bitmaps
base-commit: f28e4ef872c247b0b35f48b1c3d2c5f77753b908
--
2.46.0.86.ge766d390f0.dirty
next prev parent reply other threads:[~2024-08-15 22:28 UTC|newest]
Thread overview: 136+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-15 21:01 [PATCH 00/13] midx: incremental multi-pack indexes, part two Taylor Blau
2024-08-15 21:01 ` [PATCH 01/13] Documentation: describe incremental MIDX bitmaps Taylor Blau
2024-08-15 21:01 ` [PATCH 02/13] pack-revindex: prepare for " Taylor Blau
2024-08-15 21:01 ` [PATCH 03/13] pack-bitmap.c: open and store incremental bitmap layers Taylor Blau
2024-08-15 21:01 ` [PATCH 04/13] pack-bitmap.c: teach `bitmap_for_commit()` about incremental MIDXs Taylor Blau
2024-08-15 21:01 ` [PATCH 05/13] pack-bitmap.c: teach `show_objects_for_type()` " Taylor Blau
2024-08-15 21:01 ` [PATCH 06/13] pack-bitmap.c: support bitmap pack-reuse with " Taylor Blau
2024-08-15 21:01 ` [PATCH 07/13] pack-bitmap.c: teach `rev-list --test-bitmap` about " Taylor Blau
2024-08-15 21:01 ` [PATCH 08/13] pack-bitmap.c: compute disk-usage with " Taylor Blau
2024-08-15 21:01 ` [PATCH 09/13] pack-bitmap.c: apply pseudo-merge commits " Taylor Blau
2024-08-15 21:01 ` [PATCH 10/13] ewah: implement `struct ewah_or_iterator` Taylor Blau
2024-08-15 21:01 ` [PATCH 11/13] pack-bitmap.c: keep track of each layer's type bitmaps Taylor Blau
2024-08-15 21:01 ` [PATCH 12/13] pack-bitmap.c: use `ewah_or_iterator` for type bitmap iterators Taylor Blau
2024-08-15 21:01 ` [PATCH 13/13] midx: implement writing incremental MIDX bitmaps Taylor Blau
2024-08-15 22:28 ` Taylor Blau [this message]
2024-08-15 22:28 ` [PATCH v2 01/13] Documentation: describe " Taylor Blau
2024-08-15 22:28 ` [PATCH v2 02/13] pack-revindex: prepare for " Taylor Blau
2024-08-15 22:28 ` [PATCH v2 03/13] pack-bitmap.c: open and store incremental bitmap layers Taylor Blau
2024-08-15 22:29 ` [PATCH v2 04/13] pack-bitmap.c: teach `bitmap_for_commit()` about incremental MIDXs Taylor Blau
2024-08-15 22:29 ` [PATCH v2 05/13] pack-bitmap.c: teach `show_objects_for_type()` " Taylor Blau
2024-08-15 22:29 ` [PATCH v2 06/13] pack-bitmap.c: support bitmap pack-reuse with " Taylor Blau
2024-08-15 22:29 ` [PATCH v2 07/13] pack-bitmap.c: teach `rev-list --test-bitmap` about " Taylor Blau
2024-08-15 22:29 ` [PATCH v2 08/13] pack-bitmap.c: compute disk-usage with " Taylor Blau
2024-08-15 22:29 ` [PATCH v2 09/13] pack-bitmap.c: apply pseudo-merge commits " Taylor Blau
2024-08-15 22:29 ` [PATCH v2 10/13] ewah: implement `struct ewah_or_iterator` Taylor Blau
2024-08-15 22:29 ` [PATCH v2 11/13] pack-bitmap.c: keep track of each layer's type bitmaps Taylor Blau
2024-08-15 22:29 ` [PATCH v2 12/13] pack-bitmap.c: use `ewah_or_iterator` for type bitmap iterators Taylor Blau
2024-08-15 22:29 ` [PATCH v2 13/13] midx: implement writing incremental MIDX bitmaps Taylor Blau
2024-08-28 17:55 ` [PATCH] fixup! " Junio C Hamano
2024-08-28 18:33 ` Jeff King
2024-08-29 18:57 ` Taylor Blau
2024-08-29 19:27 ` Jeff King
2024-11-19 20:56 ` Taylor Blau
2024-11-19 22:07 ` [PATCH v3 00/13] midx: incremental multi-pack indexes, part two Taylor Blau
2024-11-19 22:07 ` [PATCH v3 01/13] Documentation: describe incremental MIDX bitmaps Taylor Blau
2025-02-28 10:01 ` Patrick Steinhardt
2025-02-28 23:26 ` Taylor Blau
2025-03-03 10:54 ` Patrick Steinhardt
2024-11-19 22:07 ` [PATCH v3 02/13] pack-revindex: prepare for " Taylor Blau
2025-02-28 10:01 ` Patrick Steinhardt
2025-02-28 23:39 ` Taylor Blau
2024-11-19 22:07 ` [PATCH v3 03/13] pack-bitmap.c: open and store incremental bitmap layers Taylor Blau
2025-02-28 10:01 ` Patrick Steinhardt
2025-02-28 23:49 ` Taylor Blau
2025-03-03 10:55 ` Patrick Steinhardt
2024-11-19 22:07 ` [PATCH v3 04/13] pack-bitmap.c: teach `bitmap_for_commit()` about incremental MIDXs Taylor Blau
2025-02-28 10:01 ` Patrick Steinhardt
2025-03-01 0:12 ` Taylor Blau
2024-11-19 22:07 ` [PATCH v3 05/13] pack-bitmap.c: teach `show_objects_for_type()` " Taylor Blau
2024-11-19 22:07 ` [PATCH v3 06/13] pack-bitmap.c: support bitmap pack-reuse with " Taylor Blau
2025-02-28 10:01 ` Patrick Steinhardt
2025-03-01 0:16 ` Taylor Blau
2024-11-19 22:07 ` [PATCH v3 07/13] pack-bitmap.c: teach `rev-list --test-bitmap` about " Taylor Blau
2025-02-28 10:01 ` Patrick Steinhardt
2025-03-01 0:19 ` Taylor Blau
2024-11-19 22:07 ` [PATCH v3 08/13] pack-bitmap.c: compute disk-usage with " Taylor Blau
2024-11-19 22:07 ` [PATCH v3 09/13] pack-bitmap.c: apply pseudo-merge commits " Taylor Blau
2024-11-19 22:07 ` [PATCH v3 10/13] ewah: implement `struct ewah_or_iterator` Taylor Blau
2025-02-28 10:01 ` Patrick Steinhardt
2025-03-01 0:22 ` Taylor Blau
2024-11-19 22:07 ` [PATCH v3 11/13] pack-bitmap.c: keep track of each layer's type bitmaps Taylor Blau
2025-02-28 10:01 ` Patrick Steinhardt
2025-03-01 0:26 ` Taylor Blau
2024-11-19 22:07 ` [PATCH v3 12/13] pack-bitmap.c: use `ewah_or_iterator` for type bitmap iterators Taylor Blau
2025-02-28 10:01 ` Patrick Steinhardt
2025-03-01 0:28 ` Taylor Blau
2024-11-19 22:07 ` [PATCH v3 13/13] midx: implement writing incremental MIDX bitmaps Taylor Blau
2025-02-28 10:01 ` Patrick Steinhardt
2025-03-01 0:31 ` Taylor Blau
2024-11-20 8:49 ` [PATCH v3 00/13] midx: incremental multi-pack indexes, part two Junio C Hamano
2025-03-14 20:18 ` [PATCH v4 " Taylor Blau
2025-03-14 20:18 ` [PATCH v4 01/13] Documentation: describe incremental MIDX bitmaps Taylor Blau
2025-03-18 1:16 ` Jeff King
2025-03-18 23:11 ` Taylor Blau
2025-03-18 2:42 ` Elijah Newren
2025-03-18 23:19 ` Taylor Blau
2025-03-14 20:18 ` [PATCH v4 02/13] pack-revindex: prepare for " Taylor Blau
2025-03-18 1:27 ` Jeff King
2025-03-19 0:02 ` Taylor Blau
2025-03-19 0:07 ` Taylor Blau
2025-03-26 18:08 ` Jeff King
2025-03-18 2:43 ` Elijah Newren
2025-03-19 0:03 ` Taylor Blau
2025-03-14 20:18 ` [PATCH v4 03/13] pack-bitmap.c: open and store incremental bitmap layers Taylor Blau
2025-03-18 4:13 ` Elijah Newren
2025-03-19 0:08 ` Taylor Blau
2025-03-14 20:18 ` [PATCH v4 04/13] pack-bitmap.c: teach `bitmap_for_commit()` about incremental MIDXs Taylor Blau
2025-03-18 1:38 ` Jeff King
2025-03-19 0:13 ` Taylor Blau
2025-03-14 20:18 ` [PATCH v4 05/13] pack-bitmap.c: teach `show_objects_for_type()` " Taylor Blau
2025-03-14 20:18 ` [PATCH v4 06/13] pack-bitmap.c: support bitmap pack-reuse with " Taylor Blau
2025-03-18 4:13 ` Elijah Newren
2025-03-19 0:17 ` Taylor Blau
2025-03-14 20:18 ` [PATCH v4 07/13] pack-bitmap.c: teach `rev-list --test-bitmap` about " Taylor Blau
2025-03-18 5:31 ` Elijah Newren
2025-03-19 0:30 ` Taylor Blau
2025-03-14 20:18 ` [PATCH v4 08/13] pack-bitmap.c: compute disk-usage with " Taylor Blau
2025-03-18 1:41 ` Jeff King
2025-03-19 0:30 ` Taylor Blau
2025-03-14 20:18 ` [PATCH v4 09/13] pack-bitmap.c: apply pseudo-merge commits " Taylor Blau
2025-03-14 20:18 ` [PATCH v4 10/13] ewah: implement `struct ewah_or_iterator` Taylor Blau
2025-03-18 1:44 ` Jeff King
2025-03-19 0:33 ` Taylor Blau
2025-03-14 20:18 ` [PATCH v4 11/13] pack-bitmap.c: keep track of each layer's type bitmaps Taylor Blau
2025-03-18 2:01 ` Jeff King
2025-03-19 0:38 ` Taylor Blau
2025-03-18 6:43 ` Elijah Newren
2025-03-19 0:39 ` Taylor Blau
2025-03-14 20:18 ` [PATCH v4 12/13] pack-bitmap.c: use `ewah_or_iterator` for type bitmap iterators Taylor Blau
2025-03-18 2:05 ` Jeff King
2025-03-19 23:02 ` Taylor Blau
2025-03-14 20:19 ` [PATCH v4 13/13] midx: implement writing incremental MIDX bitmaps Taylor Blau
2025-03-18 2:16 ` Jeff King
2025-03-20 0:14 ` Taylor Blau
2025-03-18 17:13 ` Elijah Newren
2025-03-20 0:16 ` Taylor Blau
2025-03-18 2:21 ` [PATCH v4 00/13] midx: incremental multi-pack indexes, part two Jeff King
2025-03-20 0:18 ` Taylor Blau
2025-03-20 17:56 ` [PATCH v5 00/14] " Taylor Blau
2025-03-20 17:56 ` [PATCH v5 01/14] Documentation: remove a "future work" item from the MIDX docs Taylor Blau
2025-03-20 17:56 ` [PATCH v5 02/14] Documentation: describe incremental MIDX bitmaps Taylor Blau
2025-03-20 17:56 ` [PATCH v5 03/14] pack-revindex: prepare for " Taylor Blau
2025-03-20 17:56 ` [PATCH v5 04/14] pack-bitmap.c: open and store incremental bitmap layers Taylor Blau
2025-03-20 17:56 ` [PATCH v5 05/14] pack-bitmap.c: teach `bitmap_for_commit()` about incremental MIDXs Taylor Blau
2025-03-20 17:56 ` [PATCH v5 06/14] pack-bitmap.c: teach `show_objects_for_type()` " Taylor Blau
2025-03-20 17:56 ` [PATCH v5 07/14] pack-bitmap.c: support bitmap pack-reuse with " Taylor Blau
2025-03-20 17:56 ` [PATCH v5 08/14] pack-bitmap.c: teach `rev-list --test-bitmap` about " Taylor Blau
2025-03-20 17:56 ` Taylor Blau
2025-03-20 17:58 ` Taylor Blau
2025-03-20 17:56 ` [PATCH v5 09/14] pack-bitmap.c: compute disk-usage with " Taylor Blau
2025-03-20 17:56 ` [PATCH v5 10/14] pack-bitmap.c: apply pseudo-merge commits " Taylor Blau
2025-03-20 17:56 ` [PATCH v5 11/14] ewah: implement `struct ewah_or_iterator` Taylor Blau
2025-03-20 17:57 ` [PATCH v5 12/14] pack-bitmap.c: keep track of each layer's type bitmaps Taylor Blau
2025-03-20 17:57 ` [PATCH v5 13/14] pack-bitmap.c: use `ewah_or_iterator` for type bitmap iterators Taylor Blau
2025-03-20 17:57 ` [PATCH v5 14/14] midx: implement writing incremental MIDX bitmaps Taylor Blau
2025-03-20 20:00 ` [PATCH v5 00/14] midx: incremental multi-pack indexes, part two Elijah Newren
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cover.1723760847.git.me@ttaylorr.com \
--to=me@ttaylorr.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=newren@gmail.com \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).