From: Taylor Blau <me@ttaylorr.com>
To: Jeff King <peff@peff.net>
Cc: git@vger.kernel.org, Elijah Newren <newren@gmail.com>,
Junio C Hamano <gitster@pobox.com>,
Patrick Steinhardt <ps@pks.im>
Subject: Re: [PATCH v4 01/13] Documentation: describe incremental MIDX bitmaps
Date: Tue, 18 Mar 2025 19:11:09 -0400 [thread overview]
Message-ID: <Z9n9jf7GDBMPyM2R@nand.local> (raw)
In-Reply-To: <20250318011618.GA1471939@coredump.intra.peff.net>
On Mon, Mar 17, 2025 at 09:16:18PM -0400, Jeff King wrote:
> On Fri, Mar 14, 2025 at 04:18:20PM -0400, Taylor Blau wrote:
>
> > +In the incremental MIDX design, we extend this definition to include
> > +objects from multiple layers of the MIDX chain. The pseudo-pack order
> > +for incremental MIDXs is determined by concatenating the pseudo-pack
> > +ordering for each layer of the MIDX chain in order. Formally two objects
> > +`o1` and `o2` are compared as follows:
> > +
> > +1. If `o1` appears in an earlier layer of the MIDX chain than `o2`, then
> > + `o1` is considered less than `o2`.
> > +
> > +2. Otherwise, if `o1` and `o2` appear in the same MIDX layer, and that
> > + MIDX layer has no base, then if one of `pack(o1)` and `pack(o2)` is
> > + preferred and the other is not, then the preferred one sorts first. If
> > + there is a base layer (i.e. the MIDX layer is not the first layer in
> > + the chain), then if `pack(o1)` appears earlier in that MIDX layer's
> > + pack order, than `o1` is less than `o2`. Likewise if `pack(o2)`
> > + appears earlier, than the opposite is true.
> > +
> > +3. Otherwise, `o1` and `o2` appear in the same pack, and thus in the
> > + same MIDX layer. Sort `o1` and `o2` by their offset within their
> > + containing packfile.
>
> OK, I think this ordering makes sense. I had to read this description
> over several times to make sure I wasn't missing something. The earlier
> part that says "it's just concatenating the pack order of the layers" is
> a much more intuitive way of looking at it (modulo that you might need
> to remove duplicates found in earlier layers).
>
> But I think an even more basic way of thinking about it is that it's the
> same as the pseudo-pack order you would get if you had a single midx of
> all of the packs in all of the layers (in their layer order). We already
> have to deal with (and have documented) duplicates in that case.
>
> Not really suggesting any wording change here, just making sure I
> grokked it all.
Yeah, those are both excellent ways to think about it. I hadn't
considered the "the new ordering is the same as the pseudo-pack order
you'd get if you had a single MIDX of all the packs in layer order"
thing before, but it's quite intuitive.
As a side note, it's somewhat hilarious to me that we could really
write:
"The new ordering is the same as the pseudo-pack order you'd get if
you had a single MIDX of all the packs in layer order, which is the
same order you'd get if you had a single pack containing all of the
objects in MIDX order."
;-)
> > +Note that the preferred pack is a property of the MIDX chain, not the
> > +individual layers themselves. Fundamentally we could introduce a
> > +per-layer preferred pack, but this is less relevant now that we can
> > +perform multi-pack reuse across the set of packs in a MIDX.
>
> Calling this out explicitly is good, since it's an obvious question
> for somebody to have.
Thanks, I think this was an addition from Patrick's earlier review of
the series.
> OK, so each layer's bitmap does depend on the layers above/before it.
> That obviously needs to happen because each incremental midx is not
> likely to be a complete reachability set anyway.
>
> But I also wondered what would happen with a situation like this:
>
> A -- B
> \
> -- C
>
> stored like this:
>
> base midx:
> - pack 1:
> - object A
> - object B, which can reach A
> incremental midx:
> - pack 2:
> - object A
> - object C, which can reach A
>
> That is, two objects B and C both depend on A, which is duplicated in
> two midx layers. Even if the incremental midx is complete in the sense
> that C only depends on A, its bitmap cannot just be "11". Because the
> bit position for object A in the incremental midx does not exist in the
> pseudo-pack order at all! It must refer to the copy of "A" in the base
> midx, so it's correct bitmap is "101" (A and C, but not B).
>
Right. Since the base MIDX has objects A and B, B's bitmap here would be
"11". C's bit position in the subsequent layer is a function of where it
sits not just in that MIDX layer, but how many (de-duplicated) objects
exist in all prior layers. There are two, so the earliest bit position
possible to allocate towards C is the third bit. And since C reaches A,
its bitmap would indeed be "101".
> Again, just talking through it here.
Heh, thanks for saying so. It's good to know when we're just talking
through examples versus asking for changes. (Of course, the mere fact of
talking through an example is sometimes enough to suggest a change by
virtue of that example being confusing enough to need to be talked
through in the first place).
> > +Note also that only the bitmap pertaining to the most recent layer in an
> > +incremental MIDX chain is used to store reachability information about
> > +the interesting and uninteresting objects in a reachability query.
> > +Earlier bitmap layers are only used to look up commit and pseudo-merge
> > +bitmaps from that layer, as well as the type-level bitmaps for objects
> > +in that layer.
>
> I'm not quite sure what this means, but I guess you're saying that
> internally as we produce a bitmap, we'll always use the complete bitmap
> over all of the layers?
That's exactly right.
> > +To simplify the implementation, type-level bitmaps are iterated
> > +simultaneously, and their results are OR'd together to avoid recursively
> > +calling internal bitmap functions.
>
> OK, I guess we'll see what this means in the patches. ;)
>
> The general rules for the data structure make sense to me, though.
Great, and thanks in advance for the review as I work through the rest
of your emails :-).
Thanks,
Taylor
next prev parent reply other threads:[~2025-03-18 23:11 UTC|newest]
Thread overview: 136+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-15 21:01 [PATCH 00/13] midx: incremental multi-pack indexes, part two Taylor Blau
2024-08-15 21:01 ` [PATCH 01/13] Documentation: describe incremental MIDX bitmaps Taylor Blau
2024-08-15 21:01 ` [PATCH 02/13] pack-revindex: prepare for " Taylor Blau
2024-08-15 21:01 ` [PATCH 03/13] pack-bitmap.c: open and store incremental bitmap layers Taylor Blau
2024-08-15 21:01 ` [PATCH 04/13] pack-bitmap.c: teach `bitmap_for_commit()` about incremental MIDXs Taylor Blau
2024-08-15 21:01 ` [PATCH 05/13] pack-bitmap.c: teach `show_objects_for_type()` " Taylor Blau
2024-08-15 21:01 ` [PATCH 06/13] pack-bitmap.c: support bitmap pack-reuse with " Taylor Blau
2024-08-15 21:01 ` [PATCH 07/13] pack-bitmap.c: teach `rev-list --test-bitmap` about " Taylor Blau
2024-08-15 21:01 ` [PATCH 08/13] pack-bitmap.c: compute disk-usage with " Taylor Blau
2024-08-15 21:01 ` [PATCH 09/13] pack-bitmap.c: apply pseudo-merge commits " Taylor Blau
2024-08-15 21:01 ` [PATCH 10/13] ewah: implement `struct ewah_or_iterator` Taylor Blau
2024-08-15 21:01 ` [PATCH 11/13] pack-bitmap.c: keep track of each layer's type bitmaps Taylor Blau
2024-08-15 21:01 ` [PATCH 12/13] pack-bitmap.c: use `ewah_or_iterator` for type bitmap iterators Taylor Blau
2024-08-15 21:01 ` [PATCH 13/13] midx: implement writing incremental MIDX bitmaps Taylor Blau
2024-08-15 22:28 ` [PATCH v2 00/13] midx: incremental multi-pack indexes, part two Taylor Blau
2024-08-15 22:28 ` [PATCH v2 01/13] Documentation: describe incremental MIDX bitmaps Taylor Blau
2024-08-15 22:28 ` [PATCH v2 02/13] pack-revindex: prepare for " Taylor Blau
2024-08-15 22:28 ` [PATCH v2 03/13] pack-bitmap.c: open and store incremental bitmap layers Taylor Blau
2024-08-15 22:29 ` [PATCH v2 04/13] pack-bitmap.c: teach `bitmap_for_commit()` about incremental MIDXs Taylor Blau
2024-08-15 22:29 ` [PATCH v2 05/13] pack-bitmap.c: teach `show_objects_for_type()` " Taylor Blau
2024-08-15 22:29 ` [PATCH v2 06/13] pack-bitmap.c: support bitmap pack-reuse with " Taylor Blau
2024-08-15 22:29 ` [PATCH v2 07/13] pack-bitmap.c: teach `rev-list --test-bitmap` about " Taylor Blau
2024-08-15 22:29 ` [PATCH v2 08/13] pack-bitmap.c: compute disk-usage with " Taylor Blau
2024-08-15 22:29 ` [PATCH v2 09/13] pack-bitmap.c: apply pseudo-merge commits " Taylor Blau
2024-08-15 22:29 ` [PATCH v2 10/13] ewah: implement `struct ewah_or_iterator` Taylor Blau
2024-08-15 22:29 ` [PATCH v2 11/13] pack-bitmap.c: keep track of each layer's type bitmaps Taylor Blau
2024-08-15 22:29 ` [PATCH v2 12/13] pack-bitmap.c: use `ewah_or_iterator` for type bitmap iterators Taylor Blau
2024-08-15 22:29 ` [PATCH v2 13/13] midx: implement writing incremental MIDX bitmaps Taylor Blau
2024-08-28 17:55 ` [PATCH] fixup! " Junio C Hamano
2024-08-28 18:33 ` Jeff King
2024-08-29 18:57 ` Taylor Blau
2024-08-29 19:27 ` Jeff King
2024-11-19 20:56 ` Taylor Blau
2024-11-19 22:07 ` [PATCH v3 00/13] midx: incremental multi-pack indexes, part two Taylor Blau
2024-11-19 22:07 ` [PATCH v3 01/13] Documentation: describe incremental MIDX bitmaps Taylor Blau
2025-02-28 10:01 ` Patrick Steinhardt
2025-02-28 23:26 ` Taylor Blau
2025-03-03 10:54 ` Patrick Steinhardt
2024-11-19 22:07 ` [PATCH v3 02/13] pack-revindex: prepare for " Taylor Blau
2025-02-28 10:01 ` Patrick Steinhardt
2025-02-28 23:39 ` Taylor Blau
2024-11-19 22:07 ` [PATCH v3 03/13] pack-bitmap.c: open and store incremental bitmap layers Taylor Blau
2025-02-28 10:01 ` Patrick Steinhardt
2025-02-28 23:49 ` Taylor Blau
2025-03-03 10:55 ` Patrick Steinhardt
2024-11-19 22:07 ` [PATCH v3 04/13] pack-bitmap.c: teach `bitmap_for_commit()` about incremental MIDXs Taylor Blau
2025-02-28 10:01 ` Patrick Steinhardt
2025-03-01 0:12 ` Taylor Blau
2024-11-19 22:07 ` [PATCH v3 05/13] pack-bitmap.c: teach `show_objects_for_type()` " Taylor Blau
2024-11-19 22:07 ` [PATCH v3 06/13] pack-bitmap.c: support bitmap pack-reuse with " Taylor Blau
2025-02-28 10:01 ` Patrick Steinhardt
2025-03-01 0:16 ` Taylor Blau
2024-11-19 22:07 ` [PATCH v3 07/13] pack-bitmap.c: teach `rev-list --test-bitmap` about " Taylor Blau
2025-02-28 10:01 ` Patrick Steinhardt
2025-03-01 0:19 ` Taylor Blau
2024-11-19 22:07 ` [PATCH v3 08/13] pack-bitmap.c: compute disk-usage with " Taylor Blau
2024-11-19 22:07 ` [PATCH v3 09/13] pack-bitmap.c: apply pseudo-merge commits " Taylor Blau
2024-11-19 22:07 ` [PATCH v3 10/13] ewah: implement `struct ewah_or_iterator` Taylor Blau
2025-02-28 10:01 ` Patrick Steinhardt
2025-03-01 0:22 ` Taylor Blau
2024-11-19 22:07 ` [PATCH v3 11/13] pack-bitmap.c: keep track of each layer's type bitmaps Taylor Blau
2025-02-28 10:01 ` Patrick Steinhardt
2025-03-01 0:26 ` Taylor Blau
2024-11-19 22:07 ` [PATCH v3 12/13] pack-bitmap.c: use `ewah_or_iterator` for type bitmap iterators Taylor Blau
2025-02-28 10:01 ` Patrick Steinhardt
2025-03-01 0:28 ` Taylor Blau
2024-11-19 22:07 ` [PATCH v3 13/13] midx: implement writing incremental MIDX bitmaps Taylor Blau
2025-02-28 10:01 ` Patrick Steinhardt
2025-03-01 0:31 ` Taylor Blau
2024-11-20 8:49 ` [PATCH v3 00/13] midx: incremental multi-pack indexes, part two Junio C Hamano
2025-03-14 20:18 ` [PATCH v4 " Taylor Blau
2025-03-14 20:18 ` [PATCH v4 01/13] Documentation: describe incremental MIDX bitmaps Taylor Blau
2025-03-18 1:16 ` Jeff King
2025-03-18 23:11 ` Taylor Blau [this message]
2025-03-18 2:42 ` Elijah Newren
2025-03-18 23:19 ` Taylor Blau
2025-03-14 20:18 ` [PATCH v4 02/13] pack-revindex: prepare for " Taylor Blau
2025-03-18 1:27 ` Jeff King
2025-03-19 0:02 ` Taylor Blau
2025-03-19 0:07 ` Taylor Blau
2025-03-26 18:08 ` Jeff King
2025-03-18 2:43 ` Elijah Newren
2025-03-19 0:03 ` Taylor Blau
2025-03-14 20:18 ` [PATCH v4 03/13] pack-bitmap.c: open and store incremental bitmap layers Taylor Blau
2025-03-18 4:13 ` Elijah Newren
2025-03-19 0:08 ` Taylor Blau
2025-03-14 20:18 ` [PATCH v4 04/13] pack-bitmap.c: teach `bitmap_for_commit()` about incremental MIDXs Taylor Blau
2025-03-18 1:38 ` Jeff King
2025-03-19 0:13 ` Taylor Blau
2025-03-14 20:18 ` [PATCH v4 05/13] pack-bitmap.c: teach `show_objects_for_type()` " Taylor Blau
2025-03-14 20:18 ` [PATCH v4 06/13] pack-bitmap.c: support bitmap pack-reuse with " Taylor Blau
2025-03-18 4:13 ` Elijah Newren
2025-03-19 0:17 ` Taylor Blau
2025-03-14 20:18 ` [PATCH v4 07/13] pack-bitmap.c: teach `rev-list --test-bitmap` about " Taylor Blau
2025-03-18 5:31 ` Elijah Newren
2025-03-19 0:30 ` Taylor Blau
2025-03-14 20:18 ` [PATCH v4 08/13] pack-bitmap.c: compute disk-usage with " Taylor Blau
2025-03-18 1:41 ` Jeff King
2025-03-19 0:30 ` Taylor Blau
2025-03-14 20:18 ` [PATCH v4 09/13] pack-bitmap.c: apply pseudo-merge commits " Taylor Blau
2025-03-14 20:18 ` [PATCH v4 10/13] ewah: implement `struct ewah_or_iterator` Taylor Blau
2025-03-18 1:44 ` Jeff King
2025-03-19 0:33 ` Taylor Blau
2025-03-14 20:18 ` [PATCH v4 11/13] pack-bitmap.c: keep track of each layer's type bitmaps Taylor Blau
2025-03-18 2:01 ` Jeff King
2025-03-19 0:38 ` Taylor Blau
2025-03-18 6:43 ` Elijah Newren
2025-03-19 0:39 ` Taylor Blau
2025-03-14 20:18 ` [PATCH v4 12/13] pack-bitmap.c: use `ewah_or_iterator` for type bitmap iterators Taylor Blau
2025-03-18 2:05 ` Jeff King
2025-03-19 23:02 ` Taylor Blau
2025-03-14 20:19 ` [PATCH v4 13/13] midx: implement writing incremental MIDX bitmaps Taylor Blau
2025-03-18 2:16 ` Jeff King
2025-03-20 0:14 ` Taylor Blau
2025-03-18 17:13 ` Elijah Newren
2025-03-20 0:16 ` Taylor Blau
2025-03-18 2:21 ` [PATCH v4 00/13] midx: incremental multi-pack indexes, part two Jeff King
2025-03-20 0:18 ` Taylor Blau
2025-03-20 17:56 ` [PATCH v5 00/14] " Taylor Blau
2025-03-20 17:56 ` [PATCH v5 01/14] Documentation: remove a "future work" item from the MIDX docs Taylor Blau
2025-03-20 17:56 ` [PATCH v5 02/14] Documentation: describe incremental MIDX bitmaps Taylor Blau
2025-03-20 17:56 ` [PATCH v5 03/14] pack-revindex: prepare for " Taylor Blau
2025-03-20 17:56 ` [PATCH v5 04/14] pack-bitmap.c: open and store incremental bitmap layers Taylor Blau
2025-03-20 17:56 ` [PATCH v5 05/14] pack-bitmap.c: teach `bitmap_for_commit()` about incremental MIDXs Taylor Blau
2025-03-20 17:56 ` [PATCH v5 06/14] pack-bitmap.c: teach `show_objects_for_type()` " Taylor Blau
2025-03-20 17:56 ` [PATCH v5 07/14] pack-bitmap.c: support bitmap pack-reuse with " Taylor Blau
2025-03-20 17:56 ` [PATCH v5 08/14] pack-bitmap.c: teach `rev-list --test-bitmap` about " Taylor Blau
2025-03-20 17:56 ` Taylor Blau
2025-03-20 17:58 ` Taylor Blau
2025-03-20 17:56 ` [PATCH v5 09/14] pack-bitmap.c: compute disk-usage with " Taylor Blau
2025-03-20 17:56 ` [PATCH v5 10/14] pack-bitmap.c: apply pseudo-merge commits " Taylor Blau
2025-03-20 17:56 ` [PATCH v5 11/14] ewah: implement `struct ewah_or_iterator` Taylor Blau
2025-03-20 17:57 ` [PATCH v5 12/14] pack-bitmap.c: keep track of each layer's type bitmaps Taylor Blau
2025-03-20 17:57 ` [PATCH v5 13/14] pack-bitmap.c: use `ewah_or_iterator` for type bitmap iterators Taylor Blau
2025-03-20 17:57 ` [PATCH v5 14/14] midx: implement writing incremental MIDX bitmaps Taylor Blau
2025-03-20 20:00 ` [PATCH v5 00/14] midx: incremental multi-pack indexes, part two Elijah Newren
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z9n9jf7GDBMPyM2R@nand.local \
--to=me@ttaylorr.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=newren@gmail.com \
--cc=peff@peff.net \
--cc=ps@pks.im \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).