git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Patrick Steinhardt <ps@pks.im>
To: Taylor Blau <me@ttaylorr.com>
Cc: git@vger.kernel.org, Elijah Newren <newren@gmail.com>,
	Jeff King <peff@peff.net>, Junio C Hamano <gitster@pobox.com>
Subject: Re: [PATCH v3 01/13] Documentation: describe incremental MIDX bitmaps
Date: Mon, 3 Mar 2025 11:54:58 +0100	[thread overview]
Message-ID: <Z8WKgisnKb5zc1xO@pks.im> (raw)
In-Reply-To: <Z8JGNQZolfs7fm65@nand.local>

On Fri, Feb 28, 2025 at 06:26:45PM -0500, Taylor Blau wrote:
> On Fri, Feb 28, 2025 at 11:01:04AM +0100, Patrick Steinhardt wrote:
> > On Tue, Nov 19, 2024 at 05:07:19PM -0500, Taylor Blau wrote:
> > > diff --git a/Documentation/technical/multi-pack-index.txt b/Documentation/technical/multi-pack-index.txt
> > > index cc063b30bea..a063262c360 100644
> > > --- a/Documentation/technical/multi-pack-index.txt
> > > +++ b/Documentation/technical/multi-pack-index.txt
> > > @@ -164,6 +164,70 @@ objects_nr($H2) + objects_nr($H1) + i
> > >  (in the C implementation, this is often computed as `i +
> > >  m->num_objects_in_base`).
> > >
> > > +=== Pseudo-pack order for incremental MIDXs
> > > +
> > > +The original implementation of multi-pack reachability bitmaps defined
> > > +the pseudo-pack order in linkgit:gitformat-pack[5] (see the section
> > > +titled "multi-pack-index reverse indexes") roughly as follows:
> > > +
> > > +____
> > > +In short, a MIDX's pseudo-pack is the de-duplicated concatenation of
> > > +objects in packs stored by the MIDX, laid out in pack order, and the
> > > +packs arranged in MIDX order (with the preferred pack coming first).
> > > +____
> > > +
> > > +In the incremental MIDX design, we extend this definition to include
> > > +objects from multiple layers of the MIDX chain. The pseudo-pack order
> > > +for incremental MIDXs is determined by concatenating the pseudo-pack
> > > +ordering for each layer of the MIDX chain in order. Formally two objects
> > > +`o1` and `o2` are compared as follows:
> > > +
> > > +1. If `o1` appears in an earlier layer of the MIDX chain than `o2`, then
> > > +  `o1` is considered less than `o2`.
> >
> > Just as a refresher for myself: what is the consequence of an object
> > `o1` sorting earlier than `o2`? In the case where those refer to
> > different objects it is only used to establish the pseudo-pack order so
> > that we know how to interpret the bitmaps. But in the case where those
> > two objects refer to the same underlying object, e.g. because the object
> > is contained in two packs, it also impacts which of both objects would
> > be preferred e.g. during a clone, right?
> 
> Great question -- the pseudo-pack order here is how we translate the set
> of objects in a MIDX into their corresponding bit positions in the
> bitmap.
> 
> So if "o1" sorts ahead of "o2", that means that "o1" will appear in an
> earlier bit position than "o2". But note that we're talking about
> objects in a MIDX chain here, comprised of objects from each MIDX'd layer of
> that chain. So by that point the duplicates have already been filtered
> out, since:
> 
>   - The MIDX only stores one copy of an object in any given MIDX, and
> 
>   - The incremental MIDX design avoids putting objects from earlier
>     layers in later ones.
> 
> I tried to get at this a few lines up with "[...] a MIDX's pseudo-pack
> is the de-duplicated concatenation of [...]" to make clear that o1 != o2
> here. But let me know if you think I should clarify or emphasize that
> point further.

Okay, the deduplication bit was a bit subtle, so I missed that part. And
once one has learned about it my question makes less sense, as I was
expecting that an object may appear in the same MIDX chain multiple
times.

Patrick

  reply	other threads:[~2025-03-03 10:55 UTC|newest]

Thread overview: 136+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-15 21:01 [PATCH 00/13] midx: incremental multi-pack indexes, part two Taylor Blau
2024-08-15 21:01 ` [PATCH 01/13] Documentation: describe incremental MIDX bitmaps Taylor Blau
2024-08-15 21:01 ` [PATCH 02/13] pack-revindex: prepare for " Taylor Blau
2024-08-15 21:01 ` [PATCH 03/13] pack-bitmap.c: open and store incremental bitmap layers Taylor Blau
2024-08-15 21:01 ` [PATCH 04/13] pack-bitmap.c: teach `bitmap_for_commit()` about incremental MIDXs Taylor Blau
2024-08-15 21:01 ` [PATCH 05/13] pack-bitmap.c: teach `show_objects_for_type()` " Taylor Blau
2024-08-15 21:01 ` [PATCH 06/13] pack-bitmap.c: support bitmap pack-reuse with " Taylor Blau
2024-08-15 21:01 ` [PATCH 07/13] pack-bitmap.c: teach `rev-list --test-bitmap` about " Taylor Blau
2024-08-15 21:01 ` [PATCH 08/13] pack-bitmap.c: compute disk-usage with " Taylor Blau
2024-08-15 21:01 ` [PATCH 09/13] pack-bitmap.c: apply pseudo-merge commits " Taylor Blau
2024-08-15 21:01 ` [PATCH 10/13] ewah: implement `struct ewah_or_iterator` Taylor Blau
2024-08-15 21:01 ` [PATCH 11/13] pack-bitmap.c: keep track of each layer's type bitmaps Taylor Blau
2024-08-15 21:01 ` [PATCH 12/13] pack-bitmap.c: use `ewah_or_iterator` for type bitmap iterators Taylor Blau
2024-08-15 21:01 ` [PATCH 13/13] midx: implement writing incremental MIDX bitmaps Taylor Blau
2024-08-15 22:28 ` [PATCH v2 00/13] midx: incremental multi-pack indexes, part two Taylor Blau
2024-08-15 22:28   ` [PATCH v2 01/13] Documentation: describe incremental MIDX bitmaps Taylor Blau
2024-08-15 22:28   ` [PATCH v2 02/13] pack-revindex: prepare for " Taylor Blau
2024-08-15 22:28   ` [PATCH v2 03/13] pack-bitmap.c: open and store incremental bitmap layers Taylor Blau
2024-08-15 22:29   ` [PATCH v2 04/13] pack-bitmap.c: teach `bitmap_for_commit()` about incremental MIDXs Taylor Blau
2024-08-15 22:29   ` [PATCH v2 05/13] pack-bitmap.c: teach `show_objects_for_type()` " Taylor Blau
2024-08-15 22:29   ` [PATCH v2 06/13] pack-bitmap.c: support bitmap pack-reuse with " Taylor Blau
2024-08-15 22:29   ` [PATCH v2 07/13] pack-bitmap.c: teach `rev-list --test-bitmap` about " Taylor Blau
2024-08-15 22:29   ` [PATCH v2 08/13] pack-bitmap.c: compute disk-usage with " Taylor Blau
2024-08-15 22:29   ` [PATCH v2 09/13] pack-bitmap.c: apply pseudo-merge commits " Taylor Blau
2024-08-15 22:29   ` [PATCH v2 10/13] ewah: implement `struct ewah_or_iterator` Taylor Blau
2024-08-15 22:29   ` [PATCH v2 11/13] pack-bitmap.c: keep track of each layer's type bitmaps Taylor Blau
2024-08-15 22:29   ` [PATCH v2 12/13] pack-bitmap.c: use `ewah_or_iterator` for type bitmap iterators Taylor Blau
2024-08-15 22:29   ` [PATCH v2 13/13] midx: implement writing incremental MIDX bitmaps Taylor Blau
2024-08-28 17:55     ` [PATCH] fixup! " Junio C Hamano
2024-08-28 18:33       ` Jeff King
2024-08-29 18:57         ` Taylor Blau
2024-08-29 19:27           ` Jeff King
2024-11-19 20:56             ` Taylor Blau
2024-11-19 22:07 ` [PATCH v3 00/13] midx: incremental multi-pack indexes, part two Taylor Blau
2024-11-19 22:07   ` [PATCH v3 01/13] Documentation: describe incremental MIDX bitmaps Taylor Blau
2025-02-28 10:01     ` Patrick Steinhardt
2025-02-28 23:26       ` Taylor Blau
2025-03-03 10:54         ` Patrick Steinhardt [this message]
2024-11-19 22:07   ` [PATCH v3 02/13] pack-revindex: prepare for " Taylor Blau
2025-02-28 10:01     ` Patrick Steinhardt
2025-02-28 23:39       ` Taylor Blau
2024-11-19 22:07   ` [PATCH v3 03/13] pack-bitmap.c: open and store incremental bitmap layers Taylor Blau
2025-02-28 10:01     ` Patrick Steinhardt
2025-02-28 23:49       ` Taylor Blau
2025-03-03 10:55         ` Patrick Steinhardt
2024-11-19 22:07   ` [PATCH v3 04/13] pack-bitmap.c: teach `bitmap_for_commit()` about incremental MIDXs Taylor Blau
2025-02-28 10:01     ` Patrick Steinhardt
2025-03-01  0:12       ` Taylor Blau
2024-11-19 22:07   ` [PATCH v3 05/13] pack-bitmap.c: teach `show_objects_for_type()` " Taylor Blau
2024-11-19 22:07   ` [PATCH v3 06/13] pack-bitmap.c: support bitmap pack-reuse with " Taylor Blau
2025-02-28 10:01     ` Patrick Steinhardt
2025-03-01  0:16       ` Taylor Blau
2024-11-19 22:07   ` [PATCH v3 07/13] pack-bitmap.c: teach `rev-list --test-bitmap` about " Taylor Blau
2025-02-28 10:01     ` Patrick Steinhardt
2025-03-01  0:19       ` Taylor Blau
2024-11-19 22:07   ` [PATCH v3 08/13] pack-bitmap.c: compute disk-usage with " Taylor Blau
2024-11-19 22:07   ` [PATCH v3 09/13] pack-bitmap.c: apply pseudo-merge commits " Taylor Blau
2024-11-19 22:07   ` [PATCH v3 10/13] ewah: implement `struct ewah_or_iterator` Taylor Blau
2025-02-28 10:01     ` Patrick Steinhardt
2025-03-01  0:22       ` Taylor Blau
2024-11-19 22:07   ` [PATCH v3 11/13] pack-bitmap.c: keep track of each layer's type bitmaps Taylor Blau
2025-02-28 10:01     ` Patrick Steinhardt
2025-03-01  0:26       ` Taylor Blau
2024-11-19 22:07   ` [PATCH v3 12/13] pack-bitmap.c: use `ewah_or_iterator` for type bitmap iterators Taylor Blau
2025-02-28 10:01     ` Patrick Steinhardt
2025-03-01  0:28       ` Taylor Blau
2024-11-19 22:07   ` [PATCH v3 13/13] midx: implement writing incremental MIDX bitmaps Taylor Blau
2025-02-28 10:01     ` Patrick Steinhardt
2025-03-01  0:31       ` Taylor Blau
2024-11-20  8:49   ` [PATCH v3 00/13] midx: incremental multi-pack indexes, part two Junio C Hamano
2025-03-14 20:18 ` [PATCH v4 " Taylor Blau
2025-03-14 20:18   ` [PATCH v4 01/13] Documentation: describe incremental MIDX bitmaps Taylor Blau
2025-03-18  1:16     ` Jeff King
2025-03-18 23:11       ` Taylor Blau
2025-03-18  2:42     ` Elijah Newren
2025-03-18 23:19       ` Taylor Blau
2025-03-14 20:18   ` [PATCH v4 02/13] pack-revindex: prepare for " Taylor Blau
2025-03-18  1:27     ` Jeff King
2025-03-19  0:02       ` Taylor Blau
2025-03-19  0:07         ` Taylor Blau
2025-03-26 18:08           ` Jeff King
2025-03-18  2:43     ` Elijah Newren
2025-03-19  0:03       ` Taylor Blau
2025-03-14 20:18   ` [PATCH v4 03/13] pack-bitmap.c: open and store incremental bitmap layers Taylor Blau
2025-03-18  4:13     ` Elijah Newren
2025-03-19  0:08       ` Taylor Blau
2025-03-14 20:18   ` [PATCH v4 04/13] pack-bitmap.c: teach `bitmap_for_commit()` about incremental MIDXs Taylor Blau
2025-03-18  1:38     ` Jeff King
2025-03-19  0:13       ` Taylor Blau
2025-03-14 20:18   ` [PATCH v4 05/13] pack-bitmap.c: teach `show_objects_for_type()` " Taylor Blau
2025-03-14 20:18   ` [PATCH v4 06/13] pack-bitmap.c: support bitmap pack-reuse with " Taylor Blau
2025-03-18  4:13     ` Elijah Newren
2025-03-19  0:17       ` Taylor Blau
2025-03-14 20:18   ` [PATCH v4 07/13] pack-bitmap.c: teach `rev-list --test-bitmap` about " Taylor Blau
2025-03-18  5:31     ` Elijah Newren
2025-03-19  0:30       ` Taylor Blau
2025-03-14 20:18   ` [PATCH v4 08/13] pack-bitmap.c: compute disk-usage with " Taylor Blau
2025-03-18  1:41     ` Jeff King
2025-03-19  0:30       ` Taylor Blau
2025-03-14 20:18   ` [PATCH v4 09/13] pack-bitmap.c: apply pseudo-merge commits " Taylor Blau
2025-03-14 20:18   ` [PATCH v4 10/13] ewah: implement `struct ewah_or_iterator` Taylor Blau
2025-03-18  1:44     ` Jeff King
2025-03-19  0:33       ` Taylor Blau
2025-03-14 20:18   ` [PATCH v4 11/13] pack-bitmap.c: keep track of each layer's type bitmaps Taylor Blau
2025-03-18  2:01     ` Jeff King
2025-03-19  0:38       ` Taylor Blau
2025-03-18  6:43     ` Elijah Newren
2025-03-19  0:39       ` Taylor Blau
2025-03-14 20:18   ` [PATCH v4 12/13] pack-bitmap.c: use `ewah_or_iterator` for type bitmap iterators Taylor Blau
2025-03-18  2:05     ` Jeff King
2025-03-19 23:02       ` Taylor Blau
2025-03-14 20:19   ` [PATCH v4 13/13] midx: implement writing incremental MIDX bitmaps Taylor Blau
2025-03-18  2:16     ` Jeff King
2025-03-20  0:14       ` Taylor Blau
2025-03-18 17:13     ` Elijah Newren
2025-03-20  0:16       ` Taylor Blau
2025-03-18  2:21   ` [PATCH v4 00/13] midx: incremental multi-pack indexes, part two Jeff King
2025-03-20  0:18     ` Taylor Blau
2025-03-20 17:56 ` [PATCH v5 00/14] " Taylor Blau
2025-03-20 17:56   ` [PATCH v5 01/14] Documentation: remove a "future work" item from the MIDX docs Taylor Blau
2025-03-20 17:56   ` [PATCH v5 02/14] Documentation: describe incremental MIDX bitmaps Taylor Blau
2025-03-20 17:56   ` [PATCH v5 03/14] pack-revindex: prepare for " Taylor Blau
2025-03-20 17:56   ` [PATCH v5 04/14] pack-bitmap.c: open and store incremental bitmap layers Taylor Blau
2025-03-20 17:56   ` [PATCH v5 05/14] pack-bitmap.c: teach `bitmap_for_commit()` about incremental MIDXs Taylor Blau
2025-03-20 17:56   ` [PATCH v5 06/14] pack-bitmap.c: teach `show_objects_for_type()` " Taylor Blau
2025-03-20 17:56   ` [PATCH v5 07/14] pack-bitmap.c: support bitmap pack-reuse with " Taylor Blau
2025-03-20 17:56   ` [PATCH v5 08/14] pack-bitmap.c: teach `rev-list --test-bitmap` about " Taylor Blau
2025-03-20 17:56     ` Taylor Blau
2025-03-20 17:58       ` Taylor Blau
2025-03-20 17:56   ` [PATCH v5 09/14] pack-bitmap.c: compute disk-usage with " Taylor Blau
2025-03-20 17:56   ` [PATCH v5 10/14] pack-bitmap.c: apply pseudo-merge commits " Taylor Blau
2025-03-20 17:56   ` [PATCH v5 11/14] ewah: implement `struct ewah_or_iterator` Taylor Blau
2025-03-20 17:57   ` [PATCH v5 12/14] pack-bitmap.c: keep track of each layer's type bitmaps Taylor Blau
2025-03-20 17:57   ` [PATCH v5 13/14] pack-bitmap.c: use `ewah_or_iterator` for type bitmap iterators Taylor Blau
2025-03-20 17:57   ` [PATCH v5 14/14] midx: implement writing incremental MIDX bitmaps Taylor Blau
2025-03-20 20:00   ` [PATCH v5 00/14] midx: incremental multi-pack indexes, part two Elijah Newren

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z8WKgisnKb5zc1xO@pks.im \
    --to=ps@pks.im \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=me@ttaylorr.com \
    --cc=newren@gmail.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).