From: Taylor Blau <me@ttaylorr.com>
To: Patrick Steinhardt <ps@pks.im>
Cc: git@vger.kernel.org, Elijah Newren <newren@gmail.com>,
Jeff King <peff@peff.net>, Junio C Hamano <gitster@pobox.com>
Subject: Re: [PATCH 07/17] midx-write.c: don't use `pack_perm` when assigning `bitmap_pos`
Date: Mon, 8 Dec 2025 20:59:54 -0500 [thread overview]
Message-ID: <aTeCmrLnRtYRm/ah@nand.local> (raw)
In-Reply-To: <aTcYU_yVYyXL9TXv@pks.im>
On Mon, Dec 08, 2025 at 07:26:27PM +0100, Patrick Steinhardt wrote:
> On Sat, Dec 06, 2025 at 03:31:19PM -0500, Taylor Blau wrote:
> > In midx_pack_order(), we compute for each bitampped pack the first bit
>
> s/bitampped/bitmapped/
Ugh. My "bitamp" typo strikes again, thanks for spotting!
> > to correspond to an object in that pack, along with how many bits were
> > assigned to object(s) in that pack.
> >
> > Initially, each bitmap_nr value is set to zero, and each bitmap_pos
>
> I assume `bitmap_nr` is the number of bits, whereas `bitmap_pos` is the
> position of the first bit?
That's right!
> > However, we enumerate the bitmapped packs in order of `ctx->pack_perm`.
>
> Which is the "permutation between pack-int-ids from the previous
> multi-pack-index to the new one we are writing"'. So it's basically
> tracking which new packs correspond to the old packs.
Ditto.
> So obviously, the permutation will only ever be different in case we've
> got at least one dropped pack, and that only happens when we expire any
> packs. So the explanation matches.
>
> Of course it may be a bit more fragile now if we ever added a caller
> of this function that _does_ expire data. But we don't have any, so that
> enters the territory of overthinking things.
I think that with incremental MIDXs we will never have such a caller
without a mechanism to tombstone objects in existing packs, but
definitely worth calling out.
> > diff --git a/midx-write.c b/midx-write.c
> > index 73d24fabbc6..c30f6a70d37 100644
> > --- a/midx-write.c
> > +++ b/midx-write.c
> > @@ -637,7 +637,7 @@ static uint32_t *midx_pack_order(struct write_midx_context *ctx)
> > pack_order[i] = data[i].nr;
> > }
> > for (i = 0; i < ctx->nr; i++) {
> > - struct pack_info *pack = &ctx->info[ctx->pack_perm[i]];
> > + struct pack_info *pack = &ctx->info[i];
> > if (pack->bitmap_pos == BITMAP_POS_UNKNOWN)
> > pack->bitmap_pos = 0;
> > }
>
> The change looks simple enough.
Yeah, I almost wonder if the commit message was more harmful than not.
The main points that I wanted to get across were:
- Ultimately we want to enumerate a list, and there's no reason to do
that in a permuted order.
- Iterating in that permuted order is fine today because the array of
values in ctx->pack_perm are always addressable indices into
ctx->info.
- That won't be the case in the future when we are combining packs from
MIDX layers that have a non-zero m->num_packs_in_base, so adjusting
the implementation now prevents us from running into that pitfall in
such a future.
Let me know if you think that I should adjust the commit message here.
It's hard to know whether something resembling the above is better or
worse than the current version of the commit message from a reviewer's
perspective, so I'm happy to do whatever you think is cleaner ;-).
Thanks,
Taylor
next prev parent reply other threads:[~2025-12-09 1:59 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-12-06 20:30 [PATCH 00/17] midx: incremental MIDX/bitmap layer compaction Taylor Blau
2025-12-06 20:31 ` [PATCH 01/17] midx: mark `get_midx_checksum()` arguments as const Taylor Blau
2025-12-08 18:26 ` Patrick Steinhardt
2025-12-09 1:41 ` Taylor Blau
2025-12-06 20:31 ` [PATCH 02/17] midx: split `get_midx_checksum()` by adding `get_midx_hash()` Taylor Blau
2025-12-08 18:25 ` Patrick Steinhardt
2025-12-09 1:42 ` Taylor Blau
2025-12-09 1:50 ` Taylor Blau
2025-12-09 6:27 ` Patrick Steinhardt
2025-12-06 20:31 ` [PATCH 03/17] builtin/multi-pack-index.c: make '--progress' a common option Taylor Blau
2025-12-06 20:31 ` [PATCH 04/17] git-multi-pack-index(1): remove non-existent incompatibility Taylor Blau
2025-12-06 20:31 ` [PATCH 05/17] git-multi-pack-index(1): align SYNOPSIS with 'git multi-pack-index -h' Taylor Blau
2025-12-06 20:31 ` [PATCH 06/17] t/t5319-multi-pack-index.sh: fix copy-and-paste error in t5319.39 Taylor Blau
2025-12-06 20:31 ` [PATCH 07/17] midx-write.c: don't use `pack_perm` when assigning `bitmap_pos` Taylor Blau
2025-12-08 18:26 ` Patrick Steinhardt
2025-12-09 1:59 ` Taylor Blau [this message]
2025-12-06 20:31 ` [PATCH 08/17] midx-write.c: introduce `struct write_midx_opts` Taylor Blau
2025-12-08 18:26 ` Patrick Steinhardt
2025-12-09 2:04 ` Taylor Blau
2025-12-06 20:31 ` [PATCH 09/17] midx: do not require packs to be sorted in lexicographic order Taylor Blau
2025-12-08 18:26 ` Patrick Steinhardt
2025-12-09 2:07 ` Taylor Blau
2025-12-09 2:11 ` Taylor Blau
2025-12-06 20:31 ` [PATCH 10/17] git-compat-util.h: introduce `u32_add()` Taylor Blau
2025-12-08 18:27 ` Patrick Steinhardt
2025-12-09 2:13 ` Taylor Blau
2025-12-06 20:31 ` [PATCH 11/17] midx-write.c: introduce `midx_pack_perm()` helper Taylor Blau
2025-12-06 20:31 ` [PATCH 12/17] midx-write.c: extract `fill_pack_from_midx()` Taylor Blau
2025-12-06 20:31 ` [PATCH 13/17] midx-write.c: enumerate `pack_int_id` values directly Taylor Blau
2025-12-08 18:27 ` Patrick Steinhardt
2025-12-09 2:14 ` Taylor Blau
2025-12-06 20:31 ` [PATCH 14/17] midx-write.c: factor fanout layering from `compute_sorted_entries()` Taylor Blau
2025-12-06 20:31 ` [PATCH 15/17] t/helper/test-read-midx.c: plug memory leak when selecting layer Taylor Blau
2025-12-08 18:27 ` Patrick Steinhardt
2025-12-09 2:16 ` Taylor Blau
2025-12-06 20:31 ` [PATCH 16/17] midx: implement MIDX compaction Taylor Blau
2025-12-09 7:21 ` Patrick Steinhardt
2025-12-06 20:31 ` [PATCH 17/17] midx: enable reachability bitmaps during " Taylor Blau
2025-12-09 7:21 ` Patrick Steinhardt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aTeCmrLnRtYRm/ah@nand.local \
--to=me@ttaylorr.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=newren@gmail.com \
--cc=peff@peff.net \
--cc=ps@pks.im \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).