git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Taylor Blau <me@ttaylorr.com>
To: Patrick Steinhardt <ps@pks.im>
Cc: git@vger.kernel.org, Elijah Newren <newren@gmail.com>,
	Jeff King <peff@peff.net>, Junio C Hamano <gitster@pobox.com>
Subject: Re: [PATCH 07/17] midx-write.c: don't use `pack_perm` when assigning `bitmap_pos`
Date: Mon, 8 Dec 2025 20:59:54 -0500	[thread overview]
Message-ID: <aTeCmrLnRtYRm/ah@nand.local> (raw)
In-Reply-To: <aTcYU_yVYyXL9TXv@pks.im>

On Mon, Dec 08, 2025 at 07:26:27PM +0100, Patrick Steinhardt wrote:
> On Sat, Dec 06, 2025 at 03:31:19PM -0500, Taylor Blau wrote:
> > In midx_pack_order(), we compute for each bitampped pack the first bit
>
> s/bitampped/bitmapped/

Ugh. My "bitamp" typo strikes again, thanks for spotting!

> > to correspond to an object in that pack, along with how many bits were
> > assigned to object(s) in that pack.
> >
> > Initially, each bitmap_nr value is set to zero, and each bitmap_pos
>
> I assume `bitmap_nr` is the number of bits, whereas `bitmap_pos` is the
> position of the first bit?

That's right!

> > However, we enumerate the bitmapped packs in order of `ctx->pack_perm`.
>
> Which is the "permutation between pack-int-ids from the previous
> multi-pack-index to the new one we are writing"'. So it's basically
> tracking which new packs correspond to the old packs.

Ditto.

> So obviously, the permutation will only ever be different in case we've
> got at least one dropped pack, and that only happens when we expire any
> packs. So the explanation matches.
>
> Of course it may be a bit more fragile now if we ever added a caller
> of this function that _does_ expire data. But we don't have any, so that
> enters the territory of overthinking things.

I think that with incremental MIDXs we will never have such a caller
without a mechanism to tombstone objects in existing packs, but
definitely worth calling out.

> > diff --git a/midx-write.c b/midx-write.c
> > index 73d24fabbc6..c30f6a70d37 100644
> > --- a/midx-write.c
> > +++ b/midx-write.c
> > @@ -637,7 +637,7 @@ static uint32_t *midx_pack_order(struct write_midx_context *ctx)
> >  		pack_order[i] = data[i].nr;
> >  	}
> >  	for (i = 0; i < ctx->nr; i++) {
> > -		struct pack_info *pack = &ctx->info[ctx->pack_perm[i]];
> > +		struct pack_info *pack = &ctx->info[i];
> >  		if (pack->bitmap_pos == BITMAP_POS_UNKNOWN)
> >  			pack->bitmap_pos = 0;
> >  	}
>
> The change looks simple enough.

Yeah, I almost wonder if the commit message was more harmful than not.
The main points that I wanted to get across were:

 - Ultimately we want to enumerate a list, and there's no reason to do
   that in a permuted order.

 - Iterating in that permuted order is fine today because the array of
   values in ctx->pack_perm are always addressable indices into
   ctx->info.

 - That won't be the case in the future when we are combining packs from
   MIDX layers that have a non-zero m->num_packs_in_base, so adjusting
   the implementation now prevents us from running into that pitfall in
   such a future.

Let me know if you think that I should adjust the commit message here.
It's hard to know whether something resembling the above is better or
worse than the current version of the commit message from a reviewer's
perspective, so I'm happy to do whatever you think is cleaner ;-).

Thanks,
Taylor

  reply	other threads:[~2025-12-09  1:59 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-06 20:30 [PATCH 00/17] midx: incremental MIDX/bitmap layer compaction Taylor Blau
2025-12-06 20:31 ` [PATCH 01/17] midx: mark `get_midx_checksum()` arguments as const Taylor Blau
2025-12-08 18:26   ` Patrick Steinhardt
2025-12-09  1:41     ` Taylor Blau
2025-12-06 20:31 ` [PATCH 02/17] midx: split `get_midx_checksum()` by adding `get_midx_hash()` Taylor Blau
2025-12-08 18:25   ` Patrick Steinhardt
2025-12-09  1:42     ` Taylor Blau
2025-12-09  1:50       ` Taylor Blau
2025-12-09  6:27         ` Patrick Steinhardt
2025-12-06 20:31 ` [PATCH 03/17] builtin/multi-pack-index.c: make '--progress' a common option Taylor Blau
2025-12-06 20:31 ` [PATCH 04/17] git-multi-pack-index(1): remove non-existent incompatibility Taylor Blau
2025-12-06 20:31 ` [PATCH 05/17] git-multi-pack-index(1): align SYNOPSIS with 'git multi-pack-index -h' Taylor Blau
2025-12-06 20:31 ` [PATCH 06/17] t/t5319-multi-pack-index.sh: fix copy-and-paste error in t5319.39 Taylor Blau
2025-12-06 20:31 ` [PATCH 07/17] midx-write.c: don't use `pack_perm` when assigning `bitmap_pos` Taylor Blau
2025-12-08 18:26   ` Patrick Steinhardt
2025-12-09  1:59     ` Taylor Blau [this message]
2025-12-06 20:31 ` [PATCH 08/17] midx-write.c: introduce `struct write_midx_opts` Taylor Blau
2025-12-08 18:26   ` Patrick Steinhardt
2025-12-09  2:04     ` Taylor Blau
2025-12-06 20:31 ` [PATCH 09/17] midx: do not require packs to be sorted in lexicographic order Taylor Blau
2025-12-08 18:26   ` Patrick Steinhardt
2025-12-09  2:07     ` Taylor Blau
2025-12-09  2:11       ` Taylor Blau
2025-12-06 20:31 ` [PATCH 10/17] git-compat-util.h: introduce `u32_add()` Taylor Blau
2025-12-08 18:27   ` Patrick Steinhardt
2025-12-09  2:13     ` Taylor Blau
2025-12-06 20:31 ` [PATCH 11/17] midx-write.c: introduce `midx_pack_perm()` helper Taylor Blau
2025-12-06 20:31 ` [PATCH 12/17] midx-write.c: extract `fill_pack_from_midx()` Taylor Blau
2025-12-06 20:31 ` [PATCH 13/17] midx-write.c: enumerate `pack_int_id` values directly Taylor Blau
2025-12-08 18:27   ` Patrick Steinhardt
2025-12-09  2:14     ` Taylor Blau
2025-12-06 20:31 ` [PATCH 14/17] midx-write.c: factor fanout layering from `compute_sorted_entries()` Taylor Blau
2025-12-06 20:31 ` [PATCH 15/17] t/helper/test-read-midx.c: plug memory leak when selecting layer Taylor Blau
2025-12-08 18:27   ` Patrick Steinhardt
2025-12-09  2:16     ` Taylor Blau
2025-12-06 20:31 ` [PATCH 16/17] midx: implement MIDX compaction Taylor Blau
2025-12-09  7:21   ` Patrick Steinhardt
2025-12-06 20:31 ` [PATCH 17/17] midx: enable reachability bitmaps during " Taylor Blau
2025-12-09  7:21   ` Patrick Steinhardt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aTeCmrLnRtYRm/ah@nand.local \
    --to=me@ttaylorr.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=newren@gmail.com \
    --cc=peff@peff.net \
    --cc=ps@pks.im \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).