From: Patrick Steinhardt <ps@pks.im>
To: Taylor Blau <me@ttaylorr.com>
Cc: git@vger.kernel.org
Subject: Re: [PATCH] pack-bitmap: gracefully handle missing BTMP chunks
Date: Mon, 15 Apr 2024 08:34:33 +0200 [thread overview]
Message-ID: <ZhzKeR9C4UVpEm29@tanuki> (raw)
In-Reply-To: <Zhap8iyMYytCM+on@nand.local>
[-- Attachment #1: Type: text/plain, Size: 4764 bytes --]
On Wed, Apr 10, 2024 at 11:02:10AM -0400, Taylor Blau wrote:
> On Tue, Apr 09, 2024 at 07:59:25AM +0200, Patrick Steinhardt wrote:
[snip]
> > diff --git a/midx.c b/midx.c
> > index 41521e019c..6903e9dfd2 100644
> > --- a/midx.c
> > +++ b/midx.c
> > @@ -1661,9 +1661,10 @@ static int write_midx_internal(const char *object_dir,
> > add_chunk(cf, MIDX_CHUNKID_REVINDEX,
> > st_mult(ctx.entries_nr, sizeof(uint32_t)),
> > write_midx_revindex);
> > - add_chunk(cf, MIDX_CHUNKID_BITMAPPEDPACKS,
> > - bitmapped_packs_concat_len,
> > - write_midx_bitmapped_packs);
> > + if (git_env_bool("GIT_TEST_MIDX_WRITE_BTMP", 1))
> > + add_chunk(cf, MIDX_CHUNKID_BITMAPPEDPACKS,
> > + bitmapped_packs_concat_len,
> > + write_midx_bitmapped_packs);
>
> I wish that this were possible to exercise without a new
> GIT_TEST_-variable. I think there are a couple of alternatives:
>
> You could introduce a new GIT_TEST_MIDX_READ_BTMP variable, and then set
> that to control whether or not we read the BTMP chunk. This is what we
> did in:
>
> - 28cd730680d (pack-bitmap: prepare to read lookup table extension,
> 2022-08-14), as well as in
>
> - 7f514b7a5e7 (midx: read `RIDX` chunk when present, 2022-01-25)
>
> . I have a vague preference towards controlling whether or not we read
> the BTMP chunk (as opposed to whether or not we write it) as this
> removes a potential footgun for users who might accidentally disable
> writing a BTMP chunk (in which case you have to rewrite the whole MIDX)
> as opposed to reading it (in which case you just change your environment
> variable).
>
> Of course, that is still using a GIT_TEST_-variable, which is less than
> ideal IMHO. The other alternative would be to store a MIDX file as a
> test fixture in the tree (which we do in a couple of places). But with
> the recent xz shenanigans, I'm not sure that's a great idea either ;-).
I'm happy to convert this to use `GIT_TEST_MIDX_READ_BTMP` instead.
> > diff --git a/pack-bitmap.c b/pack-bitmap.c
> > index 2baeabacee..f286805724 100644
> > --- a/pack-bitmap.c
> > +++ b/pack-bitmap.c
> > @@ -2049,7 +2049,25 @@ void reuse_partial_packfile_from_bitmap(struct bitmap_index *bitmap_git,
> >
> > load_reverse_index(r, bitmap_git);
> >
> > - if (bitmap_is_midx(bitmap_git)) {
> > + if (bitmap_is_midx(bitmap_git) &&
> > + (!multi_pack_reuse || !bitmap_git->midx->chunk_bitmapped_packs)) {
> > + uint32_t preferred_pack_pos;
> > + struct packed_git *pack;
> > +
> > + if (midx_preferred_pack(bitmap_git->midx, &preferred_pack_pos) < 0) {
> > + warning(_("unable to compute preferred pack, disabling pack-reuse"));
> > + return;
> > + }
> > +
> > + pack = bitmap_git->midx->packs[preferred_pack_pos];
> > +
> > + ALLOC_GROW(packs, packs_nr + 1, packs_alloc);
> > + packs[packs_nr].p = pack;
> > + packs[packs_nr].bitmap_nr = pack->num_objects;
> > + packs[packs_nr].bitmap_pos = 0;
> > +
> > + objects_nr = packs[packs_nr++].bitmap_nr;
> > + } else if (bitmap_is_midx(bitmap_git)) {
> > for (i = 0; i < bitmap_git->midx->num_packs; i++) {
> > struct bitmapped_pack pack;
> > if (nth_bitmapped_pack(r, bitmap_git->midx, &pack, i) < 0) {
>
> This all makes sense to me. I think we could make the result slightly
> more readable by handling the case where we're doing multi-pack reuse
> separately from the case where we're not.
>
> I tried to make that change locally to see if it was a good idea, and
> I'm reasonably happy with the result. I can't think of a great way to
> talk about it without just showing the resulting patch (as the
> inter-diff is fairly difficult to read IMHO). So here is the resulting
> patch (forging your s-o-b):
Yup, the result indeed looks nicer, thanks!
[snip]
> The way I would structure this series is to first apply the portion of
> the above patch *without* these lines:
>
> - if (bitmap_is_midx(bitmap_git)) {
> + if (!bitmap_is_midx(bitmap_git) ||
> + !bitmap_git->midx->chunk_bitmapped_packs)
> + multi_pack_reuse = 0;
> +
>
> , so we're still able to reproduce the issue. Then, apply the remaining
> portions (the above diff, the test, and the GIT_TEST_MIDX_WRITE_BTMP
> stuff) to demonstrate that the issue is fixed via a separate commit.
>
> I'm happy to write that up, and equally happy to not do it ;-). Sorry
> for the lengthy review, but thank you very much for spotting and fixing
> this issue.
I'd prefer to leave it as a single patch. Junio has expressed at times
that he doesn't see much value in these splits only to demonstrate a
broken test. An interested reader can easily revert the fix and see that
the test would fail.
Patrick
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
next prev parent reply other threads:[~2024-04-15 6:34 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-04-09 5:59 [PATCH] pack-bitmap: gracefully handle missing BTMP chunks Patrick Steinhardt
2024-04-10 15:02 ` Taylor Blau
2024-04-15 6:34 ` Patrick Steinhardt [this message]
2024-04-15 22:42 ` Taylor Blau
2024-04-15 6:41 ` [PATCH v2] " Patrick Steinhardt
2024-04-15 8:51 ` Patrick Steinhardt
2024-04-15 17:41 ` Junio C Hamano
2024-04-15 22:51 ` Taylor Blau
2024-04-15 23:46 ` Junio C Hamano
2024-04-15 22:51 ` Taylor Blau
2024-04-16 4:47 ` Patrick Steinhardt
2024-04-16 5:12 ` Jeff King
2024-04-16 5:14 ` Patrick Steinhardt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZhzKeR9C4UVpEm29@tanuki \
--to=ps@pks.im \
--cc=git@vger.kernel.org \
--cc=me@ttaylorr.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.