git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Patrick Steinhardt <ps@pks.im>
To: Taylor Blau <me@ttaylorr.com>
Cc: git@vger.kernel.org
Subject: Re: [PATCH] pack-bitmap: gracefully handle missing BTMP chunks
Date: Mon, 15 Apr 2024 08:34:33 +0200	[thread overview]
Message-ID: <ZhzKeR9C4UVpEm29@tanuki> (raw)
In-Reply-To: <Zhap8iyMYytCM+on@nand.local>

[-- Attachment #1: Type: text/plain, Size: 4764 bytes --]

On Wed, Apr 10, 2024 at 11:02:10AM -0400, Taylor Blau wrote:
> On Tue, Apr 09, 2024 at 07:59:25AM +0200, Patrick Steinhardt wrote:
[snip]
> > diff --git a/midx.c b/midx.c
> > index 41521e019c..6903e9dfd2 100644
> > --- a/midx.c
> > +++ b/midx.c
> > @@ -1661,9 +1661,10 @@ static int write_midx_internal(const char *object_dir,
> >  		add_chunk(cf, MIDX_CHUNKID_REVINDEX,
> >  			  st_mult(ctx.entries_nr, sizeof(uint32_t)),
> >  			  write_midx_revindex);
> > -		add_chunk(cf, MIDX_CHUNKID_BITMAPPEDPACKS,
> > -			  bitmapped_packs_concat_len,
> > -			  write_midx_bitmapped_packs);
> > +		if (git_env_bool("GIT_TEST_MIDX_WRITE_BTMP", 1))
> > +			add_chunk(cf, MIDX_CHUNKID_BITMAPPEDPACKS,
> > +				  bitmapped_packs_concat_len,
> > +				  write_midx_bitmapped_packs);
> 
> I wish that this were possible to exercise without a new
> GIT_TEST_-variable. I think there are a couple of alternatives:
> 
> You could introduce a new GIT_TEST_MIDX_READ_BTMP variable, and then set
> that to control whether or not we read the BTMP chunk. This is what we
> did in:
> 
>   - 28cd730680d (pack-bitmap: prepare to read lookup table extension,
>     2022-08-14), as well as in
> 
>   - 7f514b7a5e7 (midx: read `RIDX` chunk when present, 2022-01-25)
> 
> . I have a vague preference towards controlling whether or not we read
> the BTMP chunk (as opposed to whether or not we write it) as this
> removes a potential footgun for users who might accidentally disable
> writing a BTMP chunk (in which case you have to rewrite the whole MIDX)
> as opposed to reading it (in which case you just change your environment
> variable).
> 
> Of course, that is still using a GIT_TEST_-variable, which is less than
> ideal IMHO. The other alternative would be to store a MIDX file as a
> test fixture in the tree (which we do in a couple of places). But with
> the recent xz shenanigans, I'm not sure that's a great idea either ;-).

I'm happy to convert this to use `GIT_TEST_MIDX_READ_BTMP` instead.

> > diff --git a/pack-bitmap.c b/pack-bitmap.c
> > index 2baeabacee..f286805724 100644
> > --- a/pack-bitmap.c
> > +++ b/pack-bitmap.c
> > @@ -2049,7 +2049,25 @@ void reuse_partial_packfile_from_bitmap(struct bitmap_index *bitmap_git,
> >
> >  	load_reverse_index(r, bitmap_git);
> >
> > -	if (bitmap_is_midx(bitmap_git)) {
> > +	if (bitmap_is_midx(bitmap_git) &&
> > +	    (!multi_pack_reuse || !bitmap_git->midx->chunk_bitmapped_packs)) {
> > +		uint32_t preferred_pack_pos;
> > +		struct packed_git *pack;
> > +
> > +		if (midx_preferred_pack(bitmap_git->midx, &preferred_pack_pos) < 0) {
> > +			warning(_("unable to compute preferred pack, disabling pack-reuse"));
> > +			return;
> > +		}
> > +
> > +		pack = bitmap_git->midx->packs[preferred_pack_pos];
> > +
> > +		ALLOC_GROW(packs, packs_nr + 1, packs_alloc);
> > +		packs[packs_nr].p = pack;
> > +		packs[packs_nr].bitmap_nr = pack->num_objects;
> > +		packs[packs_nr].bitmap_pos = 0;
> > +
> > +		objects_nr = packs[packs_nr++].bitmap_nr;
> > +	} else if (bitmap_is_midx(bitmap_git)) {
> >  		for (i = 0; i < bitmap_git->midx->num_packs; i++) {
> >  			struct bitmapped_pack pack;
> >  			if (nth_bitmapped_pack(r, bitmap_git->midx, &pack, i) < 0) {
> 
> This all makes sense to me. I think we could make the result slightly
> more readable by handling the case where we're doing multi-pack reuse
> separately from the case where we're not.
> 
> I tried to make that change locally to see if it was a good idea, and
> I'm reasonably happy with the result. I can't think of a great way to
> talk about it without just showing the resulting patch (as the
> inter-diff is fairly difficult to read IMHO). So here is the resulting
> patch (forging your s-o-b):

Yup, the result indeed looks nicer, thanks!

[snip]
> The way I would structure this series is to first apply the portion of
> the above patch *without* these lines:
> 
> -	if (bitmap_is_midx(bitmap_git)) {
> +	if (!bitmap_is_midx(bitmap_git) ||
> +	    !bitmap_git->midx->chunk_bitmapped_packs)
> +		multi_pack_reuse = 0;
> +
> 
> , so we're still able to reproduce the issue. Then, apply the remaining
> portions (the above diff, the test, and the GIT_TEST_MIDX_WRITE_BTMP
> stuff) to demonstrate that the issue is fixed via a separate commit.
> 
> I'm happy to write that up, and equally happy to not do it ;-). Sorry
> for the lengthy review, but thank you very much for spotting and fixing
> this issue.

I'd prefer to leave it as a single patch. Junio has expressed at times
that he doesn't see much value in these splits only to demonstrate a
broken test. An interested reader can easily revert the fix and see that
the test would fail.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2024-04-15  6:34 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-09  5:59 [PATCH] pack-bitmap: gracefully handle missing BTMP chunks Patrick Steinhardt
2024-04-10 15:02 ` Taylor Blau
2024-04-15  6:34   ` Patrick Steinhardt [this message]
2024-04-15 22:42     ` Taylor Blau
2024-04-15  6:41 ` [PATCH v2] " Patrick Steinhardt
2024-04-15  8:51   ` Patrick Steinhardt
2024-04-15 17:41     ` Junio C Hamano
2024-04-15 22:51       ` Taylor Blau
2024-04-15 23:46         ` Junio C Hamano
2024-04-15 22:51   ` Taylor Blau
2024-04-16  4:47     ` Patrick Steinhardt
2024-04-16  5:12       ` Jeff King
2024-04-16  5:14         ` Patrick Steinhardt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZhzKeR9C4UVpEm29@tanuki \
    --to=ps@pks.im \
    --cc=git@vger.kernel.org \
    --cc=me@ttaylorr.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).