All of lore.kernel.org
 help / color / mirror / Atom feed
From: Patrick Steinhardt <ps@pks.im>
To: Taylor Blau <me@ttaylorr.com>
Cc: git@vger.kernel.org, peff@peff.net, dstolee@microsoft.com
Subject: Re: [PATCH v2 2/8] repack: fix trying to use preferred pack in alternates
Date: Thu, 13 Apr 2023 11:31:43 +0200	[thread overview]
Message-ID: <ZDfL_6G-ySm94Vla@ncase> (raw)
In-Reply-To: <ZDb6gZycwTdsaB6o@nand.local>

[-- Attachment #1: Type: text/plain, Size: 5395 bytes --]

On Wed, Apr 12, 2023 at 02:37:53PM -0400, Taylor Blau wrote:
> On Wed, Apr 12, 2023 at 12:22:35PM +0200, Patrick Steinhardt wrote:
> > When doing a geometric repack with multi-pack-indices, then we ask
> > git-multi-pack-index(1) to use the largest packfile as the preferred
> > pack. It can happen though that the largest packfile is not part of the
> > main object database, but instead part of an alternate object database.
> > The result is that git-multi-pack-index(1) will not be able to find the
> > preferred pack and print a warning. It then falls back to use the first
> > packfile that the multi-pack-index shall reference.
> >
> > Fix this bug by only considering packfiles as preferred pack that are
> > local. This is the right thing to do given that a multi-pack-index
> > should never reference packfiles borrowed from an alternate.
> >
> > While at it, rename the function `get_largest_active_packfile()` to
> > `get_preferred_pack()` to better document its intent.
> >
> > Signed-off-by: Patrick Steinhardt <ps@pks.im>
> 
> > @@ -464,7 +466,16 @@ static struct packed_git *get_largest_active_pack(struct pack_geometry *geometry
> >  	}
> >  	if (geometry->split == geometry->pack_nr)
> >  		return NULL;
> > -	return geometry->pack[geometry->pack_nr - 1];
> > +
> > +	for (i = geometry->pack_nr; i > 0; i--)
> > +		/*
> > +		 * A pack that is not local would never be included in a
> > +		 * multi-pack index. We thus skip over any non-local packs.
> > +		 */
> > +		if (geometry->pack[i - 1]->pack_local)
> > +			return geometry->pack[i - 1];
> > +
> > +	return NULL;
> >  }
> 
> Looking good, we want to go through this list in reverse order, since
> the packs are ordered largest to smallest. I think that you could
> slightly rewrite the loop condition to avoid having to always access
> `geometry->pack` at `i-1` instead of `i`, like so:
> 
> --- 8< ---
> diff --git a/builtin/repack.c b/builtin/repack.c
> index 9d36dc8b84..ba468ac44e 100644
> --- a/builtin/repack.c
> +++ b/builtin/repack.c
> @@ -467,13 +467,15 @@ static struct packed_git *get_preferred_pack(struct pack_geometry *geometry)
>  	if (geometry->split == geometry->pack_nr)
>  		return NULL;
> 
> -	for (i = geometry->pack_nr; i > 0; i--)
> +	for (i = geometry->pack_nr - 1; i >= 0; i--) {
>  		/*
>  		 * A pack that is not local would never be included in a
>  		 * multi-pack index. We thus skip over any non-local packs.
>  		 */
> -		if (geometry->pack[i - 1]->pack_local)
> -			return geometry->pack[i - 1];
> +		struct packed_git *p = geometry->pack[i];
> +		if (p->pack_local)
> +			return p;
> +	}
> 
>  	return NULL;
>  }
> --- >8 ---

There's a gotcha: `i` is an `unit32_t`, so `i >= 0` would always be true
and thus we wrap around and would try to access the pack array at an
out-of-bound index.

> but I'm not sure that the loop condition is quite right to begin with.
> We don't want to iterate all the way down to the beginning of the pack
> list, since some of those packs may be below the "split" line, IOW their
> contents are going to be rolled up and the packs destroyed.
> 
> So I think the right loop condition would be:
> 
>     for (i = geometry->pack_nr - 1; i >= geometry->split; i--)
> 
> which I think makes the "if (geometry->split == geometry->pack_nr)"
> condition above redundant with this loop, so you could probably drop
> that.
> 
> I would definitely appreciate a second set of eye here. The pack *at*
> the split line is considered frozen (IOW, we create a new pack
> consisting of the packs in range [0, geometry->split), and leave the
> packs in range [geometry->split, geometry->nr) alone).
> 
> So it should be fine to enter that loop when "i == geometry->split",
> because it's OK to return that as a potential preferred pack.

That makes sense indeed. Will amend.

[snip]
> > +	test $(wc -l <packs) = 1 &&
> > +
> > +	# We should also see a multi-pack-index. This multi-pack-index should
> > +	# never refer to any packfiles in the alternate object database.
> > +	# Consequentially, it should be valid even with the alternate ODB
> > +	# deleted.
> > +	rm -r shared &&
> > +	git -C member multi-pack-index verify
> 
> To test this even more directly, I think that you could ensure that the
> set of packs (which should just be the single entry found in "packs"
> above) matches what we expect. That should look something like:
> 
>     test-tool read-midx member/.git/objects >packs.midx &&
>     grep "^pack-.*\.idx$" packs.midx | sort >actual &&
>     xargs -n 1 basename <packs | sort >expect
>     test_cmp expect actual
> 
> Note that the read-midx test helper outputs packs with the "*.idx"
> suffix, so you'd probably want to alter your find invocation a few lines
> above accordingly, or rewrite it before writing into "actual".
> 
> Alternatively, since we expect there to be just a single pack here, I
> think we could just as easily write something like:
> 
>     test-tool read-midx member/.git/objects >packs.midx &&
>     grep "^pack-.*\.idx$" packs.midx >actual &&
>     test_line_count = 1 actual &&
>     grep $(cat actual) packs
> 
> though I slightly prefer the former since it is less brittle to changing
> this test. Either way, though.

Thanks, I've adopted that approach with a slight modification.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2023-04-13  9:31 UTC|newest]

Thread overview: 67+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-04 11:08 [PATCH] repack: fix geometric repacking with gitalternates Patrick Steinhardt
2023-04-04 18:55 ` Taylor Blau
2023-04-04 19:00   ` Taylor Blau
2023-04-05  7:08   ` Patrick Steinhardt
2023-04-10 15:06     ` Derrick Stolee
2023-04-10 23:49       ` Taylor Blau
2023-04-11 17:13         ` Patrick Steinhardt
2023-04-11 21:13           ` Taylor Blau
2023-04-12  9:37             ` Patrick Steinhardt
2023-04-11 17:06       ` Patrick Steinhardt
2023-04-11 17:26         ` Patrick Steinhardt
2023-04-11 21:14           ` Taylor Blau
2023-04-10 23:29     ` Taylor Blau
2023-04-12 10:22 ` [PATCH v2 0/8] " Patrick Steinhardt
2023-04-12 10:22   ` [PATCH v2 1/8] midx: fix segfault with no packs and invalid preferred pack Patrick Steinhardt
2023-04-12 17:56     ` Taylor Blau
2023-04-13  9:28       ` Patrick Steinhardt
2023-04-12 10:22   ` [PATCH v2 2/8] repack: fix trying to use preferred pack in alternates Patrick Steinhardt
2023-04-12 18:37     ` Taylor Blau
2023-04-13  9:31       ` Patrick Steinhardt [this message]
2023-04-12 10:22   ` [PATCH v2 3/8] repack: fix generating multi-pack-index with only non-local packs Patrick Steinhardt
2023-04-12 20:39     ` Taylor Blau
2023-04-12 10:22   ` [PATCH v2 4/8] pack-objects: fix error when packing same pack twice Patrick Steinhardt
2023-04-12 21:33     ` Taylor Blau
2023-04-12 10:22   ` [PATCH v2 5/8] pack-objects: fix error when same packfile is included and excluded Patrick Steinhardt
2023-04-12 21:52     ` Taylor Blau
2023-04-12 10:22   ` [PATCH v2 6/8] pack-objects: extend test coverage of `--stdin-packs` with alternates Patrick Steinhardt
2023-04-12 10:22   ` [PATCH v2 7/8] repack: honor `-l` when calculating pack geometry Patrick Steinhardt
2023-04-12 23:56     ` Junio C Hamano
2023-04-13  5:11       ` Junio C Hamano
2023-04-13  6:41         ` Patrick Steinhardt
2023-04-12 10:23   ` [PATCH v2 8/8] repack: disable writing bitmaps when doing a local geometric repack Patrick Steinhardt
2023-04-12 22:01     ` Taylor Blau
2023-04-13  9:54       ` Patrick Steinhardt
2023-04-13 10:14         ` Patrick Steinhardt
2023-04-12 22:02   ` [PATCH v2 0/8] repack: fix geometric repacking with gitalternates Taylor Blau
2023-04-13 11:16 ` [PATCH v3 00/10] " Patrick Steinhardt
2023-04-13 11:16   ` [PATCH v3 01/10] midx: fix segfault with no packs and invalid preferred pack Patrick Steinhardt
2023-04-13 13:49     ` Derrick Stolee
2023-04-13 11:16   ` [PATCH v3 02/10] repack: fix trying to use preferred pack in alternates Patrick Steinhardt
2023-04-13 11:16   ` [PATCH v3 03/10] repack: fix generating multi-pack-index with only non-local packs Patrick Steinhardt
2023-04-13 11:16   ` [PATCH v3 04/10] pack-objects: split out `--stdin-packs` tests into separate file Patrick Steinhardt
2023-04-13 11:16   ` [PATCH v3 05/10] pack-objects: fix error when packing same pack twice Patrick Steinhardt
2023-04-13 11:16   ` [PATCH v3 06/10] pack-objects: fix error when same packfile is included and excluded Patrick Steinhardt
2023-04-13 11:16   ` [PATCH v3 07/10] pack-objects: extend test coverage of `--stdin-packs` with alternates Patrick Steinhardt
2023-04-13 11:16   ` [PATCH v3 08/10] t/helper: allow chmtime to print verbosely without modifying mtime Patrick Steinhardt
2023-04-13 11:16   ` [PATCH v3 09/10] repack: honor `-l` when calculating pack geometry Patrick Steinhardt
2023-04-13 13:59     ` Derrick Stolee
2023-04-13 14:13       ` Patrick Steinhardt
2023-04-13 15:40       ` Junio C Hamano
2023-04-13 11:16   ` [PATCH v3 10/10] repack: disable writing bitmaps when doing a local repack Patrick Steinhardt
2023-04-13 14:54   ` [PATCH v3 00/10] repack: fix geometric repacking with gitalternates Derrick Stolee
2023-04-14  2:03   ` Junio C Hamano
2023-04-14  5:42     ` Patrick Steinhardt
2023-04-14  6:01 ` [PATCH v4 " Patrick Steinhardt
2023-04-14  6:01   ` [PATCH v4 01/10] midx: fix segfault with no packs and invalid preferred pack Patrick Steinhardt
2023-04-14  6:01   ` [PATCH v4 02/10] repack: fix trying to use preferred pack in alternates Patrick Steinhardt
2023-04-14  6:01   ` [PATCH v4 03/10] repack: fix generating multi-pack-index with only non-local packs Patrick Steinhardt
2023-04-14  6:01   ` [PATCH v4 04/10] pack-objects: split out `--stdin-packs` tests into separate file Patrick Steinhardt
2023-04-14  6:01   ` [PATCH v4 05/10] pack-objects: fix error when packing same pack twice Patrick Steinhardt
2023-04-14  6:01   ` [PATCH v4 06/10] pack-objects: fix error when same packfile is included and excluded Patrick Steinhardt
2023-04-14  6:01   ` [PATCH v4 07/10] pack-objects: extend test coverage of `--stdin-packs` with alternates Patrick Steinhardt
2023-04-14  6:02   ` [PATCH v4 08/10] t/helper: allow chmtime to print verbosely without modifying mtime Patrick Steinhardt
2023-04-14  6:02   ` [PATCH v4 09/10] repack: honor `-l` when calculating pack geometry Patrick Steinhardt
2023-04-14  6:02   ` [PATCH v4 10/10] repack: disable writing bitmaps when doing a local repack Patrick Steinhardt
2023-04-14 13:23   ` [PATCH v4 00/10] repack: fix geometric repacking with gitalternates Derrick Stolee
2023-04-14 17:29     ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZDfL_6G-ySm94Vla@ncase \
    --to=ps@pks.im \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=me@ttaylorr.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.