git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Taylor Blau <me@ttaylorr.com>
Cc: git@vger.kernel.org, peff@peff.net
Subject: Re: [PATCH 1/5] builtin/repack.c: do not repack single packs with --geometric
Date: Fri, 05 Mar 2021 11:15:58 -0800	[thread overview]
Message-ID: <xmqqv9a59ztd.fsf@gitster.c.googlers.com> (raw)
In-Reply-To: <80bc7fa8397491d015b80a39168813d2019e262d.1614957681.git.me@ttaylorr.com> (Taylor Blau's message of "Fri, 5 Mar 2021 10:21:37 -0500")

Taylor Blau <me@ttaylorr.com> writes:

> Loosen the guard to only stop when there aren't any packs, and let the
> rest of the code do the right thing. Add a test to ensure that this is
> the case.
>
> Noticed-by: Junio C Hamano <gitster@pobox.com>

I do not think I "noticed" anything, though.

> -	if (geometry->pack_nr <= 1) {
> +	if (!geometry->pack_nr) {
>  		geometry->split = geometry->pack_nr;
>  		return;
>  	}

When pack_nr is 0, split is set to 0.  Otherwise we compute the
split the usual way in the new code.  Let's see the post-context of
the above code and figure out what happens when pack_nr is 1.

	split = geometry->pack_nr - 1;

split set to 0 here.

	/*
	 * First, count the number of packs (in descending order of size) which
	 * already form a geometric progression.
	 */
	for (i = geometry->pack_nr - 1; i > 0; i--) {

Iterate from i = 0 while i is positive, which means this entire loop
is skipped.

		struct packed_git *ours = geometry->pack[i];
		struct packed_git *prev = geometry->pack[i - 1];
		if (geometry_pack_weight(ours) >= factor * geometry_pack_weight(prev))
			split--;
		else
			break;
	}

	if (split) {
		/*
		 * Move the split one to the right, since the top element in the
		 * last-compared pair can't be in the progression. Only do this
		 * when we split in the middle of the array (otherwise if we got
		 * to the end, then the split is in the right place).
		 */
		split++;
	}

And split is still 0.


	/*
	 * Then, anything to the left of 'split' must be in a new pack. But,
	 * creating that new pack may cause packs in the heavy half to no longer
	 * form a geometric progression.
	 *
	 * Compute an expected size of the new pack, and then determine how many
	 * packs in the heavy half need to be joined into it (if any) to restore
	 * the geometric progression.
	 */
	for (i = 0; i < split; i++)
		total_size += geometry_pack_weight(geometry->pack[i]);

The loop is skipped, as split is 0.

	for (i = split; i < geometry->pack_nr; i++) {

Iterate from i = 0 and for just once.

		struct packed_git *ours = geometry->pack[i];
		if (geometry_pack_weight(ours) < factor * total_size) {

But total_size is 0 so split is not incremented.

			split++;
			total_size += geometry_pack_weight(ours);
		} else
			break;
	}

	geometry->split = split;

Hence we assign 0 to .split member.

I however wonder if it expresses the intent more clearly if you did
this upfront, instead of forcing the readers to go through the code.

	if (geometry->pack_nr <= 1) {
-		geometry->split = geometry->pack_nr;
+		geometry->split = 0;
 		return;
 	}

That is, "when there is no existing packs, or just one pack, we
split at 0"

The code that gets affected by the setting of "split" is this piece
in the caller, cmd_repack():

	if (geometry) {
		FILE *in = xfdopen(cmd.in, "w");
		for (i = 0; i < geometry->split; i++)
			fprintf(in, "%s\n", pack_basename(geometry->pack[i]));
		for (i = geometry->split; i < geometry->pack_nr; i++)
			fprintf(in, "^%s\n", pack_basename(geometry->pack[i]));
		fclose(in);
	}

When split == 0, we end up feeding no positive packs and all
negative packs, which results in nothing to pack.  I wonder if we
can optimize out the spawning of the pack-object process, but that
is probalby optimizing for a wrong case.

> diff --git a/t/t7703-repack-geometric.sh b/t/t7703-repack-geometric.sh
> index 96917fc163..4a1952a054 100755
> --- a/t/t7703-repack-geometric.sh
> +++ b/t/t7703-repack-geometric.sh
> @@ -20,6 +20,21 @@ test_expect_success '--geometric with no packs' '
>  	)
>  '
>  
> +test_expect_success '--geometric with one pack' '
> +	git init geometric &&
> +	test_when_finished "rm -fr geometric" &&
> +	(
> +		cd geometric &&
> +
> +		test_commit "base" &&
> +		git repack -d &&
> +
> +		git repack --geometric 2 >out &&
> +
> +		test_i18ngrep "Nothing new to pack" out
> +	)
> +'
> +
>  test_expect_success '--geometric with an intact progression' '
>  	git init geometric &&
>  	test_when_finished "rm -fr geometric" &&

  reply	other threads:[~2021-03-05 19:16 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-05 15:21 [PATCH 0/5] clean-ups to geometric repacking Taylor Blau
2021-03-05 15:21 ` [PATCH 1/5] builtin/repack.c: do not repack single packs with --geometric Taylor Blau
2021-03-05 19:15   ` Junio C Hamano [this message]
2021-03-05 19:27     ` Taylor Blau
2021-03-05 19:30       ` Taylor Blau
2021-03-05 15:21 ` [PATCH 2/5] t7703: test --geometric repack with loose objects Taylor Blau
2021-03-05 15:21 ` [PATCH 3/5] builtin/repack.c: assign pack split later Taylor Blau
2021-03-05 15:21 ` [PATCH 4/5] builtin/repack.c: be more conservative with unsigned overflows Taylor Blau
2021-03-05 19:32   ` Junio C Hamano
2021-03-05 19:41     ` Taylor Blau
2021-03-10 21:00   ` Jeff King
2021-03-05 15:22 ` [PATCH 5/5] builtin/repack.c: reword comment around pack-objects flags Taylor Blau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqv9a59ztd.fsf@gitster.c.googlers.com \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=me@ttaylorr.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).