Git development
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: Taylor Blau <me@ttaylorr.com>
Cc: git@vger.kernel.org, Junio C Hamano <gitster@pobox.com>,
	Elijah Newren <newren@gmail.com>, Patrick Steinhardt <ps@pks.im>
Subject: Re: [PATCH 04/16] midx: use `strvec` for `keep_hashes`
Date: Mon, 30 Mar 2026 19:01:30 -0400	[thread overview]
Message-ID: <20260330230130.GD41843@coredump.intra.peff.net> (raw)
In-Reply-To: <5fc72d5049a602ae5ede6bb243f44546f02d995d.1774820449.git.me@ttaylorr.com>

On Sun, Mar 29, 2026 at 05:41:00PM -0400, Taylor Blau wrote:

> -	CALLOC_ARRAY(keep_hashes, keep_hashes_nr);
> +	strvec_init_alloc(&keep_hashes, keep_hashes_nr);
>  
>  	if (ctx.incremental) {
>  		FILE *chainf = fdopen_lock_file(&lk, "w");
> @@ -1760,39 +1760,45 @@ static int write_midx_internal(struct write_midx_opts *opts)
>  			for (i = 0; i < num_layers_before_from; i++) {
>  				uint32_t j = num_layers_before_from - i - 1;
>  
> -				keep_hashes[j] = xstrdup(midx_get_checksum_hex(m));
> +				keep_hashes.v[j] = xstrdup(midx_get_checksum_hex(m));
> +				keep_hashes.nr++;

Gross, we are just manipulating the innards of the strvec ourselves?

Is it really worth doing this (and adding init_alloc()) versus just:

  strvec_init(&keep_hashes);
  for (...)
	strvec_push(midx_get_checksum_hex(m));

? That's amortized linear-time, and it's not like the number of midx
layers is going to be large anyway.

>  				m = m->base_midx;
>  			}
>  
> -			keep_hashes[i] = xstrdup(hash_to_hex_algop(midx_hash,
> +			keep_hashes.v[i] = xstrdup(hash_to_hex_algop(midx_hash,
>  								   r->hash_algo));
> +			keep_hashes.nr++;

OK, this could be a push, too.

>  
>  			i = 0;
>  			for (m = ctx.m;
>  			     m && midx_hashcmp(m, ctx.compact_to, r->hash_algo);
>  			     m = m->base_midx) {
> -				keep_hashes[keep_hashes_nr - i - 1] =
> +				keep_hashes.v[keep_hashes_nr - i - 1] =
>  					xstrdup(midx_get_checksum_hex(m));
> +				keep_hashes.nr++;

But what is this? We're filling in from the back side of the array? I
mean...yeah, that's something that strvec_push() can't do. But I can't
help but feel like it might be simpler and more obvious to adjust the
iteration to build the array in order.

I dunno. Maybe that is hard to do. But if so, I question whether moving
to a strvec is worth it here, since we are not treating it as an opaque
type anymore. And it is not buying us much to use it (we get to pass one
parameter versus two, though that is easily solved with a struct, and we
get to use _clear() instead of our own free loop).

>  void clear_incremental_midx_files_ext(struct odb_source *source, const char *ext,
> -				      char **keep_hashes,
> -				      uint32_t hashes_nr)
> +				      const struct strvec *keep_hashes)
>  {
>  	struct clear_midx_data data = {
> -		.keep = STRING_LIST_INIT_NODUP,
> +		.keep = STRING_LIST_INIT_DUP,
>  		.ext = ext,
>  	};
> -	uint32_t i;
>  
> -	for (i = 0; i < hashes_nr; i++)
> -		string_list_append(&data.keep,
> -				   xstrfmt("multi-pack-index-%s.%s",
> -					   keep_hashes[i], ext));
> -	string_list_sort(&data.keep);
> +	if (keep_hashes) {
> +		struct strbuf buf = STRBUF_INIT;
> +		for (size_t i = 0; i < keep_hashes->nr; i++) {
> +			strbuf_reset(&buf);
> +
> +			strbuf_addf(&buf, "multi-pack-index-%s.%s",
> +				    keep_hashes->v[i], ext);
> +			string_list_append(&data.keep, buf.buf);
> +		}
> +
> +		string_list_sort(&data.keep);
> +		strbuf_release(&buf);
> +	}

This hunk was unexpected. We move from xstrfmt() to a strbuf, but does
that have anything to do with the rest of the patch?

Also, I don't think using a strbuf really buys us anything. We are
reusing the strbuf for each format operation, but then we copy into the
string list anyway. So there is one allocation per loop iteration either
way.

Also also, the original was leaking the strings, right? The string_list
was initialized as NODUP, but we assigned allocate xstrfmt() results to
it. But because of the nodup, string_list_clear() won't free them.
It should have been:

  .keep = STRING_LIST_INIT_DUP,
  [...]
  string_list_append_nodup(&data.keep, xstfmrt(...));
  [...]
  string_list_clear(&data.keep);

in patch 2.

-Peff

  reply	other threads:[~2026-03-30 23:01 UTC|newest]

Thread overview: 92+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-29 21:40 [PATCH 00/16] repack: incremental MIDX/bitmap-based repacking Taylor Blau
2026-03-29 21:40 ` [PATCH 01/16] midx-write: handle noop writes when converting incremental chains Taylor Blau
2026-03-30 22:33   ` Jeff King
2026-03-31 21:43     ` Taylor Blau
2026-03-29 21:40 ` [PATCH 02/16] midx: use `string_list` for retained MIDX files Taylor Blau
2026-03-30 22:38   ` Jeff King
2026-03-31 21:49     ` Taylor Blau
2026-03-29 21:40 ` [PATCH 03/16] strvec: introduce `strvec_init_alloc()` Taylor Blau
2026-03-30 22:46   ` Jeff King
2026-03-29 21:41 ` [PATCH 04/16] midx: use `strvec` for `keep_hashes` Taylor Blau
2026-03-30 23:01   ` Jeff King [this message]
2026-03-31 22:26     ` Taylor Blau
2026-03-31 22:50       ` Taylor Blau
2026-03-31 23:17         ` Jeff King
2026-04-01 15:41           ` Taylor Blau
2026-04-01 19:25             ` Jeff King
2026-03-29 21:41 ` [PATCH 05/16] midx: introduce `--checksum-only` for incremental MIDX writes Taylor Blau
2026-03-30 23:15   ` Jeff King
2026-04-02 22:51     ` Taylor Blau
2026-03-29 21:41 ` [PATCH 06/16] midx: support custom `--base` " Taylor Blau
2026-04-07  5:57   ` Jeff King
2026-04-14 22:09     ` Taylor Blau
2026-03-29 21:41 ` [PATCH 07/16] repack: track the ODB source via existing_packs Taylor Blau
2026-04-07  6:04   ` Jeff King
2026-04-14 22:24     ` Taylor Blau
2026-03-29 21:41 ` [PATCH 08/16] midx: expose `midx_layer_contains_pack()` Taylor Blau
2026-04-07  6:05   ` Jeff King
2026-03-29 21:41 ` [PATCH 09/16] repack-midx: factor out `repack_prepare_midx_command()` Taylor Blau
2026-03-29 21:41 ` [PATCH 10/16] repack-midx: extract `repack_fill_midx_stdin_packs()` Taylor Blau
2026-04-07  6:08   ` Jeff King
2026-03-29 21:41 ` [PATCH 11/16] repack-geometry: prepare for incremental MIDX repacking Taylor Blau
2026-04-07  6:10   ` Jeff King
2026-04-16 22:51   ` Elijah Newren
2026-04-21 19:34     ` Taylor Blau
2026-03-29 21:41 ` [PATCH 12/16] builtin/repack.c: convert `--write-midx` to an `OPT_CALLBACK` Taylor Blau
2026-04-07  6:18   ` Jeff King
2026-03-29 21:41 ` [PATCH 13/16] packfile: ensure `close_pack_revindex()` frees in-memory revindex Taylor Blau
2026-04-07  6:29   ` Jeff King
2026-03-29 21:41 ` [PATCH 14/16] repack: implement incremental MIDX repacking Taylor Blau
2026-04-16 22:53   ` Elijah Newren
2026-04-21 19:40     ` Taylor Blau
2026-03-29 21:41 ` [PATCH 15/16] repack: introduce `--write-midx=incremental` Taylor Blau
2026-04-16 22:53   ` Elijah Newren
2026-04-21 19:52     ` Taylor Blau
2026-03-29 21:41 ` [PATCH 16/16] repack: allow `--write-midx=incremental` without `--geometric` Taylor Blau
2026-04-14 22:38 ` [PATCH 00/16] repack: incremental MIDX/bitmap-based repacking Taylor Blau
2026-04-21 20:37 ` [PATCH v2 " Taylor Blau
2026-04-21 20:37   ` [PATCH v2 01/16] midx-write: handle noop writes when converting incremental chains Taylor Blau
2026-04-21 20:37   ` [PATCH v2 02/16] midx: use `strset` for retained MIDX files Taylor Blau
2026-04-21 20:37   ` [PATCH v2 03/16] midx: build `keep_hashes` array in order Taylor Blau
2026-04-21 20:37   ` [PATCH v2 04/16] midx: use `strvec` for `keep_hashes` Taylor Blau
2026-04-21 20:37   ` [PATCH v2 05/16] midx: introduce `--no-write-chain-file` for incremental MIDX writes Taylor Blau
2026-04-21 20:37   ` [PATCH v2 06/16] midx: support custom `--base` " Taylor Blau
2026-04-21 20:37   ` [PATCH v2 07/16] repack: track the ODB source via existing_packs Taylor Blau
2026-04-21 20:37   ` [PATCH v2 08/16] midx: expose `midx_layer_contains_pack()` Taylor Blau
2026-04-21 20:37   ` [PATCH v2 09/16] repack-midx: factor out `repack_prepare_midx_command()` Taylor Blau
2026-04-21 20:37   ` [PATCH v2 10/16] repack-midx: extract `repack_fill_midx_stdin_packs()` Taylor Blau
2026-04-29  8:08     ` Jeff King
2026-04-29 22:40       ` Taylor Blau
2026-04-21 20:37   ` [PATCH v2 11/16] repack-geometry: prepare for incremental MIDX repacking Taylor Blau
2026-04-21 20:37   ` [PATCH v2 12/16] builtin/repack.c: convert `--write-midx` to an `OPT_CALLBACK` Taylor Blau
2026-04-21 20:37   ` [PATCH v2 13/16] packfile: ensure `close_pack_revindex()` frees in-memory revindex Taylor Blau
2026-04-21 20:37   ` [PATCH v2 14/16] repack: implement incremental MIDX repacking Taylor Blau
2026-04-29  7:51     ` Jeff King
2026-04-29 23:36       ` Taylor Blau
2026-04-29  8:10     ` Jeff King
2026-04-29 23:39       ` Taylor Blau
2026-04-21 20:37   ` [PATCH v2 15/16] repack: introduce `--write-midx=incremental` Taylor Blau
2026-04-21 21:02     ` Taylor Blau
2026-04-21 20:38   ` [PATCH v2 16/16] repack: allow `--write-midx=incremental` without `--geometric` Taylor Blau
2026-04-22 14:45   ` [PATCH v2 00/16] repack: incremental MIDX/bitmap-based repacking Elijah Newren
2026-04-29  8:10   ` Jeff King
2026-04-30  0:13 ` [PATCH v3 " Taylor Blau
2026-04-30  0:13   ` [PATCH v3 01/16] midx-write: handle noop writes when converting incremental chains Taylor Blau
2026-04-30  0:13   ` [PATCH v3 02/16] midx: use `strset` for retained MIDX files Taylor Blau
2026-04-30  0:13   ` [PATCH v3 03/16] midx: build `keep_hashes` array in order Taylor Blau
2026-04-30  0:13   ` [PATCH v3 04/16] midx: use `strvec` for `keep_hashes` Taylor Blau
2026-04-30  0:13   ` [PATCH v3 05/16] midx: introduce `--no-write-chain-file` for incremental MIDX writes Taylor Blau
2026-04-30  0:13   ` [PATCH v3 06/16] midx: support custom `--base` " Taylor Blau
2026-04-30  0:13   ` [PATCH v3 07/16] repack: track the ODB source via existing_packs Taylor Blau
2026-04-30  0:13   ` [PATCH v3 08/16] midx: expose `midx_layer_contains_pack()` Taylor Blau
2026-04-30  0:13   ` [PATCH v3 09/16] repack-midx: factor out `repack_prepare_midx_command()` Taylor Blau
2026-05-13 21:45     ` SZEDER Gábor
2026-04-30  0:13   ` [PATCH v3 10/16] repack-midx: extract `repack_fill_midx_stdin_packs()` Taylor Blau
2026-04-30  0:13   ` [PATCH v3 11/16] repack-geometry: prepare for incremental MIDX repacking Taylor Blau
2026-04-30  0:13   ` [PATCH v3 12/16] builtin/repack.c: convert `--write-midx` to an `OPT_CALLBACK` Taylor Blau
2026-04-30  0:13   ` [PATCH v3 13/16] packfile: ensure `close_pack_revindex()` frees in-memory revindex Taylor Blau
2026-04-30  0:13   ` [PATCH v3 14/16] repack: implement incremental MIDX repacking Taylor Blau
2026-04-30  0:13   ` [PATCH v3 15/16] repack: introduce `--write-midx=incremental` Taylor Blau
2026-05-13 23:08     ` Jeff King
2026-04-30  0:13   ` [PATCH v3 16/16] repack: allow `--write-midx=incremental` without `--geometric` Taylor Blau
2026-05-01  6:46   ` [PATCH v3 00/16] repack: incremental MIDX/bitmap-based repacking Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260330230130.GD41843@coredump.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=me@ttaylorr.com \
    --cc=newren@gmail.com \
    --cc=ps@pks.im \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox