All of lore.kernel.org
 help / color / mirror / Atom feed
From: Taylor Blau <me@ttaylorr.com>
To: Jeff King <peff@peff.net>
Cc: git@vger.kernel.org, Junio C Hamano <gitster@pobox.com>,
	Elijah Newren <newren@gmail.com>, Patrick Steinhardt <ps@pks.im>
Subject: Re: [PATCH 04/16] midx: use `strvec` for `keep_hashes`
Date: Tue, 31 Mar 2026 18:26:04 -0400	[thread overview]
Message-ID: <acxJ/NfLNloCv3o+@nand.local> (raw)
In-Reply-To: <20260330230130.GD41843@coredump.intra.peff.net>

On Mon, Mar 30, 2026 at 07:01:30PM -0400, Jeff King wrote:
> On Sun, Mar 29, 2026 at 05:41:00PM -0400, Taylor Blau wrote:
>
> > -	CALLOC_ARRAY(keep_hashes, keep_hashes_nr);
> > +	strvec_init_alloc(&keep_hashes, keep_hashes_nr);
> >
> >  	if (ctx.incremental) {
> >  		FILE *chainf = fdopen_lock_file(&lk, "w");
> > @@ -1760,39 +1760,45 @@ static int write_midx_internal(struct write_midx_opts *opts)
> >  			for (i = 0; i < num_layers_before_from; i++) {
> >  				uint32_t j = num_layers_before_from - i - 1;
> >
> > -				keep_hashes[j] = xstrdup(midx_get_checksum_hex(m));
> > +				keep_hashes.v[j] = xstrdup(midx_get_checksum_hex(m));
> > +				keep_hashes.nr++;
>
> Gross, we are just manipulating the innards of the strvec ourselves?
>
> Is it really worth doing this (and adding init_alloc()) versus just:
>
>   strvec_init(&keep_hashes);
>   for (...)
> 	strvec_push(midx_get_checksum_hex(m));
>
> ? That's amortized linear-time, and it's not like the number of midx
> layers is going to be large anyway.

Yeah, this is all pretty disgusting. It's in service of what you noted
below where we need to fill in the array out-of-order, but that's really
bending strvec around MIDX-specific awkwardness, which I dislike.

> I dunno. Maybe that is hard to do. But if so, I question whether moving
> to a strvec is worth it here, since we are not treating it as an opaque
> type anymore. And it is not buying us much to use it (we get to pass one
> parameter versus two, though that is easily solved with a struct, and we
> get to use _clear() instead of our own free loop).

I think it is worth it to move to a strvec here to avoid having to
manage our own memory and pass the length around separately, but I
dislike the way that I did it in this patch.

I tried adjusting this patch to juggle the MIDX layers in a way that
would allow us to just push to the strvec. It ended up being less
awkward/difficult than I thought, so I think we should go with that and
drop the previous patch.

> >  void clear_incremental_midx_files_ext(struct odb_source *source, const char *ext,
> > -				      char **keep_hashes,
> > -				      uint32_t hashes_nr)
> > +				      const struct strvec *keep_hashes)
> >  {
> >  	struct clear_midx_data data = {
> > -		.keep = STRING_LIST_INIT_NODUP,
> > +		.keep = STRING_LIST_INIT_DUP,
> >  		.ext = ext,
> >  	};
> > -	uint32_t i;
> >
> > -	for (i = 0; i < hashes_nr; i++)
> > -		string_list_append(&data.keep,
> > -				   xstrfmt("multi-pack-index-%s.%s",
> > -					   keep_hashes[i], ext));
> > -	string_list_sort(&data.keep);
> > +	if (keep_hashes) {
> > +		struct strbuf buf = STRBUF_INIT;
> > +		for (size_t i = 0; i < keep_hashes->nr; i++) {
> > +			strbuf_reset(&buf);
> > +
> > +			strbuf_addf(&buf, "multi-pack-index-%s.%s",
> > +				    keep_hashes->v[i], ext);
> > +			string_list_append(&data.keep, buf.buf);
> > +		}
> > +
> > +		string_list_sort(&data.keep);
> > +		strbuf_release(&buf);
> > +	}
>
> This hunk was unexpected. We move from xstrfmt() to a strbuf, but does
> that have anything to do with the rest of the patch?

Yeah, this looks like it was left over from when I was writing this
patch and trying out different approaches, and I didn't notice that I
left it in when sending this patch out.

> Also also, the original was leaking the strings, right? The string_list
> was initialized as NODUP, but we assigned allocate xstrfmt() results to
> it. But because of the nodup, string_list_clear() won't free them.
> It should have been:
>
>   .keep = STRING_LIST_INIT_DUP,
>   [...]
>   string_list_append_nodup(&data.keep, xstfmrt(...));
>   [...]
>   string_list_clear(&data.keep);
>
> in patch 2.

Good catch, that's right, but partially obviated by the fact that we're moving to
strset here.

Thanks,
Taylor

  reply	other threads:[~2026-03-31 22:26 UTC|newest]

Thread overview: 92+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-29 21:40 [PATCH 00/16] repack: incremental MIDX/bitmap-based repacking Taylor Blau
2026-03-29 21:40 ` [PATCH 01/16] midx-write: handle noop writes when converting incremental chains Taylor Blau
2026-03-30 22:33   ` Jeff King
2026-03-31 21:43     ` Taylor Blau
2026-03-29 21:40 ` [PATCH 02/16] midx: use `string_list` for retained MIDX files Taylor Blau
2026-03-30 22:38   ` Jeff King
2026-03-31 21:49     ` Taylor Blau
2026-03-29 21:40 ` [PATCH 03/16] strvec: introduce `strvec_init_alloc()` Taylor Blau
2026-03-30 22:46   ` Jeff King
2026-03-29 21:41 ` [PATCH 04/16] midx: use `strvec` for `keep_hashes` Taylor Blau
2026-03-30 23:01   ` Jeff King
2026-03-31 22:26     ` Taylor Blau [this message]
2026-03-31 22:50       ` Taylor Blau
2026-03-31 23:17         ` Jeff King
2026-04-01 15:41           ` Taylor Blau
2026-04-01 19:25             ` Jeff King
2026-03-29 21:41 ` [PATCH 05/16] midx: introduce `--checksum-only` for incremental MIDX writes Taylor Blau
2026-03-30 23:15   ` Jeff King
2026-04-02 22:51     ` Taylor Blau
2026-03-29 21:41 ` [PATCH 06/16] midx: support custom `--base` " Taylor Blau
2026-04-07  5:57   ` Jeff King
2026-04-14 22:09     ` Taylor Blau
2026-03-29 21:41 ` [PATCH 07/16] repack: track the ODB source via existing_packs Taylor Blau
2026-04-07  6:04   ` Jeff King
2026-04-14 22:24     ` Taylor Blau
2026-03-29 21:41 ` [PATCH 08/16] midx: expose `midx_layer_contains_pack()` Taylor Blau
2026-04-07  6:05   ` Jeff King
2026-03-29 21:41 ` [PATCH 09/16] repack-midx: factor out `repack_prepare_midx_command()` Taylor Blau
2026-03-29 21:41 ` [PATCH 10/16] repack-midx: extract `repack_fill_midx_stdin_packs()` Taylor Blau
2026-04-07  6:08   ` Jeff King
2026-03-29 21:41 ` [PATCH 11/16] repack-geometry: prepare for incremental MIDX repacking Taylor Blau
2026-04-07  6:10   ` Jeff King
2026-04-16 22:51   ` Elijah Newren
2026-04-21 19:34     ` Taylor Blau
2026-03-29 21:41 ` [PATCH 12/16] builtin/repack.c: convert `--write-midx` to an `OPT_CALLBACK` Taylor Blau
2026-04-07  6:18   ` Jeff King
2026-03-29 21:41 ` [PATCH 13/16] packfile: ensure `close_pack_revindex()` frees in-memory revindex Taylor Blau
2026-04-07  6:29   ` Jeff King
2026-03-29 21:41 ` [PATCH 14/16] repack: implement incremental MIDX repacking Taylor Blau
2026-04-16 22:53   ` Elijah Newren
2026-04-21 19:40     ` Taylor Blau
2026-03-29 21:41 ` [PATCH 15/16] repack: introduce `--write-midx=incremental` Taylor Blau
2026-04-16 22:53   ` Elijah Newren
2026-04-21 19:52     ` Taylor Blau
2026-03-29 21:41 ` [PATCH 16/16] repack: allow `--write-midx=incremental` without `--geometric` Taylor Blau
2026-04-14 22:38 ` [PATCH 00/16] repack: incremental MIDX/bitmap-based repacking Taylor Blau
2026-04-21 20:37 ` [PATCH v2 " Taylor Blau
2026-04-21 20:37   ` [PATCH v2 01/16] midx-write: handle noop writes when converting incremental chains Taylor Blau
2026-04-21 20:37   ` [PATCH v2 02/16] midx: use `strset` for retained MIDX files Taylor Blau
2026-04-21 20:37   ` [PATCH v2 03/16] midx: build `keep_hashes` array in order Taylor Blau
2026-04-21 20:37   ` [PATCH v2 04/16] midx: use `strvec` for `keep_hashes` Taylor Blau
2026-04-21 20:37   ` [PATCH v2 05/16] midx: introduce `--no-write-chain-file` for incremental MIDX writes Taylor Blau
2026-04-21 20:37   ` [PATCH v2 06/16] midx: support custom `--base` " Taylor Blau
2026-04-21 20:37   ` [PATCH v2 07/16] repack: track the ODB source via existing_packs Taylor Blau
2026-04-21 20:37   ` [PATCH v2 08/16] midx: expose `midx_layer_contains_pack()` Taylor Blau
2026-04-21 20:37   ` [PATCH v2 09/16] repack-midx: factor out `repack_prepare_midx_command()` Taylor Blau
2026-04-21 20:37   ` [PATCH v2 10/16] repack-midx: extract `repack_fill_midx_stdin_packs()` Taylor Blau
2026-04-29  8:08     ` Jeff King
2026-04-29 22:40       ` Taylor Blau
2026-04-21 20:37   ` [PATCH v2 11/16] repack-geometry: prepare for incremental MIDX repacking Taylor Blau
2026-04-21 20:37   ` [PATCH v2 12/16] builtin/repack.c: convert `--write-midx` to an `OPT_CALLBACK` Taylor Blau
2026-04-21 20:37   ` [PATCH v2 13/16] packfile: ensure `close_pack_revindex()` frees in-memory revindex Taylor Blau
2026-04-21 20:37   ` [PATCH v2 14/16] repack: implement incremental MIDX repacking Taylor Blau
2026-04-29  7:51     ` Jeff King
2026-04-29 23:36       ` Taylor Blau
2026-04-29  8:10     ` Jeff King
2026-04-29 23:39       ` Taylor Blau
2026-04-21 20:37   ` [PATCH v2 15/16] repack: introduce `--write-midx=incremental` Taylor Blau
2026-04-21 21:02     ` Taylor Blau
2026-04-21 20:38   ` [PATCH v2 16/16] repack: allow `--write-midx=incremental` without `--geometric` Taylor Blau
2026-04-22 14:45   ` [PATCH v2 00/16] repack: incremental MIDX/bitmap-based repacking Elijah Newren
2026-04-29  8:10   ` Jeff King
2026-04-30  0:13 ` [PATCH v3 " Taylor Blau
2026-04-30  0:13   ` [PATCH v3 01/16] midx-write: handle noop writes when converting incremental chains Taylor Blau
2026-04-30  0:13   ` [PATCH v3 02/16] midx: use `strset` for retained MIDX files Taylor Blau
2026-04-30  0:13   ` [PATCH v3 03/16] midx: build `keep_hashes` array in order Taylor Blau
2026-04-30  0:13   ` [PATCH v3 04/16] midx: use `strvec` for `keep_hashes` Taylor Blau
2026-04-30  0:13   ` [PATCH v3 05/16] midx: introduce `--no-write-chain-file` for incremental MIDX writes Taylor Blau
2026-04-30  0:13   ` [PATCH v3 06/16] midx: support custom `--base` " Taylor Blau
2026-04-30  0:13   ` [PATCH v3 07/16] repack: track the ODB source via existing_packs Taylor Blau
2026-04-30  0:13   ` [PATCH v3 08/16] midx: expose `midx_layer_contains_pack()` Taylor Blau
2026-04-30  0:13   ` [PATCH v3 09/16] repack-midx: factor out `repack_prepare_midx_command()` Taylor Blau
2026-05-13 21:45     ` SZEDER Gábor
2026-04-30  0:13   ` [PATCH v3 10/16] repack-midx: extract `repack_fill_midx_stdin_packs()` Taylor Blau
2026-04-30  0:13   ` [PATCH v3 11/16] repack-geometry: prepare for incremental MIDX repacking Taylor Blau
2026-04-30  0:13   ` [PATCH v3 12/16] builtin/repack.c: convert `--write-midx` to an `OPT_CALLBACK` Taylor Blau
2026-04-30  0:13   ` [PATCH v3 13/16] packfile: ensure `close_pack_revindex()` frees in-memory revindex Taylor Blau
2026-04-30  0:13   ` [PATCH v3 14/16] repack: implement incremental MIDX repacking Taylor Blau
2026-04-30  0:13   ` [PATCH v3 15/16] repack: introduce `--write-midx=incremental` Taylor Blau
2026-05-13 23:08     ` Jeff King
2026-04-30  0:13   ` [PATCH v3 16/16] repack: allow `--write-midx=incremental` without `--geometric` Taylor Blau
2026-05-01  6:46   ` [PATCH v3 00/16] repack: incremental MIDX/bitmap-based repacking Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=acxJ/NfLNloCv3o+@nand.local \
    --to=me@ttaylorr.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=newren@gmail.com \
    --cc=peff@peff.net \
    --cc=ps@pks.im \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.