From: Junio C Hamano <gitster@pobox.com>
To: Taylor Blau <me@ttaylorr.com>
Cc: git@vger.kernel.org, Jeff King <peff@peff.net>,
Elijah Newren <newren@gmail.com>,
Patrick Steinhardt <ps@pks.im>
Subject: Re: [RFC PATCH 01/14] midx: use `string_list` for retained MIDX files
Date: Thu, 26 Feb 2026 12:29:22 -0800 [thread overview]
Message-ID: <xmqqldgf1c65.fsf@gitster.g> (raw)
In-Reply-To: <d64a799afd620363c1940d7c2e634e78ea553cb6.1771978829.git.me@ttaylorr.com> (Taylor Blau's message of "Tue, 24 Feb 2026 19:20:56 -0500")
Taylor Blau <me@ttaylorr.com> writes:
> Both `clear_midx_files_ext()` and `clear_incremental_midx_files_ext()`
> build a list of filenames to keep while pruning stale MIDX files. Today
> they hand-roll an array instead of using a `string_list`, thus requiring
> us to pass an additional length parameter, and makes lookups linear.
>
> Replace the bare array with a `string_list` which can be passed around
> as a single parameter. Though it improves lookup performance, the
> difference is likely immeasurable given how small the keep_hashes array
> typically is.
And if it the lookup performance turns out to be an issue, we can
switch to strmap or something more appropriate.
>
> Signed-off-by: Taylor Blau <me@ttaylorr.com>
> ---
> midx.c | 56 ++++++++++++++++++++++----------------------------------
> 1 file changed, 22 insertions(+), 34 deletions(-)
>
> diff --git a/midx.c b/midx.c
> index c1b9658240d..c5e3553e2bb 100644
> --- a/midx.c
> +++ b/midx.c
> @@ -755,8 +755,7 @@ int midx_checksum_valid(struct multi_pack_index *m)
> }
>
> struct clear_midx_data {
> - char **keep;
> - uint32_t keep_nr;
> + struct string_list keep;
> const char *ext;
> };
>
> @@ -764,15 +763,12 @@ static void clear_midx_file_ext(const char *full_path, size_t full_path_len UNUS
> const char *file_name, void *_data)
> {
> struct clear_midx_data *data = _data;
> - uint32_t i;
>
> if (!(starts_with(file_name, "multi-pack-index-") &&
> ends_with(file_name, data->ext)))
> return;
> - for (i = 0; i < data->keep_nr; i++) {
> - if (!strcmp(data->keep[i], file_name))
> - return;
> - }
> + if (string_list_has_string(&data->keep, file_name))
> + return;
> if (unlink(full_path))
> die_errno(_("failed to remove %s"), full_path);
> }
> @@ -780,48 +776,40 @@ static void clear_midx_file_ext(const char *full_path, size_t full_path_len UNUS
> void clear_midx_files_ext(struct odb_source *source, const char *ext,
> const char *keep_hash)
> {
> - struct clear_midx_data data;
> - memset(&data, 0, sizeof(struct clear_midx_data));
> -
> - if (keep_hash) {
> - ALLOC_ARRAY(data.keep, 1);
> -
> - data.keep[0] = xstrfmt("multi-pack-index-%s.%s", keep_hash, ext);
> - data.keep_nr = 1;
> - }
> - data.ext = ext;
> -
> - for_each_file_in_pack_dir(source->path,
> - clear_midx_file_ext,
> - &data);
> + struct clear_midx_data data = {
> + .keep = STRING_LIST_INIT_NODUP,
> + .ext = ext,
> + };
>
> if (keep_hash)
> - free(data.keep[0]);
> - free(data.keep);
> + string_list_insert(&data.keep, xstrfmt("multi-pack-index-%s.%s",
> + keep_hash, ext));
> +
> + for_each_file_in_pack_dir(source->path, clear_midx_file_ext, &data);
> +
> + string_list_clear(&data.keep, 0);
> }
>
> void clear_incremental_midx_files_ext(struct odb_source *source, const char *ext,
> char **keep_hashes,
> uint32_t hashes_nr)
> {
> - struct clear_midx_data data;
> + struct clear_midx_data data = {
> + .keep = STRING_LIST_INIT_NODUP,
> + .ext = ext,
> + };
> uint32_t i;
>
> - memset(&data, 0, sizeof(struct clear_midx_data));
> -
> - ALLOC_ARRAY(data.keep, hashes_nr);
> for (i = 0; i < hashes_nr; i++)
> - data.keep[i] = xstrfmt("multi-pack-index-%s.%s", keep_hashes[i],
> - ext);
> - data.keep_nr = hashes_nr;
> - data.ext = ext;
> + string_list_append(&data.keep,
> + xstrfmt("multi-pack-index-%s.%s",
> + keep_hashes[i], ext));
> + string_list_sort(&data.keep);
>
> for_each_file_in_pack_subdir(source->path, "multi-pack-index.d",
> clear_midx_file_ext, &data);
>
> - for (i = 0; i < hashes_nr; i++)
> - free(data.keep[i]);
> - free(data.keep);
> + string_list_clear(&data.keep, 0);
> }
>
> void clear_midx_file(struct repository *r)
next prev parent reply other threads:[~2026-02-26 20:29 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-25 0:20 [RFC PATCH 00/14] repack: incremental MIDX/bitmap-based repacking Taylor Blau
2026-02-25 0:20 ` [RFC PATCH 06/14] repack: track the ODB source via existing_packs Taylor Blau
2026-02-25 0:21 ` Taylor Blau
2026-02-25 0:23 ` Taylor Blau
2026-02-25 0:20 ` [RFC PATCH 01/14] midx: use `string_list` for retained MIDX files Taylor Blau
2026-02-26 20:29 ` Junio C Hamano [this message]
2026-02-27 3:02 ` Taylor Blau
2026-02-25 0:21 ` [RFC PATCH 02/14] strvec: introduce `strvec_init_alloc()` Taylor Blau
2026-02-26 20:34 ` Junio C Hamano
2026-02-26 20:58 ` Junio C Hamano
2026-02-27 3:07 ` Taylor Blau
2026-02-25 0:21 ` [RFC PATCH 03/14] midx: use `strvec` for `keep_hashes` Taylor Blau
2026-02-25 0:21 ` [RFC PATCH 04/14] midx: introduce `--checksum-only` for incremental MIDX writes Taylor Blau
2026-02-25 0:21 ` [RFC PATCH 05/14] midx: support custom `--base` " Taylor Blau
2026-02-25 0:21 ` [RFC PATCH 07/14] midx: expose `midx_layer_contains_pack()` Taylor Blau
2026-02-25 0:21 ` [RFC PATCH 08/14] repack-midx: factor out `repack_prepare_midx_command()` Taylor Blau
2026-02-25 0:21 ` [RFC PATCH 09/14] repack-midx: extract `repack_fill_midx_stdin_packs()` Taylor Blau
2026-02-25 0:21 ` [RFC PATCH 10/14] repack-geometry: prepare for incremental MIDX repacking Taylor Blau
2026-02-25 0:21 ` [RFC PATCH 11/14] builtin/repack.c: convert `--write-midx` to an `OPT_CALLBACK` Taylor Blau
2026-02-25 0:21 ` [RFC PATCH 12/14] repack: implement incremental MIDX repacking Taylor Blau
2026-02-25 0:22 ` [RFC PATCH 13/14] repack: introduce `--write-midx=incremental` Taylor Blau
2026-02-25 0:22 ` [RFC PATCH 14/14] repack: allow `--write-midx=incremental` without `--geometric` Taylor Blau
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xmqqldgf1c65.fsf@gitster.g \
--to=gitster@pobox.com \
--cc=git@vger.kernel.org \
--cc=me@ttaylorr.com \
--cc=newren@gmail.com \
--cc=peff@peff.net \
--cc=ps@pks.im \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.