From: Taylor Blau <me@ttaylorr.com>
To: git@vger.kernel.org
Cc: peff@peff.net, avarab@gmail.com, gitster@pobox.com,
jonathantanmy@google.com, steadmon@google.com
Subject: [PATCH v3 0/9] repack: introduce `--write-midx`
Date: Tue, 28 Sep 2021 21:54:37 -0400 [thread overview]
Message-ID: <cover.1632880469.git.me@ttaylorr.com> (raw)
In-Reply-To: <cover.1631730270.git.me@ttaylorr.com>
Here is another small reroll of my series to implement the final component of
multi-pack reachability bitamps, which is to be able to write them from `git
repack`.
This version incorporates feedback from Jonathan Tan and others at Google who
graciously provided review. A range-diff is included below, but the major
changes are:
- A comment to explain the relationship between 'ctx->m' and 'ctx->to_include'
in midx.c:add_pack_to_midx().
- A comment to explain the meaning of write_midx_file_only().
- Clean-up of stray hunks, a missing strbuf_release().
- Replacing the name `existing_packs` with `existing_nonkept_packs` to
emphasize the relationship between it and the similarly-named
`existing_kept_packs`.
Instead of depending on tb/multi-pack-bitmaps, this series now depends on the
tip of master.
Taylor Blau (9):
midx: expose `write_midx_file_only()` publicly
builtin/multi-pack-index.c: support `--stdin-packs` mode
midx: preliminary support for `--refs-snapshot`
builtin/repack.c: keep track of existing packs unconditionally
builtin/repack.c: rename variables that deal with non-kept packs
builtin/repack.c: extract showing progress to a variable
builtin/repack.c: support writing a MIDX while repacking
builtin/repack.c: make largest pack preferred
builtin/repack.c: pass `--refs-snapshot` when writing bitmaps
Documentation/git-multi-pack-index.txt | 19 ++
Documentation/git-repack.txt | 18 +-
builtin/multi-pack-index.c | 36 +++-
builtin/repack.c | 279 ++++++++++++++++++++++---
midx.c | 110 ++++++++--
midx.h | 15 +-
pack-bitmap.c | 2 +-
pack-bitmap.h | 1 +
t/helper/test-read-midx.c | 25 ++-
t/lib-midx.sh | 8 +
t/t5319-multi-pack-index.sh | 15 ++
t/t5326-multi-pack-bitmaps.sh | 82 ++++++++
t/t7700-repack.sh | 96 +++++++++
t/t7703-repack-geometric.sh | 24 ++-
14 files changed, 674 insertions(+), 56 deletions(-)
create mode 100644 t/lib-midx.sh
Range-diff against v2:
1: 03c1b2c4d3 ! 1: b7c72d0bf5 midx: expose `write_midx_file_only()` publicly
@@ midx.c: struct write_midx_context {
static void add_pack_to_midx(const char *full_path, size_t full_path_len,
@@ midx.c: static void add_pack_to_midx(const char *full_path, size_t full_path_len,
+
+ if (ends_with(file_name, ".idx")) {
display_progress(ctx->progress, ++ctx->pack_paths_checked);
++ /*
++ * Note that at most one of ctx->m and ctx->to_include are set,
++ * so we are testing midx_contains_pack() and
++ * string_list_has_string() independently (guarded by the
++ * appropriate NULL checks).
++ *
++ * We could support passing to_include while reusing an existing
++ * MIDX, but don't currently since the reuse process drags
++ * forward all packs from an existing MIDX (without checking
++ * whether or not they appear in the to_include list).
++ *
++ * If we added support for that, these next two conditional
++ * should be performed independently (likely checking
++ * to_include before the existing MIDX).
++ */
if (ctx->m && midx_contains_pack(ctx->m, file_name))
return;
+ else if (ctx->to_include &&
@@ midx.c: static int write_midx_internal(const char *object_dir,
struct bitmap_index *bitmap_git;
int bitmap_exists;
int want_bitmap = flags & MIDX_WRITE_BITMAP;
-@@ midx.c: static int write_midx_internal(const char *object_dir,
-
- QSORT(ctx.info, ctx.nr, pack_info_compare);
-
-- if (packs_to_drop && packs_to_drop->nr) {
-+ if (ctx.m && packs_to_drop && packs_to_drop->nr) {
- int drop_index = 0;
- int missing_drops = 0;
-
@@ midx.c: int write_midx_file(const char *object_dir,
const char *preferred_pack_name,
unsigned flags)
@@ midx.h: int midx_contains_pack(struct multi_pack_index *m, const char *idx_or_pa
int prepare_multi_pack_index_one(struct repository *r, const char *object_dir, int local);
int write_midx_file(const char *object_dir, const char *preferred_pack_name, unsigned flags);
++/*
++ * Variant of write_midx_file which writes a MIDX containing only the packs
++ * specified in packs_to_include.
++ */
+int write_midx_file_only(const char *object_dir,
+ struct string_list *packs_to_include,
+ const char *preferred_pack_name,
2: 59556e5545 = 2: 986ef14f2a builtin/multi-pack-index.c: support `--stdin-packs` mode
3: 42f1ae9ede ! 3: 4e3769a4f3 midx: preliminary support for `--refs-snapshot`
@@ Commit message
commit in the MIDX reaches something that isn't.
This can happen when a multi-pack index contains some pack which refers
- to loose objects (which by definition aren't included in the multi-pack
- index).
+ to loose objects (e.g., if a pack was pushed after starting the repack
+ but before generating the MIDX which depends on an object which is
+ stored as loose in the repository, and by definition isn't included in
+ the multi-pack index).
By taking a snapshot of the references before we start repacking, we can
close that race window. In the above scenario (where we have a packed
@@ midx.c: static void bitmap_show_commit(struct commit *commit, void *_data)
+ }
+
+ fclose(f);
++ strbuf_release(&buf);
+ return 0;
+}
+
@@ midx.h: int fill_midx_entry(struct repository *r, const struct object_id *oid, s
int prepare_multi_pack_index_one(struct repository *r, const char *object_dir, int local);
-int write_midx_file(const char *object_dir, const char *preferred_pack_name, unsigned flags);
+ /*
+ * Variant of write_midx_file which writes a MIDX containing only the packs
+ * specified in packs_to_include.
+ */
+int write_midx_file(const char *object_dir,
+ const char *preferred_pack_name,
+ const char *refs_snapshot,
4: c0d045a9de ! 4: 1b3dd331ca builtin/repack.c: keep track of existing packs unconditionally
@@ Commit message
## builtin/repack.c ##
@@ builtin/repack.c: static void remove_pack_on_signal(int signo)
- * have a corresponding .keep file. These packs are not to
- * be kept if we are going to pack everything into one file.
+ }
+
+ /*
+- * Adds all packs hex strings to the fname list, which do not
+- * have a corresponding .keep file. These packs are not to
+- * be kept if we are going to pack everything into one file.
++ * Adds all packs hex strings to either fname_list or fname_kept_list
++ * based on whether each pack has a corresponding .keep file or not.
++ * Packs without a .keep file are not to be kept if we are going to
++ * pack everything into one file.
*/
-static void get_non_kept_pack_filenames(struct string_list *fname_list,
- const struct string_list *extra_keep)
-: ---------- > 5: 15831a201a builtin/repack.c: rename variables that deal with non-kept packs
5: 09de03de47 = 6: 1a40161baf builtin/repack.c: extract showing progress to a variable
6: 83dfdb8b12 ! 7: 6854f0751d builtin/repack.c: support writing a MIDX while repacking
@@ Commit message
check we can make consistently when (1) telling the MIDX which packs we
want to exclude, and (2) actually unlinking the redundant packs.
+ There is also a tiny change to short-circuit the body of
+ write_midx_included_packs() when no packs remain in the case of an empty
+ repository. The MIDX code does not handle this, so avoid trying to
+ generate a MIDX covering zero packs in the first place.
+
Signed-off-by: Taylor Blau <me@ttaylorr.com>
## Documentation/git-repack.txt ##
@@ builtin/repack.c: static void clear_pack_geometry(struct pack_geometry *geometry
}
+static void midx_included_packs(struct string_list *include,
-+ struct string_list *existing_packs,
++ struct string_list *existing_nonkept_packs,
+ struct string_list *existing_kept_packs,
+ struct string_list *names,
+ struct pack_geometry *geometry)
@@ builtin/repack.c: static void clear_pack_geometry(struct pack_geometry *geometry
+ string_list_insert(include, strbuf_detach(&buf, NULL));
+ }
+ } else {
-+ for_each_string_list_item(item, existing_packs) {
++ for_each_string_list_item(item, existing_nonkept_packs) {
+ if (item->util)
+ continue;
+ string_list_insert(include, xstrfmt("%s.idx", item->string));
@@ builtin/repack.c: static void clear_pack_geometry(struct pack_geometry *geometry
+ FILE *in;
+ int ret;
+
++ if (!include->nr)
++ return 0;
++
+ cmd.in = -1;
+ cmd.git_cmd = 1;
+
@@ builtin/repack.c: int cmd_repack(int argc, const char **argv, const char *prefix
+ if (delete_redundant && pack_everything & ALL_INTO_ONE) {
+ const int hexsz = the_hash_algo->hexsz;
+ string_list_sort(&names);
-+ for_each_string_list_item(item, &existing_packs) {
++ for_each_string_list_item(item, &existing_nonkept_packs) {
+ char *sha1;
+ size_t len = strlen(item->string);
+ if (len < hexsz)
+ continue;
+ sha1 = item->string + len - hexsz;
++ /*
++ * Mark this pack for deletion, which ensures that this
++ * pack won't be included in a MIDX (if `--write-midx`
++ * was given) and that we will actually delete this pack
++ * (if `-d` was given).
++ */
+ item->util = (void*)(intptr_t)!string_list_has_string(&names, sha1);
+ }
+ }
+
+ if (write_midx) {
+ struct string_list include = STRING_LIST_INIT_NODUP;
-+ midx_included_packs(&include, &existing_packs,
++ midx_included_packs(&include, &existing_nonkept_packs,
+ &existing_kept_packs, &names, geometry);
+
+ ret = write_midx_included_packs(&include,
@@ builtin/repack.c: int cmd_repack(int argc, const char **argv, const char *prefix
- if (pack_everything & ALL_INTO_ONE) {
- const int hexsz = the_hash_algo->hexsz;
- string_list_sort(&names);
-- for_each_string_list_item(item, &existing_packs) {
+- for_each_string_list_item(item, &existing_nonkept_packs) {
- char *sha1;
- size_t len = strlen(item->string);
- if (len < hexsz)
@@ builtin/repack.c: int cmd_repack(int argc, const char **argv, const char *prefix
- if (!string_list_has_string(&names, sha1))
- remove_redundant_pack(packdir, item->string);
- }
-+ for_each_string_list_item(item, &existing_packs) {
++ for_each_string_list_item(item, &existing_nonkept_packs) {
+ if (!item->util)
+ continue;
+ remove_redundant_pack(packdir, item->string);
@@ t/t7700-repack.sh: test_expect_success 'auto-bitmaps do not complain if unavaila
+'
+
test_done
+
+ ## t/t7703-repack-geometric.sh ##
+@@ t/t7703-repack-geometric.sh: test_expect_success '--geometric with no packs' '
+ (
+ cd geometric &&
+
+- git repack --geometric 2 >out &&
++ git repack --write-midx --geometric 2 >out &&
+ test_i18ngrep "Nothing new to pack" out
+ )
+ '
7: 68bc49d8ae ! 8: 3596c76daf builtin/repack.c: make largest pack preferred
@@ builtin/repack.c: static int write_midx_included_packs(struct string_list *inclu
if (ret)
return ret;
@@ builtin/repack.c: int cmd_repack(int argc, const char **argv, const char *prefix)
- midx_included_packs(&include, &existing_packs,
+ midx_included_packs(&include, &existing_nonkept_packs,
&existing_kept_packs, &names, geometry);
- ret = write_midx_included_packs(&include,
8: eb24b308ec ! 9: d99f075321 builtin/repack.c: pass `--refs-snapshot` when writing bitmaps
@@ builtin/repack.c: static void clear_pack_geometry(struct pack_geometry *geometry
+}
+
static void midx_included_packs(struct string_list *include,
- struct string_list *existing_packs,
+ struct string_list *existing_nonkept_packs,
struct string_list *existing_kept_packs,
@@ builtin/repack.c: static void midx_included_packs(struct string_list *include,
--
2.33.0.96.g73915697e6
next prev parent reply other threads:[~2021-09-29 1:55 UTC|newest]
Thread overview: 76+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-09-11 3:32 [PATCH 0/8] repack: introduce `--write-midx` Taylor Blau
2021-09-11 3:32 ` [PATCH 1/8] midx: expose 'write_midx_file_only()' publicly Taylor Blau
2021-09-11 5:00 ` Junio C Hamano
2021-09-11 16:17 ` Taylor Blau
2021-09-11 10:07 ` Ævar Arnfjörð Bjarmason
2021-09-11 16:21 ` Taylor Blau
2021-09-11 3:32 ` [PATCH 2/8] builtin/multi-pack-index.c: support --stdin-packs mode Taylor Blau
2021-09-11 10:05 ` Ævar Arnfjörð Bjarmason
2021-09-11 16:25 ` Taylor Blau
2021-09-11 16:28 ` Taylor Blau
2021-09-12 2:08 ` Eric Sunshine
2021-09-12 2:21 ` Taylor Blau
2021-09-12 15:15 ` Ævar Arnfjörð Bjarmason
2021-09-12 22:30 ` Junio C Hamano
2021-09-12 22:32 ` Ævar Arnfjörð Bjarmason
2021-09-14 19:02 ` Jeff King
2021-09-14 23:48 ` Taylor Blau
2021-09-15 1:55 ` Eric Sunshine
2021-09-11 3:32 ` [PATCH 3/8] midx: preliminary support for `--refs-snapshot` Taylor Blau
2021-09-11 3:32 ` [PATCH 4/8] builtin/repack.c: keep track of existing packs unconditionally Taylor Blau
2021-09-11 3:32 ` [PATCH 5/8] builtin/repack.c: extract showing progress to a variable Taylor Blau
2021-09-11 3:32 ` [PATCH 6/8] builtin/repack.c: support writing a MIDX while repacking Taylor Blau
2021-09-11 3:32 ` [PATCH 7/8] builtin/repack.c: make largest pack preferred Taylor Blau
2021-09-11 10:17 ` Ævar Arnfjörð Bjarmason
2021-09-11 16:35 ` Taylor Blau
2021-09-11 3:32 ` [PATCH 8/8] builtin/repack.c: pass `--refs-snapshot` when writing bitmaps Taylor Blau
2021-09-11 10:27 ` Ævar Arnfjörð Bjarmason
2021-09-11 11:19 ` Ævar Arnfjörð Bjarmason
2021-09-11 16:51 ` Taylor Blau
2021-09-14 18:55 ` Jeff King
2021-09-14 23:34 ` Taylor Blau
2021-09-14 23:56 ` Ævar Arnfjörð Bjarmason
2021-09-15 4:31 ` Taylor Blau
2021-09-11 16:49 ` Taylor Blau
2021-09-15 18:24 ` [PATCH v2 0/8] repack: introduce `--write-midx` Taylor Blau
2021-09-15 18:24 ` [PATCH v2 1/8] midx: expose `write_midx_file_only()` publicly Taylor Blau
2021-09-22 23:14 ` Jonathan Tan
2021-09-23 3:09 ` Taylor Blau
2021-09-15 18:24 ` [PATCH v2 2/8] builtin/multi-pack-index.c: support `--stdin-packs` mode Taylor Blau
2021-09-22 22:29 ` Josh Steadmon
2021-09-23 2:03 ` Taylor Blau
2021-09-22 23:11 ` Jonathan Tan
2021-09-23 2:06 ` Taylor Blau
2021-09-15 18:24 ` [PATCH v2 3/8] midx: preliminary support for `--refs-snapshot` Taylor Blau
2021-09-22 22:34 ` Josh Steadmon
2021-09-23 2:08 ` Taylor Blau
2021-09-22 23:00 ` Jonathan Tan
2021-09-23 2:18 ` Taylor Blau
2021-09-15 18:24 ` [PATCH v2 4/8] builtin/repack.c: keep track of existing packs unconditionally Taylor Blau
2021-09-22 22:56 ` Jonathan Tan
2021-09-23 2:59 ` Taylor Blau
2021-09-15 18:24 ` [PATCH v2 5/8] builtin/repack.c: extract showing progress to a variable Taylor Blau
2021-09-15 18:24 ` [PATCH v2 6/8] builtin/repack.c: support writing a MIDX while repacking Taylor Blau
2021-09-22 22:39 ` Jonathan Tan
2021-09-23 2:40 ` Taylor Blau
2021-09-15 18:24 ` [PATCH v2 7/8] builtin/repack.c: make largest pack preferred Taylor Blau
2021-09-15 18:24 ` [PATCH v2 8/8] builtin/repack.c: pass `--refs-snapshot` when writing bitmaps Taylor Blau
2021-09-24 18:22 ` Jonathan Tan
2021-10-01 22:38 ` Taylor Blau
2021-09-15 19:22 ` [PATCH v2 0/8] repack: introduce `--write-midx` Junio C Hamano
2021-09-15 19:29 ` Junio C Hamano
2021-09-15 21:19 ` Taylor Blau
2021-09-16 22:16 ` Junio C Hamano
2021-09-29 1:54 ` Taylor Blau [this message]
2021-09-29 1:55 ` [PATCH v3 1/9] midx: expose `write_midx_file_only()` publicly Taylor Blau
2021-09-29 1:55 ` [PATCH v3 2/9] builtin/multi-pack-index.c: support `--stdin-packs` mode Taylor Blau
2021-09-29 1:55 ` [PATCH v3 3/9] midx: preliminary support for `--refs-snapshot` Taylor Blau
2021-09-29 1:55 ` [PATCH v3 4/9] builtin/repack.c: keep track of existing packs unconditionally Taylor Blau
2021-09-29 1:55 ` [PATCH v3 5/9] builtin/repack.c: rename variables that deal with non-kept packs Taylor Blau
2021-09-29 1:55 ` [PATCH v3 6/9] builtin/repack.c: extract showing progress to a variable Taylor Blau
2021-09-29 1:55 ` [PATCH v3 7/9] builtin/repack.c: support writing a MIDX while repacking Taylor Blau
2021-09-29 1:55 ` [PATCH v3 8/9] builtin/repack.c: make largest pack preferred Taylor Blau
2021-09-29 1:55 ` [PATCH v3 9/9] builtin/repack.c: pass `--refs-snapshot` when writing bitmaps Taylor Blau
2021-09-29 4:24 ` [PATCH v3 0/9] repack: introduce `--write-midx` Junio C Hamano
2021-10-01 20:01 ` Jonathan Tan
2021-10-01 22:40 ` Taylor Blau
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cover.1632880469.git.me@ttaylorr.com \
--to=me@ttaylorr.com \
--cc=avarab@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=jonathantanmy@google.com \
--cc=peff@peff.net \
--cc=steadmon@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.