* [PATCH 0/8] pseudo-merge: avoid empty and non-closed pseudo-merge commits
@ 2024-08-15 17:30 Taylor Blau
2024-08-15 17:31 ` [PATCH 1/8] pack-bitmap: initialize `bitmap_writer_init()` with packing_data Taylor Blau
` (8 more replies)
0 siblings, 9 replies; 20+ messages in thread
From: Taylor Blau @ 2024-08-15 17:30 UTC (permalink / raw)
To: git; +Cc: Jeff King, Junio C Hamano
This series fixes a couple of small issues with pseudo-merge groups,
where it is possible to (a) select pseudo-merges that are not closed
over reachability with respect to the bitmap's pack or MIDX, and (b)
generate empty pseudo-merge commits.
The series is structured as follows:
- The first four commits refactor the pack-bitmap machinery to make
the bitmap_writer's packing_data structure available earlier at
selection time, and then remove redundant arguments from the rest of
the API as a result of that change.
- The fifth commit makes it so we select pseudo-merges even in small
repositories, making it easier to write tests in the remainder of
the series.
- The next two commits demonstrate and fix a case where we would
generate empty pseudo-merge commits (see (b) above).
- The final commit ensures that we do not generate pseudo-merge groups
which reach objects not contained in the bitmap's pack or MIDX (see
(a) above).
The bug described in (a) was noticed by Peff, and the remaining seven
commits are preparatory, and/or fixing other small issues I noticed
along the way while investigating the original impetus for this series.
Thanks in advance for your review!
Taylor Blau (8):
pack-bitmap: initialize `bitmap_writer_init()` with packing_data
pack-bitmap: drop redundant args from
`bitmap_writer_build_type_index()`
pack-bitmap: drop redundant args from `bitmap_writer_build()`
pack-bitmap: drop redundant args from `bitmap_writer_finish()`
pack-bitmap-write.c: select pseudo-merges even for small bitmaps
t/t5333-pseudo-merge-bitmaps.sh: demonstrate empty pseudo-merge groups
pseudo-merge.c: do not generate empty pseudo-merge commits
pseudo-merge.c: ensure pseudo-merge groups are closed
builtin/pack-objects.c | 8 ++---
midx-write.c | 10 +++---
pack-bitmap-write.c | 37 +++++++++++-----------
pack-bitmap.h | 11 +++----
pseudo-merge.c | 13 +++++---
t/t5333-pseudo-merge-bitmaps.sh | 56 +++++++++++++++++++++++++++++++++
6 files changed, 96 insertions(+), 39 deletions(-)
base-commit: 406f326d271e0bacecdb00425422c5fa3f314930
--
2.46.0.54.gc9a64b1d2a
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH 1/8] pack-bitmap: initialize `bitmap_writer_init()` with packing_data
2024-08-15 17:30 [PATCH 0/8] pseudo-merge: avoid empty and non-closed pseudo-merge commits Taylor Blau
@ 2024-08-15 17:31 ` Taylor Blau
2024-08-17 10:31 ` Jeff King
2024-08-15 17:31 ` [PATCH 2/8] pack-bitmap: drop redundant args from `bitmap_writer_build_type_index()` Taylor Blau
` (7 subsequent siblings)
8 siblings, 1 reply; 20+ messages in thread
From: Taylor Blau @ 2024-08-15 17:31 UTC (permalink / raw)
To: git; +Cc: Jeff King, Junio C Hamano
In order to determine its object order, the pack-bitmap machinery keeps
a 'struct packing_data' corresponding to the pack or pseudo-pack (when
writing a MIDX bitmap) being written.
The to_pack field is provided to the bitmap machinery by callers of
bitmap_writer_build() and assigned to the bitmap_writer struct at that
point.
But a subsequent commit will want to have access to that data earlier on
during commit selection. Prepare for that by adding a 'to_pack' argument
to 'bitmap_writer_init()', and initializing the field during that
function.
Subsequent commits will clean up other functions which take
now-redundant arguments (like nr_objects, which is equivalent to
pdata->objects_nr, or pdata itself).
Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
builtin/pack-objects.c | 2 +-
midx-write.c | 2 +-
pack-bitmap-write.c | 4 +++-
pack-bitmap.h | 3 ++-
4 files changed, 7 insertions(+), 4 deletions(-)
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index f395488971..0ad533c045 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -1342,7 +1342,7 @@ static void write_pack_file(void)
if (write_bitmap_index) {
bitmap_writer_init(&bitmap_writer,
- the_repository);
+ the_repository, &to_pack);
bitmap_writer_set_checksum(&bitmap_writer, hash);
bitmap_writer_build_type_index(&bitmap_writer,
&to_pack, written_list, nr_written);
diff --git a/midx-write.c b/midx-write.c
index a77ee73c68..62f507eb72 100644
--- a/midx-write.c
+++ b/midx-write.c
@@ -825,7 +825,7 @@ static int write_midx_bitmap(const char *midx_name,
for (i = 0; i < pdata->nr_objects; i++)
index[i] = &pdata->objects[i].idx;
- bitmap_writer_init(&writer, the_repository);
+ bitmap_writer_init(&writer, the_repository, pdata);
bitmap_writer_show_progress(&writer, flags & MIDX_PROGRESS);
bitmap_writer_build_type_index(&writer, pdata, index,
pdata->nr_objects);
diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c
index bf96c80898..4a7d2d1370 100644
--- a/pack-bitmap-write.c
+++ b/pack-bitmap-write.c
@@ -41,13 +41,15 @@ static inline int bitmap_writer_nr_selected_commits(struct bitmap_writer *writer
return writer->selected_nr - writer->pseudo_merges_nr;
}
-void bitmap_writer_init(struct bitmap_writer *writer, struct repository *r)
+void bitmap_writer_init(struct bitmap_writer *writer, struct repository *r,
+ struct packing_data *pdata)
{
memset(writer, 0, sizeof(struct bitmap_writer));
if (writer->bitmaps)
BUG("bitmap writer already initialized");
writer->bitmaps = kh_init_oid_map();
writer->pseudo_merge_commits = kh_init_oid_map();
+ writer->to_pack = pdata;
string_list_init_dup(&writer->pseudo_merge_groups);
diff --git a/pack-bitmap.h b/pack-bitmap.h
index 1171e6d989..ab20d6a0b6 100644
--- a/pack-bitmap.h
+++ b/pack-bitmap.h
@@ -123,7 +123,8 @@ struct bitmap_writer {
unsigned char pack_checksum[GIT_MAX_RAWSZ];
};
-void bitmap_writer_init(struct bitmap_writer *writer, struct repository *r);
+void bitmap_writer_init(struct bitmap_writer *writer, struct repository *r,
+ struct packing_data *pdata);
void bitmap_writer_show_progress(struct bitmap_writer *writer, int show);
void bitmap_writer_set_checksum(struct bitmap_writer *writer,
const unsigned char *sha1);
--
2.46.0.54.gc9a64b1d2a
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 2/8] pack-bitmap: drop redundant args from `bitmap_writer_build_type_index()`
2024-08-15 17:30 [PATCH 0/8] pseudo-merge: avoid empty and non-closed pseudo-merge commits Taylor Blau
2024-08-15 17:31 ` [PATCH 1/8] pack-bitmap: initialize `bitmap_writer_init()` with packing_data Taylor Blau
@ 2024-08-15 17:31 ` Taylor Blau
2024-08-15 17:31 ` [PATCH 3/8] pack-bitmap: drop redundant args from `bitmap_writer_build()` Taylor Blau
` (6 subsequent siblings)
8 siblings, 0 replies; 20+ messages in thread
From: Taylor Blau @ 2024-08-15 17:31 UTC (permalink / raw)
To: git; +Cc: Jeff King, Junio C Hamano
The previous commit ensures that the bitmap_writer's "to_pack" field is
initialized early on, so the "to_pack" and "index_nr" arguments to
`bitmap_writer_build_type_index()` are redundant.
Drop them and adjust the callers accordingly.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
builtin/pack-objects.c | 2 +-
midx-write.c | 3 +--
pack-bitmap-write.c | 12 +++++-------
pack-bitmap.h | 4 +---
4 files changed, 8 insertions(+), 13 deletions(-)
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 0ad533c045..c08a62718d 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -1345,7 +1345,7 @@ static void write_pack_file(void)
the_repository, &to_pack);
bitmap_writer_set_checksum(&bitmap_writer, hash);
bitmap_writer_build_type_index(&bitmap_writer,
- &to_pack, written_list, nr_written);
+ written_list);
}
if (cruft)
diff --git a/midx-write.c b/midx-write.c
index 62f507eb72..b3015af07f 100644
--- a/midx-write.c
+++ b/midx-write.c
@@ -827,8 +827,7 @@ static int write_midx_bitmap(const char *midx_name,
bitmap_writer_init(&writer, the_repository, pdata);
bitmap_writer_show_progress(&writer, flags & MIDX_PROGRESS);
- bitmap_writer_build_type_index(&writer, pdata, index,
- pdata->nr_objects);
+ bitmap_writer_build_type_index(&writer, index);
/*
* bitmap_writer_finish expects objects in lex order, but pack_order
diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c
index 4a7d2d1370..34cdf5f150 100644
--- a/pack-bitmap-write.c
+++ b/pack-bitmap-write.c
@@ -101,9 +101,7 @@ void bitmap_writer_show_progress(struct bitmap_writer *writer, int show)
* Build the initial type index for the packfile or multi-pack-index
*/
void bitmap_writer_build_type_index(struct bitmap_writer *writer,
- struct packing_data *to_pack,
- struct pack_idx_entry **index,
- uint32_t index_nr)
+ struct pack_idx_entry **index)
{
uint32_t i;
@@ -111,13 +109,13 @@ void bitmap_writer_build_type_index(struct bitmap_writer *writer,
writer->trees = ewah_new();
writer->blobs = ewah_new();
writer->tags = ewah_new();
- ALLOC_ARRAY(to_pack->in_pack_pos, to_pack->nr_objects);
+ ALLOC_ARRAY(writer->to_pack->in_pack_pos, writer->to_pack->nr_objects);
- for (i = 0; i < index_nr; ++i) {
+ for (i = 0; i < writer->to_pack->nr_objects; ++i) {
struct object_entry *entry = (struct object_entry *)index[i];
enum object_type real_type;
- oe_set_in_pack_pos(to_pack, entry, i);
+ oe_set_in_pack_pos(writer->to_pack, entry, i);
switch (oe_type(entry)) {
case OBJ_COMMIT:
@@ -128,7 +126,7 @@ void bitmap_writer_build_type_index(struct bitmap_writer *writer,
break;
default:
- real_type = oid_object_info(to_pack->repo,
+ real_type = oid_object_info(writer->to_pack->repo,
&entry->idx.oid, NULL);
break;
}
diff --git a/pack-bitmap.h b/pack-bitmap.h
index ab20d6a0b6..d2529abadc 100644
--- a/pack-bitmap.h
+++ b/pack-bitmap.h
@@ -129,9 +129,7 @@ void bitmap_writer_show_progress(struct bitmap_writer *writer, int show);
void bitmap_writer_set_checksum(struct bitmap_writer *writer,
const unsigned char *sha1);
void bitmap_writer_build_type_index(struct bitmap_writer *writer,
- struct packing_data *to_pack,
- struct pack_idx_entry **index,
- uint32_t index_nr);
+ struct pack_idx_entry **index);
int bitmap_writer_has_bitmapped_object_id(struct bitmap_writer *writer,
const struct object_id *oid);
void bitmap_writer_push_commit(struct bitmap_writer *writer,
--
2.46.0.54.gc9a64b1d2a
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 3/8] pack-bitmap: drop redundant args from `bitmap_writer_build()`
2024-08-15 17:30 [PATCH 0/8] pseudo-merge: avoid empty and non-closed pseudo-merge commits Taylor Blau
2024-08-15 17:31 ` [PATCH 1/8] pack-bitmap: initialize `bitmap_writer_init()` with packing_data Taylor Blau
2024-08-15 17:31 ` [PATCH 2/8] pack-bitmap: drop redundant args from `bitmap_writer_build_type_index()` Taylor Blau
@ 2024-08-15 17:31 ` Taylor Blau
2024-08-15 17:31 ` [PATCH 4/8] pack-bitmap: drop redundant args from `bitmap_writer_finish()` Taylor Blau
` (5 subsequent siblings)
8 siblings, 0 replies; 20+ messages in thread
From: Taylor Blau @ 2024-08-15 17:31 UTC (permalink / raw)
To: git; +Cc: Jeff King, Junio C Hamano
In a similar fashion as the previous commit, drop a redundant argument
from the `bitmap_writer_build()` function.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
builtin/pack-objects.c | 2 +-
midx-write.c | 2 +-
pack-bitmap-write.c | 9 +++------
pack-bitmap.h | 3 +--
4 files changed, 6 insertions(+), 10 deletions(-)
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index c08a62718d..97090433a1 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -1367,7 +1367,7 @@ static void write_pack_file(void)
bitmap_writer_select_commits(&bitmap_writer,
indexed_commits,
indexed_commits_nr);
- if (bitmap_writer_build(&bitmap_writer, &to_pack) < 0)
+ if (bitmap_writer_build(&bitmap_writer) < 0)
die(_("failed to write bitmap index"));
bitmap_writer_finish(&bitmap_writer,
written_list, nr_written,
diff --git a/midx-write.c b/midx-write.c
index b3015af07f..1ccdf0df30 100644
--- a/midx-write.c
+++ b/midx-write.c
@@ -846,7 +846,7 @@ static int write_midx_bitmap(const char *midx_name,
index[pack_order[i]] = &pdata->objects[i].idx;
bitmap_writer_select_commits(&writer, commits, commits_nr);
- ret = bitmap_writer_build(&writer, pdata);
+ ret = bitmap_writer_build(&writer);
if (ret < 0)
goto cleanup;
diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c
index 34cdf5f150..8d7437955d 100644
--- a/pack-bitmap-write.c
+++ b/pack-bitmap-write.c
@@ -569,8 +569,7 @@ static void store_selected(struct bitmap_writer *writer,
kh_value(writer->bitmaps, hash_pos) = stored;
}
-int bitmap_writer_build(struct bitmap_writer *writer,
- struct packing_data *to_pack)
+int bitmap_writer_build(struct bitmap_writer *writer)
{
struct bitmap_builder bb;
size_t i;
@@ -581,17 +580,15 @@ int bitmap_writer_build(struct bitmap_writer *writer,
uint32_t *mapping;
int closed = 1; /* until proven otherwise */
- writer->to_pack = to_pack;
-
if (writer->show_progress)
writer->progress = start_progress("Building bitmaps",
writer->selected_nr);
trace2_region_enter("pack-bitmap-write", "building_bitmaps_total",
the_repository);
- old_bitmap = prepare_bitmap_git(to_pack->repo);
+ old_bitmap = prepare_bitmap_git(writer->to_pack->repo);
if (old_bitmap)
- mapping = create_bitmap_mapping(old_bitmap, to_pack);
+ mapping = create_bitmap_mapping(old_bitmap, writer->to_pack);
else
mapping = NULL;
diff --git a/pack-bitmap.h b/pack-bitmap.h
index d2529abadc..0c5b83e954 100644
--- a/pack-bitmap.h
+++ b/pack-bitmap.h
@@ -146,8 +146,7 @@ struct ewah_bitmap *pseudo_merge_bitmap_for_commit(struct bitmap_index *bitmap_g
void bitmap_writer_select_commits(struct bitmap_writer *writer,
struct commit **indexed_commits,
unsigned int indexed_commits_nr);
-int bitmap_writer_build(struct bitmap_writer *writer,
- struct packing_data *to_pack);
+int bitmap_writer_build(struct bitmap_writer *writer);
void bitmap_writer_finish(struct bitmap_writer *writer,
struct pack_idx_entry **index,
uint32_t index_nr,
--
2.46.0.54.gc9a64b1d2a
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 4/8] pack-bitmap: drop redundant args from `bitmap_writer_finish()`
2024-08-15 17:30 [PATCH 0/8] pseudo-merge: avoid empty and non-closed pseudo-merge commits Taylor Blau
` (2 preceding siblings ...)
2024-08-15 17:31 ` [PATCH 3/8] pack-bitmap: drop redundant args from `bitmap_writer_build()` Taylor Blau
@ 2024-08-15 17:31 ` Taylor Blau
2024-08-15 17:31 ` [PATCH 5/8] pack-bitmap-write.c: select pseudo-merges even for small bitmaps Taylor Blau
` (4 subsequent siblings)
8 siblings, 0 replies; 20+ messages in thread
From: Taylor Blau @ 2024-08-15 17:31 UTC (permalink / raw)
To: git; +Cc: Jeff King, Junio C Hamano
In a similar fashion as the previous commit, drop a redundant argument
from the `bitmap_writer_finish()` function.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
builtin/pack-objects.c | 2 +-
midx-write.c | 3 +--
pack-bitmap-write.c | 8 ++++----
pack-bitmap.h | 1 -
4 files changed, 6 insertions(+), 8 deletions(-)
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 97090433a1..e23c4950ed 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -1370,7 +1370,7 @@ static void write_pack_file(void)
if (bitmap_writer_build(&bitmap_writer) < 0)
die(_("failed to write bitmap index"));
bitmap_writer_finish(&bitmap_writer,
- written_list, nr_written,
+ written_list,
tmpname.buf, write_bitmap_options);
bitmap_writer_free(&bitmap_writer);
write_bitmap_index = 0;
diff --git a/midx-write.c b/midx-write.c
index 1ccdf0df30..d2d13447af 100644
--- a/midx-write.c
+++ b/midx-write.c
@@ -851,8 +851,7 @@ static int write_midx_bitmap(const char *midx_name,
goto cleanup;
bitmap_writer_set_checksum(&writer, midx_hash);
- bitmap_writer_finish(&writer, index, pdata->nr_objects, bitmap_name,
- options);
+ bitmap_writer_finish(&writer, index, bitmap_name, options);
cleanup:
free(index);
diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c
index 8d7437955d..346fb29513 100644
--- a/pack-bitmap-write.c
+++ b/pack-bitmap-write.c
@@ -998,7 +998,6 @@ void bitmap_writer_set_checksum(struct bitmap_writer *writer,
void bitmap_writer_finish(struct bitmap_writer *writer,
struct pack_idx_entry **index,
- uint32_t index_nr,
const char *filename,
uint16_t options)
{
@@ -1031,12 +1030,13 @@ void bitmap_writer_finish(struct bitmap_writer *writer,
dump_bitmap(f, writer->tags);
if (options & BITMAP_OPT_LOOKUP_TABLE)
- CALLOC_ARRAY(offsets, index_nr);
+ CALLOC_ARRAY(offsets, writer->to_pack->nr_objects);
for (i = 0; i < bitmap_writer_nr_selected_commits(writer); i++) {
struct bitmapped_commit *stored = &writer->selected[i];
int commit_pos = oid_pos(&stored->commit->object.oid, index,
- index_nr, oid_access);
+ writer->to_pack->nr_objects,
+ oid_access);
if (commit_pos < 0)
BUG(_("trying to write commit not in index"));
@@ -1052,7 +1052,7 @@ void bitmap_writer_finish(struct bitmap_writer *writer,
write_lookup_table(writer, f, offsets);
if (options & BITMAP_OPT_HASH_CACHE)
- write_hash_cache(f, index, index_nr);
+ write_hash_cache(f, index, writer->to_pack->nr_objects);
finalize_hashfile(f, NULL, FSYNC_COMPONENT_PACK_METADATA,
CSUM_HASH_IN_STREAM | CSUM_FSYNC | CSUM_CLOSE);
diff --git a/pack-bitmap.h b/pack-bitmap.h
index 0c5b83e954..ff0fd815b8 100644
--- a/pack-bitmap.h
+++ b/pack-bitmap.h
@@ -149,7 +149,6 @@ void bitmap_writer_select_commits(struct bitmap_writer *writer,
int bitmap_writer_build(struct bitmap_writer *writer);
void bitmap_writer_finish(struct bitmap_writer *writer,
struct pack_idx_entry **index,
- uint32_t index_nr,
const char *filename,
uint16_t options);
void bitmap_writer_free(struct bitmap_writer *writer);
--
2.46.0.54.gc9a64b1d2a
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 5/8] pack-bitmap-write.c: select pseudo-merges even for small bitmaps
2024-08-15 17:30 [PATCH 0/8] pseudo-merge: avoid empty and non-closed pseudo-merge commits Taylor Blau
` (3 preceding siblings ...)
2024-08-15 17:31 ` [PATCH 4/8] pack-bitmap: drop redundant args from `bitmap_writer_finish()` Taylor Blau
@ 2024-08-15 17:31 ` Taylor Blau
2024-08-17 10:34 ` Jeff King
2024-08-15 17:31 ` [PATCH 6/8] t/t5333-pseudo-merge-bitmaps.sh: demonstrate empty pseudo-merge groups Taylor Blau
` (3 subsequent siblings)
8 siblings, 1 reply; 20+ messages in thread
From: Taylor Blau @ 2024-08-15 17:31 UTC (permalink / raw)
To: git; +Cc: Jeff King, Junio C Hamano
Ordinarily, the pack-bitmap machinery will select some subset of
reachable commits to receive bitmaps. But when there are fewer than 100
commits indexed in the first place, they will all receive bitmaps as a
special case.
When this happens, pseudo-merges are not generated, making it impossible
to test pseudo-merge corner cases with fewer than 100 commits.
Select pseudo-merges even for bitmaps with fewer than 100 commits to
make such testing easier. In practice, this should not make a difference
to non-testing bitmaps, as they are unlikely to be used when a
repository has so few commits to begin with.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
pack-bitmap-write.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c
index 346fb29513..923f793cec 100644
--- a/pack-bitmap-write.c
+++ b/pack-bitmap-write.c
@@ -694,6 +694,10 @@ void bitmap_writer_select_commits(struct bitmap_writer *writer,
if (indexed_commits_nr < 100) {
for (i = 0; i < indexed_commits_nr; ++i)
bitmap_writer_push_commit(writer, indexed_commits[i], 0);
+
+ select_pseudo_merges(writer, indexed_commits,
+ indexed_commits_nr);
+
return;
}
--
2.46.0.54.gc9a64b1d2a
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 6/8] t/t5333-pseudo-merge-bitmaps.sh: demonstrate empty pseudo-merge groups
2024-08-15 17:30 [PATCH 0/8] pseudo-merge: avoid empty and non-closed pseudo-merge commits Taylor Blau
` (4 preceding siblings ...)
2024-08-15 17:31 ` [PATCH 5/8] pack-bitmap-write.c: select pseudo-merges even for small bitmaps Taylor Blau
@ 2024-08-15 17:31 ` Taylor Blau
2024-08-15 17:31 ` [PATCH 7/8] pseudo-merge.c: do not generate empty pseudo-merge commits Taylor Blau
` (2 subsequent siblings)
8 siblings, 0 replies; 20+ messages in thread
From: Taylor Blau @ 2024-08-15 17:31 UTC (permalink / raw)
To: git; +Cc: Jeff King, Junio C Hamano
Demonstrate that it is possible to generate empty pseudo-merge commits
in certain cases.
In the below instance, we generate one non-empty pseudo-merge
(containing commit "base"), and one empty pseudo-merge group
(corresponding to the unstable commits within that group).
(In my testing, the pseudo-merge machinery seems to handle empty groups
just fine, but generating them is pointless as they carry no
information.)
This commit (introducing a deliberate "test_expect_failure") is split
out from the actual fix (which will appear in the following commit) to
demonstrate that the failure is correctly induced.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
t/t5333-pseudo-merge-bitmaps.sh | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)
diff --git a/t/t5333-pseudo-merge-bitmaps.sh b/t/t5333-pseudo-merge-bitmaps.sh
index f052f395a7..0288691340 100755
--- a/t/t5333-pseudo-merge-bitmaps.sh
+++ b/t/t5333-pseudo-merge-bitmaps.sh
@@ -390,4 +390,24 @@ test_expect_success 'pseudo-merge reuse' '
)
'
+test_expect_failure 'empty pseudo-merge group' '
+ git init pseudo-merge-empty-group &&
+ (
+ cd pseudo-merge-empty-group &&
+
+ # Ensure that a pseudo-merge group with no unstable
+ # commits does not generate an empty pseudo-merge
+ # bitmap.
+ git config bitmapPseudoMerge.empty.pattern refs/ &&
+
+ test_commit base &&
+ git repack -adb &&
+
+ test-tool bitmap dump-pseudo-merges >merges &&
+ test_line_count = 1 merges &&
+
+ test 0 -eq "$(grep -c commits=0 <merges)"
+ )
+'
+
test_done
--
2.46.0.54.gc9a64b1d2a
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 7/8] pseudo-merge.c: do not generate empty pseudo-merge commits
2024-08-15 17:30 [PATCH 0/8] pseudo-merge: avoid empty and non-closed pseudo-merge commits Taylor Blau
` (5 preceding siblings ...)
2024-08-15 17:31 ` [PATCH 6/8] t/t5333-pseudo-merge-bitmaps.sh: demonstrate empty pseudo-merge groups Taylor Blau
@ 2024-08-15 17:31 ` Taylor Blau
2024-08-17 10:38 ` Jeff King
2024-08-15 17:31 ` [PATCH 8/8] pseudo-merge.c: ensure pseudo-merge groups are closed Taylor Blau
2024-08-17 10:44 ` [PATCH 0/8] pseudo-merge: avoid empty and non-closed pseudo-merge commits Jeff King
8 siblings, 1 reply; 20+ messages in thread
From: Taylor Blau @ 2024-08-15 17:31 UTC (permalink / raw)
To: git; +Cc: Jeff King, Junio C Hamano
The previous commit demonstrated it is possible to generate empty
pseudo-merge commits, which is not useful as such pseudo-merges carry no
information.
Ensure that we only generate non-empty groups by not pushing a new
commit onto the bitmap_writer when that commit has no parents.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
pseudo-merge.c | 11 +++++++----
t/t5333-pseudo-merge-bitmaps.sh | 2 +-
2 files changed, 8 insertions(+), 5 deletions(-)
diff --git a/pseudo-merge.c b/pseudo-merge.c
index f0fde13c47..6422be979c 100644
--- a/pseudo-merge.c
+++ b/pseudo-merge.c
@@ -357,8 +357,10 @@ static void select_pseudo_merges_1(struct bitmap_writer *writer,
p = commit_list_append(c, p);
} while (j % group->stable_size);
- bitmap_writer_push_commit(writer, merge, 1);
- writer->pseudo_merges_nr++;
+ if (merge->parents) {
+ bitmap_writer_push_commit(writer, merge, 1);
+ writer->pseudo_merges_nr++;
+ }
}
/* make up to group->max_merges pseudo merges for unstable commits */
@@ -398,8 +400,9 @@ static void select_pseudo_merges_1(struct bitmap_writer *writer,
p = commit_list_append(c, p);
}
- bitmap_writer_push_commit(writer, merge, 1);
- writer->pseudo_merges_nr++;
+ if (merge->parents) {
+ bitmap_writer_push_commit(writer, merge, 1);
+ writer->pseudo_merges_nr++; }
if (end >= matches->unstable_nr)
break;
}
diff --git a/t/t5333-pseudo-merge-bitmaps.sh b/t/t5333-pseudo-merge-bitmaps.sh
index 0288691340..aa1a7d26f1 100755
--- a/t/t5333-pseudo-merge-bitmaps.sh
+++ b/t/t5333-pseudo-merge-bitmaps.sh
@@ -390,7 +390,7 @@ test_expect_success 'pseudo-merge reuse' '
)
'
-test_expect_failure 'empty pseudo-merge group' '
+test_expect_success 'empty pseudo-merge group' '
git init pseudo-merge-empty-group &&
(
cd pseudo-merge-empty-group &&
--
2.46.0.54.gc9a64b1d2a
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 8/8] pseudo-merge.c: ensure pseudo-merge groups are closed
2024-08-15 17:30 [PATCH 0/8] pseudo-merge: avoid empty and non-closed pseudo-merge commits Taylor Blau
` (6 preceding siblings ...)
2024-08-15 17:31 ` [PATCH 7/8] pseudo-merge.c: do not generate empty pseudo-merge commits Taylor Blau
@ 2024-08-15 17:31 ` Taylor Blau
2024-08-17 10:43 ` Jeff King
2024-08-17 10:44 ` [PATCH 0/8] pseudo-merge: avoid empty and non-closed pseudo-merge commits Jeff King
8 siblings, 1 reply; 20+ messages in thread
From: Taylor Blau @ 2024-08-15 17:31 UTC (permalink / raw)
To: git; +Cc: Jeff King, Junio C Hamano
When generating pseudo-merge bitmaps, it's possible that concurrent
reference updates may reveal some pseudo-merge candidates which reach
objects that are not contained in the bitmap's pack or pseudo-pack
order (in the case of MIDX bitmaps).
The latter case is relatively easy to demonstrate: if we generate a MIDX
bitmap with only half of the repository packed, then the unpacked
contents are not part of the MIDX's object order.
If we happen to select one or more commit(s) from the unpacked portion
of the repository for inclusion in a pseudo-merge, we'll get the
following message when trying to generate its bitmap:
$ git multi-pack-index write --bitmap
[...]
Selecting pseudo-merge commits: 100% (1/1), done.
warning: Failed to write bitmap index. Packfile doesn't have full closure (object ... is missing)
Building bitmaps: 50% (1/2), done.
error: could not write multi-pack bitmap
, and the attempted bitmap write will fail, leaving the repository
without a current bitmap.
Rectify this by ensuring that the commits which are pseudo-merge
candidates can only be so if they appear somewhere in the packing order.
This is sufficient, since we know that the original packing order is
closed under reachability, so if a commit appears in that list as a
potential pseudo-merge candidate, we know that everything reachable from
it also appears in the list (and thus the candidate is a good one).
Noticed-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
pseudo-merge.c | 2 ++
t/t5333-pseudo-merge-bitmaps.sh | 36 +++++++++++++++++++++++++++++++++
2 files changed, 38 insertions(+)
diff --git a/pseudo-merge.c b/pseudo-merge.c
index 6422be979c..7ec9d4c51c 100644
--- a/pseudo-merge.c
+++ b/pseudo-merge.c
@@ -217,6 +217,8 @@ static int find_pseudo_merge_group_for_ref(const char *refname,
c = lookup_commit(the_repository, oid);
if (!c)
return 0;
+ if (!packlist_find(writer->to_pack, oid))
+ return 0;
has_bitmap = bitmap_writer_has_bitmapped_object_id(writer, oid);
diff --git a/t/t5333-pseudo-merge-bitmaps.sh b/t/t5333-pseudo-merge-bitmaps.sh
index aa1a7d26f1..1dd6284756 100755
--- a/t/t5333-pseudo-merge-bitmaps.sh
+++ b/t/t5333-pseudo-merge-bitmaps.sh
@@ -410,4 +410,40 @@ test_expect_success 'empty pseudo-merge group' '
)
'
+test_expect_success 'pseudo-merge closure' '
+ git init pseudo-merge-closure &&
+ (
+ cd pseudo-merge-closure &&
+
+ test_commit A &&
+ git repack -d &&
+
+ test_commit B &&
+
+ # Note that the contents of A is packed, but B is not. A
+ # (and the objects reachable from it) are thus visible
+ # to the MIDX, but the same is not true for B and its
+ # objects.
+ #
+ # Ensure that we do not attempt to create a pseudo-merge
+ # for B, depsite it matching the below pseudo-merge
+ # group pattern, as doing so would result in a failure
+ # to write a non-closed bitmap.
+ git config bitmapPseudoMerge.test.pattern refs/ &&
+ git config bitmapPseudoMerge.test.threshold now &&
+
+ git multi-pack-index write --bitmap &&
+
+ test-tool bitmap dump-pseudo-merges >pseudo-merges &&
+ test_line_count = 1 pseudo-merges &&
+
+ git rev-parse A >expect &&
+
+ test-tool bitmap list-commits >actual &&
+ test_cmp expect actual &&
+ test-tool bitmap dump-pseudo-merge-commits 0 >actual &&
+ test_cmp expect actual
+ )
+'
+
test_done
--
2.46.0.54.gc9a64b1d2a
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH 1/8] pack-bitmap: initialize `bitmap_writer_init()` with packing_data
2024-08-15 17:31 ` [PATCH 1/8] pack-bitmap: initialize `bitmap_writer_init()` with packing_data Taylor Blau
@ 2024-08-17 10:31 ` Jeff King
2024-08-29 19:00 ` Taylor Blau
0 siblings, 1 reply; 20+ messages in thread
From: Jeff King @ 2024-08-17 10:31 UTC (permalink / raw)
To: Taylor Blau; +Cc: git, Junio C Hamano
On Thu, Aug 15, 2024 at 01:31:00PM -0400, Taylor Blau wrote:
> In order to determine its object order, the pack-bitmap machinery keeps
> a 'struct packing_data' corresponding to the pack or pseudo-pack (when
> writing a MIDX bitmap) being written.
>
> The to_pack field is provided to the bitmap machinery by callers of
> bitmap_writer_build() and assigned to the bitmap_writer struct at that
> point.
>
> But a subsequent commit will want to have access to that data earlier on
> during commit selection. Prepare for that by adding a 'to_pack' argument
> to 'bitmap_writer_init()', and initializing the field during that
> function.
>
> Subsequent commits will clean up other functions which take
> now-redundant arguments (like nr_objects, which is equivalent to
> pdata->objects_nr, or pdata itself).
This (and the next few follow-on commits) seem like a good change to me.
It simplifies many of the function calls, and I think it expresses the
domain logic in the API: there is a single set of objects being mapped
to bits, and many parts of the process will rely on it.
Even the midx code, which is not generating a pack, uses a "fake"
packing_data as the way to express that (because inherently the bit
ordering is all coming from the pack-index nature). If we likewise ever
wrote code to generate bitmaps from an existing pack, it would probably
use packing_data, too. :)
-Peff
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 5/8] pack-bitmap-write.c: select pseudo-merges even for small bitmaps
2024-08-15 17:31 ` [PATCH 5/8] pack-bitmap-write.c: select pseudo-merges even for small bitmaps Taylor Blau
@ 2024-08-17 10:34 ` Jeff King
2024-08-17 16:42 ` Junio C Hamano
2024-08-29 19:01 ` Taylor Blau
0 siblings, 2 replies; 20+ messages in thread
From: Jeff King @ 2024-08-17 10:34 UTC (permalink / raw)
To: Taylor Blau; +Cc: git, Junio C Hamano
On Thu, Aug 15, 2024 at 01:31:12PM -0400, Taylor Blau wrote:
> Ordinarily, the pack-bitmap machinery will select some subset of
> reachable commits to receive bitmaps. But when there are fewer than 100
> commits indexed in the first place, they will all receive bitmaps as a
> special case.
>
> When this happens, pseudo-merges are not generated, making it impossible
> to test pseudo-merge corner cases with fewer than 100 commits.
>
> Select pseudo-merges even for bitmaps with fewer than 100 commits to
> make such testing easier. In practice, this should not make a difference
> to non-testing bitmaps, as they are unlikely to be used when a
> repository has so few commits to begin with.
I think you could argue that if there are fewer than 100 commits in the
history that pseudo-merge bitmaps are overkill, so it does not matter
much either way. But I think being consistent with our behavior (i.e.,
generating them if asked) is important for testing and debugging.
> diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c
> index 346fb29513..923f793cec 100644
> --- a/pack-bitmap-write.c
> +++ b/pack-bitmap-write.c
> @@ -694,6 +694,10 @@ void bitmap_writer_select_commits(struct bitmap_writer *writer,
> if (indexed_commits_nr < 100) {
> for (i = 0; i < indexed_commits_nr; ++i)
> bitmap_writer_push_commit(writer, indexed_commits[i], 0);
> +
> + select_pseudo_merges(writer, indexed_commits,
> + indexed_commits_nr);
> +
I of course just posted a series which drops the latter two arguments
from this function. It's a semantic conflict, not a textual one, so
we'll have to fix it up when the two are merged (but the compiler will
easily notice to remind us).
-Peff
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 7/8] pseudo-merge.c: do not generate empty pseudo-merge commits
2024-08-15 17:31 ` [PATCH 7/8] pseudo-merge.c: do not generate empty pseudo-merge commits Taylor Blau
@ 2024-08-17 10:38 ` Jeff King
2024-08-29 19:03 ` Taylor Blau
0 siblings, 1 reply; 20+ messages in thread
From: Jeff King @ 2024-08-17 10:38 UTC (permalink / raw)
To: Taylor Blau; +Cc: git, Junio C Hamano
On Thu, Aug 15, 2024 at 01:31:17PM -0400, Taylor Blau wrote:
> The previous commit demonstrated it is possible to generate empty
> pseudo-merge commits, which is not useful as such pseudo-merges carry no
> information.
>
> Ensure that we only generate non-empty groups by not pushing a new
> commit onto the bitmap_writer when that commit has no parents.
Makes sense, and the patch seems pretty straight-forward.
It's a little tempting to suggest that we keep some way of writing
these, so that we can exercise the reading side in the tests (we
_shouldn't_ see these if we're not writing them, but as an on-disk
format, we may see examples from other sources). But it is probably not
worth the complexity. If we really want to test them, we can probably
generate a single example by hand (and we can probably even wait on that
until we see something in the wild that makes it worth doing).
-Peff
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 8/8] pseudo-merge.c: ensure pseudo-merge groups are closed
2024-08-15 17:31 ` [PATCH 8/8] pseudo-merge.c: ensure pseudo-merge groups are closed Taylor Blau
@ 2024-08-17 10:43 ` Jeff King
0 siblings, 0 replies; 20+ messages in thread
From: Jeff King @ 2024-08-17 10:43 UTC (permalink / raw)
To: Taylor Blau; +Cc: git, Junio C Hamano
On Thu, Aug 15, 2024 at 01:31:20PM -0400, Taylor Blau wrote:
> Rectify this by ensuring that the commits which are pseudo-merge
> candidates can only be so if they appear somewhere in the packing order.
>
> This is sufficient, since we know that the original packing order is
> closed under reachability, so if a commit appears in that list as a
> potential pseudo-merge candidate, we know that everything reachable from
> it also appears in the list (and thus the candidate is a good one).
Right, good explanation.
> diff --git a/pseudo-merge.c b/pseudo-merge.c
> index 6422be979c..7ec9d4c51c 100644
> --- a/pseudo-merge.c
> +++ b/pseudo-merge.c
> @@ -217,6 +217,8 @@ static int find_pseudo_merge_group_for_ref(const char *refname,
> c = lookup_commit(the_repository, oid);
> if (!c)
> return 0;
> + if (!packlist_find(writer->to_pack, oid))
> + return 0;
>
> has_bitmap = bitmap_writer_has_bitmapped_object_id(writer, oid);
>
And the patch looks good. I wondered about checking the packlist before
calling lookup_commit(), but the latter is really not very expensive (it
is not reading the object, but just creating a struct).
> +test_expect_success 'pseudo-merge closure' '
> + git init pseudo-merge-closure &&
> + (
> + cd pseudo-merge-closure &&
> +
> + test_commit A &&
> + git repack -d &&
> +
> + test_commit B &&
> +
> + # Note that the contents of A is packed, but B is not. A
> + # (and the objects reachable from it) are thus visible
> + # to the MIDX, but the same is not true for B and its
> + # objects.
> + #
> + # Ensure that we do not attempt to create a pseudo-merge
> + # for B, depsite it matching the below pseudo-merge
> + # group pattern, as doing so would result in a failure
> + # to write a non-closed bitmap.
> + git config bitmapPseudoMerge.test.pattern refs/ &&
> + git config bitmapPseudoMerge.test.threshold now &&
> +
> + git multi-pack-index write --bitmap &&
OK, clever. In the real world, I think this would happen racily, because
you'd usually suck up all of the loose objects into a pack to feed into
the midx. And the problem is new objects (whether packed or not) that
are referenced after that step.
But here we just skip that step and generate the midx directly, which
lets us do it deterministically.
-Peff
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 0/8] pseudo-merge: avoid empty and non-closed pseudo-merge commits
2024-08-15 17:30 [PATCH 0/8] pseudo-merge: avoid empty and non-closed pseudo-merge commits Taylor Blau
` (7 preceding siblings ...)
2024-08-15 17:31 ` [PATCH 8/8] pseudo-merge.c: ensure pseudo-merge groups are closed Taylor Blau
@ 2024-08-17 10:44 ` Jeff King
2024-08-29 19:04 ` Taylor Blau
8 siblings, 1 reply; 20+ messages in thread
From: Jeff King @ 2024-08-17 10:44 UTC (permalink / raw)
To: Taylor Blau; +Cc: git, Junio C Hamano
On Thu, Aug 15, 2024 at 01:30:56PM -0400, Taylor Blau wrote:
> This series fixes a couple of small issues with pseudo-merge groups,
> where it is possible to (a) select pseudo-merges that are not closed
> over reachability with respect to the bitmap's pack or MIDX, and (b)
> generate empty pseudo-merge commits.
Thanks, the whole series looks good to me (I left some comments, but it
is all just walking through my thought process).
-Peff
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 5/8] pack-bitmap-write.c: select pseudo-merges even for small bitmaps
2024-08-17 10:34 ` Jeff King
@ 2024-08-17 16:42 ` Junio C Hamano
2024-08-29 19:01 ` Taylor Blau
1 sibling, 0 replies; 20+ messages in thread
From: Junio C Hamano @ 2024-08-17 16:42 UTC (permalink / raw)
To: Jeff King; +Cc: Taylor Blau, git
Jeff King <peff@peff.net> writes:
>> Select pseudo-merges even for bitmaps with fewer than 100 commits to
>> make such testing easier. In practice, this should not make a difference
>> to non-testing bitmaps, as they are unlikely to be used when a
>> repository has so few commits to begin with.
>
> I think you could argue that if there are fewer than 100 commits in the
> history that pseudo-merge bitmaps are overkill, so it does not matter
> much either way. But I think being consistent with our behavior (i.e.,
> generating them if asked) is important for testing and debugging.
OK. I had an opposite reaction to this change, i.e. "eh, are we
lifting the safety that protected the users from asking for a
suboptimal way to pack, only to make it easier to write this test?",
but I do buy "the user tells us to do the suboptimal thing, we do
that thing."
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 1/8] pack-bitmap: initialize `bitmap_writer_init()` with packing_data
2024-08-17 10:31 ` Jeff King
@ 2024-08-29 19:00 ` Taylor Blau
2024-08-29 19:36 ` Jeff King
0 siblings, 1 reply; 20+ messages in thread
From: Taylor Blau @ 2024-08-29 19:00 UTC (permalink / raw)
To: Jeff King; +Cc: git, Junio C Hamano
On Sat, Aug 17, 2024 at 06:31:55AM -0400, Jeff King wrote:
> On Thu, Aug 15, 2024 at 01:31:00PM -0400, Taylor Blau wrote:
>
> > In order to determine its object order, the pack-bitmap machinery keeps
> > a 'struct packing_data' corresponding to the pack or pseudo-pack (when
> > writing a MIDX bitmap) being written.
> >
> > The to_pack field is provided to the bitmap machinery by callers of
> > bitmap_writer_build() and assigned to the bitmap_writer struct at that
> > point.
> >
> > But a subsequent commit will want to have access to that data earlier on
> > during commit selection. Prepare for that by adding a 'to_pack' argument
> > to 'bitmap_writer_init()', and initializing the field during that
> > function.
> >
> > Subsequent commits will clean up other functions which take
> > now-redundant arguments (like nr_objects, which is equivalent to
> > pdata->objects_nr, or pdata itself).
>
> This (and the next few follow-on commits) seem like a good change to me.
> It simplifies many of the function calls, and I think it expresses the
> domain logic in the API: there is a single set of objects being mapped
> to bits, and many parts of the process will rely on it.
Thanks. Yeah, it was a little surprising to me that it wasn't already
this way, especially having worked in this area for so long. I suspect
it grew this way organically over time (though haven't actually gone
spelunking through the history to confirm).
> Even the midx code, which is not generating a pack, uses a "fake"
> packing_data as the way to express that (because inherently the bit
> ordering is all coming from the pack-index nature). If we likewise ever
> wrote code to generate bitmaps from an existing pack, it would probably
> use packing_data, too. :)
I agree for the most part, though there is a lot of weight in
packing_data that would be nice to not have to carry around. I know
within GitHub's infrastructure we sometimes OOM kill invocations of "git
multi-pack-index write --bitmap" because of the memory overhead (a lot
of which is dominated by the actual traversal and bitmap generation, but
a lot that comes from just the per-object overhead).
I've thought about alternative structures that might be a little more
memory efficient, but it's never gotten to the top of my list.
Thanks,
Taylor
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 5/8] pack-bitmap-write.c: select pseudo-merges even for small bitmaps
2024-08-17 10:34 ` Jeff King
2024-08-17 16:42 ` Junio C Hamano
@ 2024-08-29 19:01 ` Taylor Blau
1 sibling, 0 replies; 20+ messages in thread
From: Taylor Blau @ 2024-08-29 19:01 UTC (permalink / raw)
To: Jeff King; +Cc: git, Junio C Hamano
On Sat, Aug 17, 2024 at 06:34:01AM -0400, Jeff King wrote:
> On Thu, Aug 15, 2024 at 01:31:12PM -0400, Taylor Blau wrote:
>
> > Ordinarily, the pack-bitmap machinery will select some subset of
> > reachable commits to receive bitmaps. But when there are fewer than 100
> > commits indexed in the first place, they will all receive bitmaps as a
> > special case.
> >
> > When this happens, pseudo-merges are not generated, making it impossible
> > to test pseudo-merge corner cases with fewer than 100 commits.
> >
> > Select pseudo-merges even for bitmaps with fewer than 100 commits to
> > make such testing easier. In practice, this should not make a difference
> > to non-testing bitmaps, as they are unlikely to be used when a
> > repository has so few commits to begin with.
>
> I think you could argue that if there are fewer than 100 commits in the
> history that pseudo-merge bitmaps are overkill, so it does not matter
> much either way. But I think being consistent with our behavior (i.e.,
> generating them if asked) is important for testing and debugging.
Oh, I think that argument is very fair and makes a lot of sense. The
point of this patch wasn't that having pseudo-merge bitmaps for
repositories that small (and which already have complete bitmap
coverage!) is helpful for performance.
Rather, it was to make it easier to test the pseudo-merge code path in
small repositories without having to waste CPU cycles to generate 98
extra commits when 2 would do, etc.
Thanks,
Taylor
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 7/8] pseudo-merge.c: do not generate empty pseudo-merge commits
2024-08-17 10:38 ` Jeff King
@ 2024-08-29 19:03 ` Taylor Blau
0 siblings, 0 replies; 20+ messages in thread
From: Taylor Blau @ 2024-08-29 19:03 UTC (permalink / raw)
To: Jeff King; +Cc: git, Junio C Hamano
On Sat, Aug 17, 2024 at 06:38:22AM -0400, Jeff King wrote:
> [...] If we really want to test them, we can probably generate a
> single example by hand (and we can probably even wait on that until we
> see something in the wild that makes it worth doing).
Yeah, I had a similar thought as you. I considered adding a test fixture
that has a .bitmap with a pseudo-merge bitmap containing zero parents,
but figured that it was probably overkill since this feature is still
considered experimental.
Thanks,
Taylor
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 0/8] pseudo-merge: avoid empty and non-closed pseudo-merge commits
2024-08-17 10:44 ` [PATCH 0/8] pseudo-merge: avoid empty and non-closed pseudo-merge commits Jeff King
@ 2024-08-29 19:04 ` Taylor Blau
0 siblings, 0 replies; 20+ messages in thread
From: Taylor Blau @ 2024-08-29 19:04 UTC (permalink / raw)
To: Jeff King; +Cc: git, Junio C Hamano
On Sat, Aug 17, 2024 at 06:44:12AM -0400, Jeff King wrote:
> On Thu, Aug 15, 2024 at 01:30:56PM -0400, Taylor Blau wrote:
>
> > This series fixes a couple of small issues with pseudo-merge groups,
> > where it is possible to (a) select pseudo-merges that are not closed
> > over reachability with respect to the bitmap's pack or MIDX, and (b)
> > generate empty pseudo-merge commits.
>
> Thanks, the whole series looks good to me (I left some comments, but it
> is all just walking through my thought process).
Thanks (as always) for a thoughtful review!
Thanks,
Taylor
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 1/8] pack-bitmap: initialize `bitmap_writer_init()` with packing_data
2024-08-29 19:00 ` Taylor Blau
@ 2024-08-29 19:36 ` Jeff King
0 siblings, 0 replies; 20+ messages in thread
From: Jeff King @ 2024-08-29 19:36 UTC (permalink / raw)
To: Taylor Blau; +Cc: git, Junio C Hamano
On Thu, Aug 29, 2024 at 03:00:21PM -0400, Taylor Blau wrote:
> > Even the midx code, which is not generating a pack, uses a "fake"
> > packing_data as the way to express that (because inherently the bit
> > ordering is all coming from the pack-index nature). If we likewise ever
> > wrote code to generate bitmaps from an existing pack, it would probably
> > use packing_data, too. :)
>
> I agree for the most part, though there is a lot of weight in
> packing_data that would be nice to not have to carry around. I know
> within GitHub's infrastructure we sometimes OOM kill invocations of "git
> multi-pack-index write --bitmap" because of the memory overhead (a lot
> of which is dominated by the actual traversal and bitmap generation, but
> a lot that comes from just the per-object overhead).
>
> I've thought about alternative structures that might be a little more
> memory efficient, but it's never gotten to the top of my list.
True. What the index and bitmap steps really want is not an array of
object_entry, but an array of pack_idx_entry (which is the first
component of an object_entry).
I wonder how feasible it would be to simply hold two arrays with
corresponding entries at each index. Many places only care about one or
the other. But for places that do care about both, especially ones that
receive a pointer to an individual object_entry, they'd need to receive
pointers to both.
I briefly looked at the compile errors that come from making such a
change. Many of them look trivial, but I think some of them get weird
(the ext_bases array is also holding object_entry structs). So maybe
worth pursuing in the long run, but not something to knock out this
afternoon.
-Peff
^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2024-08-29 19:37 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-15 17:30 [PATCH 0/8] pseudo-merge: avoid empty and non-closed pseudo-merge commits Taylor Blau
2024-08-15 17:31 ` [PATCH 1/8] pack-bitmap: initialize `bitmap_writer_init()` with packing_data Taylor Blau
2024-08-17 10:31 ` Jeff King
2024-08-29 19:00 ` Taylor Blau
2024-08-29 19:36 ` Jeff King
2024-08-15 17:31 ` [PATCH 2/8] pack-bitmap: drop redundant args from `bitmap_writer_build_type_index()` Taylor Blau
2024-08-15 17:31 ` [PATCH 3/8] pack-bitmap: drop redundant args from `bitmap_writer_build()` Taylor Blau
2024-08-15 17:31 ` [PATCH 4/8] pack-bitmap: drop redundant args from `bitmap_writer_finish()` Taylor Blau
2024-08-15 17:31 ` [PATCH 5/8] pack-bitmap-write.c: select pseudo-merges even for small bitmaps Taylor Blau
2024-08-17 10:34 ` Jeff King
2024-08-17 16:42 ` Junio C Hamano
2024-08-29 19:01 ` Taylor Blau
2024-08-15 17:31 ` [PATCH 6/8] t/t5333-pseudo-merge-bitmaps.sh: demonstrate empty pseudo-merge groups Taylor Blau
2024-08-15 17:31 ` [PATCH 7/8] pseudo-merge.c: do not generate empty pseudo-merge commits Taylor Blau
2024-08-17 10:38 ` Jeff King
2024-08-29 19:03 ` Taylor Blau
2024-08-15 17:31 ` [PATCH 8/8] pseudo-merge.c: ensure pseudo-merge groups are closed Taylor Blau
2024-08-17 10:43 ` Jeff King
2024-08-17 10:44 ` [PATCH 0/8] pseudo-merge: avoid empty and non-closed pseudo-merge commits Jeff King
2024-08-29 19:04 ` Taylor Blau
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).