Git development
 help / color / mirror / Atom feed
* [PATCH 0/8] pack-bitmap: fix various pseudo-merge bugs
@ 2026-04-13 23:56 Taylor Blau
  2026-04-13 23:56 ` [PATCH 1/8] t/helper: add 'test-tool bitmap write' subcommand Taylor Blau
                   ` (9 more replies)
  0 siblings, 10 replies; 46+ messages in thread
From: Taylor Blau @ 2026-04-13 23:56 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Elijah Newren

This series fixes several bugs in the pseudo-merge bitmap implementation
that caused the pseudo-merge application path to be effectively broken
during fill-in traversal.

Peff noticed that this code path was never triggered by the existing
test suite, and investigating that observation uncovered a handful of
bugs, some compounding.

The first two patches introduce test infrastructure: a 'bitmap write'
test helper that gives tests precise control over which commits receive
individual bitmaps, and a set of "test_expect_failure" tests
demonstrating each bug.

The next four patches fix the bugs in the per-commit pseudo-merge
lookup:

  - The pseudo-merge commit lookup table was sorted by OID rather than
    by bit position, causing the reader's binary search to fail.

  - The binary search in pseudo_merge_at() had its lo/hi updates
    swapped.

  - The extended pseudo-merge lookup path had three compounding bugs: a
    wrong entry-size calculation in the writer, a misinterpretation of
    extended table entries in the reader, and a silently-swallowed error
    check.

The final two patches fix issues in pseudo-merge group selection:

  - find_pseudo_merge_group_for_ref() did not parse commits before
    inspecting their dates, so all candidates had date == 0 and were
    unconditionally placed in the "stable" bucket.

  - The config validation for bitmapPseudoMerge.*.sampleRate accepted 0,
    which leads to a division by zero once the date classification is
    fixed and the unstable code path is exercised.

There is also a small fix for a regex leak when the pattern key is
overridden in config.

Thanks in advance for your review!

Taylor Blau (8):
  t/helper: add 'test-tool bitmap write' subcommand
  t5333: demonstrate various pseudo-merge bugs
  pack-bitmap-write: sort pseudo-merge commit lookup table in pack order
  pack-bitmap: fix inverted binary search in `pseudo_merge_at()`
  pack-bitmap: fix pseudo-merge lookup for shared commits
  pack-bitmap: parse commits in `find_pseudo_merge_group_for_ref()`
  pack-bitmap: reject pseudo-merge "sampleRate" of 0
  pack-bitmap: prevent pattern leak on pseudo-merge re-assignment

 pack-bitmap-write.c             |  23 +++-
 pseudo-merge.c                  |  19 ++-
 t/helper/test-bitmap.c          | 110 ++++++++++++++-
 t/t5310-pack-bitmaps.sh         |  24 ++++
 t/t5333-pseudo-merge-bitmaps.sh | 230 ++++++++++++++++++++++++++++++++
 5 files changed, 396 insertions(+), 10 deletions(-)


base-commit: 8c9303b1ffae5b745d1b0a1f98330cf7944d8db0
-- 
2.54.0.rc1.73.g8f4e0170952

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH 1/8] t/helper: add 'test-tool bitmap write' subcommand
  2026-04-13 23:56 [PATCH 0/8] pack-bitmap: fix various pseudo-merge bugs Taylor Blau
@ 2026-04-13 23:56 ` Taylor Blau
  2026-04-14 19:48   ` Junio C Hamano
                     ` (2 more replies)
  2026-04-13 23:56 ` [PATCH 2/8] t5333: demonstrate various pseudo-merge bugs Taylor Blau
                   ` (8 subsequent siblings)
  9 siblings, 3 replies; 46+ messages in thread
From: Taylor Blau @ 2026-04-13 23:56 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Elijah Newren

In f16eb1c091 (pseudo-merge: fix disk reads from find_pseudo_merge(),
2026-03-31), we noted that `apply_pseudo_merges_for_commit()` is never
triggered by the existing test suite, and that this bears further
investigation.

This patch is the first one to begin that investigation. The following
patches will expose and fix a variety of bugs in the implementation of
pseudo-merge bitmaps.

In order to do so, however, many of these tests require very precise
selection of which commits receive bitmaps and which do not. To date,
there isn't a standard approach to easily facilitate this. Address this
by introducing a `test-tool bitmap write` subcommand that writes a
bitmap for a given packfile, reading the set of commits which should
receive individual bitmaps from stdin like so:

    test-tool bitmap write <pack-basename> </path/to/commits.list

, where "<pack-basename>" is the filename for a specific packfile (e.g.,
"pack-abc123.pack"), and "/path/to/commits.list" is a list of commit
OIDs which will receive bitmaps.

The helper respects `bitmapPseudoMerge.*` configuration for creating
pseudo-merge bitmaps alongside the regular commit bitmaps.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 t/helper/test-bitmap.c  | 110 +++++++++++++++++++++++++++++++++++++++-
 t/t5310-pack-bitmaps.sh |  24 +++++++++
 2 files changed, 133 insertions(+), 1 deletion(-)

diff --git a/t/helper/test-bitmap.c b/t/helper/test-bitmap.c
index 16a01669e41..96c0000c787 100644
--- a/t/helper/test-bitmap.c
+++ b/t/helper/test-bitmap.c
@@ -2,7 +2,10 @@
 
 #include "test-tool.h"
 #include "git-compat-util.h"
+#include "hex.h"
+#include "odb.h"
 #include "pack-bitmap.h"
+#include "pseudo-merge.h"
 #include "setup.h"
 
 static int bitmap_list_commits(void)
@@ -35,6 +38,108 @@ static int bitmap_dump_pseudo_merge_objects(uint32_t n)
 	return test_bitmap_pseudo_merge_objects(the_repository, n);
 }
 
+struct bitmap_writer_data {
+	struct packing_data packed;
+	struct pack_idx_entry **index;
+	uint32_t nr;
+};
+
+static int add_packed_object(const struct object_id *oid,
+			     struct packed_git *pack,
+			     uint32_t pos,
+			     void *_data)
+{
+	struct bitmap_writer_data *data = _data;
+	struct object_entry *entry;
+	struct object_info oi = OBJECT_INFO_INIT;
+	enum object_type type;
+
+	oi.typep = &type;
+
+	entry = packlist_alloc(&data->packed, oid);
+	entry->idx.offset = nth_packed_object_offset(pack, pos);
+	if (packed_object_info(pack, entry->idx.offset, &oi) < 0)
+		die("could not get type of object %s",
+		    oid_to_hex(oid));
+	oe_set_type(entry, type);
+	oe_set_in_pack(&data->packed, entry, pack);
+	data->index[data->nr++] = &entry->idx;
+
+	return 0;
+}
+
+static int idx_oid_cmp(const void *va, const void *vb)
+{
+	const struct pack_idx_entry *a = *(const struct pack_idx_entry **)va;
+	const struct pack_idx_entry *b = *(const struct pack_idx_entry **)vb;
+
+	return oidcmp(&a->oid, &b->oid);
+}
+
+static int bitmap_write(const char *basename)
+{
+	struct packed_git *p = NULL;
+	struct bitmap_writer_data data = { 0 };
+	struct bitmap_writer writer;
+	struct strbuf buf = STRBUF_INIT;
+
+	prepare_repo_settings(the_repository);
+	repo_for_each_pack(the_repository, p) {
+		if (!strcmp(pack_basename(p), basename))
+			break;
+	}
+
+	if (!p)
+		die("could not find pack '%s'", basename);
+
+	if (open_pack_index(p))
+		die("cannot open pack index for '%s'", p->pack_name);
+
+	prepare_packing_data(the_repository, &data.packed);
+	ALLOC_ARRAY(data.index, p->num_objects);
+
+	for_each_object_in_pack(p, add_packed_object, &data,
+				ODB_FOR_EACH_OBJECT_PACK_ORDER);
+
+	bitmap_writer_init(&writer, the_repository, &data.packed, NULL);
+	bitmap_writer_build_type_index(&writer, data.index);
+
+	while (strbuf_getline_lf(&buf, stdin) != EOF) {
+		struct object_id oid;
+		struct commit *c;
+
+		if (get_oid_hex(buf.buf, &oid))
+			die("invalid OID: %s", buf.buf);
+
+		c = lookup_commit(the_repository, &oid);
+		if (!c || repo_parse_commit(the_repository, c))
+			die("could not parse commit %s", buf.buf);
+
+		bitmap_writer_push_commit(&writer, c, false);
+	}
+
+	select_pseudo_merges(&writer);
+	if (bitmap_writer_build(&writer) < 0)
+		die("failed to build bitmaps");
+
+	bitmap_writer_set_checksum(&writer, p->hash);
+
+	QSORT(data.index, p->num_objects, idx_oid_cmp);
+
+	strbuf_reset(&buf);
+	strbuf_addstr(&buf, p->pack_name);
+	strbuf_strip_suffix(&buf, ".pack");
+	strbuf_addstr(&buf, ".bitmap");
+	bitmap_writer_finish(&writer, data.index, buf.buf, 0);
+
+	bitmap_writer_free(&writer);
+	strbuf_release(&buf);
+	free(data.index);
+	clear_packing_data(&data.packed);
+
+	return 0;
+}
+
 int cmd__bitmap(int argc, const char **argv)
 {
 	setup_git_directory();
@@ -51,13 +156,16 @@ int cmd__bitmap(int argc, const char **argv)
 		return bitmap_dump_pseudo_merge_commits(atoi(argv[2]));
 	if (argc == 3 && !strcmp(argv[1], "dump-pseudo-merge-objects"))
 		return bitmap_dump_pseudo_merge_objects(atoi(argv[2]));
+	if (argc == 3 && !strcmp(argv[1], "write"))
+		return bitmap_write(argv[2]);
 
 	usage("\ttest-tool bitmap list-commits\n"
 	      "\ttest-tool bitmap list-commits-with-offset\n"
 	      "\ttest-tool bitmap dump-hashes\n"
 	      "\ttest-tool bitmap dump-pseudo-merges\n"
 	      "\ttest-tool bitmap dump-pseudo-merge-commits <n>\n"
-	      "\ttest-tool bitmap dump-pseudo-merge-objects <n>");
+	      "\ttest-tool bitmap dump-pseudo-merge-objects <n>\n"
+	      "\ttest-tool bitmap write <pack-basename> < <commit-list>");
 
 	return -1;
 }
diff --git a/t/t5310-pack-bitmaps.sh b/t/t5310-pack-bitmaps.sh
index f693cb56691..9489e59fa55 100755
--- a/t/t5310-pack-bitmaps.sh
+++ b/t/t5310-pack-bitmaps.sh
@@ -648,4 +648,28 @@ test_expect_success 'truncated bitmap fails gracefully (lookup table)' '
 	test_grep corrupted.bitmap.index stderr
 '
 
+test_expect_success 'test-tool bitmap write' '
+	git init bitmap-write-helper &&
+	test_when_finished "rm -fr bitmap-write-helper" &&
+	(
+		cd bitmap-write-helper &&
+
+		test_commit_bulk 64 &&
+		git repack -ad &&
+
+		pack="$(ls .git/objects/pack/pack-*.pack)" &&
+
+		git rev-parse HEAD >commits &&
+		test-tool bitmap write "$(basename $pack)" <commits &&
+
+		test-tool bitmap list-commits | sort >actual &&
+		sort commits >expect &&
+		test_cmp expect actual &&
+
+		git rev-list --count --objects --use-bitmap-index HEAD >actual &&
+		git rev-list --count --objects HEAD >expect &&
+		test_cmp expect actual
+	)
+'
+
 test_done
-- 
2.54.0.rc1.73.g8f4e0170952


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 2/8] t5333: demonstrate various pseudo-merge bugs
  2026-04-13 23:56 [PATCH 0/8] pack-bitmap: fix various pseudo-merge bugs Taylor Blau
  2026-04-13 23:56 ` [PATCH 1/8] t/helper: add 'test-tool bitmap write' subcommand Taylor Blau
@ 2026-04-13 23:56 ` Taylor Blau
  2026-04-19  0:25   ` Elijah Newren
  2026-04-13 23:56 ` [PATCH 3/8] pack-bitmap-write: sort pseudo-merge commit lookup table in pack order Taylor Blau
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 46+ messages in thread
From: Taylor Blau @ 2026-04-13 23:56 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Elijah Newren

Using the test helper introduced via the previous commit, add various
failing tests demonstrating bugs in the pseudo-merge implementation.

These are all marked as failing with one exception. The "sampleRate=0"
test describes a latent bug, which is only reachable through a code path
that is itself masked by a separate bug. A future commit will fix that
bug, and, in turn, cause the aforementioned test to fail. Accordingly,
that commit will mark the test as failing, and it will be re-marked as
passing in a separate commit which fixes the once-latent bug.

For the rest: the following commits will explain and fix the underlying
bugs in detail.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 t/t5333-pseudo-merge-bitmaps.sh | 198 ++++++++++++++++++++++++++++++++
 1 file changed, 198 insertions(+)

diff --git a/t/t5333-pseudo-merge-bitmaps.sh b/t/t5333-pseudo-merge-bitmaps.sh
index 1f7a5d82ee4..20e77ab4390 100755
--- a/t/t5333-pseudo-merge-bitmaps.sh
+++ b/t/t5333-pseudo-merge-bitmaps.sh
@@ -462,4 +462,202 @@ test_expect_success 'use pseudo-merge in boundary traversal' '
 	)
 '
 
+test_expect_failure 'apply pseudo-merges during fill-in traversal' '
+	git init pseudo-merge-fill-in-traversal &&
+	test_when_finished "rm -fr pseudo-merge-fill-in-traversal" &&
+	(
+		cd pseudo-merge-fill-in-traversal &&
+
+		git config bitmapPseudoMerge.test.pattern refs/tags/ &&
+		git config bitmapPseudoMerge.test.maxMerges 1 &&
+		git config bitmapPseudoMerge.test.stableThreshold never &&
+
+		test_commit_bulk 64 &&
+		tag_everything &&
+		git repack -ad &&
+
+		pack=$(ls .git/objects/pack/pack-*.pack) &&
+		git rev-parse HEAD~63 >in &&
+		test-tool bitmap write "$(basename $pack)" <in &&
+
+		test_pseudo_merges >merges &&
+		test_line_count = 1 merges &&
+
+		test_commit stale &&
+
+		git rev-list --count --objects HEAD >expect &&
+
+		: >trace2.txt &&
+		GIT_TRACE2_EVENT=$PWD/trace2.txt \
+			git rev-list --count --objects --use-bitmap-index HEAD >actual &&
+		test_pseudo_merges_satisfied 1 <trace2.txt &&
+
+		test_cmp expect actual
+	)
+'
+
+test_expect_failure 'apply pseudo-merges from multiple groups during fill-in' '
+	git init pseudo-merge-fill-in-multi &&
+	test_when_finished "rm -fr pseudo-merge-fill-in-multi" &&
+	(
+		cd pseudo-merge-fill-in-multi &&
+
+		test_commit base &&
+		base=$(git rev-parse HEAD) &&
+
+		for side in left right
+		do
+			git checkout -B $side base &&
+
+			test_commit_bulk --id=$side 64 &&
+			git rev-list --no-object-names HEAD --not $base >in &&
+			while read oid
+			do
+				echo "create refs/group-$side/$oid $oid" || return 1
+			done <in | git update-ref --stdin || return 1
+		done &&
+
+		git checkout left &&
+		git merge right &&
+		git repack -ad &&
+
+		git config bitmapPseudoMerge.left.pattern "refs/group-left/" &&
+		git config bitmapPseudoMerge.left.maxMerges 1 &&
+		git config bitmapPseudoMerge.left.stableThreshold never &&
+
+		git config bitmapPseudoMerge.right.pattern "refs/group-right/" &&
+		git config bitmapPseudoMerge.right.maxMerges 1 &&
+		git config bitmapPseudoMerge.right.stableThreshold never &&
+
+		pack="$(ls .git/objects/pack/pack-*.pack)" &&
+		git rev-parse "$base" >in &&
+		test-tool bitmap write "$(basename $pack)" <in &&
+
+		test_pseudo_merges >merges &&
+		test_line_count = 2 merges &&
+
+		test_commit stale &&
+
+		git rev-list --count --objects HEAD >expect &&
+
+		: >trace2.txt &&
+		GIT_TRACE2_EVENT=$PWD/trace2.txt \
+			git rev-list --count --objects --use-bitmap-index HEAD >actual &&
+		test_pseudo_merges_satisfied 2 <trace2.txt &&
+
+		test_cmp expect actual
+	)
+'
+
+test_expect_failure 'apply pseudo-merges with overlapping groups during fill-in' '
+	test_when_finished "rm -fr pseudo-merge-fill-in-overlap" &&
+	git init pseudo-merge-fill-in-overlap &&
+	(
+		cd pseudo-merge-fill-in-overlap &&
+
+		test_commit_bulk 64 &&
+		tag_everything &&
+		git repack -ad &&
+
+		pack="$(ls .git/objects/pack/pack-*.pack)" &&
+
+		# Use two pseudo-merge group patterns that both match
+		# refs/tags/, so every tagged commit belongs to both
+		# groups. This exercises the extended lookup table
+		# path in apply_pseudo_merges_for_commit().
+		git config bitmapPseudoMerge.all.pattern "refs/tags/" &&
+		git config bitmapPseudoMerge.all.maxMerges 1 &&
+		git config bitmapPseudoMerge.all.stableThreshold never &&
+
+		git config bitmapPseudoMerge.tags.pattern "refs/tags/" &&
+		git config bitmapPseudoMerge.tags.maxMerges 1 &&
+		git config bitmapPseudoMerge.tags.stableThreshold never &&
+
+		git rev-parse HEAD~63 >in &&
+		test-tool bitmap write "$(basename $pack)" <in &&
+
+		test_pseudo_merges >merges &&
+		test_line_count = 2 merges &&
+
+		test_commit stale &&
+
+		git rev-list --count --objects HEAD >expect &&
+
+		: >trace2.txt &&
+		GIT_TRACE2_EVENT=$PWD/trace2.txt \
+			git rev-list --count --objects --use-bitmap-index HEAD >actual &&
+		test_pseudo_merges_satisfied 2 <trace2.txt &&
+
+		test_cmp expect actual
+	)
+'
+
+test_expect_failure 'pseudo-merge commits are correctly classified by date' '
+	git init pseudo-merge-date-classification &&
+	test_when_finished "rm -fr pseudo-merge-date-classification" &&
+	(
+		cd pseudo-merge-date-classification &&
+
+		test_commit_bulk 64 &&
+		tag_everything &&
+		git repack -ad &&
+
+		pack="$(ls .git/objects/pack/pack-*.pack)" &&
+
+		# Configure two pseudo-merge groups: one that only
+		# matches "stable" refs (older than one month), and one
+		# that matches all refs. With 64 freshly-created tags
+		# (all younger than one month) the stable group should
+		# have zero pseudo-merges and the catch-all group should
+		# have one.
+		#
+		# Use GIT_TEST_DATE_NOW to align "now" (and therefore
+		# "1.month.ago") with the test_tick timestamps so that
+		# the commits are within the last month.
+		#
+		# This exercises the date-based classification in
+		# find_pseudo_merge_group_for_ref(), which requires
+		# that commits are parsed before inspecting their date.
+		git config bitmapPseudoMerge.stable.pattern "refs/tags/" &&
+		git config bitmapPseudoMerge.stable.maxMerges 64 &&
+		git config bitmapPseudoMerge.stable.stableThreshold never &&
+		git config bitmapPseudoMerge.stable.threshold 1.month.ago &&
+
+		git config bitmapPseudoMerge.all.pattern "refs/tags/" &&
+		git config bitmapPseudoMerge.all.maxMerges 1 &&
+		git config bitmapPseudoMerge.all.stableThreshold never &&
+		git config bitmapPseudoMerge.all.threshold now &&
+
+		git rev-parse HEAD~63 >in &&
+		GIT_TEST_DATE_NOW=$test_tick \
+			test-tool bitmap write "$(basename $pack)" <in &&
+
+		test_pseudo_merges >merges &&
+		test_line_count = 1 merges
+	)
+'
+
+test_expect_success 'sampleRate=0 does not cause division by zero' '
+	git init pseudo-merge-sample-rate-zero &&
+	test_when_finished "rm -fr pseudo-merge-sample-rate-zero" &&
+	(
+		cd pseudo-merge-sample-rate-zero &&
+
+		test_commit_bulk 64 &&
+		tag_everything &&
+		git repack -ad &&
+
+		pack="$(ls .git/objects/pack/pack-*.pack)" &&
+
+		git config bitmapPseudoMerge.test.pattern "refs/tags/" &&
+		git config bitmapPseudoMerge.test.maxMerges 1 &&
+		git config bitmapPseudoMerge.test.sampleRate 0 &&
+		git config bitmapPseudoMerge.test.threshold now &&
+		git config bitmapPseudoMerge.test.stableThreshold never &&
+
+		git rev-parse HEAD~63 >in &&
+		test-tool bitmap write "$(basename $pack)" <in
+	)
+'
+
 test_done
-- 
2.54.0.rc1.73.g8f4e0170952


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 3/8] pack-bitmap-write: sort pseudo-merge commit lookup table in pack order
  2026-04-13 23:56 [PATCH 0/8] pack-bitmap: fix various pseudo-merge bugs Taylor Blau
  2026-04-13 23:56 ` [PATCH 1/8] t/helper: add 'test-tool bitmap write' subcommand Taylor Blau
  2026-04-13 23:56 ` [PATCH 2/8] t5333: demonstrate various pseudo-merge bugs Taylor Blau
@ 2026-04-13 23:56 ` Taylor Blau
  2026-04-13 23:56 ` [PATCH 4/8] pack-bitmap: fix inverted binary search in `pseudo_merge_at()` Taylor Blau
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 46+ messages in thread
From: Taylor Blau @ 2026-04-13 23:56 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Elijah Newren

The pseudo-merge commit lookup table stores each commit's position in
the pack- or pseudo-pack order, and is used to perform a binary search
in order to determine which pseudo-merge(s) a given commit belongs to.

However, the table was previously sorted in lexical order (via
`oid_array_sort()`), causing the binary search to fail.

While this causes pseudo-merge bitmaps to be de-facto broken for fill-in
traversal, there are a couple of important points to keep in mind:

 * Pseudo-merge application during the initial phases of a bitmap-based
   traversal are applied via `cascade_pseudo_merges_1()`. This function
   enumerates the known pseudo-merges and determines if its parents are
   a subset of the traversal roots.

   This is a different path than the fill-in traversal, where we are
   looking for any pseudo-merges which may be satisfied after visiting
   some commit along an object walk, which involves the aforementioned
   (broken) binary search.

   As a consequence, any pseudo-merges we apply at this stage are done
   so correctly.

 * While this bug makes applying pseudo-merges during fill-in traversal
   effectively broken, it does not produce wrong results. Instead of
   applying the *wrong* pseudo-merge, we will simply fail to find
   satisfied pseudo-merges, leaving the traversal to use the existing
   fill-in routines.

Fix this by sorting the table by bit position before writing, matching
the order that the reader's binary search expects.

This does produce a change the on-disk format insofar as the actual code
now complies with the documented format (for more details, refer to:
Documentation/technical/bitmap-format.adoc). Given that this never
worked in the first place, such a change should be OK to perform.

If an out-of-tree implementation of pseudo-merges happened to generate
bitmaps that comply with the documented format, they will continue to be
read and interpreted as normal.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 pack-bitmap-write.c             | 21 ++++++++++++++++++++-
 t/t5333-pseudo-merge-bitmaps.sh |  2 +-
 2 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c
index 8338d7217ef..86ed6a5d78c 100644
--- a/pack-bitmap-write.c
+++ b/pack-bitmap-write.c
@@ -819,6 +819,20 @@ static void write_selected_commits_v1(struct bitmap_writer *writer,
 	}
 }
 
+static int pseudo_merge_commit_pos_cmp(const void *_va, const void *_vb,
+				       void *_data)
+{
+	struct bitmap_writer *writer = _data;
+	uint32_t pos_a = find_object_pos(writer, _va, NULL);
+	uint32_t pos_b = find_object_pos(writer, _vb, NULL);
+
+	if (pos_a < pos_b)
+		return -1;
+	if (pos_a > pos_b)
+		return 1;
+	return 0;
+}
+
 static void write_pseudo_merges(struct bitmap_writer *writer,
 				struct hashfile *f)
 {
@@ -876,7 +890,12 @@ static void write_pseudo_merges(struct bitmap_writer *writer,
 		oid_array_append(&commits, &kh_key(writer->pseudo_merge_commits, i));
 	}
 
-	oid_array_sort(&commits);
+	/*
+	 * Sort the commits by their bit position so that the lookup
+	 * table can be binary searched by the reader (see
+	 * find_pseudo_merge()).
+	 */
+	QSORT_S(commits.oid, commits.nr, pseudo_merge_commit_pos_cmp, writer);
 
 	/* write lookup table (non-extended) */
 	for (i = 0; i < commits.nr; i++) {
diff --git a/t/t5333-pseudo-merge-bitmaps.sh b/t/t5333-pseudo-merge-bitmaps.sh
index 20e77ab4390..dce43ed8dc6 100755
--- a/t/t5333-pseudo-merge-bitmaps.sh
+++ b/t/t5333-pseudo-merge-bitmaps.sh
@@ -462,7 +462,7 @@ test_expect_success 'use pseudo-merge in boundary traversal' '
 	)
 '
 
-test_expect_failure 'apply pseudo-merges during fill-in traversal' '
+test_expect_success 'apply pseudo-merges during fill-in traversal' '
 	git init pseudo-merge-fill-in-traversal &&
 	test_when_finished "rm -fr pseudo-merge-fill-in-traversal" &&
 	(
-- 
2.54.0.rc1.73.g8f4e0170952


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 4/8] pack-bitmap: fix inverted binary search in `pseudo_merge_at()`
  2026-04-13 23:56 [PATCH 0/8] pack-bitmap: fix various pseudo-merge bugs Taylor Blau
                   ` (2 preceding siblings ...)
  2026-04-13 23:56 ` [PATCH 3/8] pack-bitmap-write: sort pseudo-merge commit lookup table in pack order Taylor Blau
@ 2026-04-13 23:56 ` Taylor Blau
  2026-04-13 23:56 ` [PATCH 5/8] pack-bitmap: fix pseudo-merge lookup for shared commits Taylor Blau
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 46+ messages in thread
From: Taylor Blau @ 2026-04-13 23:56 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Elijah Newren

The binary search in `pseudo_merge_at()` has its "lo" and "hi" updates
swapped: when the midpoint's offset is less than the target, it sets `hi
= mi` (searching left) instead of `lo = mi + 1` (searching right), and
vice versa.

This means that lookups for pseudo-merges whose offset is not near the
midpoint of the pseudo-merge table are likely to fail.

In practice, with a single pseudo-merge group this is masked because the
lone entry is always at the midpoint. With multiple groups, the inverted
comparisons cause lookups to search in the wrong direction, potentially
missing entries.

Swap the "lo" and "hi" assignments to search in the correct direction,
making it possible to apply pseudo-merges during fill-in when more than
one pseudo-merge exists in a group.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 pseudo-merge.c                  | 4 ++--
 t/t5333-pseudo-merge-bitmaps.sh | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/pseudo-merge.c b/pseudo-merge.c
index ff18b6c3642..fb71c761792 100644
--- a/pseudo-merge.c
+++ b/pseudo-merge.c
@@ -559,9 +559,9 @@ static struct pseudo_merge *pseudo_merge_at(const struct pseudo_merge_map *pm,
 		if (got == want)
 			return use_pseudo_merge(pm, &pm->v[mi]);
 		else if (got < want)
-			hi = mi;
-		else
 			lo = mi + 1;
+		else
+			hi = mi;
 	}
 
 	warning(_("could not find pseudo-merge for commit %s at offset %"PRIuMAX),
diff --git a/t/t5333-pseudo-merge-bitmaps.sh b/t/t5333-pseudo-merge-bitmaps.sh
index dce43ed8dc6..5bfb5103124 100755
--- a/t/t5333-pseudo-merge-bitmaps.sh
+++ b/t/t5333-pseudo-merge-bitmaps.sh
@@ -496,7 +496,7 @@ test_expect_success 'apply pseudo-merges during fill-in traversal' '
 	)
 '
 
-test_expect_failure 'apply pseudo-merges from multiple groups during fill-in' '
+test_expect_success 'apply pseudo-merges from multiple groups during fill-in' '
 	git init pseudo-merge-fill-in-multi &&
 	test_when_finished "rm -fr pseudo-merge-fill-in-multi" &&
 	(
-- 
2.54.0.rc1.73.g8f4e0170952


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 5/8] pack-bitmap: fix pseudo-merge lookup for shared commits
  2026-04-13 23:56 [PATCH 0/8] pack-bitmap: fix various pseudo-merge bugs Taylor Blau
                   ` (3 preceding siblings ...)
  2026-04-13 23:56 ` [PATCH 4/8] pack-bitmap: fix inverted binary search in `pseudo_merge_at()` Taylor Blau
@ 2026-04-13 23:56 ` Taylor Blau
  2026-04-13 23:56 ` [PATCH 6/8] pack-bitmap: parse commits in `find_pseudo_merge_group_for_ref()` Taylor Blau
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 46+ messages in thread
From: Taylor Blau @ 2026-04-13 23:56 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Elijah Newren

When a commit appears in more than one pseudo-merge group, its entry in
the commit lookup table has the high bit set in its offset field,
indicating that the offset points to an "extended" table containing the
set of pseudo-merges for that commit.

There are three bugs in this path:

 * The `next_ext` offset in `write_pseudo_merges()` undercounts the
   per-entry size of the lookup table (8 vs. 12 bytes).

 * `nth_pseudo_merge_ext()` calls `read_pseudo_merge_commit_at()` on a
   pseudo-merge bitmap offset, misinterpreting it as a 12-byte commit
   table entry.

 * The error check after `pseudo_merge_ext_at()` in
   `apply_pseudo_merges_for_commit()` tests `< -1` instead of `< 0`,
   silently swallowing errors from `error()`.

The first bug is on the write side: each commit lookup entry contains a
4- and 8-byte unsigned value for a total of 12 bytes, but the
calculation assumes that the entry only contains 8 bytes of data. This
makes `next_ext` too small, so the extended-table offsets that get
written point into the middle of the non-extended lookup table rather
than past it. The reader then interprets non-extended lookup data as
extended entries, producing garbage.

The second bug is on the read side and is independently fatal: even with
a correctly positioned extended table, `nth_pseudo_merge_ext()` feeds
the offset it reads (which points at pseudo-merge bitmap data) to
`read_pseudo_merge_commit_at()`. That function tries to parse 12 bytes
as a `pseudo_merge_commit` struct, clobbering `merge->pseudo_merge_ofs`
with whatever happens to be at that location. The caller only needs
`pseudo_merge_ofs`, so the fix is to store the offset directly rather
than re-parsing a commit table entry. The `commit_pos` field is left
untouched, retaining the value that `find_pseudo_merge()` set earlier.

The third bug is latent. With the first two fixes applied, the extended
table is correctly written and read, so `pseudo_merge_ext_at()` does not
fail during normal operation. The `< -1` vs `< 0` distinction only
matters when the bitmap file is corrupt or truncated, in which case the
error would be silently ignored and the code would proceed with
uninitialized data.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 pack-bitmap-write.c             | 2 +-
 pseudo-merge.c                  | 4 ++--
 t/t5333-pseudo-merge-bitmaps.sh | 2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c
index 86ed6a5d78c..1c8070f99c0 100644
--- a/pack-bitmap-write.c
+++ b/pack-bitmap-write.c
@@ -877,7 +877,7 @@ static void write_pseudo_merges(struct bitmap_writer *writer,
 
 	next_ext = st_add(hashfile_total(f),
 			  st_mult(kh_size(writer->pseudo_merge_commits),
-				  sizeof(uint64_t)));
+				  sizeof(uint32_t) + sizeof(uint64_t)));
 
 	table_start = hashfile_total(f);
 
diff --git a/pseudo-merge.c b/pseudo-merge.c
index fb71c761792..34e1da00b4e 100644
--- a/pseudo-merge.c
+++ b/pseudo-merge.c
@@ -600,7 +600,7 @@ static int nth_pseudo_merge_ext(const struct pseudo_merge_map *pm,
 		return error(_("out-of-bounds read: (%"PRIuMAX" >= %"PRIuMAX")"),
 			     (uintmax_t)ofs, (uintmax_t)pm->map_size);
 
-	read_pseudo_merge_commit_at(merge, pm->map + ofs);
+	merge->pseudo_merge_ofs = ofs;
 
 	return 0;
 }
@@ -671,7 +671,7 @@ int apply_pseudo_merges_for_commit(const struct pseudo_merge_map *pm,
 		off_t ofs = merge_commit.pseudo_merge_ofs & ~((uint64_t)1<<63);
 		uint32_t i;
 
-		if (pseudo_merge_ext_at(pm, &ext, ofs) < -1) {
+		if (pseudo_merge_ext_at(pm, &ext, ofs) < 0) {
 			warning(_("could not read extended pseudo-merge table "
 				  "for commit %s"),
 				oid_to_hex(&commit->object.oid));
diff --git a/t/t5333-pseudo-merge-bitmaps.sh b/t/t5333-pseudo-merge-bitmaps.sh
index 5bfb5103124..8844a3bced9 100755
--- a/t/t5333-pseudo-merge-bitmaps.sh
+++ b/t/t5333-pseudo-merge-bitmaps.sh
@@ -549,7 +549,7 @@ test_expect_success 'apply pseudo-merges from multiple groups during fill-in' '
 	)
 '
 
-test_expect_failure 'apply pseudo-merges with overlapping groups during fill-in' '
+test_expect_success 'apply pseudo-merges with overlapping groups during fill-in' '
 	test_when_finished "rm -fr pseudo-merge-fill-in-overlap" &&
 	git init pseudo-merge-fill-in-overlap &&
 	(
-- 
2.54.0.rc1.73.g8f4e0170952


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 6/8] pack-bitmap: parse commits in `find_pseudo_merge_group_for_ref()`
  2026-04-13 23:56 [PATCH 0/8] pack-bitmap: fix various pseudo-merge bugs Taylor Blau
                   ` (4 preceding siblings ...)
  2026-04-13 23:56 ` [PATCH 5/8] pack-bitmap: fix pseudo-merge lookup for shared commits Taylor Blau
@ 2026-04-13 23:56 ` Taylor Blau
  2026-04-13 23:56 ` [PATCH 7/8] pack-bitmap: reject pseudo-merge "sampleRate" of 0 Taylor Blau
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 46+ messages in thread
From: Taylor Blau @ 2026-04-13 23:56 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Elijah Newren

`find_pseudo_merge_group_for_ref()` uses the commit's date to classify
it as either "stable" (older than the stable threshold) or "unstable"
(otherwise).

However, to find the relevant commit from a given OID, the function
`find_pseudo_merge_group_for_ref()` uses `lookup_commit()` which does
not parse commits.

Because an unparsed commit has its "date" set to zero, every candidate
is placed in the "stable" bucket regardless of its actual committer
timestamp. This means the `bitmapPseudoMerge.*.threshold` and
`stableThreshold` configuration options have no effect: the
stable/unstable split is always determined by comparing against zero
rather than the real commit date.

The net result is that pseudo-merge groups are partitioned by
`stableSize` instead of the intended decay-based sizing, and the
`sampleRate` knob (which only applies to the unstable path) is never
exercised.

Fix this by calling `repo_parse_commit()` after `lookup_commit()`,
bailing out of the callback if parsing fails.

The corresponding test configures two pseudo-merge groups that both
match all tags. The "stable" group uses `threshold=1.month.ago`, and the
"all" group uses `threshold=now`. The test use our custom
"GIT_TEST_DATE_NOW" environment variable by setting it to the value of
"$test_tick" to align Git's notion of "now" (and therefore
"1.month.ago") with the `test_tick` timestamps, so the commits appear to
be younger than one month: only the "all" group matches them, producing
exactly one pseudo-merge.

Without the fix every commit has `date == 0`, which satisfies `date <=
threshold` for both groups (since 0 is older than one month ago), and
the "stable" group erroneously matches as well.

Now that commits are correctly classified as "unstable", the bug
described in the test exercising the "sampleRate=0" test is reachable,
and the test is marked as failing. It will be fixed in a following
commit.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 pseudo-merge.c                  |  2 ++
 t/t5333-pseudo-merge-bitmaps.sh | 22 ++++++++++++----------
 2 files changed, 14 insertions(+), 10 deletions(-)

diff --git a/pseudo-merge.c b/pseudo-merge.c
index 34e1da00b4e..d79e5fb649a 100644
--- a/pseudo-merge.c
+++ b/pseudo-merge.c
@@ -236,6 +236,8 @@ static int find_pseudo_merge_group_for_ref(const struct reference *ref, void *_d
 	c = lookup_commit(the_repository, maybe_peeled);
 	if (!c)
 		return 0;
+	if (repo_parse_commit(the_repository, c))
+		return 0;
 	if (!packlist_find(writer->to_pack, maybe_peeled))
 		return 0;
 
diff --git a/t/t5333-pseudo-merge-bitmaps.sh b/t/t5333-pseudo-merge-bitmaps.sh
index 8844a3bced9..63d2f64361d 100755
--- a/t/t5333-pseudo-merge-bitmaps.sh
+++ b/t/t5333-pseudo-merge-bitmaps.sh
@@ -592,32 +592,34 @@ test_expect_success 'apply pseudo-merges with overlapping groups during fill-in'
 	)
 '
 
-test_expect_failure 'pseudo-merge commits are correctly classified by date' '
+test_expect_success 'pseudo-merge commits are correctly classified by date' '
 	git init pseudo-merge-date-classification &&
 	test_when_finished "rm -fr pseudo-merge-date-classification" &&
 	(
 		cd pseudo-merge-date-classification &&
 
 		test_commit_bulk 64 &&
+
 		tag_everything &&
 		git repack -ad &&
 
 		pack="$(ls .git/objects/pack/pack-*.pack)" &&
 
 		# Configure two pseudo-merge groups: one that only
-		# matches "stable" refs (older than one month), and one
-		# that matches all refs. With 64 freshly-created tags
-		# (all younger than one month) the stable group should
-		# have zero pseudo-merges and the catch-all group should
-		# have one.
+		# matches "stable" refs (older than one month), and
+		# one that matches all refs. With 64 tags whose
+		# commits are all younger than one month, the
+		# "stable" group should have zero pseudo-merges and
+		# the "all" group should have one.
 		#
 		# Use GIT_TEST_DATE_NOW to align "now" (and therefore
 		# "1.month.ago") with the test_tick timestamps so that
 		# the commits are within the last month.
 		#
-		# This exercises the date-based classification in
-		# find_pseudo_merge_group_for_ref(), which requires
-		# that commits are parsed before inspecting their date.
+		# Without parsing the commit, its date field would
+		# be zero, causing it to satisfy date <= threshold
+		# for the "stable" group as well, and both groups
+		# would produce pseudo-merges.
 		git config bitmapPseudoMerge.stable.pattern "refs/tags/" &&
 		git config bitmapPseudoMerge.stable.maxMerges 64 &&
 		git config bitmapPseudoMerge.stable.stableThreshold never &&
@@ -637,7 +639,7 @@ test_expect_failure 'pseudo-merge commits are correctly classified by date' '
 	)
 '
 
-test_expect_success 'sampleRate=0 does not cause division by zero' '
+test_expect_failure 'sampleRate=0 does not cause division by zero' '
 	git init pseudo-merge-sample-rate-zero &&
 	test_when_finished "rm -fr pseudo-merge-sample-rate-zero" &&
 	(
-- 
2.54.0.rc1.73.g8f4e0170952


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 7/8] pack-bitmap: reject pseudo-merge "sampleRate" of 0
  2026-04-13 23:56 [PATCH 0/8] pack-bitmap: fix various pseudo-merge bugs Taylor Blau
                   ` (5 preceding siblings ...)
  2026-04-13 23:56 ` [PATCH 6/8] pack-bitmap: parse commits in `find_pseudo_merge_group_for_ref()` Taylor Blau
@ 2026-04-13 23:56 ` Taylor Blau
  2026-04-19  0:26   ` Elijah Newren
  2026-04-13 23:57 ` [PATCH 8/8] pack-bitmap: prevent pattern leak on pseudo-merge re-assignment Taylor Blau
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 46+ messages in thread
From: Taylor Blau @ 2026-04-13 23:56 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Elijah Newren

The "bitmapPseudoMerge.*.sampleRate" configuration controls what
fraction of unstable commits are included in each pseudo-merge group.
The config validation accepts values in the range `[0, 1]`, but a value
of exactly 0 causes a division by zero in `select_pseudo_merges_1()`:

    if (j % (uint32_t)(1.0 / group->sample_rate))

When `sample_rate` is 0, `1.0 / 0.0` produces `+inf`, and casting
infinity to `uint32_t` is undefined behavior in C. On most platforms
this yields 0, making the subsequent modulo operation (`j % 0`) a
fatal arithmetic trap.

This path was not previously reachable because an earlier bug caused
all pseudo-merge candidates to be classified as "stable" (where the
sampling rate is not used), regardless of their actual commit date. Now
that the date classification is fixed, the unstable path is exercised
and the division by zero can fire.

Fix this by changing the validation to require a strict lower bound and
thus reject 0.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 pseudo-merge.c                  | 4 ++--
 t/t5333-pseudo-merge-bitmaps.sh | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/pseudo-merge.c b/pseudo-merge.c
index d79e5fb649a..75bed043602 100644
--- a/pseudo-merge.c
+++ b/pseudo-merge.c
@@ -169,8 +169,8 @@ static int pseudo_merge_config(const char *var, const char *value,
 		}
 	} else if (!strcmp(key, "samplerate")) {
 		group->sample_rate = git_config_double(var, value, ctx->kvi);
-		if (!(0 <= group->sample_rate && group->sample_rate <= 1)) {
-			warning(_("%s must be between 0 and 1, using default"), var);
+		if (!(0 < group->sample_rate && group->sample_rate <= 1)) {
+			warning(_("%s must be between 0 (exclusive) and 1, using default"), var);
 			group->sample_rate = DEFAULT_PSEUDO_MERGE_SAMPLE_RATE;
 		}
 	} else if (!strcmp(key, "threshold")) {
diff --git a/t/t5333-pseudo-merge-bitmaps.sh b/t/t5333-pseudo-merge-bitmaps.sh
index 63d2f64361d..46e8e6a8ea1 100755
--- a/t/t5333-pseudo-merge-bitmaps.sh
+++ b/t/t5333-pseudo-merge-bitmaps.sh
@@ -639,7 +639,7 @@ test_expect_success 'pseudo-merge commits are correctly classified by date' '
 	)
 '
 
-test_expect_failure 'sampleRate=0 does not cause division by zero' '
+test_expect_success 'sampleRate=0 does not cause division by zero' '
 	git init pseudo-merge-sample-rate-zero &&
 	test_when_finished "rm -fr pseudo-merge-sample-rate-zero" &&
 	(
-- 
2.54.0.rc1.73.g8f4e0170952


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 8/8] pack-bitmap: prevent pattern leak on pseudo-merge re-assignment
  2026-04-13 23:56 [PATCH 0/8] pack-bitmap: fix various pseudo-merge bugs Taylor Blau
                   ` (6 preceding siblings ...)
  2026-04-13 23:56 ` [PATCH 7/8] pack-bitmap: reject pseudo-merge "sampleRate" of 0 Taylor Blau
@ 2026-04-13 23:57 ` Taylor Blau
  2026-04-21 20:01 ` [PATCH v2 0/9] pack-bitmap: fix various pseudo-merge bugs Taylor Blau
  2026-05-12  0:46 ` [PATCH v3 " Taylor Blau
  9 siblings, 0 replies; 46+ messages in thread
From: Taylor Blau @ 2026-04-13 23:57 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Elijah Newren

When "bitmapPseudoMerge.*.pattern" appears more than once for the same
group, `pseudo_merge_config()` frees the old `regex_t *` pointer
but does not call `regfree()` on it first. This leaks whatever internal
state `regcomp()` allocated.

The final cleanup path in `pseudo_merge_group_release()` does call
`regfree()` before `free()`, so only the intermediate replacement is
affected.

Fix this by guarding the replacement with a NULL check and calling
`regfree()` before `free()` when the pointer is non-NULL.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 pseudo-merge.c                  |  5 ++++-
 t/t5333-pseudo-merge-bitmaps.sh | 30 ++++++++++++++++++++++++++++++
 2 files changed, 34 insertions(+), 1 deletion(-)

diff --git a/pseudo-merge.c b/pseudo-merge.c
index 75bed043602..22b8600d689 100644
--- a/pseudo-merge.c
+++ b/pseudo-merge.c
@@ -150,7 +150,10 @@ static int pseudo_merge_config(const char *var, const char *value,
 	if (!strcmp(key, "pattern")) {
 		struct strbuf re = STRBUF_INIT;
 
-		free(group->pattern);
+		if (group->pattern) {
+			regfree(group->pattern);
+			free(group->pattern);
+		}
 		if (*value != '^')
 			strbuf_addch(&re, '^');
 		strbuf_addstr(&re, value);
diff --git a/t/t5333-pseudo-merge-bitmaps.sh b/t/t5333-pseudo-merge-bitmaps.sh
index 46e8e6a8ea1..34d432ce76d 100755
--- a/t/t5333-pseudo-merge-bitmaps.sh
+++ b/t/t5333-pseudo-merge-bitmaps.sh
@@ -662,4 +662,34 @@ test_expect_success 'sampleRate=0 does not cause division by zero' '
 	)
 '
 
+test_expect_success 'duplicate pseudo-merge pattern does not leak' '
+	git init pseudo-merge-dup-pattern &&
+	test_when_finished "rm -fr pseudo-merge-dup-pattern" &&
+
+	(
+		cd pseudo-merge-dup-pattern &&
+
+		test_commit_bulk 64 &&
+		tag_everything &&
+		git repack -ad &&
+
+		pack=$(ls .git/objects/pack/pack-*.pack) &&
+
+		# Set the same group'\''s pattern twice. The second
+		# assignment should cleanly release the compiled regex
+		# from the first without leaking.
+		git config bitmapPseudoMerge.test.pattern "refs/tags/" &&
+		git config --add bitmapPseudoMerge.test.pattern "refs/tags/" &&
+		git config bitmapPseudoMerge.test.maxMerges 1 &&
+		git config bitmapPseudoMerge.test.threshold now &&
+		git config bitmapPseudoMerge.test.stableThreshold never &&
+
+		git rev-parse HEAD~63 |
+		test-tool bitmap write "$(basename $pack)" &&
+
+		test_pseudo_merges >merges &&
+		test_line_count = 1 merges
+	)
+'
+
 test_done
-- 
2.54.0.rc1.73.g8f4e0170952

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* Re: [PATCH 1/8] t/helper: add 'test-tool bitmap write' subcommand
  2026-04-13 23:56 ` [PATCH 1/8] t/helper: add 'test-tool bitmap write' subcommand Taylor Blau
@ 2026-04-14 19:48   ` Junio C Hamano
  2026-04-14 21:29     ` Taylor Blau
  2026-04-14 20:08   ` Junio C Hamano
  2026-04-19  0:24   ` Elijah Newren
  2 siblings, 1 reply; 46+ messages in thread
From: Junio C Hamano @ 2026-04-14 19:48 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, Jeff King, Elijah Newren

Taylor Blau <me@ttaylorr.com> writes:

> In f16eb1c091 (pseudo-merge: fix disk reads from find_pseudo_merge(),
> 2026-03-31), we noted that `apply_pseudo_merges_for_commit()` is never
> triggered by the existing test suite, and that this bears further
> investigation.
>
> This patch is the first one to begin that investigation. The following
> patches will expose and fix a variety of bugs in the implementation of
> pseudo-merge bitmaps.
>
> In order to do so, however, many of these tests require very precise
> selection of which commits receive bitmaps and which do not. To date,
> there isn't a standard approach to easily facilitate this. Address this
> by introducing a `test-tool bitmap write` subcommand that writes a
> bitmap for a given packfile, reading the set of commits which should
> receive individual bitmaps from stdin like so:
>
>     test-tool bitmap write <pack-basename> </path/to/commits.list
>
> , where "<pack-basename>" is the filename for a specific packfile (e.g.,
> "pack-abc123.pack"), and "/path/to/commits.list" is a list of commit
> OIDs which will receive bitmaps.
>
> The helper respects `bitmapPseudoMerge.*` configuration for creating
> pseudo-merge bitmaps alongside the regular commit bitmaps.
>
> Signed-off-by: Taylor Blau <me@ttaylorr.com>
> ---
>  t/helper/test-bitmap.c  | 110 +++++++++++++++++++++++++++++++++++++++-
>  t/t5310-pack-bitmaps.sh |  24 +++++++++
>  2 files changed, 133 insertions(+), 1 deletion(-)

I haven't been paying attention at all to pseudo-merge stuff, so my
comment may be too trivial and/or misses the point, but please bear
with me.

> diff --git a/t/t5310-pack-bitmaps.sh b/t/t5310-pack-bitmaps.sh
> index f693cb56691..9489e59fa55 100755
> --- a/t/t5310-pack-bitmaps.sh
> +++ b/t/t5310-pack-bitmaps.sh
> @@ -648,4 +648,28 @@ test_expect_success 'truncated bitmap fails gracefully (lookup table)' '
>  	test_grep corrupted.bitmap.index stderr
>  '
>  
> +test_expect_success 'test-tool bitmap write' '

It is very unclear what aspect of "test-tool bitmap write" is being
tested to me.  Let me think aloud to see if I can convey my
puzzlement.

> +	git init bitmap-write-helper &&
> +	test_when_finished "rm -fr bitmap-write-helper" &&

A tangent but the above two lines may want to be swapped.  "rm -fr"
does not fail when bitmap-write-helper directory does not yet exist,
so "prepare to clear anytime it fails from now on and then create"
would be a safer order than "create, and prepare to clear anytime it
fails from now on".

> +	(
> +		cd bitmap-write-helper &&
> +
> +		test_commit_bulk 64 &&
> +		git repack -ad &&
> +
> +		pack="$(ls .git/objects/pack/pack-*.pack)" &&

So we bulk-created 64 commits, repacked into one, and then

> +		git rev-parse HEAD >commits &&

wrote the tip-commit in "commits".

> +		test-tool bitmap write "$(basename $pack)" <commits &&

And told the new tool to write a bitmap file, with only that HEAD
commit and nothing else to get bitmap.

> +		test-tool bitmap list-commits | sort >actual &&
> +		sort commits >expect &&
> +		test_cmp expect actual &&

And compare the list of bitmapped commits with the singleton HEAD
(it is puzzling to sort a single element list, though).

> +		git rev-list --count --objects --use-bitmap-index HEAD >actual &&
> +		git rev-list --count --objects HEAD >expect &&
> +		test_cmp expect actual
> +	)
> +'
> +
>  test_done

If we look at the implementation of bitmap_write() below, the object
name for HEAD is fed to bitmap_write_push_commit() with pseudo bit
off.  After reading the list (which has only one element), we call
select_pseudo_merges().  What do we expect to happen in this call?
Since no bitmap_write_push_commit() call is made with pseudo bit on,
we will return immediately without doing anything?

After that bitmap_write_build() is called, and as this is expected
to add bitmaps to the commits fed to bitmap_write_push_commit(), we
are expecting to see bitmap given to HEAD (and nothing else)?

I said it is unclear what is being tested.  Putting it another way,
what could go wrong to cause this test to fail?  We give commit A,
B, and C to "bitmap write", and it somehow chooses other commits to
also give bitmap, which will be reported by "bitmap list-commits"
and we detect that as a failure?

Thanks.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 1/8] t/helper: add 'test-tool bitmap write' subcommand
  2026-04-13 23:56 ` [PATCH 1/8] t/helper: add 'test-tool bitmap write' subcommand Taylor Blau
  2026-04-14 19:48   ` Junio C Hamano
@ 2026-04-14 20:08   ` Junio C Hamano
  2026-04-14 21:40     ` Taylor Blau
  2026-04-19  0:24   ` Elijah Newren
  2 siblings, 1 reply; 46+ messages in thread
From: Junio C Hamano @ 2026-04-14 20:08 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, Jeff King, Elijah Newren

Taylor Blau <me@ttaylorr.com> writes:

> +struct bitmap_writer_data {
> +	struct packing_data packed;
> +	struct pack_idx_entry **index;
> +	uint32_t nr;
> +};
> +
> +static int add_packed_object(const struct object_id *oid,
> +			     struct packed_git *pack,
> +			     uint32_t pos,
> +			     void *_data)
> +{
> +	struct bitmap_writer_data *data = _data;
> +	struct object_entry *entry;
> +	struct object_info oi = OBJECT_INFO_INIT;
> +	enum object_type type;
> +
> +	oi.typep = &type;
> +
> +	entry = packlist_alloc(&data->packed, oid);

data->packed is "packing data" that has a pointer "objects" that is
a flat array of "struct object_entry".  This array is dynamically
expanded with realloc() in packlist_alloc(), and it returns a
pointer into this data->packed->objects[] array.  We receive it in a
local variable "entry" here.

> +	entry->idx.offset = nth_packed_object_offset(pack, pos);
> +	if (packed_object_info(pack, entry->idx.offset, &oi) < 0)
> +		die("could not get type of object %s",
> +		    oid_to_hex(oid));
> +	oe_set_type(entry, type);
> +	oe_set_in_pack(&data->packed, entry, pack);

And populate the entry.

> +	data->index[data->nr++] = &entry->idx;

And then store the pointer to one of the members (actually the first
member) in that "struct object_entry" instance in that data->packed->objects[]
array we took from.

What happens when a repeated call to this function to add many
objects (those contained within the pack we are iterating over)
caused the packlist_alloc() to realloc data->packed->objects[] array
eventually?  Wouldn't it invalidate the address of &entry->idx we
are taking from before the realloc() happens?

I must be missing something?


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 1/8] t/helper: add 'test-tool bitmap write' subcommand
  2026-04-14 19:48   ` Junio C Hamano
@ 2026-04-14 21:29     ` Taylor Blau
  2026-04-14 21:34       ` Junio C Hamano
  0 siblings, 1 reply; 46+ messages in thread
From: Taylor Blau @ 2026-04-14 21:29 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jeff King, Elijah Newren

On Tue, Apr 14, 2026 at 12:48:49PM -0700, Junio C Hamano wrote:
> I haven't been paying attention at all to pseudo-merge stuff, so my
> comment may be too trivial and/or misses the point, but please bear
> with me.

It has been a while since I have thought about pseudo-merges since I
returned to the topic with this series, so bear with me just the same
;-).

> > +	git init bitmap-write-helper &&
> > +	test_when_finished "rm -fr bitmap-write-helper" &&
>
> A tangent but the above two lines may want to be swapped.  "rm -fr"
> does not fail when bitmap-write-helper directory does not yet exist,
> so "prepare to clear anytime it fails from now on and then create"
> would be a safer order than "create, and prepare to clear anytime it
> fails from now on".

I agree. I vaguely remember discussing this on the list with Ævar years
ago, but clearly we should register our cleanup handler before we do the
thing that needs to be cleaned up in case it fails part of the way
through.

> So we bulk-created 64 commits, repacked into one, and then
> [...]
> wrote the tip-commit in "commits".
> [...]
> And told the new tool to write a bitmap file, with only that HEAD
> commit and nothing else to get bitmap.
>
> > +		test-tool bitmap list-commits | sort >actual &&
> > +		sort commits >expect &&
> > +		test_cmp expect actual &&
>
> And compare the list of bitmapped commits with the singleton HEAD
> (it is puzzling to sort a single element list, though).

That's right.

> > +		git rev-list --count --objects --use-bitmap-index HEAD >actual &&
> > +		git rev-list --count --objects HEAD >expect &&
> > +		test_cmp expect actual
> > +	)
> > +'
> > +
> >  test_done
>
> If we look at the implementation of bitmap_write() below, the object
> name for HEAD is fed to bitmap_write_push_commit() with pseudo bit
> off.  After reading the list (which has only one element), we call
> select_pseudo_merges().  What do we expect to happen in this call?
> Since no bitmap_write_push_commit() call is made with pseudo bit on,
> we will return immediately without doing anything?

In this particular instance, nothing should happen, and this test
isn't directly testing the pseudo-merge selection at all here. We don't
expect any pseudo-merges to be generated here, only for us to limit the
selection of which commits get bitmaps.

> After that bitmap_write_build() is called, and as this is expected
> to add bitmaps to the commits fed to bitmap_write_push_commit(), we
> are expecting to see bitmap given to HEAD (and nothing else)?

Yes.

> I said it is unclear what is being tested.  Putting it another way,
> what could go wrong to cause this test to fail?  We give commit A,
> B, and C to "bitmap write", and it somehow chooses other commits to
> also give bitmap, which will be reported by "bitmap list-commits"
> and we detect that as a failure?

That's right, and to the point of your original question, I think a
better name is warranted here, perhaps: "test-tool bitmap write limits
bitmap selection" or something.

To the question of what it's testing, it's testing only its basic
functionality of altering the selection of which commits receive
bitmaps. In that sense I view it as a smoke test of that basic
functionality more than anything else.

How about:

--- 8< ---
diff --git a/t/t5310-pack-bitmaps.sh b/t/t5310-pack-bitmaps.sh
index 9489e59fa55..ea94735fd66 100755
--- a/t/t5310-pack-bitmaps.sh
+++ b/t/t5310-pack-bitmaps.sh
@@ -649,8 +649,8 @@ test_expect_success 'truncated bitmap fails gracefully (lookup table)' '
 '

 test_expect_success 'test-tool bitmap write' '
-	git init bitmap-write-helper &&
 	test_when_finished "rm -fr bitmap-write-helper" &&
+	git init bitmap-write-helper &&
 	(
 		cd bitmap-write-helper &&

@@ -659,12 +659,12 @@ test_expect_success 'test-tool bitmap write' '

 		pack="$(ls .git/objects/pack/pack-*.pack)" &&

-		git rev-parse HEAD >commits &&
-		test-tool bitmap write "$(basename $pack)" <commits &&
+		git rev-parse HEAD >in &&
+		test-tool bitmap write "$(basename $pack)" <in &&

-		test-tool bitmap list-commits | sort >actual &&
-		sort commits >expect &&
-		test_cmp expect actual &&
+		test-tool bitmap list-commits >bitmaps.raw &&
+		sort bitmaps.raw >bitmaps &&
+		test_cmp in bitmaps &&

 		git rev-list --count --objects --use-bitmap-index HEAD >actual &&
 		git rev-list --count --objects HEAD >expect &&
--- >8 ---

Thanks,
Taylor

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* Re: [PATCH 1/8] t/helper: add 'test-tool bitmap write' subcommand
  2026-04-14 21:29     ` Taylor Blau
@ 2026-04-14 21:34       ` Junio C Hamano
  2026-04-14 21:40         ` Taylor Blau
  0 siblings, 1 reply; 46+ messages in thread
From: Junio C Hamano @ 2026-04-14 21:34 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, Jeff King, Elijah Newren

Taylor Blau <me@ttaylorr.com> writes:

> That's right, and to the point of your original question, I think a
> better name is warranted here, perhaps: "test-tool bitmap write limits
> bitmap selection" or something.

Perhaps.  Or "limits" -> "forces"?  Neither verb exactly conveys
that the outcome must be exactly the same as the input specifies,
nothing added, nothing removed, so I dunno.

> To the question of what it's testing, it's testing only its basic
> functionality of altering the selection of which commits receive
> bitmaps. In that sense I view it as a smoke test of that basic
> functionality more than anything else.

OK.

Thanks.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 1/8] t/helper: add 'test-tool bitmap write' subcommand
  2026-04-14 20:08   ` Junio C Hamano
@ 2026-04-14 21:40     ` Taylor Blau
  0 siblings, 0 replies; 46+ messages in thread
From: Taylor Blau @ 2026-04-14 21:40 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jeff King, Elijah Newren

On Tue, Apr 14, 2026 at 01:08:39PM -0700, Junio C Hamano wrote:
> What happens when a repeated call to this function to add many
> objects (those contained within the pack we are iterating over)
> caused the packlist_alloc() to realloc data->packed->objects[] array
> eventually?  Wouldn't it invalidate the address of &entry->idx we
> are taking from before the realloc() happens?
>
> I must be missing something?

Good catch, I'm the one that is missing something here, not you. This is
definitely a use-after-realloc(), though in practice it won't bite us
because we are likely extending into an over-sized heap allocation
without actually moving the data.

I don't know why I thought we allocated the packlist with a fixed size
equal to p->num_objects ahead of time, but we don't, and this is clearly
a bug.

Will fix, and thanks again for spotting.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 1/8] t/helper: add 'test-tool bitmap write' subcommand
  2026-04-14 21:34       ` Junio C Hamano
@ 2026-04-14 21:40         ` Taylor Blau
  0 siblings, 0 replies; 46+ messages in thread
From: Taylor Blau @ 2026-04-14 21:40 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jeff King, Elijah Newren

On Tue, Apr 14, 2026 at 02:34:17PM -0700, Junio C Hamano wrote:
> Taylor Blau <me@ttaylorr.com> writes:
>
> > That's right, and to the point of your original question, I think a
> > better name is warranted here, perhaps: "test-tool bitmap write limits
> > bitmap selection" or something.
>
> Perhaps.  Or "limits" -> "forces"?  Neither verb exactly conveys
> that the outcome must be exactly the same as the input specifies,
> nothing added, nothing removed, so I dunno.

How about "determines" or "controls"?

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 1/8] t/helper: add 'test-tool bitmap write' subcommand
  2026-04-13 23:56 ` [PATCH 1/8] t/helper: add 'test-tool bitmap write' subcommand Taylor Blau
  2026-04-14 19:48   ` Junio C Hamano
  2026-04-14 20:08   ` Junio C Hamano
@ 2026-04-19  0:24   ` Elijah Newren
  2026-04-21 18:51     ` Taylor Blau
  2 siblings, 1 reply; 46+ messages in thread
From: Elijah Newren @ 2026-04-19  0:24 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, Junio C Hamano, Jeff King

On Mon, Apr 13, 2026 at 4:56 PM Taylor Blau <me@ttaylorr.com> wrote:
[...]
> +               bitmap_writer_push_commit(&writer, c, false);

$ git grep -h -A 1 bitmap_writer_push_commit -- '*.h'
void bitmap_writer_push_commit(struct bitmap_writer *writer,
                               struct commit *commit, unsigned pseudo_merge);

Not a big deal, but for consistency, would it make more sense to pass
0 for the third argument, or to change the function signature change
to accept bool instead of unsigned?

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/8] t5333: demonstrate various pseudo-merge bugs
  2026-04-13 23:56 ` [PATCH 2/8] t5333: demonstrate various pseudo-merge bugs Taylor Blau
@ 2026-04-19  0:25   ` Elijah Newren
  0 siblings, 0 replies; 46+ messages in thread
From: Elijah Newren @ 2026-04-19  0:25 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, Junio C Hamano, Jeff King

On Mon, Apr 13, 2026 at 4:56 PM Taylor Blau <me@ttaylorr.com> wrote:
>
> Using the test helper introduced via the previous commit, add various
> failing tests demonstrating bugs in the pseudo-merge implementation.
>
> These are all marked as failing with one exception. The "sampleRate=0"
> test describes a latent bug, which is only reachable through a code path
> that is itself masked by a separate bug. A future commit will fix that
> bug, and, in turn, cause the aforementioned test to fail. Accordingly,
> that commit will mark the test as failing, and it will be re-marked as
> passing in a separate commit which fixes the once-latent bug.
>
> For the rest: the following commits will explain and fix the underlying
> bugs in detail.
>
> Signed-off-by: Taylor Blau <me@ttaylorr.com>
> ---
>  t/t5333-pseudo-merge-bitmaps.sh | 198 ++++++++++++++++++++++++++++++++
>  1 file changed, 198 insertions(+)
>
> diff --git a/t/t5333-pseudo-merge-bitmaps.sh b/t/t5333-pseudo-merge-bitmaps.sh
> index 1f7a5d82ee4..20e77ab4390 100755
> --- a/t/t5333-pseudo-merge-bitmaps.sh
> +++ b/t/t5333-pseudo-merge-bitmaps.sh
> @@ -462,4 +462,202 @@ test_expect_success 'use pseudo-merge in boundary traversal' '
>         )
>  '
>
> +test_expect_failure 'apply pseudo-merges during fill-in traversal' '
> +       git init pseudo-merge-fill-in-traversal &&
> +       test_when_finished "rm -fr pseudo-merge-fill-in-traversal" &&

As suggested in the first patch, test_when_finished before the git
init.  (Same issue occurs later in this file as well.)

[...]
> +               : >trace2.txt &&

The `: >trace2.txt` struck me as odd, since this file doesn't even
exist yet in this test...but thinking more, is this just defensive in
case someone adds inserts or modifies a previous test which writes to
trace2.txt?  I like that idea; somehow hadn't seen it before.

> +               GIT_TRACE2_EVENT=$PWD/trace2.txt \
> +                       git rev-list --count --objects --use-bitmap-index HEAD >actual &&

I thought this was broken without quoting $PWD, but looks like I
forgot shell quoting rules again.  Assignments don't undergo word
splitting, globbing, or brace expansion.  So, nothing to see here
either.

> +               test_pseudo_merges_satisfied 1 <trace2.txt &&
> +
> +               test_cmp expect actual
> +       )
> +'

Didn't spot anything different to comment on for the rest of the
patch, so the only substantive comment I had was on the
test_when_finished and init ordering.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 7/8] pack-bitmap: reject pseudo-merge "sampleRate" of 0
  2026-04-13 23:56 ` [PATCH 7/8] pack-bitmap: reject pseudo-merge "sampleRate" of 0 Taylor Blau
@ 2026-04-19  0:26   ` Elijah Newren
  0 siblings, 0 replies; 46+ messages in thread
From: Elijah Newren @ 2026-04-19  0:26 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, Junio C Hamano, Jeff King

On Mon, Apr 13, 2026 at 4:56 PM Taylor Blau <me@ttaylorr.com> wrote:
>
> The "bitmapPseudoMerge.*.sampleRate" configuration controls what
> fraction of unstable commits are included in each pseudo-merge group.
> The config validation accepts values in the range `[0, 1]`, but a value
> of exactly 0 causes a division by zero in `select_pseudo_merges_1()`:
>
>     if (j % (uint32_t)(1.0 / group->sample_rate))
>
> When `sample_rate` is 0, `1.0 / 0.0` produces `+inf`, and casting
> infinity to `uint32_t` is undefined behavior in C. On most platforms
> this yields 0, making the subsequent modulo operation (`j % 0`) a
> fatal arithmetic trap.
>
> This path was not previously reachable because an earlier bug caused
> all pseudo-merge candidates to be classified as "stable" (where the
> sampling rate is not used), regardless of their actual commit date. Now
> that the date classification is fixed, the unstable path is exercised
> and the division by zero can fire.
>
> Fix this by changing the validation to require a strict lower bound and
> thus reject 0.
>
> Signed-off-by: Taylor Blau <me@ttaylorr.com>
> ---
>  pseudo-merge.c                  | 4 ++--
>  t/t5333-pseudo-merge-bitmaps.sh | 2 +-
>  2 files changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/pseudo-merge.c b/pseudo-merge.c
> index d79e5fb649a..75bed043602 100644
> --- a/pseudo-merge.c
> +++ b/pseudo-merge.c
> @@ -169,8 +169,8 @@ static int pseudo_merge_config(const char *var, const char *value,
>                 }
>         } else if (!strcmp(key, "samplerate")) {
>                 group->sample_rate = git_config_double(var, value, ctx->kvi);
> -               if (!(0 <= group->sample_rate && group->sample_rate <= 1)) {
> -                       warning(_("%s must be between 0 and 1, using default"), var);
> +               if (!(0 < group->sample_rate && group->sample_rate <= 1)) {
> +                       warning(_("%s must be between 0 (exclusive) and 1, using default"), var);

The documentation for `bitmapPseudoMerge.<name>.sampleRate` in
Documentation/config/bitmap-pseudo-merge.adoc still claims that 0 is
allowed; should that be fixed as part of this patch?

Also:

```
$ git grep -B 4 -i samplerate.*=
Documentation/gitpacking.adoc-[bitmapPseudoMerge "all"]
Documentation/gitpacking.adoc-  pattern = "refs/"
Documentation/gitpacking.adoc-  threshold = now
Documentation/gitpacking.adoc-  stableThreshold = never
Documentation/gitpacking.adoc:  sampleRate = 100
--
Documentation/gitpacking.adoc-[bitmapPseudoMerge "all"]
Documentation/gitpacking.adoc-  pattern = "refs/virtual/([0-9]+)/(heads|tags)/"
Documentation/gitpacking.adoc-  threshold = now
Documentation/gitpacking.adoc-  stableThreshold = never
Documentation/gitpacking.adoc:  sampleRate = 100
```

Should those sampleRates be fixed?

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 1/8] t/helper: add 'test-tool bitmap write' subcommand
  2026-04-19  0:24   ` Elijah Newren
@ 2026-04-21 18:51     ` Taylor Blau
  0 siblings, 0 replies; 46+ messages in thread
From: Taylor Blau @ 2026-04-21 18:51 UTC (permalink / raw)
  To: Elijah Newren; +Cc: git, Junio C Hamano, Jeff King

On Sat, Apr 18, 2026 at 05:24:04PM -0700, Elijah Newren wrote:
> On Mon, Apr 13, 2026 at 4:56 PM Taylor Blau <me@ttaylorr.com> wrote:
> [...]
> > +               bitmap_writer_push_commit(&writer, c, false);
>
> $ git grep -h -A 1 bitmap_writer_push_commit -- '*.h'
> void bitmap_writer_push_commit(struct bitmap_writer *writer,
>                                struct commit *commit, unsigned pseudo_merge);
>
> Not a big deal, but for consistency, would it make more sense to pass
> 0 for the third argument, or to change the function signature change
> to accept bool instead of unsigned?

Let's change it to pass "0" for now. I think that it's fine to clean
this up in the future, but I do not want to make a habit of changing
many "int/unsigned -> bool" signatures each time we wish to call such a
function.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH v2 0/9] pack-bitmap: fix various pseudo-merge bugs
  2026-04-13 23:56 [PATCH 0/8] pack-bitmap: fix various pseudo-merge bugs Taylor Blau
                   ` (7 preceding siblings ...)
  2026-04-13 23:57 ` [PATCH 8/8] pack-bitmap: prevent pattern leak on pseudo-merge re-assignment Taylor Blau
@ 2026-04-21 20:01 ` Taylor Blau
  2026-04-21 20:01   ` [PATCH v2 1/9] t/helper: add 'test-tool bitmap write' subcommand Taylor Blau
                     ` (9 more replies)
  2026-05-12  0:46 ` [PATCH v3 " Taylor Blau
  9 siblings, 10 replies; 46+ messages in thread
From: Taylor Blau @ 2026-04-21 20:01 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Elijah Newren

[Note to the maintainer: this series has been rebased onto the current
tip of master, which is 94f057755b7 (Git 2.54, 2026-04-19) at the time
of writing.]

This is a small reroll of my series to fix several bugs in the
pseudo-merge bitmap implementation. The main changes since last time
are:

 - Fixed a use-after-realloc bug in the test helper introduced in the
   first commit.

 - Swapped the order of initializing and cleaning up repositories in the
   new test scripts.

 - Updated bitmapPseudoMerge.<name>.sampleRate's documentation to
   describe the range as (0,1], and added a new commit fixing a broken
   example in gitpacking(7).

As usual, a range-diff is included below for convenience. The original
cover letter is as follows:

========================================================================

This series fixes several bugs in the pseudo-merge bitmap implementation
that caused the pseudo-merge application path to be effectively broken
during fill-in traversal.

Peff noticed that this code path was never triggered by the existing
test suite, and investigating that observation uncovered a handful of
bugs, some compounding.

The first two patches introduce test infrastructure: a 'bitmap write'
test helper that gives tests precise control over which commits receive
individual bitmaps, and a set of "test_expect_failure" tests
demonstrating each bug.

The next four patches fix the bugs in the per-commit pseudo-merge
lookup:

  - The pseudo-merge commit lookup table was sorted by OID rather than
    by bit position, causing the reader's binary search to fail.

  - The binary search in pseudo_merge_at() had its lo/hi updates
    swapped.

  - The extended pseudo-merge lookup path had three compounding bugs: a
    wrong entry-size calculation in the writer, a misinterpretation of
    extended table entries in the reader, and a silently-swallowed error
    check.

The final two patches fix issues in pseudo-merge group selection:

  - find_pseudo_merge_group_for_ref() did not parse commits before
    inspecting their dates, so all candidates had date == 0 and were
    unconditionally placed in the "stable" bucket.

  - The config validation for bitmapPseudoMerge.*.sampleRate accepted 0,
    which leads to a division by zero once the date classification is
    fixed and the unstable code path is exercised.

There is also a small fix for a regex leak when the pattern key is
overridden in config.

Thanks in advance for your review!

Taylor Blau (9):
  t/helper: add 'test-tool bitmap write' subcommand
  t5333: demonstrate various pseudo-merge bugs
  pack-bitmap-write: sort pseudo-merge commit lookup table in pack order
  pack-bitmap: fix inverted binary search in `pseudo_merge_at()`
  pack-bitmap: fix pseudo-merge lookup for shared commits
  pack-bitmap: parse commits in `find_pseudo_merge_group_for_ref()`
  pack-bitmap: reject pseudo-merge "sampleRate" of 0
  Documentation: fix broken `sampleRate` in gitpacking(7)
  pack-bitmap: prevent pattern leak on pseudo-merge re-assignment

 Documentation/config/bitmap-pseudo-merge.adoc |   4 +-
 Documentation/gitpacking.adoc                 |   4 +-
 pack-bitmap-write.c                           |  23 +-
 pseudo-merge.c                                |  19 +-
 t/helper/test-bitmap.c                        | 113 ++++++++-
 t/t5310-pack-bitmaps.sh                       |  24 ++
 t/t5333-pseudo-merge-bitmaps.sh               | 231 ++++++++++++++++++
 7 files changed, 404 insertions(+), 14 deletions(-)

Range-diff against v1:
 1:  d5ef6b959fd !  1:  c0df35f8ebd t/helper: add 'test-tool bitmap write' subcommand
    @@ t/helper/test-bitmap.c: static int bitmap_dump_pseudo_merge_objects(uint32_t n)
      	return test_bitmap_pseudo_merge_objects(the_repository, n);
      }
      
    -+struct bitmap_writer_data {
    -+	struct packing_data packed;
    -+	struct pack_idx_entry **index;
    -+	uint32_t nr;
    -+};
    -+
     +static int add_packed_object(const struct object_id *oid,
     +			     struct packed_git *pack,
     +			     uint32_t pos,
     +			     void *_data)
     +{
    -+	struct bitmap_writer_data *data = _data;
    ++	struct packing_data *packed = _data;
     +	struct object_entry *entry;
     +	struct object_info oi = OBJECT_INFO_INIT;
     +	enum object_type type;
     +
     +	oi.typep = &type;
     +
    -+	entry = packlist_alloc(&data->packed, oid);
    ++	entry = packlist_alloc(packed, oid);
     +	entry->idx.offset = nth_packed_object_offset(pack, pos);
     +	if (packed_object_info(pack, entry->idx.offset, &oi) < 0)
     +		die("could not get type of object %s",
     +		    oid_to_hex(oid));
     +	oe_set_type(entry, type);
    -+	oe_set_in_pack(&data->packed, entry, pack);
    -+	data->index[data->nr++] = &entry->idx;
    ++	oe_set_in_pack(packed, entry, pack);
     +
     +	return 0;
     +}
    @@ t/helper/test-bitmap.c: static int bitmap_dump_pseudo_merge_objects(uint32_t n)
     +static int bitmap_write(const char *basename)
     +{
     +	struct packed_git *p = NULL;
    -+	struct bitmap_writer_data data = { 0 };
    ++	struct packing_data packed = { 0 };
     +	struct bitmap_writer writer;
    ++	struct pack_idx_entry **index;
     +	struct strbuf buf = STRBUF_INIT;
    ++	uint32_t i;
     +
     +	prepare_repo_settings(the_repository);
     +	repo_for_each_pack(the_repository, p) {
    @@ t/helper/test-bitmap.c: static int bitmap_dump_pseudo_merge_objects(uint32_t n)
     +	if (open_pack_index(p))
     +		die("cannot open pack index for '%s'", p->pack_name);
     +
    -+	prepare_packing_data(the_repository, &data.packed);
    -+	ALLOC_ARRAY(data.index, p->num_objects);
    ++	prepare_packing_data(the_repository, &packed);
     +
    -+	for_each_object_in_pack(p, add_packed_object, &data,
    ++	for_each_object_in_pack(p, add_packed_object, &packed,
     +				ODB_FOR_EACH_OBJECT_PACK_ORDER);
     +
    -+	bitmap_writer_init(&writer, the_repository, &data.packed, NULL);
    -+	bitmap_writer_build_type_index(&writer, data.index);
    ++	/*
    ++	 * Build the index array now that data.packed.objects[] is
    ++	 * fully allocated (packlist_alloc() may have reallocated it
    ++	 * during the loop above).
    ++	 */
    ++	ALLOC_ARRAY(index, p->num_objects);
    ++	for (i = 0; i < p->num_objects; i++)
    ++		index[i] = &packed.objects[i].idx;
    ++
    ++	bitmap_writer_init(&writer, the_repository, &packed, NULL);
    ++	bitmap_writer_build_type_index(&writer, index);
     +
     +	while (strbuf_getline_lf(&buf, stdin) != EOF) {
     +		struct object_id oid;
    @@ t/helper/test-bitmap.c: static int bitmap_dump_pseudo_merge_objects(uint32_t n)
     +		if (!c || repo_parse_commit(the_repository, c))
     +			die("could not parse commit %s", buf.buf);
     +
    -+		bitmap_writer_push_commit(&writer, c, false);
    ++		bitmap_writer_push_commit(&writer, c, 0);
     +	}
     +
     +	select_pseudo_merges(&writer);
    @@ t/helper/test-bitmap.c: static int bitmap_dump_pseudo_merge_objects(uint32_t n)
     +
     +	bitmap_writer_set_checksum(&writer, p->hash);
     +
    -+	QSORT(data.index, p->num_objects, idx_oid_cmp);
    ++	QSORT(index, p->num_objects, idx_oid_cmp);
     +
     +	strbuf_reset(&buf);
     +	strbuf_addstr(&buf, p->pack_name);
     +	strbuf_strip_suffix(&buf, ".pack");
     +	strbuf_addstr(&buf, ".bitmap");
    -+	bitmap_writer_finish(&writer, data.index, buf.buf, 0);
    ++	bitmap_writer_finish(&writer, index, buf.buf, 0);
     +
     +	bitmap_writer_free(&writer);
     +	strbuf_release(&buf);
    -+	free(data.index);
    -+	clear_packing_data(&data.packed);
    ++	free(index);
    ++	clear_packing_data(&packed);
     +
     +	return 0;
     +}
    @@ t/t5310-pack-bitmaps.sh: test_expect_success 'truncated bitmap fails gracefully
      	test_grep corrupted.bitmap.index stderr
      '
      
    -+test_expect_success 'test-tool bitmap write' '
    -+	git init bitmap-write-helper &&
    ++test_expect_success 'test-tool bitmap write determines bitmap selection' '
     +	test_when_finished "rm -fr bitmap-write-helper" &&
    ++	git init bitmap-write-helper &&
     +	(
     +		cd bitmap-write-helper &&
     +
    @@ t/t5310-pack-bitmaps.sh: test_expect_success 'truncated bitmap fails gracefully
     +
     +		pack="$(ls .git/objects/pack/pack-*.pack)" &&
     +
    -+		git rev-parse HEAD >commits &&
    -+		test-tool bitmap write "$(basename $pack)" <commits &&
    ++		git rev-parse HEAD >in &&
    ++		test-tool bitmap write "$(basename $pack)" <in &&
     +
    -+		test-tool bitmap list-commits | sort >actual &&
    -+		sort commits >expect &&
    -+		test_cmp expect actual &&
    ++		test-tool bitmap list-commits >bitmaps.raw &&
    ++		sort bitmaps.raw >bitmaps &&
    ++		test_cmp in bitmaps &&
     +
     +		git rev-list --count --objects --use-bitmap-index HEAD >actual &&
     +		git rev-list --count --objects HEAD >expect &&
 2:  f4899b668e2 !  2:  11de3343726 t5333: demonstrate various pseudo-merge bugs
    @@ t/t5333-pseudo-merge-bitmaps.sh: test_expect_success 'use pseudo-merge in bounda
      '
      
     +test_expect_failure 'apply pseudo-merges during fill-in traversal' '
    -+	git init pseudo-merge-fill-in-traversal &&
     +	test_when_finished "rm -fr pseudo-merge-fill-in-traversal" &&
    ++	git init pseudo-merge-fill-in-traversal &&
     +	(
     +		cd pseudo-merge-fill-in-traversal &&
     +
    @@ t/t5333-pseudo-merge-bitmaps.sh: test_expect_success 'use pseudo-merge in bounda
     +'
     +
     +test_expect_failure 'apply pseudo-merges from multiple groups during fill-in' '
    -+	git init pseudo-merge-fill-in-multi &&
     +	test_when_finished "rm -fr pseudo-merge-fill-in-multi" &&
    ++	git init pseudo-merge-fill-in-multi &&
     +	(
     +		cd pseudo-merge-fill-in-multi &&
     +
    @@ t/t5333-pseudo-merge-bitmaps.sh: test_expect_success 'use pseudo-merge in bounda
     +'
     +
     +test_expect_failure 'pseudo-merge commits are correctly classified by date' '
    -+	git init pseudo-merge-date-classification &&
     +	test_when_finished "rm -fr pseudo-merge-date-classification" &&
    ++	git init pseudo-merge-date-classification &&
     +	(
     +		cd pseudo-merge-date-classification &&
     +
    @@ t/t5333-pseudo-merge-bitmaps.sh: test_expect_success 'use pseudo-merge in bounda
     +'
     +
     +test_expect_success 'sampleRate=0 does not cause division by zero' '
    -+	git init pseudo-merge-sample-rate-zero &&
     +	test_when_finished "rm -fr pseudo-merge-sample-rate-zero" &&
    ++	git init pseudo-merge-sample-rate-zero &&
     +	(
     +		cd pseudo-merge-sample-rate-zero &&
     +
 3:  1f5835e8c62 !  3:  8d908ab415e pack-bitmap-write: sort pseudo-merge commit lookup table in pack order
    @@ t/t5333-pseudo-merge-bitmaps.sh: test_expect_success 'use pseudo-merge in bounda
      
     -test_expect_failure 'apply pseudo-merges during fill-in traversal' '
     +test_expect_success 'apply pseudo-merges during fill-in traversal' '
    - 	git init pseudo-merge-fill-in-traversal &&
      	test_when_finished "rm -fr pseudo-merge-fill-in-traversal" &&
    + 	git init pseudo-merge-fill-in-traversal &&
      	(
 4:  af9f651269d !  4:  07f70a07c20 pack-bitmap: fix inverted binary search in `pseudo_merge_at()`
    @@ t/t5333-pseudo-merge-bitmaps.sh: test_expect_success 'apply pseudo-merges during
      
     -test_expect_failure 'apply pseudo-merges from multiple groups during fill-in' '
     +test_expect_success 'apply pseudo-merges from multiple groups during fill-in' '
    - 	git init pseudo-merge-fill-in-multi &&
      	test_when_finished "rm -fr pseudo-merge-fill-in-multi" &&
    + 	git init pseudo-merge-fill-in-multi &&
    ++	git init pseudo-merge-fill-in-multi &&
      	(
    + 		cd pseudo-merge-fill-in-multi &&
    + 
 5:  01f1d6f08c6 =  5:  3ed0b39843f pack-bitmap: fix pseudo-merge lookup for shared commits
 6:  6d74c0a177a !  6:  95f847211f3 pack-bitmap: parse commits in `find_pseudo_merge_group_for_ref()`
    @@ t/t5333-pseudo-merge-bitmaps.sh: test_expect_success 'apply pseudo-merges with o
      
     -test_expect_failure 'pseudo-merge commits are correctly classified by date' '
     +test_expect_success 'pseudo-merge commits are correctly classified by date' '
    - 	git init pseudo-merge-date-classification &&
      	test_when_finished "rm -fr pseudo-merge-date-classification" &&
    + 	git init pseudo-merge-date-classification &&
      	(
      		cd pseudo-merge-date-classification &&
      
    @@ t/t5333-pseudo-merge-bitmaps.sh: test_expect_failure 'pseudo-merge commits are c
      
     -test_expect_success 'sampleRate=0 does not cause division by zero' '
     +test_expect_failure 'sampleRate=0 does not cause division by zero' '
    - 	git init pseudo-merge-sample-rate-zero &&
      	test_when_finished "rm -fr pseudo-merge-sample-rate-zero" &&
    + 	git init pseudo-merge-sample-rate-zero &&
      	(
 7:  1b0f7295c21 !  7:  f8a01cfb893 pack-bitmap: reject pseudo-merge "sampleRate" of 0
    @@ Commit message
     
         Signed-off-by: Taylor Blau <me@ttaylorr.com>
     
    + ## Documentation/config/bitmap-pseudo-merge.adoc ##
    +@@ Documentation/config/bitmap-pseudo-merge.adoc: will be updated more often than a reference pointing at an old commit.
    + bitmapPseudoMerge.<name>.sampleRate::
    + 	Determines the proportion of non-bitmapped commits (among
    + 	reference tips) which are selected for inclusion in an
    +-	unstable pseudo-merge bitmap. Must be between `0` and `1`
    +-	(inclusive). The default is `1`.
    ++	unstable pseudo-merge bitmap. Must be greater than `0` and
    ++	less than or equal to `1`. The default is `1`.
    + 
    + bitmapPseudoMerge.<name>.threshold::
    + 	Determines the minimum age of non-bitmapped commits (among
    +
      ## pseudo-merge.c ##
     @@ pseudo-merge.c: static int pseudo_merge_config(const char *var, const char *value,
      		}
    @@ t/t5333-pseudo-merge-bitmaps.sh: test_expect_success 'pseudo-merge commits are c
      
     -test_expect_failure 'sampleRate=0 does not cause division by zero' '
     +test_expect_success 'sampleRate=0 does not cause division by zero' '
    - 	git init pseudo-merge-sample-rate-zero &&
      	test_when_finished "rm -fr pseudo-merge-sample-rate-zero" &&
    + 	git init pseudo-merge-sample-rate-zero &&
      	(
 -:  ----------- >  8:  c37156502c0 Documentation: fix broken `sampleRate` in gitpacking(7)
 8:  8f4e0170952 =  9:  b905fd5d0ae pack-bitmap: prevent pattern leak on pseudo-merge re-assignment

base-commit: 94f057755b7941b321fd11fec1b2e3ca5313a4e0
-- 
2.54.0.9.gb905fd5d0ae

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH v2 1/9] t/helper: add 'test-tool bitmap write' subcommand
  2026-04-21 20:01 ` [PATCH v2 0/9] pack-bitmap: fix various pseudo-merge bugs Taylor Blau
@ 2026-04-21 20:01   ` Taylor Blau
  2026-04-21 20:01   ` [PATCH v2 2/9] t5333: demonstrate various pseudo-merge bugs Taylor Blau
                     ` (8 subsequent siblings)
  9 siblings, 0 replies; 46+ messages in thread
From: Taylor Blau @ 2026-04-21 20:01 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Elijah Newren

In f16eb1c091 (pseudo-merge: fix disk reads from find_pseudo_merge(),
2026-03-31), we noted that `apply_pseudo_merges_for_commit()` is never
triggered by the existing test suite, and that this bears further
investigation.

This patch is the first one to begin that investigation. The following
patches will expose and fix a variety of bugs in the implementation of
pseudo-merge bitmaps.

In order to do so, however, many of these tests require very precise
selection of which commits receive bitmaps and which do not. To date,
there isn't a standard approach to easily facilitate this. Address this
by introducing a `test-tool bitmap write` subcommand that writes a
bitmap for a given packfile, reading the set of commits which should
receive individual bitmaps from stdin like so:

    test-tool bitmap write <pack-basename> </path/to/commits.list

, where "<pack-basename>" is the filename for a specific packfile (e.g.,
"pack-abc123.pack"), and "/path/to/commits.list" is a list of commit
OIDs which will receive bitmaps.

The helper respects `bitmapPseudoMerge.*` configuration for creating
pseudo-merge bitmaps alongside the regular commit bitmaps.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 t/helper/test-bitmap.c  | 113 +++++++++++++++++++++++++++++++++++++++-
 t/t5310-pack-bitmaps.sh |  24 +++++++++
 2 files changed, 136 insertions(+), 1 deletion(-)

diff --git a/t/helper/test-bitmap.c b/t/helper/test-bitmap.c
index 16a01669e41..381e9b58b2c 100644
--- a/t/helper/test-bitmap.c
+++ b/t/helper/test-bitmap.c
@@ -2,7 +2,10 @@
 
 #include "test-tool.h"
 #include "git-compat-util.h"
+#include "hex.h"
+#include "odb.h"
 #include "pack-bitmap.h"
+#include "pseudo-merge.h"
 #include "setup.h"
 
 static int bitmap_list_commits(void)
@@ -35,6 +38,111 @@ static int bitmap_dump_pseudo_merge_objects(uint32_t n)
 	return test_bitmap_pseudo_merge_objects(the_repository, n);
 }
 
+static int add_packed_object(const struct object_id *oid,
+			     struct packed_git *pack,
+			     uint32_t pos,
+			     void *_data)
+{
+	struct packing_data *packed = _data;
+	struct object_entry *entry;
+	struct object_info oi = OBJECT_INFO_INIT;
+	enum object_type type;
+
+	oi.typep = &type;
+
+	entry = packlist_alloc(packed, oid);
+	entry->idx.offset = nth_packed_object_offset(pack, pos);
+	if (packed_object_info(pack, entry->idx.offset, &oi) < 0)
+		die("could not get type of object %s",
+		    oid_to_hex(oid));
+	oe_set_type(entry, type);
+	oe_set_in_pack(packed, entry, pack);
+
+	return 0;
+}
+
+static int idx_oid_cmp(const void *va, const void *vb)
+{
+	const struct pack_idx_entry *a = *(const struct pack_idx_entry **)va;
+	const struct pack_idx_entry *b = *(const struct pack_idx_entry **)vb;
+
+	return oidcmp(&a->oid, &b->oid);
+}
+
+static int bitmap_write(const char *basename)
+{
+	struct packed_git *p = NULL;
+	struct packing_data packed = { 0 };
+	struct bitmap_writer writer;
+	struct pack_idx_entry **index;
+	struct strbuf buf = STRBUF_INIT;
+	uint32_t i;
+
+	prepare_repo_settings(the_repository);
+	repo_for_each_pack(the_repository, p) {
+		if (!strcmp(pack_basename(p), basename))
+			break;
+	}
+
+	if (!p)
+		die("could not find pack '%s'", basename);
+
+	if (open_pack_index(p))
+		die("cannot open pack index for '%s'", p->pack_name);
+
+	prepare_packing_data(the_repository, &packed);
+
+	for_each_object_in_pack(p, add_packed_object, &packed,
+				ODB_FOR_EACH_OBJECT_PACK_ORDER);
+
+	/*
+	 * Build the index array now that data.packed.objects[] is
+	 * fully allocated (packlist_alloc() may have reallocated it
+	 * during the loop above).
+	 */
+	ALLOC_ARRAY(index, p->num_objects);
+	for (i = 0; i < p->num_objects; i++)
+		index[i] = &packed.objects[i].idx;
+
+	bitmap_writer_init(&writer, the_repository, &packed, NULL);
+	bitmap_writer_build_type_index(&writer, index);
+
+	while (strbuf_getline_lf(&buf, stdin) != EOF) {
+		struct object_id oid;
+		struct commit *c;
+
+		if (get_oid_hex(buf.buf, &oid))
+			die("invalid OID: %s", buf.buf);
+
+		c = lookup_commit(the_repository, &oid);
+		if (!c || repo_parse_commit(the_repository, c))
+			die("could not parse commit %s", buf.buf);
+
+		bitmap_writer_push_commit(&writer, c, 0);
+	}
+
+	select_pseudo_merges(&writer);
+	if (bitmap_writer_build(&writer) < 0)
+		die("failed to build bitmaps");
+
+	bitmap_writer_set_checksum(&writer, p->hash);
+
+	QSORT(index, p->num_objects, idx_oid_cmp);
+
+	strbuf_reset(&buf);
+	strbuf_addstr(&buf, p->pack_name);
+	strbuf_strip_suffix(&buf, ".pack");
+	strbuf_addstr(&buf, ".bitmap");
+	bitmap_writer_finish(&writer, index, buf.buf, 0);
+
+	bitmap_writer_free(&writer);
+	strbuf_release(&buf);
+	free(index);
+	clear_packing_data(&packed);
+
+	return 0;
+}
+
 int cmd__bitmap(int argc, const char **argv)
 {
 	setup_git_directory();
@@ -51,13 +159,16 @@ int cmd__bitmap(int argc, const char **argv)
 		return bitmap_dump_pseudo_merge_commits(atoi(argv[2]));
 	if (argc == 3 && !strcmp(argv[1], "dump-pseudo-merge-objects"))
 		return bitmap_dump_pseudo_merge_objects(atoi(argv[2]));
+	if (argc == 3 && !strcmp(argv[1], "write"))
+		return bitmap_write(argv[2]);
 
 	usage("\ttest-tool bitmap list-commits\n"
 	      "\ttest-tool bitmap list-commits-with-offset\n"
 	      "\ttest-tool bitmap dump-hashes\n"
 	      "\ttest-tool bitmap dump-pseudo-merges\n"
 	      "\ttest-tool bitmap dump-pseudo-merge-commits <n>\n"
-	      "\ttest-tool bitmap dump-pseudo-merge-objects <n>");
+	      "\ttest-tool bitmap dump-pseudo-merge-objects <n>\n"
+	      "\ttest-tool bitmap write <pack-basename> < <commit-list>");
 
 	return -1;
 }
diff --git a/t/t5310-pack-bitmaps.sh b/t/t5310-pack-bitmaps.sh
index f693cb56691..efeb71593bf 100755
--- a/t/t5310-pack-bitmaps.sh
+++ b/t/t5310-pack-bitmaps.sh
@@ -648,4 +648,28 @@ test_expect_success 'truncated bitmap fails gracefully (lookup table)' '
 	test_grep corrupted.bitmap.index stderr
 '
 
+test_expect_success 'test-tool bitmap write determines bitmap selection' '
+	test_when_finished "rm -fr bitmap-write-helper" &&
+	git init bitmap-write-helper &&
+	(
+		cd bitmap-write-helper &&
+
+		test_commit_bulk 64 &&
+		git repack -ad &&
+
+		pack="$(ls .git/objects/pack/pack-*.pack)" &&
+
+		git rev-parse HEAD >in &&
+		test-tool bitmap write "$(basename $pack)" <in &&
+
+		test-tool bitmap list-commits >bitmaps.raw &&
+		sort bitmaps.raw >bitmaps &&
+		test_cmp in bitmaps &&
+
+		git rev-list --count --objects --use-bitmap-index HEAD >actual &&
+		git rev-list --count --objects HEAD >expect &&
+		test_cmp expect actual
+	)
+'
+
 test_done
-- 
2.54.0.9.gb905fd5d0ae


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v2 2/9] t5333: demonstrate various pseudo-merge bugs
  2026-04-21 20:01 ` [PATCH v2 0/9] pack-bitmap: fix various pseudo-merge bugs Taylor Blau
  2026-04-21 20:01   ` [PATCH v2 1/9] t/helper: add 'test-tool bitmap write' subcommand Taylor Blau
@ 2026-04-21 20:01   ` Taylor Blau
  2026-04-21 20:02   ` [PATCH v2 3/9] pack-bitmap-write: sort pseudo-merge commit lookup table in pack order Taylor Blau
                     ` (7 subsequent siblings)
  9 siblings, 0 replies; 46+ messages in thread
From: Taylor Blau @ 2026-04-21 20:01 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Elijah Newren

Using the test helper introduced via the previous commit, add various
failing tests demonstrating bugs in the pseudo-merge implementation.

These are all marked as failing with one exception. The "sampleRate=0"
test describes a latent bug, which is only reachable through a code path
that is itself masked by a separate bug. A future commit will fix that
bug, and, in turn, cause the aforementioned test to fail. Accordingly,
that commit will mark the test as failing, and it will be re-marked as
passing in a separate commit which fixes the once-latent bug.

For the rest: the following commits will explain and fix the underlying
bugs in detail.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 t/t5333-pseudo-merge-bitmaps.sh | 198 ++++++++++++++++++++++++++++++++
 1 file changed, 198 insertions(+)

diff --git a/t/t5333-pseudo-merge-bitmaps.sh b/t/t5333-pseudo-merge-bitmaps.sh
index 1f7a5d82ee4..0e9638c31c3 100755
--- a/t/t5333-pseudo-merge-bitmaps.sh
+++ b/t/t5333-pseudo-merge-bitmaps.sh
@@ -462,4 +462,202 @@ test_expect_success 'use pseudo-merge in boundary traversal' '
 	)
 '
 
+test_expect_failure 'apply pseudo-merges during fill-in traversal' '
+	test_when_finished "rm -fr pseudo-merge-fill-in-traversal" &&
+	git init pseudo-merge-fill-in-traversal &&
+	(
+		cd pseudo-merge-fill-in-traversal &&
+
+		git config bitmapPseudoMerge.test.pattern refs/tags/ &&
+		git config bitmapPseudoMerge.test.maxMerges 1 &&
+		git config bitmapPseudoMerge.test.stableThreshold never &&
+
+		test_commit_bulk 64 &&
+		tag_everything &&
+		git repack -ad &&
+
+		pack=$(ls .git/objects/pack/pack-*.pack) &&
+		git rev-parse HEAD~63 >in &&
+		test-tool bitmap write "$(basename $pack)" <in &&
+
+		test_pseudo_merges >merges &&
+		test_line_count = 1 merges &&
+
+		test_commit stale &&
+
+		git rev-list --count --objects HEAD >expect &&
+
+		: >trace2.txt &&
+		GIT_TRACE2_EVENT=$PWD/trace2.txt \
+			git rev-list --count --objects --use-bitmap-index HEAD >actual &&
+		test_pseudo_merges_satisfied 1 <trace2.txt &&
+
+		test_cmp expect actual
+	)
+'
+
+test_expect_failure 'apply pseudo-merges from multiple groups during fill-in' '
+	test_when_finished "rm -fr pseudo-merge-fill-in-multi" &&
+	git init pseudo-merge-fill-in-multi &&
+	(
+		cd pseudo-merge-fill-in-multi &&
+
+		test_commit base &&
+		base=$(git rev-parse HEAD) &&
+
+		for side in left right
+		do
+			git checkout -B $side base &&
+
+			test_commit_bulk --id=$side 64 &&
+			git rev-list --no-object-names HEAD --not $base >in &&
+			while read oid
+			do
+				echo "create refs/group-$side/$oid $oid" || return 1
+			done <in | git update-ref --stdin || return 1
+		done &&
+
+		git checkout left &&
+		git merge right &&
+		git repack -ad &&
+
+		git config bitmapPseudoMerge.left.pattern "refs/group-left/" &&
+		git config bitmapPseudoMerge.left.maxMerges 1 &&
+		git config bitmapPseudoMerge.left.stableThreshold never &&
+
+		git config bitmapPseudoMerge.right.pattern "refs/group-right/" &&
+		git config bitmapPseudoMerge.right.maxMerges 1 &&
+		git config bitmapPseudoMerge.right.stableThreshold never &&
+
+		pack="$(ls .git/objects/pack/pack-*.pack)" &&
+		git rev-parse "$base" >in &&
+		test-tool bitmap write "$(basename $pack)" <in &&
+
+		test_pseudo_merges >merges &&
+		test_line_count = 2 merges &&
+
+		test_commit stale &&
+
+		git rev-list --count --objects HEAD >expect &&
+
+		: >trace2.txt &&
+		GIT_TRACE2_EVENT=$PWD/trace2.txt \
+			git rev-list --count --objects --use-bitmap-index HEAD >actual &&
+		test_pseudo_merges_satisfied 2 <trace2.txt &&
+
+		test_cmp expect actual
+	)
+'
+
+test_expect_failure 'apply pseudo-merges with overlapping groups during fill-in' '
+	test_when_finished "rm -fr pseudo-merge-fill-in-overlap" &&
+	git init pseudo-merge-fill-in-overlap &&
+	(
+		cd pseudo-merge-fill-in-overlap &&
+
+		test_commit_bulk 64 &&
+		tag_everything &&
+		git repack -ad &&
+
+		pack="$(ls .git/objects/pack/pack-*.pack)" &&
+
+		# Use two pseudo-merge group patterns that both match
+		# refs/tags/, so every tagged commit belongs to both
+		# groups. This exercises the extended lookup table
+		# path in apply_pseudo_merges_for_commit().
+		git config bitmapPseudoMerge.all.pattern "refs/tags/" &&
+		git config bitmapPseudoMerge.all.maxMerges 1 &&
+		git config bitmapPseudoMerge.all.stableThreshold never &&
+
+		git config bitmapPseudoMerge.tags.pattern "refs/tags/" &&
+		git config bitmapPseudoMerge.tags.maxMerges 1 &&
+		git config bitmapPseudoMerge.tags.stableThreshold never &&
+
+		git rev-parse HEAD~63 >in &&
+		test-tool bitmap write "$(basename $pack)" <in &&
+
+		test_pseudo_merges >merges &&
+		test_line_count = 2 merges &&
+
+		test_commit stale &&
+
+		git rev-list --count --objects HEAD >expect &&
+
+		: >trace2.txt &&
+		GIT_TRACE2_EVENT=$PWD/trace2.txt \
+			git rev-list --count --objects --use-bitmap-index HEAD >actual &&
+		test_pseudo_merges_satisfied 2 <trace2.txt &&
+
+		test_cmp expect actual
+	)
+'
+
+test_expect_failure 'pseudo-merge commits are correctly classified by date' '
+	test_when_finished "rm -fr pseudo-merge-date-classification" &&
+	git init pseudo-merge-date-classification &&
+	(
+		cd pseudo-merge-date-classification &&
+
+		test_commit_bulk 64 &&
+		tag_everything &&
+		git repack -ad &&
+
+		pack="$(ls .git/objects/pack/pack-*.pack)" &&
+
+		# Configure two pseudo-merge groups: one that only
+		# matches "stable" refs (older than one month), and one
+		# that matches all refs. With 64 freshly-created tags
+		# (all younger than one month) the stable group should
+		# have zero pseudo-merges and the catch-all group should
+		# have one.
+		#
+		# Use GIT_TEST_DATE_NOW to align "now" (and therefore
+		# "1.month.ago") with the test_tick timestamps so that
+		# the commits are within the last month.
+		#
+		# This exercises the date-based classification in
+		# find_pseudo_merge_group_for_ref(), which requires
+		# that commits are parsed before inspecting their date.
+		git config bitmapPseudoMerge.stable.pattern "refs/tags/" &&
+		git config bitmapPseudoMerge.stable.maxMerges 64 &&
+		git config bitmapPseudoMerge.stable.stableThreshold never &&
+		git config bitmapPseudoMerge.stable.threshold 1.month.ago &&
+
+		git config bitmapPseudoMerge.all.pattern "refs/tags/" &&
+		git config bitmapPseudoMerge.all.maxMerges 1 &&
+		git config bitmapPseudoMerge.all.stableThreshold never &&
+		git config bitmapPseudoMerge.all.threshold now &&
+
+		git rev-parse HEAD~63 >in &&
+		GIT_TEST_DATE_NOW=$test_tick \
+			test-tool bitmap write "$(basename $pack)" <in &&
+
+		test_pseudo_merges >merges &&
+		test_line_count = 1 merges
+	)
+'
+
+test_expect_success 'sampleRate=0 does not cause division by zero' '
+	test_when_finished "rm -fr pseudo-merge-sample-rate-zero" &&
+	git init pseudo-merge-sample-rate-zero &&
+	(
+		cd pseudo-merge-sample-rate-zero &&
+
+		test_commit_bulk 64 &&
+		tag_everything &&
+		git repack -ad &&
+
+		pack="$(ls .git/objects/pack/pack-*.pack)" &&
+
+		git config bitmapPseudoMerge.test.pattern "refs/tags/" &&
+		git config bitmapPseudoMerge.test.maxMerges 1 &&
+		git config bitmapPseudoMerge.test.sampleRate 0 &&
+		git config bitmapPseudoMerge.test.threshold now &&
+		git config bitmapPseudoMerge.test.stableThreshold never &&
+
+		git rev-parse HEAD~63 >in &&
+		test-tool bitmap write "$(basename $pack)" <in
+	)
+'
+
 test_done
-- 
2.54.0.9.gb905fd5d0ae


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v2 3/9] pack-bitmap-write: sort pseudo-merge commit lookup table in pack order
  2026-04-21 20:01 ` [PATCH v2 0/9] pack-bitmap: fix various pseudo-merge bugs Taylor Blau
  2026-04-21 20:01   ` [PATCH v2 1/9] t/helper: add 'test-tool bitmap write' subcommand Taylor Blau
  2026-04-21 20:01   ` [PATCH v2 2/9] t5333: demonstrate various pseudo-merge bugs Taylor Blau
@ 2026-04-21 20:02   ` Taylor Blau
  2026-04-21 20:02   ` [PATCH v2 4/9] pack-bitmap: fix inverted binary search in `pseudo_merge_at()` Taylor Blau
                     ` (6 subsequent siblings)
  9 siblings, 0 replies; 46+ messages in thread
From: Taylor Blau @ 2026-04-21 20:02 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Elijah Newren

The pseudo-merge commit lookup table stores each commit's position in
the pack- or pseudo-pack order, and is used to perform a binary search
in order to determine which pseudo-merge(s) a given commit belongs to.

However, the table was previously sorted in lexical order (via
`oid_array_sort()`), causing the binary search to fail.

While this causes pseudo-merge bitmaps to be de-facto broken for fill-in
traversal, there are a couple of important points to keep in mind:

 * Pseudo-merge application during the initial phases of a bitmap-based
   traversal are applied via `cascade_pseudo_merges_1()`. This function
   enumerates the known pseudo-merges and determines if its parents are
   a subset of the traversal roots.

   This is a different path than the fill-in traversal, where we are
   looking for any pseudo-merges which may be satisfied after visiting
   some commit along an object walk, which involves the aforementioned
   (broken) binary search.

   As a consequence, any pseudo-merges we apply at this stage are done
   so correctly.

 * While this bug makes applying pseudo-merges during fill-in traversal
   effectively broken, it does not produce wrong results. Instead of
   applying the *wrong* pseudo-merge, we will simply fail to find
   satisfied pseudo-merges, leaving the traversal to use the existing
   fill-in routines.

Fix this by sorting the table by bit position before writing, matching
the order that the reader's binary search expects.

This does produce a change the on-disk format insofar as the actual code
now complies with the documented format (for more details, refer to:
Documentation/technical/bitmap-format.adoc). Given that this never
worked in the first place, such a change should be OK to perform.

If an out-of-tree implementation of pseudo-merges happened to generate
bitmaps that comply with the documented format, they will continue to be
read and interpreted as normal.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 pack-bitmap-write.c             | 21 ++++++++++++++++++++-
 t/t5333-pseudo-merge-bitmaps.sh |  2 +-
 2 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c
index 8338d7217ef..86ed6a5d78c 100644
--- a/pack-bitmap-write.c
+++ b/pack-bitmap-write.c
@@ -819,6 +819,20 @@ static void write_selected_commits_v1(struct bitmap_writer *writer,
 	}
 }
 
+static int pseudo_merge_commit_pos_cmp(const void *_va, const void *_vb,
+				       void *_data)
+{
+	struct bitmap_writer *writer = _data;
+	uint32_t pos_a = find_object_pos(writer, _va, NULL);
+	uint32_t pos_b = find_object_pos(writer, _vb, NULL);
+
+	if (pos_a < pos_b)
+		return -1;
+	if (pos_a > pos_b)
+		return 1;
+	return 0;
+}
+
 static void write_pseudo_merges(struct bitmap_writer *writer,
 				struct hashfile *f)
 {
@@ -876,7 +890,12 @@ static void write_pseudo_merges(struct bitmap_writer *writer,
 		oid_array_append(&commits, &kh_key(writer->pseudo_merge_commits, i));
 	}
 
-	oid_array_sort(&commits);
+	/*
+	 * Sort the commits by their bit position so that the lookup
+	 * table can be binary searched by the reader (see
+	 * find_pseudo_merge()).
+	 */
+	QSORT_S(commits.oid, commits.nr, pseudo_merge_commit_pos_cmp, writer);
 
 	/* write lookup table (non-extended) */
 	for (i = 0; i < commits.nr; i++) {
diff --git a/t/t5333-pseudo-merge-bitmaps.sh b/t/t5333-pseudo-merge-bitmaps.sh
index 0e9638c31c3..3d7a7668121 100755
--- a/t/t5333-pseudo-merge-bitmaps.sh
+++ b/t/t5333-pseudo-merge-bitmaps.sh
@@ -462,7 +462,7 @@ test_expect_success 'use pseudo-merge in boundary traversal' '
 	)
 '
 
-test_expect_failure 'apply pseudo-merges during fill-in traversal' '
+test_expect_success 'apply pseudo-merges during fill-in traversal' '
 	test_when_finished "rm -fr pseudo-merge-fill-in-traversal" &&
 	git init pseudo-merge-fill-in-traversal &&
 	(
-- 
2.54.0.9.gb905fd5d0ae


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v2 4/9] pack-bitmap: fix inverted binary search in `pseudo_merge_at()`
  2026-04-21 20:01 ` [PATCH v2 0/9] pack-bitmap: fix various pseudo-merge bugs Taylor Blau
                     ` (2 preceding siblings ...)
  2026-04-21 20:02   ` [PATCH v2 3/9] pack-bitmap-write: sort pseudo-merge commit lookup table in pack order Taylor Blau
@ 2026-04-21 20:02   ` Taylor Blau
  2026-04-21 20:02   ` [PATCH v2 5/9] pack-bitmap: fix pseudo-merge lookup for shared commits Taylor Blau
                     ` (5 subsequent siblings)
  9 siblings, 0 replies; 46+ messages in thread
From: Taylor Blau @ 2026-04-21 20:02 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Elijah Newren

The binary search in `pseudo_merge_at()` has its "lo" and "hi" updates
swapped: when the midpoint's offset is less than the target, it sets `hi
= mi` (searching left) instead of `lo = mi + 1` (searching right), and
vice versa.

This means that lookups for pseudo-merges whose offset is not near the
midpoint of the pseudo-merge table are likely to fail.

In practice, with a single pseudo-merge group this is masked because the
lone entry is always at the midpoint. With multiple groups, the inverted
comparisons cause lookups to search in the wrong direction, potentially
missing entries.

Swap the "lo" and "hi" assignments to search in the correct direction,
making it possible to apply pseudo-merges during fill-in when more than
one pseudo-merge exists in a group.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 pseudo-merge.c                  | 4 ++--
 t/t5333-pseudo-merge-bitmaps.sh | 3 ++-
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/pseudo-merge.c b/pseudo-merge.c
index ff18b6c3642..fb71c761792 100644
--- a/pseudo-merge.c
+++ b/pseudo-merge.c
@@ -559,9 +559,9 @@ static struct pseudo_merge *pseudo_merge_at(const struct pseudo_merge_map *pm,
 		if (got == want)
 			return use_pseudo_merge(pm, &pm->v[mi]);
 		else if (got < want)
-			hi = mi;
-		else
 			lo = mi + 1;
+		else
+			hi = mi;
 	}
 
 	warning(_("could not find pseudo-merge for commit %s at offset %"PRIuMAX),
diff --git a/t/t5333-pseudo-merge-bitmaps.sh b/t/t5333-pseudo-merge-bitmaps.sh
index 3d7a7668121..c5db6a11f6a 100755
--- a/t/t5333-pseudo-merge-bitmaps.sh
+++ b/t/t5333-pseudo-merge-bitmaps.sh
@@ -496,9 +496,10 @@ test_expect_success 'apply pseudo-merges during fill-in traversal' '
 	)
 '
 
-test_expect_failure 'apply pseudo-merges from multiple groups during fill-in' '
+test_expect_success 'apply pseudo-merges from multiple groups during fill-in' '
 	test_when_finished "rm -fr pseudo-merge-fill-in-multi" &&
 	git init pseudo-merge-fill-in-multi &&
+	git init pseudo-merge-fill-in-multi &&
 	(
 		cd pseudo-merge-fill-in-multi &&
 
-- 
2.54.0.9.gb905fd5d0ae


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v2 5/9] pack-bitmap: fix pseudo-merge lookup for shared commits
  2026-04-21 20:01 ` [PATCH v2 0/9] pack-bitmap: fix various pseudo-merge bugs Taylor Blau
                     ` (3 preceding siblings ...)
  2026-04-21 20:02   ` [PATCH v2 4/9] pack-bitmap: fix inverted binary search in `pseudo_merge_at()` Taylor Blau
@ 2026-04-21 20:02   ` Taylor Blau
  2026-04-21 20:02   ` [PATCH v2 6/9] pack-bitmap: parse commits in `find_pseudo_merge_group_for_ref()` Taylor Blau
                     ` (4 subsequent siblings)
  9 siblings, 0 replies; 46+ messages in thread
From: Taylor Blau @ 2026-04-21 20:02 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Elijah Newren

When a commit appears in more than one pseudo-merge group, its entry in
the commit lookup table has the high bit set in its offset field,
indicating that the offset points to an "extended" table containing the
set of pseudo-merges for that commit.

There are three bugs in this path:

 * The `next_ext` offset in `write_pseudo_merges()` undercounts the
   per-entry size of the lookup table (8 vs. 12 bytes).

 * `nth_pseudo_merge_ext()` calls `read_pseudo_merge_commit_at()` on a
   pseudo-merge bitmap offset, misinterpreting it as a 12-byte commit
   table entry.

 * The error check after `pseudo_merge_ext_at()` in
   `apply_pseudo_merges_for_commit()` tests `< -1` instead of `< 0`,
   silently swallowing errors from `error()`.

The first bug is on the write side: each commit lookup entry contains a
4- and 8-byte unsigned value for a total of 12 bytes, but the
calculation assumes that the entry only contains 8 bytes of data. This
makes `next_ext` too small, so the extended-table offsets that get
written point into the middle of the non-extended lookup table rather
than past it. The reader then interprets non-extended lookup data as
extended entries, producing garbage.

The second bug is on the read side and is independently fatal: even with
a correctly positioned extended table, `nth_pseudo_merge_ext()` feeds
the offset it reads (which points at pseudo-merge bitmap data) to
`read_pseudo_merge_commit_at()`. That function tries to parse 12 bytes
as a `pseudo_merge_commit` struct, clobbering `merge->pseudo_merge_ofs`
with whatever happens to be at that location. The caller only needs
`pseudo_merge_ofs`, so the fix is to store the offset directly rather
than re-parsing a commit table entry. The `commit_pos` field is left
untouched, retaining the value that `find_pseudo_merge()` set earlier.

The third bug is latent. With the first two fixes applied, the extended
table is correctly written and read, so `pseudo_merge_ext_at()` does not
fail during normal operation. The `< -1` vs `< 0` distinction only
matters when the bitmap file is corrupt or truncated, in which case the
error would be silently ignored and the code would proceed with
uninitialized data.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 pack-bitmap-write.c             | 2 +-
 pseudo-merge.c                  | 4 ++--
 t/t5333-pseudo-merge-bitmaps.sh | 2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c
index 86ed6a5d78c..1c8070f99c0 100644
--- a/pack-bitmap-write.c
+++ b/pack-bitmap-write.c
@@ -877,7 +877,7 @@ static void write_pseudo_merges(struct bitmap_writer *writer,
 
 	next_ext = st_add(hashfile_total(f),
 			  st_mult(kh_size(writer->pseudo_merge_commits),
-				  sizeof(uint64_t)));
+				  sizeof(uint32_t) + sizeof(uint64_t)));
 
 	table_start = hashfile_total(f);
 
diff --git a/pseudo-merge.c b/pseudo-merge.c
index fb71c761792..34e1da00b4e 100644
--- a/pseudo-merge.c
+++ b/pseudo-merge.c
@@ -600,7 +600,7 @@ static int nth_pseudo_merge_ext(const struct pseudo_merge_map *pm,
 		return error(_("out-of-bounds read: (%"PRIuMAX" >= %"PRIuMAX")"),
 			     (uintmax_t)ofs, (uintmax_t)pm->map_size);
 
-	read_pseudo_merge_commit_at(merge, pm->map + ofs);
+	merge->pseudo_merge_ofs = ofs;
 
 	return 0;
 }
@@ -671,7 +671,7 @@ int apply_pseudo_merges_for_commit(const struct pseudo_merge_map *pm,
 		off_t ofs = merge_commit.pseudo_merge_ofs & ~((uint64_t)1<<63);
 		uint32_t i;
 
-		if (pseudo_merge_ext_at(pm, &ext, ofs) < -1) {
+		if (pseudo_merge_ext_at(pm, &ext, ofs) < 0) {
 			warning(_("could not read extended pseudo-merge table "
 				  "for commit %s"),
 				oid_to_hex(&commit->object.oid));
diff --git a/t/t5333-pseudo-merge-bitmaps.sh b/t/t5333-pseudo-merge-bitmaps.sh
index c5db6a11f6a..f558e87ab21 100755
--- a/t/t5333-pseudo-merge-bitmaps.sh
+++ b/t/t5333-pseudo-merge-bitmaps.sh
@@ -550,7 +550,7 @@ test_expect_success 'apply pseudo-merges from multiple groups during fill-in' '
 	)
 '
 
-test_expect_failure 'apply pseudo-merges with overlapping groups during fill-in' '
+test_expect_success 'apply pseudo-merges with overlapping groups during fill-in' '
 	test_when_finished "rm -fr pseudo-merge-fill-in-overlap" &&
 	git init pseudo-merge-fill-in-overlap &&
 	(
-- 
2.54.0.9.gb905fd5d0ae


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v2 6/9] pack-bitmap: parse commits in `find_pseudo_merge_group_for_ref()`
  2026-04-21 20:01 ` [PATCH v2 0/9] pack-bitmap: fix various pseudo-merge bugs Taylor Blau
                     ` (4 preceding siblings ...)
  2026-04-21 20:02   ` [PATCH v2 5/9] pack-bitmap: fix pseudo-merge lookup for shared commits Taylor Blau
@ 2026-04-21 20:02   ` Taylor Blau
  2026-04-21 20:02   ` [PATCH v2 7/9] pack-bitmap: reject pseudo-merge "sampleRate" of 0 Taylor Blau
                     ` (3 subsequent siblings)
  9 siblings, 0 replies; 46+ messages in thread
From: Taylor Blau @ 2026-04-21 20:02 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Elijah Newren

`find_pseudo_merge_group_for_ref()` uses the commit's date to classify
it as either "stable" (older than the stable threshold) or "unstable"
(otherwise).

However, to find the relevant commit from a given OID, the function
`find_pseudo_merge_group_for_ref()` uses `lookup_commit()` which does
not parse commits.

Because an unparsed commit has its "date" set to zero, every candidate
is placed in the "stable" bucket regardless of its actual committer
timestamp. This means the `bitmapPseudoMerge.*.threshold` and
`stableThreshold` configuration options have no effect: the
stable/unstable split is always determined by comparing against zero
rather than the real commit date.

The net result is that pseudo-merge groups are partitioned by
`stableSize` instead of the intended decay-based sizing, and the
`sampleRate` knob (which only applies to the unstable path) is never
exercised.

Fix this by calling `repo_parse_commit()` after `lookup_commit()`,
bailing out of the callback if parsing fails.

The corresponding test configures two pseudo-merge groups that both
match all tags. The "stable" group uses `threshold=1.month.ago`, and the
"all" group uses `threshold=now`. The test use our custom
"GIT_TEST_DATE_NOW" environment variable by setting it to the value of
"$test_tick" to align Git's notion of "now" (and therefore
"1.month.ago") with the `test_tick` timestamps, so the commits appear to
be younger than one month: only the "all" group matches them, producing
exactly one pseudo-merge.

Without the fix every commit has `date == 0`, which satisfies `date <=
threshold` for both groups (since 0 is older than one month ago), and
the "stable" group erroneously matches as well.

Now that commits are correctly classified as "unstable", the bug
described in the test exercising the "sampleRate=0" test is reachable,
and the test is marked as failing. It will be fixed in a following
commit.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 pseudo-merge.c                  |  2 ++
 t/t5333-pseudo-merge-bitmaps.sh | 22 ++++++++++++----------
 2 files changed, 14 insertions(+), 10 deletions(-)

diff --git a/pseudo-merge.c b/pseudo-merge.c
index 34e1da00b4e..d79e5fb649a 100644
--- a/pseudo-merge.c
+++ b/pseudo-merge.c
@@ -236,6 +236,8 @@ static int find_pseudo_merge_group_for_ref(const struct reference *ref, void *_d
 	c = lookup_commit(the_repository, maybe_peeled);
 	if (!c)
 		return 0;
+	if (repo_parse_commit(the_repository, c))
+		return 0;
 	if (!packlist_find(writer->to_pack, maybe_peeled))
 		return 0;
 
diff --git a/t/t5333-pseudo-merge-bitmaps.sh b/t/t5333-pseudo-merge-bitmaps.sh
index f558e87ab21..e27f9850dc4 100755
--- a/t/t5333-pseudo-merge-bitmaps.sh
+++ b/t/t5333-pseudo-merge-bitmaps.sh
@@ -593,32 +593,34 @@ test_expect_success 'apply pseudo-merges with overlapping groups during fill-in'
 	)
 '
 
-test_expect_failure 'pseudo-merge commits are correctly classified by date' '
+test_expect_success 'pseudo-merge commits are correctly classified by date' '
 	test_when_finished "rm -fr pseudo-merge-date-classification" &&
 	git init pseudo-merge-date-classification &&
 	(
 		cd pseudo-merge-date-classification &&
 
 		test_commit_bulk 64 &&
+
 		tag_everything &&
 		git repack -ad &&
 
 		pack="$(ls .git/objects/pack/pack-*.pack)" &&
 
 		# Configure two pseudo-merge groups: one that only
-		# matches "stable" refs (older than one month), and one
-		# that matches all refs. With 64 freshly-created tags
-		# (all younger than one month) the stable group should
-		# have zero pseudo-merges and the catch-all group should
-		# have one.
+		# matches "stable" refs (older than one month), and
+		# one that matches all refs. With 64 tags whose
+		# commits are all younger than one month, the
+		# "stable" group should have zero pseudo-merges and
+		# the "all" group should have one.
 		#
 		# Use GIT_TEST_DATE_NOW to align "now" (and therefore
 		# "1.month.ago") with the test_tick timestamps so that
 		# the commits are within the last month.
 		#
-		# This exercises the date-based classification in
-		# find_pseudo_merge_group_for_ref(), which requires
-		# that commits are parsed before inspecting their date.
+		# Without parsing the commit, its date field would
+		# be zero, causing it to satisfy date <= threshold
+		# for the "stable" group as well, and both groups
+		# would produce pseudo-merges.
 		git config bitmapPseudoMerge.stable.pattern "refs/tags/" &&
 		git config bitmapPseudoMerge.stable.maxMerges 64 &&
 		git config bitmapPseudoMerge.stable.stableThreshold never &&
@@ -638,7 +640,7 @@ test_expect_failure 'pseudo-merge commits are correctly classified by date' '
 	)
 '
 
-test_expect_success 'sampleRate=0 does not cause division by zero' '
+test_expect_failure 'sampleRate=0 does not cause division by zero' '
 	test_when_finished "rm -fr pseudo-merge-sample-rate-zero" &&
 	git init pseudo-merge-sample-rate-zero &&
 	(
-- 
2.54.0.9.gb905fd5d0ae


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v2 7/9] pack-bitmap: reject pseudo-merge "sampleRate" of 0
  2026-04-21 20:01 ` [PATCH v2 0/9] pack-bitmap: fix various pseudo-merge bugs Taylor Blau
                     ` (5 preceding siblings ...)
  2026-04-21 20:02   ` [PATCH v2 6/9] pack-bitmap: parse commits in `find_pseudo_merge_group_for_ref()` Taylor Blau
@ 2026-04-21 20:02   ` Taylor Blau
  2026-04-21 20:02   ` [PATCH v2 8/9] Documentation: fix broken `sampleRate` in gitpacking(7) Taylor Blau
                     ` (2 subsequent siblings)
  9 siblings, 0 replies; 46+ messages in thread
From: Taylor Blau @ 2026-04-21 20:02 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Elijah Newren

The "bitmapPseudoMerge.*.sampleRate" configuration controls what
fraction of unstable commits are included in each pseudo-merge group.
The config validation accepts values in the range `[0, 1]`, but a value
of exactly 0 causes a division by zero in `select_pseudo_merges_1()`:

    if (j % (uint32_t)(1.0 / group->sample_rate))

When `sample_rate` is 0, `1.0 / 0.0` produces `+inf`, and casting
infinity to `uint32_t` is undefined behavior in C. On most platforms
this yields 0, making the subsequent modulo operation (`j % 0`) a
fatal arithmetic trap.

This path was not previously reachable because an earlier bug caused
all pseudo-merge candidates to be classified as "stable" (where the
sampling rate is not used), regardless of their actual commit date. Now
that the date classification is fixed, the unstable path is exercised
and the division by zero can fire.

Fix this by changing the validation to require a strict lower bound and
thus reject 0.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 Documentation/config/bitmap-pseudo-merge.adoc | 4 ++--
 pseudo-merge.c                                | 4 ++--
 t/t5333-pseudo-merge-bitmaps.sh               | 2 +-
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/Documentation/config/bitmap-pseudo-merge.adoc b/Documentation/config/bitmap-pseudo-merge.adoc
index 1f264eca99b..6bf52c80ba7 100644
--- a/Documentation/config/bitmap-pseudo-merge.adoc
+++ b/Documentation/config/bitmap-pseudo-merge.adoc
@@ -47,8 +47,8 @@ will be updated more often than a reference pointing at an old commit.
 bitmapPseudoMerge.<name>.sampleRate::
 	Determines the proportion of non-bitmapped commits (among
 	reference tips) which are selected for inclusion in an
-	unstable pseudo-merge bitmap. Must be between `0` and `1`
-	(inclusive). The default is `1`.
+	unstable pseudo-merge bitmap. Must be greater than `0` and
+	less than or equal to `1`. The default is `1`.
 
 bitmapPseudoMerge.<name>.threshold::
 	Determines the minimum age of non-bitmapped commits (among
diff --git a/pseudo-merge.c b/pseudo-merge.c
index d79e5fb649a..75bed043602 100644
--- a/pseudo-merge.c
+++ b/pseudo-merge.c
@@ -169,8 +169,8 @@ static int pseudo_merge_config(const char *var, const char *value,
 		}
 	} else if (!strcmp(key, "samplerate")) {
 		group->sample_rate = git_config_double(var, value, ctx->kvi);
-		if (!(0 <= group->sample_rate && group->sample_rate <= 1)) {
-			warning(_("%s must be between 0 and 1, using default"), var);
+		if (!(0 < group->sample_rate && group->sample_rate <= 1)) {
+			warning(_("%s must be between 0 (exclusive) and 1, using default"), var);
 			group->sample_rate = DEFAULT_PSEUDO_MERGE_SAMPLE_RATE;
 		}
 	} else if (!strcmp(key, "threshold")) {
diff --git a/t/t5333-pseudo-merge-bitmaps.sh b/t/t5333-pseudo-merge-bitmaps.sh
index e27f9850dc4..3d0617a2e17 100755
--- a/t/t5333-pseudo-merge-bitmaps.sh
+++ b/t/t5333-pseudo-merge-bitmaps.sh
@@ -640,7 +640,7 @@ test_expect_success 'pseudo-merge commits are correctly classified by date' '
 	)
 '
 
-test_expect_failure 'sampleRate=0 does not cause division by zero' '
+test_expect_success 'sampleRate=0 does not cause division by zero' '
 	test_when_finished "rm -fr pseudo-merge-sample-rate-zero" &&
 	git init pseudo-merge-sample-rate-zero &&
 	(
-- 
2.54.0.9.gb905fd5d0ae


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v2 8/9] Documentation: fix broken `sampleRate` in gitpacking(7)
  2026-04-21 20:01 ` [PATCH v2 0/9] pack-bitmap: fix various pseudo-merge bugs Taylor Blau
                     ` (6 preceding siblings ...)
  2026-04-21 20:02   ` [PATCH v2 7/9] pack-bitmap: reject pseudo-merge "sampleRate" of 0 Taylor Blau
@ 2026-04-21 20:02   ` Taylor Blau
  2026-04-21 20:02   ` [PATCH v2 9/9] pack-bitmap: prevent pattern leak on pseudo-merge re-assignment Taylor Blau
  2026-04-22  1:37   ` [PATCH v2 0/9] pack-bitmap: fix various pseudo-merge bugs Elijah Newren
  9 siblings, 0 replies; 46+ messages in thread
From: Taylor Blau @ 2026-04-21 20:02 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Elijah Newren

The documentation explaining some sample configurations for bitmap
pseudo-merges incorrectly uses a sample rate outside of the allowed
(0,1] range.

This dates back to faf558b23ef (pseudo-merge: implement support for
selecting pseudo-merge commits, 2024-05-23), and was likely written when
the allowable range for this configuration was the integral values
between (0,100].

Fix this to conform to the actual allowable range for this
configuration.

Noticed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 Documentation/gitpacking.adoc | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/Documentation/gitpacking.adoc b/Documentation/gitpacking.adoc
index a56596e2d1d..e6de6ec8249 100644
--- a/Documentation/gitpacking.adoc
+++ b/Documentation/gitpacking.adoc
@@ -150,7 +150,7 @@ with a configuration like so:
 	pattern = "refs/"
 	threshold = now
 	stableThreshold = never
-	sampleRate = 100
+	sampleRate = 1
 	maxMerges = 64
 ----
 
@@ -177,7 +177,7 @@ like:
 	pattern = "refs/virtual/([0-9]+)/(heads|tags)/"
 	threshold = now
 	stableThreshold = never
-	sampleRate = 100
+	sampleRate = 1
 	maxMerges = 64
 ----
 
-- 
2.54.0.9.gb905fd5d0ae


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v2 9/9] pack-bitmap: prevent pattern leak on pseudo-merge re-assignment
  2026-04-21 20:01 ` [PATCH v2 0/9] pack-bitmap: fix various pseudo-merge bugs Taylor Blau
                     ` (7 preceding siblings ...)
  2026-04-21 20:02   ` [PATCH v2 8/9] Documentation: fix broken `sampleRate` in gitpacking(7) Taylor Blau
@ 2026-04-21 20:02   ` Taylor Blau
  2026-04-22  1:37   ` [PATCH v2 0/9] pack-bitmap: fix various pseudo-merge bugs Elijah Newren
  9 siblings, 0 replies; 46+ messages in thread
From: Taylor Blau @ 2026-04-21 20:02 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Elijah Newren

When "bitmapPseudoMerge.*.pattern" appears more than once for the same
group, `pseudo_merge_config()` frees the old `regex_t *` pointer
but does not call `regfree()` on it first. This leaks whatever internal
state `regcomp()` allocated.

The final cleanup path in `pseudo_merge_group_release()` does call
`regfree()` before `free()`, so only the intermediate replacement is
affected.

Fix this by guarding the replacement with a NULL check and calling
`regfree()` before `free()` when the pointer is non-NULL.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 pseudo-merge.c                  |  5 ++++-
 t/t5333-pseudo-merge-bitmaps.sh | 30 ++++++++++++++++++++++++++++++
 2 files changed, 34 insertions(+), 1 deletion(-)

diff --git a/pseudo-merge.c b/pseudo-merge.c
index 75bed043602..22b8600d689 100644
--- a/pseudo-merge.c
+++ b/pseudo-merge.c
@@ -150,7 +150,10 @@ static int pseudo_merge_config(const char *var, const char *value,
 	if (!strcmp(key, "pattern")) {
 		struct strbuf re = STRBUF_INIT;
 
-		free(group->pattern);
+		if (group->pattern) {
+			regfree(group->pattern);
+			free(group->pattern);
+		}
 		if (*value != '^')
 			strbuf_addch(&re, '^');
 		strbuf_addstr(&re, value);
diff --git a/t/t5333-pseudo-merge-bitmaps.sh b/t/t5333-pseudo-merge-bitmaps.sh
index 3d0617a2e17..382513ca5cc 100755
--- a/t/t5333-pseudo-merge-bitmaps.sh
+++ b/t/t5333-pseudo-merge-bitmaps.sh
@@ -663,4 +663,34 @@ test_expect_success 'sampleRate=0 does not cause division by zero' '
 	)
 '
 
+test_expect_success 'duplicate pseudo-merge pattern does not leak' '
+	git init pseudo-merge-dup-pattern &&
+	test_when_finished "rm -fr pseudo-merge-dup-pattern" &&
+
+	(
+		cd pseudo-merge-dup-pattern &&
+
+		test_commit_bulk 64 &&
+		tag_everything &&
+		git repack -ad &&
+
+		pack=$(ls .git/objects/pack/pack-*.pack) &&
+
+		# Set the same group'\''s pattern twice. The second
+		# assignment should cleanly release the compiled regex
+		# from the first without leaking.
+		git config bitmapPseudoMerge.test.pattern "refs/tags/" &&
+		git config --add bitmapPseudoMerge.test.pattern "refs/tags/" &&
+		git config bitmapPseudoMerge.test.maxMerges 1 &&
+		git config bitmapPseudoMerge.test.threshold now &&
+		git config bitmapPseudoMerge.test.stableThreshold never &&
+
+		git rev-parse HEAD~63 |
+		test-tool bitmap write "$(basename $pack)" &&
+
+		test_pseudo_merges >merges &&
+		test_line_count = 1 merges
+	)
+'
+
 test_done
-- 
2.54.0.9.gb905fd5d0ae

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* Re: [PATCH v2 0/9] pack-bitmap: fix various pseudo-merge bugs
  2026-04-21 20:01 ` [PATCH v2 0/9] pack-bitmap: fix various pseudo-merge bugs Taylor Blau
                     ` (8 preceding siblings ...)
  2026-04-21 20:02   ` [PATCH v2 9/9] pack-bitmap: prevent pattern leak on pseudo-merge re-assignment Taylor Blau
@ 2026-04-22  1:37   ` Elijah Newren
  2026-05-11  2:53     ` Junio C Hamano
  2026-05-12  0:10     ` Taylor Blau
  9 siblings, 2 replies; 46+ messages in thread
From: Elijah Newren @ 2026-04-22  1:37 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, Junio C Hamano, Jeff King

On Tue, Apr 21, 2026 at 1:01 PM Taylor Blau <me@ttaylorr.com> wrote:
>
> [Note to the maintainer: this series has been rebased onto the current
> tip of master, which is 94f057755b7 (Git 2.54, 2026-04-19) at the time
> of writing.]
>
> This is a small reroll of my series to fix several bugs in the
> pseudo-merge bitmap implementation. The main changes since last time
> are:
>
>  - Fixed a use-after-realloc bug in the test helper introduced in the
>    first commit.
>
>  - Swapped the order of initializing and cleaning up repositories in the
>    new test scripts.
>
>  - Updated bitmapPseudoMerge.<name>.sampleRate's documentation to
>    describe the range as (0,1], and added a new commit fixing a broken
>    example in gitpacking(7).

Thanks for fixing these.

> Range-diff against v1:
[...]
>  4:  af9f651269d !  4:  07f70a07c20 pack-bitmap: fix inverted binary search in `pseudo_merge_at()`
>     @@ t/t5333-pseudo-merge-bitmaps.sh: test_expect_success 'apply pseudo-merges during
>
>      -test_expect_failure 'apply pseudo-merges from multiple groups during fill-in' '
>      +test_expect_success 'apply pseudo-merges from multiple groups during fill-in' '
>     -   git init pseudo-merge-fill-in-multi &&
>         test_when_finished "rm -fr pseudo-merge-fill-in-multi" &&
>     +   git init pseudo-merge-fill-in-multi &&

Here you fixed the order, but...

>     ++  git init pseudo-merge-fill-in-multi &&

...then you immediately run git init a second time?  I'm guessing this
was a stray edit made while trying to fix the order; could we get rid
of the duplicate?

>         (
>     +           cd pseudo-merge-fill-in-multi &&
>     +

Looks like you addressed all the feedback so far from v1.  There does
appear to be a new accidental double-init that I noted above in patch
4, but I didn't spot any other issues.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v2 0/9] pack-bitmap: fix various pseudo-merge bugs
  2026-04-22  1:37   ` [PATCH v2 0/9] pack-bitmap: fix various pseudo-merge bugs Elijah Newren
@ 2026-05-11  2:53     ` Junio C Hamano
  2026-05-12  0:48       ` Taylor Blau
  2026-05-12  0:10     ` Taylor Blau
  1 sibling, 1 reply; 46+ messages in thread
From: Junio C Hamano @ 2026-05-11  2:53 UTC (permalink / raw)
  To: Taylor Blau, Elijah Newren; +Cc: git, Jeff King

Elijah Newren <newren@gmail.com> writes:

>>     +   git init pseudo-merge-fill-in-multi &&
>
> Here you fixed the order, but...
>
>>     ++  git init pseudo-merge-fill-in-multi &&
>
> ...then you immediately run git init a second time?  I'm guessing this
> was a stray edit made while trying to fix the order; could we get rid
> of the duplicate?
>
>>         (
>>     +           cd pseudo-merge-fill-in-multi &&
>>     +
>
> Looks like you addressed all the feedback so far from v1.  There does
> appear to be a new accidental double-init that I noted above in patch
> 4, but I didn't spot any other issues.

The topic went dormant after this comment, and it seems that it is
so close to the finish line otherwise?  I'll leave the topic marked
as "Expecting (hopefully minor and final) reroll" in the draft
"What's cooking" report I work from for now.

Thanks.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v2 0/9] pack-bitmap: fix various pseudo-merge bugs
  2026-04-22  1:37   ` [PATCH v2 0/9] pack-bitmap: fix various pseudo-merge bugs Elijah Newren
  2026-05-11  2:53     ` Junio C Hamano
@ 2026-05-12  0:10     ` Taylor Blau
  1 sibling, 0 replies; 46+ messages in thread
From: Taylor Blau @ 2026-05-12  0:10 UTC (permalink / raw)
  To: Elijah Newren; +Cc: git, Junio C Hamano, Jeff King

On Tue, Apr 21, 2026 at 06:37:45PM -0700, Elijah Newren wrote:
> Here you fixed the order, but...
>
> >     ++  git init pseudo-merge-fill-in-multi &&
>
> ...then you immediately run git init a second time?  I'm guessing this
> was a stray edit made while trying to fix the order; could we get rid
> of the duplicate?

Oof, good catch. I'm not sure how that snuck in there, but it's fixed on
my end.

> >         (
> >     +           cd pseudo-merge-fill-in-multi &&
> >     +
>
> Looks like you addressed all the feedback so far from v1.  There does
> appear to be a new accidental double-init that I noted above in patch
> 4, but I didn't spot any other issues.

Besides that, I found one more spot that needed some love, which is the
new "duplicate pseudo-merge pattern does not leak" test added at the
very end of the series, which had a wrong ordering, and piped the output
of 'git rev-parse' directly into a test helper.

I'll send a fixed up round out now.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH v3 0/9] pack-bitmap: fix various pseudo-merge bugs
  2026-04-13 23:56 [PATCH 0/8] pack-bitmap: fix various pseudo-merge bugs Taylor Blau
                   ` (8 preceding siblings ...)
  2026-04-21 20:01 ` [PATCH v2 0/9] pack-bitmap: fix various pseudo-merge bugs Taylor Blau
@ 2026-05-12  0:46 ` Taylor Blau
  2026-05-12  0:46   ` [PATCH v3 1/9] t/helper: add 'test-tool bitmap write' subcommand Taylor Blau
                     ` (9 more replies)
  9 siblings, 10 replies; 46+ messages in thread
From: Taylor Blau @ 2026-05-12  0:46 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Elijah Newren

[Note to the maintainer: this series has been rebased onto the current
tip of master, which is 7760f83b597 (Merge branch
'jc/neuter-sideband-fixup', 2026-05-11) at the time of writing].

This is a (very) small reroll of my series to fix several bugs in the
pseudo-merge bitmap implementation. The only changes since last time are:

 - consistently ordering `test_when_finished "rm -fr ..."` above `git
   init` in tests, and

 - avoiding a single Git invocation on the left-hand side of a pipe in
   the final patch

Otherwise, the series is entirely unchanged save for the above and the
rebase.

As usual, a range-diff is included below for convenience. The original
cover letter is as follows:

========================================================================

This series fixes several bugs in the pseudo-merge bitmap implementation
that caused the pseudo-merge application path to be effectively broken
during fill-in traversal.

Peff noticed that this code path was never triggered by the existing
test suite, and investigating that observation uncovered a handful of
bugs, some compounding.

The first two patches introduce test infrastructure: a 'bitmap write'
test helper that gives tests precise control over which commits receive
individual bitmaps, and a set of "test_expect_failure" tests
demonstrating each bug.

The next four patches fix the bugs in the per-commit pseudo-merge
lookup:

  - The pseudo-merge commit lookup table was sorted by OID rather than
    by bit position, causing the reader's binary search to fail.

  - The binary search in pseudo_merge_at() had its lo/hi updates
    swapped.

  - The extended pseudo-merge lookup path had three compounding bugs: a
    wrong entry-size calculation in the writer, a misinterpretation of
    extended table entries in the reader, and a silently-swallowed error
    check.

The final two patches fix issues in pseudo-merge group selection:

  - find_pseudo_merge_group_for_ref() did not parse commits before
    inspecting their dates, so all candidates had date == 0 and were
    unconditionally placed in the "stable" bucket.

  - The config validation for bitmapPseudoMerge.*.sampleRate accepted 0,
    which leads to a division by zero once the date classification is
    fixed and the unstable code path is exercised.

There is also a small fix for a regex leak when the pattern key is
overridden in config.

Thanks in advance for your review!

Taylor Blau (9):
  t/helper: add 'test-tool bitmap write' subcommand
  t5333: demonstrate various pseudo-merge bugs
  pack-bitmap-write: sort pseudo-merge commit lookup table in pack order
  pack-bitmap: fix inverted binary search in `pseudo_merge_at()`
  pack-bitmap: fix pseudo-merge lookup for shared commits
  pack-bitmap: parse commits in `find_pseudo_merge_group_for_ref()`
  pack-bitmap: reject pseudo-merge "sampleRate" of 0
  Documentation: fix broken `sampleRate` in gitpacking(7)
  pack-bitmap: prevent pattern leak on pseudo-merge re-assignment

 Documentation/config/bitmap-pseudo-merge.adoc |   4 +-
 Documentation/gitpacking.adoc                 |   4 +-
 pack-bitmap-write.c                           |  23 +-
 pseudo-merge.c                                |  19 +-
 t/helper/test-bitmap.c                        | 113 ++++++++-
 t/t5310-pack-bitmaps.sh                       |  24 ++
 t/t5333-pseudo-merge-bitmaps.sh               | 229 ++++++++++++++++++
 7 files changed, 402 insertions(+), 14 deletions(-)

Range-diff against v2:
 1:  c0df35f8ebd =  1:  9c7a829cbeb t/helper: add 'test-tool bitmap write' subcommand
 2:  11de3343726 =  2:  d1ed4aadf75 t5333: demonstrate various pseudo-merge bugs
 3:  8d908ab415e =  3:  bf3a9a07e5f pack-bitmap-write: sort pseudo-merge commit lookup table in pack order
 4:  07f70a07c20 !  4:  a1d341c92eb pack-bitmap: fix inverted binary search in `pseudo_merge_at()`
    @@ t/t5333-pseudo-merge-bitmaps.sh: test_expect_success 'apply pseudo-merges during
     +test_expect_success 'apply pseudo-merges from multiple groups during fill-in' '
      	test_when_finished "rm -fr pseudo-merge-fill-in-multi" &&
      	git init pseudo-merge-fill-in-multi &&
    -+	git init pseudo-merge-fill-in-multi &&
      	(
    - 		cd pseudo-merge-fill-in-multi &&
    - 
 5:  3ed0b39843f =  5:  06e3410d323 pack-bitmap: fix pseudo-merge lookup for shared commits
 6:  95f847211f3 =  6:  78cf7e6d80d pack-bitmap: parse commits in `find_pseudo_merge_group_for_ref()`
 7:  f8a01cfb893 =  7:  4dbf6686718 pack-bitmap: reject pseudo-merge "sampleRate" of 0
 8:  c37156502c0 =  8:  46d0ee2f168 Documentation: fix broken `sampleRate` in gitpacking(7)
 9:  b905fd5d0ae !  9:  9b17dab2cf7 pack-bitmap: prevent pattern leak on pseudo-merge re-assignment
    @@ t/t5333-pseudo-merge-bitmaps.sh: test_expect_success 'sampleRate=0 does not caus
      '
      
     +test_expect_success 'duplicate pseudo-merge pattern does not leak' '
    -+	git init pseudo-merge-dup-pattern &&
     +	test_when_finished "rm -fr pseudo-merge-dup-pattern" &&
    -+
    ++	git init pseudo-merge-dup-pattern &&
     +	(
     +		cd pseudo-merge-dup-pattern &&
     +
    @@ t/t5333-pseudo-merge-bitmaps.sh: test_expect_success 'sampleRate=0 does not caus
     +		git config bitmapPseudoMerge.test.threshold now &&
     +		git config bitmapPseudoMerge.test.stableThreshold never &&
     +
    -+		git rev-parse HEAD~63 |
    -+		test-tool bitmap write "$(basename $pack)" &&
    ++		git rev-parse HEAD~63 >in &&
    ++		test-tool bitmap write "$(basename $pack)" <in &&
     +
     +		test_pseudo_merges >merges &&
     +		test_line_count = 1 merges

base-commit: 7760f83b59750c27df653c5c46d0f80e44cfe02c
-- 
2.54.0.76.g9b17dab2cf7

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH v3 1/9] t/helper: add 'test-tool bitmap write' subcommand
  2026-05-12  0:46 ` [PATCH v3 " Taylor Blau
@ 2026-05-12  0:46   ` Taylor Blau
  2026-05-12  0:46   ` [PATCH v3 2/9] t5333: demonstrate various pseudo-merge bugs Taylor Blau
                     ` (8 subsequent siblings)
  9 siblings, 0 replies; 46+ messages in thread
From: Taylor Blau @ 2026-05-12  0:46 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Elijah Newren

In f16eb1c091 (pseudo-merge: fix disk reads from find_pseudo_merge(),
2026-03-31), we noted that `apply_pseudo_merges_for_commit()` is never
triggered by the existing test suite, and that this bears further
investigation.

This patch is the first one to begin that investigation. The following
patches will expose and fix a variety of bugs in the implementation of
pseudo-merge bitmaps.

In order to do so, however, many of these tests require very precise
selection of which commits receive bitmaps and which do not. To date,
there isn't a standard approach to easily facilitate this. Address this
by introducing a `test-tool bitmap write` subcommand that writes a
bitmap for a given packfile, reading the set of commits which should
receive individual bitmaps from stdin like so:

    test-tool bitmap write <pack-basename> </path/to/commits.list

, where "<pack-basename>" is the filename for a specific packfile (e.g.,
"pack-abc123.pack"), and "/path/to/commits.list" is a list of commit
OIDs which will receive bitmaps.

The helper respects `bitmapPseudoMerge.*` configuration for creating
pseudo-merge bitmaps alongside the regular commit bitmaps.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 t/helper/test-bitmap.c  | 113 +++++++++++++++++++++++++++++++++++++++-
 t/t5310-pack-bitmaps.sh |  24 +++++++++
 2 files changed, 136 insertions(+), 1 deletion(-)

diff --git a/t/helper/test-bitmap.c b/t/helper/test-bitmap.c
index 16a01669e41..381e9b58b2c 100644
--- a/t/helper/test-bitmap.c
+++ b/t/helper/test-bitmap.c
@@ -2,7 +2,10 @@
 
 #include "test-tool.h"
 #include "git-compat-util.h"
+#include "hex.h"
+#include "odb.h"
 #include "pack-bitmap.h"
+#include "pseudo-merge.h"
 #include "setup.h"
 
 static int bitmap_list_commits(void)
@@ -35,6 +38,111 @@ static int bitmap_dump_pseudo_merge_objects(uint32_t n)
 	return test_bitmap_pseudo_merge_objects(the_repository, n);
 }
 
+static int add_packed_object(const struct object_id *oid,
+			     struct packed_git *pack,
+			     uint32_t pos,
+			     void *_data)
+{
+	struct packing_data *packed = _data;
+	struct object_entry *entry;
+	struct object_info oi = OBJECT_INFO_INIT;
+	enum object_type type;
+
+	oi.typep = &type;
+
+	entry = packlist_alloc(packed, oid);
+	entry->idx.offset = nth_packed_object_offset(pack, pos);
+	if (packed_object_info(pack, entry->idx.offset, &oi) < 0)
+		die("could not get type of object %s",
+		    oid_to_hex(oid));
+	oe_set_type(entry, type);
+	oe_set_in_pack(packed, entry, pack);
+
+	return 0;
+}
+
+static int idx_oid_cmp(const void *va, const void *vb)
+{
+	const struct pack_idx_entry *a = *(const struct pack_idx_entry **)va;
+	const struct pack_idx_entry *b = *(const struct pack_idx_entry **)vb;
+
+	return oidcmp(&a->oid, &b->oid);
+}
+
+static int bitmap_write(const char *basename)
+{
+	struct packed_git *p = NULL;
+	struct packing_data packed = { 0 };
+	struct bitmap_writer writer;
+	struct pack_idx_entry **index;
+	struct strbuf buf = STRBUF_INIT;
+	uint32_t i;
+
+	prepare_repo_settings(the_repository);
+	repo_for_each_pack(the_repository, p) {
+		if (!strcmp(pack_basename(p), basename))
+			break;
+	}
+
+	if (!p)
+		die("could not find pack '%s'", basename);
+
+	if (open_pack_index(p))
+		die("cannot open pack index for '%s'", p->pack_name);
+
+	prepare_packing_data(the_repository, &packed);
+
+	for_each_object_in_pack(p, add_packed_object, &packed,
+				ODB_FOR_EACH_OBJECT_PACK_ORDER);
+
+	/*
+	 * Build the index array now that data.packed.objects[] is
+	 * fully allocated (packlist_alloc() may have reallocated it
+	 * during the loop above).
+	 */
+	ALLOC_ARRAY(index, p->num_objects);
+	for (i = 0; i < p->num_objects; i++)
+		index[i] = &packed.objects[i].idx;
+
+	bitmap_writer_init(&writer, the_repository, &packed, NULL);
+	bitmap_writer_build_type_index(&writer, index);
+
+	while (strbuf_getline_lf(&buf, stdin) != EOF) {
+		struct object_id oid;
+		struct commit *c;
+
+		if (get_oid_hex(buf.buf, &oid))
+			die("invalid OID: %s", buf.buf);
+
+		c = lookup_commit(the_repository, &oid);
+		if (!c || repo_parse_commit(the_repository, c))
+			die("could not parse commit %s", buf.buf);
+
+		bitmap_writer_push_commit(&writer, c, 0);
+	}
+
+	select_pseudo_merges(&writer);
+	if (bitmap_writer_build(&writer) < 0)
+		die("failed to build bitmaps");
+
+	bitmap_writer_set_checksum(&writer, p->hash);
+
+	QSORT(index, p->num_objects, idx_oid_cmp);
+
+	strbuf_reset(&buf);
+	strbuf_addstr(&buf, p->pack_name);
+	strbuf_strip_suffix(&buf, ".pack");
+	strbuf_addstr(&buf, ".bitmap");
+	bitmap_writer_finish(&writer, index, buf.buf, 0);
+
+	bitmap_writer_free(&writer);
+	strbuf_release(&buf);
+	free(index);
+	clear_packing_data(&packed);
+
+	return 0;
+}
+
 int cmd__bitmap(int argc, const char **argv)
 {
 	setup_git_directory();
@@ -51,13 +159,16 @@ int cmd__bitmap(int argc, const char **argv)
 		return bitmap_dump_pseudo_merge_commits(atoi(argv[2]));
 	if (argc == 3 && !strcmp(argv[1], "dump-pseudo-merge-objects"))
 		return bitmap_dump_pseudo_merge_objects(atoi(argv[2]));
+	if (argc == 3 && !strcmp(argv[1], "write"))
+		return bitmap_write(argv[2]);
 
 	usage("\ttest-tool bitmap list-commits\n"
 	      "\ttest-tool bitmap list-commits-with-offset\n"
 	      "\ttest-tool bitmap dump-hashes\n"
 	      "\ttest-tool bitmap dump-pseudo-merges\n"
 	      "\ttest-tool bitmap dump-pseudo-merge-commits <n>\n"
-	      "\ttest-tool bitmap dump-pseudo-merge-objects <n>");
+	      "\ttest-tool bitmap dump-pseudo-merge-objects <n>\n"
+	      "\ttest-tool bitmap write <pack-basename> < <commit-list>");
 
 	return -1;
 }
diff --git a/t/t5310-pack-bitmaps.sh b/t/t5310-pack-bitmaps.sh
index f693cb56691..efeb71593bf 100755
--- a/t/t5310-pack-bitmaps.sh
+++ b/t/t5310-pack-bitmaps.sh
@@ -648,4 +648,28 @@ test_expect_success 'truncated bitmap fails gracefully (lookup table)' '
 	test_grep corrupted.bitmap.index stderr
 '
 
+test_expect_success 'test-tool bitmap write determines bitmap selection' '
+	test_when_finished "rm -fr bitmap-write-helper" &&
+	git init bitmap-write-helper &&
+	(
+		cd bitmap-write-helper &&
+
+		test_commit_bulk 64 &&
+		git repack -ad &&
+
+		pack="$(ls .git/objects/pack/pack-*.pack)" &&
+
+		git rev-parse HEAD >in &&
+		test-tool bitmap write "$(basename $pack)" <in &&
+
+		test-tool bitmap list-commits >bitmaps.raw &&
+		sort bitmaps.raw >bitmaps &&
+		test_cmp in bitmaps &&
+
+		git rev-list --count --objects --use-bitmap-index HEAD >actual &&
+		git rev-list --count --objects HEAD >expect &&
+		test_cmp expect actual
+	)
+'
+
 test_done
-- 
2.54.0.76.g9b17dab2cf7


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v3 2/9] t5333: demonstrate various pseudo-merge bugs
  2026-05-12  0:46 ` [PATCH v3 " Taylor Blau
  2026-05-12  0:46   ` [PATCH v3 1/9] t/helper: add 'test-tool bitmap write' subcommand Taylor Blau
@ 2026-05-12  0:46   ` Taylor Blau
  2026-05-12  0:46   ` [PATCH v3 3/9] pack-bitmap-write: sort pseudo-merge commit lookup table in pack order Taylor Blau
                     ` (7 subsequent siblings)
  9 siblings, 0 replies; 46+ messages in thread
From: Taylor Blau @ 2026-05-12  0:46 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Elijah Newren

Using the test helper introduced via the previous commit, add various
failing tests demonstrating bugs in the pseudo-merge implementation.

These are all marked as failing with one exception. The "sampleRate=0"
test describes a latent bug, which is only reachable through a code path
that is itself masked by a separate bug. A future commit will fix that
bug, and, in turn, cause the aforementioned test to fail. Accordingly,
that commit will mark the test as failing, and it will be re-marked as
passing in a separate commit which fixes the once-latent bug.

For the rest: the following commits will explain and fix the underlying
bugs in detail.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 t/t5333-pseudo-merge-bitmaps.sh | 198 ++++++++++++++++++++++++++++++++
 1 file changed, 198 insertions(+)

diff --git a/t/t5333-pseudo-merge-bitmaps.sh b/t/t5333-pseudo-merge-bitmaps.sh
index 1f7a5d82ee4..0e9638c31c3 100755
--- a/t/t5333-pseudo-merge-bitmaps.sh
+++ b/t/t5333-pseudo-merge-bitmaps.sh
@@ -462,4 +462,202 @@ test_expect_success 'use pseudo-merge in boundary traversal' '
 	)
 '
 
+test_expect_failure 'apply pseudo-merges during fill-in traversal' '
+	test_when_finished "rm -fr pseudo-merge-fill-in-traversal" &&
+	git init pseudo-merge-fill-in-traversal &&
+	(
+		cd pseudo-merge-fill-in-traversal &&
+
+		git config bitmapPseudoMerge.test.pattern refs/tags/ &&
+		git config bitmapPseudoMerge.test.maxMerges 1 &&
+		git config bitmapPseudoMerge.test.stableThreshold never &&
+
+		test_commit_bulk 64 &&
+		tag_everything &&
+		git repack -ad &&
+
+		pack=$(ls .git/objects/pack/pack-*.pack) &&
+		git rev-parse HEAD~63 >in &&
+		test-tool bitmap write "$(basename $pack)" <in &&
+
+		test_pseudo_merges >merges &&
+		test_line_count = 1 merges &&
+
+		test_commit stale &&
+
+		git rev-list --count --objects HEAD >expect &&
+
+		: >trace2.txt &&
+		GIT_TRACE2_EVENT=$PWD/trace2.txt \
+			git rev-list --count --objects --use-bitmap-index HEAD >actual &&
+		test_pseudo_merges_satisfied 1 <trace2.txt &&
+
+		test_cmp expect actual
+	)
+'
+
+test_expect_failure 'apply pseudo-merges from multiple groups during fill-in' '
+	test_when_finished "rm -fr pseudo-merge-fill-in-multi" &&
+	git init pseudo-merge-fill-in-multi &&
+	(
+		cd pseudo-merge-fill-in-multi &&
+
+		test_commit base &&
+		base=$(git rev-parse HEAD) &&
+
+		for side in left right
+		do
+			git checkout -B $side base &&
+
+			test_commit_bulk --id=$side 64 &&
+			git rev-list --no-object-names HEAD --not $base >in &&
+			while read oid
+			do
+				echo "create refs/group-$side/$oid $oid" || return 1
+			done <in | git update-ref --stdin || return 1
+		done &&
+
+		git checkout left &&
+		git merge right &&
+		git repack -ad &&
+
+		git config bitmapPseudoMerge.left.pattern "refs/group-left/" &&
+		git config bitmapPseudoMerge.left.maxMerges 1 &&
+		git config bitmapPseudoMerge.left.stableThreshold never &&
+
+		git config bitmapPseudoMerge.right.pattern "refs/group-right/" &&
+		git config bitmapPseudoMerge.right.maxMerges 1 &&
+		git config bitmapPseudoMerge.right.stableThreshold never &&
+
+		pack="$(ls .git/objects/pack/pack-*.pack)" &&
+		git rev-parse "$base" >in &&
+		test-tool bitmap write "$(basename $pack)" <in &&
+
+		test_pseudo_merges >merges &&
+		test_line_count = 2 merges &&
+
+		test_commit stale &&
+
+		git rev-list --count --objects HEAD >expect &&
+
+		: >trace2.txt &&
+		GIT_TRACE2_EVENT=$PWD/trace2.txt \
+			git rev-list --count --objects --use-bitmap-index HEAD >actual &&
+		test_pseudo_merges_satisfied 2 <trace2.txt &&
+
+		test_cmp expect actual
+	)
+'
+
+test_expect_failure 'apply pseudo-merges with overlapping groups during fill-in' '
+	test_when_finished "rm -fr pseudo-merge-fill-in-overlap" &&
+	git init pseudo-merge-fill-in-overlap &&
+	(
+		cd pseudo-merge-fill-in-overlap &&
+
+		test_commit_bulk 64 &&
+		tag_everything &&
+		git repack -ad &&
+
+		pack="$(ls .git/objects/pack/pack-*.pack)" &&
+
+		# Use two pseudo-merge group patterns that both match
+		# refs/tags/, so every tagged commit belongs to both
+		# groups. This exercises the extended lookup table
+		# path in apply_pseudo_merges_for_commit().
+		git config bitmapPseudoMerge.all.pattern "refs/tags/" &&
+		git config bitmapPseudoMerge.all.maxMerges 1 &&
+		git config bitmapPseudoMerge.all.stableThreshold never &&
+
+		git config bitmapPseudoMerge.tags.pattern "refs/tags/" &&
+		git config bitmapPseudoMerge.tags.maxMerges 1 &&
+		git config bitmapPseudoMerge.tags.stableThreshold never &&
+
+		git rev-parse HEAD~63 >in &&
+		test-tool bitmap write "$(basename $pack)" <in &&
+
+		test_pseudo_merges >merges &&
+		test_line_count = 2 merges &&
+
+		test_commit stale &&
+
+		git rev-list --count --objects HEAD >expect &&
+
+		: >trace2.txt &&
+		GIT_TRACE2_EVENT=$PWD/trace2.txt \
+			git rev-list --count --objects --use-bitmap-index HEAD >actual &&
+		test_pseudo_merges_satisfied 2 <trace2.txt &&
+
+		test_cmp expect actual
+	)
+'
+
+test_expect_failure 'pseudo-merge commits are correctly classified by date' '
+	test_when_finished "rm -fr pseudo-merge-date-classification" &&
+	git init pseudo-merge-date-classification &&
+	(
+		cd pseudo-merge-date-classification &&
+
+		test_commit_bulk 64 &&
+		tag_everything &&
+		git repack -ad &&
+
+		pack="$(ls .git/objects/pack/pack-*.pack)" &&
+
+		# Configure two pseudo-merge groups: one that only
+		# matches "stable" refs (older than one month), and one
+		# that matches all refs. With 64 freshly-created tags
+		# (all younger than one month) the stable group should
+		# have zero pseudo-merges and the catch-all group should
+		# have one.
+		#
+		# Use GIT_TEST_DATE_NOW to align "now" (and therefore
+		# "1.month.ago") with the test_tick timestamps so that
+		# the commits are within the last month.
+		#
+		# This exercises the date-based classification in
+		# find_pseudo_merge_group_for_ref(), which requires
+		# that commits are parsed before inspecting their date.
+		git config bitmapPseudoMerge.stable.pattern "refs/tags/" &&
+		git config bitmapPseudoMerge.stable.maxMerges 64 &&
+		git config bitmapPseudoMerge.stable.stableThreshold never &&
+		git config bitmapPseudoMerge.stable.threshold 1.month.ago &&
+
+		git config bitmapPseudoMerge.all.pattern "refs/tags/" &&
+		git config bitmapPseudoMerge.all.maxMerges 1 &&
+		git config bitmapPseudoMerge.all.stableThreshold never &&
+		git config bitmapPseudoMerge.all.threshold now &&
+
+		git rev-parse HEAD~63 >in &&
+		GIT_TEST_DATE_NOW=$test_tick \
+			test-tool bitmap write "$(basename $pack)" <in &&
+
+		test_pseudo_merges >merges &&
+		test_line_count = 1 merges
+	)
+'
+
+test_expect_success 'sampleRate=0 does not cause division by zero' '
+	test_when_finished "rm -fr pseudo-merge-sample-rate-zero" &&
+	git init pseudo-merge-sample-rate-zero &&
+	(
+		cd pseudo-merge-sample-rate-zero &&
+
+		test_commit_bulk 64 &&
+		tag_everything &&
+		git repack -ad &&
+
+		pack="$(ls .git/objects/pack/pack-*.pack)" &&
+
+		git config bitmapPseudoMerge.test.pattern "refs/tags/" &&
+		git config bitmapPseudoMerge.test.maxMerges 1 &&
+		git config bitmapPseudoMerge.test.sampleRate 0 &&
+		git config bitmapPseudoMerge.test.threshold now &&
+		git config bitmapPseudoMerge.test.stableThreshold never &&
+
+		git rev-parse HEAD~63 >in &&
+		test-tool bitmap write "$(basename $pack)" <in
+	)
+'
+
 test_done
-- 
2.54.0.76.g9b17dab2cf7


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v3 3/9] pack-bitmap-write: sort pseudo-merge commit lookup table in pack order
  2026-05-12  0:46 ` [PATCH v3 " Taylor Blau
  2026-05-12  0:46   ` [PATCH v3 1/9] t/helper: add 'test-tool bitmap write' subcommand Taylor Blau
  2026-05-12  0:46   ` [PATCH v3 2/9] t5333: demonstrate various pseudo-merge bugs Taylor Blau
@ 2026-05-12  0:46   ` Taylor Blau
  2026-05-12  0:46   ` [PATCH v3 4/9] pack-bitmap: fix inverted binary search in `pseudo_merge_at()` Taylor Blau
                     ` (6 subsequent siblings)
  9 siblings, 0 replies; 46+ messages in thread
From: Taylor Blau @ 2026-05-12  0:46 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Elijah Newren

The pseudo-merge commit lookup table stores each commit's position in
the pack- or pseudo-pack order, and is used to perform a binary search
in order to determine which pseudo-merge(s) a given commit belongs to.

However, the table was previously sorted in lexical order (via
`oid_array_sort()`), causing the binary search to fail.

While this causes pseudo-merge bitmaps to be de-facto broken for fill-in
traversal, there are a couple of important points to keep in mind:

 * Pseudo-merge application during the initial phases of a bitmap-based
   traversal are applied via `cascade_pseudo_merges_1()`. This function
   enumerates the known pseudo-merges and determines if its parents are
   a subset of the traversal roots.

   This is a different path than the fill-in traversal, where we are
   looking for any pseudo-merges which may be satisfied after visiting
   some commit along an object walk, which involves the aforementioned
   (broken) binary search.

   As a consequence, any pseudo-merges we apply at this stage are done
   so correctly.

 * While this bug makes applying pseudo-merges during fill-in traversal
   effectively broken, it does not produce wrong results. Instead of
   applying the *wrong* pseudo-merge, we will simply fail to find
   satisfied pseudo-merges, leaving the traversal to use the existing
   fill-in routines.

Fix this by sorting the table by bit position before writing, matching
the order that the reader's binary search expects.

This does produce a change the on-disk format insofar as the actual code
now complies with the documented format (for more details, refer to:
Documentation/technical/bitmap-format.adoc). Given that this never
worked in the first place, such a change should be OK to perform.

If an out-of-tree implementation of pseudo-merges happened to generate
bitmaps that comply with the documented format, they will continue to be
read and interpreted as normal.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 pack-bitmap-write.c             | 21 ++++++++++++++++++++-
 t/t5333-pseudo-merge-bitmaps.sh |  2 +-
 2 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c
index 8338d7217ef..86ed6a5d78c 100644
--- a/pack-bitmap-write.c
+++ b/pack-bitmap-write.c
@@ -819,6 +819,20 @@ static void write_selected_commits_v1(struct bitmap_writer *writer,
 	}
 }
 
+static int pseudo_merge_commit_pos_cmp(const void *_va, const void *_vb,
+				       void *_data)
+{
+	struct bitmap_writer *writer = _data;
+	uint32_t pos_a = find_object_pos(writer, _va, NULL);
+	uint32_t pos_b = find_object_pos(writer, _vb, NULL);
+
+	if (pos_a < pos_b)
+		return -1;
+	if (pos_a > pos_b)
+		return 1;
+	return 0;
+}
+
 static void write_pseudo_merges(struct bitmap_writer *writer,
 				struct hashfile *f)
 {
@@ -876,7 +890,12 @@ static void write_pseudo_merges(struct bitmap_writer *writer,
 		oid_array_append(&commits, &kh_key(writer->pseudo_merge_commits, i));
 	}
 
-	oid_array_sort(&commits);
+	/*
+	 * Sort the commits by their bit position so that the lookup
+	 * table can be binary searched by the reader (see
+	 * find_pseudo_merge()).
+	 */
+	QSORT_S(commits.oid, commits.nr, pseudo_merge_commit_pos_cmp, writer);
 
 	/* write lookup table (non-extended) */
 	for (i = 0; i < commits.nr; i++) {
diff --git a/t/t5333-pseudo-merge-bitmaps.sh b/t/t5333-pseudo-merge-bitmaps.sh
index 0e9638c31c3..3d7a7668121 100755
--- a/t/t5333-pseudo-merge-bitmaps.sh
+++ b/t/t5333-pseudo-merge-bitmaps.sh
@@ -462,7 +462,7 @@ test_expect_success 'use pseudo-merge in boundary traversal' '
 	)
 '
 
-test_expect_failure 'apply pseudo-merges during fill-in traversal' '
+test_expect_success 'apply pseudo-merges during fill-in traversal' '
 	test_when_finished "rm -fr pseudo-merge-fill-in-traversal" &&
 	git init pseudo-merge-fill-in-traversal &&
 	(
-- 
2.54.0.76.g9b17dab2cf7


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v3 4/9] pack-bitmap: fix inverted binary search in `pseudo_merge_at()`
  2026-05-12  0:46 ` [PATCH v3 " Taylor Blau
                     ` (2 preceding siblings ...)
  2026-05-12  0:46   ` [PATCH v3 3/9] pack-bitmap-write: sort pseudo-merge commit lookup table in pack order Taylor Blau
@ 2026-05-12  0:46   ` Taylor Blau
  2026-05-12  0:47   ` [PATCH v3 5/9] pack-bitmap: fix pseudo-merge lookup for shared commits Taylor Blau
                     ` (5 subsequent siblings)
  9 siblings, 0 replies; 46+ messages in thread
From: Taylor Blau @ 2026-05-12  0:46 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Elijah Newren

The binary search in `pseudo_merge_at()` has its "lo" and "hi" updates
swapped: when the midpoint's offset is less than the target, it sets `hi
= mi` (searching left) instead of `lo = mi + 1` (searching right), and
vice versa.

This means that lookups for pseudo-merges whose offset is not near the
midpoint of the pseudo-merge table are likely to fail.

In practice, with a single pseudo-merge group this is masked because the
lone entry is always at the midpoint. With multiple groups, the inverted
comparisons cause lookups to search in the wrong direction, potentially
missing entries.

Swap the "lo" and "hi" assignments to search in the correct direction,
making it possible to apply pseudo-merges during fill-in when more than
one pseudo-merge exists in a group.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 pseudo-merge.c                  | 4 ++--
 t/t5333-pseudo-merge-bitmaps.sh | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/pseudo-merge.c b/pseudo-merge.c
index ff18b6c3642..fb71c761792 100644
--- a/pseudo-merge.c
+++ b/pseudo-merge.c
@@ -559,9 +559,9 @@ static struct pseudo_merge *pseudo_merge_at(const struct pseudo_merge_map *pm,
 		if (got == want)
 			return use_pseudo_merge(pm, &pm->v[mi]);
 		else if (got < want)
-			hi = mi;
-		else
 			lo = mi + 1;
+		else
+			hi = mi;
 	}
 
 	warning(_("could not find pseudo-merge for commit %s at offset %"PRIuMAX),
diff --git a/t/t5333-pseudo-merge-bitmaps.sh b/t/t5333-pseudo-merge-bitmaps.sh
index 3d7a7668121..5411fbf1e04 100755
--- a/t/t5333-pseudo-merge-bitmaps.sh
+++ b/t/t5333-pseudo-merge-bitmaps.sh
@@ -496,7 +496,7 @@ test_expect_success 'apply pseudo-merges during fill-in traversal' '
 	)
 '
 
-test_expect_failure 'apply pseudo-merges from multiple groups during fill-in' '
+test_expect_success 'apply pseudo-merges from multiple groups during fill-in' '
 	test_when_finished "rm -fr pseudo-merge-fill-in-multi" &&
 	git init pseudo-merge-fill-in-multi &&
 	(
-- 
2.54.0.76.g9b17dab2cf7


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v3 5/9] pack-bitmap: fix pseudo-merge lookup for shared commits
  2026-05-12  0:46 ` [PATCH v3 " Taylor Blau
                     ` (3 preceding siblings ...)
  2026-05-12  0:46   ` [PATCH v3 4/9] pack-bitmap: fix inverted binary search in `pseudo_merge_at()` Taylor Blau
@ 2026-05-12  0:47   ` Taylor Blau
  2026-05-12  0:47   ` [PATCH v3 6/9] pack-bitmap: parse commits in `find_pseudo_merge_group_for_ref()` Taylor Blau
                     ` (4 subsequent siblings)
  9 siblings, 0 replies; 46+ messages in thread
From: Taylor Blau @ 2026-05-12  0:47 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Elijah Newren

When a commit appears in more than one pseudo-merge group, its entry in
the commit lookup table has the high bit set in its offset field,
indicating that the offset points to an "extended" table containing the
set of pseudo-merges for that commit.

There are three bugs in this path:

 * The `next_ext` offset in `write_pseudo_merges()` undercounts the
   per-entry size of the lookup table (8 vs. 12 bytes).

 * `nth_pseudo_merge_ext()` calls `read_pseudo_merge_commit_at()` on a
   pseudo-merge bitmap offset, misinterpreting it as a 12-byte commit
   table entry.

 * The error check after `pseudo_merge_ext_at()` in
   `apply_pseudo_merges_for_commit()` tests `< -1` instead of `< 0`,
   silently swallowing errors from `error()`.

The first bug is on the write side: each commit lookup entry contains a
4- and 8-byte unsigned value for a total of 12 bytes, but the
calculation assumes that the entry only contains 8 bytes of data. This
makes `next_ext` too small, so the extended-table offsets that get
written point into the middle of the non-extended lookup table rather
than past it. The reader then interprets non-extended lookup data as
extended entries, producing garbage.

The second bug is on the read side and is independently fatal: even with
a correctly positioned extended table, `nth_pseudo_merge_ext()` feeds
the offset it reads (which points at pseudo-merge bitmap data) to
`read_pseudo_merge_commit_at()`. That function tries to parse 12 bytes
as a `pseudo_merge_commit` struct, clobbering `merge->pseudo_merge_ofs`
with whatever happens to be at that location. The caller only needs
`pseudo_merge_ofs`, so the fix is to store the offset directly rather
than re-parsing a commit table entry. The `commit_pos` field is left
untouched, retaining the value that `find_pseudo_merge()` set earlier.

The third bug is latent. With the first two fixes applied, the extended
table is correctly written and read, so `pseudo_merge_ext_at()` does not
fail during normal operation. The `< -1` vs `< 0` distinction only
matters when the bitmap file is corrupt or truncated, in which case the
error would be silently ignored and the code would proceed with
uninitialized data.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 pack-bitmap-write.c             | 2 +-
 pseudo-merge.c                  | 4 ++--
 t/t5333-pseudo-merge-bitmaps.sh | 2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c
index 86ed6a5d78c..1c8070f99c0 100644
--- a/pack-bitmap-write.c
+++ b/pack-bitmap-write.c
@@ -877,7 +877,7 @@ static void write_pseudo_merges(struct bitmap_writer *writer,
 
 	next_ext = st_add(hashfile_total(f),
 			  st_mult(kh_size(writer->pseudo_merge_commits),
-				  sizeof(uint64_t)));
+				  sizeof(uint32_t) + sizeof(uint64_t)));
 
 	table_start = hashfile_total(f);
 
diff --git a/pseudo-merge.c b/pseudo-merge.c
index fb71c761792..34e1da00b4e 100644
--- a/pseudo-merge.c
+++ b/pseudo-merge.c
@@ -600,7 +600,7 @@ static int nth_pseudo_merge_ext(const struct pseudo_merge_map *pm,
 		return error(_("out-of-bounds read: (%"PRIuMAX" >= %"PRIuMAX")"),
 			     (uintmax_t)ofs, (uintmax_t)pm->map_size);
 
-	read_pseudo_merge_commit_at(merge, pm->map + ofs);
+	merge->pseudo_merge_ofs = ofs;
 
 	return 0;
 }
@@ -671,7 +671,7 @@ int apply_pseudo_merges_for_commit(const struct pseudo_merge_map *pm,
 		off_t ofs = merge_commit.pseudo_merge_ofs & ~((uint64_t)1<<63);
 		uint32_t i;
 
-		if (pseudo_merge_ext_at(pm, &ext, ofs) < -1) {
+		if (pseudo_merge_ext_at(pm, &ext, ofs) < 0) {
 			warning(_("could not read extended pseudo-merge table "
 				  "for commit %s"),
 				oid_to_hex(&commit->object.oid));
diff --git a/t/t5333-pseudo-merge-bitmaps.sh b/t/t5333-pseudo-merge-bitmaps.sh
index 5411fbf1e04..90459da5e63 100755
--- a/t/t5333-pseudo-merge-bitmaps.sh
+++ b/t/t5333-pseudo-merge-bitmaps.sh
@@ -549,7 +549,7 @@ test_expect_success 'apply pseudo-merges from multiple groups during fill-in' '
 	)
 '
 
-test_expect_failure 'apply pseudo-merges with overlapping groups during fill-in' '
+test_expect_success 'apply pseudo-merges with overlapping groups during fill-in' '
 	test_when_finished "rm -fr pseudo-merge-fill-in-overlap" &&
 	git init pseudo-merge-fill-in-overlap &&
 	(
-- 
2.54.0.76.g9b17dab2cf7


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v3 6/9] pack-bitmap: parse commits in `find_pseudo_merge_group_for_ref()`
  2026-05-12  0:46 ` [PATCH v3 " Taylor Blau
                     ` (4 preceding siblings ...)
  2026-05-12  0:47   ` [PATCH v3 5/9] pack-bitmap: fix pseudo-merge lookup for shared commits Taylor Blau
@ 2026-05-12  0:47   ` Taylor Blau
  2026-05-12  0:47   ` [PATCH v3 7/9] pack-bitmap: reject pseudo-merge "sampleRate" of 0 Taylor Blau
                     ` (3 subsequent siblings)
  9 siblings, 0 replies; 46+ messages in thread
From: Taylor Blau @ 2026-05-12  0:47 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Elijah Newren

`find_pseudo_merge_group_for_ref()` uses the commit's date to classify
it as either "stable" (older than the stable threshold) or "unstable"
(otherwise).

However, to find the relevant commit from a given OID, the function
`find_pseudo_merge_group_for_ref()` uses `lookup_commit()` which does
not parse commits.

Because an unparsed commit has its "date" set to zero, every candidate
is placed in the "stable" bucket regardless of its actual committer
timestamp. This means the `bitmapPseudoMerge.*.threshold` and
`stableThreshold` configuration options have no effect: the
stable/unstable split is always determined by comparing against zero
rather than the real commit date.

The net result is that pseudo-merge groups are partitioned by
`stableSize` instead of the intended decay-based sizing, and the
`sampleRate` knob (which only applies to the unstable path) is never
exercised.

Fix this by calling `repo_parse_commit()` after `lookup_commit()`,
bailing out of the callback if parsing fails.

The corresponding test configures two pseudo-merge groups that both
match all tags. The "stable" group uses `threshold=1.month.ago`, and the
"all" group uses `threshold=now`. The test use our custom
"GIT_TEST_DATE_NOW" environment variable by setting it to the value of
"$test_tick" to align Git's notion of "now" (and therefore
"1.month.ago") with the `test_tick` timestamps, so the commits appear to
be younger than one month: only the "all" group matches them, producing
exactly one pseudo-merge.

Without the fix every commit has `date == 0`, which satisfies `date <=
threshold` for both groups (since 0 is older than one month ago), and
the "stable" group erroneously matches as well.

Now that commits are correctly classified as "unstable", the bug
described in the test exercising the "sampleRate=0" test is reachable,
and the test is marked as failing. It will be fixed in a following
commit.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 pseudo-merge.c                  |  2 ++
 t/t5333-pseudo-merge-bitmaps.sh | 22 ++++++++++++----------
 2 files changed, 14 insertions(+), 10 deletions(-)

diff --git a/pseudo-merge.c b/pseudo-merge.c
index 34e1da00b4e..d79e5fb649a 100644
--- a/pseudo-merge.c
+++ b/pseudo-merge.c
@@ -236,6 +236,8 @@ static int find_pseudo_merge_group_for_ref(const struct reference *ref, void *_d
 	c = lookup_commit(the_repository, maybe_peeled);
 	if (!c)
 		return 0;
+	if (repo_parse_commit(the_repository, c))
+		return 0;
 	if (!packlist_find(writer->to_pack, maybe_peeled))
 		return 0;
 
diff --git a/t/t5333-pseudo-merge-bitmaps.sh b/t/t5333-pseudo-merge-bitmaps.sh
index 90459da5e63..0032a16606b 100755
--- a/t/t5333-pseudo-merge-bitmaps.sh
+++ b/t/t5333-pseudo-merge-bitmaps.sh
@@ -592,32 +592,34 @@ test_expect_success 'apply pseudo-merges with overlapping groups during fill-in'
 	)
 '
 
-test_expect_failure 'pseudo-merge commits are correctly classified by date' '
+test_expect_success 'pseudo-merge commits are correctly classified by date' '
 	test_when_finished "rm -fr pseudo-merge-date-classification" &&
 	git init pseudo-merge-date-classification &&
 	(
 		cd pseudo-merge-date-classification &&
 
 		test_commit_bulk 64 &&
+
 		tag_everything &&
 		git repack -ad &&
 
 		pack="$(ls .git/objects/pack/pack-*.pack)" &&
 
 		# Configure two pseudo-merge groups: one that only
-		# matches "stable" refs (older than one month), and one
-		# that matches all refs. With 64 freshly-created tags
-		# (all younger than one month) the stable group should
-		# have zero pseudo-merges and the catch-all group should
-		# have one.
+		# matches "stable" refs (older than one month), and
+		# one that matches all refs. With 64 tags whose
+		# commits are all younger than one month, the
+		# "stable" group should have zero pseudo-merges and
+		# the "all" group should have one.
 		#
 		# Use GIT_TEST_DATE_NOW to align "now" (and therefore
 		# "1.month.ago") with the test_tick timestamps so that
 		# the commits are within the last month.
 		#
-		# This exercises the date-based classification in
-		# find_pseudo_merge_group_for_ref(), which requires
-		# that commits are parsed before inspecting their date.
+		# Without parsing the commit, its date field would
+		# be zero, causing it to satisfy date <= threshold
+		# for the "stable" group as well, and both groups
+		# would produce pseudo-merges.
 		git config bitmapPseudoMerge.stable.pattern "refs/tags/" &&
 		git config bitmapPseudoMerge.stable.maxMerges 64 &&
 		git config bitmapPseudoMerge.stable.stableThreshold never &&
@@ -637,7 +639,7 @@ test_expect_failure 'pseudo-merge commits are correctly classified by date' '
 	)
 '
 
-test_expect_success 'sampleRate=0 does not cause division by zero' '
+test_expect_failure 'sampleRate=0 does not cause division by zero' '
 	test_when_finished "rm -fr pseudo-merge-sample-rate-zero" &&
 	git init pseudo-merge-sample-rate-zero &&
 	(
-- 
2.54.0.76.g9b17dab2cf7


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v3 7/9] pack-bitmap: reject pseudo-merge "sampleRate" of 0
  2026-05-12  0:46 ` [PATCH v3 " Taylor Blau
                     ` (5 preceding siblings ...)
  2026-05-12  0:47   ` [PATCH v3 6/9] pack-bitmap: parse commits in `find_pseudo_merge_group_for_ref()` Taylor Blau
@ 2026-05-12  0:47   ` Taylor Blau
  2026-05-12  0:47   ` [PATCH v3 8/9] Documentation: fix broken `sampleRate` in gitpacking(7) Taylor Blau
                     ` (2 subsequent siblings)
  9 siblings, 0 replies; 46+ messages in thread
From: Taylor Blau @ 2026-05-12  0:47 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Elijah Newren

The "bitmapPseudoMerge.*.sampleRate" configuration controls what
fraction of unstable commits are included in each pseudo-merge group.
The config validation accepts values in the range `[0, 1]`, but a value
of exactly 0 causes a division by zero in `select_pseudo_merges_1()`:

    if (j % (uint32_t)(1.0 / group->sample_rate))

When `sample_rate` is 0, `1.0 / 0.0` produces `+inf`, and casting
infinity to `uint32_t` is undefined behavior in C. On most platforms
this yields 0, making the subsequent modulo operation (`j % 0`) a
fatal arithmetic trap.

This path was not previously reachable because an earlier bug caused
all pseudo-merge candidates to be classified as "stable" (where the
sampling rate is not used), regardless of their actual commit date. Now
that the date classification is fixed, the unstable path is exercised
and the division by zero can fire.

Fix this by changing the validation to require a strict lower bound and
thus reject 0.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 Documentation/config/bitmap-pseudo-merge.adoc | 4 ++--
 pseudo-merge.c                                | 4 ++--
 t/t5333-pseudo-merge-bitmaps.sh               | 2 +-
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/Documentation/config/bitmap-pseudo-merge.adoc b/Documentation/config/bitmap-pseudo-merge.adoc
index 1f264eca99b..6bf52c80ba7 100644
--- a/Documentation/config/bitmap-pseudo-merge.adoc
+++ b/Documentation/config/bitmap-pseudo-merge.adoc
@@ -47,8 +47,8 @@ will be updated more often than a reference pointing at an old commit.
 bitmapPseudoMerge.<name>.sampleRate::
 	Determines the proportion of non-bitmapped commits (among
 	reference tips) which are selected for inclusion in an
-	unstable pseudo-merge bitmap. Must be between `0` and `1`
-	(inclusive). The default is `1`.
+	unstable pseudo-merge bitmap. Must be greater than `0` and
+	less than or equal to `1`. The default is `1`.
 
 bitmapPseudoMerge.<name>.threshold::
 	Determines the minimum age of non-bitmapped commits (among
diff --git a/pseudo-merge.c b/pseudo-merge.c
index d79e5fb649a..75bed043602 100644
--- a/pseudo-merge.c
+++ b/pseudo-merge.c
@@ -169,8 +169,8 @@ static int pseudo_merge_config(const char *var, const char *value,
 		}
 	} else if (!strcmp(key, "samplerate")) {
 		group->sample_rate = git_config_double(var, value, ctx->kvi);
-		if (!(0 <= group->sample_rate && group->sample_rate <= 1)) {
-			warning(_("%s must be between 0 and 1, using default"), var);
+		if (!(0 < group->sample_rate && group->sample_rate <= 1)) {
+			warning(_("%s must be between 0 (exclusive) and 1, using default"), var);
 			group->sample_rate = DEFAULT_PSEUDO_MERGE_SAMPLE_RATE;
 		}
 	} else if (!strcmp(key, "threshold")) {
diff --git a/t/t5333-pseudo-merge-bitmaps.sh b/t/t5333-pseudo-merge-bitmaps.sh
index 0032a16606b..5bfbbd4214e 100755
--- a/t/t5333-pseudo-merge-bitmaps.sh
+++ b/t/t5333-pseudo-merge-bitmaps.sh
@@ -639,7 +639,7 @@ test_expect_success 'pseudo-merge commits are correctly classified by date' '
 	)
 '
 
-test_expect_failure 'sampleRate=0 does not cause division by zero' '
+test_expect_success 'sampleRate=0 does not cause division by zero' '
 	test_when_finished "rm -fr pseudo-merge-sample-rate-zero" &&
 	git init pseudo-merge-sample-rate-zero &&
 	(
-- 
2.54.0.76.g9b17dab2cf7


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v3 8/9] Documentation: fix broken `sampleRate` in gitpacking(7)
  2026-05-12  0:46 ` [PATCH v3 " Taylor Blau
                     ` (6 preceding siblings ...)
  2026-05-12  0:47   ` [PATCH v3 7/9] pack-bitmap: reject pseudo-merge "sampleRate" of 0 Taylor Blau
@ 2026-05-12  0:47   ` Taylor Blau
  2026-05-12  0:47   ` [PATCH v3 9/9] pack-bitmap: prevent pattern leak on pseudo-merge re-assignment Taylor Blau
  2026-05-12  1:38   ` [PATCH v3 0/9] pack-bitmap: fix various pseudo-merge bugs Junio C Hamano
  9 siblings, 0 replies; 46+ messages in thread
From: Taylor Blau @ 2026-05-12  0:47 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Elijah Newren

The documentation explaining some sample configurations for bitmap
pseudo-merges incorrectly uses a sample rate outside of the allowed
(0,1] range.

This dates back to faf558b23ef (pseudo-merge: implement support for
selecting pseudo-merge commits, 2024-05-23), and was likely written when
the allowable range for this configuration was the integral values
between (0,100].

Fix this to conform to the actual allowable range for this
configuration.

Noticed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 Documentation/gitpacking.adoc | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/Documentation/gitpacking.adoc b/Documentation/gitpacking.adoc
index a56596e2d1d..e6de6ec8249 100644
--- a/Documentation/gitpacking.adoc
+++ b/Documentation/gitpacking.adoc
@@ -150,7 +150,7 @@ with a configuration like so:
 	pattern = "refs/"
 	threshold = now
 	stableThreshold = never
-	sampleRate = 100
+	sampleRate = 1
 	maxMerges = 64
 ----
 
@@ -177,7 +177,7 @@ like:
 	pattern = "refs/virtual/([0-9]+)/(heads|tags)/"
 	threshold = now
 	stableThreshold = never
-	sampleRate = 100
+	sampleRate = 1
 	maxMerges = 64
 ----
 
-- 
2.54.0.76.g9b17dab2cf7


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v3 9/9] pack-bitmap: prevent pattern leak on pseudo-merge re-assignment
  2026-05-12  0:46 ` [PATCH v3 " Taylor Blau
                     ` (7 preceding siblings ...)
  2026-05-12  0:47   ` [PATCH v3 8/9] Documentation: fix broken `sampleRate` in gitpacking(7) Taylor Blau
@ 2026-05-12  0:47   ` Taylor Blau
  2026-05-12  1:38   ` [PATCH v3 0/9] pack-bitmap: fix various pseudo-merge bugs Junio C Hamano
  9 siblings, 0 replies; 46+ messages in thread
From: Taylor Blau @ 2026-05-12  0:47 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Elijah Newren

When "bitmapPseudoMerge.*.pattern" appears more than once for the same
group, `pseudo_merge_config()` frees the old `regex_t *` pointer
but does not call `regfree()` on it first. This leaks whatever internal
state `regcomp()` allocated.

The final cleanup path in `pseudo_merge_group_release()` does call
`regfree()` before `free()`, so only the intermediate replacement is
affected.

Fix this by guarding the replacement with a NULL check and calling
`regfree()` before `free()` when the pointer is non-NULL.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 pseudo-merge.c                  |  5 ++++-
 t/t5333-pseudo-merge-bitmaps.sh | 29 +++++++++++++++++++++++++++++
 2 files changed, 33 insertions(+), 1 deletion(-)

diff --git a/pseudo-merge.c b/pseudo-merge.c
index 75bed043602..22b8600d689 100644
--- a/pseudo-merge.c
+++ b/pseudo-merge.c
@@ -150,7 +150,10 @@ static int pseudo_merge_config(const char *var, const char *value,
 	if (!strcmp(key, "pattern")) {
 		struct strbuf re = STRBUF_INIT;
 
-		free(group->pattern);
+		if (group->pattern) {
+			regfree(group->pattern);
+			free(group->pattern);
+		}
 		if (*value != '^')
 			strbuf_addch(&re, '^');
 		strbuf_addstr(&re, value);
diff --git a/t/t5333-pseudo-merge-bitmaps.sh b/t/t5333-pseudo-merge-bitmaps.sh
index 5bfbbd4214e..305d6771082 100755
--- a/t/t5333-pseudo-merge-bitmaps.sh
+++ b/t/t5333-pseudo-merge-bitmaps.sh
@@ -662,4 +662,33 @@ test_expect_success 'sampleRate=0 does not cause division by zero' '
 	)
 '
 
+test_expect_success 'duplicate pseudo-merge pattern does not leak' '
+	test_when_finished "rm -fr pseudo-merge-dup-pattern" &&
+	git init pseudo-merge-dup-pattern &&
+	(
+		cd pseudo-merge-dup-pattern &&
+
+		test_commit_bulk 64 &&
+		tag_everything &&
+		git repack -ad &&
+
+		pack=$(ls .git/objects/pack/pack-*.pack) &&
+
+		# Set the same group'\''s pattern twice. The second
+		# assignment should cleanly release the compiled regex
+		# from the first without leaking.
+		git config bitmapPseudoMerge.test.pattern "refs/tags/" &&
+		git config --add bitmapPseudoMerge.test.pattern "refs/tags/" &&
+		git config bitmapPseudoMerge.test.maxMerges 1 &&
+		git config bitmapPseudoMerge.test.threshold now &&
+		git config bitmapPseudoMerge.test.stableThreshold never &&
+
+		git rev-parse HEAD~63 >in &&
+		test-tool bitmap write "$(basename $pack)" <in &&
+
+		test_pseudo_merges >merges &&
+		test_line_count = 1 merges
+	)
+'
+
 test_done
-- 
2.54.0.76.g9b17dab2cf7

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* Re: [PATCH v2 0/9] pack-bitmap: fix various pseudo-merge bugs
  2026-05-11  2:53     ` Junio C Hamano
@ 2026-05-12  0:48       ` Taylor Blau
  0 siblings, 0 replies; 46+ messages in thread
From: Taylor Blau @ 2026-05-12  0:48 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Elijah Newren, git, Jeff King

On Mon, May 11, 2026 at 11:53:01AM +0900, Junio C Hamano wrote:
> The topic went dormant after this comment, and it seems that it is
> so close to the finish line otherwise?  I'll leave the topic marked
> as "Expecting (hopefully minor and final) reroll" in the draft
> "What's cooking" report I work from for now.

My apologies. I had put this aside while you were on vacation, and then
got busy with GitHub-specific topics in the interim. I just sent a new
version that addresses Elijah's comments, which should be ready for
merging down.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 0/9] pack-bitmap: fix various pseudo-merge bugs
  2026-05-12  0:46 ` [PATCH v3 " Taylor Blau
                     ` (8 preceding siblings ...)
  2026-05-12  0:47   ` [PATCH v3 9/9] pack-bitmap: prevent pattern leak on pseudo-merge re-assignment Taylor Blau
@ 2026-05-12  1:38   ` Junio C Hamano
  2026-05-12  1:46     ` Taylor Blau
  2026-05-12  1:49     ` Junio C Hamano
  9 siblings, 2 replies; 46+ messages in thread
From: Junio C Hamano @ 2026-05-12  1:38 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, Jeff King, Elijah Newren

Taylor Blau <me@ttaylorr.com> writes:

> [Note to the maintainer: this series has been rebased onto the current
> tip of master, which is 7760f83b597 (Merge branch
> 'jc/neuter-sideband-fixup', 2026-05-11) at the time of writing].

A note like this is very much appreciated, but please also state the
reason why the rebase was necessary.  "Because the current tip of
'master' has advanced" is not a good reason.  "The previous
synthetic base was made by merging topic X and topic Y on
then-current 'master', but both have graduated" is a so-so ok
reason.  "Because the updated implementation of this series uses
facilities that appeared in recent 'master' that come from topics A
and B, which the previous iteration did not use" and "Recent updates
to 'master' brings in conflicting changes from topic C" are
excellent reasons.

> Range-diff against v2:
>  1:  c0df35f8ebd =  1:  9c7a829cbeb t/helper: add 'test-tool bitmap write' subcommand
>  2:  11de3343726 =  2:  d1ed4aadf75 t5333: demonstrate various pseudo-merge bugs
>  3:  8d908ab415e =  3:  bf3a9a07e5f pack-bitmap-write: sort pseudo-merge commit lookup table in pack order
>  4:  07f70a07c20 !  4:  a1d341c92eb pack-bitmap: fix inverted binary search in `pseudo_merge_at()`
>     @@ t/t5333-pseudo-merge-bitmaps.sh: test_expect_success 'apply pseudo-merges during
>      +test_expect_success 'apply pseudo-merges from multiple groups during fill-in' '
>       	test_when_finished "rm -fr pseudo-merge-fill-in-multi" &&
>       	git init pseudo-merge-fill-in-multi &&
>     -+	git init pseudo-merge-fill-in-multi &&

OK.

>       	(
>     - 		cd pseudo-merge-fill-in-multi &&
>     - 
>  5:  3ed0b39843f =  5:  06e3410d323 pack-bitmap: fix pseudo-merge lookup for shared commits
>  6:  95f847211f3 =  6:  78cf7e6d80d pack-bitmap: parse commits in `find_pseudo_merge_group_for_ref()`
>  7:  f8a01cfb893 =  7:  4dbf6686718 pack-bitmap: reject pseudo-merge "sampleRate" of 0
>  8:  c37156502c0 =  8:  46d0ee2f168 Documentation: fix broken `sampleRate` in gitpacking(7)
>  9:  b905fd5d0ae !  9:  9b17dab2cf7 pack-bitmap: prevent pattern leak on pseudo-merge re-assignment
>     @@ t/t5333-pseudo-merge-bitmaps.sh: test_expect_success 'sampleRate=0 does not caus
>       '
>       
>      +test_expect_success 'duplicate pseudo-merge pattern does not leak' '
>     -+	git init pseudo-merge-dup-pattern &&
>      +	test_when_finished "rm -fr pseudo-merge-dup-pattern" &&
>     -+
>     ++	git init pseudo-merge-dup-pattern &&
>      +	(
>      +		cd pseudo-merge-dup-pattern &&
>      +
>     @@ t/t5333-pseudo-merge-bitmaps.sh: test_expect_success 'sampleRate=0 does not caus
>      +		git config bitmapPseudoMerge.test.threshold now &&
>      +		git config bitmapPseudoMerge.test.stableThreshold never &&
>      +
>     -+		git rev-parse HEAD~63 |
>     -+		test-tool bitmap write "$(basename $pack)" &&
>     ++		git rev-parse HEAD~63 >in &&
>     ++		test-tool bitmap write "$(basename $pack)" <in &&
>      +
>      +		test_pseudo_merges >merges &&
>      +		test_line_count = 1 merges
>
> base-commit: 7760f83b59750c27df653c5c46d0f80e44cfe02c

Queued.  Thanks.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 0/9] pack-bitmap: fix various pseudo-merge bugs
  2026-05-12  1:38   ` [PATCH v3 0/9] pack-bitmap: fix various pseudo-merge bugs Junio C Hamano
@ 2026-05-12  1:46     ` Taylor Blau
  2026-05-12  1:49     ` Junio C Hamano
  1 sibling, 0 replies; 46+ messages in thread
From: Taylor Blau @ 2026-05-12  1:46 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jeff King, Elijah Newren

On Tue, May 12, 2026 at 10:38:44AM +0900, Junio C Hamano wrote:
> Taylor Blau <me@ttaylorr.com> writes:
>
> > [Note to the maintainer: this series has been rebased onto the current
> > tip of master, which is 7760f83b597 (Merge branch
> > 'jc/neuter-sideband-fixup', 2026-05-11) at the time of writing].
>
> A note like this is very much appreciated, but please also state the
> reason why the rebase was necessary.  "Because the current tip of
> 'master' has advanced" is not a good reason.  "The previous
> synthetic base was made by merging topic X and topic Y on
> then-current 'master', but both have graduated" is a so-so ok
> reason.  "Because the updated implementation of this series uses
> facilities that appeared in recent 'master' that come from topics A
> and B, which the previous iteration did not use" and "Recent updates
> to 'master' brings in conflicting changes from topic C" are
> excellent reasons.

I think the reason here was "bad habit that I am trying to break" ;-).

(Joking aside, I usually rebase my series locally before sending to
ensure they can still be merged in cleanly. I usually remember to toss
that rebased version aside and send the non-rebased version, but clearly
forgot to do so here. Sorry about that.)

Thanks for queueing regardless.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 0/9] pack-bitmap: fix various pseudo-merge bugs
  2026-05-12  1:38   ` [PATCH v3 0/9] pack-bitmap: fix various pseudo-merge bugs Junio C Hamano
  2026-05-12  1:46     ` Taylor Blau
@ 2026-05-12  1:49     ` Junio C Hamano
  1 sibling, 0 replies; 46+ messages in thread
From: Junio C Hamano @ 2026-05-12  1:49 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, Jeff King, Elijah Newren

Junio C Hamano <gitster@pobox.com> writes:

> Taylor Blau <me@ttaylorr.com> writes:
>
>> [Note to the maintainer: this series has been rebased onto the current
>> tip of master, which is 7760f83b597 (Merge branch
>> 'jc/neuter-sideband-fixup', 2026-05-11) at the time of writing].
>
> A note like this is very much appreciated, but please also state the
> reason why the rebase was necessary.  "Because the current tip of
> 'master' has advanced" is not a good reason.  "The previous
> synthetic base was made by merging topic X and topic Y on
> then-current 'master', but both have graduated" is a so-so ok
> reason.  "Because the updated implementation of this series uses
> facilities that appeared in recent 'master' that come from topics A
> and B, which the previous iteration did not use" and "Recent updates
> to 'master' brings in conflicting changes from topic C" are
> excellent reasons.

Forgot one important case.  "It turns out that this fix is important
so it was rebased to be applicable to an older maintenance release M"
would be very much appreciated as well.

Perhaps after coming up with a few more good reasons, we should
describe them in Documentation/SubmittingPatches somewhere.


^ permalink raw reply	[flat|nested] 46+ messages in thread

end of thread, other threads:[~2026-05-12  1:49 UTC | newest]

Thread overview: 46+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-13 23:56 [PATCH 0/8] pack-bitmap: fix various pseudo-merge bugs Taylor Blau
2026-04-13 23:56 ` [PATCH 1/8] t/helper: add 'test-tool bitmap write' subcommand Taylor Blau
2026-04-14 19:48   ` Junio C Hamano
2026-04-14 21:29     ` Taylor Blau
2026-04-14 21:34       ` Junio C Hamano
2026-04-14 21:40         ` Taylor Blau
2026-04-14 20:08   ` Junio C Hamano
2026-04-14 21:40     ` Taylor Blau
2026-04-19  0:24   ` Elijah Newren
2026-04-21 18:51     ` Taylor Blau
2026-04-13 23:56 ` [PATCH 2/8] t5333: demonstrate various pseudo-merge bugs Taylor Blau
2026-04-19  0:25   ` Elijah Newren
2026-04-13 23:56 ` [PATCH 3/8] pack-bitmap-write: sort pseudo-merge commit lookup table in pack order Taylor Blau
2026-04-13 23:56 ` [PATCH 4/8] pack-bitmap: fix inverted binary search in `pseudo_merge_at()` Taylor Blau
2026-04-13 23:56 ` [PATCH 5/8] pack-bitmap: fix pseudo-merge lookup for shared commits Taylor Blau
2026-04-13 23:56 ` [PATCH 6/8] pack-bitmap: parse commits in `find_pseudo_merge_group_for_ref()` Taylor Blau
2026-04-13 23:56 ` [PATCH 7/8] pack-bitmap: reject pseudo-merge "sampleRate" of 0 Taylor Blau
2026-04-19  0:26   ` Elijah Newren
2026-04-13 23:57 ` [PATCH 8/8] pack-bitmap: prevent pattern leak on pseudo-merge re-assignment Taylor Blau
2026-04-21 20:01 ` [PATCH v2 0/9] pack-bitmap: fix various pseudo-merge bugs Taylor Blau
2026-04-21 20:01   ` [PATCH v2 1/9] t/helper: add 'test-tool bitmap write' subcommand Taylor Blau
2026-04-21 20:01   ` [PATCH v2 2/9] t5333: demonstrate various pseudo-merge bugs Taylor Blau
2026-04-21 20:02   ` [PATCH v2 3/9] pack-bitmap-write: sort pseudo-merge commit lookup table in pack order Taylor Blau
2026-04-21 20:02   ` [PATCH v2 4/9] pack-bitmap: fix inverted binary search in `pseudo_merge_at()` Taylor Blau
2026-04-21 20:02   ` [PATCH v2 5/9] pack-bitmap: fix pseudo-merge lookup for shared commits Taylor Blau
2026-04-21 20:02   ` [PATCH v2 6/9] pack-bitmap: parse commits in `find_pseudo_merge_group_for_ref()` Taylor Blau
2026-04-21 20:02   ` [PATCH v2 7/9] pack-bitmap: reject pseudo-merge "sampleRate" of 0 Taylor Blau
2026-04-21 20:02   ` [PATCH v2 8/9] Documentation: fix broken `sampleRate` in gitpacking(7) Taylor Blau
2026-04-21 20:02   ` [PATCH v2 9/9] pack-bitmap: prevent pattern leak on pseudo-merge re-assignment Taylor Blau
2026-04-22  1:37   ` [PATCH v2 0/9] pack-bitmap: fix various pseudo-merge bugs Elijah Newren
2026-05-11  2:53     ` Junio C Hamano
2026-05-12  0:48       ` Taylor Blau
2026-05-12  0:10     ` Taylor Blau
2026-05-12  0:46 ` [PATCH v3 " Taylor Blau
2026-05-12  0:46   ` [PATCH v3 1/9] t/helper: add 'test-tool bitmap write' subcommand Taylor Blau
2026-05-12  0:46   ` [PATCH v3 2/9] t5333: demonstrate various pseudo-merge bugs Taylor Blau
2026-05-12  0:46   ` [PATCH v3 3/9] pack-bitmap-write: sort pseudo-merge commit lookup table in pack order Taylor Blau
2026-05-12  0:46   ` [PATCH v3 4/9] pack-bitmap: fix inverted binary search in `pseudo_merge_at()` Taylor Blau
2026-05-12  0:47   ` [PATCH v3 5/9] pack-bitmap: fix pseudo-merge lookup for shared commits Taylor Blau
2026-05-12  0:47   ` [PATCH v3 6/9] pack-bitmap: parse commits in `find_pseudo_merge_group_for_ref()` Taylor Blau
2026-05-12  0:47   ` [PATCH v3 7/9] pack-bitmap: reject pseudo-merge "sampleRate" of 0 Taylor Blau
2026-05-12  0:47   ` [PATCH v3 8/9] Documentation: fix broken `sampleRate` in gitpacking(7) Taylor Blau
2026-05-12  0:47   ` [PATCH v3 9/9] pack-bitmap: prevent pattern leak on pseudo-merge re-assignment Taylor Blau
2026-05-12  1:38   ` [PATCH v3 0/9] pack-bitmap: fix various pseudo-merge bugs Junio C Hamano
2026-05-12  1:46     ` Taylor Blau
2026-05-12  1:49     ` Junio C Hamano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox