Git development
 help / color / mirror / Atom feed
* [PATCH 0/3] pack-objects: support bitmaps and delta-islands with `--path-walk`
@ 2026-05-27 23:18 Taylor Blau
  2026-05-27 23:18 ` [PATCH 1/3] pack-objects: support reachability bitmaps " Taylor Blau
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Taylor Blau @ 2026-05-27 23:18 UTC (permalink / raw)
  To: git; +Cc: Derrick Stolee, Junio C Hamano, Jeff King, Elijah Newren

Note to the maintainer:

 * This series is based on 'ds/path-walk-filters' with Patrick's
   'ps/clang-w-glibc-2.43-and-_Generic' merged in. The former has since
   graduated. These are the three remaining patches from my earlier RFC
   after Stolee's series incorporated the filter-related pieces.

Here is a trimmed-down reroll of my series to make `--path-walk` work
with reachability bitmaps and delta-islands. This series was originally
an RFC that was a companion to Stolee's recent patches to extend
`--filter` support to `--path-walk` [1].

Since the previous round, Stolee's series has graduated and incorporated
the filter-related patches from my earlier RFC [2]. What remains are the
three patches here that implement support for reachability bitmaps and
delta-islands under `--path-walk`.

 * The first patch allows `--path-walk` to use reachability bitmaps when
   they can answer the request, falling back to path-walk enumeration
   when they cannot. It also lets bitmap writing see the same commit
   candidates that the regular traversal would have shown to the bitmap
   selector.

 * The second patch is preparatory, and factors the
   delta-islands-specific tree-depth recording from `show_object()` into
   a helper.

 * The final patch teaches the path-walk callback to perform the same
   delta-islands side effects as the regular traversal: propagating
   island marks for commits, and recording tree depths for trees. This
   gives `resolve_tree_islands()` the same input in either enumeration
   mode, so the existing island checks can be reused unchanged.

Thanks in advance for your review!

[1]: https://lore.kernel.org/git/pull.2101.git.1777731354.gitgitgadget@gmail.com/
[2]: https://lore.kernel.org/git/cover.1777853408.git.me@ttaylorr.com/

Taylor Blau (3):
  pack-objects: support reachability bitmaps with `--path-walk`
  pack-objects: extract `record_tree_depth()` helper
  pack-objects: support `--delta-islands` with `--path-walk`

 Documentation/git-pack-objects.adoc | 12 ++---
 builtin/pack-objects.c              | 68 +++++++++++++++++++++--------
 t/t5310-pack-bitmaps.sh             | 36 +++++++++++++++
 t/t5320-delta-islands.sh            | 29 ++++++++++++
 4 files changed, 122 insertions(+), 23 deletions(-)


base-commit: 45a9ecee26839cc880fdd5e704339dd3cf4ffc26
-- 
2.54.0.22.ga642305e3c9

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH 1/3] pack-objects: support reachability bitmaps with `--path-walk`
  2026-05-27 23:18 [PATCH 0/3] pack-objects: support bitmaps and delta-islands with `--path-walk` Taylor Blau
@ 2026-05-27 23:18 ` Taylor Blau
  2026-05-27 23:18 ` [PATCH 2/3] pack-objects: extract `record_tree_depth()` helper Taylor Blau
  2026-05-27 23:18 ` [PATCH 3/3] pack-objects: support `--delta-islands` with `--path-walk` Taylor Blau
  2 siblings, 0 replies; 4+ messages in thread
From: Taylor Blau @ 2026-05-27 23:18 UTC (permalink / raw)
  To: git; +Cc: Derrick Stolee, Junio C Hamano, Jeff King, Elijah Newren

When 'pack-objects' is invoked with '--path-walk', it prevents us from
using reachability bitmaps.

This behavior dates back to 70664d2865c (pack-objects: add --path-walk
option, 2025-05-16), which included a comment in the relevant portion of
the command-line arguments handling that read as follows:

    /*
     * We must disable the bitmaps because we are removing
     * the --objects / --objects-edge[-aggressive] options.
     */

In fb2c309b7d3 (pack-objects: pass --objects with --path-walk,
2026-05-02), path-walk learned to pass '--objects' again, but still
kept bitmap traversal disabled. That leaves two useful cases
unsupported:

 * A path-walk repack that writes bitmaps does not give the bitmap
   selector any commits, because path-walk reveals commits through
   `add_objects_by_path()` rather than through `show_commit()`, where
   `index_commit_for_bitmap()` is normally called.

 * An invocation like "git pack-objects --use-bitmap-index --path-walk"
   never tries an existing bitmap, even when one is available and could
   answer the request.

Fortunately for us, neither restriction is required.

 * On the writing side: teach the path-walk object callback to call
   `index_commit_for_bitmap()` for commits that it adds to the pack.
   That gives the bitmap selector the commit candidates it would have
   seen from the regular traversal.

 * For bitmap reading, keep passing '--objects' to the internal rev_list
   machinery, but stop clearing `use_bitmap_index`. If an existing
   bitmap can answer the request, use it; otherwise fall back to
   path-walk's own enumeration.

There is one wrinkle when it comes to '--boundary', which we must not
pass into the bitmap walk in the presence of both '--path-walk' and
'--use-bitmap-index'. Path-walk needs boundary commits when it performs
its own traversal, in order to discover bases for thin packs, but the
bitmap traversal expects the usual non-boundary state. Work around this
by setting `revs->boundary` as late as possible within
`get_object_list_path_walk()`, after any bitmap attempt has either
succeeded or declined to answer the request.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 Documentation/git-pack-objects.adoc |  6 +++--
 builtin/pack-objects.c              | 18 +++++++++++++--
 t/t5310-pack-bitmaps.sh             | 36 +++++++++++++++++++++++++++++
 3 files changed, 56 insertions(+), 4 deletions(-)

diff --git a/Documentation/git-pack-objects.adoc b/Documentation/git-pack-objects.adoc
index 8a27aa19fd3..0adce8961a3 100644
--- a/Documentation/git-pack-objects.adoc
+++ b/Documentation/git-pack-objects.adoc
@@ -402,8 +402,10 @@ will be automatically changed to version `1`.
 	of filenames that cause collisions in Git's default name-hash
 	algorithm.
 +
-Incompatible with `--delta-islands`. The `--use-bitmap-index` option is
-ignored in the presence of `--path-walk`. The `--path-walk` option
+Incompatible with `--delta-islands`. When `--use-bitmap-index` is
+specified with `--path-walk`, a successful bitmap traversal is used for
+object enumeration, with path-walk remaining as the fallback traversal
+when the bitmap cannot satisfy the request. The `--path-walk` option
 supports the `--filter=<spec>` forms `blob:none`, `blob:limit=<n>`,
 `tree:0`, `object:type=<type>`, and `sparse:<oid>`. These supported filter
 types can be combined with the `combine:<spec>+<spec>` form.
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index b783dc62bc9..e4dcb563b7d 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -4732,6 +4732,15 @@ static int add_objects_by_path(const char *path,
 			continue;
 
 		add_object_entry(oid, type, path, exclude);
+
+		if (type == OBJ_COMMIT && write_bitmap_index) {
+			struct commit *commit;
+
+			commit = lookup_commit(the_repository, oid);
+			if (!commit)
+				die(_("could not find commit %s"), oid_to_hex(oid));
+			index_commit_for_bitmap(commit);
+		}
 	}
 
 	oe_end = to_pack.nr_objects;
@@ -4764,6 +4773,13 @@ static int get_object_list_path_walk(struct rev_info *revs)
 	info.path_fn = add_objects_by_path;
 	info.path_fn_data = &processed;
 
+	/*
+	 * Path-walk needs boundary commits to discover thin-pack bases, but
+	 * bitmap traversal does not understand the boundary state. Set it
+	 * here so any prior bitmap attempt sees the usual non-boundary walk.
+	 */
+	revs->boundary = 1;
+
 	/*
 	 * Allow the --[no-]sparse option to be interesting here, if only
 	 * for testing purposes. Paths with no interesting objects will not
@@ -5195,9 +5211,7 @@ int cmd_pack_objects(int argc,
 		}
 	}
 	if (path_walk) {
-		strvec_push(&rp, "--boundary");
 		strvec_push(&rp, "--objects");
-		use_bitmap_index = 0;
 	} else if (thin) {
 		use_internal_rev_list = 1;
 		strvec_push(&rp, shallow
diff --git a/t/t5310-pack-bitmaps.sh b/t/t5310-pack-bitmaps.sh
index f693cb56691..69c5da1580a 100755
--- a/t/t5310-pack-bitmaps.sh
+++ b/t/t5310-pack-bitmaps.sh
@@ -577,6 +577,42 @@ test_bitmap_cases
 
 sane_unset GIT_TEST_PACK_USE_BITMAP_BOUNDARY_TRAVERSAL
 
+test_expect_success 'path-walk repack can write and use bitmap indexes' '
+	test_when_finished "rm -rf path-walk-bitmap" &&
+	git init path-walk-bitmap &&
+	(
+		cd path-walk-bitmap &&
+		test_commit first &&
+		test_commit second &&
+		test_commit third &&
+
+		git repack -a -d -b --path-walk &&
+		git rev-list --test-bitmap --use-bitmap-index HEAD &&
+
+		git rev-parse HEAD >in &&
+
+		git rev-list --objects --no-object-names HEAD >expect.raw &&
+		sort expect.raw >expect &&
+
+		for reuse in true false
+		do
+			: >trace.txt &&
+
+			GIT_TRACE2_EVENT="$(pwd)/trace.txt" \
+			git -c pack.allowPackReuse=$reuse pack-objects \
+				--stdout --revs --path-walk --use-bitmap-index \
+				<in >out.pack &&
+			grep "\"category\":\"bitmap\",\"key\":\"bitmap/hits\"" trace.txt &&
+
+			git index-pack out.pack &&
+
+			list_packed_objects out.idx >actual.raw &&
+			sort actual.raw >actual &&
+			test_cmp expect actual || return 1
+		done
+	)
+'
+
 test_expect_success 'incremental repack fails when bitmaps are requested' '
 	test_commit more-1 &&
 	test_must_fail git repack -d 2>err &&
-- 
2.54.0.22.ga642305e3c9


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH 2/3] pack-objects: extract `record_tree_depth()` helper
  2026-05-27 23:18 [PATCH 0/3] pack-objects: support bitmaps and delta-islands with `--path-walk` Taylor Blau
  2026-05-27 23:18 ` [PATCH 1/3] pack-objects: support reachability bitmaps " Taylor Blau
@ 2026-05-27 23:18 ` Taylor Blau
  2026-05-27 23:18 ` [PATCH 3/3] pack-objects: support `--delta-islands` with `--path-walk` Taylor Blau
  2 siblings, 0 replies; 4+ messages in thread
From: Taylor Blau @ 2026-05-27 23:18 UTC (permalink / raw)
  To: git; +Cc: Derrick Stolee, Junio C Hamano, Jeff King, Elijah Newren

Prepare for a subsequent change that needs to record tree depths from a
second call site by factoring the delta-islands tree-depth bookkeeping
out of `show_object()` and into a helper, `record_tree_depth()`.

The helper looks up the object in `to_pack`, returns early when the
object was not added there, computes the depth from the slash count in
the supplied name, and preserves the existing max-depth-wins behavior
when a tree is reached by more than one path.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 builtin/pack-objects.c | 32 ++++++++++++++++++--------------
 1 file changed, 18 insertions(+), 14 deletions(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index e4dcb563b7d..ec02e2b21d2 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -2722,6 +2722,22 @@ static inline void oe_set_tree_depth(struct packing_data *pack,
 	pack->tree_depth[e - pack->objects] = tree_depth;
 }
 
+static void record_tree_depth(const struct object_id *oid, const char *name)
+{
+	const char *p;
+	unsigned depth;
+	struct object_entry *ent;
+
+	/* the empty string is a root tree, which is depth 0 */
+	depth = *name ? 1 : 0;
+	for (p = strchr(name, '/'); p; p = strchr(p + 1, '/'))
+		depth++;
+
+	ent = packlist_find(&to_pack, oid);
+	if (ent && depth > oe_tree_depth(&to_pack, ent))
+		oe_set_tree_depth(&to_pack, ent, depth);
+}
+
 /*
  * Return the size of the object without doing any delta
  * reconstruction (so non-deltas are true object sizes, but deltas
@@ -4375,20 +4391,8 @@ static void show_object(struct object *obj, const char *name,
 	add_preferred_base_object(name);
 	add_object_entry(&obj->oid, obj->type, name, 0);
 
-	if (use_delta_islands) {
-		const char *p;
-		unsigned depth;
-		struct object_entry *ent;
-
-		/* the empty string is a root tree, which is depth 0 */
-		depth = *name ? 1 : 0;
-		for (p = strchr(name, '/'); p; p = strchr(p + 1, '/'))
-			depth++;
-
-		ent = packlist_find(&to_pack, &obj->oid);
-		if (ent && depth > oe_tree_depth(&to_pack, ent))
-			oe_set_tree_depth(&to_pack, ent, depth);
-	}
+	if (use_delta_islands)
+		record_tree_depth(&obj->oid, name);
 }
 
 static void show_object__ma_allow_any(struct object *obj, const char *name, void *data)
-- 
2.54.0.22.ga642305e3c9


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH 3/3] pack-objects: support `--delta-islands` with `--path-walk`
  2026-05-27 23:18 [PATCH 0/3] pack-objects: support bitmaps and delta-islands with `--path-walk` Taylor Blau
  2026-05-27 23:18 ` [PATCH 1/3] pack-objects: support reachability bitmaps " Taylor Blau
  2026-05-27 23:18 ` [PATCH 2/3] pack-objects: extract `record_tree_depth()` helper Taylor Blau
@ 2026-05-27 23:18 ` Taylor Blau
  2 siblings, 0 replies; 4+ messages in thread
From: Taylor Blau @ 2026-05-27 23:18 UTC (permalink / raw)
  To: git; +Cc: Derrick Stolee, Junio C Hamano, Jeff King, Elijah Newren

Since the inception of `--path-walk`, this option has had a documented
incompatibility with `--delta-islands`.

When discussing those original patches on the list, a message from
Stolee in [1] noted the following:

    this could be remedied by [...] doing a separate walk to identify
    islands using the normal method

In a related portion of the thread, Peff explains[2]:

    The delta islands code already does its own tree walk to propagate
    the bits down (it does rely on the base walk's show_commit() to
    propagate through the commits).

    Once each object has its island bitmaps, I think however you
    choose to come up with delta candidates [...] you should be able
    to use it. It's fundamentally just answering the question of "am
    I allowed to delta between these two objects".

That is similar to what this patch does, and it turns out the cheaper
option is sufficient: perform the same island side effects from the
path-walk callback rather than doing a second walk.

Recall how delta-islands are computed during a normal repack:

 - `show_commit()` calls `propagate_island_marks()` for each commit,
   which merges the commit's island bitset onto its root tree object and
   onto each of its parent commits.

 - `show_object()` for a tree records the tree's depth derived from the
   slash-separated pathname. Subsequent `resolve_tree_islands()` uses
   that depth to walk trees in increasing-depth order, propagating each
   tree's marks to its children.

 - At delta-search time, `in_same_island()` enforces that a delta
   target's island bitmap is a subset of its base's: every island that
   reaches the target must also reach the base.

Path-walk's enumeration callback is `add_objects_by_path()`. It already
adds objects to `to_pack`, but until now did not perform the
island-related side effects. Two things are needed:

 - For each commit batch, call `propagate_island_marks()` on commits,
   exactly as `show_commit()` does.

   We have to be careful about the order in which we call this function,
   and we must see a commit before its parents in order to have
   island marks to propagate.

   The path-walk batch preserves that order. Path-walk appends commits
   to its `OBJ_COMMIT` batch as they come back from the same
   `get_revision()` loop the regular traversal uses, and
   `add_objects_by_path()` iterates the batch in array order. So every
   commit reaches `propagate_island_marks()` in the same sequence that
   `show_commit()` would have seen it, and the descendant-first chain
   that the algorithm relies on is intact.

   Skip island propagation for excluded commits to match the regular
   traversal, whose `show_commit()` callback is only invoked for
   interesting commits. Boundary commits may still be present in
   path-walk's callback so they can serve as thin-pack bases, but they
   should not contribute island marks.

 - For each tree batch, record the tree's depth from the path. Use the
   `record_tree_depth()` helper from the previous commit so both
   callbacks behave identically, including the max-depth-wins behavior
   when a tree is reached via more than one path. The helper accepts
   both the `show_object()` path shape ("foo", "foo/bar") and the
   path-walk shape with a trailing slash ("foo/", "foo/bar/"), so depths
   recorded from either traversal mode are directly comparable.

   This is implicit in the implementation sketch from Peff above.
   `resolve_tree_islands()` sorts trees by `oe->tree_depth` in
   increasing-depth order before propagating marks down, so that a
   parent tree's marks are finalized before its children inherit them.
   Without recording the depth at path-walk time, every
   path-walk-discovered tree would land at depth 0 in `to_pack`, the
   sort would lose its ordering, and children could inherit marks from
   parents whose own contributions had not yet been merged in.

With those two pieces in place, `resolve_tree_islands()` receives the
same island inputs from path-walk as it would from the regular
traversal, so the existing island checks can be reused unchanged.

Drop the documented incompatibility between `--path-walk` and
`--delta-islands`, and add t5320 coverage for path-walk island repacks
with and without bitmap writing, as well as the same-island case where a
delta remains allowed.

[1]: https://lore.kernel.org/git/9aa2471b-0850-4707-9733-d3b33609f5f2@gmail.com/
[2]: https://lore.kernel.org/git/20240911063203.GA1538586@coredump.intra.peff.net/

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 Documentation/git-pack-objects.adoc | 14 +++++++-------
 builtin/pack-objects.c              | 22 ++++++++++++++++++----
 t/t5320-delta-islands.sh            | 29 +++++++++++++++++++++++++++++
 3 files changed, 54 insertions(+), 11 deletions(-)

diff --git a/Documentation/git-pack-objects.adoc b/Documentation/git-pack-objects.adoc
index 0adce8961a3..65cd00c152f 100644
--- a/Documentation/git-pack-objects.adoc
+++ b/Documentation/git-pack-objects.adoc
@@ -402,13 +402,13 @@ will be automatically changed to version `1`.
 	of filenames that cause collisions in Git's default name-hash
 	algorithm.
 +
-Incompatible with `--delta-islands`. When `--use-bitmap-index` is
-specified with `--path-walk`, a successful bitmap traversal is used for
-object enumeration, with path-walk remaining as the fallback traversal
-when the bitmap cannot satisfy the request. The `--path-walk` option
-supports the `--filter=<spec>` forms `blob:none`, `blob:limit=<n>`,
-`tree:0`, `object:type=<type>`, and `sparse:<oid>`. These supported filter
-types can be combined with the `combine:<spec>+<spec>` form.
+When `--use-bitmap-index` is specified with `--path-walk`, a successful
+bitmap traversal is used for object enumeration, with path-walk
+remaining as the fallback traversal when the bitmap cannot satisfy the
+request. The `--path-walk` option supports the `--filter=<spec>` forms
+`blob:none`, `blob:limit=<n>`, `tree:0`, `object:type=<type>`, and
+`sparse:<oid>`. These supported filter types can be combined with the
+`combine:<spec>+<spec>` form.
 
 
 DELTA ISLANDS
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index ec02e2b21d2..f48ea7a888b 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -4737,13 +4737,29 @@ static int add_objects_by_path(const char *path,
 
 		add_object_entry(oid, type, path, exclude);
 
-		if (type == OBJ_COMMIT && write_bitmap_index) {
+		if (type == OBJ_COMMIT) {
 			struct commit *commit;
 
+			if (!write_bitmap_index && !use_delta_islands)
+				continue;
+
 			commit = lookup_commit(the_repository, oid);
 			if (!commit)
 				die(_("could not find commit %s"), oid_to_hex(oid));
-			index_commit_for_bitmap(commit);
+			if (write_bitmap_index)
+				index_commit_for_bitmap(commit);
+			/*
+			 * Skip island propagation for boundary commits.
+			 * The regular traversal's show_commit() is only
+			 * called for interesting commits; matching that
+			 * here keeps path-walk from doing extra work that
+			 * would only be a no-op anyway (boundary commits
+			 * are not in island_marks).
+			 */
+			if (use_delta_islands && !exclude)
+				propagate_island_marks(the_repository, commit);
+		} else if (type == OBJ_TREE && use_delta_islands) {
+			record_tree_depth(oid, path);
 		}
 	}
 
@@ -5205,8 +5221,6 @@ int cmd_pack_objects(int argc,
 		const char *option = NULL;
 		if (!path_walk_filter_compatible(&filter_options))
 			option = "--filter";
-		else if (use_delta_islands)
-			option = "--delta-islands";
 
 		if (option) {
 			warning(_("cannot use %s with %s"),
diff --git a/t/t5320-delta-islands.sh b/t/t5320-delta-islands.sh
index 2c961c70963..9b28344a0a3 100755
--- a/t/t5320-delta-islands.sh
+++ b/t/t5320-delta-islands.sh
@@ -53,6 +53,35 @@ test_expect_success 'separate islands disallows delta' '
 	! is_delta_base $two $one
 '
 
+test_expect_success 'path-walk island repack respects islands' '
+	GIT_TRACE2_EVENT="$(pwd)/trace.path-walk-islands" \
+		git -c "pack.island=refs/heads/(.*)" repack -adfi \
+		--path-walk 2>err &&
+	test_region pack-objects path-walk trace.path-walk-islands &&
+	test_grep ! "cannot use --delta-islands with --path-walk" err &&
+	! is_delta_base $one $two &&
+	! is_delta_base $two $one
+'
+
+test_expect_success 'path-walk island bitmap repack respects islands' '
+	GIT_TRACE2_EVENT="$(pwd)/trace.path-walk-island-bitmap" \
+		git -c "pack.island=refs/heads/(.*)" repack -a -d -f -i -b \
+		--path-walk 2>err &&
+	test_region pack-objects path-walk trace.path-walk-island-bitmap &&
+	test_path_is_file .git/objects/pack/*.bitmap &&
+	git rev-list --test-bitmap --use-bitmap-index one &&
+	test_grep ! "cannot use --delta-islands with --path-walk" err &&
+	! is_delta_base $one $two &&
+	! is_delta_base $two $one
+'
+
+test_expect_success 'path-walk same island allows delta' '
+	GIT_TRACE2_EVENT="$(pwd)/trace.path-walk-same-island" \
+		git -c "pack.island=refs/heads" repack -adfi --path-walk &&
+	test_region pack-objects path-walk trace.path-walk-same-island &&
+	is_delta_base $one $two
+'
+
 test_expect_success 'same island allows delta' '
 	git -c "pack.island=refs/heads" repack -adfi &&
 	is_delta_base $one $two
-- 
2.54.0.22.ga642305e3c9

^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2026-05-27 23:18 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-27 23:18 [PATCH 0/3] pack-objects: support bitmaps and delta-islands with `--path-walk` Taylor Blau
2026-05-27 23:18 ` [PATCH 1/3] pack-objects: support reachability bitmaps " Taylor Blau
2026-05-27 23:18 ` [PATCH 2/3] pack-objects: extract `record_tree_depth()` helper Taylor Blau
2026-05-27 23:18 ` [PATCH 3/3] pack-objects: support `--delta-islands` with `--path-walk` Taylor Blau

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox