* [PATCH 1/3] pack-objects: support reachability bitmaps with `--path-walk`
2026-05-27 23:18 [PATCH 0/3] pack-objects: support bitmaps and delta-islands with `--path-walk` Taylor Blau
@ 2026-05-27 23:18 ` Taylor Blau
2026-05-27 23:18 ` [PATCH 2/3] pack-objects: extract `record_tree_depth()` helper Taylor Blau
` (3 subsequent siblings)
4 siblings, 0 replies; 14+ messages in thread
From: Taylor Blau @ 2026-05-27 23:18 UTC (permalink / raw)
To: git; +Cc: Derrick Stolee, Junio C Hamano, Jeff King, Elijah Newren
When 'pack-objects' is invoked with '--path-walk', it prevents us from
using reachability bitmaps.
This behavior dates back to 70664d2865c (pack-objects: add --path-walk
option, 2025-05-16), which included a comment in the relevant portion of
the command-line arguments handling that read as follows:
/*
* We must disable the bitmaps because we are removing
* the --objects / --objects-edge[-aggressive] options.
*/
In fb2c309b7d3 (pack-objects: pass --objects with --path-walk,
2026-05-02), path-walk learned to pass '--objects' again, but still
kept bitmap traversal disabled. That leaves two useful cases
unsupported:
* A path-walk repack that writes bitmaps does not give the bitmap
selector any commits, because path-walk reveals commits through
`add_objects_by_path()` rather than through `show_commit()`, where
`index_commit_for_bitmap()` is normally called.
* An invocation like "git pack-objects --use-bitmap-index --path-walk"
never tries an existing bitmap, even when one is available and could
answer the request.
Fortunately for us, neither restriction is required.
* On the writing side: teach the path-walk object callback to call
`index_commit_for_bitmap()` for commits that it adds to the pack.
That gives the bitmap selector the commit candidates it would have
seen from the regular traversal.
* For bitmap reading, keep passing '--objects' to the internal rev_list
machinery, but stop clearing `use_bitmap_index`. If an existing
bitmap can answer the request, use it; otherwise fall back to
path-walk's own enumeration.
There is one wrinkle when it comes to '--boundary', which we must not
pass into the bitmap walk in the presence of both '--path-walk' and
'--use-bitmap-index'. Path-walk needs boundary commits when it performs
its own traversal, in order to discover bases for thin packs, but the
bitmap traversal expects the usual non-boundary state. Work around this
by setting `revs->boundary` as late as possible within
`get_object_list_path_walk()`, after any bitmap attempt has either
succeeded or declined to answer the request.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
Documentation/git-pack-objects.adoc | 6 +++--
builtin/pack-objects.c | 18 +++++++++++++--
t/t5310-pack-bitmaps.sh | 36 +++++++++++++++++++++++++++++
3 files changed, 56 insertions(+), 4 deletions(-)
diff --git a/Documentation/git-pack-objects.adoc b/Documentation/git-pack-objects.adoc
index 8a27aa19fd3..0adce8961a3 100644
--- a/Documentation/git-pack-objects.adoc
+++ b/Documentation/git-pack-objects.adoc
@@ -402,8 +402,10 @@ will be automatically changed to version `1`.
of filenames that cause collisions in Git's default name-hash
algorithm.
+
-Incompatible with `--delta-islands`. The `--use-bitmap-index` option is
-ignored in the presence of `--path-walk`. The `--path-walk` option
+Incompatible with `--delta-islands`. When `--use-bitmap-index` is
+specified with `--path-walk`, a successful bitmap traversal is used for
+object enumeration, with path-walk remaining as the fallback traversal
+when the bitmap cannot satisfy the request. The `--path-walk` option
supports the `--filter=<spec>` forms `blob:none`, `blob:limit=<n>`,
`tree:0`, `object:type=<type>`, and `sparse:<oid>`. These supported filter
types can be combined with the `combine:<spec>+<spec>` form.
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index b783dc62bc9..e4dcb563b7d 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -4732,6 +4732,15 @@ static int add_objects_by_path(const char *path,
continue;
add_object_entry(oid, type, path, exclude);
+
+ if (type == OBJ_COMMIT && write_bitmap_index) {
+ struct commit *commit;
+
+ commit = lookup_commit(the_repository, oid);
+ if (!commit)
+ die(_("could not find commit %s"), oid_to_hex(oid));
+ index_commit_for_bitmap(commit);
+ }
}
oe_end = to_pack.nr_objects;
@@ -4764,6 +4773,13 @@ static int get_object_list_path_walk(struct rev_info *revs)
info.path_fn = add_objects_by_path;
info.path_fn_data = &processed;
+ /*
+ * Path-walk needs boundary commits to discover thin-pack bases, but
+ * bitmap traversal does not understand the boundary state. Set it
+ * here so any prior bitmap attempt sees the usual non-boundary walk.
+ */
+ revs->boundary = 1;
+
/*
* Allow the --[no-]sparse option to be interesting here, if only
* for testing purposes. Paths with no interesting objects will not
@@ -5195,9 +5211,7 @@ int cmd_pack_objects(int argc,
}
}
if (path_walk) {
- strvec_push(&rp, "--boundary");
strvec_push(&rp, "--objects");
- use_bitmap_index = 0;
} else if (thin) {
use_internal_rev_list = 1;
strvec_push(&rp, shallow
diff --git a/t/t5310-pack-bitmaps.sh b/t/t5310-pack-bitmaps.sh
index f693cb56691..69c5da1580a 100755
--- a/t/t5310-pack-bitmaps.sh
+++ b/t/t5310-pack-bitmaps.sh
@@ -577,6 +577,42 @@ test_bitmap_cases
sane_unset GIT_TEST_PACK_USE_BITMAP_BOUNDARY_TRAVERSAL
+test_expect_success 'path-walk repack can write and use bitmap indexes' '
+ test_when_finished "rm -rf path-walk-bitmap" &&
+ git init path-walk-bitmap &&
+ (
+ cd path-walk-bitmap &&
+ test_commit first &&
+ test_commit second &&
+ test_commit third &&
+
+ git repack -a -d -b --path-walk &&
+ git rev-list --test-bitmap --use-bitmap-index HEAD &&
+
+ git rev-parse HEAD >in &&
+
+ git rev-list --objects --no-object-names HEAD >expect.raw &&
+ sort expect.raw >expect &&
+
+ for reuse in true false
+ do
+ : >trace.txt &&
+
+ GIT_TRACE2_EVENT="$(pwd)/trace.txt" \
+ git -c pack.allowPackReuse=$reuse pack-objects \
+ --stdout --revs --path-walk --use-bitmap-index \
+ <in >out.pack &&
+ grep "\"category\":\"bitmap\",\"key\":\"bitmap/hits\"" trace.txt &&
+
+ git index-pack out.pack &&
+
+ list_packed_objects out.idx >actual.raw &&
+ sort actual.raw >actual &&
+ test_cmp expect actual || return 1
+ done
+ )
+'
+
test_expect_success 'incremental repack fails when bitmaps are requested' '
test_commit more-1 &&
test_must_fail git repack -d 2>err &&
--
2.54.0.22.ga642305e3c9
^ permalink raw reply related [flat|nested] 14+ messages in thread* [PATCH 2/3] pack-objects: extract `record_tree_depth()` helper
2026-05-27 23:18 [PATCH 0/3] pack-objects: support bitmaps and delta-islands with `--path-walk` Taylor Blau
2026-05-27 23:18 ` [PATCH 1/3] pack-objects: support reachability bitmaps " Taylor Blau
@ 2026-05-27 23:18 ` Taylor Blau
2026-05-27 23:18 ` [PATCH 3/3] pack-objects: support `--delta-islands` with `--path-walk` Taylor Blau
` (2 subsequent siblings)
4 siblings, 0 replies; 14+ messages in thread
From: Taylor Blau @ 2026-05-27 23:18 UTC (permalink / raw)
To: git; +Cc: Derrick Stolee, Junio C Hamano, Jeff King, Elijah Newren
Prepare for a subsequent change that needs to record tree depths from a
second call site by factoring the delta-islands tree-depth bookkeeping
out of `show_object()` and into a helper, `record_tree_depth()`.
The helper looks up the object in `to_pack`, returns early when the
object was not added there, computes the depth from the slash count in
the supplied name, and preserves the existing max-depth-wins behavior
when a tree is reached by more than one path.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
builtin/pack-objects.c | 32 ++++++++++++++++++--------------
1 file changed, 18 insertions(+), 14 deletions(-)
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index e4dcb563b7d..ec02e2b21d2 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -2722,6 +2722,22 @@ static inline void oe_set_tree_depth(struct packing_data *pack,
pack->tree_depth[e - pack->objects] = tree_depth;
}
+static void record_tree_depth(const struct object_id *oid, const char *name)
+{
+ const char *p;
+ unsigned depth;
+ struct object_entry *ent;
+
+ /* the empty string is a root tree, which is depth 0 */
+ depth = *name ? 1 : 0;
+ for (p = strchr(name, '/'); p; p = strchr(p + 1, '/'))
+ depth++;
+
+ ent = packlist_find(&to_pack, oid);
+ if (ent && depth > oe_tree_depth(&to_pack, ent))
+ oe_set_tree_depth(&to_pack, ent, depth);
+}
+
/*
* Return the size of the object without doing any delta
* reconstruction (so non-deltas are true object sizes, but deltas
@@ -4375,20 +4391,8 @@ static void show_object(struct object *obj, const char *name,
add_preferred_base_object(name);
add_object_entry(&obj->oid, obj->type, name, 0);
- if (use_delta_islands) {
- const char *p;
- unsigned depth;
- struct object_entry *ent;
-
- /* the empty string is a root tree, which is depth 0 */
- depth = *name ? 1 : 0;
- for (p = strchr(name, '/'); p; p = strchr(p + 1, '/'))
- depth++;
-
- ent = packlist_find(&to_pack, &obj->oid);
- if (ent && depth > oe_tree_depth(&to_pack, ent))
- oe_set_tree_depth(&to_pack, ent, depth);
- }
+ if (use_delta_islands)
+ record_tree_depth(&obj->oid, name);
}
static void show_object__ma_allow_any(struct object *obj, const char *name, void *data)
--
2.54.0.22.ga642305e3c9
^ permalink raw reply related [flat|nested] 14+ messages in thread* [PATCH 3/3] pack-objects: support `--delta-islands` with `--path-walk`
2026-05-27 23:18 [PATCH 0/3] pack-objects: support bitmaps and delta-islands with `--path-walk` Taylor Blau
2026-05-27 23:18 ` [PATCH 1/3] pack-objects: support reachability bitmaps " Taylor Blau
2026-05-27 23:18 ` [PATCH 2/3] pack-objects: extract `record_tree_depth()` helper Taylor Blau
@ 2026-05-27 23:18 ` Taylor Blau
2026-05-28 15:28 ` [PATCH 0/3] pack-objects: support bitmaps and delta-islands " Derrick Stolee
2026-06-02 22:21 ` [PATCH v2 0/4] " Taylor Blau
4 siblings, 0 replies; 14+ messages in thread
From: Taylor Blau @ 2026-05-27 23:18 UTC (permalink / raw)
To: git; +Cc: Derrick Stolee, Junio C Hamano, Jeff King, Elijah Newren
Since the inception of `--path-walk`, this option has had a documented
incompatibility with `--delta-islands`.
When discussing those original patches on the list, a message from
Stolee in [1] noted the following:
this could be remedied by [...] doing a separate walk to identify
islands using the normal method
In a related portion of the thread, Peff explains[2]:
The delta islands code already does its own tree walk to propagate
the bits down (it does rely on the base walk's show_commit() to
propagate through the commits).
Once each object has its island bitmaps, I think however you
choose to come up with delta candidates [...] you should be able
to use it. It's fundamentally just answering the question of "am
I allowed to delta between these two objects".
That is similar to what this patch does, and it turns out the cheaper
option is sufficient: perform the same island side effects from the
path-walk callback rather than doing a second walk.
Recall how delta-islands are computed during a normal repack:
- `show_commit()` calls `propagate_island_marks()` for each commit,
which merges the commit's island bitset onto its root tree object and
onto each of its parent commits.
- `show_object()` for a tree records the tree's depth derived from the
slash-separated pathname. Subsequent `resolve_tree_islands()` uses
that depth to walk trees in increasing-depth order, propagating each
tree's marks to its children.
- At delta-search time, `in_same_island()` enforces that a delta
target's island bitmap is a subset of its base's: every island that
reaches the target must also reach the base.
Path-walk's enumeration callback is `add_objects_by_path()`. It already
adds objects to `to_pack`, but until now did not perform the
island-related side effects. Two things are needed:
- For each commit batch, call `propagate_island_marks()` on commits,
exactly as `show_commit()` does.
We have to be careful about the order in which we call this function,
and we must see a commit before its parents in order to have
island marks to propagate.
The path-walk batch preserves that order. Path-walk appends commits
to its `OBJ_COMMIT` batch as they come back from the same
`get_revision()` loop the regular traversal uses, and
`add_objects_by_path()` iterates the batch in array order. So every
commit reaches `propagate_island_marks()` in the same sequence that
`show_commit()` would have seen it, and the descendant-first chain
that the algorithm relies on is intact.
Skip island propagation for excluded commits to match the regular
traversal, whose `show_commit()` callback is only invoked for
interesting commits. Boundary commits may still be present in
path-walk's callback so they can serve as thin-pack bases, but they
should not contribute island marks.
- For each tree batch, record the tree's depth from the path. Use the
`record_tree_depth()` helper from the previous commit so both
callbacks behave identically, including the max-depth-wins behavior
when a tree is reached via more than one path. The helper accepts
both the `show_object()` path shape ("foo", "foo/bar") and the
path-walk shape with a trailing slash ("foo/", "foo/bar/"), so depths
recorded from either traversal mode are directly comparable.
This is implicit in the implementation sketch from Peff above.
`resolve_tree_islands()` sorts trees by `oe->tree_depth` in
increasing-depth order before propagating marks down, so that a
parent tree's marks are finalized before its children inherit them.
Without recording the depth at path-walk time, every
path-walk-discovered tree would land at depth 0 in `to_pack`, the
sort would lose its ordering, and children could inherit marks from
parents whose own contributions had not yet been merged in.
With those two pieces in place, `resolve_tree_islands()` receives the
same island inputs from path-walk as it would from the regular
traversal, so the existing island checks can be reused unchanged.
Drop the documented incompatibility between `--path-walk` and
`--delta-islands`, and add t5320 coverage for path-walk island repacks
with and without bitmap writing, as well as the same-island case where a
delta remains allowed.
[1]: https://lore.kernel.org/git/9aa2471b-0850-4707-9733-d3b33609f5f2@gmail.com/
[2]: https://lore.kernel.org/git/20240911063203.GA1538586@coredump.intra.peff.net/
Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
Documentation/git-pack-objects.adoc | 14 +++++++-------
builtin/pack-objects.c | 22 ++++++++++++++++++----
t/t5320-delta-islands.sh | 29 +++++++++++++++++++++++++++++
3 files changed, 54 insertions(+), 11 deletions(-)
diff --git a/Documentation/git-pack-objects.adoc b/Documentation/git-pack-objects.adoc
index 0adce8961a3..65cd00c152f 100644
--- a/Documentation/git-pack-objects.adoc
+++ b/Documentation/git-pack-objects.adoc
@@ -402,13 +402,13 @@ will be automatically changed to version `1`.
of filenames that cause collisions in Git's default name-hash
algorithm.
+
-Incompatible with `--delta-islands`. When `--use-bitmap-index` is
-specified with `--path-walk`, a successful bitmap traversal is used for
-object enumeration, with path-walk remaining as the fallback traversal
-when the bitmap cannot satisfy the request. The `--path-walk` option
-supports the `--filter=<spec>` forms `blob:none`, `blob:limit=<n>`,
-`tree:0`, `object:type=<type>`, and `sparse:<oid>`. These supported filter
-types can be combined with the `combine:<spec>+<spec>` form.
+When `--use-bitmap-index` is specified with `--path-walk`, a successful
+bitmap traversal is used for object enumeration, with path-walk
+remaining as the fallback traversal when the bitmap cannot satisfy the
+request. The `--path-walk` option supports the `--filter=<spec>` forms
+`blob:none`, `blob:limit=<n>`, `tree:0`, `object:type=<type>`, and
+`sparse:<oid>`. These supported filter types can be combined with the
+`combine:<spec>+<spec>` form.
DELTA ISLANDS
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index ec02e2b21d2..f48ea7a888b 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -4737,13 +4737,29 @@ static int add_objects_by_path(const char *path,
add_object_entry(oid, type, path, exclude);
- if (type == OBJ_COMMIT && write_bitmap_index) {
+ if (type == OBJ_COMMIT) {
struct commit *commit;
+ if (!write_bitmap_index && !use_delta_islands)
+ continue;
+
commit = lookup_commit(the_repository, oid);
if (!commit)
die(_("could not find commit %s"), oid_to_hex(oid));
- index_commit_for_bitmap(commit);
+ if (write_bitmap_index)
+ index_commit_for_bitmap(commit);
+ /*
+ * Skip island propagation for boundary commits.
+ * The regular traversal's show_commit() is only
+ * called for interesting commits; matching that
+ * here keeps path-walk from doing extra work that
+ * would only be a no-op anyway (boundary commits
+ * are not in island_marks).
+ */
+ if (use_delta_islands && !exclude)
+ propagate_island_marks(the_repository, commit);
+ } else if (type == OBJ_TREE && use_delta_islands) {
+ record_tree_depth(oid, path);
}
}
@@ -5205,8 +5221,6 @@ int cmd_pack_objects(int argc,
const char *option = NULL;
if (!path_walk_filter_compatible(&filter_options))
option = "--filter";
- else if (use_delta_islands)
- option = "--delta-islands";
if (option) {
warning(_("cannot use %s with %s"),
diff --git a/t/t5320-delta-islands.sh b/t/t5320-delta-islands.sh
index 2c961c70963..9b28344a0a3 100755
--- a/t/t5320-delta-islands.sh
+++ b/t/t5320-delta-islands.sh
@@ -53,6 +53,35 @@ test_expect_success 'separate islands disallows delta' '
! is_delta_base $two $one
'
+test_expect_success 'path-walk island repack respects islands' '
+ GIT_TRACE2_EVENT="$(pwd)/trace.path-walk-islands" \
+ git -c "pack.island=refs/heads/(.*)" repack -adfi \
+ --path-walk 2>err &&
+ test_region pack-objects path-walk trace.path-walk-islands &&
+ test_grep ! "cannot use --delta-islands with --path-walk" err &&
+ ! is_delta_base $one $two &&
+ ! is_delta_base $two $one
+'
+
+test_expect_success 'path-walk island bitmap repack respects islands' '
+ GIT_TRACE2_EVENT="$(pwd)/trace.path-walk-island-bitmap" \
+ git -c "pack.island=refs/heads/(.*)" repack -a -d -f -i -b \
+ --path-walk 2>err &&
+ test_region pack-objects path-walk trace.path-walk-island-bitmap &&
+ test_path_is_file .git/objects/pack/*.bitmap &&
+ git rev-list --test-bitmap --use-bitmap-index one &&
+ test_grep ! "cannot use --delta-islands with --path-walk" err &&
+ ! is_delta_base $one $two &&
+ ! is_delta_base $two $one
+'
+
+test_expect_success 'path-walk same island allows delta' '
+ GIT_TRACE2_EVENT="$(pwd)/trace.path-walk-same-island" \
+ git -c "pack.island=refs/heads" repack -adfi --path-walk &&
+ test_region pack-objects path-walk trace.path-walk-same-island &&
+ is_delta_base $one $two
+'
+
test_expect_success 'same island allows delta' '
git -c "pack.island=refs/heads" repack -adfi &&
is_delta_base $one $two
--
2.54.0.22.ga642305e3c9
^ permalink raw reply related [flat|nested] 14+ messages in thread* Re: [PATCH 0/3] pack-objects: support bitmaps and delta-islands with `--path-walk`
2026-05-27 23:18 [PATCH 0/3] pack-objects: support bitmaps and delta-islands with `--path-walk` Taylor Blau
` (2 preceding siblings ...)
2026-05-27 23:18 ` [PATCH 3/3] pack-objects: support `--delta-islands` with `--path-walk` Taylor Blau
@ 2026-05-28 15:28 ` Derrick Stolee
2026-05-29 17:26 ` Derrick Stolee
2026-06-02 22:21 ` [PATCH v2 0/4] " Taylor Blau
4 siblings, 1 reply; 14+ messages in thread
From: Derrick Stolee @ 2026-05-28 15:28 UTC (permalink / raw)
To: Taylor Blau, git; +Cc: Junio C Hamano, Jeff King, Elijah Newren
On 5/27/26 7:18 PM, Taylor Blau wrote:
> Here is a trimmed-down reroll of my series to make `--path-walk` work
> with reachability bitmaps and delta-islands. This series was originally
> an RFC that was a companion to Stolee's recent patches to extend
> `--filter` support to `--path-walk` [1].
>
> Since the previous round, Stolee's series has graduated and incorporated
> the filter-related patches from my earlier RFC [2]. What remains are the
> three patches here that implement support for reachability bitmaps and
> delta-islands under `--path-walk`.
>
> * The first patch allows `--path-walk` to use reachability bitmaps when
> they can answer the request, falling back to path-walk enumeration
> when they cannot. It also lets bitmap writing see the same commit
> candidates that the regular traversal would have shown to the bitmap
> selector.
>
> * The second patch is preparatory, and factors the
> delta-islands-specific tree-depth recording from `show_object()` into
> a helper.
>
> * The final patch teaches the path-walk callback to perform the same
> delta-islands side effects as the regular traversal: propagating
> island marks for commits, and recording tree depths for trees. This
> gives `resolve_tree_islands()` the same input in either enumeration
> mode, so the existing island checks can be reused unchanged.
I've applied these patches locally and confirmed that each one passes the
test suite with GIT_TEST_PACK_PATH_WALK=1, which helps to confirm that
the changes are correct (all existing bitmap tests create and use the
bitmaps with --path-walk unless explicitly disabled).
Should we add GIT_TEST_PACK_PATH_WALK=1 to the test-var CI build, now
that this is going to be more commonly used?
Do you have any end-to-end performance data to demonstrate that these
changes are effective at scale? Are we still producing packfiles with the
pack-file compression and now with .bitmap files? How does this impact
the performance of a clone or fetch when using a bitmap index at read
time?
With that in mind, should we update any t/perf/ test to cover some of
these scenarios? I'm running a few with GIT_TEST_PACK_PATH_WALK=1 on
my laptop as a test, but it's taking a while. If you have stats ready
from your local testing, then that would be interesting.
Thanks,
-Stolee
^ permalink raw reply [flat|nested] 14+ messages in thread* Re: [PATCH 0/3] pack-objects: support bitmaps and delta-islands with `--path-walk`
2026-05-28 15:28 ` [PATCH 0/3] pack-objects: support bitmaps and delta-islands " Derrick Stolee
@ 2026-05-29 17:26 ` Derrick Stolee
2026-05-29 20:07 ` Taylor Blau
0 siblings, 1 reply; 14+ messages in thread
From: Derrick Stolee @ 2026-05-29 17:26 UTC (permalink / raw)
To: Taylor Blau, git; +Cc: Junio C Hamano, Jeff King, Elijah Newren
On 5/28/26 11:28 AM, Derrick Stolee wrote:
> On 5/27/26 7:18 PM, Taylor Blau wrote:
> Do you have any end-to-end performance data to demonstrate that these
> changes are effective at scale? Are we still producing packfiles with the
> pack-file compression and now with .bitmap files? How does this impact
> the performance of a clone or fetch when using a bitmap index at read
> time?
Here's my attempt to use our existing performance tests to analyze the
impact of this series.
Running p5311 against the base of this topic and this topic with
GIT_TEST_PACK_PATH_WALK=1, I get this output:
Test HEAD~3 HEAD
-----------------------------------------------------------------
5311.4: server (1 days) (lookup=true) 0.02 0.03 +50.0%
5311.5: size (1 days) 6.8K 124.9K +1730.9%
5311.6: client (1 days) (lookup=true) 0.02 0.01 -50.0%
5311.8: server (2 days) (lookup=true) 0.02 0.03 +50.0%
5311.9: size (2 days) 6.8K 124.9K +1730.9%
5311.10: client (2 days) (lookup=true) 0.02 0.01 -50.0%
5311.12: server (4 days) (lookup=true) 0.02 0.03 +50.0%
5311.13: size (4 days) 6.8K 124.9K +1730.9%
5311.14: client (4 days) (lookup=true) 0.02 0.01 -50.0%
5311.16: server (8 days) (lookup=true) 0.03 0.03 +0.0%
5311.17: size (8 days) 37.3K 186.0K +398.2%
5311.18: client (8 days) (lookup=true) 0.03 0.02 -33.3%
5311.20: server (16 days) (lookup=true) 0.02 0.03 +50.0%
5311.21: size (16 days) 37.3K 186.0K +398.2%
5311.22: client (16 days) (lookup=true) 0.03 0.02 -33.3%
5311.24: server (32 days) (lookup=true) 0.03 0.03 +0.0%
5311.25: size (32 days) 46.5K 197.2K +324.3%
5311.26: client (32 days) (lookup=true) 0.03 0.02 -33.3%
5311.28: server (64 days) (lookup=true) 0.24 0.16 -33.3%
5311.29: size (64 days) 1.5M 5.1M +239.8%
5311.30: client (64 days) (lookup=true) 0.42 0.35 -16.7%
5311.32: server (128 days) (lookup=true) 0.49 0.29 -40.8%
5311.33: size (128 days) 4.1M 9.8M +139.5%
5311.34: client (128 days) (lookup=true) 0.86 0.65 -24.4%
5311.38: server (1 days) (lookup=false) 0.02 0.03 +50.0%
5311.39: size (1 days) 6.8K 124.9K +1730.9%
5311.40: client (1 days) (lookup=false) 0.02 0.02 +0.0%
5311.42: server (2 days) (lookup=false) 0.02 0.03 +50.0%
5311.43: size (2 days) 6.8K 124.9K +1730.9%
5311.44: client (2 days) (lookup=false) 0.02 0.02 +0.0%
5311.46: server (4 days) (lookup=false) 0.02 0.03 +50.0%
5311.47: size (4 days) 6.8K 124.9K +1730.9%
5311.48: client (4 days) (lookup=false) 0.02 0.02 +0.0%
5311.50: server (8 days) (lookup=false) 0.02 0.03 +50.0%
5311.51: size (8 days) 37.3K 186.0K +398.2%
5311.52: client (8 days) (lookup=false) 0.03 0.02 -33.3%
5311.54: server (16 days) (lookup=false) 0.02 0.03 +50.0%
5311.55: size (16 days) 37.3K 186.0K +398.2%
5311.56: client (16 days) (lookup=false) 0.03 0.02 -33.3%
5311.58: server (32 days) (lookup=false) 0.03 0.03 +0.0%
5311.59: size (32 days) 46.5K 197.2K +324.3%
5311.60: client (32 days) (lookup=false) 0.03 0.02 -33.3%
5311.62: server (64 days) (lookup=false) 0.25 0.17 -32.0%
5311.63: size (64 days) 1.5M 5.1M +239.8%
5311.64: client (64 days) (lookup=false) 0.43 0.37 -14.0%
5311.66: server (128 days) (lookup=false) 0.50 0.29 -42.0%
5311.67: size (128 days) 4.1M 9.8M +138.6%
5311.68: client (128 days) (lookup=false) 0.87 0.67 -23.0%
It's important to realize that even with the test variable, the
path-walk logic is overriding the bitmap logic in the HEAD~3
case.
What's happening is that the path-walk mode (without bitmaps)
is computing a smaller packfile for all of these cases. Some
are significantly smaller, but only when it's a very small
pack anyway. The bitmap case is faster only for larger fetches.
I did the same test without the path-walk feature and both columns
looked the same (as expected, no change due to this series) and
the data matched the path-walk test's HEAD column pretty closely.
So this shows that adding path-walk to bitmap-focused efforts is
not a regression on any of these dimensions.
This test was for my local copy of the Git repository, including
all the forks I fetch. I hoped the results would be different
for repositories that have data shapes that struggle with
name-hash collisions, but microsoft/fluentui is an example that
I've used for path-walk repacks before and it had similar data.
Do you have a good feeling for why the path-walk feature doesn't
make a huge change in these test scenarios?
Thanks,
-Stolee
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 0/3] pack-objects: support bitmaps and delta-islands with `--path-walk`
2026-05-29 17:26 ` Derrick Stolee
@ 2026-05-29 20:07 ` Taylor Blau
2026-05-29 21:28 ` Derrick Stolee
0 siblings, 1 reply; 14+ messages in thread
From: Taylor Blau @ 2026-05-29 20:07 UTC (permalink / raw)
To: Derrick Stolee; +Cc: git, Junio C Hamano, Jeff King, Elijah Newren
On Fri, May 29, 2026 at 01:26:33PM -0400, Derrick Stolee wrote:
> On 5/28/26 11:28 AM, Derrick Stolee wrote:
> > On 5/27/26 7:18 PM, Taylor Blau wrote:
>
> > Do you have any end-to-end performance data to demonstrate that these
> > changes are effective at scale? Are we still producing packfiles with the
> > pack-file compression and now with .bitmap files? How does this impact
> > the performance of a clone or fetch when using a bitmap index at read
> > time?
>
> Here's my attempt to use our existing performance tests to analyze the
> impact of this series.
>
> Running p5311 against the base of this topic and this topic with
> GIT_TEST_PACK_PATH_WALK=1, I get this output:
Yikes. That's not great, but see below for what I think is going on.
(As an aside, we can focus in on either lookup=true or lookup=false,
since these are just controlling whether or not the bitmap lookup table
is written. On a repository as small as git.git, this shouldn't make a
huge difference either way. I have a separate series to make this the
default and to clean up the t/perf suite accordingly, but haven't sent
it to the list yet.)
> Test HEAD~3 HEAD
> -----------------------------------------------------------------
> [...]
I think this is either the primary reason why you're not seeing an
improvement here, or at least related to it...
> Do you have a good feeling for why the path-walk feature doesn't
> make a huge change in these test scenarios?
I think the problem is that we're relying on the TEST_ variable to tell
pack-objects to generate a pack using --path-walk, but treat it as a
fallback.
I suspect that since p5311 invokes repack *without* the '--path-walk'
option, we end up in this case within cmd_pack_objects():
if (path_walk < 0) {
if (use_bitmap_index > 0 ||
!use_internal_rev_list)
path_walk = 0;
else if (the_repository->gitdir &&
the_repository->settings.pack_use_path_walk)
path_walk = 1;
else
path_walk = git_env_bool("GIT_TEST_PACK_PATH_WALK", 0);
}
, where `path_walk` is *not* set, but `use_bitmap_index` is since p5311
set the 'pack.writeBitmaps' configuration (as a separate aside, this
should really prefer the non-deprecated 'repack.writeBitmaps' variant).
So in that case, we fall back to `path_walk = 0`, and don't even bother
reading the `GIT_TEST_PACK_PATH_WALK` variable.
If I modify the perf test like so:
--- 8< ---
diff --git a/t/perf/p5311-pack-bitmaps-fetch.sh b/t/perf/p5311-pack-bitmaps-fetch.sh
index 047efb995d6..1c9c99216e3 100755
--- a/t/perf/p5311-pack-bitmaps-fetch.sh
+++ b/t/perf/p5311-pack-bitmaps-fetch.sh
@@ -13,7 +13,7 @@ test_fetch_bitmaps () {
test_expect_success 'create bitmapped server repo' '
git config pack.writebitmaps true &&
git config pack.writeBitmapLookupTable '"$1"' &&
- git repack -ad
+ git repack -ad --path-walk
'
# simulate a fetch from a repository that last fetched N days ago, for
--- >8 ---
, then I can get significantly improved results when running without the
GIT_TEST_PACK_PATH_WALK variablle (here I'm truncating the
'lookup=false' case, which performs nearly identically):
Test HEAD~3 HEAD
------------------------------------------------------------------------------------
5311.4: server (1 days) (lookup=true) 2.57(2.52+0.04) 0.03(0.02+0.00) -98.8%
5311.5: size (1 days) 153.4K 153.4K +0.0%
5311.6: client (1 days) (lookup=true) 0.00(0.01+0.00) 0.00(0.01+0.00) =
5311.8: server (2 days) (lookup=true) 2.60(2.55+0.04) 0.02(0.02+0.00) -99.2%
5311.9: size (2 days) 153.4K 153.4K +0.0%
5311.10: client (2 days) (lookup=true) 0.00(0.01+0.00) 0.00(0.01+0.00) =
5311.12: server (4 days) (lookup=true) 2.60(2.54+0.05) 0.03(0.03+0.00) -98.8%
5311.13: size (4 days) 209.0K 209.0K +0.0%
5311.14: client (4 days) (lookup=true) 0.01(0.02+0.00) 0.01(0.01+0.00) +0.0%
5311.16: server (8 days) (lookup=true) 2.58(2.53+0.04) 0.03(0.03+0.00) -98.8%
5311.17: size (8 days) 209.0K 209.0K +0.0%
5311.18: client (8 days) (lookup=true) 0.01(0.01+0.00) 0.01(0.02+0.00) +0.0%
5311.20: server (16 days) (lookup=true) 2.58(2.52+0.05) 0.03(0.03+0.00) -98.8%
5311.21: size (16 days) 209.0K 209.0K +0.0%
5311.22: client (16 days) (lookup=true) 0.01(0.02+0.00) 0.01(0.01+0.00) +0.0%
5311.24: server (32 days) (lookup=true) 2.61(2.58+0.03) 0.03(0.02+0.01) -98.9%
5311.25: size (32 days) 212.9K 212.9K +0.0%
5311.26: client (32 days) (lookup=true) 0.01(0.02+0.00) 0.02(0.02+0.00) +100.0%
5311.28: server (64 days) (lookup=true) 2.72(2.79+0.06) 0.19(0.30+0.03) -93.0%
5311.29: size (64 days) 4.5M 4.5M -0.0%
5311.30: client (64 days) (lookup=true) 0.49(0.58+0.02) 0.48(0.56+0.04) -2.0%
5311.32: server (128 days) (lookup=true) 2.90(3.21+0.09) 0.35(0.70+0.04) -87.9%
5311.33: size (128 days) 9.4M 9.5M +0.4%
5311.34: client (128 days) (lookup=true) 0.98(1.27+0.08) 0.98(1.33+0.06) +0.0%
My reading here is that we get significantly smaller packs (i.e. all
'test_size' tests drop from HEAD~3 to HEAD) in the same amount of time
(i.e. that all 'test_perf' tests are roughly flat).
That lines up with my expectation here, which is that even though we're
using bitmaps at read time, that's effectively seeding the verbatim pack
reuse over a significantly smaller pack, producing a much smaller output
pack as a result.
As to whether we should modify the perf suite to test this, naturally I
think we should. Likely that looks like modifying p5313 to re-run
`test_all_with_args` with '--use-bitmap-index' after repacking with
--path-walk and generating bitmaps like so (untested):
--- 8< ---
diff --git a/t/perf/p5313-pack-objects.sh b/t/perf/p5313-pack-objects.sh
index 46a6cd32d24..663717982b1 100755
--- a/t/perf/p5313-pack-objects.sh
+++ b/t/perf/p5313-pack-objects.sh
@@ -22,6 +22,21 @@ test_expect_success 'create rev input' '
EOF
'
+test_repack_with_args () {
+ args="$@"
+ export args
+
+ test_perf "repack with $args" '
+ git repack -adf $args
+ '
+
+ test_size "repack size with $args" '
+ gitdir=$(git rev-parse --git-dir) &&
+ pack=$(ls $gitdir/objects/pack/pack-*.pack) &&
+ test_file_size "$pack"
+ '
+}
+
test_all_with_args () {
parameter=$1
export parameter
@@ -52,23 +67,22 @@ test_all_with_args () {
test_size "shallow pack size with $parameter" '
test_file_size out
'
-
- test_perf "repack with $parameter" '
- git repack -adf $parameter
- '
-
- test_size "repack size with $parameter" '
- gitdir=$(git rev-parse --git-dir) &&
- pack=$(ls $gitdir/objects/pack/pack-*.pack) &&
- test_file_size "$pack"
- '
}
for version in 1 2
do
- test_all_with_args --name-hash-version=$version
+ arg="--name-hash-version=$version" &&
+
+ test_all_with_args "$arg" &&
+ test_repack_with_args "$arg" || return 1
done
test_all_with_args --path-walk
+test_repack_with_args --path-walk
+
+# inverted order here: we want to test using reachability bitmaps on a
+# pack written with --path-walk
+test_repack_with_args --path-walk --write-bitmap-index
+test_all_with_args --use-bitmap-index
test_done
--- >8 ---
I don't have a strong opinion on whether or not we should include that
in this series or elsewhere.
Thanks,
Taylor
^ permalink raw reply related [flat|nested] 14+ messages in thread* Re: [PATCH 0/3] pack-objects: support bitmaps and delta-islands with `--path-walk`
2026-05-29 20:07 ` Taylor Blau
@ 2026-05-29 21:28 ` Derrick Stolee
2026-05-29 22:20 ` Taylor Blau
0 siblings, 1 reply; 14+ messages in thread
From: Derrick Stolee @ 2026-05-29 21:28 UTC (permalink / raw)
To: Taylor Blau; +Cc: git, Junio C Hamano, Jeff King, Elijah Newren
On 5/29/2026 4:07 PM, Taylor Blau wrote:
> On Fri, May 29, 2026 at 01:26:33PM -0400, Derrick Stolee wrote:
>> On 5/28/26 11:28 AM, Derrick Stolee wrote:
>>> On 5/27/26 7:18 PM, Taylor Blau wrote:
>>
>>> Do you have any end-to-end performance data to demonstrate that these
>>> changes are effective at scale? Are we still producing packfiles with the
>>> pack-file compression and now with .bitmap files? How does this impact
>>> the performance of a clone or fetch when using a bitmap index at read
>>> time?
>>
>> Here's my attempt to use our existing performance tests to analyze the
>> impact of this series.
>>
>> Running p5311 against the base of this topic and this topic with
>> GIT_TEST_PACK_PATH_WALK=1, I get this output:
>
> Yikes. That's not great, but see below for what I think is going on.
>
> (As an aside, we can focus in on either lookup=true or lookup=false,
> since these are just controlling whether or not the bitmap lookup table
> is written. On a repository as small as git.git, this shouldn't make a
> huge difference either way. I have a separate series to make this the
> default and to clean up the t/perf suite accordingly, but haven't sent
> it to the list yet.)
>
>> Test HEAD~3 HEAD
>> -----------------------------------------------------------------
>> [...]
>
> I think this is either the primary reason why you're not seeing an
> improvement here, or at least related to it...
>
>> Do you have a good feeling for why the path-walk feature doesn't
>> make a huge change in these test scenarios?
>
> I think the problem is that we're relying on the TEST_ variable to tell
> pack-objects to generate a pack using --path-walk, but treat it as a
> fallback.
>
> I suspect that since p5311 invokes repack *without* the '--path-walk'
> option, we end up in this case within cmd_pack_objects():
>
> if (path_walk < 0) {
> if (use_bitmap_index > 0 ||
> !use_internal_rev_list)
> path_walk = 0;
> else if (the_repository->gitdir &&
> the_repository->settings.pack_use_path_walk)
> path_walk = 1;
> else
> path_walk = git_env_bool("GIT_TEST_PACK_PATH_WALK", 0);
> }
>
> , where `path_walk` is *not* set, but `use_bitmap_index` is since p5311
> set the 'pack.writeBitmaps' configuration (as a separate aside, this
> should really prefer the non-deprecated 'repack.writeBitmaps' variant).
I see. When `--use-bitmap-index` is specified, then the test variable is
ignored. So my tests aren't actually measuring the intended end state.
> So in that case, we fall back to `path_walk = 0`, and don't even bother
> reading the `GIT_TEST_PACK_PATH_WALK` variable.
>
> If I modify the perf test like so:
>
> --- 8< ---
> diff --git a/t/perf/p5311-pack-bitmaps-fetch.sh b/t/perf/p5311-pack-bitmaps-fetch.sh
> index 047efb995d6..1c9c99216e3 100755
> --- a/t/perf/p5311-pack-bitmaps-fetch.sh
> +++ b/t/perf/p5311-pack-bitmaps-fetch.sh
> @@ -13,7 +13,7 @@ test_fetch_bitmaps () {
> test_expect_success 'create bitmapped server repo' '
> git config pack.writebitmaps true &&
> git config pack.writeBitmapLookupTable '"$1"' &&
> - git repack -ad
> + git repack -ad --path-walk
> '
>
> # simulate a fetch from a repository that last fetched N days ago, for
> --- >8 ---
> , then I can get significantly improved results when running without the
> GIT_TEST_PACK_PATH_WALK variablle (here I'm truncating the
> 'lookup=false' case, which performs nearly identically):
>
> Test HEAD~3 HEAD
> ------------------------------------------------------------------------------------
> 5311.4: server (1 days) (lookup=true) 2.57(2.52+0.04) 0.03(0.02+0.00) -98.8%
> 5311.5: size (1 days) 153.4K 153.4K +0.0%
> 5311.6: client (1 days) (lookup=true) 0.00(0.01+0.00) 0.00(0.01+0.00) =
> 5311.8: server (2 days) (lookup=true) 2.60(2.55+0.04) 0.02(0.02+0.00) -99.2%
> 5311.9: size (2 days) 153.4K 153.4K +0.0%
> 5311.10: client (2 days) (lookup=true) 0.00(0.01+0.00) 0.00(0.01+0.00) =
> 5311.12: server (4 days) (lookup=true) 2.60(2.54+0.05) 0.03(0.03+0.00) -98.8%
> 5311.13: size (4 days) 209.0K 209.0K +0.0%
> 5311.14: client (4 days) (lookup=true) 0.01(0.02+0.00) 0.01(0.01+0.00) +0.0%
> 5311.16: server (8 days) (lookup=true) 2.58(2.53+0.04) 0.03(0.03+0.00) -98.8%
> 5311.17: size (8 days) 209.0K 209.0K +0.0%
> 5311.18: client (8 days) (lookup=true) 0.01(0.01+0.00) 0.01(0.02+0.00) +0.0%
> 5311.20: server (16 days) (lookup=true) 2.58(2.52+0.05) 0.03(0.03+0.00) -98.8%
> 5311.21: size (16 days) 209.0K 209.0K +0.0%
> 5311.22: client (16 days) (lookup=true) 0.01(0.02+0.00) 0.01(0.01+0.00) +0.0%
> 5311.24: server (32 days) (lookup=true) 2.61(2.58+0.03) 0.03(0.02+0.01) -98.9%
> 5311.25: size (32 days) 212.9K 212.9K +0.0%
> 5311.26: client (32 days) (lookup=true) 0.01(0.02+0.00) 0.02(0.02+0.00) +100.0%
> 5311.28: server (64 days) (lookup=true) 2.72(2.79+0.06) 0.19(0.30+0.03) -93.0%
> 5311.29: size (64 days) 4.5M 4.5M -0.0%
> 5311.30: client (64 days) (lookup=true) 0.49(0.58+0.02) 0.48(0.56+0.04) -2.0%
> 5311.32: server (128 days) (lookup=true) 2.90(3.21+0.09) 0.35(0.70+0.04) -87.9%
> 5311.33: size (128 days) 9.4M 9.5M +0.4%
> 5311.34: client (128 days) (lookup=true) 0.98(1.27+0.08) 0.98(1.33+0.06) +0.0%
>
> My reading here is that we get significantly smaller packs (i.e. all
> 'test_size' tests drop from HEAD~3 to HEAD) in the same amount of time
> (i.e. that all 'test_perf' tests are roughly flat).
The sizes don't shrink, and in one case increases by a small amount. I'm
happy to count those cases as noise from multi-threaded delta calculations
being less deterministic.
The _time_ taken to compute the packfiles is what decreases, though, which
is promising.
> That lines up with my expectation here, which is that even though we're
> using bitmaps at read time, that's effectively seeding the verbatim pack
> reuse over a significantly smaller pack, producing a much smaller output
> pack as a result.
Can you double-check this reasoning with my read of the data? The size
isn't changing, but the computation time is.
> As to whether we should modify the perf suite to test this, naturally I
> think we should. Likely that looks like modifying p5313 to re-run
> `test_all_with_args` with '--use-bitmap-index' after repacking with
> --path-walk and generating bitmaps like so (untested):
>
> --- 8< ---
> diff --git a/t/perf/p5313-pack-objects.sh b/t/perf/p5313-pack-objects.sh
> index 46a6cd32d24..663717982b1 100755
> --- a/t/perf/p5313-pack-objects.sh
> +++ b/t/perf/p5313-pack-objects.sh
> @@ -22,6 +22,21 @@ test_expect_success 'create rev input' '
> EOF
> '
>
> +test_repack_with_args () {
> + args="$@"
> + export args
> +
> + test_perf "repack with $args" '
> + git repack -adf $args
> + '
> +
> + test_size "repack size with $args" '
> + gitdir=$(git rev-parse --git-dir) &&
> + pack=$(ls $gitdir/objects/pack/pack-*.pack) &&
> + test_file_size "$pack"
> + '
> +}
> +
I see that these tests are extracted from test_all_... below:
> test_all_with_args () {
> parameter=$1
> export parameter
> @@ -52,23 +67,22 @@ test_all_with_args () {
> test_size "shallow pack size with $parameter" '
> test_file_size out
> '
> -
> - test_perf "repack with $parameter" '
> - git repack -adf $parameter
> - '
> -
> - test_size "repack size with $parameter" '
> - gitdir=$(git rev-parse --git-dir) &&
> - pack=$(ls $gitdir/objects/pack/pack-*.pack) &&
> - test_file_size "$pack"
> - '
> }
Because the --use-bitmap-index and --write-bitmap-index args
aren't appropriate across these different commands.
nit: the diff would be more obvious if test_repack_with_args
was defined after test_all_with_args so the hunk of existing
tests wouldn't appear in the diff.
> for version in 1 2
> do
> - test_all_with_args --name-hash-version=$version
> + arg="--name-hash-version=$version" &&
> +
> + test_all_with_args "$arg" &&
> + test_repack_with_args "$arg" || return 1
> done
>
> test_all_with_args --path-walk
> +test_repack_with_args --path-walk
> +
> +# inverted order here: we want to test using reachability bitmaps on a
> +# pack written with --path-walk
> +test_repack_with_args --path-walk --write-bitmap-index
> +test_all_with_args --use-bitmap-index
So this allows us to test all of the different modes.
> --- >8 ---
>
> I don't have a strong opinion on whether or not we should include that
> in this series or elsewhere.
I'm interested to see some results of your new p5313 test
to make sure that we are getting expected size changes for
the repack, since the p5311 tests were more focused on the
thin fetch pack (and didn't show a change in size).
For that, I'd be interested to see this test be included in
a patch for future reference, too.
Thanks,
-Stolee
^ permalink raw reply [flat|nested] 14+ messages in thread* Re: [PATCH 0/3] pack-objects: support bitmaps and delta-islands with `--path-walk`
2026-05-29 21:28 ` Derrick Stolee
@ 2026-05-29 22:20 ` Taylor Blau
0 siblings, 0 replies; 14+ messages in thread
From: Taylor Blau @ 2026-05-29 22:20 UTC (permalink / raw)
To: Derrick Stolee; +Cc: git, Junio C Hamano, Jeff King, Elijah Newren
On Fri, May 29, 2026 at 05:28:32PM -0400, Derrick Stolee wrote:
> > My reading here is that we get significantly smaller packs (i.e. all
> > 'test_size' tests drop from HEAD~3 to HEAD) in the same amount of time
> > (i.e. that all 'test_perf' tests are roughly flat).
>
> The sizes don't shrink, and in one case increases by a small amount. I'm
> happy to count those cases as noise from multi-threaded delta calculations
> being less deterministic.
>
> The _time_ taken to compute the packfiles is what decreases, though, which
> is promising.
>
> > That lines up with my expectation here, which is that even though we're
> > using bitmaps at read time, that's effectively seeding the verbatim pack
> > reuse over a significantly smaller pack, producing a much smaller output
> > pack as a result.
>
> Can you double-check this reasoning with my read of the data? The size
> isn't changing, but the computation time is.
Yeah, that's right, and my apologies for being in a slight rush when
sending this to you ;-).
The size staying flat makes sense, since both packs were generated with
--path-walk, we're just changing the way they're served. In HEAD~3, that
pack is generated on-the-fly and sent over to the client. At HEAD, we're
doing verbatim reuse over an already-existing pack which we got via
repacking (also generated with --path-walk).
So I think you get to the same end-result (more or less, modulo usual
delta patching via pack-reuse), but the time to get there drops
significantly since we don't have to (re)compute the pack.
> > +test_repack_with_args () {
> > + args="$@"
> > + export args
> > +
> > + test_perf "repack with $args" '
> > + git repack -adf $args
> > + '
> > +
> > + test_size "repack size with $args" '
> > + gitdir=$(git rev-parse --git-dir) &&
> > + pack=$(ls $gitdir/objects/pack/pack-*.pack) &&
> > + test_file_size "$pack"
> > + '
> > +}
> > +
> I see that these tests are extracted from test_all_... below:
>
> [...]
>
> Because the --use-bitmap-index and --write-bitmap-index args
> aren't appropriate across these different commands.
Exactly.
> nit: the diff would be more obvious if test_repack_with_args
> was defined after test_all_with_args so the hunk of existing
> tests wouldn't appear in the diff.
Fair enough :-).
> > for version in 1 2
> > do
> > - test_all_with_args --name-hash-version=$version
> > + arg="--name-hash-version=$version" &&
> > +
> > + test_all_with_args "$arg" &&
> > + test_repack_with_args "$arg" || return 1
> > done
> >
> > test_all_with_args --path-walk
> > +test_repack_with_args --path-walk
> > +
> > +# inverted order here: we want to test using reachability bitmaps on a
> > +# pack written with --path-walk
> > +test_repack_with_args --path-walk --write-bitmap-index
> > +test_all_with_args --use-bitmap-index
>
> So this allows us to test all of the different modes.
>
> > --- >8 ---
> >
> > I don't have a strong opinion on whether or not we should include that
> > in this series or elsewhere.
>
> I'm interested to see some results of your new p5313 test
> to make sure that we are getting expected size changes for
> the repack, since the p5311 tests were more focused on the
> thin fetch pack (and didn't show a change in size).
>
> For that, I'd be interested to see this test be included in
> a patch for future reference, too.
Will do.
Thanks,
Taylor
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH v2 0/4] pack-objects: support bitmaps and delta-islands with `--path-walk`
2026-05-27 23:18 [PATCH 0/3] pack-objects: support bitmaps and delta-islands with `--path-walk` Taylor Blau
` (3 preceding siblings ...)
2026-05-28 15:28 ` [PATCH 0/3] pack-objects: support bitmaps and delta-islands " Derrick Stolee
@ 2026-06-02 22:21 ` Taylor Blau
2026-06-02 22:21 ` [PATCH v2 1/4] t/perf: drop p5311's lookup-table permutation Taylor Blau
` (3 more replies)
4 siblings, 4 replies; 14+ messages in thread
From: Taylor Blau @ 2026-06-02 22:21 UTC (permalink / raw)
To: git; +Cc: Derrick Stolee, Junio C Hamano, Jeff King, Elijah Newren
Note to the maintainer:
* This series is based on 'ds/path-walk-filters' with Patrick's
'ps/clang-w-glibc-2.43-and-_Generic' merged in. The former has since
graduated. These are the three remaining patches from my earlier RFC
after Stolee's series incorporated the filter-related pieces.
Here is a very small reroll of my series to make `--path-walk` work with
reachability bitmaps and delta-islands.
Since the previous round, the only changes are:
* A new commit making some adjustments to p5311 to facilitate
performance testing bitmaps in repositories repacked with
'--path-walk'.
* Updates to (what is now) the second commit's message, including
performance results based on the aforementioned changes.
Outside of the above, the series is otherwise unchanged.
Thanks in advance for your review!
Taylor Blau (4):
t/perf: drop p5311's lookup-table permutation
pack-objects: support reachability bitmaps with `--path-walk`
pack-objects: extract `record_tree_depth()` helper
pack-objects: support `--delta-islands` with `--path-walk`
Documentation/git-pack-objects.adoc | 12 ++---
builtin/pack-objects.c | 68 +++++++++++++++++++++--------
t/perf/p5311-pack-bitmaps-fetch.sh | 20 +++++----
t/t5310-pack-bitmaps.sh | 36 +++++++++++++++
t/t5320-delta-islands.sh | 29 ++++++++++++
5 files changed, 134 insertions(+), 31 deletions(-)
Range-diff against v1:
-: ----------- > 1: 52d63e8910e t/perf: drop p5311's lookup-table permutation
1: 3fa8bfbfd59 ! 2: ffad584a43e pack-objects: support reachability bitmaps with `--path-walk`
@@ Commit message
bitmap can answer the request, use it; otherwise fall back to
path-walk's own enumeration.
+ As a result, we can see significantly reduced pack sizes from p5311
+ before this commit:
+
+ Test HEAD^ HEAD
+ ----------------------------------------------------------------------------------
+ 5311.38: server (1 days, --path-walk) 2.56(2.52+0.03) 0.01(0.01+0.00) -99.6%
+ 5311.39: size (1 days, --path-walk) 123.9K 123.9K +0.0%
+ 5311.40: client (1 days, --path-walk) 0.00(0.01+0.00) 0.00(0.00+0.00) =
+ 5311.42: server (2 days, --path-walk) 2.57(2.52+0.05) 0.01(0.01+0.00) -99.6%
+ 5311.43: size (2 days, --path-walk) 123.9K 123.9K +0.0%
+ 5311.44: client (2 days, --path-walk) 0.00(0.00+0.00) 0.00(0.00+0.00) =
+ 5311.46: server (4 days, --path-walk) 2.58(2.51+0.07) 0.01(0.01+0.00) -99.6%
+ 5311.47: size (4 days, --path-walk) 123.9K 123.9K +0.0%
+ 5311.48: client (4 days, --path-walk) 0.00(0.00+0.00) 0.00(0.00+0.00) =
+ 5311.50: server (8 days, --path-walk) 2.58(2.53+0.04) 0.02(0.02+0.00) -99.2%
+ 5311.51: size (8 days, --path-walk) 152.4K 152.4K +0.0%
+ 5311.52: client (8 days, --path-walk) 0.00(0.01+0.00) 0.00(0.01+0.00) =
+ 5311.54: server (16 days, --path-walk) 2.58(2.52+0.05) 0.03(0.02+0.00) -98.8%
+ 5311.55: size (16 days, --path-walk) 205.3K 205.3K +0.0%
+ 5311.56: client (16 days, --path-walk) 0.01(0.01+0.00) 0.01(0.01+0.00) +0.0%
+ 5311.58: server (32 days, --path-walk) 2.59(2.53+0.06) 0.03(0.03+0.00) -98.8%
+ 5311.59: size (32 days, --path-walk) 209.3K 209.3K +0.0%
+ 5311.60: client (32 days, --path-walk) 0.01(0.02+0.00) 0.01(0.02+0.00) +0.0%
+ 5311.62: server (64 days, --path-walk) 2.70(2.76+0.06) 0.16(0.24+0.04) -94.1%
+ 5311.63: size (64 days, --path-walk) 4.1M 4.1M +0.0%
+ 5311.64: client (64 days, --path-walk) 0.44(0.50+0.02) 0.44(0.51+0.02) +0.0%
+ 5311.66: server (128 days, --path-walk) 2.88(3.20+0.05) 0.34(0.65+0.05) -88.2%
+ 5311.67: size (128 days, --path-walk) 9.0M 9.0M -0.0%
+ 5311.68: client (128 days, --path-walk) 0.93(1.22+0.07) 0.93(1.20+0.08) +0.0%
+
+ We get the same size of output pack, but this commit allows us to do so
+ in a significantly shorter amount of time. Intuitively, we're generating
+ the same pack (hence the unchanged 'test_size' output from run to run),
+ but varying how we get there. Before this commit, pack-objects prefers
+ '--path-walk' to '--use-bitmap-index', so we generate the output pack by
+ performing a normal '--path-walk' traversal. With this commit, we are
+ operating over a *repacked* state (that itself was done with a
+ '--path-walk' traversal), but are able to perform pack-reuse on that
+ repacked state via bitmaps.
+
There is one wrinkle when it comes to '--boundary', which we must not
pass into the bitmap walk in the presence of both '--path-walk' and
'--use-bitmap-index'. Path-walk needs boundary commits when it performs
its own traversal, in order to discover bases for thin packs, but the
- bitmap traversal expects the usual non-boundary state. Work around this
- by setting `revs->boundary` as late as possible within
- `get_object_list_path_walk()`, after any bitmap attempt has either
- succeeded or declined to answer the request.
+ bitmap traversal does not expect this. Work around this by setting
+ `revs->boundary` as late as possible within the '--path-walk' traversal,
+ after any bitmap attempt has either succeeded or declined to answer the
+ request.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
@@ builtin/pack-objects.c: int cmd_pack_objects(int argc,
use_internal_rev_list = 1;
strvec_push(&rp, shallow
+ ## t/perf/p5311-pack-bitmaps-fetch.sh ##
+@@ t/perf/p5311-pack-bitmaps-fetch.sh: test_description='performance of fetches from bitmapped packs'
+ . ./perf-lib.sh
+
+ test_fetch_bitmaps () {
++ argv=$1
++ export argv
++
+ test_expect_success 'setup test directory' '
+ rm -fr * .git
+ '
+
+ test_perf_default_repo
+
+- test_expect_success 'create bitmapped server repo' '
++ test_expect_success "create bitmapped server repo ${argv:+($argv)}" '
+ git config pack.writebitmaps true &&
+- git repack -ad
++ git repack -ad $argv
+ '
+
+ # simulate a fetch from a repository that last fetched N days ago, for
+@@ t/perf/p5311-pack-bitmaps-fetch.sh: test_fetch_bitmaps () {
+ # and assume the first entry in the chain that is N days older than the current
+ # HEAD is where the HEAD would have been then.
+ for days in 1 2 4 8 16 32 64 128; do
+- title=$(printf '%10s' "($days days)")
++ title=$(printf '%10s' "($days days${argv:+, $argv})")
+ test_expect_success "setup revs from $days days ago" '
+ now=$(git log -1 --format=%ct HEAD) &&
+ then=$(($now - ($days * 86400))) &&
+@@ t/perf/p5311-pack-bitmaps-fetch.sh: test_fetch_bitmaps () {
+ done
+ }
+
+-test_fetch_bitmaps
++for argv in '' --path-walk
++do
++ test_fetch_bitmaps $argv || return 1
++done
+
+ test_done
+
## t/t5310-pack-bitmaps.sh ##
@@ t/t5310-pack-bitmaps.sh: test_bitmap_cases
2: bdae873eaab = 3: 069c50d3370 pack-objects: extract `record_tree_depth()` helper
3: a642305e3c9 = 4: ae57607b57f pack-objects: support `--delta-islands` with `--path-walk`
base-commit: 45a9ecee26839cc880fdd5e704339dd3cf4ffc26
--
2.54.0.23.gae57607b57f
^ permalink raw reply [flat|nested] 14+ messages in thread* [PATCH v2 1/4] t/perf: drop p5311's lookup-table permutation
2026-06-02 22:21 ` [PATCH v2 0/4] " Taylor Blau
@ 2026-06-02 22:21 ` Taylor Blau
2026-06-02 22:21 ` [PATCH v2 2/4] pack-objects: support reachability bitmaps with `--path-walk` Taylor Blau
` (2 subsequent siblings)
3 siblings, 0 replies; 14+ messages in thread
From: Taylor Blau @ 2026-06-02 22:21 UTC (permalink / raw)
To: git; +Cc: Derrick Stolee, Junio C Hamano, Jeff King, Elijah Newren
p5311 measures the cost of serving a fetch from a bitmapped pack and
indexing the resulting pack on the client. Since 761416ef91d
(bitmap-lookup-table: add performance tests for lookup table,
2022-08-14), p5311 effectively runs itself twice: once with the bitmap's
lookup table extension enabled, and again with it disabled.
This comparison has served its useful purpose, as the lookup table is
almost four years old, and the de-facto default in server-side Git
deployments.
A following commit will want to test a different combination (repacking
with and without '--path-walk' instead of the lookup table). Instead of
multiplying the current test count by two again to produce four
variations of `test_fetch_bitmaps()`, drop the lookup table option to
reduce the number of perf tests we run. Retain `test_fetch_bitmaps()`
itself, since we will use this in the future for the new
parameterization.
(As an aside, a future commit outside of this series will adjust the
default value of 'pack.writeBitmapLookupTable' to "true", matching the
de-facto norm for deployments where the existence of bitmap lookup
tables is meaningful. Punt on that to a later series and instead make
the minimal change for now.)
Suggested-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
t/perf/p5311-pack-bitmaps-fetch.sh | 8 +++-----
1 file changed, 3 insertions(+), 5 deletions(-)
diff --git a/t/perf/p5311-pack-bitmaps-fetch.sh b/t/perf/p5311-pack-bitmaps-fetch.sh
index 047efb995d6..5bea5c64e7b 100755
--- a/t/perf/p5311-pack-bitmaps-fetch.sh
+++ b/t/perf/p5311-pack-bitmaps-fetch.sh
@@ -12,7 +12,6 @@ test_fetch_bitmaps () {
test_expect_success 'create bitmapped server repo' '
git config pack.writebitmaps true &&
- git config pack.writeBitmapLookupTable '"$1"' &&
git repack -ad
'
@@ -32,7 +31,7 @@ test_fetch_bitmaps () {
} >revs
'
- test_perf "server $title (lookup=$1)" '
+ test_perf "server $title" '
git pack-objects --stdout --revs \
--thin --delta-base-offset \
<revs >tmp.pack
@@ -42,13 +41,12 @@ test_fetch_bitmaps () {
test_file_size tmp.pack
'
- test_perf "client $title (lookup=$1)" '
+ test_perf "client $title" '
git index-pack --stdin --fix-thin <tmp.pack
'
done
}
-test_fetch_bitmaps true
-test_fetch_bitmaps false
+test_fetch_bitmaps
test_done
--
2.54.0.23.gae57607b57f
^ permalink raw reply related [flat|nested] 14+ messages in thread* [PATCH v2 2/4] pack-objects: support reachability bitmaps with `--path-walk`
2026-06-02 22:21 ` [PATCH v2 0/4] " Taylor Blau
2026-06-02 22:21 ` [PATCH v2 1/4] t/perf: drop p5311's lookup-table permutation Taylor Blau
@ 2026-06-02 22:21 ` Taylor Blau
2026-06-02 22:21 ` [PATCH v2 3/4] pack-objects: extract `record_tree_depth()` helper Taylor Blau
2026-06-02 22:21 ` [PATCH v2 4/4] pack-objects: support `--delta-islands` with `--path-walk` Taylor Blau
3 siblings, 0 replies; 14+ messages in thread
From: Taylor Blau @ 2026-06-02 22:21 UTC (permalink / raw)
To: git; +Cc: Derrick Stolee, Junio C Hamano, Jeff King, Elijah Newren
When 'pack-objects' is invoked with '--path-walk', it prevents us from
using reachability bitmaps.
This behavior dates back to 70664d2865c (pack-objects: add --path-walk
option, 2025-05-16), which included a comment in the relevant portion of
the command-line arguments handling that read as follows:
/*
* We must disable the bitmaps because we are removing
* the --objects / --objects-edge[-aggressive] options.
*/
In fb2c309b7d3 (pack-objects: pass --objects with --path-walk,
2026-05-02), path-walk learned to pass '--objects' again, but still
kept bitmap traversal disabled. That leaves two useful cases
unsupported:
* A path-walk repack that writes bitmaps does not give the bitmap
selector any commits, because path-walk reveals commits through
`add_objects_by_path()` rather than through `show_commit()`, where
`index_commit_for_bitmap()` is normally called.
* An invocation like "git pack-objects --use-bitmap-index --path-walk"
never tries an existing bitmap, even when one is available and could
answer the request.
Fortunately for us, neither restriction is required.
* On the writing side: teach the path-walk object callback to call
`index_commit_for_bitmap()` for commits that it adds to the pack.
That gives the bitmap selector the commit candidates it would have
seen from the regular traversal.
* For bitmap reading, keep passing '--objects' to the internal rev_list
machinery, but stop clearing `use_bitmap_index`. If an existing
bitmap can answer the request, use it; otherwise fall back to
path-walk's own enumeration.
As a result, we can see significantly reduced pack sizes from p5311
before this commit:
Test HEAD^ HEAD
----------------------------------------------------------------------------------
5311.38: server (1 days, --path-walk) 2.56(2.52+0.03) 0.01(0.01+0.00) -99.6%
5311.39: size (1 days, --path-walk) 123.9K 123.9K +0.0%
5311.40: client (1 days, --path-walk) 0.00(0.01+0.00) 0.00(0.00+0.00) =
5311.42: server (2 days, --path-walk) 2.57(2.52+0.05) 0.01(0.01+0.00) -99.6%
5311.43: size (2 days, --path-walk) 123.9K 123.9K +0.0%
5311.44: client (2 days, --path-walk) 0.00(0.00+0.00) 0.00(0.00+0.00) =
5311.46: server (4 days, --path-walk) 2.58(2.51+0.07) 0.01(0.01+0.00) -99.6%
5311.47: size (4 days, --path-walk) 123.9K 123.9K +0.0%
5311.48: client (4 days, --path-walk) 0.00(0.00+0.00) 0.00(0.00+0.00) =
5311.50: server (8 days, --path-walk) 2.58(2.53+0.04) 0.02(0.02+0.00) -99.2%
5311.51: size (8 days, --path-walk) 152.4K 152.4K +0.0%
5311.52: client (8 days, --path-walk) 0.00(0.01+0.00) 0.00(0.01+0.00) =
5311.54: server (16 days, --path-walk) 2.58(2.52+0.05) 0.03(0.02+0.00) -98.8%
5311.55: size (16 days, --path-walk) 205.3K 205.3K +0.0%
5311.56: client (16 days, --path-walk) 0.01(0.01+0.00) 0.01(0.01+0.00) +0.0%
5311.58: server (32 days, --path-walk) 2.59(2.53+0.06) 0.03(0.03+0.00) -98.8%
5311.59: size (32 days, --path-walk) 209.3K 209.3K +0.0%
5311.60: client (32 days, --path-walk) 0.01(0.02+0.00) 0.01(0.02+0.00) +0.0%
5311.62: server (64 days, --path-walk) 2.70(2.76+0.06) 0.16(0.24+0.04) -94.1%
5311.63: size (64 days, --path-walk) 4.1M 4.1M +0.0%
5311.64: client (64 days, --path-walk) 0.44(0.50+0.02) 0.44(0.51+0.02) +0.0%
5311.66: server (128 days, --path-walk) 2.88(3.20+0.05) 0.34(0.65+0.05) -88.2%
5311.67: size (128 days, --path-walk) 9.0M 9.0M -0.0%
5311.68: client (128 days, --path-walk) 0.93(1.22+0.07) 0.93(1.20+0.08) +0.0%
We get the same size of output pack, but this commit allows us to do so
in a significantly shorter amount of time. Intuitively, we're generating
the same pack (hence the unchanged 'test_size' output from run to run),
but varying how we get there. Before this commit, pack-objects prefers
'--path-walk' to '--use-bitmap-index', so we generate the output pack by
performing a normal '--path-walk' traversal. With this commit, we are
operating over a *repacked* state (that itself was done with a
'--path-walk' traversal), but are able to perform pack-reuse on that
repacked state via bitmaps.
There is one wrinkle when it comes to '--boundary', which we must not
pass into the bitmap walk in the presence of both '--path-walk' and
'--use-bitmap-index'. Path-walk needs boundary commits when it performs
its own traversal, in order to discover bases for thin packs, but the
bitmap traversal does not expect this. Work around this by setting
`revs->boundary` as late as possible within the '--path-walk' traversal,
after any bitmap attempt has either succeeded or declined to answer the
request.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
Documentation/git-pack-objects.adoc | 6 +++--
builtin/pack-objects.c | 18 +++++++++++++--
t/perf/p5311-pack-bitmaps-fetch.sh | 14 +++++++----
t/t5310-pack-bitmaps.sh | 36 +++++++++++++++++++++++++++++
4 files changed, 66 insertions(+), 8 deletions(-)
diff --git a/Documentation/git-pack-objects.adoc b/Documentation/git-pack-objects.adoc
index 8a27aa19fd3..0adce8961a3 100644
--- a/Documentation/git-pack-objects.adoc
+++ b/Documentation/git-pack-objects.adoc
@@ -402,8 +402,10 @@ will be automatically changed to version `1`.
of filenames that cause collisions in Git's default name-hash
algorithm.
+
-Incompatible with `--delta-islands`. The `--use-bitmap-index` option is
-ignored in the presence of `--path-walk`. The `--path-walk` option
+Incompatible with `--delta-islands`. When `--use-bitmap-index` is
+specified with `--path-walk`, a successful bitmap traversal is used for
+object enumeration, with path-walk remaining as the fallback traversal
+when the bitmap cannot satisfy the request. The `--path-walk` option
supports the `--filter=<spec>` forms `blob:none`, `blob:limit=<n>`,
`tree:0`, `object:type=<type>`, and `sparse:<oid>`. These supported filter
types can be combined with the `combine:<spec>+<spec>` form.
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index b783dc62bc9..e4dcb563b7d 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -4732,6 +4732,15 @@ static int add_objects_by_path(const char *path,
continue;
add_object_entry(oid, type, path, exclude);
+
+ if (type == OBJ_COMMIT && write_bitmap_index) {
+ struct commit *commit;
+
+ commit = lookup_commit(the_repository, oid);
+ if (!commit)
+ die(_("could not find commit %s"), oid_to_hex(oid));
+ index_commit_for_bitmap(commit);
+ }
}
oe_end = to_pack.nr_objects;
@@ -4764,6 +4773,13 @@ static int get_object_list_path_walk(struct rev_info *revs)
info.path_fn = add_objects_by_path;
info.path_fn_data = &processed;
+ /*
+ * Path-walk needs boundary commits to discover thin-pack bases, but
+ * bitmap traversal does not understand the boundary state. Set it
+ * here so any prior bitmap attempt sees the usual non-boundary walk.
+ */
+ revs->boundary = 1;
+
/*
* Allow the --[no-]sparse option to be interesting here, if only
* for testing purposes. Paths with no interesting objects will not
@@ -5195,9 +5211,7 @@ int cmd_pack_objects(int argc,
}
}
if (path_walk) {
- strvec_push(&rp, "--boundary");
strvec_push(&rp, "--objects");
- use_bitmap_index = 0;
} else if (thin) {
use_internal_rev_list = 1;
strvec_push(&rp, shallow
diff --git a/t/perf/p5311-pack-bitmaps-fetch.sh b/t/perf/p5311-pack-bitmaps-fetch.sh
index 5bea5c64e7b..1b115d921a1 100755
--- a/t/perf/p5311-pack-bitmaps-fetch.sh
+++ b/t/perf/p5311-pack-bitmaps-fetch.sh
@@ -4,15 +4,18 @@ test_description='performance of fetches from bitmapped packs'
. ./perf-lib.sh
test_fetch_bitmaps () {
+ argv=$1
+ export argv
+
test_expect_success 'setup test directory' '
rm -fr * .git
'
test_perf_default_repo
- test_expect_success 'create bitmapped server repo' '
+ test_expect_success "create bitmapped server repo ${argv:+($argv)}" '
git config pack.writebitmaps true &&
- git repack -ad
+ git repack -ad $argv
'
# simulate a fetch from a repository that last fetched N days ago, for
@@ -20,7 +23,7 @@ test_fetch_bitmaps () {
# and assume the first entry in the chain that is N days older than the current
# HEAD is where the HEAD would have been then.
for days in 1 2 4 8 16 32 64 128; do
- title=$(printf '%10s' "($days days)")
+ title=$(printf '%10s' "($days days${argv:+, $argv})")
test_expect_success "setup revs from $days days ago" '
now=$(git log -1 --format=%ct HEAD) &&
then=$(($now - ($days * 86400))) &&
@@ -47,6 +50,9 @@ test_fetch_bitmaps () {
done
}
-test_fetch_bitmaps
+for argv in '' --path-walk
+do
+ test_fetch_bitmaps $argv || return 1
+done
test_done
diff --git a/t/t5310-pack-bitmaps.sh b/t/t5310-pack-bitmaps.sh
index f693cb56691..69c5da1580a 100755
--- a/t/t5310-pack-bitmaps.sh
+++ b/t/t5310-pack-bitmaps.sh
@@ -577,6 +577,42 @@ test_bitmap_cases
sane_unset GIT_TEST_PACK_USE_BITMAP_BOUNDARY_TRAVERSAL
+test_expect_success 'path-walk repack can write and use bitmap indexes' '
+ test_when_finished "rm -rf path-walk-bitmap" &&
+ git init path-walk-bitmap &&
+ (
+ cd path-walk-bitmap &&
+ test_commit first &&
+ test_commit second &&
+ test_commit third &&
+
+ git repack -a -d -b --path-walk &&
+ git rev-list --test-bitmap --use-bitmap-index HEAD &&
+
+ git rev-parse HEAD >in &&
+
+ git rev-list --objects --no-object-names HEAD >expect.raw &&
+ sort expect.raw >expect &&
+
+ for reuse in true false
+ do
+ : >trace.txt &&
+
+ GIT_TRACE2_EVENT="$(pwd)/trace.txt" \
+ git -c pack.allowPackReuse=$reuse pack-objects \
+ --stdout --revs --path-walk --use-bitmap-index \
+ <in >out.pack &&
+ grep "\"category\":\"bitmap\",\"key\":\"bitmap/hits\"" trace.txt &&
+
+ git index-pack out.pack &&
+
+ list_packed_objects out.idx >actual.raw &&
+ sort actual.raw >actual &&
+ test_cmp expect actual || return 1
+ done
+ )
+'
+
test_expect_success 'incremental repack fails when bitmaps are requested' '
test_commit more-1 &&
test_must_fail git repack -d 2>err &&
--
2.54.0.23.gae57607b57f
^ permalink raw reply related [flat|nested] 14+ messages in thread* [PATCH v2 3/4] pack-objects: extract `record_tree_depth()` helper
2026-06-02 22:21 ` [PATCH v2 0/4] " Taylor Blau
2026-06-02 22:21 ` [PATCH v2 1/4] t/perf: drop p5311's lookup-table permutation Taylor Blau
2026-06-02 22:21 ` [PATCH v2 2/4] pack-objects: support reachability bitmaps with `--path-walk` Taylor Blau
@ 2026-06-02 22:21 ` Taylor Blau
2026-06-02 22:21 ` [PATCH v2 4/4] pack-objects: support `--delta-islands` with `--path-walk` Taylor Blau
3 siblings, 0 replies; 14+ messages in thread
From: Taylor Blau @ 2026-06-02 22:21 UTC (permalink / raw)
To: git; +Cc: Derrick Stolee, Junio C Hamano, Jeff King, Elijah Newren
Prepare for a subsequent change that needs to record tree depths from a
second call site by factoring the delta-islands tree-depth bookkeeping
out of `show_object()` and into a helper, `record_tree_depth()`.
The helper looks up the object in `to_pack`, returns early when the
object was not added there, computes the depth from the slash count in
the supplied name, and preserves the existing max-depth-wins behavior
when a tree is reached by more than one path.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
builtin/pack-objects.c | 32 ++++++++++++++++++--------------
1 file changed, 18 insertions(+), 14 deletions(-)
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index e4dcb563b7d..ec02e2b21d2 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -2722,6 +2722,22 @@ static inline void oe_set_tree_depth(struct packing_data *pack,
pack->tree_depth[e - pack->objects] = tree_depth;
}
+static void record_tree_depth(const struct object_id *oid, const char *name)
+{
+ const char *p;
+ unsigned depth;
+ struct object_entry *ent;
+
+ /* the empty string is a root tree, which is depth 0 */
+ depth = *name ? 1 : 0;
+ for (p = strchr(name, '/'); p; p = strchr(p + 1, '/'))
+ depth++;
+
+ ent = packlist_find(&to_pack, oid);
+ if (ent && depth > oe_tree_depth(&to_pack, ent))
+ oe_set_tree_depth(&to_pack, ent, depth);
+}
+
/*
* Return the size of the object without doing any delta
* reconstruction (so non-deltas are true object sizes, but deltas
@@ -4375,20 +4391,8 @@ static void show_object(struct object *obj, const char *name,
add_preferred_base_object(name);
add_object_entry(&obj->oid, obj->type, name, 0);
- if (use_delta_islands) {
- const char *p;
- unsigned depth;
- struct object_entry *ent;
-
- /* the empty string is a root tree, which is depth 0 */
- depth = *name ? 1 : 0;
- for (p = strchr(name, '/'); p; p = strchr(p + 1, '/'))
- depth++;
-
- ent = packlist_find(&to_pack, &obj->oid);
- if (ent && depth > oe_tree_depth(&to_pack, ent))
- oe_set_tree_depth(&to_pack, ent, depth);
- }
+ if (use_delta_islands)
+ record_tree_depth(&obj->oid, name);
}
static void show_object__ma_allow_any(struct object *obj, const char *name, void *data)
--
2.54.0.23.gae57607b57f
^ permalink raw reply related [flat|nested] 14+ messages in thread* [PATCH v2 4/4] pack-objects: support `--delta-islands` with `--path-walk`
2026-06-02 22:21 ` [PATCH v2 0/4] " Taylor Blau
` (2 preceding siblings ...)
2026-06-02 22:21 ` [PATCH v2 3/4] pack-objects: extract `record_tree_depth()` helper Taylor Blau
@ 2026-06-02 22:21 ` Taylor Blau
3 siblings, 0 replies; 14+ messages in thread
From: Taylor Blau @ 2026-06-02 22:21 UTC (permalink / raw)
To: git; +Cc: Derrick Stolee, Junio C Hamano, Jeff King, Elijah Newren
Since the inception of `--path-walk`, this option has had a documented
incompatibility with `--delta-islands`.
When discussing those original patches on the list, a message from
Stolee in [1] noted the following:
this could be remedied by [...] doing a separate walk to identify
islands using the normal method
In a related portion of the thread, Peff explains[2]:
The delta islands code already does its own tree walk to propagate
the bits down (it does rely on the base walk's show_commit() to
propagate through the commits).
Once each object has its island bitmaps, I think however you
choose to come up with delta candidates [...] you should be able
to use it. It's fundamentally just answering the question of "am
I allowed to delta between these two objects".
That is similar to what this patch does, and it turns out the cheaper
option is sufficient: perform the same island side effects from the
path-walk callback rather than doing a second walk.
Recall how delta-islands are computed during a normal repack:
- `show_commit()` calls `propagate_island_marks()` for each commit,
which merges the commit's island bitset onto its root tree object and
onto each of its parent commits.
- `show_object()` for a tree records the tree's depth derived from the
slash-separated pathname. Subsequent `resolve_tree_islands()` uses
that depth to walk trees in increasing-depth order, propagating each
tree's marks to its children.
- At delta-search time, `in_same_island()` enforces that a delta
target's island bitmap is a subset of its base's: every island that
reaches the target must also reach the base.
Path-walk's enumeration callback is `add_objects_by_path()`. It already
adds objects to `to_pack`, but until now did not perform the
island-related side effects. Two things are needed:
- For each commit batch, call `propagate_island_marks()` on commits,
exactly as `show_commit()` does.
We have to be careful about the order in which we call this function,
and we must see a commit before its parents in order to have
island marks to propagate.
The path-walk batch preserves that order. Path-walk appends commits
to its `OBJ_COMMIT` batch as they come back from the same
`get_revision()` loop the regular traversal uses, and
`add_objects_by_path()` iterates the batch in array order. So every
commit reaches `propagate_island_marks()` in the same sequence that
`show_commit()` would have seen it, and the descendant-first chain
that the algorithm relies on is intact.
Skip island propagation for excluded commits to match the regular
traversal, whose `show_commit()` callback is only invoked for
interesting commits. Boundary commits may still be present in
path-walk's callback so they can serve as thin-pack bases, but they
should not contribute island marks.
- For each tree batch, record the tree's depth from the path. Use the
`record_tree_depth()` helper from the previous commit so both
callbacks behave identically, including the max-depth-wins behavior
when a tree is reached via more than one path. The helper accepts
both the `show_object()` path shape ("foo", "foo/bar") and the
path-walk shape with a trailing slash ("foo/", "foo/bar/"), so depths
recorded from either traversal mode are directly comparable.
This is implicit in the implementation sketch from Peff above.
`resolve_tree_islands()` sorts trees by `oe->tree_depth` in
increasing-depth order before propagating marks down, so that a
parent tree's marks are finalized before its children inherit them.
Without recording the depth at path-walk time, every
path-walk-discovered tree would land at depth 0 in `to_pack`, the
sort would lose its ordering, and children could inherit marks from
parents whose own contributions had not yet been merged in.
With those two pieces in place, `resolve_tree_islands()` receives the
same island inputs from path-walk as it would from the regular
traversal, so the existing island checks can be reused unchanged.
Drop the documented incompatibility between `--path-walk` and
`--delta-islands`, and add t5320 coverage for path-walk island repacks
with and without bitmap writing, as well as the same-island case where a
delta remains allowed.
[1]: https://lore.kernel.org/git/9aa2471b-0850-4707-9733-d3b33609f5f2@gmail.com/
[2]: https://lore.kernel.org/git/20240911063203.GA1538586@coredump.intra.peff.net/
Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
Documentation/git-pack-objects.adoc | 14 +++++++-------
builtin/pack-objects.c | 22 ++++++++++++++++++----
t/t5320-delta-islands.sh | 29 +++++++++++++++++++++++++++++
3 files changed, 54 insertions(+), 11 deletions(-)
diff --git a/Documentation/git-pack-objects.adoc b/Documentation/git-pack-objects.adoc
index 0adce8961a3..65cd00c152f 100644
--- a/Documentation/git-pack-objects.adoc
+++ b/Documentation/git-pack-objects.adoc
@@ -402,13 +402,13 @@ will be automatically changed to version `1`.
of filenames that cause collisions in Git's default name-hash
algorithm.
+
-Incompatible with `--delta-islands`. When `--use-bitmap-index` is
-specified with `--path-walk`, a successful bitmap traversal is used for
-object enumeration, with path-walk remaining as the fallback traversal
-when the bitmap cannot satisfy the request. The `--path-walk` option
-supports the `--filter=<spec>` forms `blob:none`, `blob:limit=<n>`,
-`tree:0`, `object:type=<type>`, and `sparse:<oid>`. These supported filter
-types can be combined with the `combine:<spec>+<spec>` form.
+When `--use-bitmap-index` is specified with `--path-walk`, a successful
+bitmap traversal is used for object enumeration, with path-walk
+remaining as the fallback traversal when the bitmap cannot satisfy the
+request. The `--path-walk` option supports the `--filter=<spec>` forms
+`blob:none`, `blob:limit=<n>`, `tree:0`, `object:type=<type>`, and
+`sparse:<oid>`. These supported filter types can be combined with the
+`combine:<spec>+<spec>` form.
DELTA ISLANDS
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index ec02e2b21d2..f48ea7a888b 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -4737,13 +4737,29 @@ static int add_objects_by_path(const char *path,
add_object_entry(oid, type, path, exclude);
- if (type == OBJ_COMMIT && write_bitmap_index) {
+ if (type == OBJ_COMMIT) {
struct commit *commit;
+ if (!write_bitmap_index && !use_delta_islands)
+ continue;
+
commit = lookup_commit(the_repository, oid);
if (!commit)
die(_("could not find commit %s"), oid_to_hex(oid));
- index_commit_for_bitmap(commit);
+ if (write_bitmap_index)
+ index_commit_for_bitmap(commit);
+ /*
+ * Skip island propagation for boundary commits.
+ * The regular traversal's show_commit() is only
+ * called for interesting commits; matching that
+ * here keeps path-walk from doing extra work that
+ * would only be a no-op anyway (boundary commits
+ * are not in island_marks).
+ */
+ if (use_delta_islands && !exclude)
+ propagate_island_marks(the_repository, commit);
+ } else if (type == OBJ_TREE && use_delta_islands) {
+ record_tree_depth(oid, path);
}
}
@@ -5205,8 +5221,6 @@ int cmd_pack_objects(int argc,
const char *option = NULL;
if (!path_walk_filter_compatible(&filter_options))
option = "--filter";
- else if (use_delta_islands)
- option = "--delta-islands";
if (option) {
warning(_("cannot use %s with %s"),
diff --git a/t/t5320-delta-islands.sh b/t/t5320-delta-islands.sh
index 2c961c70963..9b28344a0a3 100755
--- a/t/t5320-delta-islands.sh
+++ b/t/t5320-delta-islands.sh
@@ -53,6 +53,35 @@ test_expect_success 'separate islands disallows delta' '
! is_delta_base $two $one
'
+test_expect_success 'path-walk island repack respects islands' '
+ GIT_TRACE2_EVENT="$(pwd)/trace.path-walk-islands" \
+ git -c "pack.island=refs/heads/(.*)" repack -adfi \
+ --path-walk 2>err &&
+ test_region pack-objects path-walk trace.path-walk-islands &&
+ test_grep ! "cannot use --delta-islands with --path-walk" err &&
+ ! is_delta_base $one $two &&
+ ! is_delta_base $two $one
+'
+
+test_expect_success 'path-walk island bitmap repack respects islands' '
+ GIT_TRACE2_EVENT="$(pwd)/trace.path-walk-island-bitmap" \
+ git -c "pack.island=refs/heads/(.*)" repack -a -d -f -i -b \
+ --path-walk 2>err &&
+ test_region pack-objects path-walk trace.path-walk-island-bitmap &&
+ test_path_is_file .git/objects/pack/*.bitmap &&
+ git rev-list --test-bitmap --use-bitmap-index one &&
+ test_grep ! "cannot use --delta-islands with --path-walk" err &&
+ ! is_delta_base $one $two &&
+ ! is_delta_base $two $one
+'
+
+test_expect_success 'path-walk same island allows delta' '
+ GIT_TRACE2_EVENT="$(pwd)/trace.path-walk-same-island" \
+ git -c "pack.island=refs/heads" repack -adfi --path-walk &&
+ test_region pack-objects path-walk trace.path-walk-same-island &&
+ is_delta_base $one $two
+'
+
test_expect_success 'same island allows delta' '
git -c "pack.island=refs/heads" repack -adfi &&
is_delta_base $one $two
--
2.54.0.23.gae57607b57f
^ permalink raw reply related [flat|nested] 14+ messages in thread