* [RFC PATCH 1/7] pack-objects: update `--path-walk`'s existing incompatibilities
2026-05-04 0:11 [RFC PATCH 0/7] pack-bitmap: resolve various `--path-walk` incompatibilities Taylor Blau
@ 2026-05-04 0:11 ` Taylor Blau
2026-05-04 12:22 ` Derrick Stolee
2026-05-04 0:11 ` [RFC PATCH 2/7] path-walk: support `tree:0` filter Taylor Blau
` (6 subsequent siblings)
7 siblings, 1 reply; 13+ messages in thread
From: Taylor Blau @ 2026-05-04 0:11 UTC (permalink / raw)
To: git; +Cc: Junio C Hamano, Derrick Stolee, Jeff King, Elijah Newren
The documentation in git-pack-objects(1) claims that `--path-walk` is
incompatible with `-shallow`. However, commit c178b02e29f (pack-objects:
allow --shallow and --path-walk, 2025-05-16) resolves this
incompatibility, leaving the documentation stale.
Likewise, this documentation claims that `--filter` is incompatible, but
`blob:none`, `blob:limit=<n>`, and `sparse:oid=<blob>` already work via
path-walk.
List the supported `--filter` forms explicitly and note that other forms
fall back to the regular object traversal. Also remove the
incompatibility notice with `--shallow`.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
Documentation/git-pack-objects.adoc | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/Documentation/git-pack-objects.adoc b/Documentation/git-pack-objects.adoc
index b78175fbe1b..8dea8259787 100644
--- a/Documentation/git-pack-objects.adoc
+++ b/Documentation/git-pack-objects.adoc
@@ -402,9 +402,11 @@ will be automatically changed to version `1`.
of filenames that cause collisions in Git's default name-hash
algorithm.
+
-Incompatible with `--delta-islands`, `--shallow`, or `--filter`. The
-`--use-bitmap-index` option will be ignored in the presence of
-`--path-walk.`
+Incompatible with `--delta-islands`. Path-walk supports the
+`--filter=<spec>` forms `blob:none`, `blob:limit=<n>`, and
+`sparse:oid=<blob>`. Other filter forms fall back to the regular object
+traversal. The `--use-bitmap-index` option will be ignored in the
+presence of `--path-walk`.
DELTA ISLANDS
--
2.54.0.4.g6aa0d38a4ec
^ permalink raw reply related [flat|nested] 13+ messages in thread* Re: [RFC PATCH 1/7] pack-objects: update `--path-walk`'s existing incompatibilities
2026-05-04 0:11 ` [RFC PATCH 1/7] pack-objects: update `--path-walk`'s existing incompatibilities Taylor Blau
@ 2026-05-04 12:22 ` Derrick Stolee
0 siblings, 0 replies; 13+ messages in thread
From: Derrick Stolee @ 2026-05-04 12:22 UTC (permalink / raw)
To: Taylor Blau, git; +Cc: Junio C Hamano, Jeff King, Elijah Newren
On 5/3/2026 8:11 PM, Taylor Blau wrote:
> The documentation in git-pack-objects(1) claims that `--path-walk` is
> incompatible with `-shallow`. However, commit c178b02e29f (pack-objects:
> allow --shallow and --path-walk, 2025-05-16) resolves this
> incompatibility, leaving the documentation stale.
>
> Likewise, this documentation claims that `--filter` is incompatible, but
> `blob:none`, `blob:limit=<n>`, and `sparse:oid=<blob>` already work via
> path-walk.
>
> List the supported `--filter` forms explicitly and note that other forms
> fall back to the regular object traversal. Also remove the
> incompatibility notice with `--shallow`.
Thanks for pointing out that I didn't update the docs in my series.
I should incorporate the appropriate language for these changes in my
patches and give you co-authorship. That will also help avoid using
commit references that are impermanent until the series lands.
Thanks,
-Stolee
^ permalink raw reply [flat|nested] 13+ messages in thread
* [RFC PATCH 2/7] path-walk: support `tree:0` filter
2026-05-04 0:11 [RFC PATCH 0/7] pack-bitmap: resolve various `--path-walk` incompatibilities Taylor Blau
2026-05-04 0:11 ` [RFC PATCH 1/7] pack-objects: update `--path-walk`'s existing incompatibilities Taylor Blau
@ 2026-05-04 0:11 ` Taylor Blau
2026-05-04 12:30 ` Derrick Stolee
2026-05-04 21:55 ` Kristoffer Haugsbakk
2026-05-04 0:11 ` [RFC PATCH 3/7] path-walk: support `object:type` filter Taylor Blau
` (5 subsequent siblings)
7 siblings, 2 replies; 13+ messages in thread
From: Taylor Blau @ 2026-05-04 0:11 UTC (permalink / raw)
To: git; +Cc: Junio C Hamano, Derrick Stolee, Jeff King, Elijah Newren
The `tree:0` object filter omits all trees and blobs from the result,
keeping only commits and tags. Consequently, this filter type should
has a fairly straightforward integration with path-walk, as the decision
to include an object depends only on its type and does not depend on any
path-sensitive state.
Mapping it onto `path_walk_info` is direct: set `info->trees = 0` and
`info->blobs = 0` in `prepare_filters()` when the `LOFC_TREE_DEPTH`
choice is requested with depth zero. The existing code already plumbs
those flags through the rest of the walk:
- 'walk_objects_by_path()' sets `revs->blob_objects = info->blobs` and
`revs->tree_objects = info->trees` before `prepare_revision_walk()`,
so the revision walk doesn't try to enumerate trees or blobs itself.
- The commit-walk loop short-circuits the root-tree fetch with
"if (!info->trees && !info->blobs) continue;", so we never even
look up the root tree, let alone descend into it.
- `setup_pending_objects()` skips pending trees and blobs based on
the same flags.
This means the path-walk doesn't allocate or expand any tree structures
at all under `tree:0`, which matches the intended behavior of the
filter.
Non-zero tree-depth filters are not supported. Those depend on the depth
at which a tree is visited, which is a path-walk concept the filter
machinery doesn't currently share with the path-walk API. Reject them in
`prepare_filters()` with a helpful error and let pack-objects fall back
to the regular traversal, the same way it already does for unsupported
filters.
Add coverage in t6601 for both `--all` and a single-branch case to
confirm that no trees or blobs are emitted, and a separate test that
`tree:1` is rejected with the expected error message. Place the new
tests before "setup sparse filter blob" so they run on the original set
of refs, before the orphan branch that the sparse-tree tests create.
Update Documentation/git-pack-objects.adoc to drop --filter from
the unconditional incompatibility list and call out the supported
subset (which already includes the filters added by Stolee's
earlier patches: blob:none, blob:limit, and sparse:oid).
Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
Documentation/git-pack-objects.adoc | 10 +++----
path-walk.c | 13 +++++++++
t/t6601-path-walk.sh | 45 +++++++++++++++++++++++++++++
3 files changed, 63 insertions(+), 5 deletions(-)
diff --git a/Documentation/git-pack-objects.adoc b/Documentation/git-pack-objects.adoc
index 8dea8259787..cfb5bc0ae16 100644
--- a/Documentation/git-pack-objects.adoc
+++ b/Documentation/git-pack-objects.adoc
@@ -402,11 +402,11 @@ will be automatically changed to version `1`.
of filenames that cause collisions in Git's default name-hash
algorithm.
+
-Incompatible with `--delta-islands`. Path-walk supports the
-`--filter=<spec>` forms `blob:none`, `blob:limit=<n>`, and
-`sparse:oid=<blob>`. Other filter forms fall back to the regular object
-traversal. The `--use-bitmap-index` option will be ignored in the
-presence of `--path-walk`.
+Incompatible with `--delta-islands`. Path-walk supports
+the `--filter=<spec>` forms `blob:none`, `blob:limit=<n>`,
+`sparse:oid=<blob>`, and `tree:0`. Other filter forms fall back to the
+regular object traversal. The `--use-bitmap-index` option will be
+ignored in the presence of `--path-walk`.
DELTA ISLANDS
diff --git a/path-walk.c b/path-walk.c
index 700617ee2fe..36a1e5b967a 100644
--- a/path-walk.c
+++ b/path-walk.c
@@ -564,6 +564,19 @@ static int prepare_filters(struct path_walk_info *info,
}
return 1;
+ case LOFC_TREE_DEPTH:
+ if (options->tree_exclude_depth) {
+ error(_("tree:%lu filter not supported by the path-walk API"),
+ options->tree_exclude_depth);
+ return 0;
+ }
+ if (info) {
+ info->trees = 0;
+ info->blobs = 0;
+ list_objects_filter_release(options);
+ }
+ return 1;
+
case LOFC_SPARSE_OID:
if (info) {
struct object_id sparse_oid;
diff --git a/t/t6601-path-walk.sh b/t/t6601-path-walk.sh
index 520269dfc65..72e09211e63 100755
--- a/t/t6601-path-walk.sh
+++ b/t/t6601-path-walk.sh
@@ -590,6 +590,51 @@ test_expect_success 'all, blob:limit=3 filter' '
test_cmp_sorted expect out
'
+test_expect_success 'all, tree:0 filter' '
+ test-tool path-walk --filter=tree:0 -- --all >out &&
+
+ cat >expect <<-EOF &&
+ 0:commit::$(git rev-parse topic)
+ 0:commit::$(git rev-parse base)
+ 0:commit::$(git rev-parse base~1)
+ 0:commit::$(git rev-parse base~2)
+ 1:tag:/tags:$(git rev-parse refs/tags/first)
+ 1:tag:/tags:$(git rev-parse refs/tags/second.1)
+ 1:tag:/tags:$(git rev-parse refs/tags/second.2)
+ 1:tag:/tags:$(git rev-parse refs/tags/third)
+ 1:tag:/tags:$(git rev-parse refs/tags/fourth)
+ 1:tag:/tags:$(git rev-parse refs/tags/tree-tag)
+ 1:tag:/tags:$(git rev-parse refs/tags/blob-tag)
+ blobs:0
+ commits:4
+ tags:7
+ trees:0
+ EOF
+
+ test_cmp_sorted expect out
+'
+
+test_expect_success 'topic only, tree:0 filter' '
+ test-tool path-walk --filter=tree:0 -- topic >out &&
+
+ cat >expect <<-EOF &&
+ 0:commit::$(git rev-parse topic)
+ 0:commit::$(git rev-parse base~1)
+ 0:commit::$(git rev-parse base~2)
+ blobs:0
+ commits:3
+ tags:0
+ trees:0
+ EOF
+
+ test_cmp_sorted expect out
+'
+
+test_expect_success 'tree:1 filter is rejected' '
+ test_must_fail test-tool path-walk --filter=tree:1 -- --all 2>err &&
+ test_grep "tree:1 filter not supported by the path-walk API" err
+'
+
test_expect_success 'setup sparse filter blob' '
# Cone-mode patterns: include root, exclude all dirs, include left/
cat >patterns <<-\EOF &&
--
2.54.0.4.g6aa0d38a4ec
^ permalink raw reply related [flat|nested] 13+ messages in thread* Re: [RFC PATCH 2/7] path-walk: support `tree:0` filter
2026-05-04 0:11 ` [RFC PATCH 2/7] path-walk: support `tree:0` filter Taylor Blau
@ 2026-05-04 12:30 ` Derrick Stolee
2026-05-04 21:55 ` Kristoffer Haugsbakk
1 sibling, 0 replies; 13+ messages in thread
From: Derrick Stolee @ 2026-05-04 12:30 UTC (permalink / raw)
To: Taylor Blau, git; +Cc: Junio C Hamano, Jeff King, Elijah Newren
On 5/3/2026 8:11 PM, Taylor Blau wrote:
> The `tree:0` object filter omits all trees and blobs from the result,
> keeping only commits and tags. Consequently, this filter type should
> has a fairly straightforward integration with path-walk, as the decision
> to include an object depends only on its type and does not depend on any
> path-sensitive state.
I agree that the implementation here is straight-forward. It's something
where I could easily see wanting to disable the path-walk API because it
is no longer contributing much value, but perhaps the caller wants a
consistent callback that provides all commits and tags in different
chunks.
> Non-zero tree-depth filters are not supported. Those depend on the depth
> at which a tree is visited, which is a path-walk concept the filter
> machinery doesn't currently share with the path-walk API. Reject them in
> `prepare_filters()` with a helpful error and let pack-objects fall back
> to the regular traversal, the same way it already does for unsupported
> filters.
I think that this could be remedied with some tweaks to the internal
methods and data within the path-walk API to track a depth. This could
be handled later, if there was enough demand for nonzero tree-depth.
The diff itself looks good.
Thanks,
-Stolee
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC PATCH 2/7] path-walk: support `tree:0` filter
2026-05-04 0:11 ` [RFC PATCH 2/7] path-walk: support `tree:0` filter Taylor Blau
2026-05-04 12:30 ` Derrick Stolee
@ 2026-05-04 21:55 ` Kristoffer Haugsbakk
1 sibling, 0 replies; 13+ messages in thread
From: Kristoffer Haugsbakk @ 2026-05-04 21:55 UTC (permalink / raw)
To: Taylor Blau, git; +Cc: Junio C Hamano, Derrick Stolee, Jeff King, Elijah Newren
On Mon, May 4, 2026, at 02:11, Taylor Blau wrote:
> The `tree:0` object filter omits all trees and blobs from the result,
> keeping only commits and tags. Consequently, this filter type should
> has a fairly straightforward integration with path-walk, as the decision
s/has a/have a/
> to include an object depends only on its type and does not depend on any
> path-sensitive state.
>
>[snip]
^ permalink raw reply [flat|nested] 13+ messages in thread
* [RFC PATCH 3/7] path-walk: support `object:type` filter
2026-05-04 0:11 [RFC PATCH 0/7] pack-bitmap: resolve various `--path-walk` incompatibilities Taylor Blau
2026-05-04 0:11 ` [RFC PATCH 1/7] pack-objects: update `--path-walk`'s existing incompatibilities Taylor Blau
2026-05-04 0:11 ` [RFC PATCH 2/7] path-walk: support `tree:0` filter Taylor Blau
@ 2026-05-04 0:11 ` Taylor Blau
2026-05-04 12:32 ` Derrick Stolee
2026-05-04 0:11 ` [RFC PATCH 4/7] path-walk: support `combine` filter Taylor Blau
` (4 subsequent siblings)
7 siblings, 1 reply; 13+ messages in thread
From: Taylor Blau @ 2026-05-04 0:11 UTC (permalink / raw)
To: git; +Cc: Junio C Hamano, Derrick Stolee, Jeff King, Elijah Newren
The `object:type` filter accepts only objects of a single type; it is
the second member of the object-info-only filter family that bitmap
traversal already supports.
Like `blob:none` and `tree:0`, it can be evaluated with nothing more
than the object's type, which is exactly the granularity path-walk's
existing info->{commits,trees,blobs,tags} flags already control.
Map `LOFC_OBJECT_TYPE` in `prepare_filters()` by AND-ing each flag
against the filtered type. A single `object:type=X` filter
applied to the default info (all flags = 1) leaves `info->X = 1` and
all the others 0, which is what we want.
Using an AND rather than straight assignment prepares us for a
subsequent change to implement combined object filters.
The path-walk machinery is mostly already wired for the per-type
distinction:
- `walk_path()` calls `path_fn` for a batch only when the corresponding
`info->X` flag is set, so unwanted types are silently not reported.
- `add_tree_entries()` skips tree entries of type `OBJ_BLOB` when
`info->blobs` is unset, so we don't even allocate paths for them.
- The commit-walk loop short-circuits the root-tree fetch when
`!info->trees && !info->blobs`, so commit-only filters don't descend
into trees at all.
But there are a couple of side effects of the "trees off, blobs on" case
that need fixing:
1. 'setup_pending_objects()' previously skipped pending trees as soon
as `info->trees` was zero. For 'object:type=blob' the call site
needs those pending trees: a lightweight tag pointing to a tree, or
an annotated tag whose peeled target is a tree, can both reach
blobs that are otherwise unreachable from any commit's root tree.
Loosen the gate to "if (!info->trees && !info->blobs) continue" and
similarly retrieve the root_tree_list whenever either trees or
blobs are wanted.
2. The revision machinery's `handle_commit()` drops pending trees when
`revs->tree_objects` is zero (see the 'OBJ_TREE' handler in
revision.c), so by the time path-walk sees the pending list
after `prepare_revision_walk()` the tree-bearing pendings would
already be gone. Fix this by setting
revs->tree_objects = info->trees || info->blobs
so pending trees survive `prepare_revision_walk()` whenever we
need to walk into them. Path-walk still resets tree_objects to
zero immediately after `prepare_revision_walk()` returns, so the
rev-walk itself never enumerates trees redundantly with
path-walk's own descent.
Add coverage in t6601 for each of the four `object:type` values. The
'object:type=blob' test in particular asserts that file2 and child/file
(both reachable only through tag-pointed trees) show up in the output,
exercising the pending-tree fix.
Update Documentation/git-pack-objects.adoc to add object:type to
the list of supported --filter forms.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
Documentation/git-pack-objects.adoc | 7 ++-
path-walk.c | 23 +++++++-
t/t6601-path-walk.sh | 86 +++++++++++++++++++++++++++++
3 files changed, 110 insertions(+), 6 deletions(-)
diff --git a/Documentation/git-pack-objects.adoc b/Documentation/git-pack-objects.adoc
index cfb5bc0ae16..22c782611d2 100644
--- a/Documentation/git-pack-objects.adoc
+++ b/Documentation/git-pack-objects.adoc
@@ -404,9 +404,10 @@ will be automatically changed to version `1`.
+
Incompatible with `--delta-islands`. Path-walk supports
the `--filter=<spec>` forms `blob:none`, `blob:limit=<n>`,
-`sparse:oid=<blob>`, and `tree:0`. Other filter forms fall back to the
-regular object traversal. The `--use-bitmap-index` option will be
-ignored in the presence of `--path-walk`.
+`sparse:oid=<blob>`, `tree:0`, and `object:type=<type>`. Other filter
+forms fall back to the regular object traversal. The
+`--use-bitmap-index` option will be ignored in the presence of
+`--path-walk`.
DELTA ISLANDS
diff --git a/path-walk.c b/path-walk.c
index 36a1e5b967a..b9902abbb75 100644
--- a/path-walk.c
+++ b/path-walk.c
@@ -430,7 +430,7 @@ static int setup_pending_objects(struct path_walk_info *info,
CALLOC_ARRAY(tags, 1);
if (info->blobs)
CALLOC_ARRAY(tagged_blobs, 1);
- if (info->trees)
+ if (info->trees || info->blobs)
root_tree_list = strmap_get(&ctx->paths_to_lists, root_path);
/*
@@ -475,7 +475,7 @@ static int setup_pending_objects(struct path_walk_info *info,
switch (obj->type) {
case OBJ_TREE:
- if (!info->trees)
+ if (!info->trees && !info->blobs)
continue;
if (pending->path) {
char *path = *pending->path ? xstrfmt("%s/", pending->path)
@@ -577,6 +577,16 @@ static int prepare_filters(struct path_walk_info *info,
}
return 1;
+ case LOFC_OBJECT_TYPE:
+ if (info) {
+ info->commits &= options->object_type == OBJ_COMMIT;
+ info->tags &= options->object_type == OBJ_TAG;
+ info->trees &= options->object_type == OBJ_TREE;
+ info->blobs &= options->object_type == OBJ_BLOB;
+ list_objects_filter_release(options);
+ }
+ return 1;
+
case LOFC_SPARSE_OID:
if (info) {
struct object_id sparse_oid;
@@ -683,9 +693,16 @@ int walk_objects_by_path(struct path_walk_info *info)
/*
* Set these values before preparing the walk to catch
* lightweight tags pointing to non-commits and indexed objects.
+ *
+ * Keep tree_objects set whenever blobs are wanted: blobs may
+ * be reachable through trees that show up as pending objects
+ * (e.g., via lightweight tags pointing to trees, or annotated
+ * tags whose peeled target is a tree). Without tree_objects,
+ * prepare_revision_walk() would discard those pending trees
+ * and we would never descend into them.
*/
info->revs->blob_objects = info->blobs;
- info->revs->tree_objects = info->trees;
+ info->revs->tree_objects = info->trees || info->blobs;
if (prepare_revision_walk(info->revs))
die(_("failed to setup revision walk"));
diff --git a/t/t6601-path-walk.sh b/t/t6601-path-walk.sh
index 72e09211e63..13016e62ab1 100755
--- a/t/t6601-path-walk.sh
+++ b/t/t6601-path-walk.sh
@@ -635,6 +635,92 @@ test_expect_success 'tree:1 filter is rejected' '
test_grep "tree:1 filter not supported by the path-walk API" err
'
+test_expect_success 'all, object:type=commit filter' '
+ test-tool path-walk --filter=object:type=commit -- --all >out &&
+
+ cat >expect <<-EOF &&
+ 0:commit::$(git rev-parse topic)
+ 0:commit::$(git rev-parse base)
+ 0:commit::$(git rev-parse base~1)
+ 0:commit::$(git rev-parse base~2)
+ blobs:0
+ commits:4
+ tags:0
+ trees:0
+ EOF
+
+ test_cmp_sorted expect out
+'
+
+test_expect_success 'all, object:type=tag filter' '
+ test-tool path-walk --filter=object:type=tag -- --all >out &&
+
+ cat >expect <<-EOF &&
+ 0:tag:/tags:$(git rev-parse refs/tags/first)
+ 0:tag:/tags:$(git rev-parse refs/tags/second.1)
+ 0:tag:/tags:$(git rev-parse refs/tags/second.2)
+ 0:tag:/tags:$(git rev-parse refs/tags/third)
+ 0:tag:/tags:$(git rev-parse refs/tags/fourth)
+ 0:tag:/tags:$(git rev-parse refs/tags/tree-tag)
+ 0:tag:/tags:$(git rev-parse refs/tags/blob-tag)
+ blobs:0
+ commits:0
+ tags:7
+ trees:0
+ EOF
+
+ test_cmp_sorted expect out
+'
+
+test_expect_success 'all, object:type=tree filter' '
+ test-tool path-walk --filter=object:type=tree -- --all >out &&
+
+ cat >expect <<-EOF &&
+ 0:tree::$(git rev-parse topic^{tree})
+ 0:tree::$(git rev-parse base^{tree})
+ 0:tree::$(git rev-parse base~1^{tree})
+ 0:tree::$(git rev-parse base~2^{tree})
+ 0:tree::$(git rev-parse refs/tags/tree-tag^{})
+ 0:tree::$(git rev-parse refs/tags/tree-tag2^{})
+ 1:tree:a/:$(git rev-parse base:a)
+ 2:tree:child/:$(git rev-parse refs/tags/tree-tag:child)
+ 3:tree:left/:$(git rev-parse base:left)
+ 3:tree:left/:$(git rev-parse base~2:left)
+ 4:tree:right/:$(git rev-parse topic:right)
+ 4:tree:right/:$(git rev-parse base~1:right)
+ 4:tree:right/:$(git rev-parse base~2:right)
+ blobs:0
+ commits:0
+ tags:0
+ trees:13
+ EOF
+
+ test_cmp_sorted expect out
+'
+
+test_expect_success 'all, object:type=blob filter' '
+ test-tool path-walk --filter=object:type=blob -- --all >out &&
+
+ cat >expect <<-EOF &&
+ 0:blob:/tagged-blobs:$(git rev-parse refs/tags/blob-tag^{})
+ 0:blob:/tagged-blobs:$(git rev-parse refs/tags/blob-tag2^{})
+ 1:blob:a:$(git rev-parse base~2:a)
+ 2:blob:file2:$(git rev-parse refs/tags/tree-tag2^{}:file2)
+ 3:blob:child/file:$(git rev-parse refs/tags/tree-tag:child/file)
+ 4:blob:left/b:$(git rev-parse base:left/b)
+ 4:blob:left/b:$(git rev-parse base~2:left/b)
+ 5:blob:right/c:$(git rev-parse base~2:right/c)
+ 5:blob:right/c:$(git rev-parse topic:right/c)
+ 6:blob:right/d:$(git rev-parse base~1:right/d)
+ blobs:10
+ commits:0
+ tags:0
+ trees:0
+ EOF
+
+ test_cmp_sorted expect out
+'
+
test_expect_success 'setup sparse filter blob' '
# Cone-mode patterns: include root, exclude all dirs, include left/
cat >patterns <<-\EOF &&
--
2.54.0.4.g6aa0d38a4ec
^ permalink raw reply related [flat|nested] 13+ messages in thread* Re: [RFC PATCH 3/7] path-walk: support `object:type` filter
2026-05-04 0:11 ` [RFC PATCH 3/7] path-walk: support `object:type` filter Taylor Blau
@ 2026-05-04 12:32 ` Derrick Stolee
0 siblings, 0 replies; 13+ messages in thread
From: Derrick Stolee @ 2026-05-04 12:32 UTC (permalink / raw)
To: Taylor Blau, git; +Cc: Junio C Hamano, Jeff King, Elijah Newren
On 5/3/2026 8:11 PM, Taylor Blau wrote:
> The `object:type` filter accepts only objects of a single type; it is
> the second member of the object-info-only filter family that bitmap
> traversal already supports.
...
> But there are a couple of side effects of the "trees off, blobs on" case
> that need fixing:
>
> 1. 'setup_pending_objects()' previously skipped pending trees as soon
> as `info->trees` was zero. For 'object:type=blob' the call site
> needs those pending trees: a lightweight tag pointing to a tree, or
> an annotated tag whose peeled target is a tree, can both reach
> blobs that are otherwise unreachable from any commit's root tree.
> Loosen the gate to "if (!info->trees && !info->blobs) continue" and
> similarly retrieve the root_tree_list whenever either trees or
> blobs are wanted.
>
> 2. The revision machinery's `handle_commit()` drops pending trees when
> `revs->tree_objects` is zero (see the 'OBJ_TREE' handler in
> revision.c), so by the time path-walk sees the pending list
> after `prepare_revision_walk()` the tree-bearing pendings would
> already be gone. Fix this by setting
>
> revs->tree_objects = info->trees || info->blobs
>
> so pending trees survive `prepare_revision_walk()` whenever we
> need to walk into them. Path-walk still resets tree_objects to
> zero immediately after `prepare_revision_walk()` returns, so the
> rev-walk itself never enumerates trees redundantly with
> path-walk's own descent.
Both of these changes are very valuable bug fixes for the path-walk API!
Thanks for catching the distinction here where we should still be
walking trees in order to find the blobs we want.
Thanks,
-Stolee
^ permalink raw reply [flat|nested] 13+ messages in thread
* [RFC PATCH 4/7] path-walk: support `combine` filter
2026-05-04 0:11 [RFC PATCH 0/7] pack-bitmap: resolve various `--path-walk` incompatibilities Taylor Blau
` (2 preceding siblings ...)
2026-05-04 0:11 ` [RFC PATCH 3/7] path-walk: support `object:type` filter Taylor Blau
@ 2026-05-04 0:11 ` Taylor Blau
2026-05-04 0:11 ` [RFC PATCH 5/7] pack-objects: support reachability bitmaps with `--path-walk` Taylor Blau
` (3 subsequent siblings)
7 siblings, 0 replies; 13+ messages in thread
From: Taylor Blau @ 2026-05-04 0:11 UTC (permalink / raw)
To: git; +Cc: Junio C Hamano, Derrick Stolee, Jeff King, Elijah Newren
The `combine` filter takes the intersection of its children, that is:
objects are shown only when all child filters would admit the object.
The preceding patches added support for many individual filter types.
Enable users to compose these filters by implementing support for the
`combine` filter type.
Mapping intersection onto path_walk_info works because every supported
child filter is a monotonic restriction:
- `blob:none`, `tree:0` unconditionally clear `info->blobs` and (for
`tree:0`) `info->trees`; clearing an already-cleared flag is a
no-op.
- `object:type=X` is now expressed as an AND of each type flag with the
filtered type, so applying multiple such filters only refines the
existing set rather than overwrites it.
- `blob:limit=N` has to compose too: the intersection of "size < L1"
and "size < L2" is "size < min(L1, L2)".
Update the `LOFC_BLOB_LIMIT` handler to take the running minimum when
`info->blob_limit` is already set, so a combined filter with, e.g.,
both "blob:limit=10" and "blob:limit=5" produces a limit of 5
regardless of ordering.
- `sparse:oid` is left unchanged. A `combine` filter that includes a
`sparse:oid` is allowed at most once, since the existing handler
refuses to overwrite `info->pl`. Two `sparse:oid` filters in a single
`combine` would be unusual and are rejected with a warning, matching
the standalone `sparse:oid` behavior.
Implementation-wise, the existing `prepare_filters()` called
`list_objects_filter_release()` inside each case branch. That works fine
for top-level filters, but `combine` filters need to recurse over its
child filters without releasing each one in turn (since the parent's
release iterates the sub array). Split `prepare_filters()` into a
recursive helper that performs only the mutation, plus a thin wrapper
that calls the helper and then releases the top-level filter once.
The `LOFC_COMBINE` case in the helper just walks `sub_nr` and recurses;
child filters are released by the wrapper's single
`list_objects_filter_release()` call on the parent (which itself
recursively releases each sub-filter, the same way it always has).
If any sub-filter is unsupported (e.g. "tree:1", "sparse:<path>", or a
not-yet-supported choice), the recursion bubbles a failure up and the
existing pack-objects/backfill fallback paths kick in.
Add coverage in t6601:
- "combine:blob:none+tree:0" collapses to "tree:0"
- "combine:object:type=blob+blob:limit=3" yields only the blobs
smaller than three bytes
- "combine:object:type=blob+object:type=tree" intersects to empty
- "combine:tree:1+blob:none" reports the "tree:1" error.
Update Documentation/git-pack-objects.adoc to add combine to the
list of supported --filter forms.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
Documentation/git-pack-objects.adoc | 8 ++--
path-walk.c | 31 +++++++++-----
t/t6601-path-walk.sh | 65 +++++++++++++++++++++++++++++
3 files changed, 90 insertions(+), 14 deletions(-)
diff --git a/Documentation/git-pack-objects.adoc b/Documentation/git-pack-objects.adoc
index 22c782611d2..6c7bbff5be5 100644
--- a/Documentation/git-pack-objects.adoc
+++ b/Documentation/git-pack-objects.adoc
@@ -404,10 +404,10 @@ will be automatically changed to version `1`.
+
Incompatible with `--delta-islands`. Path-walk supports
the `--filter=<spec>` forms `blob:none`, `blob:limit=<n>`,
-`sparse:oid=<blob>`, `tree:0`, and `object:type=<type>`. Other filter
-forms fall back to the regular object traversal. The
-`--use-bitmap-index` option will be ignored in the presence of
-`--path-walk`.
+`sparse:oid=<blob>`, `tree:0`, `object:type=<type>`, and `combine:`
+over any of those. Other filter forms fall back to the regular object
+traversal. The `--use-bitmap-index` option will be ignored in the
+presence of `--path-walk`.
DELTA ISLANDS
diff --git a/path-walk.c b/path-walk.c
index b9902abbb75..6d66da3dc3b 100644
--- a/path-walk.c
+++ b/path-walk.c
@@ -539,28 +539,26 @@ static int setup_pending_objects(struct path_walk_info *info,
return 0;
}
-static int prepare_filters(struct path_walk_info *info,
- struct list_objects_filter_options *options)
+static int prepare_filters_one(struct path_walk_info *info,
+ struct list_objects_filter_options *options)
{
switch (options->choice) {
case LOFC_DISABLED:
return 1;
case LOFC_BLOB_NONE:
- if (info) {
+ if (info)
info->blobs = 0;
- list_objects_filter_release(options);
- }
return 1;
case LOFC_BLOB_LIMIT:
if (info) {
if (!options->blob_limit_value) {
info->blobs = 0;
- } else {
+ } else if (!info->blob_limit ||
+ options->blob_limit_value < info->blob_limit) {
info->blob_limit = options->blob_limit_value;
}
- list_objects_filter_release(options);
}
return 1;
@@ -573,7 +571,6 @@ static int prepare_filters(struct path_walk_info *info,
if (info) {
info->trees = 0;
info->blobs = 0;
- list_objects_filter_release(options);
}
return 1;
@@ -583,7 +580,6 @@ static int prepare_filters(struct path_walk_info *info,
info->tags &= options->object_type == OBJ_TAG;
info->trees &= options->object_type == OBJ_TREE;
info->blobs &= options->object_type == OBJ_BLOB;
- list_objects_filter_release(options);
}
return 1;
@@ -624,8 +620,13 @@ static int prepare_filters(struct path_walk_info *info,
warning(_("sparse filter is not cone-mode compatible"));
return 0;
}
+ }
+ return 1;
- list_objects_filter_release(options);
+ case LOFC_COMBINE:
+ for (size_t i = 0; i < options->sub_nr; i++) {
+ if (!prepare_filters_one(info, &options->sub[i]))
+ return 0;
}
return 1;
@@ -636,6 +637,16 @@ static int prepare_filters(struct path_walk_info *info,
}
}
+static int prepare_filters(struct path_walk_info *info,
+ struct list_objects_filter_options *options)
+{
+ if (!prepare_filters_one(info, options))
+ return 0;
+ if (info)
+ list_objects_filter_release(options);
+ return 1;
+}
+
int path_walk_filter_compatible(struct list_objects_filter_options *options)
{
return prepare_filters(NULL, options);
diff --git a/t/t6601-path-walk.sh b/t/t6601-path-walk.sh
index 13016e62ab1..a7d5f0de4ec 100755
--- a/t/t6601-path-walk.sh
+++ b/t/t6601-path-walk.sh
@@ -721,6 +721,71 @@ test_expect_success 'all, object:type=blob filter' '
test_cmp_sorted expect out
'
+test_expect_success 'all, combine:blob:none+tree:0 filter' '
+ test-tool path-walk \
+ --filter=combine:blob:none+tree:0 -- --all >out &&
+
+ cat >expect <<-EOF &&
+ 0:commit::$(git rev-parse topic)
+ 0:commit::$(git rev-parse base)
+ 0:commit::$(git rev-parse base~1)
+ 0:commit::$(git rev-parse base~2)
+ 1:tag:/tags:$(git rev-parse refs/tags/first)
+ 1:tag:/tags:$(git rev-parse refs/tags/second.1)
+ 1:tag:/tags:$(git rev-parse refs/tags/second.2)
+ 1:tag:/tags:$(git rev-parse refs/tags/third)
+ 1:tag:/tags:$(git rev-parse refs/tags/fourth)
+ 1:tag:/tags:$(git rev-parse refs/tags/tree-tag)
+ 1:tag:/tags:$(git rev-parse refs/tags/blob-tag)
+ blobs:0
+ commits:4
+ tags:7
+ trees:0
+ EOF
+
+ test_cmp_sorted expect out
+'
+
+test_expect_success 'all, combine:object:type=blob+blob:limit=3 filter' '
+ test-tool path-walk \
+ --filter=combine:object:type=blob+blob:limit=3 \
+ -- --all >out &&
+
+ cat >expect <<-EOF &&
+ 0:blob:a:$(git rev-parse base~2:a)
+ 1:blob:left/b:$(git rev-parse base~2:left/b)
+ 2:blob:right/c:$(git rev-parse base~2:right/c)
+ 3:blob:right/d:$(git rev-parse base~1:right/d)
+ blobs:4
+ commits:0
+ tags:0
+ trees:0
+ EOF
+
+ test_cmp_sorted expect out
+'
+
+test_expect_success 'all, combine of disjoint object:types is empty' '
+ test-tool path-walk \
+ --filter=combine:object:type=blob+object:type=tree \
+ -- --all >out &&
+
+ cat >expect <<-EOF &&
+ blobs:0
+ commits:0
+ tags:0
+ trees:0
+ EOF
+
+ test_cmp_sorted expect out
+'
+
+test_expect_success 'combine: rejects unsupported subfilters' '
+ test_must_fail test-tool path-walk \
+ --filter=combine:tree:1+blob:none -- --all 2>err &&
+ test_grep "tree:1 filter not supported by the path-walk API" err
+'
+
test_expect_success 'setup sparse filter blob' '
# Cone-mode patterns: include root, exclude all dirs, include left/
cat >patterns <<-\EOF &&
--
2.54.0.4.g6aa0d38a4ec
^ permalink raw reply related [flat|nested] 13+ messages in thread* [RFC PATCH 5/7] pack-objects: support reachability bitmaps with `--path-walk`
2026-05-04 0:11 [RFC PATCH 0/7] pack-bitmap: resolve various `--path-walk` incompatibilities Taylor Blau
` (3 preceding siblings ...)
2026-05-04 0:11 ` [RFC PATCH 4/7] path-walk: support `combine` filter Taylor Blau
@ 2026-05-04 0:11 ` Taylor Blau
2026-05-04 0:11 ` [RFC PATCH 6/7] pack-objects: extract `record_tree_depth()` helper Taylor Blau
` (2 subsequent siblings)
7 siblings, 0 replies; 13+ messages in thread
From: Taylor Blau @ 2026-05-04 0:11 UTC (permalink / raw)
To: git; +Cc: Junio C Hamano, Derrick Stolee, Jeff King, Elijah Newren
When 'pack-objects' is invoked with '--path-walk', it prevents us from
using reachability bitmaps.
This behavior dates back to 70664d2865c (pack-objects: add --path-walk
option, 2025-05-16), which included a comment in the relevant portion of
the command-line arguments handling that read as follows:
/*
* We must disable the bitmaps because we are removing
* the --objects / --objects-edge[-aggressive] options.
*/
In fb2c309b7d3 (pack-objects: pass --objects with --path-walk,
2026-05-02), we adjusted this behavior to also pass "--objects", but
still disable use of reachability bitmaps.
Fortunately, disabling reachability bitmaps is not strictly necessary.
Consider a couple of pack-objects use-cases: one during repacking, when
we would ordinarily generate reachability bitmaps, and another for
serving fetches and clones, when we would ordinarily read existing
bitmaps:
- When attempting to generate reachability bitmaps, we would fail to do
so since path-walk reveals objects through the
`add_objects_by_path()` callback rather than, e.g., `show_commit()`,
so the bitmap selector's `index_commit_for_bitmap()` was never called
for any commit.
The selection routine then had no candidates and bitmap writing was
effectively a no-op.
- On the bitmap-reading side, an invocation like "git pack-objects
--use-bitmap-index --path-walk" never even tried to consult a bitmap,
even when one was sitting on disk that could have answered the
request.
Neither restriction is required. They are discussed in turn:
- For bitmap-writing, all we need is for `index_commit_for_bitmap()` to
see each commit that path-walk visits. The path-walk callback already
groups commits into a single batch keyed by `OBJ_COMMIT`, so invoking
`index_commit_for_bitmap()` from there gives the bitmap selection
routine the same input it would have gotten from `show_commit()` in
the regular traversal.
The candidate set of commits is identical, though the ordering
differs. Bitmap selection is sensitive to commit ordering, but
commits are visited in the same order as we see them from
`get_revision()` so bitmap selection should be identical with or
without `--path-walk`.
- For bitmap-reading, all we need is for `revs->tree_objects` (and so
on for blobs and tags) to be set, otherwise bitmap traversal would
only emit commit objects.
In commit fb2c309b7d3, those flags are set via passing "--objects",
so bitmap traversal under "--path-walk" packs everything just like
any other "--use-bitmap-index" invocation.
If an existing reachability bitmap is unable to satisfy the request (no
bitmap on disk, haves not in the bitmapped pack, etc.) we fall through
to path-walk's own enumeration, just as the regular traversal falls back
when a bitmap is unavailable.
In other words: "--path-walk --use-bitmap-index" uses reachability
bitmaps when available, and otherwise enumerates via path-walk.
The regression in t5310 deserves a word about pack-reuse. With
pack-reuse enabled (the default), the output pack copies whole regions
of the existing bitmapped pack before `traverse_bitmap_commit_list()`
even runs, so a naive test would happily pass even if, say,
`revs->blob_objects` is set to 0. We test both with and without
pack-reuse enabled. When pack-reuse is disabled, pack-objects must
enumerate the resulting bitmap without copying any existing on-disk in
the rev_info setup that pack-reuse would otherwise paper over.
Update Documentation/git-pack-objects.adoc to drop the "ignored"
claim about --use-bitmap-index in favor of describing the new
fallback chain.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
Documentation/git-pack-objects.adoc | 6 +++--
builtin/pack-objects.c | 10 +++++++-
t/t5310-pack-bitmaps.sh | 36 +++++++++++++++++++++++++++++
3 files changed, 49 insertions(+), 3 deletions(-)
diff --git a/Documentation/git-pack-objects.adoc b/Documentation/git-pack-objects.adoc
index 6c7bbff5be5..60e594c7bc4 100644
--- a/Documentation/git-pack-objects.adoc
+++ b/Documentation/git-pack-objects.adoc
@@ -406,8 +406,10 @@ Incompatible with `--delta-islands`. Path-walk supports
the `--filter=<spec>` forms `blob:none`, `blob:limit=<n>`,
`sparse:oid=<blob>`, `tree:0`, `object:type=<type>`, and `combine:`
over any of those. Other filter forms fall back to the regular object
-traversal. The `--use-bitmap-index` option will be ignored in the
-presence of `--path-walk`.
+traversal. When `--use-bitmap-index` is specified with `--path-walk`, a
+successful bitmap traversal is used for object enumeration, with
+path-walk remaining as the fallback traversal when the bitmap cannot
+satisfy the request.
DELTA ISLANDS
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index ba00d8148ab..1a5f1afd32e 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -4732,6 +4732,15 @@ static int add_objects_by_path(const char *path,
continue;
add_object_entry(oid, type, path, exclude);
+
+ if (type == OBJ_COMMIT && write_bitmap_index) {
+ struct commit *commit;
+
+ commit = lookup_commit(the_repository, oid);
+ if (!commit)
+ die(_("could not find commit %s"), oid_to_hex(oid));
+ index_commit_for_bitmap(commit);
+ }
}
oe_end = to_pack.nr_objects;
@@ -5193,7 +5202,6 @@ int cmd_pack_objects(int argc,
if (path_walk) {
strvec_push(&rp, "--boundary");
strvec_push(&rp, "--objects");
- use_bitmap_index = 0;
} else if (thin) {
use_internal_rev_list = 1;
strvec_push(&rp, shallow
diff --git a/t/t5310-pack-bitmaps.sh b/t/t5310-pack-bitmaps.sh
index f693cb56691..69c5da1580a 100755
--- a/t/t5310-pack-bitmaps.sh
+++ b/t/t5310-pack-bitmaps.sh
@@ -577,6 +577,42 @@ test_bitmap_cases
sane_unset GIT_TEST_PACK_USE_BITMAP_BOUNDARY_TRAVERSAL
+test_expect_success 'path-walk repack can write and use bitmap indexes' '
+ test_when_finished "rm -rf path-walk-bitmap" &&
+ git init path-walk-bitmap &&
+ (
+ cd path-walk-bitmap &&
+ test_commit first &&
+ test_commit second &&
+ test_commit third &&
+
+ git repack -a -d -b --path-walk &&
+ git rev-list --test-bitmap --use-bitmap-index HEAD &&
+
+ git rev-parse HEAD >in &&
+
+ git rev-list --objects --no-object-names HEAD >expect.raw &&
+ sort expect.raw >expect &&
+
+ for reuse in true false
+ do
+ : >trace.txt &&
+
+ GIT_TRACE2_EVENT="$(pwd)/trace.txt" \
+ git -c pack.allowPackReuse=$reuse pack-objects \
+ --stdout --revs --path-walk --use-bitmap-index \
+ <in >out.pack &&
+ grep "\"category\":\"bitmap\",\"key\":\"bitmap/hits\"" trace.txt &&
+
+ git index-pack out.pack &&
+
+ list_packed_objects out.idx >actual.raw &&
+ sort actual.raw >actual &&
+ test_cmp expect actual || return 1
+ done
+ )
+'
+
test_expect_success 'incremental repack fails when bitmaps are requested' '
test_commit more-1 &&
test_must_fail git repack -d 2>err &&
--
2.54.0.4.g6aa0d38a4ec
^ permalink raw reply related [flat|nested] 13+ messages in thread* [RFC PATCH 6/7] pack-objects: extract `record_tree_depth()` helper
2026-05-04 0:11 [RFC PATCH 0/7] pack-bitmap: resolve various `--path-walk` incompatibilities Taylor Blau
` (4 preceding siblings ...)
2026-05-04 0:11 ` [RFC PATCH 5/7] pack-objects: support reachability bitmaps with `--path-walk` Taylor Blau
@ 2026-05-04 0:11 ` Taylor Blau
2026-05-04 0:11 ` [RFC PATCH 7/7] pack-objects: support `--delta-islands` with `--path-walk` Taylor Blau
2026-05-04 12:13 ` [RFC PATCH 0/7] pack-bitmap: resolve various `--path-walk` incompatibilities Derrick Stolee
7 siblings, 0 replies; 13+ messages in thread
From: Taylor Blau @ 2026-05-04 0:11 UTC (permalink / raw)
To: git; +Cc: Junio C Hamano, Derrick Stolee, Jeff King, Elijah Newren
Prepare for a future change that needs to record tree depths from a
second call site by factoring out the delta islands-specific portion of
`show_object()` out into a helper, `record_tree_depth()`.
`record_tree_depth()` takes a tree OID along with the path that
`show_object()` received, and computes the directory depth from the
slash count in the path.
While we're in the area, make a few minor clean-ups:
- Gate the call on `obj->type == OBJ_TREE`, as we only care to compute
the depth for tree objects. The sole caller of `oe_tree_depth()`
resides in `delta-islands.c::resolve_tree_islands()`, and only calls
`oe_tree_depth()` behind `oe_type(...) == OBJ_TREE`.
- Defer computing the depth for an object until we know it is in the
`to_pack` list.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
builtin/pack-objects.c | 34 ++++++++++++++++++++--------------
1 file changed, 20 insertions(+), 14 deletions(-)
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 1a5f1afd32e..842d1fcac29 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -2722,6 +2722,24 @@ static inline void oe_set_tree_depth(struct packing_data *pack,
pack->tree_depth[e - pack->objects] = tree_depth;
}
+static void record_tree_depth(const struct object_id *oid, const char *name)
+{
+ const char *p;
+ unsigned depth;
+ struct object_entry *ent = packlist_find(&to_pack, oid);
+
+ if (!ent)
+ return;
+
+ /* the empty string is a root tree, which is depth 0 */
+ depth = *name ? 1 : 0;
+ for (p = strchr(name, '/'); p; p = strchr(p + 1, '/'))
+ depth++;
+
+ if (depth > oe_tree_depth(&to_pack, ent))
+ oe_set_tree_depth(&to_pack, ent, depth);
+}
+
/*
* Return the size of the object without doing any delta
* reconstruction (so non-deltas are true object sizes, but deltas
@@ -4375,20 +4393,8 @@ static void show_object(struct object *obj, const char *name,
add_preferred_base_object(name);
add_object_entry(&obj->oid, obj->type, name, 0);
- if (use_delta_islands) {
- const char *p;
- unsigned depth;
- struct object_entry *ent;
-
- /* the empty string is a root tree, which is depth 0 */
- depth = *name ? 1 : 0;
- for (p = strchr(name, '/'); p; p = strchr(p + 1, '/'))
- depth++;
-
- ent = packlist_find(&to_pack, &obj->oid);
- if (ent && depth > oe_tree_depth(&to_pack, ent))
- oe_set_tree_depth(&to_pack, ent, depth);
- }
+ if (use_delta_islands && obj->type == OBJ_TREE)
+ record_tree_depth(&obj->oid, name);
}
static void show_object__ma_allow_any(struct object *obj, const char *name, void *data)
--
2.54.0.4.g6aa0d38a4ec
^ permalink raw reply related [flat|nested] 13+ messages in thread* [RFC PATCH 7/7] pack-objects: support `--delta-islands` with `--path-walk`
2026-05-04 0:11 [RFC PATCH 0/7] pack-bitmap: resolve various `--path-walk` incompatibilities Taylor Blau
` (5 preceding siblings ...)
2026-05-04 0:11 ` [RFC PATCH 6/7] pack-objects: extract `record_tree_depth()` helper Taylor Blau
@ 2026-05-04 0:11 ` Taylor Blau
2026-05-04 12:13 ` [RFC PATCH 0/7] pack-bitmap: resolve various `--path-walk` incompatibilities Derrick Stolee
7 siblings, 0 replies; 13+ messages in thread
From: Taylor Blau @ 2026-05-04 0:11 UTC (permalink / raw)
To: git; +Cc: Junio C Hamano, Derrick Stolee, Jeff King, Elijah Newren
Since the inception of `--path-walk`, this option has a documented
incompatibility with `--delta-islands`.
When discussing those original patches on the list, a message from
Stolee in [1] noted the following:
this could be remedied by [...] doing a separate walk to identify
islands using the normal method
In a related portion of the thread, Peff explains[2]:
The delta islands code already does its own tree walk to propagate
the bits down (it does rely on the base walk's show_commit() to
propagate through the commits).
Once each object has its island bitmaps, I think however you
choose to come up with delta candidates [...] you should be able
to use it. It's fundamentally just answering the question of "am
I allowed to delta between these two objects".
That is similar to what this patch does, and it turns out the cheaper
option (do the side-effects inside the path-walk callback rather than
via a second walk) is sufficient.
Recall how delta-islands are computed during a normal repack:
- `show_commit()` calls `propagate_island_marks()` for each commit,
which merges the commit's island bitset onto its root tree object and
onto each of its parent commits.
- `show_object()` for a tree records the tree's depth derived from the
slash-separated pathname. Subsequent `resolve_tree_islands()` uses
that depth to walk trees in increasing-depth order, propagating each
tree's marks to its children.
- At delta-search time, `in_same_island()` enforces that a delta
target's island bitmap is a subset of its base's: every island
that reaches the target must also reach the base.
Path-walk's enumeration callback is `add_objects_by_path()`. It already
adds objects to `to_pack', but until now did not perform any of the
island-related side effects. Two things are needed:
- For each commit batch, call `propagate_island_marks()` on the commit,
exactly as show_commit() does.
Order matters here. `mark_remote_island_1()` only seeds marks on
tip commits, so a non-tip commit has marks in the `island_marks` map
only after some descendant has already had `propagate_island_marks()`
run on it. If we see a commit before its descendants, its
`island_marks` entry would still be empty, the call would be a no-op,
and that commit's root tree would never receive any marks at all.
As a consequence, `resolve_tree_islands()` would later look up the
tree, find nothing, and propagate nothing. The traversal must visit
children before parents.
The path-walk batch preserves that order mechanically. Path-walk
appends commits to its `OBJ_COMMIT` batch as they come back from the
same `get_revision()` loop the regular traversal uses, and
`add_objects_by_path()` iterates the batch in array order. So every
commit reaches `propagate_island_marks()` in the same sequence that
`show_commit()` would have seen it, and the descendant-first chain
that the algorithm relies on is intact.
Skip the call for boundary commits to match `show_commit()`, which is
only invoked for interesting commits (this call is a no-op anyway for
boundary commits since they are not in 'island_marks', but matching
`show_commit()` exactly keeps the two enumeration modes tidy).
- For each tree batch, record the tree's depth from the path. Use the
`record_tree_depth()` helper from the previous commit so both
callbacks behave identically, including the "max-depth-wins" behavior
when a tree is reached via more than one path. The helper accepts
both the show_object() path shape ("foo", "foo/bar") and the
path-walk shape with a trailing '/' ("foo/", "foo/bar/"), so depths
recorded from either traversal mode are directly comparable.
This is implicit in the implementation sketch from Peff above.
`resolve_tree_islands()` sorts trees by `oe->tree_depth` (ascending)
before propagating marks down, so that a parent tree's marks are
finalized before its children inherit them. Without recording the
depth at path-walk time, every path-walk-discovered tree would land
at depth 0 in `to_pack`, the sort would lose its ordering, and
children could inherit marks from parents whose own contributions had
not yet been merged in.
With those two pieces in place, `resolve_tree_islands()` receives
identical input to a normal traversal, so the existing correctness
argument carries over verbatim: depth-ordered processing guarantees that
a parent tree's marks are propagated to a child only after the parent
itself has been finalized, and the "is-this-a-subset" check at delta
time is the same regardless of how the marks got there.
Coverage in t5320 exercises both repack flavors (with and without '-b'),
confirms that cross-island deltas remain forbidden, and that
intra-island deltas are still allowed.
[1]: https://lore.kernel.org/git/9aa2471b-0850-4707-9733-d3b33609f5f2@gmail.com/
[2]: https://lore.kernel.org/git/20240911063203.GA1538586@coredump.intra.peff.net/
Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
Documentation/git-pack-objects.adoc | 15 +++++++--------
builtin/pack-objects.c | 22 ++++++++++++++++++----
t/t5320-delta-islands.sh | 29 +++++++++++++++++++++++++++++
3 files changed, 54 insertions(+), 12 deletions(-)
diff --git a/Documentation/git-pack-objects.adoc b/Documentation/git-pack-objects.adoc
index 60e594c7bc4..aa7a9721203 100644
--- a/Documentation/git-pack-objects.adoc
+++ b/Documentation/git-pack-objects.adoc
@@ -402,14 +402,13 @@ will be automatically changed to version `1`.
of filenames that cause collisions in Git's default name-hash
algorithm.
+
-Incompatible with `--delta-islands`. Path-walk supports
-the `--filter=<spec>` forms `blob:none`, `blob:limit=<n>`,
-`sparse:oid=<blob>`, `tree:0`, `object:type=<type>`, and `combine:`
-over any of those. Other filter forms fall back to the regular object
-traversal. When `--use-bitmap-index` is specified with `--path-walk`, a
-successful bitmap traversal is used for object enumeration, with
-path-walk remaining as the fallback traversal when the bitmap cannot
-satisfy the request.
+Path-walk supports the `--filter=<spec>` forms `blob:none`,
+`blob:limit=<n>`, `sparse:oid=<blob>`, `tree:0`, `object:type=<type>`,
+and `combine:` over any of those. Other filter forms fall back to the
+regular object traversal. When `--use-bitmap-index` is specified with
+`--path-walk`, a successful bitmap traversal is used for object
+enumeration, with path-walk remaining as the fallback traversal when
+the bitmap cannot satisfy the request.
DELTA ISLANDS
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 842d1fcac29..d79366db3de 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -4739,13 +4739,29 @@ static int add_objects_by_path(const char *path,
add_object_entry(oid, type, path, exclude);
- if (type == OBJ_COMMIT && write_bitmap_index) {
+ if (type == OBJ_COMMIT) {
struct commit *commit;
+ if (!write_bitmap_index && !use_delta_islands)
+ continue;
+
commit = lookup_commit(the_repository, oid);
if (!commit)
die(_("could not find commit %s"), oid_to_hex(oid));
- index_commit_for_bitmap(commit);
+ if (write_bitmap_index)
+ index_commit_for_bitmap(commit);
+ /*
+ * Skip island propagation for boundary commits.
+ * The regular traversal's show_commit() is only
+ * called for interesting commits; matching that
+ * here keeps path-walk from doing extra work that
+ * would only be a no-op anyway (boundary commits
+ * are not in island_marks).
+ */
+ if (use_delta_islands && !exclude)
+ propagate_island_marks(the_repository, commit);
+ } else if (type == OBJ_TREE && use_delta_islands) {
+ record_tree_depth(oid, path);
}
}
@@ -5196,8 +5212,6 @@ int cmd_pack_objects(int argc,
const char *option = NULL;
if (!path_walk_filter_compatible(&filter_options))
option = "--filter";
- else if (use_delta_islands)
- option = "--delta-islands";
if (option) {
warning(_("cannot use %s with %s"),
diff --git a/t/t5320-delta-islands.sh b/t/t5320-delta-islands.sh
index 2c961c70963..9b28344a0a3 100755
--- a/t/t5320-delta-islands.sh
+++ b/t/t5320-delta-islands.sh
@@ -53,6 +53,35 @@ test_expect_success 'separate islands disallows delta' '
! is_delta_base $two $one
'
+test_expect_success 'path-walk island repack respects islands' '
+ GIT_TRACE2_EVENT="$(pwd)/trace.path-walk-islands" \
+ git -c "pack.island=refs/heads/(.*)" repack -adfi \
+ --path-walk 2>err &&
+ test_region pack-objects path-walk trace.path-walk-islands &&
+ test_grep ! "cannot use --delta-islands with --path-walk" err &&
+ ! is_delta_base $one $two &&
+ ! is_delta_base $two $one
+'
+
+test_expect_success 'path-walk island bitmap repack respects islands' '
+ GIT_TRACE2_EVENT="$(pwd)/trace.path-walk-island-bitmap" \
+ git -c "pack.island=refs/heads/(.*)" repack -a -d -f -i -b \
+ --path-walk 2>err &&
+ test_region pack-objects path-walk trace.path-walk-island-bitmap &&
+ test_path_is_file .git/objects/pack/*.bitmap &&
+ git rev-list --test-bitmap --use-bitmap-index one &&
+ test_grep ! "cannot use --delta-islands with --path-walk" err &&
+ ! is_delta_base $one $two &&
+ ! is_delta_base $two $one
+'
+
+test_expect_success 'path-walk same island allows delta' '
+ GIT_TRACE2_EVENT="$(pwd)/trace.path-walk-same-island" \
+ git -c "pack.island=refs/heads" repack -adfi --path-walk &&
+ test_region pack-objects path-walk trace.path-walk-same-island &&
+ is_delta_base $one $two
+'
+
test_expect_success 'same island allows delta' '
git -c "pack.island=refs/heads" repack -adfi &&
is_delta_base $one $two
--
2.54.0.4.g6aa0d38a4ec
^ permalink raw reply related [flat|nested] 13+ messages in thread* Re: [RFC PATCH 0/7] pack-bitmap: resolve various `--path-walk` incompatibilities
2026-05-04 0:11 [RFC PATCH 0/7] pack-bitmap: resolve various `--path-walk` incompatibilities Taylor Blau
` (6 preceding siblings ...)
2026-05-04 0:11 ` [RFC PATCH 7/7] pack-objects: support `--delta-islands` with `--path-walk` Taylor Blau
@ 2026-05-04 12:13 ` Derrick Stolee
7 siblings, 0 replies; 13+ messages in thread
From: Derrick Stolee @ 2026-05-04 12:13 UTC (permalink / raw)
To: Taylor Blau, git; +Cc: Junio C Hamano, Jeff King, Elijah Newren
On 5/3/2026 8:11 PM, Taylor Blau wrote:
> (Note to the maintainer, this is built on top of 'ds/path-walk-filters').
>
> Between other tasks, I have been working on trying to integrate
> `--path-walk` within GitHub's infrastructure. In order to do this,
> `--path-walk` must work with features that GitHub depends on, such as
> reachability bitmaps and delta-islands (along with filters, shallow,
> etc., though more on that below).
>
> I had been sitting on these patches for a few days in my fork before
> Stolee sent his series in [1] which resolves incompatibilities between
> the `--path-walk` option and various filter types. Since I figured that
> others are working in this area I wanted to send a reworked version of
> my series for a couple of reasons:
>
> 1. Since reviewers are already looking at this area as a consequence of
> Stolee's series, this topic should be slightly easier to review
> while the area is fresh.
I agree that we should review both series together. It's been a while
since the original path-walk API series, so it may require some refresh
of all its nuances.
> 2. In case Stolee (or others) are working on resolving the
> incompatibility between `--path-walk` and either delta-islands or
> reachability bitmaps, this series can either combine with those (if
> any) or serve as inspiration (if others are in the process of
> writing such series).
I was _not_ working on bitmap compatibility, but I'm grateful to see it!
> When writing this originally, I had borrowed the same filter-application
> mechanism from bitmaps, which supports trivial filters (e.g., blob:none,
> tree:0, and combinations therein). Stolee's series is a strict
> improvement on that approach supporting sparse:<oid> filters as well, so
> I reworked my filtering-related patches based on that.
You have some new filters that I had not considered, so they are welcome
additions. If you don't mind, I could add them into my series, as they
may be more appropriate grouped with the other filter changes.
> The patches surrounding bitmaps and delta-islands are largely
> unchanged from when I had originally written them:
>
> * Supporting bitmaps with `--path-walk` is mostly straightforward, and
> boils down to ensuring that the path-walk-specific object callback
> indexes any commit(s) it sees for bitmapping.
>
> * Supporting delta-islands with `--path-walk` required a bit more
> surgery, and involves propagating island marks for commits in the
> path-walk-specific callback, as well as recording tree depth
> information in the same spot.
>
> I'm submitting these patches as an RFC, since (a) I haven't thought
> deeply about the approach taken here and could very well be on the wrong
> track, and (b) in case Stolee or others want to combine forces here
> and/or coordinate around each other.
I'll definitely take a very close read of these patches, as there are
some interesting interactions here.
Thanks,
-Stolee
^ permalink raw reply [flat|nested] 13+ messages in thread