From: Junio C Hamano <gitster@pobox.com>
To: "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org, Derrick Stolee <stolee@gmail.com>
Subject: Re: [PATCH 5/5] path-walk: support wildcard pathspecs for blob filtering
Date: Tue, 17 Mar 2026 15:19:00 -0700 [thread overview]
Message-ID: <xmqqms06hzfv.fsf@gitster.g> (raw)
In-Reply-To: <beb1c92554c76907315a4d1a7983226d2bf5a828.1773707361.git.gitgitgadget@gmail.com> (Derrick Stolee via GitGitGadget's message of "Tue, 17 Mar 2026 00:29:21 +0000")
"Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:
> From: Derrick Stolee <stolee@gmail.com>
>
> Previously, walk_objects_by_path() silently ignored pathspecs containing
> wildcards or magic by clearing them. This caused all blobs to be
> downloaded regardless of the given pathspec. Wildcard pathspecs like
> "d/file.*.txt" are useful for narrowing which blobs to process (e.g.,
> during 'git backfill').
>
> Support wildcard pathspecs by making three changes:
>
> 1. Add an 'exact_pathspecs' flag to path_walk_context. When the
> pathspec has no wildcards or magic, set this flag and use the
> existing fast-path prefix matching in add_tree_entries(). When
> wildcards are present, skip that block since prefix matching
> cannot handle glob patterns.
>
> 2. Disable revision-level commit pruning (revs->prune = 0) for
> wildcard pathspecs. The revision walk uses the pathspec to filter
> commits via TREESAME detection. For exact prefix pathspecs this
> works well, but wildcard pathspecs may fail to match through
> TREESAME because fnmatch with WM_PATHNAME does not cross directory
> boundaries. Disabling pruning ensures all commits are visited and
> their trees are available for the path-walk to filter.
Hmph, I wonder how significant an impact does it have on the
performance that we have to disable pruning here. With the bog
standard tree traversal, wouldn't tree_entry_interesting() already
be capable of doing this, even with fnmatch / WM_PATHNAME ?
> 3. Add a match_pathspec() check in walk_path() to filter out blobs
> whose full path does not match the pathspec. This provides the
> actual blob-level filtering for wildcard pathspecs.
>
> Signed-off-by: Derrick Stolee <stolee@gmail.com>
> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The latter person cannot sign DCO or vouch for the origin of what
they have written in this patch, can they?
> ---
> path-walk.c | 22 ++++++++++++++--------
> t/t5620-backfill.sh | 7 +++----
> 2 files changed, 17 insertions(+), 12 deletions(-)
next prev parent reply other threads:[~2026-03-17 22:19 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-17 0:29 [PATCH 0/5] backfill: accept revision arguments Derrick Stolee via GitGitGadget
2026-03-17 0:29 ` [PATCH 1/5] revision: include object-name.h Derrick Stolee via GitGitGadget
2026-03-17 21:52 ` Junio C Hamano
2026-03-17 0:29 ` [PATCH 2/5] t5620: prepare branched repo for revision tests Derrick Stolee via GitGitGadget
2026-03-17 0:29 ` [PATCH 3/5] backfill: accept revision arguments Derrick Stolee via GitGitGadget
2026-03-17 22:01 ` Junio C Hamano
2026-03-18 15:37 ` Kristoffer Haugsbakk
2026-03-23 0:31 ` Derrick Stolee
2026-03-19 9:54 ` Patrick Steinhardt
2026-03-23 0:35 ` Derrick Stolee
2026-03-17 0:29 ` [PATCH 4/5] backfill: work with prefix pathspecs Derrick Stolee via GitGitGadget
2026-03-17 22:10 ` Junio C Hamano
2026-03-18 13:15 ` Derrick Stolee
2026-03-19 9:54 ` Patrick Steinhardt
2026-03-19 9:55 ` Patrick Steinhardt
2026-03-19 10:15 ` Patrick Steinhardt
2026-03-23 0:47 ` Derrick Stolee
2026-03-17 0:29 ` [PATCH 5/5] path-walk: support wildcard pathspecs for blob filtering Derrick Stolee via GitGitGadget
2026-03-17 22:19 ` Junio C Hamano [this message]
2026-03-18 13:16 ` Derrick Stolee
2026-03-23 1:33 ` Derrick Stolee
2026-03-17 21:45 ` [PATCH 0/5] backfill: accept revision arguments Junio C Hamano
2026-03-19 9:54 ` Patrick Steinhardt
2026-03-19 12:59 ` Derrick Stolee
2026-03-20 7:35 ` Patrick Steinhardt
2026-03-23 11:40 ` [PATCH v2 0/6] " Derrick Stolee via GitGitGadget
2026-03-23 11:40 ` [PATCH v2 1/6] revision: include object-name.h Derrick Stolee via GitGitGadget
2026-03-23 11:40 ` [PATCH v2 2/6] t5620: prepare branched repo for revision tests Derrick Stolee via GitGitGadget
2026-03-23 11:40 ` [PATCH v2 3/6] backfill: accept revision arguments Derrick Stolee via GitGitGadget
2026-03-24 7:59 ` Patrick Steinhardt
2026-03-26 12:55 ` Derrick Stolee
2026-03-23 11:40 ` [PATCH v2 4/6] backfill: work with prefix pathspecs Derrick Stolee via GitGitGadget
2026-03-24 7:59 ` Patrick Steinhardt
2026-03-26 12:58 ` Derrick Stolee
2026-03-23 11:40 ` [PATCH v2 5/6] path-walk: support wildcard pathspecs for blob filtering Derrick Stolee via GitGitGadget
2026-03-23 11:40 ` [PATCH v2 6/6] t5620: test backfill's unknown argument handling Derrick Stolee via GitGitGadget
2026-03-23 15:29 ` Junio C Hamano
2026-03-23 20:39 ` Derrick Stolee
2026-03-26 15:14 ` [PATCH v3 0/6] backfill: accept revision arguments Derrick Stolee via GitGitGadget
2026-03-26 15:14 ` [PATCH v3 1/6] revision: include object-name.h Derrick Stolee via GitGitGadget
2026-03-26 15:14 ` [PATCH v3 2/6] t5620: prepare branched repo for revision tests Derrick Stolee via GitGitGadget
2026-03-26 15:14 ` [PATCH v3 3/6] backfill: accept revision arguments Derrick Stolee via GitGitGadget
2026-03-26 15:14 ` [PATCH v3 4/6] backfill: work with prefix pathspecs Derrick Stolee via GitGitGadget
2026-03-26 15:14 ` [PATCH v3 5/6] path-walk: support wildcard pathspecs for blob filtering Derrick Stolee via GitGitGadget
2026-03-26 15:14 ` [PATCH v3 6/6] t5620: test backfill's unknown argument handling Derrick Stolee via GitGitGadget
2026-03-27 7:07 ` [PATCH v3 0/6] backfill: accept revision arguments Patrick Steinhardt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xmqqms06hzfv.fsf@gitster.g \
--to=gitster@pobox.com \
--cc=git@vger.kernel.org \
--cc=gitgitgadget@gmail.com \
--cc=stolee@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox