* [PATCH] ref-filter: restore prefix-scoped iteration
@ 2026-06-05 16:43 Tamir Duberstein
2026-06-08 21:36 ` Junio C Hamano
0 siblings, 1 reply; 3+ messages in thread
From: Tamir Duberstein @ 2026-06-05 16:43 UTC (permalink / raw)
To: git
Cc: Karthik Nayak, Patrick Steinhardt, Junio C Hamano, Victoria Dye,
ZheNing Hu, Tamir Duberstein
Commit dabecb9db2 (for-each-ref: introduce a '--start-after' option,
2025-07-15) changed single-kind branch, remote-tracking branch, and tag
enumeration in do_filter_refs() from constructing an iterator with the
namespace prefix to constructing an unscoped iterator and applying the
prefix with ref_iterator_seek().
Before that change, refs_for_each_fullref_in() passed the namespace
prefix during iterator construction. That helper has since been
replaced by refs_for_each_ref_ext().
The files backend primes its loose-ref cache for the construction
prefix before it opens packed refs. An empty construction prefix
therefore reads every loose ref, and a later seek cannot undo that I/O.
Consequently, git branch, git branch --remotes, and git tag scale with
unrelated loose refs.
Patrick Steinhardt observed during review that iterator construction
and seeking accepted similar strings but assigned them different state
semantics. Junio C Hamano then pointed out that no current command can
combine start_after with this single-kind path, but future branch or
tag support would need to keep the namespace while moving the cursor.
Keep the existing start_after path unchanged. The iterator API cannot
currently seek to one string while retaining another as its prefix:
an unflagged seek clears the prefix, while REF_ITERATOR_SEEK_SET_PREFIX
replaces it with the seek string.
For the commands affected by this regression, which do not set
start_after, pass the namespace prefix during iterator construction so
that loose refs are scoped before the packed-refs snapshot is opened.
This fixes the current regression without deleting the ref-filter state
discussed during review or changing its dormant behavior.
Add REFFILES-gated performance cases with one branch, one
remote-tracking branch, one tag, and 10,000 unrelated loose refs. The
benchmarks were run with:
GIT_PERF_REPEAT_COUNT=5 GIT_PERF_MAKE_OPTS=-j8 \
t/perf/run a89346e34a . -- p6300-for-each-ref.sh
The following are the best of five runs, with each run invoking the
command ten times. Times are elapsed seconds with user and system CPU
seconds in parentheses:
a89346e34a this commit
branch 2.74(0.13+2.56) 0.11(0.04+0.04)
branch --remotes 2.81(0.13+2.62) 0.12(0.04+0.04)
tag 3.01(0.14+2.82) 0.11(0.04+0.04)
Both revisions used the default -O2 build flags and a config.mak
containing only "NO_REGEX = NeedsStartEnd". They were built with Apple
clang 21.0.0 on macOS 26.5. The machine was a MacBook Pro (Mac16,6)
with a 16-core Apple M4 Max (12 performance and four efficiency cores)
and 128 GB RAM.
Link: https://lore.kernel.org/git/aGZidwwlToWThkn8@pks.im/
Link: https://lore.kernel.org/git/xmqqikjq7s16.fsf@gitster.g/
Fixes: dabecb9db2b2 ("for-each-ref: introduce a '--start-after' option")
Assisted-by: Codex gpt-5.5
Signed-off-by: Tamir Duberstein <tamird@gmail.com>
---
The series is based on a89346e34a (maint) because the regression has
been present in released versions since Git 2.51.0.
---
ref-filter.c | 30 +++++++++++++++++++++---------
t/perf/p6300-for-each-ref.sh | 39 ++++++++++++++++++++++++++++++++++++++-
2 files changed, 59 insertions(+), 10 deletions(-)
diff --git a/ref-filter.c b/ref-filter.c
index 1da4c0e60d..2388a57b39 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -3315,19 +3315,31 @@ static int do_filter_refs(struct ref_filter *filter, unsigned int type, refs_for
prefix = "refs/tags/";
if (prefix) {
- struct ref_iterator *iter;
+ if (filter->start_after) {
+ struct ref_iterator *iter;
- iter = refs_ref_iterator_begin(get_main_ref_store(the_repository),
- "", NULL, 0, 0);
+ iter = refs_ref_iterator_begin(
+ get_main_ref_store(the_repository), "", NULL, 0,
+ 0);
- if (filter->start_after)
ret = start_ref_iterator_after(iter, filter->start_after);
- else
- ret = ref_iterator_seek(iter, prefix,
- REF_ITERATOR_SEEK_SET_PREFIX);
+ if (!ret)
+ ret = do_for_each_ref_iterator(iter, fn,
+ cb_data);
+ } else {
+ /*
+ * Pass the prefix during construction because the files
+ * backend primes loose refs before a later seek can
+ * narrow the iterator.
+ */
+ struct refs_for_each_ref_options opts = {
+ .prefix = prefix,
+ };
- if (!ret)
- ret = do_for_each_ref_iterator(iter, fn, cb_data);
+ ret = refs_for_each_ref_ext(
+ get_main_ref_store(the_repository), fn, cb_data,
+ &opts);
+ }
} else if (filter->kind & FILTER_REFS_REGULAR) {
ret = for_each_fullref_in_pattern(filter, fn, cb_data);
}
diff --git a/t/perf/p6300-for-each-ref.sh b/t/perf/p6300-for-each-ref.sh
index fa7289c752..ed9c1c6a19 100755
--- a/t/perf/p6300-for-each-ref.sh
+++ b/t/perf/p6300-for-each-ref.sh
@@ -1,6 +1,6 @@
#!/bin/sh
-test_description='performance of for-each-ref'
+test_description='performance of ref-filter users'
. ./perf-lib.sh
test_perf_fresh_repo
@@ -84,4 +84,41 @@ test_expect_success 'pack refs' '
'
run_tests "packed"
+test_expect_success REFFILES 'setup many unrelated loose refs' '
+ git init scoped &&
+ test_commit -C scoped --no-tag base &&
+ test_seq $ref_count_per_type |
+ sed "s,.*,update refs/custom/unrelated_& HEAD," |
+ git -C scoped update-ref --stdin &&
+ git -C scoped update-ref refs/remotes/origin/main HEAD &&
+ git -C scoped update-ref refs/tags/only HEAD
+'
+
+test_perf "branch (many unrelated loose refs)" --prereq REFFILES "
+ (
+ cd scoped &&
+ for i in \$(test_seq $test_iteration_count); do
+ git branch --format='%(refname)' >/dev/null
+ done
+ )
+"
+
+test_perf "branch --remotes (many unrelated loose refs)" --prereq REFFILES "
+ (
+ cd scoped &&
+ for i in \$(test_seq $test_iteration_count); do
+ git branch --remotes --format='%(refname)' >/dev/null
+ done
+ )
+"
+
+test_perf "tag (many unrelated loose refs)" --prereq REFFILES "
+ (
+ cd scoped &&
+ for i in \$(test_seq $test_iteration_count); do
+ git tag --format='%(refname)' >/dev/null
+ done
+ )
+"
+
test_done
---
base-commit: a89346e34a937f001e5d397ee62224e3e9852040
change-id: 20260605-fix-git-branch-regression-9e4236f18091
Best regards,
--
Tamir Duberstein <tamird@gmail.com>
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] ref-filter: restore prefix-scoped iteration
2026-06-05 16:43 [PATCH] ref-filter: restore prefix-scoped iteration Tamir Duberstein
@ 2026-06-08 21:36 ` Junio C Hamano
2026-06-08 22:39 ` Tamir Duberstein
0 siblings, 1 reply; 3+ messages in thread
From: Junio C Hamano @ 2026-06-08 21:36 UTC (permalink / raw)
To: Tamir Duberstein
Cc: git, Karthik Nayak, Patrick Steinhardt, Victoria Dye, ZheNing Hu
Tamir Duberstein <tamird@gmail.com> writes:
> diff --git a/ref-filter.c b/ref-filter.c
> index 1da4c0e60d..2388a57b39 100644
> --- a/ref-filter.c
> +++ b/ref-filter.c
> @@ -3315,19 +3315,31 @@ static int do_filter_refs(struct ref_filter *filter, unsigned int type, refs_for
> prefix = "refs/tags/";
>
> if (prefix) {
Below, adding an extra call to get_main_ref_store(the_repository)
makes one line unnecessarily split and harder to read. How about
doing
struct ref_store *store = get_main_ref_store(the_repository);
upfront here, and then use that to replace these two calls of
get_main_ref_store(the_repository)?
> + if (filter->start_after) {
> + struct ref_iterator *iter;
>
> + iter = refs_ref_iterator_begin(
> + get_main_ref_store(the_repository), "", NULL, 0,
> + 0);
>
> ret = start_ref_iterator_after(iter, filter->start_after);
> + if (!ret)
> + ret = do_for_each_ref_iterator(iter, fn,
> + cb_data);
> + } else {
> + /*
> + * Pass the prefix during construction because the files
> + * backend primes loose refs before a later seek can
> + * narrow the iterator.
> + */
> + struct refs_for_each_ref_options opts = {
> + .prefix = prefix,
> + };
>
> + ret = refs_for_each_ref_ext(
> + get_main_ref_store(the_repository), fn, cb_data,
> + &opts);
> + }
> } else if (filter->kind & FILTER_REFS_REGULAR) {
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] ref-filter: restore prefix-scoped iteration
2026-06-08 21:36 ` Junio C Hamano
@ 2026-06-08 22:39 ` Tamir Duberstein
0 siblings, 0 replies; 3+ messages in thread
From: Tamir Duberstein @ 2026-06-08 22:39 UTC (permalink / raw)
To: Junio C Hamano
Cc: git, Karthik Nayak, Patrick Steinhardt, Victoria Dye, ZheNing Hu
On Mon, Jun 8, 2026 at 2:36 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Tamir Duberstein <tamird@gmail.com> writes:
>
> > diff --git a/ref-filter.c b/ref-filter.c
> > index 1da4c0e60d..2388a57b39 100644
> > --- a/ref-filter.c
> > +++ b/ref-filter.c
> > @@ -3315,19 +3315,31 @@ static int do_filter_refs(struct ref_filter *filter, unsigned int type, refs_for
> > prefix = "refs/tags/";
> >
> > if (prefix) {
>
> Below, adding an extra call to get_main_ref_store(the_repository)
> makes one line unnecessarily split and harder to read. How about
> doing
>
> struct ref_store *store = get_main_ref_store(the_repository);
>
> upfront here, and then use that to replace these two calls of
> get_main_ref_store(the_repository)?
Yep, done in v2.
Thanks for the review!
By the way, how long should I wait before sending new versions of my
patches? I have 4 outstanding at the moment.
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2026-06-08 22:40 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-05 16:43 [PATCH] ref-filter: restore prefix-scoped iteration Tamir Duberstein
2026-06-08 21:36 ` Junio C Hamano
2026-06-08 22:39 ` Tamir Duberstein
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox