From: Tamir Duberstein <tamird@gmail.com>
To: git@vger.kernel.org
Cc: Karthik Nayak <karthik.188@gmail.com>,
Patrick Steinhardt <ps@pks.im>,
Junio C Hamano <gitster@pobox.com>,
Victoria Dye <vdye@github.com>,
ZheNing Hu <adlternative@gmail.com>,
Tamir Duberstein <tamird@gmail.com>
Subject: [PATCH v2] ref-filter: restore prefix-scoped iteration
Date: Mon, 08 Jun 2026 19:34:57 -0700 [thread overview]
Message-ID: <20260608-fix-git-branch-regression-v2-1-fd82075a8520@gmail.com> (raw)
Commit dabecb9db2 (for-each-ref: introduce a '--start-after' option,
2025-07-15) changed single-kind branch, remote-tracking branch, and tag
enumeration in do_filter_refs() from constructing an iterator with the
namespace prefix to constructing an unscoped iterator and applying the
prefix with ref_iterator_seek().
Before that change, refs_for_each_fullref_in() passed the namespace
prefix during iterator construction. That helper has since been
replaced by refs_for_each_ref_ext().
The files backend primes its loose-ref cache for the construction
prefix before it opens packed refs. An empty construction prefix
therefore reads every loose ref, and a later seek cannot undo that I/O.
Consequently, git branch, git branch --remotes, and git tag scale with
unrelated loose refs.
Patrick Steinhardt observed during review that iterator construction
and seeking accepted similar strings but assigned them different state
semantics. Junio C Hamano then pointed out that no current command can
combine start_after with this single-kind path, but future branch or
tag support would need to keep the namespace while moving the cursor.
Keep the existing start_after path unchanged. The iterator API cannot
currently seek to one string while retaining another as its prefix:
an unflagged seek clears the prefix, while REF_ITERATOR_SEEK_SET_PREFIX
replaces it with the seek string.
For the commands affected by this regression, which do not set
start_after, pass the namespace prefix during iterator construction so
that loose refs are scoped before the packed-refs snapshot is opened.
This fixes the current regression without deleting the ref-filter state
discussed during review or changing its dormant behavior.
Add REFFILES-gated performance cases with one branch, one
remote-tracking branch, one tag, and 10,000 unrelated loose refs. The
benchmarks were run with:
GIT_PERF_REPEAT_COUNT=5 GIT_PERF_MAKE_OPTS=-j8 \
t/perf/run a89346e34a . -- p6300-for-each-ref.sh
The following are the best of five runs, with each run invoking the
command ten times. Times are elapsed seconds with user and system CPU
seconds in parentheses:
a89346e34a this commit
branch 2.74(0.13+2.56) 0.11(0.04+0.04)
branch --remotes 2.81(0.13+2.62) 0.12(0.04+0.04)
tag 3.01(0.14+2.82) 0.11(0.04+0.04)
Both revisions used the default -O2 build flags and a config.mak
containing only "NO_REGEX = NeedsStartEnd". They were built with Apple
clang 21.0.0 on macOS 26.5. The machine was a MacBook Pro (Mac16,6)
with a 16-core Apple M4 Max (12 performance and four efficiency cores)
and 128 GB RAM.
Link: https://lore.kernel.org/git/aGZidwwlToWThkn8@pks.im/
Link: https://lore.kernel.org/git/xmqqikjq7s16.fsf@gitster.g/
Fixes: dabecb9db2b2 ("for-each-ref: introduce a '--start-after' option")
Assisted-by: Codex gpt-5.5
Signed-off-by: Tamir Duberstein <tamird@gmail.com>
---
The series is based on a89346e34a (maint) because the regression has
been present in released versions since Git 2.51.0.
---
Changes in v2:
- Extract local variable `store`.
- Link to v1: https://patch.msgid.link/20260605-fix-git-branch-regression-v1-1-02f40ad40929@gmail.com
---
ref-filter.c | 28 +++++++++++++++++++---------
t/perf/p6300-for-each-ref.sh | 39 ++++++++++++++++++++++++++++++++++++++-
2 files changed, 57 insertions(+), 10 deletions(-)
diff --git a/ref-filter.c b/ref-filter.c
index 1da4c0e60d..5cbc007d64 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -3315,19 +3315,29 @@ static int do_filter_refs(struct ref_filter *filter, unsigned int type, refs_for
prefix = "refs/tags/";
if (prefix) {
- struct ref_iterator *iter;
+ struct ref_store *store = get_main_ref_store(the_repository);
- iter = refs_ref_iterator_begin(get_main_ref_store(the_repository),
- "", NULL, 0, 0);
+ if (filter->start_after) {
+ struct ref_iterator *iter;
+
+ iter = refs_ref_iterator_begin(store, "", NULL, 0, 0);
- if (filter->start_after)
ret = start_ref_iterator_after(iter, filter->start_after);
- else
- ret = ref_iterator_seek(iter, prefix,
- REF_ITERATOR_SEEK_SET_PREFIX);
+ if (!ret)
+ ret = do_for_each_ref_iterator(iter, fn,
+ cb_data);
+ } else {
+ /*
+ * Pass the prefix during construction because the files
+ * backend primes loose refs before a later seek can
+ * narrow the iterator.
+ */
+ struct refs_for_each_ref_options opts = {
+ .prefix = prefix,
+ };
- if (!ret)
- ret = do_for_each_ref_iterator(iter, fn, cb_data);
+ ret = refs_for_each_ref_ext(store, fn, cb_data, &opts);
+ }
} else if (filter->kind & FILTER_REFS_REGULAR) {
ret = for_each_fullref_in_pattern(filter, fn, cb_data);
}
diff --git a/t/perf/p6300-for-each-ref.sh b/t/perf/p6300-for-each-ref.sh
index fa7289c752..ed9c1c6a19 100755
--- a/t/perf/p6300-for-each-ref.sh
+++ b/t/perf/p6300-for-each-ref.sh
@@ -1,6 +1,6 @@
#!/bin/sh
-test_description='performance of for-each-ref'
+test_description='performance of ref-filter users'
. ./perf-lib.sh
test_perf_fresh_repo
@@ -84,4 +84,41 @@ test_expect_success 'pack refs' '
'
run_tests "packed"
+test_expect_success REFFILES 'setup many unrelated loose refs' '
+ git init scoped &&
+ test_commit -C scoped --no-tag base &&
+ test_seq $ref_count_per_type |
+ sed "s,.*,update refs/custom/unrelated_& HEAD," |
+ git -C scoped update-ref --stdin &&
+ git -C scoped update-ref refs/remotes/origin/main HEAD &&
+ git -C scoped update-ref refs/tags/only HEAD
+'
+
+test_perf "branch (many unrelated loose refs)" --prereq REFFILES "
+ (
+ cd scoped &&
+ for i in \$(test_seq $test_iteration_count); do
+ git branch --format='%(refname)' >/dev/null
+ done
+ )
+"
+
+test_perf "branch --remotes (many unrelated loose refs)" --prereq REFFILES "
+ (
+ cd scoped &&
+ for i in \$(test_seq $test_iteration_count); do
+ git branch --remotes --format='%(refname)' >/dev/null
+ done
+ )
+"
+
+test_perf "tag (many unrelated loose refs)" --prereq REFFILES "
+ (
+ cd scoped &&
+ for i in \$(test_seq $test_iteration_count); do
+ git tag --format='%(refname)' >/dev/null
+ done
+ )
+"
+
test_done
---
base-commit: a89346e34a937f001e5d397ee62224e3e9852040
change-id: 20260605-fix-git-branch-regression-9e4236f18091
Best regards,
--
Tamir Duberstein <tamird@gmail.com>
next reply other threads:[~2026-06-09 2:35 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-09 2:34 Tamir Duberstein [this message]
2026-06-10 10:50 ` [PATCH v2] ref-filter: restore prefix-scoped iteration Karthik Nayak
2026-06-10 12:25 ` Tamir Duberstein
2026-06-10 12:29 ` [PATCH v3] " Tamir Duberstein
2026-06-12 11:48 ` Patrick Steinhardt
2026-06-12 21:24 ` Tamir Duberstein
2026-06-12 21:27 ` [PATCH v4] " Tamir Duberstein
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260608-fix-git-branch-regression-v2-1-fd82075a8520@gmail.com \
--to=tamird@gmail.com \
--cc=adlternative@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=karthik.188@gmail.com \
--cc=ps@pks.im \
--cc=vdye@github.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox