* [PATCH v2] ref-filter: restore prefix-scoped iteration
@ 2026-06-09 2:34 Tamir Duberstein
2026-06-10 10:50 ` Karthik Nayak
2026-06-10 12:29 ` [PATCH v3] " Tamir Duberstein
0 siblings, 2 replies; 5+ messages in thread
From: Tamir Duberstein @ 2026-06-09 2:34 UTC (permalink / raw)
To: git
Cc: Karthik Nayak, Patrick Steinhardt, Junio C Hamano, Victoria Dye,
ZheNing Hu, Tamir Duberstein
Commit dabecb9db2 (for-each-ref: introduce a '--start-after' option,
2025-07-15) changed single-kind branch, remote-tracking branch, and tag
enumeration in do_filter_refs() from constructing an iterator with the
namespace prefix to constructing an unscoped iterator and applying the
prefix with ref_iterator_seek().
Before that change, refs_for_each_fullref_in() passed the namespace
prefix during iterator construction. That helper has since been
replaced by refs_for_each_ref_ext().
The files backend primes its loose-ref cache for the construction
prefix before it opens packed refs. An empty construction prefix
therefore reads every loose ref, and a later seek cannot undo that I/O.
Consequently, git branch, git branch --remotes, and git tag scale with
unrelated loose refs.
Patrick Steinhardt observed during review that iterator construction
and seeking accepted similar strings but assigned them different state
semantics. Junio C Hamano then pointed out that no current command can
combine start_after with this single-kind path, but future branch or
tag support would need to keep the namespace while moving the cursor.
Keep the existing start_after path unchanged. The iterator API cannot
currently seek to one string while retaining another as its prefix:
an unflagged seek clears the prefix, while REF_ITERATOR_SEEK_SET_PREFIX
replaces it with the seek string.
For the commands affected by this regression, which do not set
start_after, pass the namespace prefix during iterator construction so
that loose refs are scoped before the packed-refs snapshot is opened.
This fixes the current regression without deleting the ref-filter state
discussed during review or changing its dormant behavior.
Add REFFILES-gated performance cases with one branch, one
remote-tracking branch, one tag, and 10,000 unrelated loose refs. The
benchmarks were run with:
GIT_PERF_REPEAT_COUNT=5 GIT_PERF_MAKE_OPTS=-j8 \
t/perf/run a89346e34a . -- p6300-for-each-ref.sh
The following are the best of five runs, with each run invoking the
command ten times. Times are elapsed seconds with user and system CPU
seconds in parentheses:
a89346e34a this commit
branch 2.74(0.13+2.56) 0.11(0.04+0.04)
branch --remotes 2.81(0.13+2.62) 0.12(0.04+0.04)
tag 3.01(0.14+2.82) 0.11(0.04+0.04)
Both revisions used the default -O2 build flags and a config.mak
containing only "NO_REGEX = NeedsStartEnd". They were built with Apple
clang 21.0.0 on macOS 26.5. The machine was a MacBook Pro (Mac16,6)
with a 16-core Apple M4 Max (12 performance and four efficiency cores)
and 128 GB RAM.
Link: https://lore.kernel.org/git/aGZidwwlToWThkn8@pks.im/
Link: https://lore.kernel.org/git/xmqqikjq7s16.fsf@gitster.g/
Fixes: dabecb9db2b2 ("for-each-ref: introduce a '--start-after' option")
Assisted-by: Codex gpt-5.5
Signed-off-by: Tamir Duberstein <tamird@gmail.com>
---
The series is based on a89346e34a (maint) because the regression has
been present in released versions since Git 2.51.0.
---
Changes in v2:
- Extract local variable `store`.
- Link to v1: https://patch.msgid.link/20260605-fix-git-branch-regression-v1-1-02f40ad40929@gmail.com
---
ref-filter.c | 28 +++++++++++++++++++---------
t/perf/p6300-for-each-ref.sh | 39 ++++++++++++++++++++++++++++++++++++++-
2 files changed, 57 insertions(+), 10 deletions(-)
diff --git a/ref-filter.c b/ref-filter.c
index 1da4c0e60d..5cbc007d64 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -3315,19 +3315,29 @@ static int do_filter_refs(struct ref_filter *filter, unsigned int type, refs_for
prefix = "refs/tags/";
if (prefix) {
- struct ref_iterator *iter;
+ struct ref_store *store = get_main_ref_store(the_repository);
- iter = refs_ref_iterator_begin(get_main_ref_store(the_repository),
- "", NULL, 0, 0);
+ if (filter->start_after) {
+ struct ref_iterator *iter;
+
+ iter = refs_ref_iterator_begin(store, "", NULL, 0, 0);
- if (filter->start_after)
ret = start_ref_iterator_after(iter, filter->start_after);
- else
- ret = ref_iterator_seek(iter, prefix,
- REF_ITERATOR_SEEK_SET_PREFIX);
+ if (!ret)
+ ret = do_for_each_ref_iterator(iter, fn,
+ cb_data);
+ } else {
+ /*
+ * Pass the prefix during construction because the files
+ * backend primes loose refs before a later seek can
+ * narrow the iterator.
+ */
+ struct refs_for_each_ref_options opts = {
+ .prefix = prefix,
+ };
- if (!ret)
- ret = do_for_each_ref_iterator(iter, fn, cb_data);
+ ret = refs_for_each_ref_ext(store, fn, cb_data, &opts);
+ }
} else if (filter->kind & FILTER_REFS_REGULAR) {
ret = for_each_fullref_in_pattern(filter, fn, cb_data);
}
diff --git a/t/perf/p6300-for-each-ref.sh b/t/perf/p6300-for-each-ref.sh
index fa7289c752..ed9c1c6a19 100755
--- a/t/perf/p6300-for-each-ref.sh
+++ b/t/perf/p6300-for-each-ref.sh
@@ -1,6 +1,6 @@
#!/bin/sh
-test_description='performance of for-each-ref'
+test_description='performance of ref-filter users'
. ./perf-lib.sh
test_perf_fresh_repo
@@ -84,4 +84,41 @@ test_expect_success 'pack refs' '
'
run_tests "packed"
+test_expect_success REFFILES 'setup many unrelated loose refs' '
+ git init scoped &&
+ test_commit -C scoped --no-tag base &&
+ test_seq $ref_count_per_type |
+ sed "s,.*,update refs/custom/unrelated_& HEAD," |
+ git -C scoped update-ref --stdin &&
+ git -C scoped update-ref refs/remotes/origin/main HEAD &&
+ git -C scoped update-ref refs/tags/only HEAD
+'
+
+test_perf "branch (many unrelated loose refs)" --prereq REFFILES "
+ (
+ cd scoped &&
+ for i in \$(test_seq $test_iteration_count); do
+ git branch --format='%(refname)' >/dev/null
+ done
+ )
+"
+
+test_perf "branch --remotes (many unrelated loose refs)" --prereq REFFILES "
+ (
+ cd scoped &&
+ for i in \$(test_seq $test_iteration_count); do
+ git branch --remotes --format='%(refname)' >/dev/null
+ done
+ )
+"
+
+test_perf "tag (many unrelated loose refs)" --prereq REFFILES "
+ (
+ cd scoped &&
+ for i in \$(test_seq $test_iteration_count); do
+ git tag --format='%(refname)' >/dev/null
+ done
+ )
+"
+
test_done
---
base-commit: a89346e34a937f001e5d397ee62224e3e9852040
change-id: 20260605-fix-git-branch-regression-9e4236f18091
Best regards,
--
Tamir Duberstein <tamird@gmail.com>
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH v2] ref-filter: restore prefix-scoped iteration
2026-06-09 2:34 [PATCH v2] ref-filter: restore prefix-scoped iteration Tamir Duberstein
@ 2026-06-10 10:50 ` Karthik Nayak
2026-06-10 12:25 ` Tamir Duberstein
2026-06-10 12:29 ` [PATCH v3] " Tamir Duberstein
1 sibling, 1 reply; 5+ messages in thread
From: Karthik Nayak @ 2026-06-10 10:50 UTC (permalink / raw)
To: Tamir Duberstein, git
Cc: Patrick Steinhardt, Junio C Hamano, Victoria Dye, ZheNing Hu
[-- Attachment #1: Type: text/plain, Size: 8571 bytes --]
Tamir Duberstein <tamird@gmail.com> writes:
> Commit dabecb9db2 (for-each-ref: introduce a '--start-after' option,
> 2025-07-15) changed single-kind branch, remote-tracking branch, and tag
> enumeration in do_filter_refs() from constructing an iterator with the
> namespace prefix to constructing an unscoped iterator and applying the
> prefix with ref_iterator_seek().
>
> Before that change, refs_for_each_fullref_in() passed the namespace
> prefix during iterator construction. That helper has since been
> replaced by refs_for_each_ref_ext().
>
> The files backend primes its loose-ref cache for the construction
> prefix before it opens packed refs. An empty construction prefix
> therefore reads every loose ref, and a later seek cannot undo that I/O.
> Consequently, git branch, git branch --remotes, and git tag scale with
> unrelated loose refs.
>
And this is the crux of the issue. Currently we do
- refs_ref_iterator_begin()
- ref_iterator_seek()
And between the two `cache_ref_iterator_set_prefix()` is already called
which caches all the loose refs. This is the IO intensive operation this
patch tries to avoid.
I think it would be worthwhile to add this information in the commit
message.
>
> Patrick Steinhardt observed during review that iterator construction
> and seeking accepted similar strings but assigned them different state
> semantics. Junio C Hamano then pointed out that no current command can
> combine start_after with this single-kind path, but future branch or
> tag support would need to keep the namespace while moving the cursor.
>
> Keep the existing start_after path unchanged. The iterator API cannot
> currently seek to one string while retaining another as its prefix:
> an unflagged seek clears the prefix, while REF_ITERATOR_SEEK_SET_PREFIX
> replaces it with the seek string.
>
> For the commands affected by this regression, which do not set
> start_after, pass the namespace prefix during iterator construction so
> that loose refs are scoped before the packed-refs snapshot is opened.
> This fixes the current regression without deleting the ref-filter state
> discussed during review or changing its dormant behavior.
>
> Add REFFILES-gated performance cases with one branch, one
> remote-tracking branch, one tag, and 10,000 unrelated loose refs. The
> benchmarks were run with:
>
> GIT_PERF_REPEAT_COUNT=5 GIT_PERF_MAKE_OPTS=-j8 \
> t/perf/run a89346e34a . -- p6300-for-each-ref.sh
>
> The following are the best of five runs, with each run invoking the
> command ten times. Times are elapsed seconds with user and system CPU
> seconds in parentheses:
>
> a89346e34a this commit
> branch 2.74(0.13+2.56) 0.11(0.04+0.04)
> branch --remotes 2.81(0.13+2.62) 0.12(0.04+0.04)
> tag 3.01(0.14+2.82) 0.11(0.04+0.04)
>
> Both revisions used the default -O2 build flags and a config.mak
> containing only "NO_REGEX = NeedsStartEnd". They were built with Apple
> clang 21.0.0 on macOS 26.5. The machine was a MacBook Pro (Mac16,6)
> with a 16-core Apple M4 Max (12 performance and four efficiency cores)
> and 128 GB RAM.
>
> Link: https://lore.kernel.org/git/aGZidwwlToWThkn8@pks.im/
> Link: https://lore.kernel.org/git/xmqqikjq7s16.fsf@gitster.g/
> Fixes: dabecb9db2b2 ("for-each-ref: introduce a '--start-after' option")
> Assisted-by: Codex gpt-5.5
> Signed-off-by: Tamir Duberstein <tamird@gmail.com>
> ---
> The series is based on a89346e34a (maint) because the regression has
> been present in released versions since Git 2.51.0.
> ---
> Changes in v2:
> - Extract local variable `store`.
> - Link to v1: https://patch.msgid.link/20260605-fix-git-branch-regression-v1-1-02f40ad40929@gmail.com
> ---
> ref-filter.c | 28 +++++++++++++++++++---------
> t/perf/p6300-for-each-ref.sh | 39 ++++++++++++++++++++++++++++++++++++++-
> 2 files changed, 57 insertions(+), 10 deletions(-)
>
> diff --git a/ref-filter.c b/ref-filter.c
> index 1da4c0e60d..5cbc007d64 100644
> --- a/ref-filter.c
> +++ b/ref-filter.c
> @@ -3315,19 +3315,29 @@ static int do_filter_refs(struct ref_filter *filter, unsigned int type, refs_for
> prefix = "refs/tags/";
>
> if (prefix) {
> - struct ref_iterator *iter;
> + struct ref_store *store = get_main_ref_store(the_repository);
>
> - iter = refs_ref_iterator_begin(get_main_ref_store(the_repository),
> - "", NULL, 0, 0);
> + if (filter->start_after) {
> + struct ref_iterator *iter;
> +
> + iter = refs_ref_iterator_begin(store, "", NULL, 0, 0);
>
> - if (filter->start_after)
> ret = start_ref_iterator_after(iter, filter->start_after);
> - else
> - ret = ref_iterator_seek(iter, prefix,
> - REF_ITERATOR_SEEK_SET_PREFIX);
> + if (!ret)
> + ret = do_for_each_ref_iterator(iter, fn,
> + cb_data);
> + } else {
> + /*
> + * Pass the prefix during construction because the files
> + * backend primes loose refs before a later seek can
> + * narrow the iterator.
> + */
> + struct refs_for_each_ref_options opts = {
> + .prefix = prefix,
> + };
>
> - if (!ret)
> - ret = do_for_each_ref_iterator(iter, fn, cb_data);
> + ret = refs_for_each_ref_ext(store, fn, cb_data, &opts);
> + }
This would work, as now we separate out the regular path to use
`do_for_each_ref_iterator()` instead.
But this causes a bit of confusion, why do we need to use
`do_for_each_ref_iterator()` and why not simply provide the prefix to
`refs_ref_iterator_begin()`, like before?
On top of master, the below diff seems to fix the issue and works with
the benchmarks provided in this patch. (I haven't tested it with out
test suite though).
modified ref-filter.c
@@ -3316,15 +3316,16 @@ static int do_filter_refs(struct ref_filter
*filter, unsigned int type, refs_for
if (prefix) {
struct ref_iterator *iter;
+ struct ref_store *store;
- iter = refs_ref_iterator_begin(get_main_ref_store(the_repository),
- "", NULL, 0, 0);
+ store = get_main_ref_store(the_repository);
- if (filter->start_after)
+ if (filter->start_after) {
+ iter = refs_ref_iterator_begin(store, "", NULL, 0, 0);
ret = start_ref_iterator_after(iter, filter->start_after);
- else
- ret = ref_iterator_seek(iter, prefix,
- REF_ITERATOR_SEEK_SET_PREFIX);
+ } else {
+ iter = refs_ref_iterator_begin(store, prefix, NULL, 0, 0);
+ }
if (!ret)
ret = do_for_each_ref_iterator(iter, fn, cb_data);
I would say something like this would make more sense, since it still
keeps the current structure without introducing a new command.
> } else if (filter->kind & FILTER_REFS_REGULAR) {
> ret = for_each_fullref_in_pattern(filter, fn, cb_data);
> }
> diff --git a/t/perf/p6300-for-each-ref.sh b/t/perf/p6300-for-each-ref.sh
> index fa7289c752..ed9c1c6a19 100755
> --- a/t/perf/p6300-for-each-ref.sh
> +++ b/t/perf/p6300-for-each-ref.sh
> @@ -1,6 +1,6 @@
> #!/bin/sh
>
> -test_description='performance of for-each-ref'
> +test_description='performance of ref-filter users'
> . ./perf-lib.sh
>
> test_perf_fresh_repo
> @@ -84,4 +84,41 @@ test_expect_success 'pack refs' '
> '
> run_tests "packed"
>
> +test_expect_success REFFILES 'setup many unrelated loose refs' '
> + git init scoped &&
> + test_commit -C scoped --no-tag base &&
> + test_seq $ref_count_per_type |
> + sed "s,.*,update refs/custom/unrelated_& HEAD," |
> + git -C scoped update-ref --stdin &&
> + git -C scoped update-ref refs/remotes/origin/main HEAD &&
> + git -C scoped update-ref refs/tags/only HEAD
> +'
> +
> +test_perf "branch (many unrelated loose refs)" --prereq REFFILES "
> + (
> + cd scoped &&
> + for i in \$(test_seq $test_iteration_count); do
> + git branch --format='%(refname)' >/dev/null
> + done
> + )
> +"
> +
> +test_perf "branch --remotes (many unrelated loose refs)" --prereq REFFILES "
> + (
> + cd scoped &&
> + for i in \$(test_seq $test_iteration_count); do
> + git branch --remotes --format='%(refname)' >/dev/null
> + done
> + )
> +"
> +
> +test_perf "tag (many unrelated loose refs)" --prereq REFFILES "
> + (
> + cd scoped &&
> + for i in \$(test_seq $test_iteration_count); do
> + git tag --format='%(refname)' >/dev/null
> + done
> + )
> +"
> +
> test_done
>
> ---
> base-commit: a89346e34a937f001e5d397ee62224e3e9852040
> change-id: 20260605-fix-git-branch-regression-9e4236f18091
>
> Best regards,
> --
> Tamir Duberstein <tamird@gmail.com>
Thanks for the patch, this is indeed a regression we must fix and the
benchmarks are a clear indication of it.
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v2] ref-filter: restore prefix-scoped iteration
2026-06-10 10:50 ` Karthik Nayak
@ 2026-06-10 12:25 ` Tamir Duberstein
0 siblings, 0 replies; 5+ messages in thread
From: Tamir Duberstein @ 2026-06-10 12:25 UTC (permalink / raw)
To: Karthik Nayak
Cc: git, Patrick Steinhardt, Junio C Hamano, Victoria Dye, ZheNing Hu
On Wed, Jun 10, 2026 at 3:50 AM Karthik Nayak <karthik.188@gmail.com> wrote:
>
> Tamir Duberstein <tamird@gmail.com> writes:
>
> > Commit dabecb9db2 (for-each-ref: introduce a '--start-after' option,
> > 2025-07-15) changed single-kind branch, remote-tracking branch, and tag
> > enumeration in do_filter_refs() from constructing an iterator with the
> > namespace prefix to constructing an unscoped iterator and applying the
> > prefix with ref_iterator_seek().
> >
> > Before that change, refs_for_each_fullref_in() passed the namespace
> > prefix during iterator construction. That helper has since been
> > replaced by refs_for_each_ref_ext().
> >
> > The files backend primes its loose-ref cache for the construction
> > prefix before it opens packed refs. An empty construction prefix
> > therefore reads every loose ref, and a later seek cannot undo that I/O.
> > Consequently, git branch, git branch --remotes, and git tag scale with
> > unrelated loose refs.
> >
>
> And this is the crux of the issue. Currently we do
>
> - refs_ref_iterator_begin()
> - ref_iterator_seek()
>
> And between the two `cache_ref_iterator_set_prefix()` is already called
> which caches all the loose refs. This is the IO intensive operation this
> patch tries to avoid.
>
> I think it would be worthwhile to add this information in the commit
> message.
Agreed. I will explain that `cache_ref_iterator_set_prefix()` primes
the loose-ref cache during iterator construction, before the later
seek can narrow it.
>
> >
> > Patrick Steinhardt observed during review that iterator construction
> > and seeking accepted similar strings but assigned them different state
> > semantics. Junio C Hamano then pointed out that no current command can
> > combine start_after with this single-kind path, but future branch or
> > tag support would need to keep the namespace while moving the cursor.
> >
> > Keep the existing start_after path unchanged. The iterator API cannot
> > currently seek to one string while retaining another as its prefix:
> > an unflagged seek clears the prefix, while REF_ITERATOR_SEEK_SET_PREFIX
> > replaces it with the seek string.
> >
> > For the commands affected by this regression, which do not set
> > start_after, pass the namespace prefix during iterator construction so
> > that loose refs are scoped before the packed-refs snapshot is opened.
> > This fixes the current regression without deleting the ref-filter state
> > discussed during review or changing its dormant behavior.
> >
> > Add REFFILES-gated performance cases with one branch, one
> > remote-tracking branch, one tag, and 10,000 unrelated loose refs. The
> > benchmarks were run with:
> >
> > GIT_PERF_REPEAT_COUNT=5 GIT_PERF_MAKE_OPTS=-j8 \
> > t/perf/run a89346e34a . -- p6300-for-each-ref.sh
> >
> > The following are the best of five runs, with each run invoking the
> > command ten times. Times are elapsed seconds with user and system CPU
> > seconds in parentheses:
> >
> > a89346e34a this commit
> > branch 2.74(0.13+2.56) 0.11(0.04+0.04)
> > branch --remotes 2.81(0.13+2.62) 0.12(0.04+0.04)
> > tag 3.01(0.14+2.82) 0.11(0.04+0.04)
> >
> > Both revisions used the default -O2 build flags and a config.mak
> > containing only "NO_REGEX = NeedsStartEnd". They were built with Apple
> > clang 21.0.0 on macOS 26.5. The machine was a MacBook Pro (Mac16,6)
> > with a 16-core Apple M4 Max (12 performance and four efficiency cores)
> > and 128 GB RAM.
> >
> > Link: https://lore.kernel.org/git/aGZidwwlToWThkn8@pks.im/
> > Link: https://lore.kernel.org/git/xmqqikjq7s16.fsf@gitster.g/
> > Fixes: dabecb9db2b2 ("for-each-ref: introduce a '--start-after' option")
> > Assisted-by: Codex gpt-5.5
> > Signed-off-by: Tamir Duberstein <tamird@gmail.com>
> > ---
> > The series is based on a89346e34a (maint) because the regression has
> > been present in released versions since Git 2.51.0.
> > ---
> > Changes in v2:
> > - Extract local variable `store`.
> > - Link to v1: https://patch.msgid.link/20260605-fix-git-branch-regression-v1-1-02f40ad40929@gmail.com
> > ---
> > ref-filter.c | 28 +++++++++++++++++++---------
> > t/perf/p6300-for-each-ref.sh | 39 ++++++++++++++++++++++++++++++++++++++-
> > 2 files changed, 57 insertions(+), 10 deletions(-)
> >
> > diff --git a/ref-filter.c b/ref-filter.c
> > index 1da4c0e60d..5cbc007d64 100644
> > --- a/ref-filter.c
> > +++ b/ref-filter.c
> > @@ -3315,19 +3315,29 @@ static int do_filter_refs(struct ref_filter *filter, unsigned int type, refs_for
> > prefix = "refs/tags/";
> >
> > if (prefix) {
> > - struct ref_iterator *iter;
> > + struct ref_store *store = get_main_ref_store(the_repository);
> >
> > - iter = refs_ref_iterator_begin(get_main_ref_store(the_repository),
> > - "", NULL, 0, 0);
> > + if (filter->start_after) {
> > + struct ref_iterator *iter;
> > +
> > + iter = refs_ref_iterator_begin(store, "", NULL, 0, 0);
> >
> > - if (filter->start_after)
> > ret = start_ref_iterator_after(iter, filter->start_after);
> > - else
> > - ret = ref_iterator_seek(iter, prefix,
> > - REF_ITERATOR_SEEK_SET_PREFIX);
> > + if (!ret)
> > + ret = do_for_each_ref_iterator(iter, fn,
> > + cb_data);
> > + } else {
> > + /*
> > + * Pass the prefix during construction because the files
> > + * backend primes loose refs before a later seek can
> > + * narrow the iterator.
> > + */
> > + struct refs_for_each_ref_options opts = {
> > + .prefix = prefix,
> > + };
> >
> > - if (!ret)
> > - ret = do_for_each_ref_iterator(iter, fn, cb_data);
> > + ret = refs_for_each_ref_ext(store, fn, cb_data, &opts);
> > + }
>
> This would work, as now we separate out the regular path to use
> `do_for_each_ref_iterator()` instead.
>
> But this causes a bit of confusion, why do we need to use
> `do_for_each_ref_iterator()` and why not simply provide the prefix to
> `refs_ref_iterator_begin()`, like before?
We do not. Your version is simpler and preserves the existing iterator
flow. I have adopted it for v3. Thanks!
> [...]
>
> Thanks for the patch, this is indeed a regression we must fix and the
> benchmarks are a clear indication of it.
Thank you! I'll try not to break threading on the next roll.
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH v3] ref-filter: restore prefix-scoped iteration
2026-06-09 2:34 [PATCH v2] ref-filter: restore prefix-scoped iteration Tamir Duberstein
2026-06-10 10:50 ` Karthik Nayak
@ 2026-06-10 12:29 ` Tamir Duberstein
2026-06-12 11:48 ` Patrick Steinhardt
1 sibling, 1 reply; 5+ messages in thread
From: Tamir Duberstein @ 2026-06-10 12:29 UTC (permalink / raw)
To: git
Cc: Karthik Nayak, Patrick Steinhardt, Junio C Hamano, Victoria Dye,
ZheNing Hu, Tamir Duberstein
dabecb9db2 (for-each-ref: introduce a '--start-after' option,
2025-07-15) changed branch, remote-tracking branch, and tag enumeration
from constructing an iterator with the namespace prefix to constructing
an unscoped iterator and seeking to the prefix.
The files backend constructs its loose-ref iterator with cache priming
enabled. cache_ref_iterator_begin() immediately applies the construction
prefix through cache_ref_iterator_set_prefix(), reading loose refs
beneath it before packed refs are opened. An empty prefix therefore
reads every loose ref, and a later seek cannot undo that I/O.
For these single-kind filters, construct the iterator with the namespace
prefix when start_after is not set. Keep the existing unscoped
construction for start_after, whose seek position may differ from the
namespace prefix.
With 10,000 unrelated loose refs, the p6300 tests improve as follows:
before after
branch 2.74 s 0.11 s
branch --remotes 2.81 s 0.12 s
tag 3.01 s 0.11 s
Link: https://lore.kernel.org/git/aGZidwwlToWThkn8@pks.im/
Link: https://lore.kernel.org/git/xmqqikjq7s16.fsf@gitster.g/
Link: https://lore.kernel.org/r/CAOLa=ZRHKNNymXGk31YgECjUmF9nZ8GsPUdQb7aKBH5DKMz7=w@mail.gmail.com
Fixes: dabecb9db2b2 ("for-each-ref: introduce a '--start-after' option")
Suggested-by: Karthik Nayak <karthik.188@gmail.com>
Assisted-by: Codex gpt-5.5
Signed-off-by: Tamir Duberstein <tamird@gmail.com>
---
The series is based on a89346e34a (maint) because the regression has
been present in released versions since Git 2.51.0.
---
Changes in v3:
- Construct the iterator directly with the namespace prefix.
- Explain when the files backend primes its loose-ref cache.
- Condense the commit message and performance results.
- Link to v2: https://patch.msgid.link/20260608-fix-git-branch-regression-v2-1-fd82075a8520@gmail.com
Changes in v2:
- Extract local variable `store`.
- Link to v1: https://patch.msgid.link/20260605-fix-git-branch-regression-v1-1-02f40ad40929@gmail.com
---
ref-filter.c | 13 ++++++-------
t/perf/p6300-for-each-ref.sh | 39 ++++++++++++++++++++++++++++++++++++++-
2 files changed, 44 insertions(+), 8 deletions(-)
diff --git a/ref-filter.c b/ref-filter.c
index 1da4c0e60d..9b04e3af85 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -3316,15 +3316,14 @@ static int do_filter_refs(struct ref_filter *filter, unsigned int type, refs_for
if (prefix) {
struct ref_iterator *iter;
+ struct ref_store *store = get_main_ref_store(the_repository);
- iter = refs_ref_iterator_begin(get_main_ref_store(the_repository),
- "", NULL, 0, 0);
-
- if (filter->start_after)
+ if (filter->start_after) {
+ iter = refs_ref_iterator_begin(store, "", NULL, 0, 0);
ret = start_ref_iterator_after(iter, filter->start_after);
- else
- ret = ref_iterator_seek(iter, prefix,
- REF_ITERATOR_SEEK_SET_PREFIX);
+ } else {
+ iter = refs_ref_iterator_begin(store, prefix, NULL, 0, 0);
+ }
if (!ret)
ret = do_for_each_ref_iterator(iter, fn, cb_data);
diff --git a/t/perf/p6300-for-each-ref.sh b/t/perf/p6300-for-each-ref.sh
index fa7289c752..ed9c1c6a19 100755
--- a/t/perf/p6300-for-each-ref.sh
+++ b/t/perf/p6300-for-each-ref.sh
@@ -1,6 +1,6 @@
#!/bin/sh
-test_description='performance of for-each-ref'
+test_description='performance of ref-filter users'
. ./perf-lib.sh
test_perf_fresh_repo
@@ -84,4 +84,41 @@ test_expect_success 'pack refs' '
'
run_tests "packed"
+test_expect_success REFFILES 'setup many unrelated loose refs' '
+ git init scoped &&
+ test_commit -C scoped --no-tag base &&
+ test_seq $ref_count_per_type |
+ sed "s,.*,update refs/custom/unrelated_& HEAD," |
+ git -C scoped update-ref --stdin &&
+ git -C scoped update-ref refs/remotes/origin/main HEAD &&
+ git -C scoped update-ref refs/tags/only HEAD
+'
+
+test_perf "branch (many unrelated loose refs)" --prereq REFFILES "
+ (
+ cd scoped &&
+ for i in \$(test_seq $test_iteration_count); do
+ git branch --format='%(refname)' >/dev/null
+ done
+ )
+"
+
+test_perf "branch --remotes (many unrelated loose refs)" --prereq REFFILES "
+ (
+ cd scoped &&
+ for i in \$(test_seq $test_iteration_count); do
+ git branch --remotes --format='%(refname)' >/dev/null
+ done
+ )
+"
+
+test_perf "tag (many unrelated loose refs)" --prereq REFFILES "
+ (
+ cd scoped &&
+ for i in \$(test_seq $test_iteration_count); do
+ git tag --format='%(refname)' >/dev/null
+ done
+ )
+"
+
test_done
---
base-commit: a89346e34a937f001e5d397ee62224e3e9852040
change-id: 20260605-fix-git-branch-regression-9e4236f18091
Best regards,
--
Tamir Duberstein <tamird@gmail.com>
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH v3] ref-filter: restore prefix-scoped iteration
2026-06-10 12:29 ` [PATCH v3] " Tamir Duberstein
@ 2026-06-12 11:48 ` Patrick Steinhardt
0 siblings, 0 replies; 5+ messages in thread
From: Patrick Steinhardt @ 2026-06-12 11:48 UTC (permalink / raw)
To: Tamir Duberstein
Cc: git, Karthik Nayak, Junio C Hamano, Victoria Dye, ZheNing Hu
On Wed, Jun 10, 2026 at 05:29:49AM -0700, Tamir Duberstein wrote:
> dabecb9db2 (for-each-ref: introduce a '--start-after' option,
> 2025-07-15) changed branch, remote-tracking branch, and tag enumeration
> from constructing an iterator with the namespace prefix to constructing
> an unscoped iterator and seeking to the prefix.
>
> The files backend constructs its loose-ref iterator with cache priming
> enabled. cache_ref_iterator_begin() immediately applies the construction
> prefix through cache_ref_iterator_set_prefix(), reading loose refs
> beneath it before packed refs are opened. An empty prefix therefore
> reads every loose ref, and a later seek cannot undo that I/O.
>
> For these single-kind filters, construct the iterator with the namespace
> prefix when start_after is not set. Keep the existing unscoped
> construction for start_after, whose seek position may differ from the
> namespace prefix.
>
> With 10,000 unrelated loose refs, the p6300 tests improve as follows:
>
> before after
> branch 2.74 s 0.11 s
> branch --remotes 2.81 s 0.12 s
> tag 3.01 s 0.11 s
>
> Link: https://lore.kernel.org/git/aGZidwwlToWThkn8@pks.im/
> Link: https://lore.kernel.org/git/xmqqikjq7s16.fsf@gitster.g/
> Link: https://lore.kernel.org/r/CAOLa=ZRHKNNymXGk31YgECjUmF9nZ8GsPUdQb7aKBH5DKMz7=w@mail.gmail.com
I honestly have no idea what you want to say with these links, as they
seem to just link to random reviews mails when the above mentioned
commit was reviewed. In general, we typically try to embed references
like this into the explanation, like:
In [1], we discussed... and this is relevant because of ...
[1]: https://lore.kernel.org/git/aGZidwwlToWThkn8@pks.im/
Just dropping the links as-is without much of an explanation isn't
helpful.
> diff --git a/ref-filter.c b/ref-filter.c
> index 1da4c0e60d..9b04e3af85 100644
> --- a/ref-filter.c
> +++ b/ref-filter.c
> @@ -3316,15 +3316,14 @@ static int do_filter_refs(struct ref_filter *filter, unsigned int type, refs_for
>
> if (prefix) {
> struct ref_iterator *iter;
> + struct ref_store *store = get_main_ref_store(the_repository);
>
> - iter = refs_ref_iterator_begin(get_main_ref_store(the_repository),
> - "", NULL, 0, 0);
> -
> - if (filter->start_after)
> + if (filter->start_after) {
> + iter = refs_ref_iterator_begin(store, "", NULL, 0, 0);
> ret = start_ref_iterator_after(iter, filter->start_after);
> - else
> - ret = ref_iterator_seek(iter, prefix,
> - REF_ITERATOR_SEEK_SET_PREFIX);
> + } else {
> + iter = refs_ref_iterator_begin(store, prefix, NULL, 0, 0);
> + }
>
> if (!ret)
> ret = do_for_each_ref_iterator(iter, fn, cb_data);
The patch itself seems sensible to me.
> diff --git a/t/perf/p6300-for-each-ref.sh b/t/perf/p6300-for-each-ref.sh
> index fa7289c752..ed9c1c6a19 100755
> --- a/t/perf/p6300-for-each-ref.sh
> +++ b/t/perf/p6300-for-each-ref.sh
> @@ -1,6 +1,6 @@
> #!/bin/sh
>
> -test_description='performance of for-each-ref'
> +test_description='performance of ref-filter users'
> . ./perf-lib.sh
>
> test_perf_fresh_repo
> @@ -84,4 +84,41 @@ test_expect_success 'pack refs' '
> '
> run_tests "packed"
>
> +test_expect_success REFFILES 'setup many unrelated loose refs' '
> + git init scoped &&
> + test_commit -C scoped --no-tag base &&
> + test_seq $ref_count_per_type |
> + sed "s,.*,update refs/custom/unrelated_& HEAD," |
> + git -C scoped update-ref --stdin &&
> + git -C scoped update-ref refs/remotes/origin/main HEAD &&
> + git -C scoped update-ref refs/tags/only HEAD
> +'
I've already called this out before on other patches, but the REFFILES
prerequisite just doesn't make any sense here as this test logic is
generic.
Patrick
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2026-06-12 11:48 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-09 2:34 [PATCH v2] ref-filter: restore prefix-scoped iteration Tamir Duberstein
2026-06-10 10:50 ` Karthik Nayak
2026-06-10 12:25 ` Tamir Duberstein
2026-06-10 12:29 ` [PATCH v3] " Tamir Duberstein
2026-06-12 11:48 ` Patrick Steinhardt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox