Git development
 help / color / mirror / Atom feed
* Re: [PATCH v2] ls-files: filter pathspec before lstat
From: Tamir Duberstein @ 2026-06-09  3:38 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, René Scharfe, Patrick Steinhardt, Jeff King
In-Reply-To: <xmqqv7bstmw8.fsf@gitster.g>

On Mon, Jun 8, 2026 at 8:26 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Tamir Duberstein <tamird@gmail.com> writes:
>
> > show_files() checks whether each index entry is deleted or modified
> > before show_ce() applies the pathspec. prune_index() avoids most of this
> > work for pathspecs with a common directory prefix, but a top-level name
> > or leading wildcard leaves every entry to be checked.
> > ...
>
> Please make sure that your v2 is a response to v1; otherwise loses
> sight of the previous iteration.
>
> > Changes in v2:
> > - Restrict early matching to one pathspec, avoiding the regression Jeff
> >   demonstrated with many pathspecs.
> > - Add all-matching and many-pathspec performance results.
> > - Drop the Assisted-by trailer.
> > - Link to v1: https://patch.msgid.link/20260607-ls-files-pathspec-lstat-v1-1-8cf40b730146@gmail.com
>
> And it is *not* a replacement to force human to follow such a link.
>
> Instead, please make sure each piece of your e-mail identifies where
> it fits in the discussion thread by pointing the message of the
> previous round with its In-Reply-To: header.
>
> Thanks.

Apologies, I used b4 which follows kernel rules. I'll follow this
guidance in the future.

^ permalink raw reply

* Re: [PATCH v2] ls-files: filter pathspec before lstat
From: Junio C Hamano @ 2026-06-09  3:26 UTC (permalink / raw)
  To: Tamir Duberstein; +Cc: git, René Scharfe, Patrick Steinhardt, Jeff King
In-Reply-To: <20260608-ls-files-pathspec-lstat-v2-1-fb734b28422e@gmail.com>

Tamir Duberstein <tamird@gmail.com> writes:

> show_files() checks whether each index entry is deleted or modified
> before show_ce() applies the pathspec. prune_index() avoids most of this
> work for pathspecs with a common directory prefix, but a top-level name
> or leading wildcard leaves every entry to be checked.
> ...

Please make sure that your v2 is a response to v1; otherwise loses
sight of the previous iteration.

> Changes in v2:
> - Restrict early matching to one pathspec, avoiding the regression Jeff
>   demonstrated with many pathspecs.
> - Add all-matching and many-pathspec performance results.
> - Drop the Assisted-by trailer.
> - Link to v1: https://patch.msgid.link/20260607-ls-files-pathspec-lstat-v1-1-8cf40b730146@gmail.com

And it is *not* a replacement to force human to follow such a link.

Instead, please make sure each piece of your e-mail identifies where
it fits in the discussion thread by pointing the message of the
previous round with its In-Reply-To: header.

Thanks.

^ permalink raw reply

* [GSoC] [Blog] week 2: Improving the new git repo command
From: K Jayatheerth @ 2026-06-09  3:10 UTC (permalink / raw)
  To: GIT Mailing-list, Justin Tobler, Lucas Seiki Oshiro
In-Reply-To: <CA+rGoLee083Whzi3b9CP3Hxrq_cz58enN67ZQq5r0koczKeU1A@mail.gmail.com>

Hey everyone,

My Week 2 GSoC blog is live!
https://jayatheerth.com/blogs/gsoc/week-2-feedback1

Feel free to give it a read and share any feedback ; )

Regards,
- K Jayatheerth

^ permalink raw reply

* Re: [GSoC PATCH v2 1/4] path: introduce format_path() for centralized path formatting
From: K Jayatheerth @ 2026-06-09  2:47 UTC (permalink / raw)
  To: Lucas Seiki Oshiro
  Cc: git, a3205153416, gitster, jltobler, kumarayushjha123,
	phillip.wood, sandals
In-Reply-To: <22E79E77-BCC3-4622-BD39-F4ED7DDA9511@gmail.com>

>
> Nitpick: the documentation is clear to me, but maybe the function name
> "format" and the parameter name "buf" can mislead the user to think
> that it only formats the path without appending to the existing string
> in `buf`. My suggestion is to rename them to something like
> `append_formatted_path` and `dest`, respectively.
>

Ok, that's a good point!
I will add this in the next series!


>
> > +test_repo_info_path () {
> > + field_name=$1
> > + expect_absolute_eval=$2
> > + expect_relative=$3
> > + env_prefix=$4
>
> This helper function needs a documentation.
>

Alright, I will add that.

> > + test_expect_success "query individual key: path.$field_name.absolute${env_prefix:+ ($env_prefix)}" '
>
> This makes the output polluted. What about changing it by something like:
>
>         test_expect_success "absolute: $label' '...'
>         test_expect_success "relative: $label' '...'
>
> with a custom label?
>

Ahh, interesting.
I agree, I will look into this!

> > +
> > +test_expect_success 'setup test repository layout for path fields' '
> > + git init test-repo &&
> > + mkdir -p test-repo/sub
> > +'
>
> The helper function `test_repo_info_path` is relying too much on the
> existence of the `test-repo`. I think it would be better to add a new
> parameter `repo_name` (or similar) because
>
> 1. You could move this creation to the helper function and
>    you won't need to place the test after that creation
>
> 2. You could use different for each (test_repo_info_path call, path format)
>    pair. Currently, if more than one test fails, its result is overwritten
>    and the `expect` and `actual` files from the trash directory will be
>    the last of the broken tests.
>
> 3. You won't need to use the hacky 'echo "$(cd .. && pwd)'
>
> This applies my suggestions (feel free to use, adapt or discard it):
>

Thanks!
That is helpful.

Regards,
- K Jayatheerth

^ permalink raw reply

* [PATCH v2] ls-files: filter pathspec before lstat
From: Tamir Duberstein @ 2026-06-09  2:37 UTC (permalink / raw)
  To: git
  Cc: René Scharfe, Patrick Steinhardt, Junio C Hamano, Jeff King,
	Tamir Duberstein

show_files() checks whether each index entry is deleted or modified
before show_ce() applies the pathspec. prune_index() avoids most of this
work for pathspecs with a common directory prefix, but a top-level name
or leading wildcard leaves every entry to be checked.

For a single pathspec, match it before lstat() in the deleted and
modified modes. Keep the later match in show_ce() so --error-unmatch is
satisfied only by entries that are actually shown.

match_pathspec() is linear in the number of pathspec items. Applying it
early for every item can therefore multiply the work for commands with
many pathspecs, especially when lstat() shows that no entries are
modified. Restrict the early check to one pathspec. Callers with
multiple pathspecs retain the existing lstat()-first order.

On a repository with 859,211 index entries, a 19,931,862-byte index,
and 25,303,439 packed objects occupying 21.13 GiB, I exported $parent
and $this to binaries built from the parent and this commit, then ran:

    hyperfine --warmup 0 --runs 3 \
        --command-name parent \
        '$parent -c core.fsmonitor=false ls-files --deleted -- README.md' \
        --command-name 'this commit' \
        '$this -c core.fsmonitor=false ls-files --deleted -- README.md'

The results were:

             parent       this commit
  elapsed    60.742 s     1.061 s
  user        1.117 s     0.963 s
  system     10.740 s     0.042 s

For an all-matching pathspec, I used a checkout with 859,940 index
entries and ran:

    hyperfine --warmup 0 --runs 3 \
        --command-name parent \
        '$parent -c core.fsmonitor=false ls-files --deleted -- "*"' \
        --command-name 'this commit' \
        '$this -c core.fsmonitor=false ls-files --deleted -- "*"'

I repeated the benchmark with the commands reversed. The results were:

                         parent          this commit
  parent first elapsed    56.807 s        64.618 s
               user        1.256 s         1.270 s
               system     10.633 s        11.068 s
  patched first elapsed   63.361 s        64.316 s
                user       1.238 s         1.280 s
                system    10.296 s        11.864 s

The patched user-time means were 14 ms and 42 ms higher in the two
orderings. Elapsed time changed by several seconds when the order was
reversed, so those results do not show a stable wall-time ordering.

Jeff King pointed out that a preliminary match for each of many literal
pathspecs can be much more expensive. On a generated repository with
10,000 clean files, I recorded the paths with "git ls-files >paths".
With $v1 exported to a binary built from the implementation sent in v1,
I ran:

    hyperfine --warmup 2 --runs 10 \
        --command-name parent \
        '$parent ls-files -m -- $(cat paths) >/dev/null' \
        --command-name 'this commit' \
        '$this ls-files -m -- $(cat paths) >/dev/null'

I replaced $this with $v1 in a second invocation. The wall-clock means
and standard deviations were:

                         mean          standard deviation
  parent, final run     110.1 ms              4.1 ms
  this commit           104.9 ms              2.2 ms
  parent, v1 run        112.5 ms              6.6 ms
  unguarded v1          494.1 ms             17.2 ms

The guarded result matches the parent within the observed variation,
while avoiding the regression in v1.

All three revisions were built with -O3, -mcpu=native, and ThinLTO
using Apple clang 21.0.0 on macOS 26.5. The machine was a MacBook Pro
(Mac16,6) with a 16-core Apple M4 Max (12 performance and four
efficiency cores) and 128 GB RAM.

Link: https://lore.kernel.org/r/20260607-ls-files-pathspec-lstat-v1-1-8cf40b730146@gmail.com
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Tamir Duberstein <tamird@gmail.com>
---
A selective pathspec should let ls-files --deleted and --modified avoid
statting entries that cannot be shown. Match a single pathspec before
accessing the worktree, while preserving the existing lstat-first order
for multiple pathspecs whose matching cost grows linearly.
---
Changes in v2:
- Restrict early matching to one pathspec, avoiding the regression Jeff
  demonstrated with many pathspecs.
- Add all-matching and many-pathspec performance results.
- Drop the Assisted-by trailer.
- Link to v1: https://patch.msgid.link/20260607-ls-files-pathspec-lstat-v1-1-8cf40b730146@gmail.com
---
 builtin/ls-files.c                  | 11 +++++++++++
 t/meson.build                       |  1 +
 t/perf/p3010-ls-files.sh            | 31 +++++++++++++++++++++++++++++++
 t/t3010-ls-files-killed-modified.sh | 18 ++++++++++++++++++
 4 files changed, 61 insertions(+)

diff --git a/builtin/ls-files.c b/builtin/ls-files.c
index e1a22b41b9..8d7158652b 100644
--- a/builtin/ls-files.c
+++ b/builtin/ls-files.c
@@ -450,6 +450,17 @@ static void show_files(struct repository *repo, struct dir_struct *dir)
 			continue;
 		if (ce_skip_worktree(ce))
 			continue;
+		/*
+		 * match_pathspec() is linear in pathspec.nr, so prefilter only
+		 * the single-pathspec case. Only entries shown by show_ce()
+		 * satisfy --error-unmatch.
+		 */
+		if (pathspec.nr == 1 &&
+		    !match_pathspec(repo->index, &pathspec, fullname.buf,
+				    fullname.len, max_prefix_len, NULL,
+				    S_ISDIR(ce->ce_mode) ||
+				    S_ISGITLINK(ce->ce_mode)))
+			continue;
 		stat_err = lstat(fullname.buf, &st);
 		if (stat_err && (errno != ENOENT && errno != ENOTDIR))
 			error_errno("cannot lstat '%s'", fullname.buf);
diff --git a/t/meson.build b/t/meson.build
index 2af8d01279..ee8086e6ef 100644
--- a/t/meson.build
+++ b/t/meson.build
@@ -1140,6 +1140,7 @@ benchmarks = [
   'perf/p1500-graph-walks.sh',
   'perf/p1501-rev-parse-oneline.sh',
   'perf/p2000-sparse-operations.sh',
+  'perf/p3010-ls-files.sh',
   'perf/p3400-rebase.sh',
   'perf/p3404-rebase-interactive.sh',
   'perf/p4000-diff-algorithms.sh',
diff --git a/t/perf/p3010-ls-files.sh b/t/perf/p3010-ls-files.sh
new file mode 100755
index 0000000000..ae14449432
--- /dev/null
+++ b/t/perf/p3010-ls-files.sh
@@ -0,0 +1,31 @@
+#!/bin/sh
+
+test_description='Tests ls-files worktree performance'
+
+. ./perf-lib.sh
+
+test_perf_large_repo
+test_checkout_worktree
+
+test_expect_success 'select a zero-prefix pathspec' '
+	tracked_file=$(git ls-files | sed -n 1p) &&
+	test -n "$tracked_file" &&
+	pathspec="?${tracked_file#?}" &&
+	test_export pathspec
+'
+
+test_perf 'ls-files --deleted with pathspec' '
+	git -c core.fsmonitor=false ls-files --deleted \
+		-- "$pathspec" >/dev/null
+'
+
+test_perf 'ls-files --deleted with all-matching pathspec' '
+	git -c core.fsmonitor=false ls-files --deleted -- "*" >/dev/null
+'
+
+test_perf 'ls-files --modified with pathspec' '
+	git -c core.fsmonitor=false ls-files --modified \
+		-- "$pathspec" >/dev/null
+'
+
+test_done
diff --git a/t/t3010-ls-files-killed-modified.sh b/t/t3010-ls-files-killed-modified.sh
index 7af4532cd1..6e38e10219 100755
--- a/t/t3010-ls-files-killed-modified.sh
+++ b/t/t3010-ls-files-killed-modified.sh
@@ -124,4 +124,22 @@ test_expect_success 'validate git ls-files -m output.' '
 	test_cmp .expected .output
 '
 
+test_expect_success 'worktree modes honor wildcard pathspecs' '
+	cat >.expected <<-\EOF &&
+	path2/file2
+	path3/file3
+	EOF
+	git ls-files --deleted -- "path?/file?" >.output &&
+	test_cmp .expected .output &&
+
+	cat >.expected <<-\EOF &&
+	path7
+	path8
+	EOF
+	git ls-files --modified --error-unmatch -- "path[78]" >.output &&
+	test_cmp .expected .output &&
+
+	test_must_fail git ls-files --modified --error-unmatch -- path10
+'
+
 test_done

---
base-commit: 9ac3f193c05c2237e2b14ebaa1149e9fc8a1abe0
change-id: 20260607-ls-files-pathspec-lstat-885125a5d644

Best regards,
--  
Tamir Duberstein <tamird@gmail.com>


^ permalink raw reply related

* [PATCH v2 2/2] ref-filter: memoize --contains with generations
From: Tamir Duberstein @ 2026-06-09  2:36 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Karthik Nayak, Junio C Hamano, Victoria Dye,
	Derrick Stolee, Elijah Newren, Tamir Duberstein
In-Reply-To: <20260608-ref-filter-memoized-contains-v2-0-e72720344a7c@gmail.com>

git branch and git for-each-ref call repo_is_descendant_of() for
each candidate selected by --contains or --no-contains. Each call
starts a new graph walk, so refs with shared history repeatedly
traverse the same commits.

ffc4b8012d (tag: speed up --contains calculation, 2011-06-11)
introduced a depth-first walk for git tag that caches positive and
negative answers across candidates. ee2bd06b0f (ref-filter: implement
'--contains' option, 2015-07-07) preserved both implementations when
ref-filter learned --contains.

The memoized walk is not always faster. Without generation numbers,
a negative check can walk to the root even when the breadth-first
merge-base walk finds a nearby divergence. With generation numbers,
the depth-first walk can stop below the oldest target while still
reusing answers across candidates.

Keep the existing memoized selection for git tag. Select it for other
ref-filter callers when generation numbers are enabled, and retain
the breadth-first walk otherwise.

When generation numbers are unavailable, repo_is_descendant_of() can
return -1 if ancestry cannot be read. The ref-filter Boolean interface
treated that error as a match. Check it and exit instead. The memoized
path already dies on the same parse failure, so both selected paths now
fail rather than return a result.

Add p1500 cases for up to 8,192 packed refs along one first-parent
history and for sibling refs near the tip with generation numbers
forced off.

On a checkout with 62,174 remote-tracking refs and generation numbers
enabled, I ran:

    hyperfine --warmup 0 --runs 3 \
        --command-name parent \
        '"$parent" branch -r --contains c78ae85f3ce7e >/dev/null' \
        --command-name this-commit \
        '"$this" branch -r --contains c78ae85f3ce7e >/dev/null'

The results were:

             parent       this commit
  elapsed    104.365 s     467.7 ms
  user        93.702 s     220.2 ms
  system       0.723 s     182.7 ms

The wall-time standard deviations were 11.356 seconds and 133.8
milliseconds, respectively. Separate runs without redirection produced
the same output with SHA-256
2466f6e2b72aa16b1a2126eddb81c8a1b2764ee251204ac034c191a925aa896f.

Both revisions were built with the default -O2 flags using Apple
clang 21.0.0 on macOS 26.5. The machine was a MacBook Pro (Mac16,6)
with a 16-core Apple M4 Max (12 performance and four efficiency
cores) and 128 GB RAM.

Link: https://lore.kernel.org/git/1445163904-24611-1-git-send-email-Karthik.188@gmail.com/
Link: https://lore.kernel.org/r/20230324191009.GA536967@coredump.intra.peff.net
Link: https://lore.kernel.org/git/20260527070510.3510836-1-krka@spotify.com/
Link: https://lore.kernel.org/r/20260608223430.GA340696@coredump.intra.peff.net
Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Tamir Duberstein <tamird@gmail.com>
---
 commit-reach.c                 | 13 +++++++++--
 commit-reach.h                 |  7 ++++++
 t/perf/p1500-graph-walks.sh    | 49 +++++++++++++++++++++++++++++++++++++++++-
 t/t6301-for-each-ref-errors.sh | 22 +++++++++++++++++++
 4 files changed, 88 insertions(+), 3 deletions(-)

diff --git a/commit-reach.c b/commit-reach.c
index 65b618959b..83a48004ef 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -821,9 +821,18 @@ static enum contains_result contains_tag_algo(struct commit *candidate,
 int commit_contains(struct ref_filter *filter, struct commit *commit,
 		    struct commit_list *list, struct contains_cache *cache)
 {
-	if (filter->with_commit_tag_algo)
+	int result;
+
+	if (!list)
+		return 1;
+	if (filter->with_commit_tag_algo ||
+	    generation_numbers_enabled(the_repository))
 		return contains_tag_algo(commit, list, cache) == CONTAINS_YES;
-	return repo_is_descendant_of(the_repository, commit, list);
+
+	result = repo_is_descendant_of(the_repository, commit, list);
+	if (result < 0)
+		exit(128);
+	return result;
 }
 
 int can_all_from_reach_with_flag(struct object_array *from,
diff --git a/commit-reach.h b/commit-reach.h
index f908d305b1..da6796a354 100644
--- a/commit-reach.h
+++ b/commit-reach.h
@@ -79,6 +79,13 @@ enum contains_result {
 
 define_commit_slab(contains_cache, enum contains_result);
 
+/*
+ * Return whether "commit" is a descendant of any commit in "list". An empty
+ * list matches.
+ *
+ * The memoized traversal records answers in "cache" for one fixed "list".
+ * Clear it before changing the list.
+ */
 int commit_contains(struct ref_filter *filter, struct commit *commit,
 		    struct commit_list *list, struct contains_cache *cache);
 
diff --git a/t/perf/p1500-graph-walks.sh b/t/perf/p1500-graph-walks.sh
index 5b23ce5db9..99b54e274b 100755
--- a/t/perf/p1500-graph-walks.sh
+++ b/t/perf/p1500-graph-walks.sh
@@ -32,12 +32,47 @@ test_expect_success 'setup' '
 		echo "X:$line" >>test-tool-tags || return 1
 	done &&
 
-	commit=$(git commit-tree $(git rev-parse HEAD^{tree})) &&
+	git rev-list --first-parent --max-count=8192 HEAD >contains-commits &&
+	test_file_not_empty contains-commits &&
+	git update-ref refs/contains-perf-base "$(tail -n 1 contains-commits)" &&
+	awk "{
+		printf \"update refs/contains-perf/%04d %s\\n\", NR, \$1
+	}" contains-commits |
+		git update-ref --stdin &&
+	git pack-refs --include "refs/contains-perf/*" &&
+
+	tree=$(git rev-parse HEAD^{tree}) &&
+	base=$(git rev-parse HEAD) &&
+	target=$(echo target | git commit-tree "$tree" -p "$base") &&
+	git update-ref refs/contains-diverged/target "$target" &&
+	for i in $(test_seq 1 4)
+	do
+		commit=$(echo candidate-$i |
+			git commit-tree "$tree" -p "$base") &&
+		git update-ref refs/contains-diverged/candidate-$i "$commit" ||
+		return 1
+	done &&
+
+	commit=$(git commit-tree "$tree") &&
 	git update-ref refs/heads/disjoint-base $commit &&
 
 	git commit-graph write --reachable
 '
 
+test_expect_success 'verify contains results' '
+	git for-each-ref --contains=refs/contains-perf-base \
+		refs/contains-perf/ >actual &&
+	test_line_count = $(wc -l <contains-commits) actual &&
+
+	echo refs/contains-diverged/target >expect &&
+	GIT_TEST_COMMIT_GRAPH=0 \
+		git -c core.commitGraph=false for-each-ref \
+			--format="%(refname)" \
+			--contains=refs/contains-diverged/target \
+			refs/contains-diverged/ >actual &&
+	test_cmp expect actual
+'
+
 test_perf 'ahead-behind counts: git for-each-ref' '
 	git for-each-ref --format="%(ahead-behind:HEAD)" --stdin <refs
 '
@@ -62,6 +97,18 @@ test_perf 'contains: git tag --merged' '
 	xargs git tag --merged=HEAD <tags
 '
 
+test_perf 'contains: git for-each-ref --contains' '
+	git for-each-ref --contains=refs/contains-perf-base \
+		refs/contains-perf/ >/dev/null
+'
+
+test_perf 'contains without generations: divergent refs' '
+	GIT_TEST_COMMIT_GRAPH=0 \
+		git -c core.commitGraph=false for-each-ref \
+			--contains=refs/contains-diverged/target \
+			refs/contains-diverged/ >/dev/null
+'
+
 test_perf 'is-base check: test-tool reach (refs)' '
 	test-tool reach get_branch_base_for_tip <test-tool-refs
 '
diff --git a/t/t6301-for-each-ref-errors.sh b/t/t6301-for-each-ref-errors.sh
index e06feb06e9..72b27c8be3 100755
--- a/t/t6301-for-each-ref-errors.sh
+++ b/t/t6301-for-each-ref-errors.sh
@@ -52,6 +52,28 @@ test_expect_success 'Missing objects are reported correctly' '
 	test_must_be_empty brief-err
 '
 
+test_expect_success 'missing ancestors are reported by contains filters' '
+	test_when_finished "git update-ref -d refs/heads/missing-parent" &&
+	{
+		echo "tree $(git rev-parse HEAD^{tree})" &&
+		echo "parent $MISSING" &&
+		git cat-file commit HEAD |
+			sed -n -e "/^author /p" -e "/^committer /p" &&
+		echo &&
+		echo "missing parent"
+	} >commit &&
+	broken=$(git hash-object -t commit -w commit) &&
+	git update-ref refs/heads/missing-parent "$broken" &&
+	for option in --contains --no-contains
+	do
+		test_must_fail git for-each-ref "$option=HEAD" \
+			refs/heads/missing-parent >out 2>err &&
+		test_must_be_empty out &&
+		test_grep "parse commit $MISSING" err ||
+		return 1
+	done
+'
+
 test_expect_success 'ahead-behind requires an argument' '
 	test_must_fail git for-each-ref \
 		--format="%(ahead-behind)" 2>err &&

-- 
2.54.0.501.g0fb508de08


^ permalink raw reply related

* [PATCH v2 1/2] commit-reach: handle cycles in contains walk
From: Tamir Duberstein @ 2026-06-09  2:36 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Karthik Nayak, Junio C Hamano, Victoria Dye,
	Derrick Stolee, Elijah Newren, Tamir Duberstein
In-Reply-To: <20260608-ref-filter-memoized-contains-v2-0-e72720344a7c@gmail.com>

git tag --contains uses a memoized traversal that assumes commit
ancestry is acyclic. Replacement refs can violate that assumption,
causing the traversal to revisit a commit already on its stack
indefinitely.

Mark commits while they are active. If the traversal encounters an
active commit, discard the cache because it cannot distinguish answers
produced by the interrupted walk. Then fall back to the cycle-safe
reachability walk for that candidate.

Signed-off-by: Tamir Duberstein <tamird@gmail.com>
---
 commit-reach.c | 30 ++++++++++++++++++++++++++----
 commit-reach.h |  3 ++-
 t/t7004-tag.sh | 21 +++++++++++++++++++++
 3 files changed, 49 insertions(+), 5 deletions(-)

diff --git a/commit-reach.c b/commit-reach.c
index 9b3ea46d6f..65b618959b 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -708,7 +708,8 @@ static int in_commit_list(const struct commit_list *want, struct commit *c)
 
 /*
  * Test whether the candidate is contained in the list.
- * Do not recurse to find out, though, but return -1 if inconclusive.
+ * Do not recurse to find out, though, but return CONTAINS_UNKNOWN if
+ * inconclusive.
  */
 static enum contains_result contains_test(struct commit *candidate,
 					  const struct commit_list *want,
@@ -744,7 +745,7 @@ static void push_to_contains_stack(struct commit *candidate, struct contains_sta
 }
 
 static enum contains_result contains_tag_algo(struct commit *candidate,
-					      const struct commit_list *want,
+					      struct commit_list *want,
 					      struct contains_cache *cache)
 {
 	struct contains_stack contains_stack = { 0, 0, NULL };
@@ -765,6 +766,7 @@ static enum contains_result contains_tag_algo(struct commit *candidate,
 	if (result != CONTAINS_UNKNOWN)
 		return result;
 
+	*contains_cache_at(cache, candidate) = CONTAINS_IN_PROGRESS;
 	push_to_contains_stack(candidate, &contains_stack);
 	while (contains_stack.nr) {
 		struct contains_stack_entry *entry = &contains_stack.contains_stack[contains_stack.nr - 1];
@@ -776,8 +778,8 @@ static enum contains_result contains_tag_algo(struct commit *candidate,
 			contains_stack.nr--;
 		}
 		/*
-		 * If we just popped the stack, parents->item has been marked,
-		 * therefore contains_test will return a meaningful yes/no.
+		 * A parent may have just been popped and marked, or may still
+		 * be active when replacement refs create a cycle.
 		 */
 		else switch (contains_test(parents->item, want, cache, cutoff)) {
 		case CONTAINS_YES:
@@ -787,13 +789,33 @@ static enum contains_result contains_tag_algo(struct commit *candidate,
 		case CONTAINS_NO:
 			entry->parents = parents->next;
 			break;
+		case CONTAINS_IN_PROGRESS:
+			/*
+			 * Partial negative answers are not safe across a cycle.
+			 * Discard them and use the cycle-safe reachability walk.
+			 */
+			goto cycle;
 		case CONTAINS_UNKNOWN:
+			*contains_cache_at(cache, parents->item) =
+				CONTAINS_IN_PROGRESS;
 			push_to_contains_stack(parents->item, &contains_stack);
 			break;
 		}
 	}
 	free(contains_stack.contains_stack);
 	return contains_test(candidate, want, cache, cutoff);
+
+cycle:
+	free(contains_stack.contains_stack);
+	clear_contains_cache(cache);
+	init_contains_cache(cache);
+
+	result = repo_is_descendant_of(the_repository, candidate, want);
+	if (result < 0)
+		exit(128);
+	*contains_cache_at(cache, candidate) =
+		result ? CONTAINS_YES : CONTAINS_NO;
+	return result ? CONTAINS_YES : CONTAINS_NO;
 }
 
 int commit_contains(struct ref_filter *filter, struct commit *commit,
diff --git a/commit-reach.h b/commit-reach.h
index 3f3a563d8a..f908d305b1 100644
--- a/commit-reach.h
+++ b/commit-reach.h
@@ -73,7 +73,8 @@ int ref_newer(const struct object_id *new_oid, const struct object_id *old_oid);
 enum contains_result {
 	CONTAINS_UNKNOWN = 0,
 	CONTAINS_NO,
-	CONTAINS_YES
+	CONTAINS_YES,
+	CONTAINS_IN_PROGRESS
 };
 
 define_commit_slab(contains_cache, enum contains_result);
diff --git a/t/t7004-tag.sh b/t/t7004-tag.sh
index d918005dd9..1ed91bb66e 100755
--- a/t/t7004-tag.sh
+++ b/t/t7004-tag.sh
@@ -1611,6 +1611,27 @@ test_expect_success 'checking that first commit is in all tags (hash)' '
 	test_cmp expected actual
 '
 
+test_expect_success 'tag --contains handles cyclic replacement histories' '
+	first=$(git rev-parse HEAD~2) &&
+	second=$(git rev-parse HEAD~) &&
+	third=$(git rev-parse HEAD) &&
+	test_when_finished "
+		git replace -d $first
+		git replace -d $third
+		git tag -d cycle-a cycle-b
+	" &&
+	git tag cycle-a "$first" &&
+	git tag cycle-b "$third" &&
+	git replace --graft "$first" "$third" "$second" &&
+	git replace --graft "$third" "$first" &&
+	cat >expected <<-\EOF &&
+	cycle-a
+	cycle-b
+	EOF
+	git tag --contains="$second" --list "cycle-*" >actual &&
+	test_cmp expected actual
+'
+
 # other ways of specifying the commit
 test_expect_success 'checking that first commit is in all tags (tag)' '
 	cat >expected <<-\EOF &&

-- 
2.54.0.501.g0fb508de08


^ permalink raw reply related

* [PATCH v2 0/2] Reuse --contains traversal results
From: Tamir Duberstein @ 2026-06-09  2:36 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Karthik Nayak, Junio C Hamano, Victoria Dye,
	Derrick Stolee, Elijah Newren, Tamir Duberstein

The memoized traversal used by git tag avoids repeating graph walks for
refs with shared history. Extend it to the other ref-filter users after
making the existing traversal safe for cycles introduced by replacement
refs.

Signed-off-by: Tamir Duberstein <tamird@gmail.com>

---
Changes in v2:
- Split cycle handling into a preparatory patch.
- Exercise cycle handling through the existing git tag path.
- Move perf result verification out of setup.
- Link to v1: https://patch.msgid.link/20260607-ref-filter-memoized-contains-v1-1-a1972dde9c76@gmail.com

---
Tamir Duberstein (2):
      commit-reach: handle cycles in contains walk
      ref-filter: memoize --contains with generations

 commit-reach.c                 | 43 ++++++++++++++++++++++++++++++------
 commit-reach.h                 | 10 ++++++++-
 t/perf/p1500-graph-walks.sh    | 49 +++++++++++++++++++++++++++++++++++++++++-
 t/t6301-for-each-ref-errors.sh | 22 +++++++++++++++++++
 t/t7004-tag.sh                 | 21 ++++++++++++++++++
 5 files changed, 137 insertions(+), 8 deletions(-)
---
base-commit: 9ac3f193c05c2237e2b14ebaa1149e9fc8a1abe0
change-id: 20260607-ref-filter-memoized-contains-7cb6b3bccad1

Best regards,
--  
Tamir Duberstein <tamird@gmail.com>


^ permalink raw reply

* [PATCH v2] ref-filter: restore prefix-scoped iteration
From: Tamir Duberstein @ 2026-06-09  2:34 UTC (permalink / raw)
  To: git
  Cc: Karthik Nayak, Patrick Steinhardt, Junio C Hamano, Victoria Dye,
	ZheNing Hu, Tamir Duberstein

Commit dabecb9db2 (for-each-ref: introduce a '--start-after' option,
2025-07-15) changed single-kind branch, remote-tracking branch, and tag
enumeration in do_filter_refs() from constructing an iterator with the
namespace prefix to constructing an unscoped iterator and applying the
prefix with ref_iterator_seek().

Before that change, refs_for_each_fullref_in() passed the namespace
prefix during iterator construction. That helper has since been
replaced by refs_for_each_ref_ext().

The files backend primes its loose-ref cache for the construction
prefix before it opens packed refs. An empty construction prefix
therefore reads every loose ref, and a later seek cannot undo that I/O.
Consequently, git branch, git branch --remotes, and git tag scale with
unrelated loose refs.

Patrick Steinhardt observed during review that iterator construction
and seeking accepted similar strings but assigned them different state
semantics. Junio C Hamano then pointed out that no current command can
combine start_after with this single-kind path, but future branch or
tag support would need to keep the namespace while moving the cursor.

Keep the existing start_after path unchanged. The iterator API cannot
currently seek to one string while retaining another as its prefix:
an unflagged seek clears the prefix, while REF_ITERATOR_SEEK_SET_PREFIX
replaces it with the seek string.

For the commands affected by this regression, which do not set
start_after, pass the namespace prefix during iterator construction so
that loose refs are scoped before the packed-refs snapshot is opened.
This fixes the current regression without deleting the ref-filter state
discussed during review or changing its dormant behavior.

Add REFFILES-gated performance cases with one branch, one
remote-tracking branch, one tag, and 10,000 unrelated loose refs. The
benchmarks were run with:

    GIT_PERF_REPEAT_COUNT=5 GIT_PERF_MAKE_OPTS=-j8 \
        t/perf/run a89346e34a . -- p6300-for-each-ref.sh

The following are the best of five runs, with each run invoking the
command ten times. Times are elapsed seconds with user and system CPU
seconds in parentheses:

                                  a89346e34a       this commit
  branch                       2.74(0.13+2.56)   0.11(0.04+0.04)
  branch --remotes             2.81(0.13+2.62)   0.12(0.04+0.04)
  tag                          3.01(0.14+2.82)   0.11(0.04+0.04)

Both revisions used the default -O2 build flags and a config.mak
containing only "NO_REGEX = NeedsStartEnd". They were built with Apple
clang 21.0.0 on macOS 26.5. The machine was a MacBook Pro (Mac16,6)
with a 16-core Apple M4 Max (12 performance and four efficiency cores)
and 128 GB RAM.

Link: https://lore.kernel.org/git/aGZidwwlToWThkn8@pks.im/
Link: https://lore.kernel.org/git/xmqqikjq7s16.fsf@gitster.g/
Fixes: dabecb9db2b2 ("for-each-ref: introduce a '--start-after' option")
Assisted-by: Codex gpt-5.5
Signed-off-by: Tamir Duberstein <tamird@gmail.com>
---
The series is based on a89346e34a (maint) because the regression has
been present in released versions since Git 2.51.0.
---
Changes in v2:
- Extract local variable `store`.
- Link to v1: https://patch.msgid.link/20260605-fix-git-branch-regression-v1-1-02f40ad40929@gmail.com
---
 ref-filter.c                 | 28 +++++++++++++++++++---------
 t/perf/p6300-for-each-ref.sh | 39 ++++++++++++++++++++++++++++++++++++++-
 2 files changed, 57 insertions(+), 10 deletions(-)

diff --git a/ref-filter.c b/ref-filter.c
index 1da4c0e60d..5cbc007d64 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -3315,19 +3315,29 @@ static int do_filter_refs(struct ref_filter *filter, unsigned int type, refs_for
 		prefix = "refs/tags/";
 
 	if (prefix) {
-		struct ref_iterator *iter;
+		struct ref_store *store = get_main_ref_store(the_repository);
 
-		iter = refs_ref_iterator_begin(get_main_ref_store(the_repository),
-					       "", NULL, 0, 0);
+		if (filter->start_after) {
+			struct ref_iterator *iter;
+
+			iter = refs_ref_iterator_begin(store, "", NULL, 0, 0);
 
-		if (filter->start_after)
 			ret = start_ref_iterator_after(iter, filter->start_after);
-		else
-			ret = ref_iterator_seek(iter, prefix,
-						REF_ITERATOR_SEEK_SET_PREFIX);
+			if (!ret)
+				ret = do_for_each_ref_iterator(iter, fn,
+							       cb_data);
+		} else {
+			/*
+			 * Pass the prefix during construction because the files
+			 * backend primes loose refs before a later seek can
+			 * narrow the iterator.
+			 */
+			struct refs_for_each_ref_options opts = {
+				.prefix = prefix,
+			};
 
-		if (!ret)
-			ret = do_for_each_ref_iterator(iter, fn, cb_data);
+			ret = refs_for_each_ref_ext(store, fn, cb_data, &opts);
+		}
 	} else if (filter->kind & FILTER_REFS_REGULAR) {
 		ret = for_each_fullref_in_pattern(filter, fn, cb_data);
 	}
diff --git a/t/perf/p6300-for-each-ref.sh b/t/perf/p6300-for-each-ref.sh
index fa7289c752..ed9c1c6a19 100755
--- a/t/perf/p6300-for-each-ref.sh
+++ b/t/perf/p6300-for-each-ref.sh
@@ -1,6 +1,6 @@
 #!/bin/sh
 
-test_description='performance of for-each-ref'
+test_description='performance of ref-filter users'
 . ./perf-lib.sh
 
 test_perf_fresh_repo
@@ -84,4 +84,41 @@ test_expect_success 'pack refs' '
 '
 run_tests "packed"
 
+test_expect_success REFFILES 'setup many unrelated loose refs' '
+	git init scoped &&
+	test_commit -C scoped --no-tag base &&
+	test_seq $ref_count_per_type |
+		sed "s,.*,update refs/custom/unrelated_& HEAD," |
+		git -C scoped update-ref --stdin &&
+	git -C scoped update-ref refs/remotes/origin/main HEAD &&
+	git -C scoped update-ref refs/tags/only HEAD
+'
+
+test_perf "branch (many unrelated loose refs)" --prereq REFFILES "
+	(
+		cd scoped &&
+		for i in \$(test_seq $test_iteration_count); do
+			git branch --format='%(refname)' >/dev/null
+		done
+	)
+"
+
+test_perf "branch --remotes (many unrelated loose refs)" --prereq REFFILES "
+	(
+		cd scoped &&
+		for i in \$(test_seq $test_iteration_count); do
+			git branch --remotes --format='%(refname)' >/dev/null
+		done
+	)
+"
+
+test_perf "tag (many unrelated loose refs)" --prereq REFFILES "
+	(
+		cd scoped &&
+		for i in \$(test_seq $test_iteration_count); do
+			git tag --format='%(refname)' >/dev/null
+		done
+	)
+"
+
 test_done

---
base-commit: a89346e34a937f001e5d397ee62224e3e9852040
change-id: 20260605-fix-git-branch-regression-9e4236f18091

Best regards,
--  
Tamir Duberstein <tamird@gmail.com>


^ permalink raw reply related

* [PATCH v2] describe: limit default ref iteration to tags
From: Tamir Duberstein @ 2026-06-09  2:32 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Patrick Steinhardt, Tamir Duberstein

Unless --all is given, get_name() rejects every ref outside refs/tags/.
The rejection happens only after the ref backend has enumerated the ref,
so repositories with many other refs spend most of a simple describe
invocation visiting refs which cannot affect its result.

Commit 8a5a1884e9 (Avoid accessing non-tag refs in git-describe unless
--all is requested, 2008-02-24) moved this rejection before object
lookup, but left iteration unscoped. Pass the existing refs/tags/
restriction to the iterator unless --all is given so the backend can
avoid unrelated refs.

The benchmark checkout had 120,532 refs, of which 330 were tags. With
`$repo` naming the checkout, `$commit` an exactly tagged commit, and
`$parent` and `$this` the two binaries, I ran:

    hyperfine --warmup 3 --runs 15 \
        --command-name parent \
        '$parent -C $repo describe --exact-match $commit' \
        --command-name 'this commit' \
        '$this -C $repo describe --exact-match $commit'

The results were:

    Benchmark 1: parent
      Time (mean ± σ):     171.7 ms ±  18.5 ms    [User: 23.9 ms, System: 133.6 ms]
      Range (min … max):   142.3 ms … 198.3 ms    15 runs

    Benchmark 2: this commit
      Time (mean ± σ):       9.9 ms ±   1.1 ms    [User: 3.3 ms, System: 4.7 ms]
      Range (min … max):     8.8 ms …  13.1 ms    15 runs

    Summary
      this commit ran
       17.35 ± 2.63 times faster than parent

Both revisions were built with -O3, -mcpu=native, and ThinLTO using
Apple clang 21.0.0 on macOS 26.5. The machine was a MacBook Pro
(Mac16,6) with a 16-core Apple M4 Max (12 performance and four
efficiency cores) and 128 GB RAM.

Signed-off-by: Tamir Duberstein <tamird@gmail.com>
---
Changes in v2:
- Exercise the performance test with both ref backends.
- Keep the ref count local to its setup test.
- Report native hyperfine output for an exact-tag lookup.
- Link to v1: https://patch.msgid.link/20260607-describe-tag-ref-scope-v1-1-653d232b86b5@gmail.com
---
 builtin/describe.c       |  3 +++
 t/perf/p6100-describe.sh | 15 +++++++++++++++
 2 files changed, 18 insertions(+)

diff --git a/builtin/describe.c b/builtin/describe.c
index 1c47d7c0b7..3532c8ff22 100644
--- a/builtin/describe.c
+++ b/builtin/describe.c
@@ -740,6 +740,9 @@ int cmd_describe(int argc,
 		return ret;
 	}
 
+	if (!all)
+		for_each_ref_opts.prefix = "refs/tags/";
+
 	hashmap_init(&names, commit_name_neq, NULL, 0);
 	refs_for_each_ref_ext(get_main_ref_store(the_repository),
 			      get_name, NULL, &for_each_ref_opts);
diff --git a/t/perf/p6100-describe.sh b/t/perf/p6100-describe.sh
index 069f91ce49..ed9f1abe18 100755
--- a/t/perf/p6100-describe.sh
+++ b/t/perf/p6100-describe.sh
@@ -27,4 +27,19 @@ test_perf 'describe HEAD with one tag' '
 	git describe --match=new HEAD
 '
 
+test_expect_success 'set up many unrelated refs' '
+	ref_count=10000 &&
+	git tag -m tip tip HEAD &&
+	for i in $(test_seq $ref_count)
+	do
+		printf "create refs/heads/describe-perf/%05d HEAD\n" $i ||
+		return 1
+	done >instructions &&
+	git update-ref --stdin <instructions
+'
+
+test_perf 'describe exact tag with many unrelated refs' '
+	git describe --exact-match HEAD
+'
+
 test_done

---
base-commit: 9ac3f193c05c2237e2b14ebaa1149e9fc8a1abe0
change-id: 20260607-describe-tag-ref-scope-7d00ae140a58

Best regards,
--  
Tamir Duberstein <tamird@gmail.com>


^ permalink raw reply related

* Re: [GSoC PATCH v2 0/4] teach git repo info to handle path keys
From: K Jayatheerth @ 2026-06-09  2:30 UTC (permalink / raw)
  To: Lucas Seiki Oshiro
  Cc: git, a3205153416, gitster, jltobler, kumarayushjha123,
	phillip.wood, sandals
In-Reply-To: <A67C8C8B-2600-41D2-9E61-0923BFDDD06B@gmail.com>

>
> I prefer `.(absolute|relative)` at the end. `path.gitdir.relative`
> means that we have a collection of paths, in those collections we
> have gitdir that can be relative or absolute, and we want the
> relative. `path.relative.gitdir` means that we have a collection
> of relative paths and from those we're picking gitdir. The first
> feels more natural.
>

Yes, I believe the same.


> PS: this is a nitpick, but it would be really helpful if you provide
> a range-diff in the cover letter. Check the usage of `--range-diff`
> in git-format-patch documentation (this flag also works for
> git-send-email). Or, if you prefer, you can generate it by running
> `git range-diff` and copying the output.

Alright, I will add that as well in the next series.

Thank you!

^ permalink raw reply

* Re: [GSoC PATCH v2 1/4] path: introduce format_path() for centralized path formatting
From: K Jayatheerth @ 2026-06-09  2:27 UTC (permalink / raw)
  To: Kristoffer Haugsbakk
  Cc: git, a3205153416, Junio C Hamano, Justin Tobler, kumarayushjha123,
	Lucas Seiki Oshiro, Phillip Wood, brian m. carlson
In-Reply-To: <bd9bc9aa-60b6-4e5d-9ce1-bf38b6032309@app.fastmail.com>

> > the localized fallback mechanics specific to `rev-parse`.
>
> This looks very well explained to my naive eyes.

Thank you!


> Nitpick. You are supposed to add your `Signed-off-by` at the end. You
> are saying with that line that you are signing off on the changes and
> the commit message, including the trailers (mentors) you’ve decided to
> add. Imagine if the maintainer applies this patch and fixes a typo and
> the commit becomes:
>
>     Mentored-by: Justin Tobler <jltobler@gmail.com>
>     Mentored-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
>     Signed-off-by: K Jayatheerth <jayatheerthkulkarni2005@gmail.com>
>     [jc: typo fix]
>     Signed-off-by: Junio ...
>
> The chain of custody is then very clear.
>
> > ---
> >[snip]

Ok, I understand
thanks for taking time to explain that

Will change it in the next patch series.

^ permalink raw reply

* Re: [PATCH v3 5/8] t7810: turn MB_REGEX check into a lazy prereq
From: Jeff King @ 2026-06-09  1:22 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, Junio C Hamano
In-Reply-To: <20260604-pks-t7527-fix-tap-output-v3-5-7d766ed481e4@pks.im>

On Thu, Jun 04, 2026 at 12:07:35PM +0200, Patrick Steinhardt wrote:

> diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
> index 1b195bee59..d61c4a4d73 100755
> --- a/t/t7810-grep.sh
> +++ b/t/t7810-grep.sh
> @@ -18,8 +18,9 @@ test_invalid_grep_expression() {
>  	'
>  }
>  
> -LC_ALL=en_US.UTF-8 test-tool regex '^.$' '¿' &&
> -  test_set_prereq MB_REGEX
> +test_lazy_prereq MB_REGEX '
> +	LC_ALL=en_US.UTF-8 test-tool regex "^.$" "¿"
> +'

Oh good. Since the error was coming from the shell, I was worried that
the use of LC_ALL inside the test snippets was somehow causing output to
leak to stderr. But it was just (yet another) case where we ran a tool
outside of a snippet, which is easy to fix.

This also allays my fears that the tests might have been misbehaving on
those other platforms. They are (and were) caught by this prereq and not
run at all, which is the right thing.

Thanks for fixing.

-Peff

^ permalink raw reply

* What's cooking in git.git (Jun 2026, #03)
From: Junio C Hamano @ 2026-06-09  0:56 UTC (permalink / raw)
  To: git

Here are the topics that have been cooking in my tree.  Commits
prefixed with '+' are in 'next' (being in 'next' is a sign that a
topic is stable enough to be used and is a candidate to be in a
future release).  Commits prefixed with '-' are only in 'seen', and
aren't considered "accepted" at all and may be annotated with a URL
to a message that raises issues but they are by no means exhaustive.
A topic without enough support may be discarded after a long period
of no activity (of course they can be resubmitted when new interests
arise).

Copies of the source code to Git live in many repositories, and the
following is a list of the ones I push into or their mirrors.  Some
repositories have only a subset of branches.

With maint, master, next, seen, todo:

	git://git.kernel.org/pub/scm/git/git.git/
	git://repo.or.cz/alt-git.git/
	https://kernel.googlesource.com/pub/scm/git/git/
	https://github.com/git/git/
	https://gitlab.com/git-scm/git/

With all the integration branches and topics broken out:

	https://github.com/gitster/git/

Even though the preformatted documentation in HTML and man format
are not sources, they are published in these repositories for
convenience (replace "htmldocs" with "manpages" for the manual
pages):

	git://git.kernel.org/pub/scm/git/git-htmldocs.git/
	https://github.com/gitster/git-htmldocs.git/

Release tarballs are available at:

	https://www.kernel.org/pub/software/scm/git/

--------------------------------------------------
[Graduated to 'master']

* aj/stash-patch-optimize-temporary-index (2026-05-22) 1 commit
  (merged to 'next' on 2026-05-31 at d1b1dd94f5)
 + stash: reuse cached index entries in --patch temporary index

 "git stash -p" has been optimized by reusing cached index
 entries in its temporary index, avoiding unnecessary lstat()
 calls on unchanged files.
 cf. <xmqqse7m6deh.fsf@gitster.g>
 source: <pull.2306.v2.git.git.1779491545531.gitgitgadget@gmail.com>


* ar/receive-pack-worktree-env (2026-05-25) 1 commit
  (merged to 'next' on 2026-05-27 at 9c246d1969)
 + receive-pack: fix updateInstead with core.worktree

 The GIT_WORK_TREE variable prepared to invoke the push-to-checkout
 hook was leaking into the environment even when there was no hook
 used and broke the default push-to-deploy (i.e., let "git checkout"
 update the working tree only when the working tree is clean).
 source: <20260525162311.66240-2-hi@alyssa.is>


* ds/restore-sparse-index (2026-05-26) 2 commits
  (merged to 'next' on 2026-05-31 at e85a961bc7)
 + restore: avoid sparse index expansion
 + t1092: test 'git restore' with sparse index

 'git restore --staged' has been optimized to avoid unnecessarily expanding
 the sparse index when operating on paths within the sparse checkout
 definition, by handling sparse directory entries at the tree level.
 source: <pull.2121.v2.git.1779827195.gitgitgadget@gmail.com>


* ja/doc-synopsis-style-again (2026-05-25) 6 commits
  (merged to 'next' on 2026-05-31 at cc4fe82d87)
 + doc: convert git-imap-send synopsis and options to new style
 + doc: convert git-apply synopsis and options to new style
 + doc: convert git-am synopsis and options to new style
 + doc: convert git-grep synopsis and options to new style
 + doc: git bisect: clarify the usage of the synopsis vs actual command
 + doc: convert git-bisect to synopsis style

 A batch of documentation pages has been updated to use the modern
 synopsis style.
 cf. <pull.2117.v2.git.1779704908.gitgitgadget@gmail.com>
 source: <pull.2117.v2.git.1779704908.gitgitgadget@gmail.com>


* kh/free-commit-list (2026-05-28) 2 commits
  (merged to 'next' on 2026-05-31 at 154f83b192)
 + commit: remove deprecated functions
 + *: replace deprecated free_commit_list

 Code clean-up.
 source: <V2_CV_commit.h_remove_deprecated.732@msgid.xyz>


* kk/commit-reach-optim (2026-05-25) 3 commits
  (merged to 'next' on 2026-05-31 at eeb8d0c207)
 + commit-reach: replace queue_has_nonstale() scan with O(1) tracking
 + commit-reach: deduplicate queue entries in paint_down_to_common
 + object.h: fix stale entries in object flag allocation table

 The check for non-stale commits in the priority queue used by
 `paint_down_to_common` and `ahead_behind` has been optimized by
 replacing an O(N) scan with an O(1) counter, yielding performance
 improvements in repositories with wide histories.
 cf. <xmqqzf1ncded.fsf@gitster.g>
 source: <pull.2124.v2.git.1779719286.gitgitgadget@gmail.com>

--------------------------------------------------
[New Topics]

* ps/odb-source-packed (2026-06-04) 17 commits
 - odb/source-packed: drop pointer to "files" parent source
 - midx: refactor interfaces to work on "packed" source
 - odb/source-packed: stub out remaining functions
 - odb/source-packed: wire up `freshen_object()` callback
 - odb/source-packed: wire up `find_abbrev_len()` callback
 - odb/source-packed: wire up `count_objects()` callback
 - odb/source-packed: wire up `for_each_object()` callback
 - odb/source-packed: wire up `read_object_stream()` callback
 - odb/source-packed: wire up `read_object_info()` callback
 - packfile: use higher-level interface to implement `has_object_pack()`
 - odb/source-packed: wire up `reprepare()` callback
 - odb/source-packed: wire up `close()` callback
 - odb/source-packed: start converting to a proper `struct odb_source`
 - odb/source-packed: store pointer to "files" instead of generic source
 - packfile: move packed source into "odb/" subsystem
 - packfile: rename `struct packfile_store` to `odb_source_packed`
 - Merge branch 'ps/odb-source-loose' into ps/odb-source-packed
 (this branch uses ps/odb-source-loose.)

 The packed object source has been refactored into a proper struct
 odb_source.

 Comments?
 source: <20260604-pks-odb-source-packed-v1-0-2e7ab31b4b5c@pks.im>


* ps/transport-helper-tsan-fix (2026-06-04) 1 commit
 - transport-helper: fix TSAN race in transfer_debug()

 The TSAN race in transfer_debug() within transport-helper.c has been
 resolved by initializing the debug flag early in
 bidirectional_transfer_loop() before spawning worker threads, allowing
 the removal of a TSAN suppression.

 Comments?
 source: <20260604132327.277693-3-pushkarkumarsingh1970@gmail.com>


* ta/typofixes (2026-06-04) 1 commit
 - docs: fix typos

 Typofixes

 Comments?
 source: <20260604131457.19215-1-taahol@utu.fi>


* js/win-kill-child-more-gently (2026-06-04) 2 commits
 - mingw: really handle SIGINT
 - mingw: kill child processes in a gentler way

 Advanced emulation of kill() used on Windows in GfW has been
 upstreamed to improve the symptoms like left-behind .lock files and
 that fails to let the child clean-up itself when it gets killed.

 Comments?
 source: <pull.2130.git.1780590261.gitgitgadget@gmail.com>


* dl/posix-unused-warning-clang (2026-06-08) 2 commits
 - compat/posix.h: simplify GIT_GNUC_PREREQ() comparison
 - compat/posix.h: enable UNUSED warning messages for Clang

 The UNUSED macro in 'compat/posix.h' has been updated to use a
 newly introduced GIT_CLANG_PREREQ macro for compiler version
 checks, and the existing GIT_GNUC_PREREQ macro has been modernized
 to use explicit major/minor comparisons rather than bit-shifting.

 Comments?
 source: <20260608124419.38905-1-dominik.loidolt@univie.ac.at>


* lo/doc-format-patch-subject-prefix (2026-06-04) 1 commit
 - Documentation: remove redundant 'instead' in --subject-prefix

 Wording used in "format-patch --subject-prefix" documentation
 has been improved.

 Will merge to 'next'.
 source: <20260604163510.36687-2-lucasseikioshiro@gmail.com>


* am/doc-tech-hash-typofix (2026-06-05) 1 commit
 - doc: fix typo in GIT_ALTERNATE_OBJECT_DIRECTORIES

 Typofix.

 Will merge to 'next'.
 cf. <aiZo9FqsdKrhz0gA@pks.im>
 source: <20260605172643.8796-1-amonakov@ispras.ru>


* td/ref-filter-restore-prefix-iteration (2026-06-05) 1 commit
 - ref-filter: restore prefix-scoped iteration

 Commands that list branches and tags (like git branch and git tag)
 have been optimized to pass the namespace prefix when initializing
 their ref iterator, avoiding a loose-ref scaling regression in
 repositories with many unrelated loose references.

 Comments?
 source: <20260605-fix-git-branch-regression-v1-1-02f40ad40929@gmail.com>


* ty/move-protect-hfs-ntfs (2026-06-06) 1 commit
 - environment.c: move 'protect_hfs' and 'protect_ntfs' into 'repo_config_values'

 The global configuration variables protect_hfs and protect_ntfs have
 been migrated into struct repo_config_values to tie them to
 per-repository configuration state.

 Comments?
 source: <20260606143412.15443-1-cat@malon.dev>


* ds/config-no-includes (2026-06-08) 3 commits
 - git: add --no-includes top-level option
 - config: add GIT_CONFIG_INCLUDES
 - git-config.adoc: fix paragraph break

 Two new mechanisms, the GIT_CONFIG_INCLUDES environment variable and
 the top-level --no-includes command-line option, have been introduced
 to ignore configuration include directives.

 Comments?
 source: <pull.2139.git.1780927027.gitgitgadget@gmail.com>


* ps/cat-file-remote-object-info (2026-06-08) 12 commits
 - cat-file: make remote-object-info allow-list dynamic
 - cat-file: validate remote atoms with allow_list
 - cat-file: add remote-object-info to batch-command
 - transport: add client support for object-info
 - serve: advertise object-info feature
 - fetch-pack: move fetch initialization
 - connect: refactor packet writing
 - fetch-pack: move function to connect.c
 - t1006: split test utility functions into new "lib-cat-file.sh"
 - cat-file: add declaration of variable i inside its for loop
 - git-compat-util: add strtoul_ul() with error handling
 - transport-helper: fix memory leak of helper on disconnect

 The `remote-object-info` command has been added to `git cat-file
 --batch-command`, allowing clients to request object metadata
 (currently size) from a remote server via protocol v2 without
 downloading the entire object.

 The client dynamically filters format placeholders based on
 server-advertised capabilities and safely returns empty strings for
 inapplicable or unsupported fields.

 Comments?
 source: <20260608-ps-eric-work-rebase-v12-0-5338b766e658@gmail.com>

--------------------------------------------------
[Stalled]

* jd/unpack-trees-wo-the-repository (2026-03-31) 2 commits
 - unpack-trees: use repository from index instead of global
 - unpack-trees: use repository from index instead of global

 A handful of inappropriate uses of the_repository have been
 rewritten to use the right repository structure instance in the
 unpack-trees.c codepath.

 Waiting for response(s) to review comment(s) for too long, consider discarding.
 cf. <xmqqldf7y95a.fsf@gitster.g>
 source: <pull.2258.v2.git.git.1774971267.gitgitgadget@gmail.com>


* cs/subtree-split-recursion (2026-03-05) 3 commits
 - contrib/subtree: reduce recursion during split
 - contrib/subtree: functionalize split traversal
 - contrib/subtree: reduce function side-effects

 When processing large history graphs on Debian or Ubuntu, "git
 subtree" can die with a "recursion depth reached" error.

 Waiting for response(s) to review comment(s) for too long, consider discarding.
 cf. <xmqqv7c13o5l.fsf@gitster.g>
 source: <20260305-cs-subtree-split-recursion-v2-0-7266be870ba9@howdoi.land>

--------------------------------------------------
[Cooking]

* ap/http-redirect-wwwauth-fix (2026-06-02) 1 commit
 - http: preserve wwwauth_headers across redirects

 When cURL follows a redirect, the WWW-Authenticate headers from the
 redirect target were lost because credential_from_url() cleared the
 credential state. This has been fixed by preserving the collected
 headers across the redirect update.

 Expecting a reroll.
 cf. <5144a29d-a53f-4446-beff-e1f549345bf9@nvidia.com>
 source: <20260602161150.1527493-1-aplattner@nvidia.com>


* ps/doc-recommend-b4 (2026-06-07) 3 commits
 - b4: introduce configuration for the Git project
 - MyFirstContribution: recommend the use of b4
 - MyFirstContribution: recommend shallow threading of cover letters

 Project-specific configuration for b4 has been introduced, and the
 documentation has been updated to recommend using it as a
 streamlined method for submitting patches.

 Comments?
 source: <20260608-pks-b4-v3-0-f5e497d10c56@pks.im>


* kk/streaming-walk-pqueue (2026-05-27) 3 commits
 - revision: use priority queue for non-limited streaming walks
 - revision: introduce rev_walk_mode to clarify get_revision_1()
 - pack-objects: call release_revisions() after cruft traversal

 Streaming revision walks have been optimized by using a priority queue
 for date-sorting commits, speeding up walks repositories with many
 merges.

 Will merge to 'next'?
 source: <pull.2127.git.1779897003.gitgitgadget@gmail.com>


* kk/wildmatch-windows-ls-files-prereq (2026-05-28) 1 commit
  (merged to 'next' on 2026-06-04 at 6dc748aa63)
 + t3070: skip ls-files tests with backslash patterns on Windows

 In t3070-wildmatch, "via ls-files" test variants with patterns
 containing backslash escapes are now skipped on Windows, avoiding 36
 test failures caused by pathspec separator conversion.

 Will merge to 'master'.
 cf. <xmqqecivjn7k.fsf@gitster.g>
 source: <pull.2128.git.1779958849319.gitgitgadget@gmail.com>


* sn/rebase-update-refs-symrefs (2026-06-03) 1 commit
 - rebase: skip branch symref aliases

 "git rebase --update-refs" has been taught to resolve local branch
 symrefs to their referents before queuing updates. This correctly
 skips aliases of the current branch and avoids duplicate updates for
 underlying real branches, fixing failures when branch aliases (like a
 default branch rename) are present.

 Waiting for response(s) to review comment(s).
 cf. <f982c386-e329-4ab0-b695-e540bcb9de3d@gmail.com>
 source: <pull.2126.v2.git.1780482436865.gitgitgadget@gmail.com>


* lp/http-fetch-pack-index-leak-fix (2026-06-01) 2 commits
  (merged to 'next' on 2026-06-04 at f4090b5068)
 + http: fix memory leak in fetch_and_setup_pack_index()
 + http: cleanup function fetch_and_setup_pack_index()

 A memory leak in `fetch_and_setup_pack_index()` when verification of
 the downloaded pack index fails has been plugged. Also an obsolete
 `unlink()` call on parse failure has been cleaned up.

 Will merge to 'master'.
 cf. <20260529053659.GC1099450@coredump.intra.peff.net>
 source: <cover.1780321770.git.lorenzo.pegorari2002@gmail.com>


* jk/describe-contains-all-match-fix (2026-06-01) 1 commit
 - describe: fix --exclude, --match with --contains and --all

 The 'git describe --contains --all' command has been fixed to
 properly honor the '--match' and '--exclude' options by passing
 them down to 'git name-rev' with the appropriate reference
 prefixes.

 Will merge to 'next'?
 source: <20260601233727.43558-1-jacob.e.keller@intel.com>


* wy/docs-typofixes (2026-05-29) 1 commit
 - docs: fix typos and grammar

 Various typos, grammatical errors, and duplicated words in both
 documentation and code comments have been corrected.

 Waiting for response(s) to review comment(s).
 cf. <xmqq8q8x3nox.fsf@gitster.g>
 source: <7b502e20e9495cd4720496bd6738a1fbeb453410.1780041658.git.wy@wyuan.org>


* ab/index-pack-retain-child-bases (2026-06-02) 1 commit
 - index-pack: retain child bases in delta cache

 "git index-pack" has been optimized by retaining child bases in the
 delta cache instead of immediately freeing them, letting the existing
 cache limit policy decide eviction.

 Waiting for response(s) to review comment(s).
 cf. <c4a32a6f-70bf-4ff4-abbf-d6e301246b5b@gmail.com>
 source: <pull.2131.v3.git.1780445118653.gitgitgadget@gmail.com>


* hn/macos-linker-warning (2026-06-02) 1 commit
  (merged to 'next' on 2026-06-04 at db2ca164c4)
 + config.mak.uname: avoid macOS linker warning on Xcode 16.3+

 A linker warning on macOS when building with Xcode 16.3 or newer has
 been avoided by passing -fno-common to the compiler when a
 sufficiently new linker is detected.

 Will merge to 'master'.
 source: <pull.2313.v3.git.git.1780385878555.gitgitgadget@gmail.com>


* mm/diff-process-hunks (2026-05-29) 6 commits
 . blame: consult diff process for no-hunk detection
 . diff: bypass diff process with --no-ext-diff and in format-patch
 . diff: add long-running diff process via diff.<driver>.process
 . sub-process: separate process lifecycle from hashmap management
 . userdiff: add diff.<driver>.process config
 . xdiff: support external hunks via xpparam_t

 A new `diff.<driver>.process` configuration has been introduced to
 allow a long-running external process to act as a hunk provider to
 allows external tools to control which lines Git considers changed
 while leaving all output formatting (word diff, color, blame, etc.) to
 Git's standard pipeline.

 Expecting a reroll.
 cf. <CAC2QwmKNA6wv-jG07fgJj7Xj2J+dzzWEiqV5Q+8HJpjA_GtkFw@mail.gmail.com>
 source: <pull.2120.v3.git.1780087700.gitgitgadget@gmail.com>


* tb/pack-path-walk-bitmap-delta-islands (2026-06-02) 5 commits
 - pack-objects: support `--delta-islands` with `--path-walk`
 - pack-objects: extract `record_tree_depth()` helper
 - pack-objects: support reachability bitmaps with `--path-walk`
 - t/perf: drop p5311's lookup-table permutation
 - Merge branch 'ds/path-walk-filters' into tb/pack-path-walk-bitmap-delta-islands

 The pack-objects command now supports using reachability bitmaps and
 delta-islands concurrently with the `--path-walk` option, allowing
 faster packaging by falling back to path-walk when bitmaps cannot
 fully satisfy the request.

 Comments?
 source: <cover.1780438896.git.me@ttaylorr.com>


* ty/migrate-trust-executable-bit (2026-05-30) 4 commits
 - read-cache: pass 'istate' to stat/mode helper functions
 - environment: move 'trust_executable_bit' into repo_config_values
 - read-cache: move 'ce_mode_from_stat()' to 'read-cache.c'
 - read-cache: remove redundant extern declarations

 The 'trust_executable_bit' (coming from 'core.filemode'
 configuration) has been migrated into 'repo_config_values' to tie it
 to a specific repository instance.

 Waiting for response(s) to review comment(s).
 cf. <CAP8UFD1GJ=caPh-M97KLCfB1ZKtpomzosYN0uYBOnay+G23GcA@mail.gmail.com>
 cf. <CAP8UFD20yij=1ZEYnR74DoCJ3g=b39yOsUxZecYuuf7nFGaKyA@mail.gmail.com>
 source: <20260530160520.77859-1-cat@malon.dev>


* ak/typofixes (2026-05-31) 1 commit
 - doc: fix typos via codespell

 Typofixes.

 Will merge to 'next'.
 cf. <3398ef40-1547-4324-2cfc-97b9e2b24854@gmx.de>
 cf. <xmqq8q8p1ese.fsf@gitster.g>
 source: <20260531184428.55905-1-algonell@gmail.com>
 source: <20260506101631.18127-1-algonell@gmail.com>
 source: <3398ef40-1547-4324-2cfc-97b9e2b24854@gmx.de>


* kk/prio-queue-cascade-sift (2026-06-01) 1 commit
 - prio-queue: use cascade-down for faster extract-min

 prio_queue_get() has been optimized by using a cascade-down approach
 (promoting the smaller child at each level and sifting up the last
 element from the leaf vacancy), which halves the number of comparisons
 per extract-min operation in the common case.

 Expecting a reroll.
 cf. <CAL71e4Ob-B5MJ5DPY+_tzpj6nyrbQ5WutxED2T93SWJV6kJGPA@mail.gmail.com>
 source: <pull.2132.v2.git.1780301856444.gitgitgadget@gmail.com>


* mm/subprocess-handshake-fix (2026-06-01) 1 commit
 - sub-process: use gentle handshake to avoid die() on startup failure

 The subprocess handshake during startup has been made gentler by using
 packet_read_line_gently() instead of packet_read_line() to prevent the
 parent Git process from dying abruptly when a configured subprocess
 (e.g., a clean/smudge filter) fails to start.

 Will merge to 'next'?
 source: <pull.2133.v2.git.1780348848489.gitgitgadget@gmail.com>


* jk/repo-info-path-keys (2026-06-05) 4 commits
 - repo: add path.commondir with absolute and relative suffix formatting
 - repo: add path.gitdir with absolute and relative suffix formatting
 - rev-parse: use format_path for path formatting
 - path: introduce format_path() for centralized path formatting

 The "git repo info" command has been taught new keys to output both
 absolute and relative paths for "gitdir" and "commondir", supported by
 a new path-formatting helper extracted from "git rev-parse".

 Comments?
 source: <20260605163012.181089-1-jayatheerthkulkarni2005@gmail.com>


* ps/history-drop (2026-06-08) 9 commits
 - builtin/history: implement "drop" subcommand
 - builtin/history: split handling of ref updates into two phases
 - reset: stop assuming that the caller passes in a clean index
 - reset: allow the caller to specify the current HEAD object
 - reset: introduce ability to skip reference updates
 - reset: introduce dry-run mode
 - reset: modernize flags passed to `reset_head()`
 - reset: drop `USE_THE_REPOSITORY_VARIABLE`
 - read-cache: split out function to drop unmerged entries to stage 0

 The experimental "git history" command has been taught a new "drop"
 subcommand to remove a commit and replay its descendants onto its
 parent.

 Comments?
 source: <20260608-b4-pks-history-drop-v3-0-84ca8e43e937@pks.im>


* ls/doc-raw-timestamp-prefix (2026-06-02) 1 commit
 - doc: document and test `@` prefix for raw timestamps

 Documentation and tests have been added to clarify that Git's internal
 raw timestamp format requires a `@` prefix for values less than
 100,000,000 to prevent ambiguity with other formats like YYYYMMDD.

 Will merge to 'next'.
 cf. xmqqmrxdxq1r.fsf@gitster.g>
 source: <20260602081924.673763-2-dev@luna.gl>


* jk/setup-gitfile-diag-fix (2026-06-01) 1 commit
 - read_gitfile_gently(): return non-repo path on error

 A regression in the error diagnosis code for invalid .git files has
 been fixed, avoiding a potential NULL-pointer crash when reporting
 that a .git file does not point to a valid repository.

 Expecting a reroll?
 cf. <ah6WEtk2pXyViEQA@pks.im>
 source: <20260602061159.GA693928@coredump.intra.peff.net>


* jc/submitting-patches-cover-letter (2026-06-02) 2 commits
 - SubmittingPatches: describe cover letter
 - SubmittingPatches: separate typofixes section

 Guidelines on how to write a cover letter for a multi-patch series
 have been added to SubmittingPatches, which also got a new marker
 to separate the section for typofixes.

 Will merge to 'next'.
 cf. <c54f3571-ff7b-4caa-b75d-a739ed87ec9d@gmail.com>
 cf. <aiEgUdnL8dkszKFn@pks.im>
 source: <20260602144304.3341000-1-gitster@pobox.com>


* ps/t7527-fix-tap-output (2026-06-04) 8 commits
 - t: let prove fail when parsing invalid TAP output
 - t/lib-git-p4: silence output when killing p4d and its watchdog
 - t/test-lib: silence EBUSY errors on Windows during test cleanup
 - t7810: turn MB_REGEX check into a lazy prereq
 - t7527: fix broken TAP output
 - ci: unify Linux images across GitLab and GitHub
 - gitlab-ci: add missing Linux jobs
 - gitlab-ci: rearrange Linux jobs to match GitHub's order

 A recent regression in t7527 that broke TAP output has been fixed,
 some other test noise that also broke TAP output has been silenced,
 and 'prove' is now configured to fail on invalid TAP output to
 prevent future regressions.

 Comments?
 source: <20260604-pks-t7527-fix-tap-output-v3-0-7d766ed481e4@pks.im>


* ob/more-repo-config-values (2026-06-02) 8 commits
 - environment: move "warn_on_object_refname_ambiguity" into `struct repo_config_values`
 - environment: move "sparse_expect_files_outside_of_patterns" into `struct repo_config_values`
 - environment: move "core_sparse_checkout_cone" into `struct repo_config_values`
 - environment: move "precomposed_unicode" into `struct repo_config_values`
 - environment: move "pack_compression_level" into `struct repo_config_values`
 - environment: move `zlib_compression_level` into `struct repo_config_values`
 - environment: move "check_stat" into `struct repo_config_values`
 - environment: move "trust_ctime" into `struct repo_config_values`

 Many core configuration variables have been migrated from global
 variables into 'repo_config_values' to tie them to a specific
 repository instance, avoiding cross-repository state leakage.

 Will merge to 'next'.
 source: <20260602170921.35869-1-belkid98@gmail.com>


* kh/doc-trailers (2026-04-13) 9 commits
 - doc: interpret-trailers: document comment line treatment
 - doc: interpret-trailers: commit to “trailer block” term
 - doc: interpret-trailers: add key format example
 - doc: interpret-trailers: explain key format
 - doc: interpret-trailers: explain the format after the intro
 - doc: interpret-trailers: not just for commit messages
 - doc: interpret-trailers: use “metadata” in Name as well
 - doc: interpret-trailers: replace “lines” with “metadata”
 - doc: interpret-trailers: stop fixating on RFC 822

 Documentation updates.

 Expecting a reroll.
 cf. <5508ee49-2f78-4c3a-accf-a2350666bfb8@app.fastmail.com>
 source: <V2_CV_doc_int-tr_key_format.613@msgid.xyz>


* za/completion-hide-dotfiles (2026-05-26) 1 commit
 - completion: hide dotfiles for selected path completion

 The path completion for commands like `git rm` and `git mv`, is being
 updated to hide dotfiles by default, unless the user explicitly starts
 the path with a dot, matching standard shell-completion behavior.

 Comments?
 cf. <xmqqqzmxlep3.fsf@gitster.g>
 source: <pull.2311.v2.git.git.1779808987825.gitgitgadget@gmail.com>


* ib/doc-push-default-simple (2026-05-25) 1 commit
  (merged to 'next' on 2026-06-02 at 5c1ff2a769)
 + doc: clarify push.default=simple behavior

 The documentation for `push.default = simple` has been clarified to
 better explain its behavior, making it clear that it pushes the
 current branch to a same-named branch on the remote, and detailing
 the upstream requirements for centralized workflows.

 Will merge to 'master'.
 cf. <pull.2115.v2.git.1779767888508.gitgitgadget@gmail.com>
 source: <pull.2115.v2.git.1779767888508.gitgitgadget@gmail.com>


* jc/doc-monitor-ghci (2026-05-24) 1 commit
  (merged to 'next' on 2026-06-02 at 46fb5fe1c2)
 + SubmittingPatches: proactively monitor GHCI pages

 Encourage original authors to monitor the CI status.

 Will merge to 'master'.
 source: <xmqq1pf0gpp3.fsf@gitster.g>


* ec/commit-fixup-options (2026-05-26) 2 commits
 - commit: allow -c/-C for all kinds of --fixup
 - commit: allow -m/-F for all kinds of --fixup

 The -m/-F/-c/-C options to supply commit log message from outside the
 editor are now supported for all "git commit --fixup" variations.

 Comments?
 source: <cover.1779792311.git.erik@cervined.in>


* gh/jump-auto-mode (2026-05-21) 1 commit
  (merged to 'next' on 2026-06-02 at f70dd05c9c)
 + git-jump: pick a mode automatically when invoked without arguments

 The 'git-jump' command (in contrib/) has been taught to automatically
 pick a mode (merge, diff, or ws) when invoked without arguments.

 Will merge to 'master'.
 cf. <20260522052821.GC861761@coredump.intra.peff.net>
 source: <pull.2108.v3.git.1779371110195.gitgitgadget@gmail.com>


* ps/odb-source-loose (2026-06-01) 19 commits
  (merged to 'next' on 2026-06-04 at 660909ad66)
 + odb/source-loose: drop pointer to the "files" source
 + odb/source-loose: stub out remaining callbacks
 + odb/source-loose: wire up `write_object_stream()` callback
 + object-file: refactor writing objects to use loose source
 + odb/source-loose: wire up `write_object()` callback
 + loose: refactor object map to operate on `struct odb_source_loose`
 + odb/source-loose: wire up `freshen_object()` callback
 + odb/source-loose: drop `odb_source_loose_has_object()`
 + odb/source-loose: wire up `count_objects()` callback
 + odb/source-loose: wire up `find_abbrev_len()` callback
 + odb/source-loose: wire up `for_each_object()` callback
 + odb/source-loose: wire up `read_object_stream()` callback
 + odb/source-loose: wire up `read_object_info()` callback
 + odb/source-loose: wire up `close()` callback
 + odb/source-loose: wire up `reprepare()` callback
 + odb/source-loose: start converting to a proper `struct odb_source`
 + odb/source-loose: store pointer to "files" instead of generic source
 + odb/source-loose: move loose source into "odb/" subsystem
 + Merge branch 'ps/odb-in-memory' into ps/odb-source-loose
 (this branch is used by ps/odb-source-packed.)

 The loose object source has been refactored into a proper `struct
 odb_source`.

 Will merge to 'master'.
 source: <20260601-b4-pks-odb-source-loose-v2-0-90ff159430af@pks.im>


* ps/setup-centralize-odb-creation (2026-06-04) 9 commits
 - setup: construct object database in `apply_repository_format()`
 - repository: stop reading loose object map twice on repo init
 - setup: stop initializing object database without repository
 - setup: stop creating the object database in `setup_git_env()`
 - repository: stop initializing the object database in `repo_set_gitdir()`
 - setup: deduplicate logic to apply repository format
 - setup: drop `setup_git_env()`
 - t0001: plug test gaps for git-init(1) with GIT_OBJECT_DIRECTORY
 - Merge branch 'ps/setup-wo-the-repository' into ps/setup-centralize-odb-creation

 The setup logic to discover and configure repositories has been
 refactored, and the initialization of the object database has been
 centralized.

 Will merge to 'next'.
 cf. <CAOLa=ZQwVbLsOcajaxQwtkTPm=4St7EiGEEyL6_B0o3Tt1v1pw@mail.gmail.com>
 source: <20260604-b4-pks-setup-centralize-odb-creation-v3-0-0691834f318a@pks.im>


* kh/doc-replay-config (2026-06-05) 4 commits
 - doc: replay: move “default” to the right-hand side
 - doc: replay: use a nested description list
 - doc: replay: improve config description
 - doc: link to config for git-replay(1)

 Doc update for "git replay" to actually refer to its configuration
 variables.

 Comments?
 source: <V3_CV_doc_replay_config.780@msgid.xyz>


* tb/bitmap-build-performance (2026-05-27) 9 commits
  (merged to 'next' on 2026-06-02 at d1a84a996a)
 + pack-bitmap: build pseudo-merge bitmaps after regular bitmaps
 + pack-bitmap: remember pseudo-merge parents
 + pack-bitmap: sort bitmaps before XORing
 + pack-bitmap: cache object positions during fill
 + pack-bitmap: consolidate `find_object_pos()` success path
 + pack-bitmap: reuse stored selected bitmaps
 + pack-bitmap: check subtree bits before recursing
 + pack-bitmap: pass object position to `fill_bitmap_tree()`
 + Merge branch 'tb/pseudo-merge-bugfixes' into tb/bitmap-build-performance

 Reachability bitmap generation has been significantly optimized. By
 reordering tree traversal, caching object positions, and refining how
 pseudo-merge bitmaps are constructed, the performance of "git repack
 --write-midx-bitmaps" is improved, especially for large repositories
 and when using pseudo-merges.

 Will merge to 'master'.
 cf. <20260529083439.GD1106035@coredump.intra.peff.net>
 source: <cover.1779911733.git.me@ttaylorr.com>


* hn/status-pull-advice-qualified (2026-05-21) 1 commit
 - remote: qualify "git pull" advice for non-upstream compareBranches

 Advice shown by "git status" when the local branch is behind or has
 diverged from its push branch has been updated to suggest "git pull
 <remote> <branch>".

 Comments?
 source: <pull.2301.v4.git.git.1779372367317.gitgitgadget@gmail.com>


* rs/strbuf-add-uint (2026-05-12) 4 commits
  (merged to 'next' on 2026-06-02 at f5be02d8ec)
 + ls-tree: use strbuf_add_uint()
 + ls-files: use strbuf_add_uint()
 + cat-file: use strbuf_add_uint()
 + strbuf: add strbuf_add_uint()

 Adding a decimal integer with strbuf_addf("%u") appears commonly;
 they have been optimized by using a custom formatter.

 Will merge to 'master'.
 cf. <20260512184619.GD70851@coredump.intra.peff.net>
 source: <20260512115603.80780-1-l.s.r@web.de>


* mm/doc-word-diff (2026-05-28) 1 commit
  (merged to 'next' on 2026-06-04 at 9fa723ec63)
 + doc: clarify that --word-diff operates on line-level hunks

 The documentation for "--word-diff" has been extended with a bit of
 implementation detail of where these different words come from.

 Will merge to 'master'.
 source: <pull.2113.v2.git.1779996106005.gitgitgadget@gmail.com>


* rs/strbuf-add-oid-hex (2026-05-13) 1 commit
  (merged to 'next' on 2026-06-02 at 4876f95de0)
 + hex: add and use strbuf_add_oid_hex()

 Formatting object name in full hexadecimal form has been optimized
 by using a new strbuf_add_oid_hex() helper function.

 Will merge to 'master'.
 cf. <20260513160155.GA103037@coredump.intra.peff.net>
 source: <183aa0fd-d455-4ec9-9c42-d511fac8b3e4@web.de>


* hn/config-typo-advice (2026-06-02) 2 commits
 - config: improve diagnostic for "set" with missing value
 - config: add git_config_key_is_valid() for quiet validation

 "git config foo.bar=baz" is not likely to be a request to read the
 value of such a variable with '=' in its name; rather it is plausible
 that the user meant "git config set foo.bar baz".  Give advice when
 giving an error message.

 Will merge to 'next'.
 cf. <xmqq1penqfg2.fsf@gitster.g>
 source: <pull.2302.v6.git.git.1780425808.gitgitgadget@gmail.com>


* jt/config-lock-timeout (2026-05-17) 1 commit
 - config: retry acquiring config.lock, configurable via core.configLockTimeout

 Configuration file locking now retries for a short period, avoiding
 failures when multiple processes attempt to update the configuration
 simultaneously.

 Waiting for response(s) to review comment(s).
 cf. <agrIrGwSMFlKTx9x@pks.im>
 source: <20260517132111.1014901-1-joerg@thalheim.io>


* hn/branch-prune-merged (2026-06-05) 6 commits
 . branch: add --dry-run for --prune-merged
 . branch: add branch.<name>.pruneMerged opt-out
 . branch: add --prune-merged <branch>
 . branch: prepare delete_branches for a bulk caller
 . branch: let delete_branches warn instead of error on bulk refusal
 . branch: add --forked filter for --list mode

 "git branch" command learned "--prune-merged" option to remove
 local branches that have already been merged to the remote-tracking
 branches they track.

 Breaks tests.
 source: <pull.2285.v13.git.git.1780684553.gitgitgadget@gmail.com>


* st/daemon-sockaddr-fixes (2026-05-27) 3 commits
  (merged to 'next' on 2026-06-04 at 17684e6158)
 + daemon: guard NULL REMOTE_PORT in execute() logging
 + daemon: fix IPv6 address truncation in ip2str()
 + daemon: fix IPv6 address corruption in lookup_hostname()

 Correct use of sockaddr API in "git daemon".

 Will merge to 'master'.
 source: <pull.2300.v3.git.git.1779937016.gitgitgadget@gmail.com>


* cc/promisor-auto-config-url-more (2026-05-27) 8 commits
 - doc: promisor: improve acceptFromServer entry
 - promisor-remote: auto-configure unknown remotes
 - promisor-remote: trust known remotes matching acceptFromServerUrl
 - promisor-remote: introduce promisor.acceptFromServerUrl
 - promisor-remote: add 'local_name' to 'struct promisor_info'
 - urlmatch: add url_normalize_pattern() helper
 - urlmatch: change 'allow_globs' arg to bool
 - t5710: simplify 'mkdir X' followed by 'git -C X init'

 The handling of promisor-remote protocol capability has been
 loosened to allow the other side to add to the list of promisor
 remotes via the promisor.acceptFromServerURL configuration
 variable.

 Comments?
 source: <20260527140820.1438165-1-christian.couder@gmail.com>


* hn/checkout-track-fetch (2026-05-23) 2 commits
 - checkout: extend --track with a "fetch" mode to refresh start-point
 - branch: expose helpers for finding the remote owning a tracking ref

 "git checkout --track=..." learned to optionally fetch the branch
 from the remote the new branch will work with.

 Comments?
 source: <pull.2281.v13.git.git.1779565714.gitgitgadget@gmail.com>


* mf/revision-max-count-oldest (2026-05-18) 1 commit
 - revision.c: implement --max-count-oldest

 "git rev-list" (and "git log" family of commands) learned a new "--max-count-oldest"
 that picks oldest N commits in the range instead of the usual newest.

 Will merge to 'next'.
 source: <8210d60832b9a58aa4d71fc3790e44d8989564ce.1779152064.git.mroik@delayed.space>


* mm/line-log-cleanup (2026-05-28) 3 commits
  (merged to 'next' on 2026-06-04 at 02f8bea278)
 + line-log: allow non-patch diff formats with -L
 + line-log: integrate -L output with the standard log-tree pipeline
 + revision: move -L setup before output_format-to-diff derivation

 The `git log -L` implementation has been refactored to use the
 standard diff output pipeline, enabling pickaxe and diff-filter to
 work as expected. Additionally, metadata-only diff formats like
 --raw and --name-only are now supported with -L.

 Will merge to 'master'.
 cf. <B59BA5B1-184D-48A8-8BAD-11EB6F8EB50C@gmail.com>
 source: <pull.2094.v3.git.1780001267.gitgitgadget@gmail.com>


* en/ort-harden-against-corrupt-trees (2026-04-20) 5 commits
 - cache-tree: fix verify_cache() to catch non-adjacent D/F conflicts
 - merge-ort: abort merge when trees have duplicate entries
 - merge-ort: free diff pairs queue in clear_or_reinit_internal_opts()
 - merge-ort: drop unnecessary show_all_errors from collect_merge_info()
 - merge-ort: propagate callback errors from traverse_trees_wrapper()

 "ort" merge backend handles merging corrupt trees better by
 aborting when it should.

 Waiting for response(s) to review comment(s).
 cf. <xmqqldcy4f07.fsf@gitster.g>
 source: <pull.2096.git.1776731171.gitgitgadget@gmail.com>


* pw/status-rebase-todo (2026-05-01) 2 commits
 - status: improve rebase todo list parsing
 - sequencer: factor out parsing of todo commands

 The display of the rebase todo list in "git status" has been
 improved to correctly abbreviate object IDs for more commands and
 avoid misinterpreting refs as object IDs.

 Waiting for response(s) to review comment(s).
 cf. <xmqqbjdwcsno.fsf@gitster.g>
 source: <cover.1777648598.git.phillip.wood@dunelm.org.uk>


* cl/conditional-config-on-worktree-path (2026-05-24) 2 commits
 - config: add "worktree" and "worktree/i" includeIf conditions
 - config: refactor include_by_gitdir() into include_by_path()

 The [includeIf "condition"] conditional inclusion facility for
 configuration files has learned to use the location of worktree
 in its condition.

 Waiting for response(s) to review comment(s).
 cf. <xmqq8q97et9b.fsf@gitster.g>
 source: <20260525-includeif-worktree-v5-0-1efe525d025a@black-desk.cn>


* ps/shift-root-in-graph (2026-04-27) 1 commit
 - graph: add indentation for commits preceded by a parentless commit

 In a history with more than one root commit, "git log --graph
 --oneline" stuffed an unrelated commit immediately below a root
 commit, which has been corrected by making the spot below a root
 unavailable.

 Expecting a reroll.
 cf. <CAN5EUNQoKRqt3FGLmzRGpPU1nO5jCAogP8Wm9gBZXuPbMNbQAw@mail.gmail.com>
 source: <20260427102838.44867-2-pabloosabaterr@gmail.com>


* th/promisor-quiet-per-repo (2026-04-06) 1 commit
  (merged to 'next' on 2026-06-02 at 02a749d7fe)
 + promisor-remote: fix promisor.quiet to use the correct repository

 The "promisor.quiet" configuration variable was not used from
 relevant submodules when commands like "grep --recurse-submodules"
 triggered a lazy fetch, which has been corrected.

 Will merge to 'master'.
 cf. <c87f1f12-d0cc-4150-8f43-4dc9cc1fe24f@malon.dev>
 source: <20260406183041.783800-1-vikingtc4@gmail.com>


* ua/push-remote-group (2026-05-03) 3 commits
  (merged to 'next' on 2026-06-02 at ba5d6aebaa)
 + push: support pushing to a remote group
 + remote: move remote group resolution to remote.c
 + remote: fix sign-compare warnings in push_cas_option

 "git push" learned to take a "remote group" name to push to, which
 causes pushes to multiple places, just like "git fetch" would do.

 Will merge to 'master'.
 cf. <20260518182721.155070-1-usmanakinyemi202@gmail.com>
 source: <20260503153402.1333220-1-usmanakinyemi202@gmail.com>


* js/parseopt-subcommand-autocorrection (2026-04-27) 11 commits
 - SQUASH???
 - doc: document autocorrect API
 - parseopt: add tests for subcommand autocorrection
 - parseopt: enable subcommand autocorrection for git-remote and git-notes
 - parseopt: autocorrect mistyped subcommands
 - autocorrect: provide config resolution API
 - autocorrect: rename AUTOCORRECT_SHOW to AUTOCORRECT_HINT
 - autocorrect: use mode and delay instead of magic numbers
 - help: move tty check for autocorrection to autocorrect.c
 - help: make autocorrect handling reusable
 - parseopt: extract subcommand handling from parse_options_step()

 The parse-options library learned to auto-correct misspelled
 subcommand names.

 Expecting a reroll.
 cf. <SY0P300MB0801E50FCB7EB2F45CD15208CE042@SY0P300MB0801.AUSP300.PROD.OUTLOOK.COM>
 source: <SY0P300MB0801677A2A1E0FD38D06A841CE2A2@SY0P300MB0801.AUSP300.PROD.OUTLOOK.COM>


* jc/neuter-sideband-post-3.0 (2026-03-05) 2 commits
 - sideband: delay sanitizing by default to Git v3.0
 - Merge branch 'jc/neuter-sideband-fixup' into jc/neuter-sideband-post-3.0

 The final step, split from earlier attempt by Dscho, to loosen the
 sideband restriction for now and tighten later at Git v3.0 boundary.

 On hold to help the base topic with wider exposure.
 (this branch uses jc/neuter-sideband-fixup.)
 source: <20260305233452.3727126-8-gitster@pobox.com>

--------------------------------------------------
[Discarded]

* kk/fetch-store-ref-optimization (2026-05-24) 1 commit
 - fetch: pass transport to post-fetch connectivity check

 When fetching from a transport that provides a self-contained pack,
 pass the transport pointer to the post-fetch `check_connected()` call
 to optimize connectivity check.

 Retracted.
 cf. <CAL71e4MrVqC1=AR6x0_8S=8kVqPdDkhgCZRb4etFsxTzd6s_8Q@mail.gmail.com>
 source: <pull.2123.git.1779625693328.gitgitgadget@gmail.com>


* lp/repack-propagate-promisor-debugging-info (2026-04-18) 6 commits
 - repack-promisor: add missing headers
 - t7703: test for promisor file content after geometric repack
 - t7700: test for promisor file content after repack
 - repack-promisor: preserve content of promisor files after repack
 - repack-promisor add helper to fill promisor file after repack
 - pack-write: add explanation to promisor file content

 When fetching objects into a lazily cloned repository, .promisor
 files are created with information meant to help debugging.  "git
 repack" has been taught to carry this information forward to
 packfiles that are newly created.

 Retracted.
 cf. <agx_GPfBKpkSc3Gx@lorenzo-VM>
 source: <cover.1776384902.git.lorenzo.pegorari2002@gmail.com>

^ permalink raw reply

* Re: trailers: --only-trailers normalizes URLs to trailers
From: Jeff King @ 2026-06-09  0:43 UTC (permalink / raw)
  To: Kristoffer Haugsbakk; +Cc: git
In-Reply-To: <ae4a32e7-bacb-4c88-b2a0-5aeaff60b904@app.fastmail.com>

On Thu, Jun 04, 2026 at 11:27:51PM +0200, Kristoffer Haugsbakk wrote:

> The following is a bug that follows straightforwardly from the documented
> or discussed behavior. In that sense it is not a bug. But it is a bug in
> the sense that it makes things inconvenient and violates a design goal.

Yeah, though if you'll allow me to nitpick your subject a moment: I
don't think --only-trailers is really the culprit here. It demonstrates
the problem because it normalizes the "trailer" it found. But the loose
trailer matching is the more fundamental issue. For example:

git interpret-trailers --trailer=foo=bar <<\EOF
subject

body

http://example.com
EOF

will stick the new "foo: bar" trailer right up against the (now-broken)
"http:" trailer. When it should come in its own stanza, which it would
if you added a line "other" at the end, since that tells us that "http:"
can't be a trailer.

> > What's different between what you expected and what actually happened?
> 
> In an ideal world to have some special-casing of URLs so that they are
> not detected as trailers. Does anyone realistically want trailers like
> this?:
> 
>     file: //...
>     http: //...
>     https: //...

I could even see those as trailers, if somebody really wanted to allow
arbitrary values that might just happen to start with "//". But without
the whitespace after the colon, it is quite questionable.

> Just special-casing `https` would go a long way.

Agreed, though I think a rule like: ":// (with no whitespace)" is not a
valid separator. Something like this:

diff --git a/trailer.c b/trailer.c
index 6d8ec7fa8d..342ed81c78 100644
--- a/trailer.c
+++ b/trailer.c
@@ -635,8 +635,12 @@ static ssize_t find_separator(const char *line, const char *separators)
 	int whitespace_found = 0;
 	const char *c;
 	for (c = line; *c; c++) {
-		if (strchr(separators, *c))
+		if (strchr(separators, *c)) {
+			/* special case to avoid accidental URL matches */
+			if (*c == ':' && c[1] == '/' && c[2] == '/')
+				return -1;
 			return c - line;
+		}
 		if (!whitespace_found && (isalnum(*c) || *c == '-'))
 			continue;
 		if (c != line && (*c == ' ' || *c == '\t')) {

-Peff

^ permalink raw reply related

* Re: [PATCH v2] transport-helper: fix TSAN race in transfer_debug()
From: Jeff King @ 2026-06-09  0:28 UTC (permalink / raw)
  To: Pushkar Singh; +Cc: git, gitster
In-Reply-To: <20260604132327.277693-3-pushkarkumarsingh1970@gmail.com>

On Thu, Jun 04, 2026 at 01:23:29PM +0000, Pushkar Singh wrote:

> Currently, transfer_debug() lazily initializes a static variable based
> on GIT_TRANSLOOP_DEBUG. Since the function may be called from multiple
> worker threads, this initialization is racy and is therefore suppressed
> in .tsan-suppressions.
> 
> Initialize the variable in bidirectional_transfer_loop() before any
> worker threads or processes are created. This patch removes the race and
> allows dropping the corresponding TSAN suppression.

OK. I was surprised that this code would use threads at all, but I guess
it all comes from 419f37db4d (Add bidirectional_transfer_loop(),
2010-10-12).

It feels like this could probably be implemented without threads by
using poll(), but that's out of scope for this patch. (I also thought
that run-command's pump_io() might help, but I think it only pumps
to/from in-memory buffers, not between descriptors).

> Changes since v1:
> - Treat negative values as disabled by using transfer_debug_enabled <= 0

I saw Junio's comment on v1, but I wonder if quietly ignoring negative
values is correct. Isn't it a BUG() if the value is negative when we get
here? It means we're ignoring the value of $GIT_TRANSLOOP_DEBUG, which
might have actually been "1".

(Yet another aside: this really ought to use git_env_bool, though that
is a user-facing change).

>  static void transfer_debug(const char *fmt, ...)
> [...]
> -	if (debug_enabled < 0)
> -		debug_enabled = getenv("GIT_TRANSLOOP_DEBUG") ? 1 : 0;
> -	if (!debug_enabled)
> +	if (transfer_debug_enabled <= 0)
>  		return;

So I feel like this ought to be:

  if (transfer_debug_enabled < 0)
	BUG("somebody forgot to check GIT_TRANSLOOP_DEBUG!");
  if (!transfer_debug_enabled)
	return;

> @@ -1648,6 +1640,9 @@ int bidirectional_transfer_loop(int input, int output)
>  {
>  	struct bidirectional_transfer_state state;
>  
> +	if (transfer_debug_enabled < 0)
> +		transfer_debug_enabled = getenv("GIT_TRANSLOOP_DEBUG") ? 1 : 0;
> +

And then we're pretty confident that the BUG() does not trigger because
all of the users of the flag will run through this function.

For the same reason, we can be pretty confident that your existing code
would be fine in practice, too, of course. ;) But if we do hit this
case, I think a BUG() is the right thing.

-Peff

^ permalink raw reply

* Re: [PATCH] ls-files: filter pathspec before lstat
From: Tamir Duberstein @ 2026-06-09  0:13 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, git, René Scharfe, Patrick Steinhardt
In-Reply-To: <20260608232516.GA357822@coredump.intra.peff.net>

On Mon, Jun 8, 2026 at 4:25 PM Jeff King <peff@peff.net> wrote:
>
> On Mon, Jun 08, 2026 at 07:03:15PM -0400, Jeff King wrote:
>
> > > Adding an extra early `match_pathspec()` check before making slow
> > > system calls like `lstat()` makes sense, especially when most of the
> > > index entries need to be skipped.  But if most of them would match,
> > > then we would end up doing the same match_pathspec() calls twice for
> > > each path, and run lstat() anyway, so you may also be able to
> > > construct a perf test that demonstrates a case where this approach
> > > is not a clear win (or even degradation), perhaps?
> >
> > The patchspec matching is linear in the number of pathspecs, so it's
> > easy to get quadratic-ish results by just asking about:
> >
> >   git ls-files -- $(git ls-files)
> >
> > So that probably provides an easy regression demonstration for this
> > patch.
>
> Ah, yeah, it is easy to demonstrate. Making a repo of size $n like this:
>
>   n=10000
>   git init
>   for i in $(seq $n); do
>     echo $i >file$i
>   done
>   git add .
>   git commit -m foo
>
> If we then run:
>
>   time git ls-files -- $(git ls-files) >/dev/null
>
> then n=1000 takes ~15ms for me, but n=10000 takes ~800ms. So that shows
> the slowdown of the existing pathspec code as the number of pathspecs
> grows.
>
> With this patch, starting with n=10000 and adding in "-m" (which
> triggers the code in this patch), like:
>
>   time git ls-files -m -- $(git ls-files) >/dev/null
>
> the time goes from ~15ms (without the patch) to ~800ms with it. Which
> makes sense. Nothing is modified, so the current code which puts the
> lstat() check first eliminates each entry before we even consider
> pathspecs. So it doesn't hit the slow case at all.
>
> But after the patch, we do a preliminary pathspec match and
> pay the cost.
>
> So it really is a question of how many items are actually modified, the
> cost of lstat(), and the cost of pathspec matching (which varies with
> the size of the pathspec).
>
> But like I said, this is kind of a silly case. If it actually starts to
> matter in the real world, I think it may be more productive to make the
> pathspec code scale better.

Yeah, agreed. Still, it exposed an easy-to-avoid downside in this patch,
so I limited the early match to a single pathspec in v2.

With 10,000 clean files, hyperfine measured 112.5 ms ± 6.6 ms for the
parent and 494.1 ms ± 17.2 ms for v1. With the restriction, the patched
version took 104.9 ms ± 2.2 ms against 110.1 ms ± 4.1 ms for the parent.

Thanks for pointing it out!

^ permalink raw reply

* Re: git-diff in a worktree is an order of magnitude slower?
From: Jeff King @ 2026-06-09  0:11 UTC (permalink / raw)
  To: D. Ben Knoble; +Cc: Git
In-Reply-To: <CALnO6CADMJSixqYvL1Yo8qKX5rWhKQ+2OoSEuPUh-yoeK9TseQ@mail.gmail.com>

On Mon, Jun 08, 2026 at 07:36:45PM -0400, D. Ben Knoble wrote:

> I'd like to report and offer to help fix what I view as a serious performance
> bug:
> 
>     "git diff --no-ext-diff --quiet" performs about ~10x slower in a secondary
>     worktree than in the main worktree.

Hmm, I get the opposite effect: it is much faster in the worktree!

I did:

  git clone /path/to/linux.git
  git -C linux worktree add --detach ../wt
  hyperfine -L dir linux,wt 'git -C {dir} diff'

which yielded:

  Benchmark 1: git -C linux diff
    Time (mean ± σ):     188.9 ms ±   2.5 ms    [User: 166.4 ms, System: 130.7 ms]
    Range (min … max):   185.5 ms … 194.8 ms    16 runs
  
  Benchmark 2: git -C wt diff
    Time (mean ± σ):      20.0 ms ±   1.5 ms    [User: 23.4 ms, System: 103.5 ms]
    Range (min … max):    17.2 ms …  24.6 ms    132 runs
  
  Summary
    git -C wt diff ran
      9.43 ± 0.71 times faster than git -C linux diff

Running:

  perf record -g git -C wt --no-pager diff
  perf record -g git -C linux --no-pager diff
  perf diff

implies that the slow case is spending a lot more time computing sha1s.
Which implies that the entries are stat dirty. And indeed, if I run:

  git -C linux update-index --refresh

now they both take ~20ms.

I wonder if it's just a racy-git problem? Many files are written in the
same second as the index, so they end up with the same mtimes, and we
have to err on the side of checking the contents.

See Documentation/technical/racy-git.adoc for a larger discussion.

So it is not really about worktrees at all, but just "bad luck" in
generating that initial index (that goes away next time you actually
make an index update that rewrites the whole thing).

I'd have thought USE_NSEC was the default these days, but looks like it
isn't? Try building with that and I'll bet it goes away entirely.

> PS I almost CC'd Peff and Patrick, whose names stood out in "git
> shortlog builtin/{worktree,diff}* object-file* | sort -t\( -k2 -g",
> but decided they'd be their own best judge of whether they can
> understand what's going on? :)

You might be interested in "git shortlog -ns". :)

-Peff

^ permalink raw reply

* Re: [PATCH] ref-filter: reuse --contains traversal results
From: Tamir Duberstein @ 2026-06-08 23:56 UTC (permalink / raw)
  To: Jeff King
  Cc: git, Karthik Nayak, Junio C Hamano, Victoria Dye, Derrick Stolee,
	Elijah Newren, Kristofer Karlsson
In-Reply-To: <20260608235214.GC358144@coredump.intra.peff.net>

On Mon, Jun 8, 2026 at 4:52 PM Jeff King <peff@peff.net> wrote:
>
> On Mon, Jun 08, 2026 at 07:35:57PM -0400, Tamir Duberstein wrote:
>
> > > So I think a better rule here is to tweak the selection in
> > > commit_contains() to select the depth-first algorithm when we have
> > > generation numbers enabled. There's a patch in an old thread, which was
> > > revived a week or two ago by Kristofer (cc'd):
> > >
> > >   https://lore.kernel.org/git/20260527070510.3510836-1-krka@spotify.com/
> >
> > Very good catch, thank you. I reproduced the regression with a
> > 100,000-commit history and generation numbers disabled. The parent
> > took 13.0 ms, the unconditional depth-first version took 238.4 ms, and
> > the generation-aware version took 9.1 ms.
> >
> > I didn't find a patch in that thread, so I will reroll using the
> > memoized walk for tags or when generation numbers are enabled, while
> > retaining the breadth-first walk otherwise. If someone else would
> > prefer to send that patch, that is fine by me as well.
>
> It's just this:
>
> diff --git a/commit-reach.c b/commit-reach.c
> index 9b3ea46d6f..cdea0030b8 100644
> --- a/commit-reach.c
> +++ b/commit-reach.c
> @@ -799,7 +799,8 @@ static enum contains_result contains_tag_algo(struct commit *candidate,
>  int commit_contains(struct ref_filter *filter, struct commit *commit,
>                     struct commit_list *list, struct contains_cache *cache)
>  {
> -       if (filter->with_commit_tag_algo)
> +       if (filter->with_commit_tag_algo ||
> +           generation_numbers_enabled(the_repository))
>                 return contains_tag_algo(commit, list, cache) == CONTAINS_YES;
>         return repo_is_descendant_of(the_repository, commit, list);
>  }
>
> from:
>
>   https://lore.kernel.org/git/20230324191009.GA536967@coredump.intra.peff.net/
>
> But I won't be surprised if you recreated the identical patch yourself. ;)

Yep, that's what happened!

>
> -Peff

Thanks again for all the reviews, v2 of all the patches coming shortly.

^ permalink raw reply

* Re: [PATCH v13 2/6] branch: let delete_branches warn instead of error on bulk refusal
From: Junio C Hamano @ 2026-06-08 23:56 UTC (permalink / raw)
  To: Harald Nordgren via GitGitGadget
  Cc: git, Kristoffer Haugsbakk, Johannes Sixt, Phillip Wood,
	Harald Nordgren
In-Reply-To: <a7672713f67d6a44992c0f0cf989770c7e9ca38b.1780684553.git.gitgitgadget@gmail.com>

"Harald Nordgren via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Harald Nordgren <haraldnordgren@gmail.com>
>
> Add a warn-only mode to delete_branches() and check_branch_commit()
> so a bulk caller can report branches that are not fully merged as a
> short warning and carry on, rather than erroring with the longer
> "use 'git branch -D'" advice that the plain "git branch -d" path
> emits. Existing callers are unaffected.
>
> Signed-off-by: Harald Nordgren <haraldnordgren@gmail.com>
> ---
>  builtin/branch.c | 50 ++++++++++++++++++++++++++++++++----------------
>  1 file changed, 34 insertions(+), 16 deletions(-)

This breaks t5404, t5514, and t5505, which contradicts with
"Existing callers are unaffected".

What's going on?  It is troubling that the breakage happens without
even getting merged with other topics in-flight, which means that
the environment you are developing in and testing on and the
environment that I apply patches on, integrate and test (something
based on Debian testing) are somehow behaving differently.

"cd t && sh t5404-*.sh -i -v" ends like so:

expecting success of 5404.7 'already deleted tracking branches ignored':
        git branch -d -r origin/b3 &&
        git push origin :b3 >output 2>&1 &&
        ! grep "^error: " output

error: the branch 'origin/b3' is not fully merged
hint: If you are sure you want to delete it, run 'git branch -D origin/b3'
hint: Disable this message with "git config set advice.forceDeleteBranch false"
not ok 7 - already deleted tracking branches ignored
#
#               git branch -d -r origin/b3 &&
#               git push origin :b3 >output 2>&1 &&
#               ! grep "^error: " output
#
1..7

but it may be possible that earlier steps are behaving differently
with the patches applied.  I didn't dig further but I think the CI
in the recent past have been affected by the same breakage.


^ permalink raw reply

* Re: [PATCH 0/6] Support hashing objects larger than 4GB on Windows
From: Junio C Hamano @ 2026-06-08 23:56 UTC (permalink / raw)
  To: Philip Oakley
  Cc: Johannes Schindelin via GitGitGadget, git, Johannes Schindelin
In-Reply-To: <4e3430a1-e8ee-47de-b6f0-25abafe3c45b@iee.email>

Philip Oakley <philipoakley@iee.email> writes:

> On 04/06/2026 18:15, Johannes Schindelin via GitGitGadget wrote:
>> Philip Oakley has contributed these patches ~4.5 years ago, and they have
>> been carried in Git for Windows ever since.
>> 
>> Now that there are already other patch series flying around that try to
>> address various aspects about >4GB objects (which aren't handled well by Git
>> until it stops forcing unsigned long to do size_t's job), it seems a good
>> time to upstream these patches, too, at long last.
>
> Yay. I approve this message ;-)

While I very much appreciate the effort to switch to size_t where
appropriate (and the places we historically used ulong for size of
in-core memory region are the most appropriate places), such an old
series crashes with in-flight topics big time.  Can we get an update
on a more recent base?

No need to rush, as I'll be slowly processing the backlog to catch
up with the list traffic for a few days.

Thanks.

^ permalink raw reply

* Re: [PATCH] ref-filter: reuse --contains traversal results
From: Jeff King @ 2026-06-08 23:52 UTC (permalink / raw)
  To: Tamir Duberstein
  Cc: git, Karthik Nayak, Junio C Hamano, Victoria Dye, Derrick Stolee,
	Elijah Newren, Kristofer Karlsson
In-Reply-To: <CAJ-ks9ng3Obv8jydYiBD4kxmTSZCJX8xNb0YihNeSW8_8WL5Ew@mail.gmail.com>

On Mon, Jun 08, 2026 at 07:35:57PM -0400, Tamir Duberstein wrote:

> > So I think a better rule here is to tweak the selection in
> > commit_contains() to select the depth-first algorithm when we have
> > generation numbers enabled. There's a patch in an old thread, which was
> > revived a week or two ago by Kristofer (cc'd):
> >
> >   https://lore.kernel.org/git/20260527070510.3510836-1-krka@spotify.com/
> 
> Very good catch, thank you. I reproduced the regression with a
> 100,000-commit history and generation numbers disabled. The parent
> took 13.0 ms, the unconditional depth-first version took 238.4 ms, and
> the generation-aware version took 9.1 ms.
> 
> I didn't find a patch in that thread, so I will reroll using the
> memoized walk for tags or when generation numbers are enabled, while
> retaining the breadth-first walk otherwise. If someone else would
> prefer to send that patch, that is fine by me as well.

It's just this:

diff --git a/commit-reach.c b/commit-reach.c
index 9b3ea46d6f..cdea0030b8 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -799,7 +799,8 @@ static enum contains_result contains_tag_algo(struct commit *candidate,
 int commit_contains(struct ref_filter *filter, struct commit *commit,
 		    struct commit_list *list, struct contains_cache *cache)
 {
-	if (filter->with_commit_tag_algo)
+	if (filter->with_commit_tag_algo ||
+	    generation_numbers_enabled(the_repository))
 		return contains_tag_algo(commit, list, cache) == CONTAINS_YES;
 	return repo_is_descendant_of(the_repository, commit, list);
 }

from:

  https://lore.kernel.org/git/20230324191009.GA536967@coredump.intra.peff.net/

But I won't be surprised if you recreated the identical patch yourself. ;)

-Peff

^ permalink raw reply related

* Re: followRemoteHEAD management question
From: Jeff King @ 2026-06-08 23:49 UTC (permalink / raw)
  To: Matt Hunter; +Cc: git, Bence Ferdinandy
In-Reply-To: <DJ19CI50W6UH.17QLIBNTXBWXU@lfurio.us>

On Fri, Jun 05, 2026 at 12:31:30PM -0400, Matt Hunter wrote:

> In the past, I've preferred to run 'git remote set-head <name> -d' when
> setting up a new repository, since I generally have an awareness of what
> the remote default branch is, and I don't like seeing them in branch
> listings or git-log annotations.  They are especially noisy to me if I
> have multiple remotes.  It's possible this config is ill-advised - I
> would love to be educated if so...

No, it's perfectly reasonable. Being able to refer to "origin" to mean
"origin/HEAD" is sometimes handy, but if you don't use it, there's no
reason to set up the symref in the first place.

> However, since b7f7d16562c3 (fetch: add configuration for set_head
> behaviour), these changes are undone by every 'git fetch'.
> 
> The topic mentioned above (merged in a1f34d595503) adds a new
> configuration key 'remote.<name>.followRemoteHEAD'.  I'm assuming that
> the intended use for followRemoteHEAD is really only in local /
> per-repository config, since trying to apply it to my personal
> .gitconfig has some odd behavior.

I think this is a gap in the new feature's implementation. It added
per-remote config, but there is no global config to fall back to (e.g.,
the way that remote.*.prune falls back to fetch.prune). There should be
a fetch.followRemoteHEAD option (or perhaps remote.followRemoteHEAD).

> The <name> in the key template does not accept a wildcard, so I must
> list out each of the common remote names I use across different
> repositories.  Since many of my repos don't actually have remotes
> established for all of these names, they pick up a kind of half-baked
> definition for each of them as git performs its config parsing.  For
> instance, a name will appear under 'git remote -v', but it won't
> have any actual properties configured.

Yes, this is a common problem with the remote-config namespace. Defining
_any_ key makes the remote "exist", even without a defined url, but that
isn't usually the intent.  But we can't distinguish that from the case
where you really do want to define a remote without a url (in which case
the url is the name of the remote).

> Is there another solution in place I've missed?  If not, would there be
> any opposition to a new key like 'remote.followRemoteHEAD' which serves
> to provide a default value for any remote that doesn't have its own
> 'remote.<name>.followRemoteHEAD' key?
> 
> I've started scouting out changes to make for such a patch.  It's not
> ready yet, but I figured I would throw this question out in case an easy
> answer can save the effort.

I think you are on the right track. I can see arguments for or against
putting it in fetch.* or remote.*, so you'll have to pick one. ;)

-Peff

^ permalink raw reply

* Re: Mirror repositories for submodules
From: Jeff King @ 2026-06-08 23:41 UTC (permalink / raw)
  To: Simon Richter; +Cc: Junio C Hamano, Benson Muite, git
In-Reply-To: <fa075b7a-96f6-4fd9-ae94-30ddf323f759@hogyros.de>

On Thu, Jun 04, 2026 at 06:27:31PM +0900, Simon Richter wrote:

> Hi,
> 
> On 6/4/26 3:16 PM, Jeff King wrote:
> 
> > Here's a thought experiment. What if you put the UUID into a URL, like:
> >    repoid://123456789.git
> 
> Yes, that's the idea, except I would want to use a relative URL, like
> 
>     ../123456789.git
> 
> This could solve the "naive cloning" problem, because it creates an
> expectation that the submodules can be found on the same server, or in a
> nearby path.

I see. I forgot that we allowed relative submodule URLs.

> > Now, all of that said, do we still need uuids at all? If the canonical
> > submodule name is https://github.com/git/git.git, then anybody can just
> > rewrite that locally in the same way using url.*.insteadOf config.
> 
> Yes, but we'd then need a mechanism for a server to indicate "for cloning,
> you should use these 'insteadOf' settings, which is a massive can of worms
> from a security standpoint.
> 
> I also don't think these canonical URLs can ever be stable if they refer to
> infrastructure that is not under the control of the maintainer -- it would
> tie the project identity to the hosting provider, and increase the inertia
> to overcome for moves (such as the current exodus from github and gitlab
> towards codeberg).

From your description I was assuming the cloner had to always specify
insteadOf (which they find out about "somehow").

If they're not, then your choice of canonical URL is effectively trading
off some cases for others. In the scenario you care about, you assume
that the submodules are hosted relative to the superproject, so clients
can usually get what they need without further config. The server
operator and the superproject repo coordinate on the names.

But in many decentralized cases, there's no URL or administrative
relationship between the superproject and the submodules. They might
happen to be on the same server, but even that falls down if the
superproject is mirrored elsewhere. So using some canonical name which
works in practice _now_ is usually the best we can do.

> The common goal is that a naive clone should get submodules from a local
> server, ideally without us having to write some tool to make an initial
> checkout, enumerate submodules, create insteadOf settings, clone first layer
> of submodules, enumerate second layer, ...

You shouldn't need to do the recursive enumeration if you set up the
inteadOf ahead of time. You don't know which insteadOf settings you'll
want, but you can feed the whole possible mapping. How you get that
mapping is unspecified, but if you are mirroring the submodules already
on your local infrastructure, then whatever process does that can also
output the mapping.


Just to be clear, I'm not trying to dismiss what you're going for. I'm
looking at this from the lens of Git developers: how do existing Git
features fit into this space, and which features are missing that might
assist in a generalized way.

-Peff

^ permalink raw reply

* git-diff in a worktree is an order of magnitude slower?
From: D. Ben Knoble @ 2026-06-08 23:36 UTC (permalink / raw)
  To: Git

Hello all,

I'd like to report and offer to help fix what I view as a serious performance
bug:

    "git diff --no-ext-diff --quiet" performs about ~10x slower in a secondary
    worktree than in the main worktree.

Fortunately, this doesn't seem to extend to "--cached" (and "--no-ext-diff" and
"--quiet" are probably both red-herrings, since it _does_ extend to plain "git
diff").

Here's a short demo in Git:

    # git switch -d v2.54.0
    # ninja -C build # where my meson-generated build dir is
    # git worktree add --detach ../perf-test v2.54.0
    # hyperfine -N --warmup 10 './build/bin-wrappers/git diff'
    Benchmark 1: ./build/bin-wrappers/git diff
      Time (mean ± σ):       3.4 ms ±   0.5 ms    [User: 4.2 ms, System: 3.9 ms]
      Range (min … max):     2.5 ms …   5.8 ms    677 runs
    # pushd ../perf-test
    # hyperfine -N --warmup 10 '../git/build/bin-wrappers/git diff'
    Benchmark 1: ../git/build/bin-wrappers/git diff
      Time (mean ± σ):     223.3 ms ±  10.5 ms    [User: 210.4 ms,
System: 19.1 ms]
      Range (min … max):   213.5 ms … 243.9 ms    13 runs

I've had a similar experience at $DAYJOB, where a large repo takes ~6ms for the
former and ~650ms for the latter. I noticed because the Bash prompt functions
execute "git diff --no-ext-diff --quiet", and that was (AFAICT) the largest
culprit for a slow shell prompt in a worktree. To squelch that from the prompt,
I have to go down the rabbit hole of the worktree config extension, so I figured
better to fix the slow diff if possible anyway.

2 questions:

1. Is this known, and if so is anybody working on it?
2. How can I help identify problem areas?

A little more
-------------

I've reproduced this as far back as v2.50.0, which is as far back as I could get
the meson build to work with little effort (so I can't rule out that this is an
old regression).

Using "perf record -F 99 -g -- <bin-wrappers/git> diff" in both trees and then
"perf report":

- it looks like the main worktree spends most of it's time in preload_thread,
  threaded_has_symlink_leading_path, lstat_cache…
- the worktree spends a lot more time in ie_match_stat, ce_modified_check_fs,
  ce_compare_data, index_fd, would_convert_to_git_filter_fd…

Here's the relevant "perf stat":

main tree:

 Performance counter stats for './build/bin-wrappers/git diff':

                 0      context-switches:u               #      0,0
cs/sec  cs_per_second
                 0      cpu-migrations:u                 #      0,0
migrations/sec  migrations_per_second
               967      page-faults:u                    #  65036,4
faults/sec  page_faults_per_second
             14,87 msec task-clock:u                     #      0,3
CPUs  CPUs_utilized
            48 616      branch-misses:u                  #      3,2 %
branch_miss_rate         (57,19%)
         3 571 630      branches:u                       #    240,2
M/sec  branch_frequency
        13 635 411      cpu-cycles:u                     #      0,9
GHz  cycles_frequency
        22 120 068      instructions:u                   #      1,9
instructions  insn_per_cycle  (85,61%)
         3 634 065      stalled-cycles-frontend:u        #     0,28
frontend_cycles_idle        (9,56%)

       0,006860098 seconds time elapsed

       0,001364000 seconds user
       0,015157000 seconds sys

worktree:

 Performance counter stats for '../git/build/bin-wrappers/git diff':

                 0      context-switches:u               #      0,0
cs/sec  cs_per_second
                 0      cpu-migrations:u                 #      0,0
migrations/sec  migrations_per_second
             1 585      page-faults:u                    #   5058,0
faults/sec  page_faults_per_second
            313,37 msec task-clock:u                     #      0,9
CPUs  CPUs_utilized
         2 481 188      branch-misses:u                  #      1,5 %
branch_miss_rate         (48,94%)
       168 664 155      branches:u                       #    538,2
M/sec  branch_frequency     (51,21%)
     1 004 095 217      cpu-cycles:u                     #      3,2
GHz  cycles_frequency       (67,74%)
     3 864 851 223      instructions:u                   #      3,9
instructions  insn_per_cycle  (52,73%)
        70 755 234      stalled-cycles-frontend:u        #     0,07
frontend_cycles_idle        (49,29%)

       0,306707634 seconds time elapsed

       0,269027000 seconds user
       0,045512000 seconds sys

My observations:
- the worktree has ~twice as many page faults and
- executes ~150 times as many instructions (3.8b compared to 23m).

(When I try to run some "perf" stats as root to access other counters, like
syscalls, "git diff" in the worktree says "not a git repository", so I'm not
counting the actual behavior. Ditto with DTrace.)

PS I almost CC'd Peff and Patrick, whose names stood out in "git
shortlog builtin/{worktree,diff}* object-file* | sort -t\( -k2 -g",
but decided they'd be their own best judge of whether they can
understand what's going on? :)

-- 
D. Ben Knoble

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox