Git development
 help / color / mirror / Atom feed
* Re: [PATCH v2 1/1] environment.c: move 'protect_hfs' and 'protect_ntfs' into 'repo_config_values'
From: Christian Couder @ 2026-06-12  7:22 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Tian Yuchen, git, phillip.wood123, Ayush Chandekar,
	Olamide Caleb Bello
In-Reply-To: <xmqqse6uwdnz.fsf@gitster.g>

On Wed, Jun 10, 2026 at 6:41 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Tian Yuchen <cat@malon.dev> writes:
>
> > +int repo_protect_ntfs(struct repository *repo)
> > +{
> > +     return repo->gitdir ?
> > +             repo_config_values(repo)->protect_ntfs :
> > +             PROTECT_NTFS_DEFAULT;
> > +}
> > +
> > +int repo_protect_hfs(struct repository *repo)
> > +{
> > +     return repo->gitdir ?
> > +             repo_config_values(repo)->protect_hfs :
> > +             PROTECT_HFS_DEFAULT;
> > +}
> > ...
> > @@ -123,6 +125,14 @@ int git_default_config(const char *, const char *,
> >  int git_default_core_config(const char *var, const char *value,
> >                           const struct config_context *ctx, void *cb);
> >
> > +/*
> > + * Getters for the `protect_hfs` and `protect_ntfs` fields of `struct repo_config_values`.
> > + * They check `repo->gitdir` to prevent calling repo_config_values()
> > + * before the configuration is loaded or in bare environments.
> > + */
> > +int repo_protect_hfs(struct repository *repo);
> > +int repo_protect_ntfs(struct repository *repo);
>
> I briefly wondered what *should* happen when repo->gitdir is not
> ready, as it feels almost a bug for a caller to call these two
> functions before the repository is ready to be used.
>
> When repo is not ready, these return their respective default
> values.  That's like the original code using the initial value of
> these global variables.
>
> IOW, this rewrite is bug-for-bug compatible, which is good.
>
> Shall we declare victory and mark the topic for 'next' now?

I would have preferred the commit subject to start with "environment:"
rather than "environment.c:" but it's a small nit and maybe you can
fix it while merging.

Otherwise the patch looks indeed good to me.

Thanks.

^ permalink raw reply

* Re: [PATCH v3 1/3] commit-reach: handle cycles in contains walk
From: Kristofer Karlsson @ 2026-06-12  6:53 UTC (permalink / raw)
  To: Tamir Duberstein
  Cc: git, Jeff King, Karthik Nayak, Junio C Hamano, Victoria Dye,
	Derrick Stolee, Elijah Newren
In-Reply-To: <20260611-ref-filter-memoized-contains-v3-1-b26af3dba285@gmail.com>

On Fri, 12 Jun 2026 at 05:00, Tamir Duberstein <tamird@gmail.com> wrote:
>
> The memoized contains traversal used by git tag assumes that commit
> ancestry is acyclic. Replacement refs can violate that assumption,
> causing it to keep pushing an already active commit until memory is
> exhausted.
>

The cycle detection itself makes sense, but would it be simpler to
just die() when a cycle is found rather than falling back to a
second reachability walk?

A cycle in the commit graph means replacement refs are
misconfigured.  The existing code already loops forever when it
hits one, so detecting and dying is strictly an improvement.  The
fallback adds a second codepath through the function, discards all
cached results (so later candidates redo work), and papers over
what is really a broken invariant.

do_lookup_replace_object() already dies when replacement refs
chain deeper than MAXREPLACEDEPTH (which covers cycles), so the
existing contract treats this as a fatal configuration error.
parse_commit_or_die() sets the same precedent within the walk
itself.

Kristofer

^ permalink raw reply

* Re: [PATCH 0/9] refs: stop using `chdir_notify_reparent()`
From: Patrick Steinhardt @ 2026-06-12  6:18 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Karthik Nayak
In-Reply-To: <20260611065346.GD2191159@coredump.intra.peff.net>

On Thu, Jun 11, 2026 at 02:53:46AM -0400, Jeff King wrote:
> On Wed, Jun 10, 2026 at 04:57:06PM +0200, Patrick Steinhardt wrote:
> 
> > this patch series is a follow-up of the discussion at [1]. It converts
> > the reference backends to always use absolute paths internally, which
> > then allows us to drop the calls to `chdir_notify_reparent()`.
> 
> We added chdir-notify to suport set_work_tree(). Commit 8500e0de3f
> (set_work_tree: use chdir_notify, 2018-03-30) mentions an optimization
> from 044bbbcb63 (Make git_dir a path relative to work_tree in
> setup_work_tree(), 2008-06-19). That commit demonstrates some measurable
> speedup from using relative versus absolute paths.

Oh, that is context I wasn't aware of. Not much of a surprise though,
given that this is from 2008 :) So thanks a lot for the pointer!

> If we move to a world of all absolute paths where chdir-notify is not
> necessary, will we lose that optimization?

Probably. Unfortunately, the commit doesn't have any repeatable
benchmarks in there, so it's hard to say whether we could still
reproduce those issues or not.

> I'm not sure how much it matters in practice these days, or if those
> timings could be repeated. And they weren't all _that_ big to start
> with. I guess it may depend on how deep your repo is within your
> filesystem, too.

Ideally, we'd have the best of both worlds: absolute paths everywhere
without the performance hit. A while back I had a discussion with
Torvalds on the securiy mailing list around this issue, and ultimately
the conclusion was that the best way forward would be to use openat(3p).

This wouldn't only allow us to optimize cases like this, but it also has
the added benefit that we're much less prone to TOCTOU-style issues and
we might even be able to use flags like O_BENEATH. So it would basically
be win-win. The only problem is of course that Windows doesn't have
openat(3p), so we'd have to emulate it, and that's where I always lost
the desire to do this.

When waking up this morning though I had the thought that we shouldn't
try to emulate openat(3p) directly, but instead create a higher-level
interface.

    struct fsroot;

    /*
     * Open a new filesystem root at the given directory. All subsequent
     * calls to open will be relative to this fsroot.
     */
    struct fsroot *fsroot_new(const char *dir);

    /*
     * Create a new fsroot from a subdirectory relative to the given
     * root directory.
     */
    struct fsroot *fsroot_new_subdir(struct fsroot *r, const char *dir);

    /*
     * Open a new file relative to the given fsroot. This will use the
     * equivalent of O_BENEATH so that we only ever open files that are
     * located below the fsroot.
     */
    int fsroot_open(struct fsroot *r, const char *path, int oflag, ...);

This is of course heavily inspired by similar interfaces that exist in
Go [1]. By having such a higher-level abstraction it should also be way
easier to port this to different platforms, where we can then add safety
features like O_BENEATH when available on any given platform.

The idea here would be that we can then convert some subsystems to use
those structures instead of tracking paths. I'd for example love for the
repository's working tree to use this mechanism so that we can squash a
whole class of potential security issues when checking out files that
end in locations we didn't intend to.

Thanks!

Patrick

[1]: https://pkg.go.dev/io/fs#FS

^ permalink raw reply

* Re: [PATCH 3/9] setup: don't apply "GIT_REFERENCE_BACKEND" without a repository
From: Patrick Steinhardt @ 2026-06-12  6:18 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Karthik Nayak
In-Reply-To: <xmqqa4t2wbb5.fsf@gitster.g>

On Wed, Jun 10, 2026 at 10:32:46AM -0700, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> > When discovering a repository we eventually also apply the
> > "GIT_REFERENCE_BACKEND" environment variable to the repository. There's
> > two problems with that:
> >
> >   - We do this unconditionally, which is rather pointless: we really
> >     only have to configure the repository when we have found one.
> >
> >   - We have already applied the repository format at that point in time,
> >     so we need to manually reapply it.
> 
> Does the second point have a small typo, i.e., "if we have a
> repository, we have already applied the ref backend to it when we
> discovered it, so NO need to manually reapply"?

No, this is correct as-is. At the point in time where we handle
GIT_REFERENCE_BACKEND we have already discovered the repository format,
applied it to the repository, configured the reference database format
et al. So because we handle GIT_REFERENCE_BACKEND _after_ that whole
dance we basically have to re-configure the reference database format,
which is awkward.

Patrick

^ permalink raw reply

* Re: [PATCH v2 3/3] doc: git-config: escape erroneous highlight markup
From: Jeff King @ 2026-06-12  6:17 UTC (permalink / raw)
  To: Jean-Noël AVILA
  Cc: Tuomas Ahola, git, Kristoffer Haugsbakk, Junio C Hamano
In-Reply-To: <20260612051605.GB593075@coredump.intra.peff.net>

On Fri, Jun 12, 2026 at 01:16:06AM -0400, Jeff King wrote:

> So...weird. groff wants to add extra space for some reason. It happens
> even if I drop the bolding, and just have "#" on a line by itself. I
> guess maybe it is the trailing "." of the previous line putting groff
> into "oh, I'm starting a new sentence" mode and it uses two spaces.
> 
> But I think that is all outside the scope of your fix, and this is an
> existing issue that we are now just unlucky enough to hit. I'd be
> tempted to ignore it and possibly fix it later.

Poking at the groff manpage for possible fixes, I think it is either:

  1. Just turn off extra spaces between sentences, like this:

       .ss 12 0

     The first argument "12" is "the minimum word space is 12/12ths of
     the current font's word space. The second is "also add 0/12ths
     between sentences" (so, no extra space).

     That prevents the problem from happening anywhere, but maybe people
     really like the extra spaces in other contexts.

  2. There are some magic characters marked as "this doesn't start a new
     sentence". We can set that flag for "#" like this:

       .cflags 32 \#

I don't think docbook has parameter support for either of those, so we'd
have to do some kind of "shove this into the header" magic, which is a
bit gross.

-Peff

^ permalink raw reply

* Re: followRemoteHEAD management question
From: Matt Hunter @ 2026-06-12  6:11 UTC (permalink / raw)
  To: Bence Ferdinandy, Jeff King; +Cc: git
In-Reply-To: <DJ6IBPYNOTTY.3QKEZQ28P713V@ferdinandy.com>

On Thu Jun 11, 2026 at 4:36 PM EDT, Bence Ferdinandy wrote:
> On Thu Jun 11, 2026 at 08:01, Jeff King <peff@peff.net> wrote:
>>
>> My initial thought is that it might affect clone as well as fetch. But I
>> guess this feature does not kick in for clone, as it has its own logic
>> for handling the remote-tracking HEAD. Though arguably it should be
>> possible to configure it not to create one in the first place.
>
> If memory serves well clone has set the remote/HEAD well before this and
> I think it indeed uses a different mechanism/logic.

I'm a little interested to try to look into the clone case as well, but
I think I'll save it for a later patch series and keep the scope of this
one as it is.

> Bit late to the party, but happy to review/test patches if they come.

Greatly appreciated!
>
> Best,
> Bence

The first version of my patches went out.  You two are Cc'd on the cover
letter, but that didn't propagate to the patches themselves, oops.

^ permalink raw reply

* Re: [PATCH v2] update-ref: add --rename option
From: Patrick Steinhardt @ 2026-06-12  6:04 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <xmqqwlw4nccr.fsf@gitster.g>

On Thu, Jun 11, 2026 at 11:47:16AM -0700, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> > One thing that I'm missing from the commit message: what's the
> > motivation for this new mode?
> 
> Maintenance of merge-fix database, a kludgy way to manage evil
> merges that are needed to deal with inter-topic semantic crashes.
> 
> If you are really interested, see the appendix.

Thanks for the explanation!

Patrick

^ permalink raw reply

* Re: [PATCH v3] update-ref: add --rename option
From: Patrick Steinhardt @ 2026-06-12  6:00 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <xmqq7bo4n4ge.fsf@gitster.g>

On Thu, Jun 11, 2026 at 02:37:53PM -0700, Junio C Hamano wrote:
> Add a "--rename" option to "git update-ref" with the syntax:
> 
>  $ git update-ref --rename <old-refname> <new-refname>
> 
> It renames <old-refname> together with its reflog to <new-refname>;
> even when used on a local branch ref, the current value and the
> reflog of the ref are the only things that are renamed.  Document it
> and redirect casual users to "git branch -m" if that is what they
> wanted to do.

This reads much better, thanks.

> diff --git a/Documentation/git-update-ref.adoc b/Documentation/git-update-ref.adoc
> index 37a5019a8b..3b4df23a86 100644
> --- a/Documentation/git-update-ref.adoc
> +++ b/Documentation/git-update-ref.adoc
> @@ -9,6 +9,7 @@ SYNOPSIS
>  --------
>  [synopsis]
>  git update-ref [-m <reason>] [--no-deref] -d <ref> [<old-oid>]
> +git update-ref [-m <reason>] --rename <old-refname> <new-refname>
>  git update-ref [-m <reason>] [--no-deref] [--create-reflog] <ref> <new-oid> [<old-oid>]
>  git update-ref [-m <reason>] [--no-deref] --stdin [-z] [--batch-updates]

This slightly triggers my OCD, but oh, well. No need to change this.

> diff --git a/builtin/update-ref.c b/builtin/update-ref.c
> index 2d68c40ecb..65ee8af08c 100644
> --- a/builtin/update-ref.c
> +++ b/builtin/update-ref.c
> @@ -800,6 +802,32 @@ int cmd_update_ref(int argc,
>  	if (end_null)
>  		usage_with_options(git_update_ref_usage, options);
>  
> +	if (rename) {
> +		const char *oldref, *newref;
> +
> +		if (delete || argc != 2)
> +			usage_with_options(git_update_ref_usage, options);

Arguably, we should also complain when either "--no-deref" or "--deref"
were given, as they don't have any effect.

A slight tangent: this is part of why I really don't like commands that
determine their mode via flags: you now have to worry about every
combination of flags and whether they even make sense. With subcommands
we at least only have to worry about the set of flags that directly
apply to that given subcommand.

Makes me wonder whether I should have a look at extending git-refs(1)
further:

    git refs delete <ref> [<oldvalue>]
    git refs update <ref> <newvalue> [<oldvalue>]
    git refs rename <ref> <oldname> <newname>

I always wanted to do this eventually so that we have one top-level
command that knows how to do "everything refs".

Anyway, except for this nit the patch looks good to me, thanks!

Patrick

^ permalink raw reply

* [PATCH 7/7] fetch: fixup a misaligned comment
From: Matt Hunter @ 2026-06-12  5:55 UTC (permalink / raw)
  To: git
In-Reply-To: <20260612055947.1499497-1-m@lfurio.us>

Signed-off-by: Matt Hunter <m@lfurio.us>
---
 builtin/fetch.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index a21bb82274d4..911ac8a47221 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -1792,7 +1792,7 @@ static int set_head(const struct ref *remote_refs, struct remote *remote,
 		strbuf_addf(&b_head, "refs/remotes/%s/HEAD", remote->name);
 		strbuf_addf(&b_remote_head, "refs/remotes/%s/%s", remote->name, head_name);
 	}
-		/* make sure it's valid */
+	/* make sure it's valid */
 	if (!baremirror && !refs_ref_exists(refs, b_remote_head.buf)) {
 		result = 1;
 		goto cleanup;
-- 
2.54.0


^ permalink raw reply related

* [PATCH 6/7] fetch: add configuration option fetch.followRemoteHEAD
From: Matt Hunter @ 2026-06-12  5:55 UTC (permalink / raw)
  To: git
In-Reply-To: <20260612055947.1499497-1-m@lfurio.us>

'fetch.followRemoteHEAD' is added as a generic option used by all
remotes for which 'remote.<name>.followRemoteHEAD' is undefined.  If
both options are undefined, a builtin default of "create" is in effect,
matching the previous behavior.

As mentioned in the previous patch, 'fetch.followRemoteHEAD' supports
all of the values that its 'remote' counterpart does _except_
warn-if-not-$branch, due to its tighter coupling to individual remote
repositories.

Documentation and advice messages for both of the followRemoteHEAD
options are reworded to better capture the relationship between the two.

The added tests assert feature parity between the two followRemoteHEAD
options, as well as the fact that 'remote.<name>.followRemoteHEAD'
always supersedes this new configurable default.

Signed-off-by: Matt Hunter <m@lfurio.us>
---
 Documentation/config/fetch.adoc  |  19 ++++++
 Documentation/config/remote.adoc |  21 +++----
 builtin/fetch.c                  |  32 ++++++++--
 t/t5510-fetch.sh                 | 105 +++++++++++++++++++++++++++++++
 4 files changed, 160 insertions(+), 17 deletions(-)

diff --git a/Documentation/config/fetch.adoc b/Documentation/config/fetch.adoc
index 04ac90912d3a..f7de22a34a54 100644
--- a/Documentation/config/fetch.adoc
+++ b/Documentation/config/fetch.adoc
@@ -126,3 +126,22 @@ the new bundle URI.
 The creation token values are chosen by the provider serving the specific
 bundle URI. If you modify the URI at `fetch.bundleURI`, then be sure to
 remove the value for the `fetch.bundleCreationToken` value before fetching.
+
+`fetch.followRemoteHEAD`::
+	When fetching using a default refspec, this option determines how to handle
+	differences between a fetched remote's `HEAD` and the local
+	`remotes/<name>/HEAD` symbolic-ref.  Its value is one of
++
+--
+`create`;;
+	Create `remotes/<name>/HEAD` if a ref exists on the remote, but not locally.
+	An existing symbolic-ref will not be touched.  This is the default value.
+`warn`;;
+	Display a warning if the remote advertises a different `HEAD` than what is
+	set locally.  Behaves like "create" if the local symbolic-ref doesn't exist.
+`always`;;
+	Silently update `remotes/<name>/HEAD` whenever the remote advertises a new
+	value.
+`never`;;
+	Never create or modify the `remotes/<name>/HEAD` symbolic-ref.
+--
diff --git a/Documentation/config/remote.adoc b/Documentation/config/remote.adoc
index eb9c8a3c4884..761bf4ba7d14 100644
--- a/Documentation/config/remote.adoc
+++ b/Documentation/config/remote.adoc
@@ -157,15 +157,12 @@ Blank values signal to ignore all previous values, allowing a reset of
 the list from broader config scenarios.
 
 remote.<name>.followRemoteHEAD::
-	How linkgit:git-fetch[1] should handle updates to `remotes/<name>/HEAD`
-	when fetching using the configured refspecs of a remote.
-	The default value is "create", which will create `remotes/<name>/HEAD`
-	if it exists on the remote, but not locally; this will not touch an
-	already existing local reference. Setting it to "warn" will print
-	a message if the remote has a different value than the local one;
-	in case there is no local reference, it behaves like "create".
-	A variant on "warn" is "warn-if-not-$branch", which behaves like
-	"warn", but if `HEAD` on the remote is `$branch` it will be silent.
-	Setting it to "always" will silently update `remotes/<name>/HEAD` to
-	the value on the remote.  Finally, setting it to "never" will never
-	change or create the local reference.
+	When fetching this remote using its default refspec, this option determines
+	how to handle differences between the remote's `HEAD` and the local
+	`remotes/<name>/HEAD` symbolic-ref.  Overrides the setting for
+	`fetch.followRemoteHEAD`.  See `fetch.followRemoteHEAD` for a description of
+	accepted values.
++
+In addition to the values supported by `fetch.followRemoteHEAD`, this option may
+also take on the value "warn-if-not-`$branch`", which behaves like "warn", but
+ignores the warning if the remote's `HEAD` is `remotes/<name>/$branch`.
diff --git a/builtin/fetch.c b/builtin/fetch.c
index 3cc7efdd83a0..a21bb82274d4 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -103,6 +103,7 @@ static struct string_list negotiation_include = STRING_LIST_INIT_NODUP;
 
 struct fetch_config {
 	enum display_format display_format;
+	enum follow_remote_head_settings follow_remote_head;
 	int all;
 	int prune;
 	int prune_tags;
@@ -173,6 +174,22 @@ static int git_fetch_config(const char *k, const char *v,
 			    "fetch.output", v);
 	}
 
+	if (!strcmp(k, "fetch.followremotehead")) {
+		if (!v)
+			return config_error_nonbool(k);
+		else if (!strcasecmp(v, "never"))
+			fetch_config->follow_remote_head = FOLLOW_REMOTE_NEVER;
+		else if (!strcasecmp(v, "create"))
+			fetch_config->follow_remote_head = FOLLOW_REMOTE_CREATE;
+		else if (!strcasecmp(v, "warn"))
+			fetch_config->follow_remote_head = FOLLOW_REMOTE_WARN;
+		else if (!strcasecmp(v, "always"))
+			fetch_config->follow_remote_head = FOLLOW_REMOTE_ALWAYS;
+		else
+			die(_("invalid value for '%s': '%s'"),
+				"fetch.followRemoteHEAD", v);
+	}
+
 	return git_default_config(k, v, ctx, cb);
 }
 
@@ -1697,11 +1714,13 @@ static const char *strip_refshead(const char *name){
 static void set_head_advice_msg(const char *remote, const char *head_name)
 {
 	const char message_advice_set_head[] =
-	N_("Run 'git remote set-head %s %s' to follow the change, or set\n"
-	   "'remote.%s.followRemoteHEAD' configuration option to a different value\n"
-	   "if you do not want to see this message. Specifically running\n"
-	   "'git config set remote.%s.followRemoteHEAD warn-if-not-%s'\n"
-	   "will disable the warning until the remote changes HEAD to something else.");
+	N_("Run 'git remote set-head %s %s' to follow the change, or modify\n"
+	   "either of the 'remote.%s.followRemoteHEAD' or 'fetch.followRemoteHEAD'\n"
+	   "configuration options to handle the situation differently.\n\n"
+
+	   "Using this specific option\n\n"
+	   "    git config set remote.%s.followRemoteHEAD warn-if-not-%s\n\n"
+	   "will suppress the warning until the remote changes HEAD to something else.");
 
 	advise_if_enabled(ADVICE_FETCH_SET_HEAD_WARN, _(message_advice_set_head),
 			remote, head_name, remote, remote, head_name);
@@ -1919,6 +1938,8 @@ static int do_fetch(struct transport *transport,
 
 	if (transport->remote->follow_remote_head)
 		follow_remote_head = transport->remote->follow_remote_head;
+	else if (config->follow_remote_head)
+		follow_remote_head = config->follow_remote_head;
 	else
 		follow_remote_head = BUILTIN_FOLLOW_REMOTE_HEAD_DFLT;
 
@@ -2477,6 +2498,7 @@ int cmd_fetch(int argc,
 {
 	struct fetch_config config = {
 		.display_format = DISPLAY_FORMAT_FULL,
+		.follow_remote_head = FOLLOW_REMOTE_UNCONFIGURED,
 		.prune = -1,
 		.prune_tags = -1,
 		.show_forced_updates = 1,
diff --git a/t/t5510-fetch.sh b/t/t5510-fetch.sh
index 43190630e714..6f0ae1bdd798 100755
--- a/t/t5510-fetch.sh
+++ b/t/t5510-fetch.sh
@@ -140,6 +140,16 @@ test_expect_success "fetch test remote HEAD change" '
 	)
 '
 
+test_expect_success "fetch test default followRemoteHEAD never" '
+	git -C two update-ref --no-deref -d refs/remotes/origin/HEAD &&
+	test_config -C two fetch.followRemoteHEAD "never" &&
+	GIT_TRACE_PACKET=$PWD/trace.out git -C two fetch &&
+	# Confirm that we do not even ask for HEAD when we are
+	# not going to act on it.
+	test_grep ! "ref-prefix HEAD" trace.out &&
+	test_must_fail git -C two rev-parse --verify refs/remotes/origin/HEAD
+'
+
 test_expect_success "fetch test followRemoteHEAD never" '
 	git -C two update-ref --no-deref -d refs/remotes/origin/HEAD &&
 	test_config -C two remote.origin.followRemoteHEAD "never" &&
@@ -150,6 +160,21 @@ test_expect_success "fetch test followRemoteHEAD never" '
 	test_must_fail git -C two rev-parse --verify refs/remotes/origin/HEAD
 '
 
+test_expect_success "fetch test default followRemoteHEAD warn no change" '
+	git -C two rev-parse --verify refs/remotes/origin/other &&
+	git -C two remote set-head origin other &&
+	git -C two rev-parse --verify refs/remotes/origin/HEAD &&
+	git -C two rev-parse --verify refs/remotes/origin/main &&
+	test_config -C two fetch.followRemoteHEAD "warn" &&
+	git -C two fetch >output &&
+	echo "${SQ}HEAD${SQ} at ${SQ}origin${SQ} is ${SQ}main${SQ}," \
+		"but we have ${SQ}other${SQ} locally." >expect &&
+	test_cmp expect output &&
+	head=$(git -C two rev-parse refs/remotes/origin/HEAD) &&
+	branch=$(git -C two rev-parse refs/remotes/origin/other) &&
+	test "z$head" = "z$branch"
+'
+
 test_expect_success "fetch test followRemoteHEAD warn no change" '
 	git -C two rev-parse --verify refs/remotes/origin/other &&
 	git -C two remote set-head origin other &&
@@ -165,6 +190,17 @@ test_expect_success "fetch test followRemoteHEAD warn no change" '
 	test "z$head" = "z$branch"
 '
 
+test_expect_success "fetch test default followRemoteHEAD warn create" '
+	git -C two update-ref --no-deref -d refs/remotes/origin/HEAD &&
+	test_config -C two fetch.followRemoteHEAD "warn" &&
+	git -C two rev-parse --verify refs/remotes/origin/main &&
+	output=$(git -C two fetch) &&
+	test "z" = "z$output" &&
+	head=$(git -C two rev-parse refs/remotes/origin/HEAD) &&
+	branch=$(git -C two rev-parse refs/remotes/origin/main) &&
+	test "z$head" = "z$branch"
+'
+
 test_expect_success "fetch test followRemoteHEAD warn create" '
 	git -C two update-ref --no-deref -d refs/remotes/origin/HEAD &&
 	test_config -C two remote.origin.followRemoteHEAD "warn" &&
@@ -176,6 +212,18 @@ test_expect_success "fetch test followRemoteHEAD warn create" '
 	test "z$head" = "z$branch"
 '
 
+test_expect_success "fetch test default followRemoteHEAD warn detached" '
+	git -C two update-ref --no-deref -d refs/remotes/origin/HEAD &&
+	git -C two update-ref refs/remotes/origin/HEAD HEAD &&
+	HEAD=$(git -C two log --pretty="%H") &&
+	test_config -C two fetch.followRemoteHEAD "warn" &&
+	git -C two fetch >output &&
+	echo "${SQ}HEAD${SQ} at ${SQ}origin${SQ} is ${SQ}main${SQ}," \
+		"but we have a detached HEAD pointing to" \
+		"${SQ}${HEAD}${SQ} locally." >expect &&
+	test_cmp expect output
+'
+
 test_expect_success "fetch test followRemoteHEAD warn detached" '
 	git -C two update-ref --no-deref -d refs/remotes/origin/HEAD &&
 	git -C two update-ref refs/remotes/origin/HEAD HEAD &&
@@ -188,6 +236,19 @@ test_expect_success "fetch test followRemoteHEAD warn detached" '
 	test_cmp expect output
 '
 
+test_expect_success "fetch test default followRemoteHEAD warn quiet" '
+	git -C two rev-parse --verify refs/remotes/origin/other &&
+	git -C two remote set-head origin other &&
+	git -C two rev-parse --verify refs/remotes/origin/HEAD &&
+	git -C two rev-parse --verify refs/remotes/origin/main &&
+	test_config -C two fetch.followRemoteHEAD "warn" &&
+	output=$(git -C two fetch --quiet) &&
+	test "z" = "z$output" &&
+	head=$(git -C two rev-parse refs/remotes/origin/HEAD) &&
+	branch=$(git -C two rev-parse refs/remotes/origin/other) &&
+	test "z$head" = "z$branch"
+'
+
 test_expect_success "fetch test followRemoteHEAD warn quiet" '
 	git -C two rev-parse --verify refs/remotes/origin/other &&
 	git -C two remote set-head origin other &&
@@ -229,6 +290,18 @@ test_expect_success "fetch test followRemoteHEAD warn-if-not-branch branch is di
 	test "z$head" = "z$branch"
 '
 
+test_expect_success "fetch test default followRemoteHEAD always" '
+	git -C two rev-parse --verify refs/remotes/origin/other &&
+	git -C two remote set-head origin other &&
+	git -C two rev-parse --verify refs/remotes/origin/HEAD &&
+	git -C two rev-parse --verify refs/remotes/origin/main &&
+	test_config -C two fetch.followRemoteHEAD "always" &&
+	git -C two fetch &&
+	head=$(git -C two rev-parse refs/remotes/origin/HEAD) &&
+	branch=$(git -C two rev-parse refs/remotes/origin/main) &&
+	test "z$head" = "z$branch"
+'
+
 test_expect_success "fetch test followRemoteHEAD always" '
 	git -C two rev-parse --verify refs/remotes/origin/other &&
 	git -C two remote set-head origin other &&
@@ -241,6 +314,28 @@ test_expect_success "fetch test followRemoteHEAD always" '
 	test "z$head" = "z$branch"
 '
 
+test_expect_success 'per-remote followRemoteHEAD takes priority over fetch default' '
+	git -C two rev-parse --verify refs/remotes/origin/other &&
+	git -C two remote set-head origin other &&
+	git -C two rev-parse --verify refs/remotes/origin/HEAD &&
+	git -C two rev-parse --verify refs/remotes/origin/main &&
+	test_config -C two fetch.followRemoteHEAD "never" &&
+	test_config -C two remote.origin.followRemoteHEAD "always" &&
+	git -C two fetch &&
+	head=$(git -C two rev-parse refs/remotes/origin/HEAD) &&
+	branch=$(git -C two rev-parse refs/remotes/origin/main) &&
+	test "z$head" = "z$branch"
+'
+
+test_expect_success 'default followRemoteHEAD does not kick in with refspecs' '
+	git -C two remote set-head origin other &&
+	test_config -C two fetch.followRemoteHEAD always &&
+	git -C two fetch origin refs/heads/main:refs/remotes/origin/main &&
+	echo refs/remotes/origin/other >expect &&
+	git -C two symbolic-ref refs/remotes/origin/HEAD >actual &&
+	test_cmp expect actual
+'
+
 test_expect_success 'followRemoteHEAD does not kick in with refspecs' '
 	git -C two remote set-head origin other &&
 	test_config -C two remote.origin.followRemoteHEAD always &&
@@ -250,6 +345,16 @@ test_expect_success 'followRemoteHEAD does not kick in with refspecs' '
 	test_cmp expect actual
 '
 
+test_expect_success 'default followRemoteHEAD create does not overwrite dangling symref' '
+	test_when_finished "git -C two remote remove custom-head" &&
+	git -C two remote add -m does-not-exist custom-head ../one &&
+	test_config -C two fetch.followRemoteHEAD create &&
+	git -C two fetch custom-head &&
+	echo refs/remotes/custom-head/does-not-exist >expect &&
+	git -C two symbolic-ref refs/remotes/custom-head/HEAD >actual &&
+	test_cmp expect actual
+'
+
 test_expect_success 'followRemoteHEAD create does not overwrite dangling symref' '
 	test_when_finished "git -C two remote remove custom-head" &&
 	git -C two remote add -m does-not-exist custom-head ../one &&
-- 
2.54.0


^ permalink raw reply related

* [PATCH 5/7] fetch: refactor do_fetch handling of followRemoteHEAD
From: Matt Hunter @ 2026-06-12  5:55 UTC (permalink / raw)
  To: git
In-Reply-To: <20260612055947.1499497-1-m@lfurio.us>

Update enum follow_remote_head_settings to include the value
FOLLOW_REMOTE_UNCONFIGURED as the new zero-initialized value for
followRemoteHEAD.  This will allow us to distinguish between the option
being unset vs. explicitly set to 'create', which is ultimately the
system default.  The unnecessary indentation is removed.

The do_fetch function is likewise updated to perform its own decision
making to determine the effective followRemoteHEAD mode, falling back to
the system default if necessary.  This will enable the next patch to
introduce a user-configurable fallback default option.

Function set_head now accepts this value as an argument rather than only
considering the value defined by the remote.

The use of the 'warn-if-not-$branch' value is awkward in the context of
a global default option, since the branches will differ between
individual remotes.  For this reason, it's left out of this scheme and
handling of the no_warn_branch variable is untouched.  Since a
remote-specific setting for followRemoteHEAD takes priority, we can
assume that if remote->no_warn_branch is set, then the remote is also
asserting FOLLOW_REMOTE_WARN as the effective operating mode, and it
will be honored by do_fetch.

Signed-off-by: Matt Hunter <m@lfurio.us>
---
 builtin/fetch.c | 14 ++++++++++----
 remote.h        | 14 ++++++++------
 2 files changed, 18 insertions(+), 10 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index 9a45e1e7a44d..3cc7efdd83a0 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -1729,12 +1729,12 @@ static void warn_set_head(const char *remote, const char *head_name,
 	strbuf_release(&buf_prefix);
 }
 
-static int set_head(const struct ref *remote_refs, struct remote *remote)
+static int set_head(const struct ref *remote_refs, struct remote *remote,
+			int follow_remote_head)
 {
 	int result = 0, create_only, baremirror, was_detached;
 	struct strbuf b_head = STRBUF_INIT, b_remote_head = STRBUF_INIT,
 		      b_local_head = STRBUF_INIT;
-	int follow_remote_head = remote->follow_remote_head;
 	const char *no_warn_branch = remote->no_warn_branch;
 	char *head_name = NULL;
 	struct ref *ref, *matches;
@@ -1901,6 +1901,7 @@ static int do_fetch(struct transport *transport,
 	struct ref_update_display_info_array display_array = { 0 };
 	struct strmap rejected_refs = STRMAP_INIT;
 	int summary_width = 0;
+	int follow_remote_head;
 
 	if (tags == TAGS_DEFAULT) {
 		if (transport->remote->fetch_tags == 2)
@@ -1916,6 +1917,11 @@ static int do_fetch(struct transport *transport,
 			goto cleanup;
 	}
 
+	if (transport->remote->follow_remote_head)
+		follow_remote_head = transport->remote->follow_remote_head;
+	else
+		follow_remote_head = BUILTIN_FOLLOW_REMOTE_HEAD_DFLT;
+
 	if (rs->nr) {
 		refspec_ref_prefixes(rs, &transport_ls_refs_options.ref_prefixes);
 	} else {
@@ -1924,7 +1930,7 @@ static int do_fetch(struct transport *transport,
 		if (transport->remote->fetch.nr) {
 			refspec_ref_prefixes(&transport->remote->fetch,
 					     &transport_ls_refs_options.ref_prefixes);
-			if (transport->remote->follow_remote_head != FOLLOW_REMOTE_NEVER)
+			if (follow_remote_head != FOLLOW_REMOTE_NEVER)
 				do_set_head = 1;
 		}
 		if (branch && branch_has_merge_config(branch) &&
@@ -2131,7 +2137,7 @@ static int do_fetch(struct transport *transport,
 		 * Way too many cases where this can go wrong so let's just
 		 * ignore errors and fail silently for now.
 		 */
-		set_head(remote_refs, transport->remote);
+		set_head(remote_refs, transport->remote, follow_remote_head);
 	}
 
 cleanup:
diff --git a/remote.h b/remote.h
index 54b17e4b028b..72a54d84ad51 100644
--- a/remote.h
+++ b/remote.h
@@ -62,12 +62,14 @@ struct remote_state {
 void remote_state_clear(struct remote_state *remote_state);
 struct remote_state *remote_state_new(void);
 
-	enum follow_remote_head_settings {
-		FOLLOW_REMOTE_NEVER = -1,
-		FOLLOW_REMOTE_CREATE = 0,
-		FOLLOW_REMOTE_WARN = 1,
-		FOLLOW_REMOTE_ALWAYS = 2,
-	};
+#define BUILTIN_FOLLOW_REMOTE_HEAD_DFLT FOLLOW_REMOTE_CREATE
+enum follow_remote_head_settings {
+	FOLLOW_REMOTE_UNCONFIGURED = 0,
+	FOLLOW_REMOTE_NEVER,
+	FOLLOW_REMOTE_CREATE,
+	FOLLOW_REMOTE_WARN,
+	FOLLOW_REMOTE_ALWAYS,
+};
 
 struct remote {
 	struct hashmap_entry ent;
-- 
2.54.0


^ permalink raw reply related

* [PATCH 4/7] fetch: rename function report_set_head
From: Matt Hunter @ 2026-06-12  5:55 UTC (permalink / raw)
  To: git
In-Reply-To: <20260612055947.1499497-1-m@lfurio.us>

Update to the slightly more obvious name 'warn_set_head', which matches
the verbiage of the followRemoteHEAD options.

Signed-off-by: Matt Hunter <m@lfurio.us>
---
 builtin/fetch.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index 82969e230f5a..9a45e1e7a44d 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -1707,7 +1707,7 @@ static void set_head_advice_msg(const char *remote, const char *head_name)
 			remote, head_name, remote, remote, head_name);
 }
 
-static void report_set_head(const char *remote, const char *head_name,
+static void warn_set_head(const char *remote, const char *head_name,
 			struct strbuf *buf_prev, int updateres) {
 	struct strbuf buf_prefix = STRBUF_INIT;
 	const char *prev_head = NULL;
@@ -1787,7 +1787,7 @@ static int set_head(const struct ref *remote_refs, struct remote *remote)
 	if (verbosity >= 0 &&
 		follow_remote_head == FOLLOW_REMOTE_WARN &&
 		(!no_warn_branch || strcmp(no_warn_branch, head_name)))
-		report_set_head(remote->name, head_name, &b_local_head, was_detached);
+		warn_set_head(remote->name, head_name, &b_local_head, was_detached);
 
 cleanup:
 	free(head_name);
-- 
2.54.0


^ permalink raw reply related

* [PATCH 3/7] t5510: cleanup remote in followRemoteHEAD dangling ref test
From: Matt Hunter @ 2026-06-12  5:55 UTC (permalink / raw)
  To: git
In-Reply-To: <20260612055947.1499497-1-m@lfurio.us>

A later patch will introduce a new test which closely mirrors this one.
Update this test to remove the 'custom-head' remote it creates.
Otherwise, the two tests will conflict with each other, as the second
one to execute will fail to create this remote (which already exists,
thanks to the first test).

Signed-off-by: Matt Hunter <m@lfurio.us>
---
 t/t5510-fetch.sh | 1 +
 1 file changed, 1 insertion(+)

diff --git a/t/t5510-fetch.sh b/t/t5510-fetch.sh
index eca9a973b5cb..43190630e714 100755
--- a/t/t5510-fetch.sh
+++ b/t/t5510-fetch.sh
@@ -251,6 +251,7 @@ test_expect_success 'followRemoteHEAD does not kick in with refspecs' '
 '
 
 test_expect_success 'followRemoteHEAD create does not overwrite dangling symref' '
+	test_when_finished "git -C two remote remove custom-head" &&
 	git -C two remote add -m does-not-exist custom-head ../one &&
 	test_config -C two remote.custom-head.followRemoteHEAD create &&
 	git -C two fetch custom-head &&
-- 
2.54.0


^ permalink raw reply related

* [PATCH 2/7] doc: explain fetchRemoteHEADWarn advice
From: Matt Hunter @ 2026-06-12  5:55 UTC (permalink / raw)
  To: git
In-Reply-To: <20260612055947.1499497-1-m@lfurio.us>

When the user sets 'remote.<name>.followRemoteHEAD' to
'warn[-if-not-$branch]', git-fetch will report when a fetched HEAD
disagrees with the locally-configured remote's HEAD.  This additional
advice instructs the user how to deal with these warnings, but was
previously undocumented in git-config.

Signed-off-by: Matt Hunter <m@lfurio.us>
---
 Documentation/config/advice.adoc | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/Documentation/config/advice.adoc b/Documentation/config/advice.adoc
index 257db5891817..c3c190ba6a4f 100644
--- a/Documentation/config/advice.adoc
+++ b/Documentation/config/advice.adoc
@@ -48,6 +48,10 @@ all advice messages.
 		to create a local branch after the fact.
 	diverging::
 		Shown when a fast-forward is not possible.
+	fetchRemoteHEADWarn::
+		Shown when linkgit:git-fetch[1] reveals that a remote `HEAD`
+		differs from what is set locally and the user has opted into
+		receiving a warning in this situation.
 	fetchShowForcedUpdates::
 		Shown when linkgit:git-fetch[1] takes a long time
 		to calculate forced updates after ref updates, or to warn
-- 
2.54.0


^ permalink raw reply related

* [PATCH 1/7] fetch: fixup set_head advice for warn-if-not-branch
From: Matt Hunter @ 2026-06-12  5:55 UTC (permalink / raw)
  To: git
In-Reply-To: <20260612055947.1499497-1-m@lfurio.us>

Specifying the word 'branch' in the command is not correct - a mismatch
with both the implementation in remote.c and the documentation.

Signed-off-by: Matt Hunter <m@lfurio.us>
---
 builtin/fetch.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index c1d7c672f4e0..82969e230f5a 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -1700,7 +1700,7 @@ static void set_head_advice_msg(const char *remote, const char *head_name)
 	N_("Run 'git remote set-head %s %s' to follow the change, or set\n"
 	   "'remote.%s.followRemoteHEAD' configuration option to a different value\n"
 	   "if you do not want to see this message. Specifically running\n"
-	   "'git config set remote.%s.followRemoteHEAD warn-if-not-branch-%s'\n"
+	   "'git config set remote.%s.followRemoteHEAD warn-if-not-%s'\n"
 	   "will disable the warning until the remote changes HEAD to something else.");
 
 	advise_if_enabled(ADVICE_FETCH_SET_HEAD_WARN, _(message_advice_set_head),
-- 
2.54.0


^ permalink raw reply related

* [PATCH 0/7] Introduce fetch.followRemoteHEAD config option
From: Matt Hunter @ 2026-06-12  5:55 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Bence Ferdinandy
In-Reply-To: <DJ19CI50W6UH.17QLIBNTXBWXU@lfurio.us>

git-fetch presently offers some useful ways to control how remote HEAD
symbolic-refs are (or aren't) updated when fetching from remote
repositories.  Namely this is done via the
'remote.<name>.followRemoteHEAD' configuration option.

However, this option can be somewhat painful to use if you prefer a
default other than the "create" option, and often work with multiple
different remote repositories.

This series introduces the option 'fetch.followRemoteHEAD', which
provides a configurable default in place of per-remote settings.

'fetch.followRemoteHEAD' functions exactly the same as the original
option, except that it doesn't allow warning suppression via
'warn-if-not-$branch'.  Given that different remotes will vary their
HEAD and set of branches independently, setting a false-positive
globally in this way doesn't make logical sense.

While it is not mentioned by any of the patches in this series, note
also that the behavior introduced by 012bc566bad7 (remote set-head: set
followRemoteHEAD to "warn" if "always") is unaffected by this series,
and this feature continues to work for only the
'remote.<name>.followRemoteHEAD' option.

Matt Hunter (7):
  fetch: fixup set_head advice for warn-if-not-branch
  doc: explain fetchRemoteHEADWarn advice
  t5510: cleanup remote in followRemoteHEAD dangling ref test
  fetch: rename function report_set_head
  fetch: refactor do_fetch handling of followRemoteHEAD
  fetch: add configuration option fetch.followRemoteHEAD
  fetch: fixup a misaligned comment

 Documentation/config/advice.adoc |   4 ++
 Documentation/config/fetch.adoc  |  19 ++++++
 Documentation/config/remote.adoc |  21 +++---
 builtin/fetch.c                  |  52 +++++++++++----
 remote.h                         |  14 ++--
 t/t5510-fetch.sh                 | 106 +++++++++++++++++++++++++++++++
 6 files changed, 186 insertions(+), 30 deletions(-)


base-commit: 1ff279f3404a482a83fb04c7457e41ab26884aea
-- 
2.54.0


^ permalink raw reply

* Re: [PATCH v4 06/16] midx: support custom `--base` for incremental MIDX writes
From: SZEDER Gábor @ 2026-06-12  5:34 UTC (permalink / raw)
  To: Taylor Blau
  Cc: git, Junio C Hamano, Jeff King, Elijah Newren, Patrick Steinhardt
In-Reply-To: <1bbb387d6b6204045d97882fd8775dbff12dedbb.1779206240.git.me@ttaylorr.com>

On Tue, May 19, 2026 at 11:57:54AM -0400, Taylor Blau wrote:
> diff --git a/t/t5334-incremental-multi-pack-index.sh b/t/t5334-incremental-multi-pack-index.sh
> index 66d6894761b..68a103d13d2 100755
> --- a/t/t5334-incremental-multi-pack-index.sh
> +++ b/t/t5334-incremental-multi-pack-index.sh
> @@ -113,6 +113,36 @@ test_expect_success 'write non-incremental MIDX layer with --no-write-chain-file
>  	test_grep "cannot use --no-write-chain-file without --incremental" err
>  '
>  
> +test_expect_success 'write MIDX layer with --base without --no-write-chain-file' '
> +	test_must_fail git multi-pack-index write --bitmap --incremental \
> +		--base=none 2>err &&
> +	test_grep "cannot use --base without --no-write-chain-file" err
> +'
> +
> +test_expect_success 'write MIDX layer with --base=none and --no-write-chain-file' '
> +	test_commit base-none &&
> +	git repack -d &&
> +
> +	cp "$midx_chain" "$midx_chain.bak" &&
> +	layer="$(git multi-pack-index write --bitmap --incremental \
> +		--no-write-chain-file --base=none)" &&
> +
> +	test_cmp "$midx_chain.bak" "$midx_chain" &&
> +	test_path_is_file "$midxdir/multi-pack-index-$layer.midx"
> +'
> +
> +test_expect_success 'write MIDX layer with --base=<hash> and --no-write-chain-file' '
> +	test_commit base-hash &&
> +	git repack -d &&
> +
> +	cp "$midx_chain" "$midx_chain.bak" &&
> +	layer="$(git multi-pack-index write --bitmap --incremental \
> +		--no-write-chain-file --base="$(nth_line 1 "$midx_chain")")" &&

There is no 'nth_line' helper function in this test script.

> +
> +	test_cmp "$midx_chain.bak" "$midx_chain" &&
> +	test_path_is_file "$midxdir/multi-pack-index-$layer.midx"
> +'
> +
>  for reuse in false single multi
>  do
>  	test_expect_success "full clone (pack.allowPackReuse=$reuse)" '

^ permalink raw reply

* Re: [PATCH v2 3/3] doc: git-config: escape erroneous highlight markup
From: Jeff King @ 2026-06-12  5:16 UTC (permalink / raw)
  To: Jean-Noël AVILA
  Cc: Tuomas Ahola, git, Kristoffer Haugsbakk, Junio C Hamano
In-Reply-To: <5106812.31r3eYUQgx@piment-oiseau>

On Thu, Jun 11, 2026 at 10:43:44PM +0200, Jean-Noël AVILA wrote:

> Oh, this is the black magic regexp that is not considering # for keyword
> character. Should be solved by something like (and I really hate these .in 
> files):

Your patch looks reasonable to me, but for some reason I get this
doc-diff output when comparing before/after (what we don't see is the
improved output from bolding those chars, since doc-diff doesn't show
any markup):

diff --git a/73bccdb573fbdf3df4abf37570be486fc0f53d4a/home/peff/share/man/man1/git-config.1 b/926d9be4ab7ee1a57d4c570ec0f8aba28b3c2af5/home/peff/share/man/man1/git-config.1
index b77d79f..b2bced2 100644
--- a/73bccdb573fbdf3df4abf37570be486fc0f53d4a/home/peff/share/man/man1/git-config.1
+++ b/926d9be4ab7ee1a57d4c570ec0f8aba28b3c2af5/home/peff/share/man/man1/git-config.1
@@ -6983,7 +6983,7 @@ CONFIGURATION FILE
 
      status.displayCommentPrefix
          If set to true, git-status(1) will insert a comment prefix before each
-         output line (starting with core.commentChar, i.e. # by default). This
+         output line (starting with core.commentChar, i.e.  # by default). This
          was the behavior of git-status(1) in Git 1.8.4 and previous. Defaults
          to false.

And I can see the extra space when looking at the rendered manpage. The
XML output looks reasonable, though:

  (starting with <literal>core.commentChar</literal>, i.e. <literal>#</literal> by default

as does the HTML. So perhaps it is happening at the roff level? But that
looks like:

  will insert a comment prefix before each output line (starting with
  \fBcore\&.commentChar\fR, i\&.e\&.
  \fB#\fR
  by default)\&. This was the behavior of

So...weird. groff wants to add extra space for some reason. It happens
even if I drop the bolding, and just have "#" on a line by itself. I
guess maybe it is the trailing "." of the previous line putting groff
into "oh, I'm starting a new sentence" mode and it uses two spaces.

But I think that is all outside the scope of your fix, and this is an
existing issue that we are now just unlucky enough to hit. I'd be
tempted to ignore it and possibly fix it later.

-Peff

^ permalink raw reply related

* Re: [PATCH v3 0/3] doc: config: fix AsciiDoc glitches
From: Jeff King @ 2026-06-12  4:53 UTC (permalink / raw)
  To: Tuomas Ahola
  Cc: git, Kristoffer Haugsbakk, Junio C Hamano, Jean-Noël Avila
In-Reply-To: <20260611161946.12166-1-taahol@utu.fi>

On Thu, Jun 11, 2026 at 07:19:43PM +0300, Tuomas Ahola wrote:

> Tuomas Ahola (3):
>   doc: config: terminate runaway lists
>   doc: config/sideband: fix description list delimiter
>   doc: git-config: escape erroneous highlight markup

Thanks, this v3 looks good to me.

-Peff

^ permalink raw reply

* [PATCH v3] ls-files: filter pathspec before lstat
From: Tamir Duberstein @ 2026-06-12  4:31 UTC (permalink / raw)
  To: git
  Cc: René Scharfe, Patrick Steinhardt, Junio C Hamano, Jeff King,
	Tamir Duberstein
In-Reply-To: <20260608-ls-files-pathspec-lstat-v2-1-fb734b28422e@gmail.com>

In --deleted and --modified modes, show_files() calls lstat() for each
index entry before show_ce() applies the pathspec. prune_index() avoids
most of these calls for pathspecs with a common directory prefix, but
not for a top-level name or leading wildcard.

Match before lstat() to avoid accessing the worktree for entries that
cannot be shown. Treat this as a prefilter: do not update ps_matched,
and retain the match in show_ce() so --error-unmatch is satisfied only
by entries that the selected modes actually show.

Prefilter only a single pathspec item, bounding the added work for each
index entry. Applying match_pathspec() to multiple arguments can cost
more than the lstat() calls it avoids. In a synthetic repository with
10,000 clean files, passing every path to ls-files --modified increased
runtime from 112.5 ms to 494.1 ms when the prefilter was unconditional.

With $parent and $this exported as paths to binaries built from the
parent and this commit, on a repository with 881,290 index entries:

    hyperfine --warmup 0 --runs 3 \
        --command-name parent \
        '$parent -c core.fsmonitor=false ls-files --deleted -- README.md >/dev/null' \
        --command-name this-commit \
        '$this -c core.fsmonitor=false ls-files --deleted -- README.md >/dev/null'

reported means of 65.790 seconds for the parent and 4.987 seconds for
this commit.

Link: https://lore.kernel.org/r/xmqqfr2tnfk0.fsf@gitster.g
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Tamir Duberstein <tamird@gmail.com>
---
A selective pathspec should let ls-files --deleted and --modified avoid
statting entries that cannot be shown. Match a single pathspec before
accessing the worktree, while preserving the existing lstat-first order
for multiple pathspecs whose matching cost grows linearly.
---
Changes in v3:
- Explain the conservative single-pathspec cutoff without referring to
  prior revisions.
- Rerun the primary benchmark with the final implementation.
- Make no code changes.
- Link to v2: https://patch.msgid.link/20260608-ls-files-pathspec-lstat-v2-1-fb734b28422e@gmail.com

Changes in v2:
- Restrict early matching to one pathspec after measuring a regression
  with many pathspecs.
- Add all-matching and many-pathspec performance results.
- Drop the Assisted-by trailer.
- Link to v1: https://patch.msgid.link/20260607-ls-files-pathspec-lstat-v1-1-8cf40b730146@gmail.com
---
 builtin/ls-files.c                  | 11 +++++++++++
 t/meson.build                       |  1 +
 t/perf/p3010-ls-files.sh            | 31 +++++++++++++++++++++++++++++++
 t/t3010-ls-files-killed-modified.sh | 18 ++++++++++++++++++
 4 files changed, 61 insertions(+)

diff --git a/builtin/ls-files.c b/builtin/ls-files.c
index e1a22b41b9..8d7158652b 100644
--- a/builtin/ls-files.c
+++ b/builtin/ls-files.c
@@ -450,6 +450,17 @@ static void show_files(struct repository *repo, struct dir_struct *dir)
 			continue;
 		if (ce_skip_worktree(ce))
 			continue;
+		/*
+		 * match_pathspec() is linear in pathspec.nr, so prefilter only
+		 * the single-pathspec case. Only entries shown by show_ce()
+		 * satisfy --error-unmatch.
+		 */
+		if (pathspec.nr == 1 &&
+		    !match_pathspec(repo->index, &pathspec, fullname.buf,
+				    fullname.len, max_prefix_len, NULL,
+				    S_ISDIR(ce->ce_mode) ||
+				    S_ISGITLINK(ce->ce_mode)))
+			continue;
 		stat_err = lstat(fullname.buf, &st);
 		if (stat_err && (errno != ENOENT && errno != ENOTDIR))
 			error_errno("cannot lstat '%s'", fullname.buf);
diff --git a/t/meson.build b/t/meson.build
index 2af8d01279..ee8086e6ef 100644
--- a/t/meson.build
+++ b/t/meson.build
@@ -1140,6 +1140,7 @@ benchmarks = [
   'perf/p1500-graph-walks.sh',
   'perf/p1501-rev-parse-oneline.sh',
   'perf/p2000-sparse-operations.sh',
+  'perf/p3010-ls-files.sh',
   'perf/p3400-rebase.sh',
   'perf/p3404-rebase-interactive.sh',
   'perf/p4000-diff-algorithms.sh',
diff --git a/t/perf/p3010-ls-files.sh b/t/perf/p3010-ls-files.sh
new file mode 100755
index 0000000000..ae14449432
--- /dev/null
+++ b/t/perf/p3010-ls-files.sh
@@ -0,0 +1,31 @@
+#!/bin/sh
+
+test_description='Tests ls-files worktree performance'
+
+. ./perf-lib.sh
+
+test_perf_large_repo
+test_checkout_worktree
+
+test_expect_success 'select a zero-prefix pathspec' '
+	tracked_file=$(git ls-files | sed -n 1p) &&
+	test -n "$tracked_file" &&
+	pathspec="?${tracked_file#?}" &&
+	test_export pathspec
+'
+
+test_perf 'ls-files --deleted with pathspec' '
+	git -c core.fsmonitor=false ls-files --deleted \
+		-- "$pathspec" >/dev/null
+'
+
+test_perf 'ls-files --deleted with all-matching pathspec' '
+	git -c core.fsmonitor=false ls-files --deleted -- "*" >/dev/null
+'
+
+test_perf 'ls-files --modified with pathspec' '
+	git -c core.fsmonitor=false ls-files --modified \
+		-- "$pathspec" >/dev/null
+'
+
+test_done
diff --git a/t/t3010-ls-files-killed-modified.sh b/t/t3010-ls-files-killed-modified.sh
index 7af4532cd1..6e38e10219 100755
--- a/t/t3010-ls-files-killed-modified.sh
+++ b/t/t3010-ls-files-killed-modified.sh
@@ -124,4 +124,22 @@ test_expect_success 'validate git ls-files -m output.' '
 	test_cmp .expected .output
 '
 
+test_expect_success 'worktree modes honor wildcard pathspecs' '
+	cat >.expected <<-\EOF &&
+	path2/file2
+	path3/file3
+	EOF
+	git ls-files --deleted -- "path?/file?" >.output &&
+	test_cmp .expected .output &&
+
+	cat >.expected <<-\EOF &&
+	path7
+	path8
+	EOF
+	git ls-files --modified --error-unmatch -- "path[78]" >.output &&
+	test_cmp .expected .output &&
+
+	test_must_fail git ls-files --modified --error-unmatch -- path10
+'
+
 test_done

---
base-commit: 9ac3f193c05c2237e2b14ebaa1149e9fc8a1abe0
change-id: 20260607-ls-files-pathspec-lstat-885125a5d644

Best regards,
--  
Tamir Duberstein <tamird@gmail.com>


^ permalink raw reply related

* [PATCH v3 3/3] commit-reach: die on contains walk errors
From: Tamir Duberstein @ 2026-06-12  3:00 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Karthik Nayak, Junio C Hamano, Victoria Dye,
	Derrick Stolee, Elijah Newren, Kristofer Karlsson,
	Tamir Duberstein
In-Reply-To: <20260611-ref-filter-memoized-contains-v3-0-b26af3dba285@gmail.com>

Without generation numbers, repo_is_descendant_of() can return -1 when
it cannot read commit ancestry. commit_contains() exposes that result
through a Boolean interface, so ref-filter treats it as true. This can
include a ref for --contains or exclude it for --no-contains without
failing the command.

Die when repo_is_descendant_of() reports an error. The memoized walk
already dies when it cannot parse a commit, so callers of the
non-memoized path no longer turn a failed walk into a match.

Reported-by: Jeff King <peff@peff.net>
Link: https://lore.kernel.org/r/20260611072942.GG2191159@coredump.intra.peff.net
Signed-off-by: Tamir Duberstein <tamird@gmail.com>
---
 commit-reach.c                 |  8 +++++++-
 t/t6301-for-each-ref-errors.sh | 22 ++++++++++++++++++++++
 2 files changed, 29 insertions(+), 1 deletion(-)

diff --git a/commit-reach.c b/commit-reach.c
index 572d2d47ff..af5563d70f 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -820,10 +820,16 @@ static enum contains_result contains_tag_algo(struct commit *candidate,
 int commit_contains(struct ref_filter *filter, struct commit *commit,
 		    struct commit_list *list, struct contains_cache *cache)
 {
+	int result;
+
 	if (filter->with_commit_tag_algo ||
 	    generation_numbers_enabled(the_repository))
 		return contains_tag_algo(commit, list, cache) == CONTAINS_YES;
-	return repo_is_descendant_of(the_repository, commit, list);
+
+	result = repo_is_descendant_of(the_repository, commit, list);
+	if (result < 0)
+		die(_("failed to check reachability"));
+	return result;
 }
 
 int can_all_from_reach_with_flag(struct object_array *from,
diff --git a/t/t6301-for-each-ref-errors.sh b/t/t6301-for-each-ref-errors.sh
index e06feb06e9..72b27c8be3 100755
--- a/t/t6301-for-each-ref-errors.sh
+++ b/t/t6301-for-each-ref-errors.sh
@@ -52,6 +52,28 @@ test_expect_success 'Missing objects are reported correctly' '
 	test_must_be_empty brief-err
 '
 
+test_expect_success 'missing ancestors are reported by contains filters' '
+	test_when_finished "git update-ref -d refs/heads/missing-parent" &&
+	{
+		echo "tree $(git rev-parse HEAD^{tree})" &&
+		echo "parent $MISSING" &&
+		git cat-file commit HEAD |
+			sed -n -e "/^author /p" -e "/^committer /p" &&
+		echo &&
+		echo "missing parent"
+	} >commit &&
+	broken=$(git hash-object -t commit -w commit) &&
+	git update-ref refs/heads/missing-parent "$broken" &&
+	for option in --contains --no-contains
+	do
+		test_must_fail git for-each-ref "$option=HEAD" \
+			refs/heads/missing-parent >out 2>err &&
+		test_must_be_empty out &&
+		test_grep "parse commit $MISSING" err ||
+		return 1
+	done
+'
+
 test_expect_success 'ahead-behind requires an argument' '
 	test_must_fail git for-each-ref \
 		--format="%(ahead-behind)" 2>err &&

-- 
2.54.0.548.gbe7bb2469c


^ permalink raw reply related

* [PATCH v3 2/3] ref-filter: memoize --contains with generations
From: Tamir Duberstein @ 2026-06-12  3:00 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Karthik Nayak, Junio C Hamano, Victoria Dye,
	Derrick Stolee, Elijah Newren, Kristofer Karlsson,
	Tamir Duberstein
In-Reply-To: <20260611-ref-filter-memoized-contains-v3-0-b26af3dba285@gmail.com>

git branch and git for-each-ref run a separate reachability walk for
each ref considered by --contains and --no-contains. Refs with shared
history therefore traverse the same commits repeatedly.

git tag instead uses a depth-first walk that caches results across
refs. That walk can perform poorly without generation numbers: a
negative check may walk to the root instead of stopping at a nearby
divergence. Generation numbers let it stop below the oldest target.

Use the memoized walk for all ref-filter callers when generation
numbers are available. Keep git tag on its existing path without
generations. Caching still helps when many tags share deep history:
ffc4b8012d (tag: speed up --contains calculation, 2011-06-11) reduced
git tag --contains HEAD~200 in linux-2.6 from 15.417 to 5.329 seconds.

The new shared-history perf test improves from 0.72 to 0.03 seconds. In
a repository with 62,174 remote-tracking refs, running:

    git branch -r --contains c78ae85f3ce7e

improves from 104.365 seconds to 468 milliseconds.

Link: https://lore.kernel.org/git/1445163904-24611-1-git-send-email-Karthik.188@gmail.com/
Link: https://lore.kernel.org/r/20230324191009.GA536967@coredump.intra.peff.net
Link: https://lore.kernel.org/git/20260527070510.3510836-1-krka@spotify.com/
Link: https://lore.kernel.org/r/20260608223430.GA340696@coredump.intra.peff.net
Link: https://lore.kernel.org/r/CAOLa=ZSezQOj56-TezVaAcisUyczxhJmu4VghyFBHcBB_mKJ2A@mail.gmail.com
Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Tamir Duberstein <tamird@gmail.com>
---
 commit-reach.c              |  3 ++-
 t/perf/p1500-graph-walks.sh | 28 +++++++++++++++++++++++++++-
 2 files changed, 29 insertions(+), 2 deletions(-)

diff --git a/commit-reach.c b/commit-reach.c
index 1d34d66fe8..572d2d47ff 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -820,7 +820,8 @@ static enum contains_result contains_tag_algo(struct commit *candidate,
 int commit_contains(struct ref_filter *filter, struct commit *commit,
 		    struct commit_list *list, struct contains_cache *cache)
 {
-	if (filter->with_commit_tag_algo)
+	if (filter->with_commit_tag_algo ||
+	    generation_numbers_enabled(the_repository))
 		return contains_tag_algo(commit, list, cache) == CONTAINS_YES;
 	return repo_is_descendant_of(the_repository, commit, list);
 }
diff --git a/t/perf/p1500-graph-walks.sh b/t/perf/p1500-graph-walks.sh
index 5b23ce5db9..d167b4f7e1 100755
--- a/t/perf/p1500-graph-walks.sh
+++ b/t/perf/p1500-graph-walks.sh
@@ -32,7 +32,16 @@ test_expect_success 'setup' '
 		echo "X:$line" >>test-tool-tags || return 1
 	done &&
 
-	commit=$(git commit-tree $(git rev-parse HEAD^{tree})) &&
+	git rev-list --first-parent --max-count=8192 HEAD >contains-commits &&
+	test_file_not_empty contains-commits &&
+	git update-ref refs/contains-perf-base "$(tail -n 1 contains-commits)" &&
+	awk "{
+		printf \"update refs/contains-perf/%04d %s\\n\", NR, \$1
+	}" contains-commits |
+		git update-ref --stdin &&
+	git pack-refs --include "refs/contains-perf/*" &&
+
+	commit=$(git commit-tree HEAD^{tree}) &&
 	git update-ref refs/heads/disjoint-base $commit &&
 
 	git commit-graph write --reachable
@@ -62,6 +71,23 @@ test_perf 'contains: git tag --merged' '
 	xargs git tag --merged=HEAD <tags
 '
 
+test_perf 'contains: git for-each-ref' '
+	git for-each-ref --contains=refs/contains-perf-base --stdin <refs
+'
+
+test_perf 'contains: git branch' '
+	xargs git branch --contains=refs/contains-perf-base <branches
+'
+
+test_perf 'contains: git tag' '
+	xargs git tag --contains=refs/contains-perf-base <tags
+'
+
+test_perf 'contains: synthetic shared history' '
+	git for-each-ref --contains=refs/contains-perf-base \
+		refs/contains-perf/ >/dev/null
+'
+
 test_perf 'is-base check: test-tool reach (refs)' '
 	test-tool reach get_branch_base_for_tip <test-tool-refs
 '

-- 
2.54.0.548.gbe7bb2469c


^ permalink raw reply related

* [PATCH v3 1/3] commit-reach: handle cycles in contains walk
From: Tamir Duberstein @ 2026-06-12  3:00 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Karthik Nayak, Junio C Hamano, Victoria Dye,
	Derrick Stolee, Elijah Newren, Kristofer Karlsson,
	Tamir Duberstein
In-Reply-To: <20260611-ref-filter-memoized-contains-v3-0-b26af3dba285@gmail.com>

The memoized contains traversal used by git tag assumes that commit
ancestry is acyclic. Replacement refs can violate that assumption,
causing it to keep pushing an already active commit until memory is
exhausted.

Mark commits while they are active. If the traversal encounters an
active commit, discard the cache and retry the candidate with the
cycle-safe reachability walk. Cache the candidate's result so a later
walk that reaches it can reuse the answer. Die if the fallback cannot
read ancestry.

Signed-off-by: Tamir Duberstein <tamird@gmail.com>
---
 commit-reach.c | 29 +++++++++++++++++++++++++----
 commit-reach.h |  3 ++-
 t/t7004-tag.sh | 21 +++++++++++++++++++++
 3 files changed, 48 insertions(+), 5 deletions(-)

diff --git a/commit-reach.c b/commit-reach.c
index 9b3ea46d6f..1d34d66fe8 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -708,7 +708,8 @@ static int in_commit_list(const struct commit_list *want, struct commit *c)
 
 /*
  * Test whether the candidate is contained in the list.
- * Do not recurse to find out, though, but return -1 if inconclusive.
+ * Do not recurse to find out, though, but return CONTAINS_UNKNOWN if
+ * inconclusive.
  */
 static enum contains_result contains_test(struct commit *candidate,
 					  const struct commit_list *want,
@@ -744,7 +745,7 @@ static void push_to_contains_stack(struct commit *candidate, struct contains_sta
 }
 
 static enum contains_result contains_tag_algo(struct commit *candidate,
-					      const struct commit_list *want,
+					      struct commit_list *want,
 					      struct contains_cache *cache)
 {
 	struct contains_stack contains_stack = { 0, 0, NULL };
@@ -765,6 +766,7 @@ static enum contains_result contains_tag_algo(struct commit *candidate,
 	if (result != CONTAINS_UNKNOWN)
 		return result;
 
+	*contains_cache_at(cache, candidate) = CONTAINS_IN_PROGRESS;
 	push_to_contains_stack(candidate, &contains_stack);
 	while (contains_stack.nr) {
 		struct contains_stack_entry *entry = &contains_stack.contains_stack[contains_stack.nr - 1];
@@ -776,8 +778,8 @@ static enum contains_result contains_tag_algo(struct commit *candidate,
 			contains_stack.nr--;
 		}
 		/*
-		 * If we just popped the stack, parents->item has been marked,
-		 * therefore contains_test will return a meaningful yes/no.
+		 * A parent may have just been popped and marked, or may still
+		 * be active when replacement refs create a cycle.
 		 */
 		else switch (contains_test(parents->item, want, cache, cutoff)) {
 		case CONTAINS_YES:
@@ -787,13 +789,32 @@ static enum contains_result contains_tag_algo(struct commit *candidate,
 		case CONTAINS_NO:
 			entry->parents = parents->next;
 			break;
+		case CONTAINS_IN_PROGRESS:
+			/*
+			 * Partial negative answers are not safe across a cycle.
+			 * Discard them and use the cycle-safe reachability walk.
+			 */
+			goto cycle;
 		case CONTAINS_UNKNOWN:
+			*contains_cache_at(cache, parents->item) =
+				CONTAINS_IN_PROGRESS;
 			push_to_contains_stack(parents->item, &contains_stack);
 			break;
 		}
 	}
 	free(contains_stack.contains_stack);
 	return contains_test(candidate, want, cache, cutoff);
+
+cycle:
+	free(contains_stack.contains_stack);
+	clear_contains_cache(cache);
+
+	result = repo_is_descendant_of(the_repository, candidate, want);
+	if (result < 0)
+		die(_("failed to check reachability after detecting a cycle"));
+	*contains_cache_at(cache, candidate) =
+		result ? CONTAINS_YES : CONTAINS_NO;
+	return result ? CONTAINS_YES : CONTAINS_NO;
 }
 
 int commit_contains(struct ref_filter *filter, struct commit *commit,
diff --git a/commit-reach.h b/commit-reach.h
index 3f3a563d8a..f908d305b1 100644
--- a/commit-reach.h
+++ b/commit-reach.h
@@ -73,7 +73,8 @@ int ref_newer(const struct object_id *new_oid, const struct object_id *old_oid);
 enum contains_result {
 	CONTAINS_UNKNOWN = 0,
 	CONTAINS_NO,
-	CONTAINS_YES
+	CONTAINS_YES,
+	CONTAINS_IN_PROGRESS
 };
 
 define_commit_slab(contains_cache, enum contains_result);
diff --git a/t/t7004-tag.sh b/t/t7004-tag.sh
index d918005dd9..4044bab006 100755
--- a/t/t7004-tag.sh
+++ b/t/t7004-tag.sh
@@ -1611,6 +1611,27 @@ test_expect_success 'checking that first commit is in all tags (hash)' '
 	test_cmp expected actual
 '
 
+test_expect_success 'tag --contains handles cyclic replacement histories' '
+	first=$(git rev-parse HEAD~2) &&
+	second=$(git rev-parse HEAD~) &&
+	third=$(git rev-parse HEAD) &&
+	test_when_finished "
+		git replace -d $first &&
+		git replace -d $third &&
+		git tag -d cycle-a cycle-b
+	" &&
+	git tag cycle-a "$first" &&
+	git tag cycle-b "$third" &&
+	git replace --graft "$first" "$third" "$second" &&
+	git replace --graft "$third" "$first" &&
+	cat >expected <<-\EOF &&
+	cycle-a
+	cycle-b
+	EOF
+	git tag --contains="$second" --list "cycle-*" >actual &&
+	test_cmp expected actual
+'
+
 # other ways of specifying the commit
 test_expect_success 'checking that first commit is in all tags (tag)' '
 	cat >expected <<-\EOF &&

-- 
2.54.0.548.gbe7bb2469c


^ permalink raw reply related

* [PATCH v3 0/3] Reuse --contains traversal results
From: Tamir Duberstein @ 2026-06-12  3:00 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Karthik Nayak, Junio C Hamano, Victoria Dye,
	Derrick Stolee, Elijah Newren, Kristofer Karlsson,
	Tamir Duberstein
In-Reply-To: <20260608-ref-filter-memoized-contains-v2-0-e72720344a7c@gmail.com>

git tag uses a memoized traversal for --contains, while git branch
and git for-each-ref repeat a reachability walk for each ref. Reuse
the memoized traversal when generation numbers can bound the walk.

The first patch makes the memoized traversal handle replacement
cycles. The last makes the non-memoized path report reachability
errors.

Signed-off-by: Tamir Duberstein <tamird@gmail.com>

---
Changes in v3:
- Split missing-ancestor error handling into its own patch.
- Use die() for reachability errors, remove redundant cache setup, and
  chain cycle-test cleanup.
- Drop the unrelated empty-target-list behavior change.
- Explain why git tag retains memoization without generation numbers.
- Add p1500 coverage for all three frontends and a shared-history
  case.
- Remove correctness checks from p1500 and drop output hashes.
- Link to v2: https://patch.msgid.link/20260608-ref-filter-memoized-contains-v2-0-e72720344a7c@gmail.com

Changes in v2:
- Split cycle handling into a preparatory patch.
- Exercise cycle handling through the existing git tag path.
- Move perf result verification out of setup.
- Link to v1: https://patch.msgid.link/20260607-ref-filter-memoized-contains-v1-1-a1972dde9c76@gmail.com

---
Tamir Duberstein (3):
      commit-reach: handle cycles in contains walk
      ref-filter: memoize --contains with generations
      commit-reach: die on contains walk errors

 commit-reach.c                 | 40 ++++++++++++++++++++++++++++++++++------
 commit-reach.h                 |  3 ++-
 t/perf/p1500-graph-walks.sh    | 28 +++++++++++++++++++++++++++-
 t/t6301-for-each-ref-errors.sh | 22 ++++++++++++++++++++++
 t/t7004-tag.sh                 | 21 +++++++++++++++++++++
 5 files changed, 106 insertions(+), 8 deletions(-)
---
base-commit: 9ac3f193c05c2237e2b14ebaa1149e9fc8a1abe0
change-id: 20260607-ref-filter-memoized-contains-7cb6b3bccad1

Best regards,
--  
Tamir Duberstein <tamird@gmail.com>


^ permalink raw reply

* Re: [PATCH v2 2/2] ref-filter: memoize --contains with generations
From: Tamir Duberstein @ 2026-06-12  2:40 UTC (permalink / raw)
  To: Jeff King
  Cc: git, Karthik Nayak, Junio C Hamano, Victoria Dye, Derrick Stolee,
	Elijah Newren
In-Reply-To: <20260611082244.GH2191159@coredump.intra.peff.net>

On Thu, Jun 11, 2026 at 1:22 AM Jeff King <peff@peff.net> wrote:
>
> On Mon, Jun 08, 2026 at 07:36:35PM -0700, Tamir Duberstein wrote:
>
> > The wall-time standard deviations were 11.356 seconds and 133.8
> > milliseconds, respectively. Separate runs without redirection produced
> > the same output with SHA-256
> > 2466f6e2b72aa16b1a2126eddb81c8a1b2764ee251204ac034c191a925aa896f.
>
> Heh. Without the original repo, this sha256 hash is meaningless to us,
> isn't it? Ditto for the sha1 the earlier command.

Yeah, AI slop. Removed.

>
> >  int commit_contains(struct ref_filter *filter, struct commit *commit,
> >                   struct commit_list *list, struct contains_cache *cache)
> >  {
> > -     if (filter->with_commit_tag_algo)
> > +     int result;
> > +
> > +     if (!list)
> > +             return 1;
> > +     if (filter->with_commit_tag_algo ||
> > +         generation_numbers_enabled(the_repository))
> >               return contains_tag_algo(commit, list, cache) == CONTAINS_YES;
> > -     return repo_is_descendant_of(the_repository, commit, list);
> > +
> > +     result = repo_is_descendant_of(the_repository, commit, list);
> > +     if (result < 0)
> > +             exit(128);
> > +     return result;
>
> There's a little more going on here than I expected from the commit
> message. Is it important for us to short-circuit the empty list and just
> return 1? Or did the existing helper functions already handle that?
>
> Looking at contains_tag_algo(), I think it would actually return
> CONTAINS_NO here (though I didn't test it). So this is actually a change
> in behavior for "git tag" if that's correct. I doubt it is triggerable
> in practice, though, as we would simply never call commit_contains() in
> the first place with an empty list. But if we are going to add in this
> logic, I think it makes sense to do so as a separate commit (describing
> what it is doing and why it's not (yet) a triggerable bug).
>
> Checking the result of repo_is_descendant_of() makes sense, as discussed
> earlier. But probably that should come as its own patch, since it's an
> independent bug-fix. I'm also tempted to say it should call die()
> instead of a direct exit, though it does look like the error exit paths
> from repo_is_descendant_of() would all have produced their own messages.

I dropped the empty-list change. The error check is now a separate
patch and uses die().

>
>
> And one side note. While looking at the implementation of
> repo_is_descendant_of(), I did notice something curious: it also
> switches algorithms based on the presence of generation numbers! So it
> should also be cutting off the traversal early when possible. But I
> guess its main problem is that we call it independently for each
> candidate, so it may traverse the same (useful) stretch of history
> multiple times.
>
> So probably an alternative approach to this patch would be feeding all
> of the candidates at once, the way we do with reach_filter() via
> filter_refs(). I'm not sure if we have the right functions available for
> that (naively, --contains and --merged are inversions of each other, so
> swapping the arguments to tips_reachable_from_bases() might work, but I
> didn't think very hard on it).

I tried the suggested argument swap, but tips_reachable_from_bases()
only reports whether a tip is reachable from any base. It cannot report
which candidate refs contain a target, which is what --contains needs.
I did not find an existing batched reachability helper that returns
those per-candidate answers.

>
> I wonder if that might perform better or worse. I'm content to leave it
> for another day, though, as switching to the memoizing depth-first algo
> here is a pretty easy change.
>
> > -     commit=$(git commit-tree $(git rev-parse HEAD^{tree})) &&
> > +     git rev-list --first-parent --max-count=8192 HEAD >contains-commits &&
> > +     test_file_not_empty contains-commits &&
> > +     git update-ref refs/contains-perf-base "$(tail -n 1 contains-commits)" &&
> > +     awk "{
> > +             printf \"update refs/contains-perf/%04d %s\\n\", NR, \$1
> > +     }" contains-commits |
> > +             git update-ref --stdin &&
> > +     git pack-refs --include "refs/contains-perf/*" &&
>
> My head almost exploded reading the embedded quoting in that awk
> invocation. But I can't think offhand of a better way to do it. You
> can't use test_seq because it needs both the number and the original
> string. You can do it with sed, but it probably ends up even more
> unreadable.
>
> But OK, we are making a bunch of refs based on first-parent history.
>
> > +     tree=$(git rev-parse HEAD^{tree}) &&
> > +     base=$(git rev-parse HEAD) &&
> > +     target=$(echo target | git commit-tree "$tree" -p "$base") &&
> > +     git update-ref refs/contains-diverged/target "$target" &&
> > +     for i in $(test_seq 1 4)
> > +     do
> > +             commit=$(echo candidate-$i |
> > +                     git commit-tree "$tree" -p "$base") &&
> > +             git update-ref refs/contains-diverged/candidate-$i "$commit" ||
> > +             return 1
> > +     done &&
>
> And then a few candidate refs that are not reachable from other refs, or
> from each other. OK.
>
> I think you could just write:
>
>   git commit-tree HEAD^{tree} -p HEAD
>
> instead of doing separate rev-parses, but it's probably not a big deal
> either way.
>
> > +test_expect_success 'verify contains results' '
> > +     git for-each-ref --contains=refs/contains-perf-base \
> > +             refs/contains-perf/ >actual &&
> > +     test_line_count = $(wc -l <contains-commits) actual &&
> > +
> > +     echo refs/contains-diverged/target >expect &&
> > +     GIT_TEST_COMMIT_GRAPH=0 \
> > +             git -c core.commitGraph=false for-each-ref \
> > +                     --format="%(refname)" \
> > +                     --contains=refs/contains-diverged/target \
> > +                     refs/contains-diverged/ >actual &&
> > +     test_cmp expect actual
> > +'
>
> This is a funny test to have in the middle of a perf script (which
> hardly anybody ever runs). If we are concerned about the correctness,
> should this be in a non-perf test script? Though I'd imagine something
> like it is already covered there.

I deleted that block rather than moving it. It only rechecked ordinary
--contains semantics already covered by t3201, t6302, and t7004; with
GIT_TEST_COMMIT_GRAPH=1, those tests exercise the newly selected
memoized path for branch and for-each-ref.

The series adds functional tests for the behavior that is actually new:
t7004 covers cyclic replacement histories, and t6301 covers unreadable
ancestry. The p1500 additions now measure performance only.

>
> There's a lot of subtlety in what we're verifying, too. In the first
> half, we are checking that all of the commits in contains-perf contain
> the base.  And that base is the final element of the contains-commits
> list. Which made me wonder what happens in a branch history, since that
> list is linearized. But because we used --first-parent to generate it,
> it _is_ linear, and the results work out. So OK, I don't think it's
> wrong, but I am struggling to understand the meaning of the test.
>
> The second half is just checking that...the other refs which are not
> contained in "target" are not mentioned? OK, but why do it only with
> commit graphs off. Why not both off and on? Again, I'm not sure I
> understand what we're trying to focus on here.
>
> > +test_perf 'contains: git for-each-ref --contains' '
> > +     git for-each-ref --contains=refs/contains-perf-base \
> > +             refs/contains-perf/ >/dev/null
> > +'
>
> Yay, actual perf tests. Here we have a ton of matches, and they all walk
> over the same chunk of history. Should get much faster, though it's
> mostly a synthetic test.
>
> For --merged, we already have separate tests with each of for-each-ref,
> branch, and tag. Should we have the same here for --contains? And should
> we be using the input repo data, rather than our synthetic test? It is
> nice to show off the performance with the synthetic test, but ultimately
> the point of the perf suite is feeding it real workloads and looking for
> regressions.

I added p1500 cases for all three frontends using refs from the input
repository, while retaining the synthetic shared-history case.

>
> > +test_perf 'contains without generations: divergent refs' '
> > +     GIT_TEST_COMMIT_GRAPH=0 \
> > +             git -c core.commitGraph=false for-each-ref \
> > +                     --contains=refs/contains-diverged/target \
> > +                     refs/contains-diverged/ >/dev/null
> > +'
>
> OK, and this one should find that most of them are not contained, but
> the depth-first algorithm could walk all the way down to the roots. But
> we don't run it at all, since we disable commit graphs!
>
> So what are we trying to measure here? If it left commit graphs enabled,
> I think we could demonstrate that using the depth-first algorithm with
> generation numbers does not make anything _worse_. I.e., that
> for-each-ref and branch did not regress from the change.

The divergent-ref test did not exercise the changed path, so I removed
it.

>
> > +test_expect_success 'missing ancestors are reported by contains filters' '
> > +     test_when_finished "git update-ref -d refs/heads/missing-parent" &&
> > +     {
> > +             echo "tree $(git rev-parse HEAD^{tree})" &&
> > +             echo "parent $MISSING" &&
> > +             git cat-file commit HEAD |
> > +                     sed -n -e "/^author /p" -e "/^committer /p" &&
> > +             echo &&
> > +             echo "missing parent"
> > +     } >commit &&
> > +     broken=$(git hash-object -t commit -w commit) &&
> > +     git update-ref refs/heads/missing-parent "$broken" &&
> > +     for option in --contains --no-contains
> > +     do
> > +             test_must_fail git for-each-ref "$option=HEAD" \
> > +                     refs/heads/missing-parent >out 2>err &&
> > +             test_must_be_empty out &&
> > +             test_grep "parse commit $MISSING" err ||
> > +             return 1
> > +     done
> > +'
>
> This is a great thing to test, but probably should be pulled out into
> a separate patch along with the fix to check the return code.

Done in v3.




>
> The commit construction looks OK, and is nicer than corrupting the
> repository by deleting a real object. Given that we are pulling the
> idents from an existing commit, it might be simpler to just use the
> whole commit as a template, like:
>
>   git cat-file commit HEAD |
>   sed "s/^parent /parent $MISSING/"
>
> but it may be a matter of taste.
>
> -Peff

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox