Git development

Git development
 help / color / mirror / Atom feed

* Re: [PATCH v2] update-ref: add --rename option
From: Patrick Steinhardt @ 2026-06-11 13:05 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <20260610212800.2892146-1-gitster@pobox.com>

On Wed, Jun 10, 2026 at 02:28:00PM -0700, Junio C Hamano wrote:
> Add a "--rename" option to "git update-ref" with the syntax:
> 
>  $ git update-ref --rename <old-refname> <new-refname>
> 
> It renames <old-refname> together with its reflog to <new-refname>
> (even when used on a local branch ref, the current value and the
> reflog of the ref are the only things that are renamed).  As the
> command is a low-level plumbing command, attempts to rename branches
> are not warned, but we document it to draw attention of unsuspecting
> users and protect them from burning themselves.

This sentence reads a tiny bit awkward now and is probably an artifact
of the change between v1 and v2.

> Because the "--stdin" mode wants to operate on its refs in a
> reference transaction, and the API function refs_rename_ref() does
> not work well as part of a transaction, it is currently not possible
> to add a corresponding "rename" verb to the "--stdin" mode before
> the underlying API learns to rename refs atomically inside a
> transaction.  It hence is left for a future refactoring.

Fair. I thought at times about extending reference transactions to also
fully support renames, but I never had a good use case where it would
really matter.

>  * The initial draft I sent had a warning when the command is used
>    to rename local branches, but that is unusual for plumbing
>    commands that should do one thing it is designed for consistently
>    well without being chatty.  This version only has words of warning
>    in the documentation.

Good, I just wanted to complain about that warning. I think it's good to
just accept this as-is. I really doubt that folks would go out of their
way to use git-update-ref(1) if all they want was to rename their
branch.

One thing that I'm missing from the commit message: what's the
motivation for this new mode?

> diff --git a/Documentation/git-update-ref.adoc b/Documentation/git-update-ref.adoc
> index 37a5019a8b..0c27efaa52 100644
> --- a/Documentation/git-update-ref.adoc
> +++ b/Documentation/git-update-ref.adoc
> @@ -39,6 +40,14 @@ the result of following the symbolic pointers.
>  With `-d`, it deletes the named <ref> after verifying that it
>  still contains <old-oid>.
>  
> +With `--rename`, it renames <old-refname> together with its reflog to
> +<new-refname>.  The command fails if <old-refname> does not exist, or
> +if <new-refname> already exists.  Because `git update-ref` does not
> +update active worktree `HEAD` symbolic references or `.git/config`
> +tracking settings when you rename a local branch in the `refs/heads/`
> +hierarchy, think twice before using this command to rename a local
> +branch (use `git branch -m` instead).

I'd rephrase this slightly to first document behaviour and then draw the
conclusion that it shouldn't be used in many cases separately. For
example:

    This command does not update any symbolic references pointing to
    the renamed reference, and neither does it update `.git/config`
    tracking settings. It is thus not recommended to use it for renaming
    local branches. Use `git branch -m` instead.

> diff --git a/builtin/update-ref.c b/builtin/update-ref.c
> index 2d68c40ecb..65ee8af08c 100644
> --- a/builtin/update-ref.c
> +++ b/builtin/update-ref.c
> @@ -787,7 +789,7 @@ int cmd_update_ref(int argc,
>  	}
>  
>  	if (read_stdin) {
> -		if (delete || argc > 0)
> +		if (delete || rename || argc > 0)
>  			usage_with_options(git_update_ref_usage, options);
>  		if (end_null)
>  			line_termination = '\0';

Okay, so here we make it mutually exclusive with "--delete".

> @@ -800,6 +802,32 @@ int cmd_update_ref(int argc,
>  	if (end_null)
>  		usage_with_options(git_update_ref_usage, options);
>  
> +	if (rename) {
> +		const char *oldref, *newref;
> +
> +		if (delete || argc != 2)
> +			usage_with_options(git_update_ref_usage, options);

And here with "-d". Good.

> +		if (!refs_ref_exists(get_main_ref_store(the_repository), oldref))
> +			die("no ref named '%s'", oldref);
> +
> +		if (refs_ref_exists(get_main_ref_store(the_repository), newref))
> +			die("ref '%s' already exists", newref);
> +
> +		if (refs_rename_ref(get_main_ref_store(the_repository),
> +				    oldref, newref, msg))
> +			die("rename failed");
> +		return 0;
> +	}

Hm. I think we're not using "--deref" / "--no-deref" at all, but we
document this flag as accepted in the synopsis.

Thanks!

Patrick

^ permalink raw reply

* Re: [PATCH v2] commit-reach: remove get_reachable_subset()
From: Derrick Stolee @ 2026-06-11 12:57 UTC (permalink / raw)
  To: Kristofer Karlsson via GitGitGadget, git; +Cc: Kristofer Karlsson
In-Reply-To: <pull.2144.v2.git.1781178567862.gitgitgadget@gmail.com>

On 6/11/2026 7:49 AM, Kristofer Karlsson via GitGitGadget wrote:
> From: Kristofer Karlsson <krka@spotify.com>

>      * Added PQ mode to the existing test-reach tests so both DFS and PQ
>        paths are exercised by the test suite.

This is a substantial change that I don't think is merited. I
think that this makes the point of your change moot: we essentially
have two implementations in one complicated method instead of two
implementations that have different performance characteristics.

I'd rather leave the code as-is than take this complication. I don't
think your commit message justifies the merging of these
implementations, either.

Moreover, I thought the previous patch was fine, it just needed better
awareness of the performance implications of the change. Specifically,
it could be a regression for large repos without a commit-graph file
while simultaneously potentially being a performance boost for large
repos _with_ a commit-graph file.

_If_ we were to go this direction, then it should be two patches, with
the first introducing the new mode and testing it. The second patch
could change the callers of get_reachable_subset() and delete that
code.

Finally, a commentary: You seem to have a habit of responding to
review feedback only through new patch versions, but I'd rather see
some thoughts in the discussion thread as direct replies to the review,
especially if you think you will change direction like this. Saying
something like "Maybe I should update the method to have two walk modes"
in a reply would have given me an opportunity to respond and perhaps
avoided a new version that went in this direction.

Thanks,
-Stolee

^ permalink raw reply

* Re: [PATCH v3 00/11] doc: interpret-trailers: explain key format
From: Kristoffer Haugsbakk @ 2026-06-11 12:53 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, Christian Couder, jackmanb, Linus Arver, D. Ben Knoble
In-Reply-To: <f738e97b-1aa6-492f-82df-c284a2f94c6a@app.fastmail.com>

On Thu, Jun 11, 2026, at 14:05, Kristoffer Haugsbakk wrote:
>>>>[snip]
>>>> ...
>>
>> By the way, what I queued last night was missing [10/11] as I used
>> "b4 am" to grab the latest thread messages by giving the message-id
>> of the cover letter, but somehow [10/11] had a bogus value in the
>> e-mail header.
>>
>>     Subject: [PATCH v3 10/11] doc: interpret-trailers: rewrite
>> new-trailers paragraphs
>>     Date: Wed, 10 Jun 2026 23:21:28 +0200
>>     Message-ID: <>
>>     X-Mailer: git-send-email 2.54.0.22.g9e26862b904
>>
>> So, I reverted to the old and battle tested way to pick these 11
>> messages manually in my newsreader to replace the topic.
>>
>> If you have a chance, could you investigate where the send-out
>> process went wrong and gave one message a bogus ID?  I am worried if
>> you may have triggered a bug in send-email, in which case we would
>> want to fix it to avoid hurting other users.
>
> I’ll investigate. It’s 99.99% chance a problem on my end, created by me.
>
> Sorry for the trouble!

It was as stupid and embarrassing as it looked at first glance.
A search-and-replace on the format-patch output which was missing
a target value.

I’m very sorry again.

^ permalink raw reply

* Re: [PATCH] t1400: have fifo test clean after itself
From: Patrick Steinhardt @ 2026-06-11 12:41 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <xmqqo6hit6rn.fsf@gitster.g>

On Wed, Jun 10, 2026 at 02:39:08PM -0700, Junio C Hamano wrote:
> One test in this script creates a pair of FIFOs, "in" and "out",
> that are named so generically that later tests may be tempted to use
> them.  By the time those later tests run a command with its output
> redirected to the file (e.g., "git foobar >out"), however, nobody is
> reading from the lingering FIFO, and the test gets blocked forever.
> 
> Clean them up when the test finishes.
> 
> Signed-off-by: Junio C Hamano <gitster@pobox.com>
> ---
>  t/t1400-update-ref.sh | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/t/t1400-update-ref.sh b/t/t1400-update-ref.sh
> index db7f5444da..477af544bc 100755
> --- a/t/t1400-update-ref.sh
> +++ b/t/t1400-update-ref.sh
> @@ -1610,6 +1610,7 @@ test_expect_success 'transaction cannot restart ongoing transaction' '
>  '
>  
>  test_expect_success PIPE 'transaction flushes status updates' '
> +	test_when_finished "rm -f in out" &&
>  	mkfifo in out &&
>  	(git update-ref --stdin <in >out &) &&

I'd expect that such a test that tried to reuse the sockets would
probably break quite obviously, but I guess you never really know. In
any case, it doesn't hurt to clean up after the test.

Thanks!

Patrick

^ permalink raw reply

* Re: [PATCH v3 00/11] doc: interpret-trailers: explain key format
From: Kristoffer Haugsbakk @ 2026-06-11 12:05 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, Kristoffer Haugsbakk, Christian Couder, jackmanb,
	Linus Arver, D. Ben Knoble
In-Reply-To: <xmqq1pedthkv.fsf@gitster.g>

On Thu, Jun 11, 2026, at 13:57, Junio C Hamano wrote:
> Junio C Hamano <gitster@pobox.com> writes:
>
>> kristofferhaugsbakk@fastmail.com writes:
>>
>>> Interdiff against v2:
>>> diff --git a/Documentation/git-interpret-trailers.adoc b/Documentation/git-interpret-trailers.adoc
>>> ...
>
> By the way, what I queued last night was missing [10/11] as I used
> "b4 am" to grab the latest thread messages by giving the message-id
> of the cover letter, but somehow [10/11] had a bogus value in the
> e-mail header.
>
>     Subject: [PATCH v3 10/11] doc: interpret-trailers: rewrite
> new-trailers paragraphs
>     Date: Wed, 10 Jun 2026 23:21:28 +0200
>     Message-ID: <>
>     X-Mailer: git-send-email 2.54.0.22.g9e26862b904
>
> So, I reverted to the old and battle tested way to pick these 11
> messages manually in my newsreader to replace the topic.
>
> If you have a chance, could you investigate where the send-out
> process went wrong and gave one message a bogus ID?  I am worried if
> you may have triggered a bug in send-email, in which case we would
> want to fix it to avoid hurting other users.

I’ll investigate. It’s 99.99% chance a problem on my end, created by me.

Sorry for the trouble!

^ permalink raw reply

* Re: [PATCH v3 00/11] doc: interpret-trailers: explain key format
From: Junio C Hamano @ 2026-06-11 11:57 UTC (permalink / raw)
  To: kristofferhaugsbakk
  Cc: git, Kristoffer Haugsbakk, christian.couder, jackmanb,
	Linus Arver, D . Ben Knoble
In-Reply-To: <xmqqcxxyt4op.fsf@gitster.g>

Junio C Hamano <gitster@pobox.com> writes:

> kristofferhaugsbakk@fastmail.com writes:
>
>> Interdiff against v2:
>> diff --git a/Documentation/git-interpret-trailers.adoc b/Documentation/git-interpret-trailers.adoc
>> ...

By the way, what I queued last night was missing [10/11] as I used
"b4 am" to grab the latest thread messages by giving the message-id
of the cover letter, but somehow [10/11] had a bogus value in the
e-mail header.

    Subject: [PATCH v3 10/11] doc: interpret-trailers: rewrite new-trailers paragraphs
    Date: Wed, 10 Jun 2026 23:21:28 +0200
    Message-ID: <>
    X-Mailer: git-send-email 2.54.0.22.g9e26862b904

So, I reverted to the old and battle tested way to pick these 11
messages manually in my newsreader to replace the topic.

If you have a chance, could you investigate where the send-out
process went wrong and gave one message a bogus ID?  I am worried if
you may have triggered a bug in send-email, in which case we would
want to fix it to avoid hurting other users.

Thanks.

^ permalink raw reply

* [PATCH v2] commit-reach: remove get_reachable_subset()
From: Kristofer Karlsson via GitGitGadget @ 2026-06-11 11:49 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Kristofer Karlsson, Kristofer Karlsson,
	Kristofer Karlsson
In-Reply-To: <pull.2144.git.1781033285419.gitgitgadget@gmail.com>

From: Kristofer Karlsson <krka@spotify.com>

get_reachable_subset() and tips_reachable_from_bases() both answer
the same reachability question but use different traversal
strategies: priority queue vs depth-first search.  Consolidate them
into tips_reachable_from_bases() with a mode parameter to select
between DFS and PQ traversal, preserving the preferred strategy for
each caller.

This works cleanly because prio_queue already supports LIFO mode
(when compare is NULL), so a single prio_queue acts as either a
stack or a heap depending on the mode.

The unified traversal pushes all unseen parents at once rather than
peeking and pushing one parent at a time.  This eliminates merge
commit revisits entirely: a 2-parent merge now requires 1 visit
instead of 3.  For DFS (LIFO) mode, the first parent is pushed
last so it ends up on top of the stack, preserving first-parent
traversal order.

Parsing is deferred to pop time for DFS since parent objects carry
valid flags without a full repo_parse_commit() call.  PQ mode
parses before push so the heap can order by generation number.

Add exhaustive reachability tests that use every commit in the
grid as a tip, protecting against subtle traversal bugs such as
wrong parent ordering or premature pruning.  The existing tests
are also extended to exercise both DFS and PQ modes.

The flag in remote.c changes from 1 (bit 0) to TMP_MARK (bit 4)
because tips_reachable_from_bases() uses SEEN (bit 0) internally.
TMP_MARK is already used for deduplication earlier in the same
function and is cleared before the reachability check.

Signed-off-by: Kristofer Karlsson <krka@spotify.com>
---
    commit-reach: remove get_reachable_subset()
    
    This removes get_reachable_subset() and consolidates its only caller
    into tips_reachable_from_bases() with a mode parameter to select between
    DFS and priority queue traversal. Both functions answer the same
    reachability query and use the same generation-number pruning strategy,
    differing primarily in traversal order (DFS vs generation-ordered PQ).
    They were introduced years apart and never consolidated.
    
    The unified function uses prio_queue for both modes: LIFO (stack) when
    compare is NULL, min-heap when given a comparator.
    
    Changes since v1:
    
     * Replaced the commit_stack + reverse-push pattern with a simpler
       approach: push all unseen parents directly, with the first parent
       pushed last so it lands on top of the LIFO stack. This preserves
       first-parent DFS order without needing a temporary stack to reverse
       parent order.
    
     * Deferred repo_parse_commit() to pop time for DFS mode. Parent objects
       carry valid flags without a full parse, so we avoid parsing
       already-SEEN parents in the inner loop. PQ mode still parses before
       push so the heap can order by generation number.
    
     * Moved the generation floor check from the parent loop to pop time,
       simplifying the hot path at the cost of occasionally enqueueing
       commits that will later be discarded.
    
     * Added SEEN flag on base commits before enqueueing, preventing
       duplicate processing if bases overlap with the traversal.
    
     * Added PQ mode to the existing test-reach tests so both DFS and PQ
       paths are exercised by the test suite.
    
     * Added two new reachability tests that use all 100 commits in the grid
       as tips - the idea is to better detect subtle traversal bugs.
    
    Notes for reviewers:
    
     * The flag in remote.c changes from 1 to TMP_MARK because
       tips_reachable_from_bases() uses SEEN (bit 0) internally. TMP_MARK is
       already used earlier in the same function and is cleared before the
       reachability block.
    
     * Test helper and test names rename from get_reachable_subset to
       tips_reachable_from_bases to match the function being exercised.
    
    As Stolee noted in the v1 review, commit-graph is now stable and
    expected for repositories where this matters, making the DFS approach
    with dynamic floor raising an attractive default. The mode parameter
    could be dropped in a future patch if we decide to use DFS everywhere,
    but preserving each caller's existing strategy keeps this change
    conservative and avoids any behavior change for add_missing_tags().
    
    Performance was not a goal of this refactoring, but the simplified DFS
    traversal turned out to be a pleasant surprise -- eliminating merge
    commit revisits and deferring parse to pop time gives a consistent
    speedup across repositories:
    
    Benchmark results (median, 30 runs, pre-built binaries):
    
    Repository   Command                   Speedup
    -----------------------------------------------------------
    linux        branch --merged=HEAD       1.09x faster
    linux        for-each-ref --merged=HEAD 1.14x faster
    git          branch --merged=HEAD       neutral
    git          for-each-ref --merged=HEAD 1.12x faster
    

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-2144%2Fspkrka%2Fkrka%2Fremove-get-reachable-subset-v2-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-2144/spkrka/krka/remove-get-reachable-subset-v2-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/2144

Range-diff vs v1:

 1:  df8e2de11b ! 1:  da01032a32 commit-reach: remove get_reachable_subset()
     @@ Metadata
       ## Commit message ##
          commit-reach: remove get_reachable_subset()
      
     -    get_reachable_subset() and tips_reachable_from_bases() answer the
     -    same question: given a set of bases and a set of tips, which tips
     -    are reachable from at least one base?
     +    get_reachable_subset() and tips_reachable_from_bases() both answer
     +    the same reachability question but use different traversal
     +    strategies: priority queue vs depth-first search.  Consolidate them
     +    into tips_reachable_from_bases() with a mode parameter to select
     +    between DFS and PQ traversal, preserving the preferred strategy for
     +    each caller.
      
     -    get_reachable_subset() was introduced in fcb2c0769d (2018-11-02)
     -    for add_missing_tags() in remote.c. tips_reachable_from_bases()
     -    was added in cbfe360b14 (2023-03-20) as part of the ahead-behind
     -    series. The two were never consolidated.
     +    This works cleanly because prio_queue already supports LIFO mode
     +    (when compare is NULL), so a single prio_queue acts as either a
     +    stack or a heap depending on the mode.
      
     -    With a commit-graph, tips_reachable_from_bases() can have an edge:
     -    its DFS raises the generation floor as lower targets are found,
     -    pruning more aggressively than the static floor in
     -    get_reachable_subset(). Without generation numbers, some edge cases
     -    may be slower with DFS instead of BFS since the date-ordered
     -    prio_queue naturally stays near the top of the graph, but this
     -    should not matter in practice -- worst case both visit the full
     -    graph down from the bases.
     +    The unified traversal pushes all unseen parents at once rather than
     +    peeking and pushing one parent at a time.  This eliminates merge
     +    commit revisits entirely: a 2-parent merge now requires 1 visit
     +    instead of 3.  For DFS (LIFO) mode, the first parent is pushed
     +    last so it ends up on top of the stack, preserving first-parent
     +    traversal order.
     +
     +    Parsing is deferred to pop time for DFS since parent objects carry
     +    valid flags without a full repo_parse_commit() call.  PQ mode
     +    parses before push so the heap can order by generation number.
     +
     +    Add exhaustive reachability tests that use every commit in the
     +    grid as a tip, protecting against subtle traversal bugs such as
     +    wrong parent ordering or premature pruning.  The existing tests
     +    are also extended to exercise both DFS and PQ modes.
      
          The flag in remote.c changes from 1 (bit 0) to TMP_MARK (bit 4)
          because tips_reachable_from_bases() uses SEEN (bit 0) internally.
     @@ commit-reach.c: int can_all_from_reach(struct commit_list *from, struct commit_l
       define_commit_slab(bit_arrays, struct bitmap *);
       static struct bit_arrays bit_arrays;
       
     +@@ commit-reach.c: static int compare_commit_and_index_by_generation(const void *va, const void *vb
     + void tips_reachable_from_bases(struct repository *r,
     + 			       struct commit_list *bases,
     + 			       struct commit **tips, size_t tips_nr,
     +-			       int mark)
     ++			       int mark, enum tips_reachable_mode mode)
     + {
     + 	struct commit_and_index *commits;
     ++	struct commit_list *p;
     ++	struct commit *c;
     + 	size_t min_generation_index = 0;
     + 	timestamp_t min_generation;
     +-	struct commit_list *stack = NULL;
     ++	struct prio_queue queue = { NULL };
     + 
     + 	if (!bases || !tips || !tips_nr)
     + 		return;
     + 
     + 	/*
     +-	 * Do a depth-first search starting at 'bases' to search for the
     +-	 * tips. Stop at the lowest (un-found) generation number. When
     +-	 * finding the lowest commit, increase the minimum generation
     +-	 * number to the next lowest (un-found) generation number.
     ++	 * Search starting at 'bases' looking for the tips. Stop at the
     ++	 * lowest un-found generation number, raising the floor as tips
     ++	 * are found. Use DFS by default; with TIPS_REACHABLE_PQ,
     ++	 * use a priority queue ordered by generation then commit date.
     + 	 */
     ++	if (mode == TIPS_REACHABLE_PQ)
     ++		queue.compare = compare_commits_by_gen_then_commit_date;
     + 
     + 	CALLOC_ARRAY(commits, tips_nr);
     + 
     +@@ commit-reach.c: void tips_reachable_from_bases(struct repository *r,
     + 
     + 	while (bases) {
     + 		repo_parse_commit(r, bases->item);
     +-		commit_list_insert(bases->item, &stack);
     ++		bases->item->object.flags |= SEEN;
     ++		prio_queue_put(&queue, bases->item);
     + 		bases = bases->next;
     + 	}
     + 
     +-	while (stack) {
     +-		int explored_all_parents = 1;
     +-		struct commit_list *p;
     +-		struct commit *c = stack->item;
     ++	while ((c = prio_queue_get(&queue))) {
     ++		struct commit *first_parent = NULL;
     ++
     ++		repo_parse_commit(r, c);
     ++
     ++		/* Skip if below the current generation floor. */
     ++		if (commit_graph_generation(c) < min_generation)
     ++			continue;
     + 
     + 		/* Does it match any of our tips? */
     + 		{
     +@@ commit-reach.c: void tips_reachable_from_bases(struct repository *r,
     + 		}
     + 
     + 		for (p = c->parents; p; p = p->next) {
     +-			repo_parse_commit(r, p->item);
     +-
     + 			/* Have we already explored this parent? */
     + 			if (p->item->object.flags & SEEN)
     + 				continue;
     + 
     +-			/* Is it below the current minimum generation? */
     +-			if (commit_graph_generation(p->item) < min_generation)
     +-				continue;
     +-
     + 			/* Ok, we will explore from here on. */
     + 			p->item->object.flags |= SEEN;
     +-			explored_all_parents = 0;
     +-			commit_list_insert(p->item, &stack);
     +-			break;
     ++			/* Parse before pushing in PQ mode for ordering. */
     ++			if (mode == TIPS_REACHABLE_PQ)
     ++				repo_parse_commit(r, p->item);
     ++			if (!first_parent)
     ++				first_parent = p->item;
     ++			else
     ++				prio_queue_put(&queue, p->item);
     + 		}
     +-
     +-		if (explored_all_parents)
     +-			pop_commit(&stack);
     ++		/*
     ++		 * Add the first parent last so that it is on top of
     ++		 * the LIFO queue, maintaining first-parent DFS order.
     ++		 */
     ++		if (first_parent)
     ++			prio_queue_put(&queue, first_parent);
     + 	}
     + 
     + done:
     +@@ commit-reach.c: done:
     + 		commits[i].commit->object.flags &= ~RESULT;
     + 	free(commits);
     + 	repo_clear_commit_marks(r, SEEN);
     +-	commit_list_free(stack);
     ++	clear_prio_queue(&queue);
     + }
     + 
     + /*
      
       ## commit-reach.h ##
      @@ commit-reach.h: int can_all_from_reach_with_flag(struct object_array *from,
     @@ commit-reach.h: int can_all_from_reach_with_flag(struct object_array *from,
       struct ahead_behind_count {
       	/**
       	 * As input, the *_index members indicate which positions in
     +@@ commit-reach.h: void ahead_behind(struct repository *r,
     +  * For all tip commits, add 'mark' to their flags if and only if they
     +  * are reachable from one of the commits in 'bases'.
     +  */
     ++enum tips_reachable_mode {
     ++	TIPS_REACHABLE_DFS,
     ++	TIPS_REACHABLE_PQ,
     ++};
     + void tips_reachable_from_bases(struct repository *r,
     + 			       struct commit_list *bases,
     + 			       struct commit **tips, size_t tips_nr,
     +-			       int mark);
     ++			       int mark, enum tips_reachable_mode mode);
     + 
     + /*
     +  * Given a 'tip' commit and a list potential 'bases', return the index 'i' that
     +
     + ## ref-filter.c ##
     +@@ ref-filter.c: static void reach_filter(struct ref_array *array,
     + 	tips_reachable_from_bases(the_repository,
     + 				  *check_reachable,
     + 				  to_clear, array->nr,
     +-				  UNINTERESTING);
     ++				  UNINTERESTING, TIPS_REACHABLE_DFS);
     + 
     + 	old_nr = array->nr;
     + 	array->nr = 0;
      
       ## remote.c ##
      @@ remote.c: static void add_missing_tags(struct ref *src, struct ref **dst, struct ref ***ds
     @@ remote.c: static void add_missing_tags(struct ref *src, struct ref **dst, struct
      +			commit_list_insert(sent_tips.items[i], &bases);
      +		tips_reachable_from_bases(the_repository,
      +					 bases, src_commits.items,
     -+					 src_commits.nr, TMP_MARK);
     ++					 src_commits.nr, TMP_MARK,
     ++					 TIPS_REACHABLE_PQ);
      +		commit_list_free(bases);
       
       		for_each_string_list_item(item, &src_tag) {
     @@ t/helper/test-reach.c: int cmd__reach(int ac, const char **av)
      -				    oid_to_hex(&list->item->object.oid));
      -			count++;
      -		}
     -+	} else if (!strcmp(av[1], "tips_reachable_from_bases")) {
     ++	} else if (!strcmp(av[1], "tips_reachable_from_bases") ||
     ++		   !strcmp(av[1], "tips_reachable_from_bases_pq")) {
     ++		enum tips_reachable_mode mode =
     ++			!strcmp(av[1], "tips_reachable_from_bases_pq")
     ++			? TIPS_REACHABLE_PQ : TIPS_REACHABLE_DFS;
      +		struct commit_list *bases = NULL;
      +		struct commit_list *result = NULL;
      +
     @@ t/helper/test-reach.c: int cmd__reach(int ac, const char **av)
      +			commit_list_insert(X_stack.items[i], &bases);
      +		tips_reachable_from_bases(the_repository,
      +					 bases, Y_stack.items,
     -+					 Y_stack.nr, TMP_MARK);
     ++					 Y_stack.nr, TMP_MARK,
     ++					 mode);
      +		commit_list_free(bases);
      +
      +		printf("tips_reachable_from_bases(X,Y)\n");
     @@ t/t6600-test-reach.sh: test_expect_success 'get_reachable_subset:all' '
       			      commit-5-6 | sort
       	) >expect &&
      -	test_all_modes get_reachable_subset
     -+	test_all_modes tips_reachable_from_bases
     ++	test_all_modes tips_reachable_from_bases &&
     ++	test_all_modes tips_reachable_from_bases_pq
       '
       
      -test_expect_success 'get_reachable_subset:some' '
     @@ t/t6600-test-reach.sh: test_expect_success 'get_reachable_subset:some' '
       			      commit-1-7 | sort
       	) >expect &&
      -	test_all_modes get_reachable_subset
     -+	test_all_modes tips_reachable_from_bases
     ++	test_all_modes tips_reachable_from_bases &&
     ++	test_all_modes tips_reachable_from_bases_pq
       '
       
      -test_expect_success 'get_reachable_subset:none' '
     @@ t/t6600-test-reach.sh: test_expect_success 'get_reachable_subset:none' '
      -	echo "get_reachable_subset(X,Y)" >expect &&
      -	test_all_modes get_reachable_subset
      +	echo "tips_reachable_from_bases(X,Y)" >expect &&
     -+	test_all_modes tips_reachable_from_bases
     ++	test_all_modes tips_reachable_from_bases &&
     ++	test_all_modes tips_reachable_from_bases_pq
       '
       
       test_expect_success 'for-each-ref ahead-behind:linear' '
     +@@ t/t6600-test-reach.sh: test_expect_success 'for-each-ref merged:duplicate at min generation' '
     + 		--format="%(refname)" --stdin
     + '
     + 
     ++test_expect_success 'for-each-ref merged:all reachable commits' '
     ++	for x in $(test_seq 1 10)
     ++	do
     ++		for y in $(test_seq 1 10)
     ++		do
     ++			echo "refs/heads/commit-$x-$y" || return 1
     ++		done
     ++	done >input &&
     ++	for x in $(test_seq 1 5)
     ++	do
     ++		for y in $(test_seq 1 5)
     ++		do
     ++			echo "refs/heads/commit-$x-$y" || return 1
     ++		done
     ++	done | sort >expect &&
     ++	run_all_modes git for-each-ref --merged=commit-5-5 \
     ++		--format="%(refname)" --stdin
     ++'
     ++
     ++test_expect_success 'for-each-ref merged:all reachable, multibase' '
     ++	for x in $(test_seq 1 10)
     ++	do
     ++		for y in $(test_seq 1 10)
     ++		do
     ++			echo "refs/heads/commit-$x-$y" || return 1
     ++		done
     ++	done >input &&
     ++	for x in $(test_seq 1 10)
     ++	do
     ++		for y in $(test_seq 1 10)
     ++		do
     ++			if { test $x -le 3 && test $y -le 7; } ||
     ++			   { test $x -le 7 && test $y -le 3; }
     ++			then
     ++				echo "refs/heads/commit-$x-$y" || return 1
     ++			fi
     ++		done
     ++	done | sort >expect &&
     ++	run_all_modes git for-each-ref \
     ++		--merged=commit-3-7 \
     ++		--merged=commit-7-3 \
     ++		--format="%(refname)" --stdin
     ++'
     ++
     + # For get_branch_base_for_tip, we only care about
     + # first-parent history. Here is the test graph with
     + # second parents removed:


 commit-reach.c        | 131 +++++++++++-------------------------------
 commit-reach.h        |  19 ++----
 ref-filter.c          |   2 +-
 remote.c              |  20 +++----
 t/helper/test-reach.c |  44 +++++++-------
 t/t6600-test-reach.sh |  65 ++++++++++++++++++---
 6 files changed, 129 insertions(+), 152 deletions(-)

diff --git a/commit-reach.c b/commit-reach.c
index 5df471a313..1cad7b211e 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -1013,79 +1013,6 @@ int can_all_from_reach(struct commit_list *from, struct commit_list *to,
 	return result;
 }
 
-struct commit_list *get_reachable_subset(struct commit **from, size_t nr_from,
-					 struct commit **to, size_t nr_to,
-					 unsigned int reachable_flag)
-{
-	struct commit **item;
-	struct commit *current;
-	struct commit_list *found_commits = NULL;
-	struct commit **to_last = to + nr_to;
-	struct commit **from_last = from + nr_from;
-	timestamp_t min_generation = GENERATION_NUMBER_INFINITY;
-	int num_to_find = 0;
-
-	struct prio_queue queue = { compare_commits_by_gen_then_commit_date };
-
-	for (item = to; item < to_last; item++) {
-		timestamp_t generation;
-		struct commit *c = *item;
-
-		repo_parse_commit(the_repository, c);
-		generation = commit_graph_generation(c);
-		if (generation < min_generation)
-			min_generation = generation;
-
-		if (!(c->object.flags & PARENT1)) {
-			c->object.flags |= PARENT1;
-			num_to_find++;
-		}
-	}
-
-	for (item = from; item < from_last; item++) {
-		struct commit *c = *item;
-		if (!(c->object.flags & PARENT2)) {
-			c->object.flags |= PARENT2;
-			repo_parse_commit(the_repository, c);
-
-			prio_queue_put(&queue, *item);
-		}
-	}
-
-	while (num_to_find && (current = prio_queue_get(&queue)) != NULL) {
-		struct commit_list *parents;
-
-		if (current->object.flags & PARENT1) {
-			current->object.flags &= ~PARENT1;
-			current->object.flags |= reachable_flag;
-			commit_list_insert(current, &found_commits);
-			num_to_find--;
-		}
-
-		for (parents = current->parents; parents; parents = parents->next) {
-			struct commit *p = parents->item;
-
-			repo_parse_commit(the_repository, p);
-
-			if (commit_graph_generation(p) < min_generation)
-				continue;
-
-			if (p->object.flags & PARENT2)
-				continue;
-
-			p->object.flags |= PARENT2;
-			prio_queue_put(&queue, p);
-		}
-	}
-
-	clear_prio_queue(&queue);
-
-	clear_commit_marks_many(nr_to, to, PARENT1);
-	clear_commit_marks_many(nr_from, from, PARENT2);
-
-	return found_commits;
-}
-
 define_commit_slab(bit_arrays, struct bitmap *);
 static struct bit_arrays bit_arrays;
 
@@ -1212,22 +1139,26 @@ static int compare_commit_and_index_by_generation(const void *va, const void *vb
 void tips_reachable_from_bases(struct repository *r,
 			       struct commit_list *bases,
 			       struct commit **tips, size_t tips_nr,
-			       int mark)
+			       int mark, enum tips_reachable_mode mode)
 {
 	struct commit_and_index *commits;
+	struct commit_list *p;
+	struct commit *c;
 	size_t min_generation_index = 0;
 	timestamp_t min_generation;
-	struct commit_list *stack = NULL;
+	struct prio_queue queue = { NULL };
 
 	if (!bases || !tips || !tips_nr)
 		return;
 
 	/*
-	 * Do a depth-first search starting at 'bases' to search for the
-	 * tips. Stop at the lowest (un-found) generation number. When
-	 * finding the lowest commit, increase the minimum generation
-	 * number to the next lowest (un-found) generation number.
+	 * Search starting at 'bases' looking for the tips. Stop at the
+	 * lowest un-found generation number, raising the floor as tips
+	 * are found. Use DFS by default; with TIPS_REACHABLE_PQ,
+	 * use a priority queue ordered by generation then commit date.
 	 */
+	if (mode == TIPS_REACHABLE_PQ)
+		queue.compare = compare_commits_by_gen_then_commit_date;
 
 	CALLOC_ARRAY(commits, tips_nr);
 
@@ -1245,14 +1176,19 @@ void tips_reachable_from_bases(struct repository *r,
 
 	while (bases) {
 		repo_parse_commit(r, bases->item);
-		commit_list_insert(bases->item, &stack);
+		bases->item->object.flags |= SEEN;
+		prio_queue_put(&queue, bases->item);
 		bases = bases->next;
 	}
 
-	while (stack) {
-		int explored_all_parents = 1;
-		struct commit_list *p;
-		struct commit *c = stack->item;
+	while ((c = prio_queue_get(&queue))) {
+		struct commit *first_parent = NULL;
+
+		repo_parse_commit(r, c);
+
+		/* Skip if below the current generation floor. */
+		if (commit_graph_generation(c) < min_generation)
+			continue;
 
 		/* Does it match any of our tips? */
 		{
@@ -1276,25 +1212,26 @@ void tips_reachable_from_bases(struct repository *r,
 		}
 
 		for (p = c->parents; p; p = p->next) {
-			repo_parse_commit(r, p->item);
-
 			/* Have we already explored this parent? */
 			if (p->item->object.flags & SEEN)
 				continue;
 
-			/* Is it below the current minimum generation? */
-			if (commit_graph_generation(p->item) < min_generation)
-				continue;
-
 			/* Ok, we will explore from here on. */
 			p->item->object.flags |= SEEN;
-			explored_all_parents = 0;
-			commit_list_insert(p->item, &stack);
-			break;
+			/* Parse before pushing in PQ mode for ordering. */
+			if (mode == TIPS_REACHABLE_PQ)
+				repo_parse_commit(r, p->item);
+			if (!first_parent)
+				first_parent = p->item;
+			else
+				prio_queue_put(&queue, p->item);
 		}
-
-		if (explored_all_parents)
-			pop_commit(&stack);
+		/*
+		 * Add the first parent last so that it is on top of
+		 * the LIFO queue, maintaining first-parent DFS order.
+		 */
+		if (first_parent)
+			prio_queue_put(&queue, first_parent);
 	}
 
 done:
@@ -1302,7 +1239,7 @@ done:
 		commits[i].commit->object.flags &= ~RESULT;
 	free(commits);
 	repo_clear_commit_marks(r, SEEN);
-	commit_list_free(stack);
+	clear_prio_queue(&queue);
 }
 
 /*
diff --git a/commit-reach.h b/commit-reach.h
index 3f3a563d8a..71e60d727a 100644
--- a/commit-reach.h
+++ b/commit-reach.h
@@ -96,19 +96,6 @@ int can_all_from_reach_with_flag(struct object_array *from,
 int can_all_from_reach(struct commit_list *from, struct commit_list *to,
 		       int commit_date_cutoff);
 
-
-/*
- * Return a list of commits containing the commits in the 'to' array
- * that are reachable from at least one commit in the 'from' array.
- * Also add the given 'flag' to each of the commits in the returned list.
- *
- * This method uses the PARENT1 and PARENT2 flags during its operation,
- * so be sure these flags are not set before calling the method.
- */
-struct commit_list *get_reachable_subset(struct commit **from, size_t nr_from,
-					 struct commit **to, size_t nr_to,
-					 unsigned int reachable_flag);
-
 struct ahead_behind_count {
 	/**
 	 * As input, the *_index members indicate which positions in
@@ -144,10 +131,14 @@ void ahead_behind(struct repository *r,
  * For all tip commits, add 'mark' to their flags if and only if they
  * are reachable from one of the commits in 'bases'.
  */
+enum tips_reachable_mode {
+	TIPS_REACHABLE_DFS,
+	TIPS_REACHABLE_PQ,
+};
 void tips_reachable_from_bases(struct repository *r,
 			       struct commit_list *bases,
 			       struct commit **tips, size_t tips_nr,
-			       int mark);
+			       int mark, enum tips_reachable_mode mode);
 
 /*
  * Given a 'tip' commit and a list potential 'bases', return the index 'i' that
diff --git a/ref-filter.c b/ref-filter.c
index 1da4c0e60d..9c8896d347 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -3157,7 +3157,7 @@ static void reach_filter(struct ref_array *array,
 	tips_reachable_from_bases(the_repository,
 				  *check_reachable,
 				  to_clear, array->nr,
-				  UNINTERESTING);
+				  UNINTERESTING, TIPS_REACHABLE_DFS);
 
 	old_nr = array->nr;
 	array->nr = 0;
diff --git a/remote.c b/remote.c
index 00723b385e..0324c25743 100644
--- a/remote.c
+++ b/remote.c
@@ -1459,9 +1459,8 @@ static void add_missing_tags(struct ref *src, struct ref **dst, struct ref ***ds
 	 * sent to the other side.
 	 */
 	if (sent_tips.nr) {
-		const int reachable_flag = 1;
-		struct commit_list *found_commits;
 		struct commit_stack src_commits = COMMIT_STACK_INIT;
+		struct commit_list *bases = NULL;
 
 		for_each_string_list_item(item, &src_tag) {
 			struct ref *ref = item->util;
@@ -1479,11 +1478,13 @@ static void add_missing_tags(struct ref *src, struct ref **dst, struct ref ***ds
 			commit_stack_push(&src_commits, commit);
 		}
 
-		found_commits = get_reachable_subset(sent_tips.items,
-						     sent_tips.nr,
-						     src_commits.items,
-						     src_commits.nr,
-						     reachable_flag);
+		for (size_t i = 0; i < sent_tips.nr; i++)
+			commit_list_insert(sent_tips.items[i], &bases);
+		tips_reachable_from_bases(the_repository,
+					 bases, src_commits.items,
+					 src_commits.nr, TMP_MARK,
+					 TIPS_REACHABLE_PQ);
+		commit_list_free(bases);
 
 		for_each_string_list_item(item, &src_tag) {
 			struct ref *dst_ref;
@@ -1503,7 +1504,7 @@ static void add_missing_tags(struct ref *src, struct ref **dst, struct ref ***ds
 			 * Is this tag, which they do not have, reachable from
 			 * any of the commits we are sending?
 			 */
-			if (!(commit->object.flags & reachable_flag))
+			if (!(commit->object.flags & TMP_MARK))
 				continue;
 
 			/* Add it in */
@@ -1513,9 +1514,8 @@ static void add_missing_tags(struct ref *src, struct ref **dst, struct ref ***ds
 		}
 
 		clear_commit_marks_many(src_commits.nr, src_commits.items,
-					reachable_flag);
+					TMP_MARK);
 		commit_stack_clear(&src_commits);
-		commit_list_free(found_commits);
 	}
 
 	string_list_clear(&src_tag, 0);
diff --git a/t/helper/test-reach.c b/t/helper/test-reach.c
index 5d86a96c17..66ee35e70d 100644
--- a/t/helper/test-reach.c
+++ b/t/helper/test-reach.c
@@ -7,6 +7,7 @@
 #include "hex.h"
 #include "object-name.h"
 #include "ref-filter.h"
+#include "revision.h"
 #include "setup.h"
 #include "string-list.h"
 #include "tag.h"
@@ -149,30 +150,31 @@ int cmd__reach(int ac, const char **av)
 
 		printf("%s(_,A,X,_):%d\n", av[1], commit_contains(&filter, A, X, &cache));
 		clear_contains_cache(&cache);
-	} else if (!strcmp(av[1], "get_reachable_subset")) {
-		const int reachable_flag = 1;
-		int count = 0;
-		struct commit_list *current;
-		struct commit_list *list = get_reachable_subset(X_stack.items, X_stack.nr,
-								Y_stack.items, Y_stack.nr,
-								reachable_flag);
-		printf("get_reachable_subset(X,Y)\n");
-		for (current = list; current; current = current->next) {
-			if (!(list->item->object.flags & reachable_flag))
-				die(_("commit %s is not marked reachable"),
-				    oid_to_hex(&list->item->object.oid));
-			count++;
-		}
+	} else if (!strcmp(av[1], "tips_reachable_from_bases") ||
+		   !strcmp(av[1], "tips_reachable_from_bases_pq")) {
+		enum tips_reachable_mode mode =
+			!strcmp(av[1], "tips_reachable_from_bases_pq")
+			? TIPS_REACHABLE_PQ : TIPS_REACHABLE_DFS;
+		struct commit_list *bases = NULL;
+		struct commit_list *result = NULL;
+
+		for (size_t i = 0; i < X_stack.nr; i++)
+			commit_list_insert(X_stack.items[i], &bases);
+		tips_reachable_from_bases(the_repository,
+					 bases, Y_stack.items,
+					 Y_stack.nr, TMP_MARK,
+					 mode);
+		commit_list_free(bases);
+
+		printf("tips_reachable_from_bases(X,Y)\n");
 		for (size_t i = 0; i < Y_stack.nr; i++) {
-			if (Y_stack.items[i]->object.flags & reachable_flag)
-				count--;
+			if (Y_stack.items[i]->object.flags & TMP_MARK)
+				commit_list_insert(Y_stack.items[i], &result);
 		}
+		print_sorted_commit_ids(result);
 
-		if (count < 0)
-			die(_("too many commits marked reachable"));
-
-		print_sorted_commit_ids(list);
-		commit_list_free(list);
+		clear_commit_marks_many(Y_stack.nr, Y_stack.items, TMP_MARK);
+		commit_list_free(result);
 	}
 
 	object_array_clear(&X_obj);
diff --git a/t/t6600-test-reach.sh b/t/t6600-test-reach.sh
index b5b314e570..b736d893d5 100755
--- a/t/t6600-test-reach.sh
+++ b/t/t6600-test-reach.sh
@@ -391,7 +391,7 @@ test_expect_success 'rev-list: symmetric difference topo-order' '
 	run_all_modes git rev-list --topo-order commit-3-8...commit-6-6
 '
 
-test_expect_success 'get_reachable_subset:all' '
+test_expect_success 'tips_reachable_from_bases:all' '
 	cat >input <<-\EOF &&
 	X:commit-9-1
 	X:commit-8-3
@@ -403,15 +403,16 @@ test_expect_success 'get_reachable_subset:all' '
 	Y:commit-5-6
 	EOF
 	(
-		echo "get_reachable_subset(X,Y)" &&
+		echo "tips_reachable_from_bases(X,Y)" &&
 		git rev-parse commit-3-3 \
 			      commit-1-7 \
 			      commit-5-6 | sort
 	) >expect &&
-	test_all_modes get_reachable_subset
+	test_all_modes tips_reachable_from_bases &&
+	test_all_modes tips_reachable_from_bases_pq
 '
 
-test_expect_success 'get_reachable_subset:some' '
+test_expect_success 'tips_reachable_from_bases:some' '
 	cat >input <<-\EOF &&
 	X:commit-9-1
 	X:commit-8-3
@@ -422,14 +423,15 @@ test_expect_success 'get_reachable_subset:some' '
 	Y:commit-5-6
 	EOF
 	(
-		echo "get_reachable_subset(X,Y)" &&
+		echo "tips_reachable_from_bases(X,Y)" &&
 		git rev-parse commit-3-3 \
 			      commit-1-7 | sort
 	) >expect &&
-	test_all_modes get_reachable_subset
+	test_all_modes tips_reachable_from_bases &&
+	test_all_modes tips_reachable_from_bases_pq
 '
 
-test_expect_success 'get_reachable_subset:none' '
+test_expect_success 'tips_reachable_from_bases:none' '
 	cat >input <<-\EOF &&
 	X:commit-9-1
 	X:commit-8-3
@@ -439,8 +441,9 @@ test_expect_success 'get_reachable_subset:none' '
 	Y:commit-7-6
 	Y:commit-2-8
 	EOF
-	echo "get_reachable_subset(X,Y)" >expect &&
-	test_all_modes get_reachable_subset
+	echo "tips_reachable_from_bases(X,Y)" >expect &&
+	test_all_modes tips_reachable_from_bases &&
+	test_all_modes tips_reachable_from_bases_pq
 '
 
 test_expect_success 'for-each-ref ahead-behind:linear' '
@@ -657,6 +660,50 @@ test_expect_success 'for-each-ref merged:duplicate at min generation' '
 		--format="%(refname)" --stdin
 '
 
+test_expect_success 'for-each-ref merged:all reachable commits' '
+	for x in $(test_seq 1 10)
+	do
+		for y in $(test_seq 1 10)
+		do
+			echo "refs/heads/commit-$x-$y" || return 1
+		done
+	done >input &&
+	for x in $(test_seq 1 5)
+	do
+		for y in $(test_seq 1 5)
+		do
+			echo "refs/heads/commit-$x-$y" || return 1
+		done
+	done | sort >expect &&
+	run_all_modes git for-each-ref --merged=commit-5-5 \
+		--format="%(refname)" --stdin
+'
+
+test_expect_success 'for-each-ref merged:all reachable, multibase' '
+	for x in $(test_seq 1 10)
+	do
+		for y in $(test_seq 1 10)
+		do
+			echo "refs/heads/commit-$x-$y" || return 1
+		done
+	done >input &&
+	for x in $(test_seq 1 10)
+	do
+		for y in $(test_seq 1 10)
+		do
+			if { test $x -le 3 && test $y -le 7; } ||
+			   { test $x -le 7 && test $y -le 3; }
+			then
+				echo "refs/heads/commit-$x-$y" || return 1
+			fi
+		done
+	done | sort >expect &&
+	run_all_modes git for-each-ref \
+		--merged=commit-3-7 \
+		--merged=commit-7-3 \
+		--format="%(refname)" --stdin
+'
+
 # For get_branch_base_for_tip, we only care about
 # first-parent history. Here is the test graph with
 # second parents removed:

base-commit: 1ff279f3404a482a83fb04c7457e41ab26884aea
-- 
gitgitgadget

^ permalink raw reply related

* Re: [PATCH v4 06/10] reset: introduce ability to skip updating HEAD
From: Patrick Steinhardt @ 2026-06-11 11:47 UTC (permalink / raw)
  To: phillip.wood; +Cc: git, Pablo Sabater, Junio C Hamano, Kristoffer Haugsbakk
In-Reply-To: <cfade236-75bc-4679-a74a-6da82e6a5135@gmail.com>

On Wed, Jun 10, 2026 at 02:11:04PM +0100, Phillip Wood wrote:
> On 10/06/2026 09:52, Patrick Steinhardt wrote:
> > In a subsequent commit we'll introduce a new caller to
> > `reset_working_tree()` that really only wants to update the index and
> > working tree, without updating any references. Introduce a new flag that
> > makes the caller opt in to updating HEAD and adapt all callers to set
> > that flag.
> > 
> > Note that in a previous iteration we instead introduced a flag that made
> > callers opt out of updating any references. This was somewhat awkward
> > though because we already have the `UPDATE_ORIG_HEAD` flag, so the
> > result was somewhat inconsistent.
> 
> Thanks for doing this. I've grepped for all the callers of reset_head() to
> confirm this patch adds RESET_HEAD_UPDATE_HEAD to them all.
> 
> I wonder if we should add a check for passing RESET_HEAD_UPDATE_ORIG_HEAD
> without RESET_HEAD_UPDATE_HEAD that calls BUG() as we don't support that.
> Everything else looks good.

Fair, can do. Thanks!

Patrick

^ permalink raw reply

* Re: [PATCH v4] git-gui: silence install recipes under "make -s"
From: Harald Nordgren @ 2026-06-11 11:17 UTC (permalink / raw)
  To: Kristoffer Haugsbakk; +Cc: Johannes Sixt, git, Koji Nakamaru
In-Reply-To: <7b40cabe-d243-40dd-ab29-fc4dd91fa20d@app.fastmail.com>

Thanks for the explanation.


Harald

^ permalink raw reply

* Re: [PATCH 5/7] environment: split up concerns of `is_bare_repository_cfg`
From: Patrick Steinhardt @ 2026-06-11  9:19 UTC (permalink / raw)
  To: Justin Tobler; +Cc: git
In-Reply-To: <ainesoUOuhspKxHF@denethor>

On Wed, Jun 10, 2026 at 05:22:04PM -0500, Justin Tobler wrote:
> On 26/06/10 08:56AM, Patrick Steinhardt wrote:
> > diff --git a/builtin/init-db.c b/builtin/init-db.c
> > index 52aa92fb0a..566732c9f4 100644
> > --- a/builtin/init-db.c
> > +++ b/builtin/init-db.c
> > @@ -81,7 +81,7 @@ int cmd_init_db(int argc,
> >  	const char *template_dir = NULL;
> >  	char *template_dir_to_free = NULL;
> >  	unsigned int flags = 0;
> > -	int bare = is_bare_repository_cfg;
> > +	int bare = startup_info->force_bare_repository ? 1 : -1;
> 
> Any particular reason to continue mapping `force_bare_repository=false`
> to -1? Or was this to just minimize changes?

The `-1` value doesn't mean "false", but it rather means "undecided".
The effect of this is that "core.bare" will eventually override this.

> > diff --git a/setup.c b/setup.c
> > index 71fc6b33da..2b690da8ca 100644
> > --- a/setup.c
> > +++ b/setup.c
> > @@ -795,10 +795,16 @@ static int check_repository_format_gently(const char *gitdir,
> >  		has_common = 0;
> >  	}
> >  
> > -	if (!has_common) {
> > -		if (candidate->is_bare != -1)
> > -			is_bare_repository_cfg = candidate->is_bare;
> > -	} else {
> > +	if (startup_info->force_bare_repository) {
> > +		candidate->is_bare = 1;
> > +		FREE_AND_NULL(candidate->work_tree);
> > +	} else if (has_common) {
> > +		/*
> > +		 * When sharing a common dir with another repository (e.g. a
> > +		 * linked worktree), do not let this repository's config
> > +		 * dictate bareness; it is inherited from the main worktree.
> > +		 */
> > +		candidate->is_bare = -1;
> >  		FREE_AND_NULL(candidate->work_tree);
> 
> Previously, when there was a common dir, `candidate->work_tree` was left
> untouched, but now we are expclicitly setting it. I'm not sure I fully
> understand this change.

I cannot blame you. All of this logic is so unbelievably tangled and
hard to follow.

In any case, I think you might have missed the fact that we `else if`
branch is now `has_common` as compared to `!has_common`? To explain the
different cases a bit:

  - When we have `force_bare_repository` we are being told that the
    repository should be treated as bare. So we set `is_bare` and also
    clear the work tree that may have been discovered.

  - When we have a commondir we know that we're in a worktree.
    Previously we did nothing in this case, and that had the implicit
    effect that `is_bare_repository_cfg` would remain at `-1`. So to
    match that behaviour we have to also reset the candidate's bareness
    to `-1`, so that we parse it via the repository's configuration at a
    later point in time.

    The other part here is that we also reset `candidate->work_tree`.
    This is because the expectation is that `$GIT_DIR/common` should
    override any "core.worktree" settings. Quoting git-config(1):

        If GIT_COMMON_DIR environment variable is set, core.worktree is
        ignored and not used for determining the root of working tree.

  - When we don't have a commondir we previosuly had to also adapt the
    global `is_bare_repository_cfg` variable. This part is not necessary
    anymore, so we basically just drop this case altogether.

Patrick

^ permalink raw reply

* Re: git-diff in a worktree is an order of magnitude slower?
From: Jeff King @ 2026-06-11  8:55 UTC (permalink / raw)
  To: D. Ben Knoble; +Cc: Git
In-Reply-To: <CALnO6CD+3sE1xQUnRsCFfWrZTsq2Edw7BWseLzasgT3dgtaq_Q@mail.gmail.com>

On Tue, Jun 09, 2026 at 01:15:11PM -0400, D. Ben Knoble wrote:

> > Which implies that the entries are stat dirty. And indeed, if I run:
> >
> >   git -C linux update-index --refresh
> >
> > now they both take ~20ms.
> 
> Ah, TIL about --refresh. I suppose it could be nice if "git diff"
> updated the index in this way, but that sounds like a band-aid. Maybe
> creating a fresh worktree should do the equivalent to make sure it's
> considered "fresh"?

I think "git diff" _does_ refresh the index internally (that's what
takes so long!). I thought we then wrote out the result, but maybe we
don't notice that it needs an update for some reason?

I'm pretty sure "git status" does something similar, though running it
in a slow working tree _does_ seem to make things faster. Maybe it's
more aggressive about doing the update.

I don't think that refreshing after making a worktree would help. The
problem is one of timestamps: we just wrote an index (so it _should_ be
totally up to date), but we err on the side of caution for some entries
because the file timestamps and the index timestamp are the same. So
what makes it "work" is that one second passed between writing those
files and running "update-index". If you ran it from the worktree
command automatically, it might all still happen in the same second.

And of course, it's not just worktrees. Any time we checkout we may
suffer from this problem, though initial clones and worktree creation
will write more files than most.

> At $DAYJOB, I _think_ some version of "git restore <stuff>" ended up
> also updating the index.

Yep, that would make sense. Any index write (after the second-hand
ticks) will make it go away, since it means updating the mtime of the
index.

> > I'd have thought USE_NSEC was the default these days, but looks like it
> > isn't? Try building with that and I'll bet it goes away entirely.
> 
> Thanks, I'll take a look.
> 
> I can see on my Macbook that at least Meson does automatically set
> either USE_ST_TIMESPEC or NO_NSEC automatically, but has no option to
> enabled USE_NSEC and try that. I can probably write that patch (which
> I'll do to test), and I can send it along with the "worktree add
> should refresh the index" if you think that's an appropriate thing to
> do.

I think NO_NSEC is about not looking at the nsec fields of stat structs
(since they might not exist). But we don't actually use them for stat
matching unless USE_NSEC is set.

I guess the distinction goes back to c06ff4908b (Record ns-timestamps if
possible, but do not use it without USE_NSEC, 2009-03-04), which details
some reasons you might not want USE_NSEC. Feels like it ought to be a
run-time config, though, and maybe even something that gets auto-probed
by git-init.

Definitely not an area I have looked at much, though, nor thought hard
about. So there might be gotchas. :)

-Peff

^ permalink raw reply

* Re: [PATCH v2] ls-files: filter pathspec before lstat
From: Jeff King @ 2026-06-11  8:41 UTC (permalink / raw)
  To: Tamir Duberstein
  Cc: git, René Scharfe, Patrick Steinhardt, Junio C Hamano
In-Reply-To: <CAJ-ks9mJk-=xp1hW77hAoZwwQAfpMukYO8OvvkLx646-2Z3_kg@mail.gmail.com>

On Tue, Jun 09, 2026 at 04:15:41PM -0700, Tamir Duberstein wrote:

> On Tue, Jun 9, 2026 at 3:41 AM Jeff King <peff@peff.net> wrote:
> >
> > On Mon, Jun 08, 2026 at 07:37:15PM -0700, Tamir Duberstein wrote:
> >
> > > +             /*
> > > +              * match_pathspec() is linear in pathspec.nr, so prefilter only
> > > +              * the single-pathspec case. Only entries shown by show_ce()
> > > +              * satisfy --error-unmatch.
> > > +              */
> > > +             if (pathspec.nr == 1 &&
> > > +                 !match_pathspec(repo->index, &pathspec, fullname.buf,
> > > +                                 fullname.len, max_prefix_len, NULL,
> > > +                                 S_ISDIR(ce->ce_mode) ||
> > > +                                 S_ISGITLINK(ce->ce_mode)))
> > > +                     continue;
> >
> > This feels...kind of arbitrary, no? Surely it's also faster with
> > pathspec.nr == 2, and so on up to some nr closer to the size of the
> > total index. It feels weird to be making an arbitrary cutoff based on
> > pathspec performance in calling code like this.
> >
> > It is not wrong, per se, as you are optimizing your case without trying
> > to hurt any others. But what do we do when somebody profiles it and
> > comes along trying to bump the number to 2, or 10?
> >
> > I dunno.
> 
> Yeah, absolutely it's arbitrary. The simplest answer is that others
> are welcome to bump this, provided they make the case for it.

OK. I can live with, I suppose, but I am tempted to say that it should
just kick in always (i.e., removing the pathspec.nr check).

Though I did show a case where the performance regresses, it was pretty
made-up and not something I'd expect in the real world. And you'd see
that same crappy performance with "git ls-files -- $(git ls-files)",
without the "-m".  The real solution is making the pathspec code less
crappy.

-Peff

^ permalink raw reply

* Re: [PATCH 0/3] config: allow disabling config includes
From: Jeff King @ 2026-06-11  8:39 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: Derrick Stolee via GitGitGadget, git, gitster
In-Reply-To: <4d7834c0-d8ab-4dcd-8a7f-ed62c30cbe43@gmail.com>

On Tue, Jun 09, 2026 at 08:59:22AM -0400, Derrick Stolee wrote:

> > So I dunno. From the described motivation, this feels like a band-aid
> > that fixes only one narrow instance of a greater problem.
> > 
> > The notion of enabling/disabling includes per-command is itself a
> > flexible building block. So it's possible that it has other uses in
> > general. But it's also a fairly broad hammer that covers more than your
> > use case. If you're planning to use "git --no-includes" in some script,
> > then it breaks the config of anybody who uses includes in their
> > user-level ~/.gitconfig file.
> > 
> > So you may need some more directed limiting.
> 
> Are you suggesting some kind of internal sandbox to limit Git from
> accessing repository paths from config includes and other config-set keys?
> That would be a more complete solution, but I'm not sure how we could plug
> all of those holes at once. I'll think on it, though.

I don't know that I'm really suggesting anything, but more just thinking
out loud. My concern would be that we plug one such hole, and then later
find we need more. And then we are stuck with the solution for plugging
that hole, even if it may later become redundant.

I'm not sure I entirely understand the problematic case, though. The
user points to in-repo config (which we already tell people is a bad
idea), and then that config breaks for some reason? Because the include
is relative and git is run from another directory?

If you are going to make such an include, I'd hope you'd at least do it
from .git/config, which should reliably resolve relative paths based on
the source include file, not the current working directory.

> This is exactly the kind of case I was worried about. This specific case
> only impacts write operations, but some tools do those things. And this
> email case is a common one that users do in their global config to isolate
> personal and professional identities.

Yeah, I picked it because I suspect it is a very common use of includes.
But really it is up to users to do whatever they want with includes, and
they'd probably expect them to always work.

> I'm trying to think if there's a place where we'd have some config that is
> critical to the repo functioning not in its local config (like the repo
> format version or extensions). Perhaps borrowing from your work/personal
> example, a user could use a different credential helper for work than they
> use for personal repositories.

It depends what you mean by critical, I suppose. If I happen to prefer
shoving all of my diff.*.textconv commands into ~/.gitconfig-diff, then
including it with include.path=.gitconfig-diff in ~/.gitconfig, I'd
expect it to just work everywhere. But disabling includes would violate
that assumption. Probably not _critical_, but it could be annoying and
surprising.

> Or: are we venturing into territory where we don't even want to create a
> new foot-gun? If there were another way to solve the situation that I'm
> facing without these risks, then I'd be open to it. Any ideas?

Yeah, the more I think on it, the more it seems like a foot-gun. Like I
said, I'm not sure I entirely understand the use-case. If you could
flesh out an example, that might help.

-Peff

^ permalink raw reply

* Re: [PATCH v3] transport-helper: fix TSAN race in transfer_debug()
From: Jeff King @ 2026-06-11  8:33 UTC (permalink / raw)
  To: Pushkar Singh; +Cc: git, gitster
In-Reply-To: <20260609134741.4727-2-pushkarkumarsingh1970@gmail.com>

On Tue, Jun 09, 2026 at 01:47:42PM +0000, Pushkar Singh wrote:

> Changes since v2:
> - Treat an uninitialized transfer_debug_enabled as a BUG()
>   instead of silently treating it as disabled.
> - Follow Jeff King's suggestion to distinguish an
>   uninitialized state from a disabled state.

Thanks, this looks OK to me.

> +	if (transfer_debug_enabled < 0)
> +		BUG("somebody forgot to check GIT_TRANSLOOP_DEBUG!");

That is a somewhat silly message, but the point is that nobody is
supposed to see it ever. And it does lead them to roughly the right
spot, so I think it's sufficient. (Yes, I know it came from my earlier
response).

-Peff

^ permalink raw reply

* Re: [PATCH v4] git-gui: silence install recipes under "make -s"
From: Kristoffer Haugsbakk @ 2026-06-11  8:32 UTC (permalink / raw)
  To: Harald Nordgren, Johannes Sixt; +Cc: git, Koji Nakamaru
In-Reply-To: <CAHwyqnVSnf9K50xgUjeHFM395Rvj_uTVvZ1U8EZayNDZeMP4Bg@mail.gmail.com>

On Thu, Jun 11, 2026, at 10:26, Harald Nordgren wrote:
> On Thu, Jun 11, 2026 at 7:37 AM Johannes Sixt <j6t@kdbg.org> wrote:
>>
>> Am 10.06.26 um 15:19 schrieb Harald Nordgren:
>> > What does it mean for it to be queued here, should I expect it to show
>> > up on seen or next?
>> It means that I'll arrange that it will appear in the next Git release.
>> Until then you can find the commit in
>> https://github.com/j6t/git-gui/tree/hn/silence-make-s .
>
> Thanks! So does that mean that 'seen' and 'next' are branches that are
> added to only by Junio Hamano?

Yes. Because the git/git repository (on GitHub) is a mirror of his own
gitster/git repo (for these branches but not topic branches).

When he is away and there is an interim maintainer, they may push to
these branches on their own repos and use git/git as a mirror. So I
guess they all have commit access to git/git.

That is my understanding of the matter and what I have observed.

^ permalink raw reply

* Re: [PATCH v2 3/3] doc: git-config: escape erroneous highlight markup
From: Jeff King @ 2026-06-11  8:31 UTC (permalink / raw)
  To: Tuomas Ahola
  Cc: git, Kristoffer Haugsbakk, Junio C Hamano, Jean-Noël Avila
In-Reply-To: <20260611080242.lqXwi%taahol@utu.fi>

On Thu, Jun 11, 2026 at 11:02:42AM +0300, Tuomas Ahola wrote:

> > >  If _<message>_ begins with one or more whitespaces followed
> > > -by "#", it is used as-is.  If it begins with "#", a space is
> > > +by "\#", it is used as-is.  If it begins with "\#", a space is
> > >  prepended before it is used.  Otherwise, a string " # " (a
> > >  space followed by a hash followed by a space) is prepended
> > 
> > I saw the comment on round 1 about this second "#" on the line. But
> > while we are here, should we be doing the one in the context, too?
> > 
> > -Peff
> 
> It seems adding that second backslash was already too much, as doc-diff (which
> I neglected to run before submitting V2) shows:
> 
> ```
> $ ./doc-diff V1 V2
>  
>             If <message> begins with one or more whitespaces followed by "#",
> -           it is used as-is. If it begins with "#", a space is prepended
> +           it is used as-is. If it begins with "\#", a space is prepended
>             before it is used. Otherwise, a string " # " (a space followed by a
>             hash followed by a space) is prepended to it. The resulting string
>             is placed immediately after the value defined for the variable. The
> ```

Heh, it would not be the first time I am baffled by asciidoc's parsing. :)

Adding a backslash to the third instance "fixes" the second one to me,
but I wouldn't want to rely on that (plus it breaks the third instance).

Using backticks does work, though it always opens a typographical
question. When reading the source, you see `#`, so you get a punctuation
delimiter but no typographical one. In the rendered output, you'll see
it in a typewriter font (assuming we fix the config issue), but we'd
lose the visible punctuation. I could live with that.

But for " # ", it gets weirder. We need punctuation to call out the
spaces, but what should happen to the quotes? They are not really part
of the literal string, so should they go inside or outside the
backticks? I think it may be a moot point as "` # `" is not parsed as
you might hope by asciidoc. Doing `" # "` does work, and is probably OK
enough here.

-Peff

^ permalink raw reply

* Re: [PATCH v4] git-gui: silence install recipes under "make -s"
From: Harald Nordgren @ 2026-06-11  8:26 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: git, Harald Nordgren via GitGitGadget
In-Reply-To: <fdf7f988-d345-4107-845f-e089d7829c16@kdbg.org>

On Thu, Jun 11, 2026 at 7:37 AM Johannes Sixt <j6t@kdbg.org> wrote:
>
> Am 10.06.26 um 15:19 schrieb Harald Nordgren:
> > What does it mean for it to be queued here, should I expect it to show
> > up on seen or next?
> It means that I'll arrange that it will appear in the next Git release.
> Until then you can find the commit in
> https://github.com/j6t/git-gui/tree/hn/silence-make-s .

Thanks! So does that mean that 'seen' and 'next' are branches that are
added to only by Junio Hamano?


Harald

^ permalink raw reply

* Re: [PATCH v2 2/2] ref-filter: memoize --contains with generations
From: Jeff King @ 2026-06-11  8:22 UTC (permalink / raw)
  To: Tamir Duberstein
  Cc: git, Karthik Nayak, Junio C Hamano, Victoria Dye, Derrick Stolee,
	Elijah Newren
In-Reply-To: <20260608-ref-filter-memoized-contains-v2-2-e72720344a7c@gmail.com>

On Mon, Jun 08, 2026 at 07:36:35PM -0700, Tamir Duberstein wrote:

> The wall-time standard deviations were 11.356 seconds and 133.8
> milliseconds, respectively. Separate runs without redirection produced
> the same output with SHA-256
> 2466f6e2b72aa16b1a2126eddb81c8a1b2764ee251204ac034c191a925aa896f.

Heh. Without the original repo, this sha256 hash is meaningless to us,
isn't it? Ditto for the sha1 the earlier command.

>  int commit_contains(struct ref_filter *filter, struct commit *commit,
>  		    struct commit_list *list, struct contains_cache *cache)
>  {
> -	if (filter->with_commit_tag_algo)
> +	int result;
> +
> +	if (!list)
> +		return 1;
> +	if (filter->with_commit_tag_algo ||
> +	    generation_numbers_enabled(the_repository))
>  		return contains_tag_algo(commit, list, cache) == CONTAINS_YES;
> -	return repo_is_descendant_of(the_repository, commit, list);
> +
> +	result = repo_is_descendant_of(the_repository, commit, list);
> +	if (result < 0)
> +		exit(128);
> +	return result;

There's a little more going on here than I expected from the commit
message. Is it important for us to short-circuit the empty list and just
return 1? Or did the existing helper functions already handle that?

Looking at contains_tag_algo(), I think it would actually return
CONTAINS_NO here (though I didn't test it). So this is actually a change
in behavior for "git tag" if that's correct. I doubt it is triggerable
in practice, though, as we would simply never call commit_contains() in
the first place with an empty list. But if we are going to add in this
logic, I think it makes sense to do so as a separate commit (describing
what it is doing and why it's not (yet) a triggerable bug).

Checking the result of repo_is_descendant_of() makes sense, as discussed
earlier. But probably that should come as its own patch, since it's an
independent bug-fix. I'm also tempted to say it should call die()
instead of a direct exit, though it does look like the error exit paths
from repo_is_descendant_of() would all have produced their own messages.

And one side note. While looking at the implementation of
repo_is_descendant_of(), I did notice something curious: it also
switches algorithms based on the presence of generation numbers! So it
should also be cutting off the traversal early when possible. But I
guess its main problem is that we call it independently for each
candidate, so it may traverse the same (useful) stretch of history
multiple times.

So probably an alternative approach to this patch would be feeding all
of the candidates at once, the way we do with reach_filter() via
filter_refs(). I'm not sure if we have the right functions available for
that (naively, --contains and --merged are inversions of each other, so
swapping the arguments to tips_reachable_from_bases() might work, but I
didn't think very hard on it).

I wonder if that might perform better or worse. I'm content to leave it
for another day, though, as switching to the memoizing depth-first algo
here is a pretty easy change.

> -	commit=$(git commit-tree $(git rev-parse HEAD^{tree})) &&
> +	git rev-list --first-parent --max-count=8192 HEAD >contains-commits &&
> +	test_file_not_empty contains-commits &&
> +	git update-ref refs/contains-perf-base "$(tail -n 1 contains-commits)" &&
> +	awk "{
> +		printf \"update refs/contains-perf/%04d %s\\n\", NR, \$1
> +	}" contains-commits |
> +		git update-ref --stdin &&
> +	git pack-refs --include "refs/contains-perf/*" &&

My head almost exploded reading the embedded quoting in that awk
invocation. But I can't think offhand of a better way to do it. You
can't use test_seq because it needs both the number and the original
string. You can do it with sed, but it probably ends up even more
unreadable.

But OK, we are making a bunch of refs based on first-parent history.

> +	tree=$(git rev-parse HEAD^{tree}) &&
> +	base=$(git rev-parse HEAD) &&
> +	target=$(echo target | git commit-tree "$tree" -p "$base") &&
> +	git update-ref refs/contains-diverged/target "$target" &&
> +	for i in $(test_seq 1 4)
> +	do
> +		commit=$(echo candidate-$i |
> +			git commit-tree "$tree" -p "$base") &&
> +		git update-ref refs/contains-diverged/candidate-$i "$commit" ||
> +		return 1
> +	done &&

And then a few candidate refs that are not reachable from other refs, or
from each other. OK.

I think you could just write:

  git commit-tree HEAD^{tree} -p HEAD

instead of doing separate rev-parses, but it's probably not a big deal
either way.

> +test_expect_success 'verify contains results' '
> +	git for-each-ref --contains=refs/contains-perf-base \
> +		refs/contains-perf/ >actual &&
> +	test_line_count = $(wc -l <contains-commits) actual &&
> +
> +	echo refs/contains-diverged/target >expect &&
> +	GIT_TEST_COMMIT_GRAPH=0 \
> +		git -c core.commitGraph=false for-each-ref \
> +			--format="%(refname)" \
> +			--contains=refs/contains-diverged/target \
> +			refs/contains-diverged/ >actual &&
> +	test_cmp expect actual
> +'

This is a funny test to have in the middle of a perf script (which
hardly anybody ever runs). If we are concerned about the correctness,
should this be in a non-perf test script? Though I'd imagine something
like it is already covered there.

There's a lot of subtlety in what we're verifying, too. In the first
half, we are checking that all of the commits in contains-perf contain
the base.  And that base is the final element of the contains-commits
list. Which made me wonder what happens in a branch history, since that
list is linearized. But because we used --first-parent to generate it,
it _is_ linear, and the results work out. So OK, I don't think it's
wrong, but I am struggling to understand the meaning of the test.

The second half is just checking that...the other refs which are not
contained in "target" are not mentioned? OK, but why do it only with
commit graphs off. Why not both off and on? Again, I'm not sure I
understand what we're trying to focus on here.

> +test_perf 'contains: git for-each-ref --contains' '
> +	git for-each-ref --contains=refs/contains-perf-base \
> +		refs/contains-perf/ >/dev/null
> +'

Yay, actual perf tests. Here we have a ton of matches, and they all walk
over the same chunk of history. Should get much faster, though it's
mostly a synthetic test.

For --merged, we already have separate tests with each of for-each-ref,
branch, and tag. Should we have the same here for --contains? And should
we be using the input repo data, rather than our synthetic test? It is
nice to show off the performance with the synthetic test, but ultimately
the point of the perf suite is feeding it real workloads and looking for
regressions.

> +test_perf 'contains without generations: divergent refs' '
> +	GIT_TEST_COMMIT_GRAPH=0 \
> +		git -c core.commitGraph=false for-each-ref \
> +			--contains=refs/contains-diverged/target \
> +			refs/contains-diverged/ >/dev/null
> +'

OK, and this one should find that most of them are not contained, but
the depth-first algorithm could walk all the way down to the roots. But
we don't run it at all, since we disable commit graphs!

So what are we trying to measure here? If it left commit graphs enabled,
I think we could demonstrate that using the depth-first algorithm with
generation numbers does not make anything _worse_. I.e., that
for-each-ref and branch did not regress from the change.

> +test_expect_success 'missing ancestors are reported by contains filters' '
> +	test_when_finished "git update-ref -d refs/heads/missing-parent" &&
> +	{
> +		echo "tree $(git rev-parse HEAD^{tree})" &&
> +		echo "parent $MISSING" &&
> +		git cat-file commit HEAD |
> +			sed -n -e "/^author /p" -e "/^committer /p" &&
> +		echo &&
> +		echo "missing parent"
> +	} >commit &&
> +	broken=$(git hash-object -t commit -w commit) &&
> +	git update-ref refs/heads/missing-parent "$broken" &&
> +	for option in --contains --no-contains
> +	do
> +		test_must_fail git for-each-ref "$option=HEAD" \
> +			refs/heads/missing-parent >out 2>err &&
> +		test_must_be_empty out &&
> +		test_grep "parse commit $MISSING" err ||
> +		return 1
> +	done
> +'

This is a great thing to test, but probably should be pulled out into
a separate patch along with the fix to check the return code.

The commit construction looks OK, and is nicer than corrupting the
repository by deleting a real object. Given that we are pulling the
idents from an existing commit, it might be simpler to just use the
whole commit as a template, like:

  git cat-file commit HEAD |
  sed "s/^parent /parent $MISSING/"

but it may be a matter of taste.

-Peff

^ permalink raw reply

* Re: [PATCH v2 2/2] ref-filter: memoize --contains with generations
From: Karthik Nayak @ 2026-06-11  8:16 UTC (permalink / raw)
  To: Tamir Duberstein
  Cc: git, Jeff King, Junio C Hamano, Victoria Dye, Derrick Stolee,
	Elijah Newren
In-Reply-To: <CAJ-ks9ku=-675naKESOJJxOo0b5BmoH7=76aKZXXmUHM+=ZV0w@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 5749 bytes --]

Tamir Duberstein <tamird@gmail.com> writes:

> On Wed, Jun 10, 2026 at 4:47 AM Karthik Nayak <karthik.188@gmail.com> wrote:
>>
>> Tamir Duberstein <tamird@gmail.com> writes:
>>
>> > git branch and git for-each-ref call repo_is_descendant_of() for
>> > each candidate selected by --contains or --no-contains. Each call
>> > starts a new graph walk, so refs with shared history repeatedly
>> > traverse the same commits.
>> >
>> > ffc4b8012d (tag: speed up --contains calculation, 2011-06-11)
>> > introduced a depth-first walk for git tag that caches positive and
>> > negative answers across candidates. ee2bd06b0f (ref-filter: implement
>> > '--contains' option, 2015-07-07) preserved both implementations when
>> > ref-filter learned --contains.
>> >
>> > The memoized walk is not always faster. Without generation numbers,
>> > a negative check can walk to the root even when the breadth-first
>> > merge-base walk finds a nearby divergence. With generation numbers,
>> > the depth-first walk can stop below the oldest target while still
>> > reusing answers across candidates.
>> >
>> > Keep the existing memoized selection for git tag. Select it for other
>> > ref-filter callers when generation numbers are enabled, and retain
>> > the breadth-first walk otherwise.
>> >
>> > When generation numbers are unavailable, repo_is_descendant_of() can
>> > return -1 if ancestry cannot be read. The ref-filter Boolean interface
>> > treated that error as a match. Check it and exit instead. The memoized
>> > path already dies on the same parse failure, so both selected paths now
>> > fail rather than return a result.
>> >
>> > Add p1500 cases for up to 8,192 packed refs along one first-parent
>> > history and for sibling refs near the tip with generation numbers
>> > forced off.
>> >
>> > On a checkout with 62,174 remote-tracking refs and generation numbers
>> > enabled, I ran:
>> >
>> >     hyperfine --warmup 0 --runs 3 \
>> >         --command-name parent \
>> >         '"$parent" branch -r --contains c78ae85f3ce7e >/dev/null' \
>> >         --command-name this-commit \
>> >         '"$this" branch -r --contains c78ae85f3ce7e >/dev/null'
>> >
>> > The results were:
>> >
>> >              parent       this commit
>> >   elapsed    104.365 s     467.7 ms
>> >   user        93.702 s     220.2 ms
>> >   system       0.723 s     182.7 ms
>> >
>> > The wall-time standard deviations were 11.356 seconds and 133.8
>> > milliseconds, respectively. Separate runs without redirection produced
>> > the same output with SHA-256
>> > 2466f6e2b72aa16b1a2126eddb81c8a1b2764ee251204ac034c191a925aa896f.
>> >
>> > Both revisions were built with the default -O2 flags using Apple
>> > clang 21.0.0 on macOS 26.5. The machine was a MacBook Pro (Mac16,6)
>> > with a 16-core Apple M4 Max (12 performance and four efficiency
>> > cores) and 128 GB RAM.
>> >
>> > Link: https://lore.kernel.org/git/1445163904-24611-1-git-send-email-Karthik.188@gmail.com/
>> > Link: https://lore.kernel.org/r/20230324191009.GA536967@coredump.intra.peff.net
>> > Link: https://lore.kernel.org/git/20260527070510.3510836-1-krka@spotify.com/
>> > Link: https://lore.kernel.org/r/20260608223430.GA340696@coredump.intra.peff.net
>> > Suggested-by: Jeff King <peff@peff.net>
>> > Signed-off-by: Tamir Duberstein <tamird@gmail.com>
>> > ---
>> >  commit-reach.c                 | 13 +++++++++--
>> >  commit-reach.h                 |  7 ++++++
>> >  t/perf/p1500-graph-walks.sh    | 49 +++++++++++++++++++++++++++++++++++++++++-
>> >  t/t6301-for-each-ref-errors.sh | 22 +++++++++++++++++++
>> >  4 files changed, 88 insertions(+), 3 deletions(-)
>> >
>> > diff --git a/commit-reach.c b/commit-reach.c
>> > index 65b618959b..83a48004ef 100644
>> > --- a/commit-reach.c
>> > +++ b/commit-reach.c
>> > @@ -821,9 +821,18 @@ static enum contains_result contains_tag_algo(struct commit *candidate,
>> >  int commit_contains(struct ref_filter *filter, struct commit *commit,
>> >                   struct commit_list *list, struct contains_cache *cache)
>> >  {
>> > -     if (filter->with_commit_tag_algo)
>> > +     int result;
>> > +
>> > +     if (!list)
>> > +             return 1;
>> > +     if (filter->with_commit_tag_algo ||
>> > +         generation_numbers_enabled(the_repository))
>>
>> What's stopping us from dropping `filter->with_commit_tag_algo`
>> completely and then doing?
>>
>>   if (generation_numbers_enabled(the_repository))
>>      return contains_algo(commit, list, cache) == CONTAINS_YES;
>>   return repo_is_descendant_of(the_repository, commit, list);
>
> Jeff raised this distinction during the v1 review:
>
> https://lore.kernel.org/r/20260608223430.GA340696@coredump.intra.peff.net/
>
> `with_commit_tag_algo` preserves the existing behavior of `git tag` when
> generation numbers are unavailable. `git tag --contains` has used the
> memoized walk since ffc4b8012d (tag: speed up --contains calculation,
> 2011-06-11). Dropping the flag would send it back through repeated
> `repo_is_descendant_of()` walks in repositories without usable generation
> numbers.
>

I did read that, my question is on top of that. Do we also want to use
the non-memoized walk for 'git tag' when there are no generation numbers
available or does that not work? If not, we should mention that too in
the commit message.

> The condition in v2 implements the rule discussed there: retain the
> existing memoized path for `git tag`, and use it for other ref-filter
> callers when generation numbers make the depth-first walk reliably
> advantageous.
>
> This is probably my fault for breaking the threading between this and
> v1. Sorry about that.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]

^ permalink raw reply

* Re: [PATCH v2 3/3] doc: git-config: escape erroneous highlight markup
From: Tuomas Ahola @ 2026-06-11  8:02 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Kristoffer Haugsbakk, Junio C Hamano, Jean-Noël Avila
In-Reply-To: <20260611061156.GC2187173@coredump.intra.peff.net>

Jeff King <peff@peff.net> wrote:

> On Thu, Jun 11, 2026 at 01:55:13AM +0300, Tuomas Ahola wrote:
> 
> > Paired octothorpes are used in AsciiDoc to mark highlighted text,
> > <mark> being the equivalent HTML tag.  To use the symbol as a literal
> > character, it can be escaped with a backslash.
> > 
> > Do so in git-config.adoc.
> 
> I think this works OK, but in general I think most uses of backslash for
> metacharacters should consider using literal backticks. That shields it
> from the special meaning for asciidoc, but also will render it
> differently for the user (usually with a typewriter font, which becomes
> bold in roff output).
> 
> Though curiously the case of `#` in git-fast-import seems not to get
> marked as <code> in the html output (even though the nearby `LF` does).
> I wonder if there is some special treatment of `#` or something.
> 
> >  If _<message>_ begins with one or more whitespaces followed
> > -by "#", it is used as-is.  If it begins with "#", a space is
> > +by "\#", it is used as-is.  If it begins with "\#", a space is
> >  prepended before it is used.  Otherwise, a string " # " (a
> >  space followed by a hash followed by a space) is prepended
> 
> I saw the comment on round 1 about this second "#" on the line. But
> while we are here, should we be doing the one in the context, too?
> 
> -Peff

It seems adding that second backslash was already too much, as doc-diff (which
I neglected to run before submitting V2) shows:

```
$ ./doc-diff V1 V2
 
            If <message> begins with one or more whitespaces followed by "#",
-           it is used as-is. If it begins with "#", a space is prepended
+           it is used as-is. If it begins with "\#", a space is prepended
            before it is used. Otherwise, a string " # " (a space followed by a
            hash followed by a space) is prepended to it. The resulting string
            is placed immediately after the value defined for the variable. The
```

--Tuomas

^ permalink raw reply

* Re: [PATCH v2 1/2] commit-reach: handle cycles in contains walk
From: Jeff King @ 2026-06-11  7:29 UTC (permalink / raw)
  To: Tamir Duberstein
  Cc: git, Karthik Nayak, Junio C Hamano, Victoria Dye, Derrick Stolee,
	Elijah Newren
In-Reply-To: <20260608-ref-filter-memoized-contains-v2-1-e72720344a7c@gmail.com>

On Mon, Jun 08, 2026 at 07:36:34PM -0700, Tamir Duberstein wrote:

> @@ -744,7 +745,7 @@ static void push_to_contains_stack(struct commit *candidate, struct contains_sta
>  }
>  
>  static enum contains_result contains_tag_algo(struct commit *candidate,
> -					      const struct commit_list *want,
> +					      struct commit_list *want,
>  					      struct contains_cache *cache)

OK, we must lose the const here because repo_is_descendant_of() does not
have it. We could add const to that function, though that cascades down
to a few other helpers (see below). I'm not sure if that is making the
world a better place, or if it is just const pedantry.

diff --git a/commit-reach.c b/commit-reach.c
index 5df471a313..8cede01f01 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -563,7 +563,7 @@ int repo_get_merge_bases(struct repository *r,
  */
 int repo_is_descendant_of(struct repository *r,
 			  struct commit *commit,
-			  struct commit_list *with_commit)
+			  const struct commit_list *with_commit)
 {
 	if (!with_commit)
 		return 1;
@@ -955,11 +955,12 @@ int can_all_from_reach_with_flag(struct object_array *from,
 	return result;
 }

-int can_all_from_reach(struct commit_list *from, struct commit_list *to,
+int can_all_from_reach(const struct commit_list *from,
+		       const struct commit_list *to,
 		       int cutoff_by_min_date)
 {
 	struct object_array from_objs = OBJECT_ARRAY_INIT;
-	struct commit_list *from_iter = from, *to_iter = to;
+	const struct commit_list *from_iter = from, *to_iter = to;
 	int result;
 	timestamp_t min_commit_date = cutoff_by_min_date ? from->item->date : 0;
 	timestamp_t min_generation = GENERATION_NUMBER_INFINITY;
diff --git a/commit-reach.h b/commit-reach.h
index 3f3a563d8a..76e82f827e 100644
--- a/commit-reach.h
+++ b/commit-reach.h
@@ -37,7 +37,7 @@ int get_octopus_merge_bases(struct commit_list *in, struct commit_list **result)

 int repo_is_descendant_of(struct repository *r,
 			  struct commit *commit,
-			  struct commit_list *with_commit);
+			  const struct commit_list *with_commit);
 int repo_in_merge_bases(struct repository *r,
 			struct commit *commit,
 			struct commit *reference);
@@ -93,7 +93,8 @@ int can_all_from_reach_with_flag(struct object_array *from,
 				 unsigned int assign_flag,
 				 timestamp_t min_commit_date,
 				 timestamp_t min_generation);
-int can_all_from_reach(struct commit_list *from, struct commit_list *to,
+int can_all_from_reach(const struct commit_list *from,
+		       const struct commit_list *to,
 		       int commit_date_cutoff);

> +cycle:
> +	free(contains_stack.contains_stack);
> +	clear_contains_cache(cache);
> +	init_contains_cache(cache);
> +
> +	result = repo_is_descendant_of(the_repository, candidate, want);
> +	if (result < 0)
> +		exit(128);

We are feeding the whole initial "want" list, so we should get a correct
answer regardless of how far we got into the cycle, which would run into
problems (e.g., if the cycle existed only on some branch of the
history). But going back to the initial list will always be correct.
Good.

Two small points, though.

One, the call to init_contains_cache() is redundant here; the clear
function is documented as making things ready for use (it's a little
hard to grep for, due to macros, but the docs are in commit-slab.h).
It's probably not hurting anything.

Two, the call to exit(128) is unusual for our code base (I'd guess it
was cribbed off of the top-level exits in builtin/pull.c). We'd usually
die() instead. Even if repo_is_descendant_of() produced its own error
message, it may be useful to mention that we were falling back to it due
to a cycle.

But even better is if we can return the error up the stack. We do not
return errors from contains_tag_algo() currently, but it has only one
caller. And that caller may also directly return the result of
repo_is_descendant_of(). So could we just pass that along?

Perhaps not. Looking at the callers of commit_contains(), they treat the
result as a pure boolean. So probably calling die() is reasonable, and
we already do so via parse_commit_or_die() elsewhere in the algorithm.
That does leave a potential lurking bug for the non-tag-algo code path.

> +	*contains_cache_at(cache, candidate) =
> +		result ? CONTAINS_YES : CONTAINS_NO;
> +	return result ? CONTAINS_YES : CONTAINS_NO;

So we actually cache our discovered value. Cute, and it might save us
from hitting the cycle again, though not always. E.g., two candidates A
and B share a parent P, and the cycle starts at P but does not include A
or B. We discover the cycle and cache the value for A, but discover it
again for B.

We do lose all of the existing non-cycle cached values when we call
clear_contains_cache(). But we have to at least clear out all of the
IN_PROGRESS commits. It is hard to care too much about optimizing the
outcome for this case which we expect to happen approximately never.
So I think doing the simplest correct thing is OK.

> +test_expect_success 'tag --contains handles cyclic replacement histories' '
> +	first=$(git rev-parse HEAD~2) &&
> +	second=$(git rev-parse HEAD~) &&
> +	third=$(git rev-parse HEAD) &&
> +	test_when_finished "
> +		git replace -d $first
> +		git replace -d $third
> +		git tag -d cycle-a cycle-b
> +	" &&

We usually &&-chain the commands inside test_when_finished. If they
fail, the test harness will note this and complain (if the test was not
otherwise failing). It's usually not a big deal either way, though
sometimes it can catch silly mistakes (e.g., if you wrote $second
instead of $third and the "replace -d" is quietly doing nothing at all).

I'm a little surprised that the chainlint checker doesn't catch this,
but I guess it doesn't know to recurse into the snippet handed to
test_when_finished. It probably is not really worth the trouble to teach
it to do so.

Otherwise the test looks good to me.

-Peff

^ permalink raw reply related

* Re: trailers: --only-trailers normalizes URLs to trailers
From: Kristoffer Haugsbakk @ 2026-06-11  7:03 UTC (permalink / raw)
  To: Jeff King; +Cc: git
In-Reply-To: <20260611065616.GE2191159@coredump.intra.peff.net>

On Thu, Jun 11, 2026, at 08:56, Jeff King wrote:
>>[snip]
>> 
>> And again I don’t think that is likely to ever happen (with a knock
>> on wood).
>
> I didn't spend much effort on the patch I showed beyond running it once.
> It would probably need tests and a doc update. I wasn't planning to run
> with it, but if you feel like doing so, please feel free to use it as
> you like.

Yeah, I want to add some tests on top
and make a sumbission (but sent a
message to first confirm that you weren't
cooking anything more ;) ). Thanks.

-- 
Sent from mobile

^ permalink raw reply

* Re: [PATCH v3] index-pack: retain child bases in delta cache
From: Jeff King @ 2026-06-11  6:57 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Arijit Banerjee, Arijit Banerjee via GitGitGadget, git,
	Ævar Arnfjörð Bjarmason, Derrick Stolee,
	Arijit Banerjee
In-Reply-To: <xmqqldcmxxco.fsf@gitster.g>

On Wed, Jun 10, 2026 at 07:51:19AM -0700, Junio C Hamano wrote:

> Arijit Banerjee <arijit91@gmail.com> writes:
> 
> > Apologies, my earlier replies were sent through GitHub's notification
> > emails and appeared only as PR comments, so they did not reach the mailing
> > list.
> >
> > On Thu, Jun 4, 2026, Jeff King wrote:
> >> So I am happy with either v2 or v3.
> >
> > I also did not see a meaningful performance difference between v2 and v3.
> > I am happy with either direction and defer to the maintainers on whether
> > v3's more precise release is worth the added complexity.
> 
> I have no strong preference either way.

Nor me. I'd probably go with v2 simply because it is shorter and less
code. If there is an optimization whose effect we cannot measure, it is
probably not worth even the few lines to have it. It could always be
resurrected if somebody finds a case where it matters.

-Peff

^ permalink raw reply

* Re: trailers: --only-trailers normalizes URLs to trailers
From: Jeff King @ 2026-06-11  6:56 UTC (permalink / raw)
  To: Kristoffer Haugsbakk; +Cc: git
In-Reply-To: <d8f7f827-27da-41fc-af8d-72d383b24fff@app.fastmail.com>

On Wed, Jun 10, 2026 at 04:21:29PM +0200, Kristoffer Haugsbakk wrote:

> > Yeah, though if you'll allow me to nitpick your subject a moment: I
> > don't think --only-trailers is really the culprit here. It demonstrates
> > the problem because it normalizes the "trailer" it found. But the loose
> > trailer matching is the more fundamental issue. For example:
> >
> >[snip]
> 
> Yeah, this is more precise. I focused a ton on the normalized output
> because that’s what makes it obvious. But the fundamental problem is
> interpreting URLs like trailers.

That makes sense. As the author of --only-trailers I immediately
wondered if I had introduced a bug in it, so I was partially motivated
by exonerating myself. ;) I agree that using it is the simplest way to
demonstrate the problem.

> > Agreed, though I think a rule like: ":// (with no whitespace)" is not a
> > valid separator. Something like this:
> 
> Yes, matching on `://` strictly is a better proposal. No need to care
> about `http`, `https`, `file`, etc. And both of these would *still* have
> to be true for this change to be a false negative w.r.t. the user’s
> intentions:
> 
> • They really input a trailer that looks like a URL, but it’s not meant
>   to be a URL
> • They really wanted the value to start with `//`
> 
> And again I don’t think that is likely to ever happen (with a knock
> on wood).

I didn't spend much effort on the patch I showed beyond running it once.
It would probably need tests and a doc update. I wasn't planning to run
with it, but if you feel like doing so, please feel free to use it as
you like.

-Peff

^ permalink raw reply

* Re: [PATCH 0/9] refs: stop using `chdir_notify_reparent()`
From: Jeff King @ 2026-06-11  6:53 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, Karthik Nayak
In-Reply-To: <20260610-b4-pks-refs-avoid-chdir-notify-reparent-v1-0-56c864b01c43@pks.im>

On Wed, Jun 10, 2026 at 04:57:06PM +0200, Patrick Steinhardt wrote:

> this patch series is a follow-up of the discussion at [1]. It converts
> the reference backends to always use absolute paths internally, which
> then allows us to drop the calls to `chdir_notify_reparent()`.

We added chdir-notify to suport set_work_tree(). Commit 8500e0de3f
(set_work_tree: use chdir_notify, 2018-03-30) mentions an optimization
from 044bbbcb63 (Make git_dir a path relative to work_tree in
setup_work_tree(), 2008-06-19). That commit demonstrates some measurable
speedup from using relative versus absolute paths.

If we move to a world of all absolute paths where chdir-notify is not
necessary, will we lose that optimization?

I'm not sure how much it matters in practice these days, or if those
timings could be repeated. And they weren't all _that_ big to start
with. I guess it may depend on how deep your repo is within your
filesystem, too.

-Peff

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox