Git development
 help / color / mirror / Atom feed
* Re: [PATCH/RFC 6/6] Documentation/technical: add paint-down-to-common doc
From: Derrick Stolee @ 2026-06-22 18:21 UTC (permalink / raw)
  To: Kristofer Karlsson via GitGitGadget, git
  Cc: Elijah Newren, Kristofer Karlsson
In-Reply-To: <9cbfc67d724d91b9abc3621f03a3c97208c76a70.1781951820.git.gitgitgadget@gmail.com>

On 6/20/2026 6:36 AM, Kristofer Karlsson via GitGitGadget wrote:
> From: Kristofer Karlsson <krka@spotify.com>
> 
> Add a technical document describing the paint_down_to_common()
> algorithm used for merge-base computation.

I like the idea of documenting this so it's easier to understand.

There is risk of drift from the actual implementation. You may want
to add a comment to the method in commit-reach.c to indicate that
any change should be reflected in this document.

> +Termination
> +-----------
> +
> +Termination happens when we can prove that no extra progress is
> +possible. We are done with the main loop when one of the following
> +conditions holds:
> +
> +  1. The queue is empty.
> +  2. The queue only contains STALE entries.
> +  3. Side-exhaustion: the walk has reached the finite region and one
> +     of the sides is fully exhausted.
It could be an interesting exercise, but potentially wasteful, to
add this document as a Patch 1, but reflecting the old algorithm
and then to update the document at the same time as you update the
code.

The changes in your patch 2 would impact this doc in terms of the
data being tracked by the paint_queue data structure instead of the
nonstale_queue structure (though those details are not currently
handled in the current version). The change to the termination
condition would come along with patch 3.

Thanks,
-Stolee


^ permalink raw reply

* Re: [PATCH/RFC 5/6] t6099, t6600: add side-exhaustion regression tests
From: Derrick Stolee @ 2026-06-22 18:16 UTC (permalink / raw)
  To: Kristofer Karlsson via GitGitGadget, git
  Cc: Elijah Newren, Kristofer Karlsson
In-Reply-To: <faf5bc98ede79965e23bfe1535127d6f52221680.1781951820.git.gitgitgadget@gmail.com>

On 6/20/2026 6:36 AM, Kristofer Karlsson via GitGitGadget wrote:
> Add t6099 to test the case where multiple merge-base candidates exist
> and one is an ancestor of another. This exercises the side-exhaustion
> optimization in paint_down_to_common together with the
> remove_redundant safety net in get_merge_bases_many_0.

Same as the previous patch: I'd like to see these before the code
change. And if we trace a count of commits walked, we'd be able to
see the number change in this specific case.

Thanks,
-Stolee


^ permalink raw reply

* Re: [PATCH/RFC 4/6] t6600: add test cases for side-exhaustion edge cases
From: Derrick Stolee @ 2026-06-22 18:15 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget, git; +Cc: Elijah Newren, Kristofer Karlsson
In-Reply-To: <91372b975fbe102538c05c7d2cdae356539d1bbd.1781951820.git.gitgitgadget@gmail.com>

On 6/20/2026 6:36 AM, Elijah Newren via GitGitGadget wrote:
> From: Elijah Newren <newren@gmail.com>
> 
> Add test cases to t6600-test-reach.sh that exercise edge cases in the
> side-exhaustion optimization for paint_down_to_common():
> 
>  - in_merge_bases_many:self: commit is both A and one of the X inputs
>  - get_merge_bases_many:duplicate-twos: duplicate entries in X list
>  - get_merge_bases_many:pending-stale: STALE transition on an
>    already-painted commit (ps-* diamond topology)
>  - get_merge_bases_many:infinity-both-sides: both tips outside the
>    commit-graph with non-monotonic dates (pi-* topology)

It's usually my preference to see these tests show up before the
new code arrives, that way we can see that they already work with
the old logic and continue to work with the new logic.

It's minor, but putting them after your code change may be adding
enforcement of a change of behavior.

One thing that could be helpful here is to consider tracing a
count of "commits walked" in the merge-base code, then you could
have these tests demonstrate the performance benefit by checking
for that number changing.

In t6600, that tracing number would not be the same across the
three different data shapes (full graph, half graph, no graph) and
that could be valuable to demonstrate in tests.

Thanks,
-Stolee


^ permalink raw reply

* Re: [PATCH/RFC 3/6] commit-reach: terminate merge-base walk when one paint side is exhausted
From: Derrick Stolee @ 2026-06-22 18:12 UTC (permalink / raw)
  To: Kristofer Karlsson via GitGitGadget, git
  Cc: Elijah Newren, Kristofer Karlsson
In-Reply-To: <ed12a5cb5b76925cff08d2ab61efeda382b4477a.1781951820.git.gitgitgadget@gmail.com>

On 6/20/2026 6:36 AM, Kristofer Karlsson via GitGitGadget wrote:
> From: Kristofer Karlsson <krka@spotify.com>
> 
> Add an early termination check to paint_down_to_common() using the
> per-side counters introduced in the previous commit. Once the walk
> enters the finite-generation region, terminate early when one side's
> exclusive count drops to zero -- no new merge-base can form without
> both paint sides meeting.
> 
> The check also waits for pending_merge_bases to reach zero, ensuring
> all merge-base candidates have been popped and recorded before
> exiting.
> 
> The INFINITY gate ensures correctness: commits without a commit-graph
> entry have GENERATION_NUMBER_INFINITY and are ordered by commit date,
> which is not topologically reliable. The optimization only fires
> once the walk enters the finite-generation region where ordering
> guarantees hold.
> 
> On large repositories with commit-graph, this yields 100-1000x
> speedups for merge-base queries where one side (e.g. a PR branch) is
> much smaller than the other.
> 
> Helped-by: Derrick Stolee <stolee@gmail.com>
> Helped-by: Elijah Newren <newren@gmail.com>
> Signed-off-by: Kristofer Karlsson <krka@spotify.com>
> ---
>  commit-reach.c | 13 +++++++++++++
>  1 file changed, 13 insertions(+)
> 
> diff --git a/commit-reach.c b/commit-reach.c
> index ba1e896f0f..fcd1ad0167 100644
> --- a/commit-reach.c
> +++ b/commit-reach.c
> @@ -201,6 +201,19 @@ static int paint_down_to_common(struct repository *r,
>  		if (queue.p1_count + queue.p2_count +
>  		    queue.pending_merge_bases == 0)
>  			break;
> +
> +		/*
> +		 * Side exhaustion: a new merge-base can only form
> +		 * when both PARENT1-only and PARENT2-only commits
> +		 * remain in the queue.  In the finite-generation
> +		 * region the queue is ordered topologically, so
> +		 * no future step can add paint to visited commits
> +		 * and an exhausted side cannot reappear.
> +		 */
> +		if (generation < GENERATION_NUMBER_INFINITY &&
> +		    queue.pending_merge_bases == 0 &&
> +		    (queue.p1_count == 0 || queue.p2_count == 0))
> +			break;
I mentioned it earlier, but I think this check should be in the
dequeueing method instead of in the tail of the loop.

But I think this is the correct ending case.

I like that you broke this out into its own patch to demonstrate
that this is the key performance boost. It may be good to have
some performance test numbers that demonstrate that patch 2 does
not add any substantial overhead (timing should match previous
code) and in patch 3 this single condition gets us a huge benefit,
though it requires the data tracking of patch 2 to work.

Thanks,
-Stolee


^ permalink raw reply

* Re: [PATCH/RFC 2/6] commit-reach: introduce struct paint_queue with per-side counters
From: Derrick Stolee @ 2026-06-22 18:10 UTC (permalink / raw)
  To: Kristofer Karlsson via GitGitGadget, git
  Cc: Elijah Newren, Kristofer Karlsson
In-Reply-To: <316e4dfe261043730c77142639f86f5c3cabe370.1781951820.git.gitgitgadget@gmail.com>

On 6/20/2026 6:36 AM, Kristofer Karlsson via GitGitGadget wrote:
> From: Kristofer Karlsson <krka@spotify.com>

> +	if (!(old_paint & STALE)) {
> +		switch (old_paint & (PARENT1 | PARENT2)) {
> +		case 0:                  break;
> +		case PARENT1:            queue->p1_count--; break;
> +		case PARENT2:            queue->p2_count--; break;
> +		case PARENT1 | PARENT2:  queue->pending_merge_bases--; break;
> +		default:                 BUG("unexpected paint state");
> +		}
> +	}
> +	if (!(new_paint & STALE)) {
> +		switch (new_paint & (PARENT1 | PARENT2)) {
> +		case 0:                  break;
> +		case PARENT1:            queue->p1_count++; break;
> +		case PARENT2:            queue->p2_count++; break;
> +		case PARENT1 | PARENT2:  queue->pending_merge_bases++; break;
> +		default:                 BUG("unexpected paint state");
> +		}
> +	}

While correct and compact, I don't believe that these switch
statements follow the coding guidelines. We should split the
lines appropriately so they are more standard, such as:

if (!(new_paint & STALE)) {
	switch (new_paint & (PARENT1 | PARENT2)) {
	case 0:
		break;

	case PARENT1:
		queue->p1_count++;
		break;

	case PARENT2:
		queue->p2_count++;
		break;

	case PARENT1 | PARENT2:
		queue->pending_merge_bases++;
		break;

	default:
		BUG("unexpected paint state");
	}
}

Also: technically "case 0" should be a BUG() state, right? We
shouldn't be walking any commit that isn't reachable from at
least one side. (case 0 does happen for old_paint, though.)

>  }
>  
> -static void clear_nonstale_queue(struct nonstale_queue *queue)
> +static void paint_queue_put(struct paint_queue *queue,
> +			    struct commit *c, unsigned add_flags)
>  {
> -	clear_prio_queue(&queue->pq);
> -	queue->max_nonstale = NULL;
> -}
> +	unsigned old_flags = c->object.flags;
> +	c->object.flags |= add_flags;
  
Diffs like this are part of the reason I'd like to see a _new_
data structure instead of replacing the old one. Keeping the
old one for ahead_behind seems like a good idea to me, but even
if we don't land on that end state then deleting the old code
_after_ adding the new code will make the diff more readable.

> -	struct nonstale_queue queue = {
> -		{ compare_commits_by_gen_then_commit_date }
> +	struct paint_queue queue = {
> +		.pq = { compare_commits_by_gen_then_commit_date }
>  	};

I didn't notice when reading the struct definition, but looking at
'pq' here makes me think that we shouldn't be using that abbreviation
as it could stand for "prio_queue" or "paint_queue".

> +	while ((commit = paint_queue_get(&queue))) {
...> +
> +		if (queue.p1_count + queue.p2_count +
> +		    queue.pending_merge_bases == 0)
> +			break;
>  	}
When possible, I like to try to make loops only have one terminating
condition. Should we have paint_queue_get() return NULL when it sees
this internal state condition?

Also, I'd rather see it of the form of (!count) instead of using
addition to make it clear that we care about each value being zero.

Finally, I think we actually want this case to get the benefit:

	if ((!queue.p1_count || !queue.p2_count) &&
	    !queue.pending_merge_bases)
	    
I do see that you have this condition in patch 3 with the extra
detail that the max generation in the queue is finite. I think this
is more reason to include this in the data structure method and not
in the loop.

Thanks,
-Stolee


^ permalink raw reply

* Re: [PATCH v3 0/2] environment: move ignore_case into repo_config_values
From: Junio C Hamano @ 2026-06-22 18:01 UTC (permalink / raw)
  To: Tian Yuchen; +Cc: git, ps, phillip.wood123, johannes.schindelin, stolee
In-Reply-To: <b5a9115a-c909-405f-b150-f956d866b1eb@malon.dev>

Tian Yuchen <cat@malon.dev> writes:

> On 6/22/26 04:16, Junio C Hamano wrote:
>> As the compat/ layer is not meant as a general purpose POSIX
>> emulation wrapper that is generally reusable to projects other than
>> us, if we have a knob settable by end users to affect behaviours of
>> lower layer in compat/, it is natural to make repo-settings
>> available to them.
>
> I see.
>
>> What is the perceived problem you have in mind, and what are your
>> proposed alternatives?
>
> Actually, my reason for showing this question wasn’t because I thought 
> there were any architectural problem, but because I felt that for a file 
> in compat/win32, which is more on the _downstream_ side (is that 
> correct?), we need to exercise extra caution and confirm with its 
> maintainer whether the changes are appropriate. That’s why I CC'd 
> Johannes Schindelin on this.
>
> Was that the right thing to do?

Yup, Dscho is the right person to decide on the design issues on
Windows build.

^ permalink raw reply

* Re: [PATCH/RFC 1/6] commit-reach: decouple ahead_behind from nonstale_queue
From: Derrick Stolee @ 2026-06-22 18:00 UTC (permalink / raw)
  To: Kristofer Karlsson via GitGitGadget, git
  Cc: Elijah Newren, Kristofer Karlsson
In-Reply-To: <5492acda0ad05eab67198880a5262e84a3f22ba6.1781951820.git.gitgitgadget@gmail.com>

On 6/20/2026 6:36 AM, Kristofer Karlsson via GitGitGadget wrote:
> From: Kristofer Karlsson <krka@spotify.com>
> 
> Move ahead_behind() off the shared nonstale_queue abstraction to use
> a plain prio_queue with a local max_nonstale pointer. The nonstale
> tracking is inlined into insert_no_dup().
> 
> This prepares for replacing nonstale_queue with a paint_queue struct
> that tracks per-side commit counts, which ahead_behind() does not
> need. No behavior change.

This change is only needed if we are intending to delete the nonstale
queue struct, which is currently happening in your patch 2. But we
are essentially recreating its logic in a more disjointed way here,
leaving this code in a worse state.

I'd rather see patch 2 create a _new_ data structure instead of
_replacing_ one that already works for multiple callers. (It does
drop to only one caller, but that seems cleaner to me right now.)

Thanks,
-Stolee


^ permalink raw reply

* Re: [PATCH 3/3] connected: search promisor objects generically
From: Junio C Hamano @ 2026-06-22 17:57 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git
In-Reply-To: <20260622-pks-connected-generic-promisor-checks-v1-3-25eba2698202@pks.im>

Patrick Steinhardt <ps@pks.im> writes:

> When performing connectivity checks we have to figure out whether any of
> the new objects are promisor objects, as we cannot assume full
> connectivity if so.
>
> This check is performed by iterating through all packfiles in the
> repository and searching each of them for the given object. Of course,
> this mechanism is quite specific to implementation details of the object
> database, as we assume that it uses packfiles in the first place.
>
> Refactor the logic so that we instead use `odb_for_each_object_ext()`
> with an object prefix filter and the `ODB_FOR_EACH_OBJECT_PROMISOR_ONLY`
> flag. This will yield all objects that have the exact object name and
> that are part of a promisor pack in a generic way.
> ...


> -		 *
> -		 * Before checking for promisor packs, be sure we have the
> -		 * latest pack-files loaded into memory.
>  		 */
> -		odb_reprepare(the_repository->objects);

Hmph?

>  		do {
> -			struct packed_git *p;
> -
> -			repo_for_each_pack(the_repository, p) {
> -				if (!p->pack_promisor)
> -					continue;
> -				if (find_pack_entry_one(oid, p))
> -					goto promisor_pack_found;
> +			opts.prefix = oid;
> +
> +			err = odb_for_each_object_ext(the_repository->objects,
> +						      NULL, promised_object_cb,
> +						      NULL, &opts);
> +			if (err < 0)
> +				break;
> +			if (err > 0) {
> +				err = 0;
> +				continue;
>  			}

So we used to manually iterate and stop when we have a matching pack
entry, but now "stop when we find" is done by promisor_object_cb
callback that returns 1.

What is the reason why we no longer odb_(re)prepare() upfront before
going into the loop?  Would it make us miss a newly added promisor
packs?  We will fall back to rev-list for correctness, so it may not
matter, though.

> +
>  			/*
>  			 * Fallback to rev-list with oid and the rest of the
>  			 * object IDs provided by fn.
>  			 */
>  			goto no_promisor_pack_found;
> -promisor_pack_found:
> -			;
>  		} while ((oid = fn(cb_data)) != NULL);
> +
>  		if (opt->err_fd)
>  			close(opt->err_fd);
> -		return 0;
> +		return err;
>  	}
>  
>  no_promisor_pack_found:

^ permalink raw reply

* Re: [PATCH 1/3] odb/source-packed: extract logic to skip certain packs
From: Junio C Hamano @ 2026-06-22 17:51 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git
In-Reply-To: <20260622-pks-connected-generic-promisor-checks-v1-1-25eba2698202@pks.im>

Patrick Steinhardt <ps@pks.im> writes:

> The caller can pass flags that allow them to filter out specific kinds
> of objects when iterating objects via `odb_for_each_object()`. This only
> works for "normal" iteration though, as we `BUG()` when the user passes
> flags and specifies an object prefix.
>
> This limitation will be lifted in the next commit. Prepare for this by
> extracting the logic that skips certain kinds of packs so that we can
> easily reuse it.
>
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
>  odb/source-packed.c | 28 ++++++++++++++++++----------
>  1 file changed, 18 insertions(+), 10 deletions(-)

Quite straight-forward creation of a simple helper function.

> diff --git a/odb/source-packed.c b/odb/source-packed.c
> index 42c28fba0e..3afc4bf01f 100644
> --- a/odb/source-packed.c
> +++ b/odb/source-packed.c
> @@ -126,6 +126,22 @@ static int match_hash(unsigned len, const unsigned char *a, const unsigned char
>  	return 1;
>  }
>  
> +static bool should_exclude_pack(struct packed_git *p, enum odb_for_each_object_flags flags)
> +{
> +	if ((flags & ODB_FOR_EACH_OBJECT_LOCAL_ONLY) && !p->pack_local)
> +		return true;
> +	if ((flags & ODB_FOR_EACH_OBJECT_PROMISOR_ONLY) &&
> +	    !p->pack_promisor)
> +		return true;
> +	if ((flags & ODB_FOR_EACH_OBJECT_SKIP_IN_CORE_KEPT_PACKS) &&
> +	    p->pack_keep_in_core)
> +		return true;
> +	if ((flags & ODB_FOR_EACH_OBJECT_SKIP_ON_DISK_KEPT_PACKS) &&
> +	    p->pack_keep)
> +		return true;
> +	return false;
> +}
> +
>  static int for_each_prefixed_object_in_midx(
>  	struct odb_source_packed *store,
>  	struct multi_pack_index *m,
> @@ -306,17 +322,9 @@ static int odb_source_packed_for_each_object(struct odb_source *source,
>  	for (e = packfile_store_get_packs(packed); e; e = e->next) {
>  		struct packed_git *p = e->pack;
>  
> -		if ((opts->flags & ODB_FOR_EACH_OBJECT_LOCAL_ONLY) && !p->pack_local)
> -			continue;
> -		if ((opts->flags & ODB_FOR_EACH_OBJECT_PROMISOR_ONLY) &&
> -		    !p->pack_promisor)
> -			continue;
> -		if ((opts->flags & ODB_FOR_EACH_OBJECT_SKIP_IN_CORE_KEPT_PACKS) &&
> -		    p->pack_keep_in_core)
> -			continue;
> -		if ((opts->flags & ODB_FOR_EACH_OBJECT_SKIP_ON_DISK_KEPT_PACKS) &&
> -		    p->pack_keep)
> +		if (should_exclude_pack(p, opts->flags))
>  			continue;
> +
>  		if (open_pack_index(p)) {
>  			pack_errors = 1;
>  			continue;

^ permalink raw reply

* Re: [GSoC Patch v7 1/3] path: extract append_formatted_path() and use in rev-parse
From: K Jayatheerth @ 2026-06-22 17:41 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: a3205153416, git, jltobler, kumarayushjha123, lucasseikioshiro,
	phillip.wood, sandals
In-Reply-To: <xmqq1pdy36me.fsf@gitster.g>

Hey Junio,

On Mon, Jun 22, 2026 at 2:32 AM Junio C Hamano <gitster@pobox.com> wrote:

> So, for the existing user of this logic, the preimage ...
>
> > -static void print_path(const char *path, const char *prefix, enum format_type format, enum default_type def)
> >  {

...

> > -     free(cwd);
> >  }
>
> ... now becomes this postimage.
>

Yes that's right!

> > +static void print_path(const char *path, const char *prefix,
> > +                    enum format_type format, enum default_type def)
> >  {
> > +     struct strbuf sb = STRBUF_INIT;
> > +     enum path_format fmt;
> > +
> > +     if (format == FORMAT_RELATIVE) {
> > +             fmt = PATH_FORMAT_RELATIVE;
> > +     } else if (format == FORMAT_CANONICAL) {
> > +             fmt = PATH_FORMAT_CANONICAL;
> > +     } else /* FORMAT_DEFAULT */ {
> > +             switch (def) {
> > +             case DEFAULT_RELATIVE:
> > +                     fmt = PATH_FORMAT_RELATIVE;
> > +                     break;
> > +             case DEFAULT_RELATIVE_IF_SHARED:
> > +                     fmt = PATH_FORMAT_RELATIVE_IF_SHARED;
> > +                     break;
> > +             case DEFAULT_CANONICAL:
> > +                     fmt = PATH_FORMAT_CANONICAL;
> > +                     break;
> > +             case DEFAULT_UNMODIFIED:
> > +             default:
> > +                     fmt = PATH_FORMAT_UNMODIFIED;
> > +                     break;
> >               }
> >       }
> > +
> > +     append_formatted_path(&sb, path, prefix, fmt);
> > +     puts(sb.buf);
> > +
> > +     strbuf_release(&sb);
> >  }
>
> Mostly, the code translates FORMAT_FOO constants into the new
> PATH_FORMAT_FOO constants, and lets append_formatted_path() do the
> heavy lifting.
>
> It is a minor point, but wouldn't it make it simpler to handle
> format_default first?  I.e.,
>
>         if (format == FORMAT_DEFAULT)
>                 switch (def) {
>                 case DEFAULT_RELATIVE:
>                         format = DEFAULT_RELATIVE;
>                         break;
>                 ...
>                 case DEFAULT_UNMODIFIED:
>                 default:
>                         format = DEFAULT_UNMODIFIED;
>                         break;
>         }
>         switch (format) {
>         case FORMAT_RELATIVE: fmt = PATH_FORMAT_RELATIVE; break;
>         case FORMAT_CANONICAL: fmt = PATH_FORMAT_CANONICAL; break;
>         ...
>         }
>
> Perhaps yes, perhaps not.  I dunno.
>

I see you have continued this point further
I am going to respond to this in detail there.

> > diff --git a/path.c b/path.c
> > index d7e17bf174..6d8e892ada 100644
> > --- a/path.c
> > +++ b/path.c
> > @@ -1579,6 +1579,75 @@ char *xdg_cache_home(const char *filename)
> >       return NULL;
> >  }
> >
> > +void append_formatted_path(struct strbuf *dest, const char *path,
> > +                        const char *prefix, enum path_format format)
> > +{
> > +     switch (format) {
> > +     case PATH_FORMAT_UNMODIFIED:
> > +             strbuf_addstr(dest, path);
> > +             break;
>
> In the orignal "print_path()", DEFAULT/UNMODIFIED did this "show
> unmodified".  OK.
>
> > +     case PATH_FORMAT_RELATIVE: {
> > +             struct strbuf relative_buf = STRBUF_INIT;
> > +             struct strbuf real_path = STRBUF_INIT;
> > +             struct strbuf real_prefix = STRBUF_INIT;
> > +             char *cwd = NULL;
> > +
> > +             /*
> > +              * We don't ever produce a relative path if prefix is NULL,
> > +              * so set the prefix to the current directory so that we can
> > +              * produce a relative path whenever possible.
> > +              */
> > +             if (!prefix)
> > +                     prefix = cwd = xgetcwd();
>
> This is what was done in the original "print_path()" upfront, with
> a similar comment to explay why this happens.  Looking good.  Also
> we no longer call xgetcwd() when we do not need to, which is goodd.
>
> > +             if (!is_absolute_path(path)) {
> > +                     strbuf_realpath_forgiving(&real_path, path, 1);
> > +                     path = real_path.buf;
> > +             }
> > +             if (!is_absolute_path(prefix)) {
> > +                     strbuf_realpath_forgiving(&real_prefix, prefix, 1);
> > +                     prefix = real_prefix.buf;
> > +             }
>
> There used to be a comment explaining why we make realpath calls,
> which is now lost.  Perhaps what the comment said was so obvious
> that we are better off without it?  I offhand do not know.
>

When the logic was a single block, the comment felt necessary to
explain the flow.
By splitting it into explicit switch cases, the logic became a bit
more self-evident, so I removed it to reduce clutter.
I kept the other comments where the reasoning is less obvious.


> What is done to make the paths real is the same as before, which is
> good.
>
> > +             strbuf_addstr(dest, relative_path(path, prefix, &relative_buf));
> > +
> > +             strbuf_release(&relative_buf);
> > +             strbuf_release(&real_path);
> > +             strbuf_release(&real_prefix);
> > +             free(cwd);
> > +             break;
> > +     }
>
> OK.
>
> > +     case PATH_FORMAT_RELATIVE_IF_SHARED: {
> > +             struct strbuf relative_buf = STRBUF_INIT;
> > +
> > +             /*
> > +              * If we're using RELATIVE_IF_SHARED mode, then we want an
> > +              * absolute path unless the two share a common prefix, so don't
> > +              * default the prefix to the current working directory. Doing so
> > +              * would cause a relative path to always be produced if possible.
> > +              */

I thought this comment made sense keeping in for instance.

> Identical to the original, which is good.
> > +
> > +     case PATH_FORMAT_CANONICAL: {
> > +             struct strbuf canonical_buf = STRBUF_INIT;
> > +
> > +             strbuf_realpath_forgiving(&canonical_buf, path, 1);
> > +             strbuf_addbuf(dest, &canonical_buf);
> > +
> > +             strbuf_release(&canonical_buf);
> > +             break;
> > +     }
> > +
> > +     default:
> > +             BUG("unknown path_format value %d", format);
> > +     }
> > +}
>
> OK.
>
> > +/**
> > + * Format a path according to the specified formatting strategy and append
> > + * the result to the given strbuf.
> > + *
> > + * `dest`   : The string buffer to append the formatted path to.
> > + * `path`   : The path string that needs to be formatted.
> > + * `prefix` : The directory prefix to calculate relative offsets against.
> > + * Pass NULL to default to the current working directory where applicable.
> > + * `format` : The formatting behavior rule to execute.
> > + */
> > +void append_formatted_path(struct strbuf *dest, const char *path,
> > +                        const char *prefix, enum path_format format);
> > +
>
> It is slightly unsatisfying that this function is defined to
> "append" to any existing value in the dest strbuf, rather than
> storing the result in the dest strbuf.  The original caller
> print_path() passes an empty strbuf to this helper, so it can let
> strbuf_realpath_*() functions to strbuf_reset() it (e.g.,
> abspath.c:get_root_part() called by strbuf_realpath_1(), wihch in
> turn is called by strbuf_realpath() and strbuf_realpath_forgiving())
> it freely, which means that use of temporary strbuf like
> canonical_buf only to copy it out to dest is wasteful and unneeded.
> But other callers we will have for this helper later may want to
> append to what they already have, so perhaps it is OK (on the other
> hand, we could say that preserving and appending is what these
> callers can do themselves).
>

Hmm, I thought about this for a while.

Then I looked at what ls-tree.c does(using an accumulator).
They already routinely use temporary `strbuf`s to calculate
relative/absolute paths before
appending them to their main output string.

Because callers who need to accumulate can easily do the preserving
and appending
themselves with a temporary buffer, there is no reason to force that
overhead into our helper.

I will change the semantics from "append" to "replace", rename the
helper back to `format_path()`.
I hope I am looking at ls-tree.c correctly here : )

Eliminate the wasteful `canonical_buf` allocations so we can pass the
destination buffer directly to functions like
`strbuf_realpath_forgiving()`.
This is a good suggestion actually, thanks!

> Otherwise, looking good as a no-op bug-to-bug compatible rewrite,
> with a slight optimization (to skip xgetcwd()).
>
> Thanks.

On Mon, Jun 22, 2026 at 9:33 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Junio C Hamano <gitster@pobox.com> writes:

> > ...
> > It is a minor point, but wouldn't it make it simpler to handle
> > format_default first?  I.e.,
> >
> >       if (format == FORMAT_DEFAULT)
> >               switch (def) {
> >               case DEFAULT_RELATIVE:
> >                       format = DEFAULT_RELATIVE;
> >                       break;
> >               ...
> >               case DEFAULT_UNMODIFIED:
> >               default:
> >                       format = DEFAULT_UNMODIFIED;
> >                       break;
> >       }
> >       switch (format) {
> >         case FORMAT_RELATIVE: fmt = PATH_FORMAT_RELATIVE; break;
> >       case FORMAT_CANONICAL: fmt = PATH_FORMAT_CANONICAL; break;
> >       ...
> >       }
> >
> > Perhaps yes, perhaps not.  I dunno.
>
> I do not consider the above an blocker, but it might make a
> difference if we are going to acquire more modes and formats, so
> once somebody tries to rewrite the logic and finds the resulting
> code harder to follow (or not easier to follow), I would be happy to
> see the above discarded ;-)
>

True, if new formats are introduced
this would instantly become sloppy.

I will change it to future proof since I am
looking to send v8 for append_formatted_path().

Although I would be surprised to see an example for a new format.

> >> +/**
> >> + * Format a path according to the specified formatting strategy and append
> >> + * the result to the given strbuf.
> >> + *
> >> + * `dest`   : The string buffer to append the formatted path to.
> >> + * `path`   : The path string that needs to be formatted.
> >> + * `prefix` : The directory prefix to calculate relative offsets against.
> >> + * Pass NULL to default to the current working directory where applicable.
> >> + * `format` : The formatting behavior rule to execute.
> >> + */
> >> +void append_formatted_path(struct strbuf *dest, const char *path,
> >> +                       const char *prefix, enum path_format format);
> >> +
> >
> > It is slightly unsatisfying that this function is defined to
> > "append" to any existing value in the dest strbuf, rather than
> > storing the result in the dest strbuf.  The original caller
> > print_path() passes an empty strbuf to this helper, so it can let
> > strbuf_realpath_*() functions to strbuf_reset() it (e.g.,
> > abspath.c:get_root_part() called by strbuf_realpath_1(), wihch in
> > turn is called by strbuf_realpath() and strbuf_realpath_forgiving())
> > it freely, which means that use of temporary strbuf like
> > canonical_buf only to copy it out to dest is wasteful and unneeded.
> > But other callers we will have for this helper later may want to
> > append to what they already have, so perhaps it is OK (on the other
> > hand, we could say that preserving and appending is what these
> > callers can do themselves).
>
> This one we may want to consider a bit more seriously, but it is
> entirely up to the future callers of the helper.  If it would make
> the callers much easier to write for this helper to have "append"
> semantics, I'd be happy to accept the semantics of the above as-is,
> but otherwise, I suspect it would be simpler to use if the helper is
> defined to replase dest with the result, instead of appending the
> result to dest.
>

I am still unsure if I am following ls-tree.c correctly.
If I am then I think it is a very good change to have for v8 as I
specified above.

> > Otherwise, looking good as a no-op bug-to-bug compatible rewrite,
> > with a slight optimization (to skip xgetcwd()).
>
> This part of the review does not change in any case.  The
> refactoring looks good.

Thank you ; )

Regards,
- K Jayatheerth

^ permalink raw reply

* Re: [PATCH v3 2/2] status: improve rebase todo list parsing
From: Junio C Hamano @ 2026-06-22 17:26 UTC (permalink / raw)
  To: Phillip Wood; +Cc: git, Elijah Newren, Patrick Steinhardt
In-Reply-To: <b3514e9b1c9515bf1a7f7983b9f120d63edba97f.1782117361.git.phillip.wood@dunelm.org.uk>

Phillip Wood <phillip.wood123@gmail.com> writes:

> From: Phillip Wood <phillip.wood@dunelm.org.uk>
>
> When there is rebase in progress "git status" displays the last couple
> of completed and the next couple of pending commands from the todo
> list. When it does this it tries to abbreviate the object ids of
> the commits to be picked. Unfortunately it does not abbreviate the
> object ids when the line starts with "fixup -C" or "merge -C". It
> also mistakenly replaces the refname in "reset main" and "update-ref
> refs/heads/main" with the object id that the ref points to.
>
> Fix this by using the function added in the last commit to parse the
> command name and only try to abbreviate the argument for commands that
> take an object id. If a command accepts a label then try to resolve the
> object name as a label first and only if that fails try to resolve it
> as an object_id. When trying to abbreviate an object id, only replace
> the object name if it starts with the abbreviated object id so that
> tag or branch names that contain only hex digits are left unchanged.

;-)  

Strictly speaking, the original that said "if begins with
exed, x, label, or l, then don't bother" style can be extended
without using the function added in the last commit to do this, but
it certainly is much more pleasant to read the resulting code
presented here that uses parsed command enum and switches on it.

> Comments are now processed after stripping any leading
> whitespace from the line. This matches what the sequencer does in
> parse_insn_line(). The existing test cases are updated to test a
> wider variety of commands. Only the pending commands in the tests
> are changed to avoid removing existing coverage.
>
> Helped-by: Elijah Newren <newren@gmail.com>
> Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
> ---
> diff --git a/wt-status.c b/wt-status.c
> index 479ccc3304b..4b15bda76f4 100644
> --- a/wt-status.c
> +++ b/wt-status.c
> @@ -1363,6 +1363,71 @@ static int split_commit_in_progress(struct wt_status *s)
>  	free(rebase_orig_head);
>  
>  	return split_in_progress;
> +}
> +
> +/*
> + * If the whitespace-delimited token starting at or just after *pp
> + * is a hex object id that is longer than its default abbreviation,
> + * abbreviate it in-place, shrinking `line` accordingly. On return
> + * *pp points one past the (possibly abbreviated) token. Leaves both
> + * `line` and *pp-advanced-past-the-token unchanged in all other cases
> + * (non-hex token, label name, unresolvable, or a refname that happens
> + * to consist only of hex digits).
> + */
> +static void abbrev_oid_in_line(struct repository *r, struct strbuf *scratch,
> +			       struct strbuf *line, bool maybe_label, char **pp)
> +{
> +	char *p = *pp;
> +	char *end_of_object_name, saved;
> +	const char *abbrev;
> +	struct object_id oid;
> +	bool have_oid;
> +
> +	p += strspn(p, " \t");
> +	end_of_object_name = p + strcspn(p, " \t");
> +	/*
> +	 * For "merge" and "reset" the object name may be a label or
> +	 * ref rather than a hex object id. Only abbreviate the object
> +	 * name if it is a hex object id.
> +	 */
> +	for (const char *q = p; q < end_of_object_name; q++) {
> +		if (!isxdigit(*q))
> +			goto out;
> +	}

OK.  If the string has non hexdigit, it cannot be a raw object name
so there is no point in rewriting.  OK.

> +	if (maybe_label) {
> +		strbuf_reset(scratch);
> +		strbuf_addf(scratch, "refs/rewritten/%.*s",
> +			    (int)(end_of_object_name - p), p);
> +		if (refs_ref_exists(get_main_ref_store(r), scratch->buf))
> +			goto out; /* object name was a label */
> +	}

If it could be a label, then we check if such a label exists, and if
so, we won't modify it.  OK.

> +	saved = *end_of_object_name;
> +	*end_of_object_name = '\0';
> +	have_oid = !repo_get_oid(r, p, &oid);
> +	*end_of_object_name = saved;
> +	if (!have_oid)
> +		goto out; /* invalid object name */

We obviously cannot abbreviate if we cannot even recognize it as an
object name.  OK.

> +	abbrev = repo_find_unique_abbrev(r, &oid, DEFAULT_ABBREV);
> +	if (!starts_with(p, abbrev))
> +		goto out; /* object name was a refname containing only xdigits */

OK, nice to see sufficient paranoia ;-)

> +	p += strlen(abbrev);
> +	strbuf_remove(line, p - line->buf, end_of_object_name - p);
> +	end_of_object_name = p;

By abbreviating, line->buf only shrinks so we won't risk getting
confused by a realloc() happening under the hood.  Upon entry to
this helper, *pp must be pointing into line->buf, or everything will
go awry but for a file-scope static helper function like this, it
probably is too obvious to anybody that it does not have to be
spelled out.  OK.

> +out:
> +	*pp = end_of_object_name;
> +}


> +/* Skip "[ \t]*(-[cC])?", returns true if "-c/-C" was skipped. */
> +static bool skip_dash_c(char **pp)
> +{
> +	bool ret;
> +	char *p = *pp;
> +
> +	p += strspn(p, " \t");
> +	ret = skip_prefix(p, "-C", &p) || skip_prefix(p, "-c", &p);
> +	*pp = p;
> +
> +	return ret;
>  }

OK.

> @@ -1371,29 +1436,68 @@ static int split_commit_in_progress(struct wt_status *s)
>   * into
>   * "pick d6a2f03 some message"
>   *
> - * The function assumes that the line does not contain useless spaces
> - * before or after the command.
> + * Returns false on comment lines, true otherwise
>   */
> -static void abbrev_oid_in_line(struct repository *r, struct strbuf *line)
> +static bool format_todo_line(struct repository *r, struct strbuf *line)
>  {
> -	struct string_list split = STRING_LIST_INIT_DUP;
> -	struct object_id oid;
> -
> -	if (starts_with(line->buf, "exec ") ||
> -	    starts_with(line->buf, "x ") ||
> -	    starts_with(line->buf, "label ") ||
> -	    starts_with(line->buf, "l "))
> -		return;
> -
> -	if ((2 <= string_list_split(&split, line->buf, " ", 2)) &&
> -	    !repo_get_oid(r, split.items[1].string, &oid)) {
> -		strbuf_reset(line);
> -		strbuf_addf(line, "%s ", split.items[0].string);
> -		strbuf_add_unique_abbrev(line, &oid, DEFAULT_ABBREV);
> -		for (size_t i = 2; i < split.nr; i++)
> -			strbuf_addf(line, " %s", split.items[i].string);
> -	}
> -	string_list_clear(&split, 0);

We essentially said, "do not molest exec and label, but everything
else, as long as there are two (or more) tokens and the second token
looks like an object name, replace it with its abbreviation",
regardless of what the actual command was.  Now we do the right
thing by ...

> +	enum todo_command cmd;
> +	struct strbuf scratch = STRBUF_INIT;
> +	char *p = line->buf;
> +
> +	if (!sequencer_parse_todo_command((const char**)&p, &cmd))
> +		return true; /* keep invalid lines */

... parsing out what the line is about, and ...

> +	switch (cmd) {

... switching on it, to make it clear that we cover all the cases
known to us (and the code will be maintained like so, by not having
a "default" arm).

> +	case TODO_COMMENT:
> +		return false;
> +
> +	case TODO_MERGE: {
> +		/*
> +		 * The argument to -C cannot be a label, but the parents
> +		 * can be labels.
> +		 */
> +		bool maybe_label = !skip_dash_c(&p);
> +
> +		while (true) {
> +			p += strspn(p, " \t");
> +			if (!p[0] || (p[0] == '#' && (!p[1] || isspace(p[1]))))
> +				break;
> +			abbrev_oid_in_line(r, &scratch, line, maybe_label, &p);
> +			maybe_label = true;
> +		}
> +		break;
> +	}
> +
> +	case TODO_FIXUP:
> +		skip_dash_c(&p);
> +		/* fallthrough */

Fixup always refers to raw object ID and never a label, so it
would be OK to just skip -c/-C here ...

> +	case TODO_DROP:
> +	case TODO_EDIT:
> +	case TODO_PICK:
> +	case TODO_REVERT:
> +	case TODO_REWORD:
> +	case TODO_SQUASH:

... and pass "false" for "maybe_label".  OK.

> +		abbrev_oid_in_line(r, &scratch, line, false, &p);
> +		break;
> +
> +	case TODO_RESET:
> +		abbrev_oid_in_line(r, &scratch, line, true, &p);
> +		break;
> +	/*
> +	 * Avoid "default" and instead list all the other commands so
> +	 * that -Wswitch (which is included in -Wall) warns if a new
> +	 * command is added without handling it in this function.
> +	 */
> +	case TODO_BREAK:
> +	case TODO_EXEC:
> +	case TODO_LABEL:
> +	case TODO_NOOP:
> +	case TODO_UPDATE_REF:
> +		break;
> +	}
> +
> +	strbuf_release(&scratch);
> +	return true;
>  }
>  
>  static int read_rebase_todolist(struct repository *r, const char *fname, struct string_list *lines)
> @@ -1411,13 +1515,9 @@ static int read_rebase_todolist(struct repository *r, const char *fname, struct
>  			  repo_git_path_replace(r, &buf, "%s", fname));
>  	}
>  	while (!strbuf_getline_lf(&buf, f)) {
> -		if (starts_with(buf.buf, comment_line_str))
> -			continue;
>  		strbuf_trim(&buf);
> -		if (!buf.len)
> -			continue;
> -		abbrev_oid_in_line(r, &buf);
> -		string_list_append(lines, buf.buf);
> +		if (format_todo_line(r, &buf))
> +			string_list_append(lines, buf.buf);
>  	}
>  	fclose(f);

This loop got much nicer than the original.

Looks good.

Thanks.

^ permalink raw reply

* Re: [PATCH v3 1/2] sequencer: factor out parsing of todo commands
From: Junio C Hamano @ 2026-06-22 17:00 UTC (permalink / raw)
  To: Phillip Wood; +Cc: git, Elijah Newren, Patrick Steinhardt
In-Reply-To: <d27dddff93144f7b6d7fc89719bdf53b6856c9fc.1782117361.git.phillip.wood@dunelm.org.uk>

Phillip Wood <phillip.wood123@gmail.com> writes:

> From: Phillip Wood <phillip.wood@dunelm.org.uk>
>
> Move the code that parses todo commands into a separate function so
> that it can be shared with "git status" in the next commit. As we
> know the input is NUL terminated we do not pass a pointer to the end
> of the line and instead test for a blank line by looking for NUL, CR
> LF, or LF. We use starts_with() instead of starts_with_mem() for the
> same reason. This results in slightly different behavior when there
> a CR at the start of the line that is not followed by LF. Previously
> such a line was treated as a comment rather than an invalid line.

Meaning that the input validation is tighter than before?  I think
it is fine in this case, as I do not see a reason why anybody wants
to use a lone CR as comment introducer.

> +bool sequencer_parse_todo_command(const char **p, enum todo_command *cmd)
> +{
> +	const char *s = *p;
> +
> +	for (int i = 0; i < TODO_COMMENT; i++)
> +		if (is_command(i, p)) {
> +			*cmd = i;
> +			return true;
> +		}
> +
> +	if (starts_with(s, comment_line_str)) {
> +		*cmd = TODO_COMMENT;
> +		return true;
> +	} else if (s[0] == '\n' || (s[0] == '\r' && s[1] == '\n') || !s[0]) {
> +		*cmd = TODO_COMMENT;
> +		return true;
> +	}
> +
> +	return false;
>  }

I notice that the order of noticing concrete comments and comment
lines are swapped relative to the original.  There is no inherently
"natural" order between them, so the change is perfectly OK.  I just
got confused slightly while reading it until I realized that is what
you did.

>  static int check_label_or_ref_arg(enum todo_command command, const char *arg)
> @@ -2716,29 +2737,23 @@ static int parse_insn_line(struct repository *r, struct replay_opts *opts,
>  {
>  	struct object_id commit_oid;
>  	char *end_of_object_name;
> -	int i, saved, status, padding;
> +	int saved, status, padding;
>  
>  	item->flags = 0;
>  
>  	/* left-trim */
>  	bol += strspn(bol, " \t");
>  
> -	if (bol == eol || *bol == '\r' || starts_with_mem(bol, eol - bol, comment_line_str)) {
> -		item->command = TODO_COMMENT;
> -		item->commit = NULL;
> -		item->arg_offset = bol - buf;
> -		item->arg_len = eol - bol;
> -		return 0;
> -	}
> -
> -	for (i = 0; i < TODO_COMMENT; i++)
> -		if (is_command(i, &bol)) {
> -			item->command = i;
> -			break;
> -		}
> -	if (i >= TODO_COMMENT)
> +	if (!sequencer_parse_todo_command(&bol, &item->command))
>  		return error(_("invalid command '%.*s'"),
>  			     (int)strcspn(bol, " \t\r\n"), bol);
> +
> +	if (item->command == TODO_COMMENT) {
> +		item->commit = NULL;
> +		item->arg_offset = bol - buf;
> +		item->arg_len = eol - bol;
> +		return 0;
> +	}

And the extra stuff that are only relevant to a comment line is
naturally processed by the caller.  OK.

Thanks.  Looking good so far.

^ permalink raw reply

* Re: [PATCH v3 0/2] environment: move ignore_case into repo_config_values
From: Tian Yuchen @ 2026-06-22 16:45 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, ps, phillip.wood123, johannes.schindelin, stolee
In-Reply-To: <xmqqjyrr7ipf.fsf@gitster.g>

On 6/22/26 04:16, Junio C Hamano wrote:
> As the compat/ layer is not meant as a general purpose POSIX
> emulation wrapper that is generally reusable to projects other than
> us, if we have a knob settable by end users to affect behaviours of
> lower layer in compat/, it is natural to make repo-settings
> available to them.

I see.

> What is the perceived problem you have in mind, and what are your
> proposed alternatives?

Actually, my reason for showing this question wasn’t because I thought 
there were any architectural problem, but because I felt that for a file 
in compat/win32, which is more on the _downstream_ side (is that 
correct?), we need to exercise extra caution and confirm with its 
maintainer whether the changes are appropriate. That’s why I CC'd 
Johannes Schindelin on this.

Was that the right thing to do?

Regards, yuchen

^ permalink raw reply

* Re: [PATCH v3 0/4] pack-objects: support bitmaps and delta-islands with `--path-walk`
From: Junio C Hamano @ 2026-06-22 16:26 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: Taylor Blau, git, Jeff King, Elijah Newren
In-Reply-To: <b6ed816c-030b-400a-9fb6-6671fd3cb0b0@gmail.com>

Derrick Stolee <stolee@gmail.com> writes:

> On 6/22/2026 3:35 AM, Junio C Hamano wrote:
>> Taylor Blau <me@ttaylorr.com> writes:
>
>>> Outside of the above, the series is functionally unchanged.
>>>
>>> Thanks in advance for another look.
>>>
>>> Taylor Blau (4):
>>>   t/perf: drop p5311's lookup-table permutation
>>>   pack-objects: support reachability bitmaps with `--path-walk`
>>>   pack-objects: extract `record_tree_depth()` helper
>>>   pack-objects: support `--delta-islands` with `--path-walk`
>> 
>> Very cleanly implemented.  I am not confident that I have followed
>> the detailed logic around delta islands in the last step but the
>> earlier three patches looked trivially good.
> I've been happy with the code, subject to the new data that is presented
> with this version confirming the expected performance benefits. I also
> lack confidence in the delta islands features, but based on my weak
> understanding it looks correct. I believe that Taylor has the right
> expertise here to make up for my lack of context.

Thanks.  Let me mark the topic for 'next' then.

^ permalink raw reply

* Re: [GSoC Patch v7 1/3] path: extract append_formatted_path() and use in rev-parse
From: Junio C Hamano @ 2026-06-22 16:03 UTC (permalink / raw)
  To: K Jayatheerth
  Cc: a3205153416, git, jltobler, kumarayushjha123, lucasseikioshiro,
	phillip.wood, sandals
In-Reply-To: <xmqqtsqv6204.fsf@gitster.g>

Junio C Hamano <gitster@pobox.com> writes:

> K Jayatheerth <jayatheerthkulkarni2005@gmail.com> writes:
> ...
> It is a minor point, but wouldn't it make it simpler to handle
> format_default first?  I.e.,
>
> 	if (format == FORMAT_DEFAULT)
> 		switch (def) {
> 		case DEFAULT_RELATIVE:
> 			format = DEFAULT_RELATIVE;
> 			break;
> 		...
> 		case DEFAULT_UNMODIFIED:
> 		default:
> 			format = DEFAULT_UNMODIFIED; 
> 			break;
> 	}
> 	switch (format) {
>         case FORMAT_RELATIVE: fmt = PATH_FORMAT_RELATIVE; break;
> 	case FORMAT_CANONICAL: fmt = PATH_FORMAT_CANONICAL; break;
> 	...
> 	}
>
> Perhaps yes, perhaps not.  I dunno.

I do not consider the above an blocker, but it might make a
difference if we are going to acquire more modes and formats, so
once somebody tries to rewrite the logic and finds the resulting
code harder to follow (or not easier to follow), I would be happy to
see the above discarded ;-)

>> +/**
>> + * Format a path according to the specified formatting strategy and append
>> + * the result to the given strbuf.
>> + *
>> + * `dest`   : The string buffer to append the formatted path to.
>> + * `path`   : The path string that needs to be formatted.
>> + * `prefix` : The directory prefix to calculate relative offsets against.
>> + * Pass NULL to default to the current working directory where applicable.
>> + * `format` : The formatting behavior rule to execute.
>> + */
>> +void append_formatted_path(struct strbuf *dest, const char *path,
>> +			   const char *prefix, enum path_format format);
>> +
>
> It is slightly unsatisfying that this function is defined to
> "append" to any existing value in the dest strbuf, rather than
> storing the result in the dest strbuf.  The original caller
> print_path() passes an empty strbuf to this helper, so it can let
> strbuf_realpath_*() functions to strbuf_reset() it (e.g.,
> abspath.c:get_root_part() called by strbuf_realpath_1(), wihch in
> turn is called by strbuf_realpath() and strbuf_realpath_forgiving())
> it freely, which means that use of temporary strbuf like
> canonical_buf only to copy it out to dest is wasteful and unneeded.
> But other callers we will have for this helper later may want to
> append to what they already have, so perhaps it is OK (on the other
> hand, we could say that preserving and appending is what these
> callers can do themselves).

This one we may want to consider a bit more seriously, but it is
entirely up to the future callers of the helper.  If it would make
the callers much easier to write for this helper to have "append"
semantics, I'd be happy to accept the semantics of the above as-is,
but otherwise, I suspect it would be simpler to use if the helper is
defined to replase dest with the result, instead of appending the
result to dest.

> Otherwise, looking good as a no-op bug-to-bug compatible rewrite,
> with a slight optimization (to skip xgetcwd()).

This part of the review does not change in any case.  The
refactoring looks good.

Thanks.

^ permalink raw reply

* Re: [PATCH v3] config.mak.uname: avoid macOS dup-library warning
From: D. Ben Knoble @ 2026-06-22 15:49 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Harald Nordgren via GitGitGadget, git, Harald Nordgren,
	Patrick Steinhardt, pbonzini@redhat.com
In-Reply-To: <CALnO6CAgNdkg0PnN9Zy=zLurLUSb2hUXYAGe_qB0oceZNy_=gg@mail.gmail.com>

On Sat, Jun 20, 2026 at 4:58 PM D. Ben Knoble <ben.knoble@gmail.com> wrote:
>
> On Fri, Jun 19, 2026 at 6:27 PM Junio C Hamano <gitster@pobox.com> wrote:
> >
> > "Harald Nordgren via GitGitGadget" <gitgitgadget@gmail.com> writes:
> >
> > > From: Harald Nordgren <haraldnordgren@gmail.com>
> > >
> > > Building on macOS with Xcode 15 or newer emits:
> > >
> > >     ld: warning: ignoring duplicate libraries: 'libgit.a',
> > >     'target/release/libgitcore.a'
> > >
> > > Some link recipes list the same archive twice, which is harmless.
> > > Quiet the warning instead.
> > >
> > > Pass -Wl,-no_warn_duplicate_libraries on Xcode 15 and newer, whose
> > > linkers added both the warning and the suppression flag (ld64-907
> > > and dyld-1009). Earlier linkers reject the flag, so gate on the
> > > linker version. Broaden the existing -fno-common version probe to
> > > also match the "ld64-NNN" and "dyld-NNN" forms Xcode 15 reports.
> > >
> > > Signed-off-by: Harald Nordgren <haraldnordgren@gmail.com>
> > > ---
> >
> > Yeah, this looks like what I expected.
> >
> > A few things to note.
> >
> >  * Can folks with different versions of Xcode (or is 15 sufficiently
> >    old that practically nobody is expected to have anything older?)
> >    test this patch?
> >
> >  * We only patch Makefile here; can folks who use meson report how
> >    well your build goes?
> >
> > Thanks.
>
> On one (old) machine I have available:
>
>     $ pkgutil --pkg-info=com.apple.pkg.CLTools_Executables
>     [trimmed]
>     version: 14.2.0.0.1.1668646533
>
> On said machine, I don't get the duplicate warnings on a Meson build.
> No issues with the patch when running make.

That old machine has "ld -v":

    @(#)PROGRAM:ld  PROJECT:ld64-711
    BUILD 21:57:11 Nov 17 2021
    [trimmed]
    LTO support using: LLVM version 13.0.0, (clang-1300.0.29.30)
(static support for 27, runtime is 27)
    TAPI support using: Apple TAPI version 13.0.0 (tapi-1300.0.6.5)

> I think I have seen this on my other machine, which is much newer.
> When I get around to trying it there, I'll report back as well.

Here's those results:

    $ pkgutil --pkg-info=com.apple.pkg.CLTools_Executables
    [trimmed]
    version: 26.1.0.0.1.1761104275

sans patch:
- Meson: duplicate warning
- Make: duplicate warning

w/ patch:
- Meson: (unchanged, obviously)
- Make: no duplicate warning

Under Meson + Ninja the warning I get is (status line may not be
helpful given parallelism)

    [705/708] Linking target t/helper/test-tool
    ld: warning: ignoring duplicate libraries: '-lexpat', '-liconv',
'-lresolv', '-lz'

That's in addition to a dozen or so

    ld: warning: reducing alignment of section __DATA,__common from
0x8000 to 0x4000 because it exceeds segment maximum alignment

Under Make I get (surrounding info may not be helpful given parallelism)

        LINK git
    ld: warning: ignoring duplicate libraries: 'libgit.a',
'target/release/libgitcore.a'
        MKDIR -p t/unit-tests/bin
        LINK git-sh-i18n--envsubst
        LINK t/helper/test-tool
        LINK git-remote-http
    ld: warning: ignoring duplicate libraries: 'libgit.a',
'target/release/libgitcore.a'

(No alignment warnings this time.)

Linker version (ld -v) on this machine:

    @(#)PROGRAM:ld PROJECT:ld-1230.1
    BUILD 16:18:08 Oct 17 2025
    [trimmed]
    LTO support using: LLVM version 17.0.0 (static support for 29,
runtime is 29)
    TAPI support using: Apple TAPI version 17.0.0 (tapi-1700.3.8)

-- 
D. Ben Knoble

^ permalink raw reply

* Re: [PATCH v4 1/3] replay: refactor enum replay_mode into a bool
From: Junio C Hamano @ 2026-06-22 15:43 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: Toon Claes, git, Elijah Newren
In-Reply-To: <ajk-YQxLWfspNWIm@pks.im>

Patrick Steinhardt <ps@pks.im> writes:

> On Mon, Jun 22, 2026 at 02:41:55PM +0200, Toon Claes wrote:
>> In 2760ee4983 (replay: add --revert mode to reverse commit changes,
>> 2026-03-26) the enum `replay_mode` was introduced. This has two possible
>> values:
>> 
>>  - The value `REPLAY_MODE_REVERT` is used when option `--revert` is
>>    passed to git-replay(1). When using this value the commits are
>>    processed in reverse order and the inverse of the changes are
>>    applied.
>> 
>>  - The value `REPLAY_MODE_PICK` is used when either option `--onto` or
>>    `--advance` is used. In both cases the commits are processed in
>>    normal order, and the changes are applied as-is.
>> 
>> Since there are only two possible values of this enum, simplify the code
>> by converting the enum into a bool. This avoids adding code paths that
>> check for invalid values of the enum, and shortens code where the value
>> is checked with a ternary operator.
>
> That's fair, and the result is easier to write. But is it really easier
> to read? And what if we ever have to create a third mode going forward?
>
> I'm generally no fan of booleans as parameters as they basically give
> you no information at all at the callsite, except if you're lucky and
> you already have an aptly-named variable available that you can pass.
> Which seems to be the case here, but I'm still not sure whether this
> change really improves the code.

I tend to agree with you on both counts.  The "what happens when
somebody else wants a third choice?" is a quesiton I would ask the
first thing as the maintainer of a project.

Even if the boolean parameter is so obviously named, the callsite
can only say "true" or "false", unlike some other popular languages
that lets you say

	my_function(use_revert_mode=true, verbose=false);

and you cannot tell what effect the author wanted out of that "true"
if all you can write were

	my_function(true, false);

Of course, we could go ultra verbose, like

	my_function(true, /* use_revert_mode */
		    false, /* verbose */);

but then we are often better off writing:

	my_function(REPLAY_MODE_REVERT, REPLAY_QUIET);

Thanks.

^ permalink raw reply

* Re: [PATCH v17 5/7] branch: add --delete-merged <branch>
From: Phillip Wood @ 2026-06-22 15:36 UTC (permalink / raw)
  To: Harald Nordgren via GitGitGadget, git
  Cc: Kristoffer Haugsbakk, Johannes Sixt, Harald Nordgren
In-Reply-To: <46da7c814056ddbc23accf19a61d0451b949127e.1782113388.git.gitgitgadget@gmail.com>

Hi Harald

On 22/06/2026 08:29, Harald Nordgren via GitGitGadget wrote:
> From: Harald Nordgren <haraldnordgren@gmail.com>
> 
> +static int collect_upstream(const struct reference *ref, void *cb_data)
> +{
> +	struct string_list *upstreams = cb_data;
> +	struct branch *branch = branch_get(ref->name);
> +	const char *upstream = branch_get_upstream(branch, NULL);
> +
> +	string_list_append(upstreams, ref->name)->util =
> +		xstrdup_or_null(upstream);
> +	return 0;
> +}
> +
> +/*
> + * Keep any branch that another, surviving branch tracks as its
> + * upstream, so we never delete a branch out from under one stacked on
> + * top of it.  Sparing a branch makes it a survivor whose own upstream
> + * then needs the same protection, so repeat until nothing changes.
> + */
> +static void spare_stacked_bases(struct ref_store *refs, struct strset *deletable)
> +{
> +	struct string_list upstreams = STRING_LIST_INIT_DUP;
> +	struct string_list_item *item;
> +	bool spared;
> +
> +	refs_for_each_branch_ref(refs, collect_upstream, &upstreams);
> +	do {
> +		spared = false;
> +		for_each_string_list_item(item, &upstreams) {
> +			const char *up = item->util, *up_short;
> +
> +			if (!up || strset_contains(deletable, item->string))
> +				continue;
> +			if (!skip_prefix(up, "refs/heads/", &up_short) ||
> +			    !strset_contains(deletable, up_short))
> +				continue;
> +
> +			strset_remove(deletable, up_short);
> +			spared = true;
> +		}
> +	} while (spared);
> +
> +	string_list_clear(&upstreams, 1);
> +}

This keeps the whole chain of branches, which is the safest thing to
do but potentially keeps unneeded branches around. It is only really
the upstream branches of unmerged branches which are useful so we
could just keep those, and if their upstream branch is going to be
deleted clear the upstream config for that branch.

For example, if we have branch feature-3 with upstream feature-2 which
has upstream feature-1, then if feature-1 and feature-2 are merged we'd
delete feature-1 but keep feature-2 and clear its upstream config. If we
also had feature-4 that was not merged and had upstream feature-1 we'd
keep feature-1 and leave the upstream config for feature-2 unchanged.

Here is a possible implementation for that. It compiles but I've not
written any new tests. It does pass the existing tests which means
either it is buggy or we don't have coverage for keeping a chain of
branches.

static int collect_upstreams(const struct reference *ref, void *cb_data)
{
	struct collect_upstream_data *data = cb_data;
	struct strset *deletable = data->deletable;
	struct strset *upstreams = data->upstreams;
	struct branch *branch;
	const char *upstream_ref, *upstream_name;

	/*
	 * We're only interested in the upstreams of branches that
	 * are not being deleted.
	 */
	if (strset_contains(deletable, ref->name))
		return 0;
	branch = branch_get(ref->name);
	if (!branch)
		return 0;
	upstream_ref = branch_get_upstream(branch, NULL);
	/*
	 * We're only interested in the upstream if it is going to
	 * be deleted.
	 */
	if (!upstream_ref ||
	    !skip_prefix(upstream_ref, "refs/heads/", &upstream_name) ||
	    !strset_contains(deletable, upstream_name))
		return 0;
	/*
	 * Do not delete this branch because it is the upstream of
	 * an unmerged branch. Also remember it so we can check if
	 * its upstream is marked for deletion once we've visited all
	 * branches
	 */
	strset_remove(deletable, upstream_name);
	strset_add(upstreams, upstream_name);

	return 0;
}

/*
  * Keep any branch that another, surviving branch tracks as its
  * upstream, so we never delete a branch out from under one stacked on
  * top of it.  If the upstream branch has an upstream set that is marked
  * for deletion clear its upstream config.
  */
static void spare_stacked_bases(struct ref_store *refs, struct strset *deletable)
{
	struct strset upstreams = STRSET_INIT;
	struct collect_upstream_data data = {
		.deletable = deletable,
		.upstreams = &upstreams,
	};
	struct strbuf buf = STRBUF_INIT;
	struct hashmap_iter iter;
	struct strmap_entry *entry;

	refs_for_each_branch_ref(refs, collect_upstreams, &data);
	strset_for_each_entry(&upstreams, &iter, entry) {
		const char *upstream_upstream;
		struct branch *upstream_branch;

		/* We know upstream_ref is a branch, skip "refs/heads/" */
		upstream_branch = branch_get(entry->key);
		upstream_upstream = branch_get_upstream(upstream_branch, NULL);
		if (upstream_upstream &&
		    strset_contains(deletable, upstream_upstream)) {
			/*
			 * This branch has been merged and is the upstream of
			 * an unmerged branch.  Its upstream is marked for
			 * deletion because it is not the upstream of any
			 * unmerged branch so clear its upstream config.
			 */
			strbuf_reset(&buf);
			strbuf_addf(&buf, "branch.%s.merge", upstream_branch->name);
			repo_config_set_gently(the_repository, buf.buf, NULL);
			strbuf_setlen(&buf, buf.len - 5);
			strbuf_addstr(&buf, "remote");
			repo_config_set_gently(the_repository, buf.buf, NULL);
		}
	}
	strbuf_release(&buf);
	strset_clear(&upstreams);

}


> +	for (i = 0; i < candidates.nr; i++) {
> +		const char *short_name;
> +
> +		if (skip_prefix(candidates.items[i]->refname, "refs/heads/",
> +				&short_name) &&
> +		    strset_contains(&deletable, short_name))
> +			strvec_push(&to_delete, short_name);
> +	}

It would be nicer to use strset_for_each_entry() here. First declare

	struct hashmap_iter iter;
	struct strmap_entry *entry;

at the top of the function and then replace the loop with

	strset_for_each_entry(&deletable, &iter, entry)
		strvec_push(&to_delete, entry->key);

Thanks

Phillip


> +	if (to_delete.nr)
> +		ret = delete_branches(to_delete.nr, to_delete.v,
> +				      FILTER_REFS_BRANCHES,
> +				      DELETE_BRANCH_SKIP_UNMERGED |
> +				      DELETE_BRANCH_NO_HEAD_FALLBACK |
> +				      flags);
> +
> +	strvec_clear(&to_delete);
> +	strset_clear(&deletable);
> +	ref_array_clear(&candidates);
> +	ref_filter_clear(&filter);
> +	return ret;
> +}
> +
>   static GIT_PATH_FUNC(edit_description, "EDIT_DESCRIPTION")
>   
>   static int edit_branch_description(const char *branch_name)
> @@ -746,6 +862,7 @@ int cmd_branch(int argc,
>   	/* possible actions */
>   	int delete = 0, rename = 0, copy = 0, list = 0,
>   	    unset_upstream = 0, show_current = 0, edit_description = 0;
> +	int delete_merged = 0;
>   	const char *new_upstream = NULL;
>   	int noncreate_actions = 0;
>   	/* possible options */
> @@ -799,6 +916,8 @@ int cmd_branch(int argc,
>   		OPT_BOOL(0, "create-reflog", &reflog, N_("create the branch's reflog")),
>   		OPT_BOOL(0, "edit-description", &edit_description,
>   			 N_("edit the description for the branch")),
> +		OPT_BOOL(0, "delete-merged", &delete_merged,
> +			N_("delete local branches whose upstream matches <branch> and are merged")),
>   		OPT__FORCE(&force, N_("force creation, move/rename, deletion"), PARSE_OPT_NOCOMPLETE),
>   		OPT_MERGED(&filter, N_("print only branches that are merged")),
>   		OPT_NO_MERGED(&filter, N_("print only branches that are not merged")),
> @@ -846,7 +965,8 @@ int cmd_branch(int argc,
>   			     0);
>   
>   	if (!delete && !rename && !copy && !edit_description && !new_upstream &&
> -	    !show_current && !unset_upstream && argc == 0)
> +	    !show_current && !unset_upstream && !delete_merged &&
> +	    argc == 0)
>   		list = 1;
>   
>   	if (filter.with_commit || filter.no_commit ||
> @@ -856,7 +976,7 @@ int cmd_branch(int argc,
>   
>   	noncreate_actions = !!delete + !!rename + !!copy + !!new_upstream +
>   			    !!show_current + !!list + !!edit_description +
> -			    !!unset_upstream;
> +			    !!unset_upstream + !!delete_merged;
>   	if (noncreate_actions > 1)
>   		usage_with_options(builtin_branch_usage, options);
>   
> @@ -898,6 +1018,10 @@ int cmd_branch(int argc,
>   				      (delete > 1 ? DELETE_BRANCH_FORCE : 0) |
>   				      (quiet ? DELETE_BRANCH_QUIET : 0));
>   		goto out;
> +	} else if (delete_merged) {
> +		ret = delete_merged_branches(argc, argv,
> +					     quiet ? DELETE_BRANCH_QUIET : 0);
> +		goto out;
>   	} else if (show_current) {
>   		print_current_branch_name();
>   		ret = 0;
> diff --git a/t/t3200-branch.sh b/t/t3200-branch.sh
> index 3104c555f6..1d372f95e8 100755
> --- a/t/t3200-branch.sh
> +++ b/t/t3200-branch.sh
> @@ -1839,4 +1839,155 @@ test_expect_success '--forked narrows a <pattern> argument' '
>   	test_cmp expect actual
>   '
>   
> +test_expect_success '--delete-merged: setup' '
> +	git init -b main upstream &&
> +	(
> +		cd upstream &&
> +		test_commit base &&
> +		git checkout -b next &&
> +		test_commit next-work &&
> +		git checkout main
> +	) &&
> +	git init -b main other &&
> +	test_commit -C other other-base &&
> +	git init -b main fork
> +'
> +
> +setup_repo_for_delete_merged () {
> +	rm -rf repo &&
> +	git clone upstream repo &&
> +	(
> +		cd repo &&
> +		git remote add fork ../fork &&
> +		git remote add other ../other &&
> +		git config remote.pushDefault fork &&
> +		git config push.default current &&
> +		git fetch other
> +	)
> +}
> +
> +merged_branch () {
> +	(
> +		cd repo &&
> +		git checkout -b "$1" "$2" &&
> +		git commit --allow-empty -m "$1 work" &&
> +		git push origin "$1:next" &&
> +		git fetch origin &&
> +		git branch --set-upstream-to="$2" "$1"
> +	)
> +}
> +
> +test_expect_success '--delete-merged deletes merged branches and spares the rest' '
> +	test_when_finished "rm -rf repo" &&
> +	setup_repo_for_delete_merged &&
> +	merged_branch merged origin/next &&
> +	(
> +		cd repo &&
> +		git checkout -b unmerged origin/next &&
> +		git commit --allow-empty -m "unmerged work" &&
> +		git branch --set-upstream-to=origin/next unmerged &&
> +		git checkout -b tracks-other other/main &&
> +		git branch --set-upstream-to=other/main tracks-other &&
> +		git checkout --detach
> +	) &&
> +	sha=$(git -C repo rev-parse --short merged) &&
> +
> +	git -C repo branch --delete-merged origin/next >actual 2>&1 &&
> +
> +	echo "Deleted branch merged (was $sha)." >expect &&
> +	test_cmp expect actual &&
> +	git -C repo for-each-ref --format="%(refname:short)" refs/heads/ >actual &&
> +	cat >expect <<-\EOF &&
> +	main
> +	tracks-other
> +	unmerged
> +	EOF
> +	test_cmp expect actual
> +'
> +
> +test_expect_success '--delete-merged deletes merged branches and spares protected ones' '
> +	test_when_finished "rm -rf repo" &&
> +	setup_repo_for_delete_merged &&
> +	merged_branch on-next origin/next &&
> +	merged_branch checked-out origin/next &&
> +	merged_branch upstream-gone origin/next &&
> +	(
> +		cd repo &&
> +		git checkout -b mainline main &&
> +		git checkout -b on-local mainline &&
> +		git branch --set-upstream-to=mainline on-local &&
> +		git update-ref refs/remotes/origin/topic refs/remotes/origin/next &&
> +		git branch --set-upstream-to=origin/topic upstream-gone &&
> +		git update-ref -d refs/remotes/origin/topic &&
> +		git branch --set-upstream-to=origin/main main &&
> +		git config branch.main.pushRemote origin &&
> +		git checkout -b tracks-other other/main &&
> +		git branch --set-upstream-to=other/main tracks-other &&
> +		git checkout checked-out
> +	) &&
> +
> +	git -C repo branch --delete-merged origin/next mainline &&
> +
> +	git -C repo for-each-ref --format="%(refname:short)" refs/heads/ >actual &&
> +	cat >expect <<-\EOF &&
> +	checked-out
> +	main
> +	mainline
> +	tracks-other
> +	upstream-gone
> +	EOF
> +	test_cmp expect actual
> +'
> +
> +test_expect_success '--delete-merged requires at least one <branch>' '
> +	test_must_fail git -C forked branch --delete-merged 2>err &&
> +	test_grep "requires at least one <branch>" err
> +'
> +
> +test_expect_success '--delete-merged keeps a branch that is an upstream' '
> +	test_when_finished "rm -rf repo" &&
> +	setup_repo_for_delete_merged &&
> +	merged_branch feature origin/next &&
> +	(
> +		cd repo &&
> +		git checkout -b topic feature &&
> +		git commit --allow-empty -m "topic work" &&
> +		git branch --set-upstream-to=feature topic &&
> +		git checkout --detach
> +	) &&
> +
> +	git -C repo branch --delete-merged origin/next 2>err &&
> +
> +	test_must_be_empty err &&
> +	git -C repo rev-parse --verify refs/heads/feature &&
> +	git -C repo rev-parse --verify refs/heads/topic
> +'
> +
> +test_expect_success '--delete-merged keeps a chain of upstreams of a kept branch' '
> +	test_when_finished "rm -rf repo" &&
> +	setup_repo_for_delete_merged &&
> +	(
> +		cd repo &&
> +		git branch b3 origin/next &&
> +		git branch --set-upstream-to=origin/next b3 &&
> +		git branch b2 origin/next &&
> +		git branch --set-upstream-to=b3 b2 &&
> +		git checkout -b b1 b2 &&
> +		git commit --allow-empty -m "b1 work" &&
> +		git branch --set-upstream-to=b2 b1 &&
> +		git checkout --detach
> +	) &&
> +
> +	git -C repo branch --delete-merged origin/next &&
> +
> +	git -C repo for-each-ref --format="%(refname:short)" refs/heads/ >actual &&
> +	cat >expect <<-\EOF &&
> +	b1
> +	b2
> +	b3
> +	main
> +	EOF
> +	test_cmp expect actual
> +'
> +
>   test_done


^ permalink raw reply

* Re: [PATCH v3] config.mak.uname: avoid macOS dup-library warning
From: Paolo Bonzini @ 2026-06-22 14:00 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: Junio C Hamano, Harald Nordgren via GitGitGadget, git,
	Harald Nordgren
In-Reply-To: <ajk6QGB8raf85CPo@pks.im>

On Mon, Jun 22, 2026 at 3:36 PM Patrick Steinhardt <ps@pks.im> wrote:
> On Mon, Jun 22, 2026 at 03:13:05PM +0200, Paolo Bonzini wrote:
> > On 6/22/26 14:57, Junio C Hamano wrote:
> > > > [1]:https://github.com/mesonbuild/meson/
> > > > commit/17d1cc60ed8246b8e7f0786421bf1cdf5cb19254
> > > I took my inspiration for -Wl,-no-whatever from Paolo's other
> > > attempt, referenced in
> > >
> > >      https://github.com/mesonbuild/meson/issues/15553
> > >
> > > which is
> > >
> > >      https://github.com/mesonbuild/meson/
> > > commit/7c901d7a8af214e31788eb6d1a1edd5b75124e66
> >
> > Yeah, it makes sense for Meson to disable it unconditionally.  I wouldn't
> > bother adding a check in meson.build though, since as Patrick mentioned it's
> > mostly a nuisance.
>
> Is this something you want to implement in Meson yourself? Otherwise I'm
> happy to create a pull request.

Sure, please go ahead!

Paolo


^ permalink raw reply

* Re: [PATCH v4 3/3] replay: offer an option to linearize the commit topology
From: Patrick Steinhardt @ 2026-06-22 13:53 UTC (permalink / raw)
  To: Toon Claes; +Cc: git, Elijah Newren, Johannes Schindelin
In-Reply-To: <20260622-toon-git-replay-drop-merges-v4-3-ff257f534319@iotcl.com>

On Mon, Jun 22, 2026 at 02:41:57PM +0200, Toon Claes wrote:
> From: Johannes Schindelin <Johannes.Schindelin@gmx.de>
> 
> One of the stated goals of git-replay(1) is to allow implementing the
> git-rebase(1) functionality on the server side.
> 
> The default mode of git-rebase(1) is to act as if `--no-rebase-merges`
> was given. This mode drops merge commits instead of replaying them, and
> linearizes the commit history into a sequence of the
> regular (single-parent) commits.
> 
> Add option `--linearize` to git-replay(1) to do the same.

git-rebase(1) essentially knows about three different modes:

  - "--no-rebase-merges", which is the default and maps to your
    "--linearize".

  - "--rebase-merges", which by default doesn't rebase cousins by using
    "--ancestry-path" internally.

  - "--rebase-merges=rebase-cousins", which doesn't pass the above
    option.

So it's not a simple boolean there, which makes me wonder whether we
should mirror the same interface so that all of git-rebase(1)'s modes
can be represented, as well.

> diff --git a/replay.c b/replay.c
> index 7921d7dba3..5539daff00 100644
> --- a/replay.c
> +++ b/replay.c
> @@ -277,12 +277,16 @@ static struct commit *pick_regular_commit(struct repository *repo,
>  					  struct commit *onto,
>  					  struct merge_options *merge_opt,
>  					  struct merge_result *result,
> +					  struct commit *replayed_base,
>  					  bool reverse,
>  					  enum replay_empty_commit_action empty)
>  {
> -	struct commit *base, *replayed_base;
> +	struct commit *base;
>  	struct tree *pickme_tree, *base_tree, *replayed_base_tree;
>  
> +	if (replayed_base && reverse)
> +		BUG("Linearizing commits is not supported when replaying in reverse");

Nit: Error messages should typically start with a lower-case letter.

> @@ -430,12 +435,25 @@ int replay_revisions(struct rev_info *revs,
>  	while ((commit = get_revision(revs))) {
>  		const struct name_decoration *decoration;
>  
> -		if (commit->parents && commit->parents->next)
> -			die(_("replaying merge commits is not supported yet!"));
> +		if (commit->parents && commit->parents->next) {
> +			if (!opts->linearize)
> +				die(_("replaying merge commits is not supported yet!"));
> +			/*
> +			 * Drop the merge commit: do not pick it and leave
> +			 * last_commit unchanged, so its children (and any ref
> +			 * pointing at it) are reparented onto the previous
> +			 * non-merge commit, which the ref-update loop below uses.
> +			 */

One could add a hint here that tells the user to pass the option. But I
guess that might be somewhat weird, as we cannot assume that we're
called by git-replay(1) here.

In any case, this here is the core of the change where we stop dying in
case "--linearize" was passed, and instead we simply skip the commit
altogether. Makes sense.

Thanks!

Patrick

^ permalink raw reply

* Re: [PATCH v4 2/3] replay: add helper to put entry into mapped_commits
From: Patrick Steinhardt @ 2026-06-22 13:53 UTC (permalink / raw)
  To: Toon Claes; +Cc: git, Elijah Newren
In-Reply-To: <20260622-toon-git-replay-drop-merges-v4-2-ff257f534319@iotcl.com>

On Mon, Jun 22, 2026 at 02:41:56PM +0200, Toon Claes wrote:
> diff --git a/replay.c b/replay.c
> index 1f8e5b083b..7921d7dba3 100644
> --- a/replay.c
> +++ b/replay.c
> @@ -256,6 +256,21 @@ static struct commit *mapped_commit(kh_oid_map_t *replayed_commits,
>  	return kh_value(replayed_commits, pos);
>  }
>  
> +static void put_mapped_commit(kh_oid_map_t *replayed_commits,
> +			      struct commit *commit,
> +			      struct commit *new_commit)
> +{
> +	khint_t pos;
> +	int ret;
> +
> +	pos = kh_put_oid_map(replayed_commits, commit->object.oid, &ret);
> +	if (ret == 0)
> +		BUG("Duplicate rewritten commit: %s\n",
> +		    oid_to_hex(&commit->object.oid));
> +
> +	kh_value(replayed_commits, pos) = new_commit;
> +}

The khash map interfaces are quite awkward to use, so having a small
wrapper feels sensible to me. It is one of those interfaces that really
make you wish for generics in C.

Patrick

^ permalink raw reply

* Re: [PATCH v4 1/3] replay: refactor enum replay_mode into a bool
From: Patrick Steinhardt @ 2026-06-22 13:53 UTC (permalink / raw)
  To: Toon Claes; +Cc: git, Elijah Newren
In-Reply-To: <20260622-toon-git-replay-drop-merges-v4-1-ff257f534319@iotcl.com>

On Mon, Jun 22, 2026 at 02:41:55PM +0200, Toon Claes wrote:
> In 2760ee4983 (replay: add --revert mode to reverse commit changes,
> 2026-03-26) the enum `replay_mode` was introduced. This has two possible
> values:
> 
>  - The value `REPLAY_MODE_REVERT` is used when option `--revert` is
>    passed to git-replay(1). When using this value the commits are
>    processed in reverse order and the inverse of the changes are
>    applied.
> 
>  - The value `REPLAY_MODE_PICK` is used when either option `--onto` or
>    `--advance` is used. In both cases the commits are processed in
>    normal order, and the changes are applied as-is.
> 
> Since there are only two possible values of this enum, simplify the code
> by converting the enum into a bool. This avoids adding code paths that
> check for invalid values of the enum, and shortens code where the value
> is checked with a ternary operator.

That's fair, and the result is easier to write. But is it really easier
to read? And what if we ever have to create a third mode going forward?

I'm generally no fan of booleans as parameters as they basically give
you no information at all at the callsite, except if you're lucky and
you already have an aptly-named variable available that you can pass.
Which seems to be the case here, but I'm still not sure whether this
change really improves the code.

Patrick

^ permalink raw reply

* Re: [PATCH v3] config.mak.uname: avoid macOS dup-library warning
From: Patrick Steinhardt @ 2026-06-22 13:36 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Junio C Hamano, Harald Nordgren via GitGitGadget, git,
	Harald Nordgren
In-Reply-To: <cdb16758-dd92-4b8c-8e82-8c607151449f@redhat.com>

On Mon, Jun 22, 2026 at 03:13:05PM +0200, Paolo Bonzini wrote:
> On 6/22/26 14:57, Junio C Hamano wrote:
> > > [1]:https://github.com/mesonbuild/meson/
> > > commit/17d1cc60ed8246b8e7f0786421bf1cdf5cb19254
> > I took my inspiration for -Wl,-no-whatever from Paolo's other
> > attempt, referenced in
> > 
> >      https://github.com/mesonbuild/meson/issues/15553
> > 
> > which is
> > 
> >      https://github.com/mesonbuild/meson/
> > commit/7c901d7a8af214e31788eb6d1a1edd5b75124e66
> 
> Yeah, it makes sense for Meson to disable it unconditionally.  I wouldn't
> bother adding a check in meson.build though, since as Patrick mentioned it's
> mostly a nuisance.

Is this something you want to implement in Meson yourself? Otherwise I'm
happy to create a pull request.

Patrick

^ permalink raw reply

* Re: [PATCH v3 0/4] pack-objects: support bitmaps and delta-islands with `--path-walk`
From: Derrick Stolee @ 2026-06-22 13:36 UTC (permalink / raw)
  To: Junio C Hamano, Taylor Blau; +Cc: git, Jeff King, Elijah Newren
In-Reply-To: <xmqqmrwn3u4x.fsf@gitster.g>

On 6/22/2026 3:35 AM, Junio C Hamano wrote:
> Taylor Blau <me@ttaylorr.com> writes:

>> Outside of the above, the series is functionally unchanged.
>>
>> Thanks in advance for another look.
>>
>> Taylor Blau (4):
>>   t/perf: drop p5311's lookup-table permutation
>>   pack-objects: support reachability bitmaps with `--path-walk`
>>   pack-objects: extract `record_tree_depth()` helper
>>   pack-objects: support `--delta-islands` with `--path-walk`
> 
> Very cleanly implemented.  I am not confident that I have followed
> the detailed logic around delta islands in the last step but the
> earlier three patches looked trivially good.
I've been happy with the code, subject to the new data that is presented
with this version confirming the expected performance benefits. I also
lack confidence in the delta islands features, but based on my weak
understanding it looks correct. I believe that Taylor has the right
expertise here to make up for my lack of context.

Thanks,
-Stolee

^ permalink raw reply

* Re: [PATCH v3] config.mak.uname: avoid macOS dup-library warning
From: Paolo Bonzini @ 2026-06-22 13:13 UTC (permalink / raw)
  To: Junio C Hamano, Patrick Steinhardt
  Cc: Harald Nordgren via GitGitGadget, git, Harald Nordgren
In-Reply-To: <xmqqldc63f8g.fsf@gitster.g>

On 6/22/26 14:57, Junio C Hamano wrote:
>> [1]:https://github.com/mesonbuild/meson/ 
>> commit/17d1cc60ed8246b8e7f0786421bf1cdf5cb19254
> I took my inspiration for -Wl,-no-whatever from Paolo's other
> attempt, referenced in
> 
>      https://github.com/mesonbuild/meson/issues/15553
> 
> which is
> 
>      https://github.com/mesonbuild/meson/ 
> commit/7c901d7a8af214e31788eb6d1a1edd5b75124e66

Yeah, it makes sense for Meson to disable it unconditionally.  I 
wouldn't bother adding a check in meson.build though, since as Patrick 
mentioned it's mostly a nuisance.

Paolo


^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox