Git development

Git development
 help / color / mirror / Atom feed

* [PATCH 1/3] git-config.adoc: fix paragraph break
From: Derrick Stolee via GitGitGadget @ 2026-06-08 13:57 UTC (permalink / raw)
  To: git; +Cc: gitster, Derrick Stolee, Derrick Stolee
In-Reply-To: <pull.2139.git.1780927027.gitgitgadget@gmail.com>

From: Derrick Stolee <stolee@gmail.com>

The bulletted list about environment variables is missing a '+' between
some paragraphs that belong to the same bullet item. Without it, the
bulletted list is rendered as two separate lists with "See also FILES."
as a normal paragraph between them. Adding '+' unifies the lists.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
---
 Documentation/git-config.adoc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/git-config.adoc b/Documentation/git-config.adoc
index 00545b2054..044d776613 100644
--- a/Documentation/git-config.adoc
+++ b/Documentation/git-config.adoc
@@ -476,7 +476,7 @@ GIT_CONFIG_SYSTEM::
 GIT_CONFIG_NOSYSTEM::
 	Whether to skip reading settings from the system-wide
 	$(prefix)/etc/gitconfig file. See linkgit:git[1] for details.
-
++
 See also <<FILES>>.
 
 GIT_CONFIG_COUNT::
-- 
gitgitgadget


^ permalink raw reply related

* [PATCH 0/3] config: allow disabling config includes
From: Derrick Stolee via GitGitGadget @ 2026-06-08 13:57 UTC (permalink / raw)
  To: git; +Cc: gitster, Derrick Stolee

This series introduces a new way to ignore config include directives via two
mechanisms:

 * GIT_CONFIG_INCLUDES=0 in the environment.
 * git --no-includes ... in the command line.

My motivation is from a tricky situation where users want to do the risky
thing and include a repo-tracked file for sharing aliases and other
recommended config. They are then struggling in a later build step that is
running Git commands (under a tool we don't control and can't change) that
then cause filesystem accesses outside of the build system's sandbox.

While git config has a --no-includes option, that doesn't impact the
behavior of other Git commands. We build upon that existing logic for
disabling includes, though.

Having had recent luck recommending GIT_ADVICE=0 when running Git commands
from third-party tools, I thought that a similar environment variable to
disable this functionality would be helpful, too.

One thing I do worry about is whether or not this would cause a significant
break in behavior, or if this is a relatively safe thing to allow.

The three patches are organized as follows:

 1. Patch 1 has a small typo fix in the config documentation that messes
    with the format of the bulleted list. I include it here because I add to
    that list in patch 2.
 2. Patch 2 adds the environment variable and tests it via 'git config' and
    the use of a Git alias.
 3. Patch 3 adds the '--no-includes' option at the top level.

Thanks, -Stolee

Derrick Stolee (3):
  git-config.adoc: fix paragraph break
  config: add GIT_CONFIG_INCLUDES
  git: add --no-includes top-level option

 Documentation/git-config.adoc |  7 ++++++-
 Documentation/git.adoc        |  6 +++++-
 config.c                      |  7 ++++++-
 environment.h                 |  6 ++++++
 git.c                         |  6 +++++-
 t/t1305-config-include.sh     | 35 +++++++++++++++++++++++++++++++++++
 6 files changed, 63 insertions(+), 4 deletions(-)

base-commit: 9ac3f193c05c2237e2b14ebaa1149e9fc8a1abe0
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-2139%2Fderrickstolee%2Fconfig-include-override-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-2139/derrickstolee/config-include-override-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/2139
-- 
gitgitgadget

^ permalink raw reply

* Re: [PATCH 7/7] odb: use size_t for object_info.sizep and the size APIs
From: Patrick Steinhardt @ 2026-06-08 13:53 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget
  Cc: git, Kristofer Karlsson, Johannes Schindelin
In-Reply-To: <f3aeae983ac8b281d6ba54299961e19d16699c94.1780570273.git.gitgitgadget@gmail.com>

On Thu, Jun 04, 2026 at 10:51:12AM +0000, Johannes Schindelin via GitGitGadget wrote:
> diff --git a/builtin/cat-file.c b/builtin/cat-file.c
> index fa45f774d7..fa6e396ddc 100644
> --- a/builtin/cat-file.c
> +++ b/builtin/cat-file.c
> @@ -120,7 +120,7 @@ static int cat_one_file(int opt, const char *exp_type, const char *obj_name)
>  	struct object_id oid;
>  	enum object_type type;
>  	char *buf;
> -	unsigned long size;
> +	size_t size;
>  	struct object_context obj_context = {0};
>  	struct object_info oi = OBJECT_INFO_INIT;
>  	unsigned flags = OBJECT_INFO_LOOKUP_REPLACE;
> @@ -166,7 +166,7 @@ static int cat_one_file(int opt, const char *exp_type, const char *obj_name)
>  		if (use_mailmap && (type == OBJ_COMMIT || type == OBJ_TAG)) {
>  			size_t s = size;
>  			buf = replace_idents_using_mailmap(buf, &s);
> -			size = cast_size_t_to_ulong(s);
> +			size = s;
>  		}
>  
>  		printf("%"PRIuMAX"\n", (uintmax_t)size);

Can't we drop this local variable completely and instead supply `&size`
directly?

> @@ -219,7 +225,7 @@ static int cat_one_file(int opt, const char *exp_type, const char *obj_name)
>  		if (use_mailmap) {
>  			size_t s = size;
>  			buf = replace_idents_using_mailmap(buf, &s);
> -			size = cast_size_t_to_ulong(s);
> +			size = s;
>  		}
>  
>  		/* otherwise just spit out the data */
> @@ -266,7 +272,7 @@ static int cat_one_file(int opt, const char *exp_type, const char *obj_name)
>  		if (use_mailmap) {
>  			size_t s = size;
>  			buf = replace_idents_using_mailmap(buf, &s);
> -			size = cast_size_t_to_ulong(s);
> +			size = s;
>  		}
>  		break;
>  	}
> @@ -446,7 +455,7 @@ static void print_object_or_die(struct batch_options *opt, struct expand_data *d
>  		if (use_mailmap) {
>  			size_t s = size;
>  			contents = replace_idents_using_mailmap(contents, &s);
> -			size = cast_size_t_to_ulong(s);
> +			size = s;
>  		}
>  
>  		if (type != data->type)

Likewise for these three instances.

> @@ -555,7 +564,7 @@ static void batch_object_write(const char *obj_name,
>  			if (!buf)
>  				die(_("unable to read %s"), oid_to_hex(&data->oid));
>  			buf = replace_idents_using_mailmap(buf, &s);
> -			data->size = cast_size_t_to_ulong(s);
> +			data->size = s;
>  
>  			free(buf);
>  		}

And I think this site here can be adapted, as well.

> diff --git a/diff.c b/diff.c
> index 5a584fa1d5..816b89dc6c 100644
> --- a/diff.c
> +++ b/diff.c
> @@ -4594,8 +4594,9 @@ int diff_populate_filespec(struct repository *r,
>  		}
>  	}
>  	else {
> +		size_t size_st = 0;
>  		struct object_info info = {
> -			.sizep = &s->size
> +			.sizep = &size_st
>  		};
>  
>  		if (!(size_only || check_binary))
> @@ -4617,6 +4618,7 @@ int diff_populate_filespec(struct repository *r,
>  			die("unable to read %s", oid_to_hex(&s->oid));
>  
>  object_read:
> +		s->size = cast_size_t_to_ulong(size_st);
>  		if (size_only || check_binary) {
>  			if (size_only)
>  				return 0;
> @@ -4631,6 +4633,7 @@ object_read:
>  			if (odb_read_object_info_extended(r->objects, &s->oid, &info,
>  							  OBJECT_INFO_LOOKUP_REPLACE))
>  				die("unable to read %s", oid_to_hex(&s->oid));
> +			s->size = cast_size_t_to_ulong(size_st);
>  		}
>  		s->should_free = 1;
>  	}

The flow in this function is quite weird if you ask me, but that's a
preexisting issue. This does look correct to me, even if it's awkward.

Patrick

^ permalink raw reply

* Re: [PATCH 6/7] packfile,delta: drop the `cast_size_t_to_ulong()` wrappers
From: Patrick Steinhardt @ 2026-06-08 13:53 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget
  Cc: git, Kristofer Karlsson, Johannes Schindelin
In-Reply-To: <460d733feeaf2a94fe28d7509cc4128e9c0a7610.1780570273.git.gitgitgadget@gmail.com>

On Thu, Jun 04, 2026 at 10:51:11AM +0000, Johannes Schindelin via GitGitGadget wrote:
> From: Johannes Schindelin <johannes.schindelin@gmx.de>
> 
> When I started the transition from `unsigned long` to `size_t`, in the
> interest of keeping the patches reviewable, I introduced these calls to
> prevent data type narrowing from silently failing to handle large object
> sizes. I also introduced `*_sz()` variants that would allow most of the
> callers to keep using that `unsigned long` that the 90s kindly asked to
> be returned.
> 
> After the preceding commits, the only places that called the narrow
> wrappers either no longer exist or already use the `_sz` form
> internally, so the wrappers just narrow values back through
> `cast_size_t_to_ulong()` for no reason.
> 
> Drop them and rename the `_sz` variants back to the natural names.

Aha, so you already address my comment I had on one of the preceding
patches :)

Patrick

^ permalink raw reply

* Re: [PATCH 4/7] packfile: widen unpack_entry()'s size out-parameter to size_t
From: Patrick Steinhardt @ 2026-06-08 13:53 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget
  Cc: git, Kristofer Karlsson, Johannes Schindelin
In-Reply-To: <bdebc36f21d1e2a13bc91d72a3ada1db3f7e184e.1780570273.git.gitgitgadget@gmail.com>

On Thu, Jun 04, 2026 at 10:51:09AM +0000, Johannes Schindelin via GitGitGadget wrote:
> diff --git a/builtin/fast-import.c b/builtin/fast-import.c
> index 82bc6dcc00..3dff898c43 100644
> --- a/builtin/fast-import.c
> +++ b/builtin/fast-import.c
> @@ -1239,6 +1239,8 @@ static void *gfi_unpack_entry(
>  	unsigned long *sizep)
>  {
>  	enum object_type type;
> +	size_t size_st = 0;
> +	void *data;
>  	struct packed_git *p = all_packs[oe->pack_id];
>  	if (p == pack_data && p->pack_size < (pack_size + the_hash_algo->rawsz)) {
>  		/* The object is stored in the packfile we are writing to
> @@ -1260,7 +1262,10 @@ static void *gfi_unpack_entry(
>  		 */
>  		p->pack_size = pack_size + the_hash_algo->rawsz;
>  	}
> -	return unpack_entry(the_repository, p, oe->idx.offset, &type, sizep);
> +	data = unpack_entry(the_repository, p, oe->idx.offset, &type, &size_st);
> +	if (sizep)
> +		*sizep = cast_size_t_to_ulong(size_st);
> +	return data;
>  }

Nit, please feel free to ignore: do we want to add a NEEDSWORK comment
here?

Patrick

^ permalink raw reply

* Re: [PATCH 3/7] pack-objects(check_pack_inflate()): use size_t instead of unsigned long
From: Patrick Steinhardt @ 2026-06-08 13:53 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget
  Cc: git, Kristofer Karlsson, Johannes Schindelin
In-Reply-To: <ddb75326cde9695f1eb7bbbe77175424e6b77004.1780570273.git.gitgitgadget@gmail.com>

On Thu, Jun 04, 2026 at 10:51:08AM +0000, Johannes Schindelin via GitGitGadget wrote:
> From: Johannes Schindelin <johannes.schindelin@gmx.de>
> 
> `write_reuse_object()` learned to track its packed-object size as
> `size_t` in 606c192380 (odb, packfile: use size_t for streaming
> object sizes, 2026-05-08), but the comparison sink it feeds,
> `check_pack_inflate()`, still takes the expected decompressed size
> as `unsigned long`. The call site bridges the mismatch with
> `cast_size_t_to_ulong()`, which on Windows turns a >4 GiB object
> into an immediate die().
> 
> That function only uses `expect` once: as the right-hand side of a
> `stream.total_out == expect` equality test against zlib's counter.
> zlib's own `total_out` counter is `uLong` and is therefore still
> 32-bit-bound on Windows. Widening `expect` to `size_t` cannot fix that,
> but it is a strict improvement nonetheless: instead of dying outright,
> an oversized object now simply makes the equality fail and lets
> `write_reuse_object()` fall back to `write_no_reuse_object()`, which
> decompresses and re-deflates the content (and which the larger
> pack-objects widening series targets separately).

Hm. I wonder whether it's possible to reset `stream.total_out` on every
iteration and instead have a local `size_t` variable that we use to
track the total number of inflated bytes?

Patrick

^ permalink raw reply

* Re: [PATCH 2/7] patch-delta: use size_t for sizes
From: Patrick Steinhardt @ 2026-06-08 13:53 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget
  Cc: git, Kristofer Karlsson, Johannes Schindelin
In-Reply-To: <1fd7646ca14f7ec392c85fab10255f08d0d79368.1780570273.git.gitgitgadget@gmail.com>

On Thu, Jun 04, 2026 at 10:51:07AM +0000, Johannes Schindelin via GitGitGadget wrote:
> From: Johannes Schindelin <johannes.schindelin@gmx.de>
> 
> `patch_delta()` takes the source and delta sizes by value and writes
> back the reconstructed target size through an `unsigned long *`.  That
> datatype cannot represent a value that exceeds 4 GiB on systems where
> `unsigned long` is 32-bit (notably 64-bit Windows builds), though, even
> though the delta encoding itself, the on-disk layout, and the in-memory
> buffers happily carry such sizes. A `size_t` companion to
> `get_delta_hdr_size()`, `get_delta_hdr_size_sz()`, was introduced in
> 17fa077596 (delta, packfile: use size_t for delta header sizes,
> 2026-05-08) precisely so that `patch_delta()` could be widened without
> changing the on-the-wire decoding helper's signature.
> 
> Widen `patch_delta()`'s three size parameters to `size_t` and switch
> its internal use of `get_delta_hdr_size()` to the `_sz` variant.
> Then propagate the wider type through the callers.

Does `get_delta_hdr_size()` have any remaining callers after this patch
series? I currently only spot two such callers, and you convert both of
them in this patch.

And can we reasonably add a test case that exercises this change?

> diff --git a/packfile.c b/packfile.c
> index 89366abfe3..e202f48837 100644
> --- a/packfile.c
> +++ b/packfile.c
> @@ -1964,10 +1964,8 @@ void *unpack_entry(struct repository *r, struct packed_git *p, off_t obj_offset,
>  			      (uintmax_t)curpos, p->pack_name);
>  			data = NULL;
>  		} else {
> -			unsigned long sz;
>  			data = patch_delta(base, base_size, delta_data,
> -					   delta_size, &sz);
> -			size = sz;
> +					   delta_size, &size);

Nice that we get rid of this awkward construct.

Patrick

^ permalink raw reply

* Re: [PATCH v3 0/2] prio-queue: fold lazy_queue into prio_queue for automatic get+put fusion
From: Junio C Hamano @ 2026-06-08 13:36 UTC (permalink / raw)
  To: Kristofer Karlsson via GitGitGadget
  Cc: git, René Scharfe, Kristofer Karlsson
In-Reply-To: <pull.2140.v3.git.1780832592.gitgitgadget@gmail.com>

"Kristofer Karlsson via GitGitGadget" <gitgitgadget@gmail.com>
writes:

> Changes in v3:
>
>  * Adopted Rene's suggestion to move the flush logic below the LIFO
>    early-return (LIFO mode never sets get_pending, so flushing there is a
>    no-op).

Sensible.

>  * Went a step further and inlined the flush logic directly into get() and
>    peek(), eliminating the flush_get() helper and its forward declaration of
>    sift_down_root().

Hmph, unless there is a reason to allow the copies in get() and
peek() to deviate from each other, e.g., what flush_get() had to do
inside get() and peek() were slightly different, I am not sure if
this is a good move.  I do not know if the slight difference of the
"inlined" logic we have in the patch between the one in get() and
the other one in peek() has merit, either.  It certainly lets you
avoid an unnecessary clearing of the get_pending bit (when a get was
pending but the queue has more items to yield) immediately followed
by turning it back on again (which happens always unless the
function makes an early return for an empty queue) in get(), which
will never happen in flush() that will always clear the bit before
it returns, but is such an avoidance of an assignment really worth
it?  I suspect that with the static inline version of flush_get(),
compilers are smart enough to optimize it away, but I dunno.

>        void *prio_queue_get(struct prio_queue *queue)
>        {
>        	if (!queue->nr)
>        		return NULL;
>        	if (!queue->compare)
>      ++		return queue->array[--queue->nr].data;
>      ++
>      ++	if (queue->get_pending) {
>      ++		if (!--queue->nr) {
>      ++			queue->get_pending = 0;
>      ++			return NULL;
>      ++		}
>      ++		queue->array[0] = queue->array[queue->nr];
>      ++		sift_down_root(queue);
>      ++	}
>      + 

The above is from [1/2] (this is a minor point, but flipping the
order of two patches to make the "nr_internal clean-up" as a
preliminary step might have made commenting on this part easier to
read).  I wondered if it is easier to understand if the first early
return is guarded by a conditional that takes get_pending into
account.

	if (queue->nr_internal <= queue->get_pending)
		return NULL;

As I said, since the above hunk is immediately followed by an
unconditional assignment of "queue->get_pending = 1", clearing
get_pending = 0 only when we leave inside the if() block works as an
optimization that is not available on the peek() side.  But with the
"ah the queue is empty already, the queue->ne == 1 is due to the
lazy get that did not rebalance" tweak, it would become unneeded, I
think.

>      + void *prio_queue_peek(struct prio_queue *queue)
>      + {
>       +	if (!queue->nr_internal)
>        		return NULL;
>        	if (!queue->compare)
>       +		return queue->array[queue->nr_internal - 1].data;
>      + 
>      + 	if (queue->get_pending) {
>      + 		queue->get_pending = 0;
>      +-		if (!--queue->nr)
>      ++		if (!--queue->nr_internal)
>      + 			return NULL;
>      +-		queue->array[0] = queue->array[queue->nr];
>      ++		queue->array[0] = queue->array[queue->nr_internal];
>      + 		sift_down_root(queue);
>      + 	}

This is from [2/2]; the same 

	if (queue->nr_internal <= queue->get_pending)
		return NULL;

applies here, I think.

^ permalink raw reply

* Re: [PATCH RFC 2/2] builtin/history: print feedback after successful reword
From: Pablo Sabater @ 2026-06-08 13:23 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Patrick Steinhardt, Kaartic Sivaraam
In-Reply-To: <xmqqqzmhz0pq.fsf@gitster.g>

El lun, 8 jun 2026 a las 14:16, Junio C Hamano (<gitster@pobox.com>) escribió:
>
> Pablo Sabater <pabloosabaterr@gmail.com> writes:
>
> > Unlike `git commit --amend` and `git rebase -i`, `git history reword`
> > doesn't print anything, this makes it feel empty for a porcelain command
> > and hard to tell if the command did anything without using other
> > commands like `git log <commit>` to check if the reword was done.
> >
> > Print a message on successful rewords so the user has feedback about it.
> >
> > Signed-off-by: Pablo Sabater <pabloosabaterr@gmail.com>
> > ---
> >  builtin/history.c         |  4 ++++
> >  t/t3451-history-reword.sh | 14 ++++++++++++++
> >  2 files changed, 18 insertions(+)
> >
> > diff --git a/builtin/history.c b/builtin/history.c
> > index 51a22a9a1c..0f1ba3b531 100644
> > --- a/builtin/history.c
> > +++ b/builtin/history.c
> > @@ -739,6 +739,10 @@ static int cmd_history_reword(int argc,
> >               goto out;
> >       }
> >
> > +     fprintf(stderr, _("Successfully reworded commit %s to %s\n"),
> > +             repo_find_unique_abbrev(repo, &original->object.oid, DEFAULT_ABBREV),
> > +             repo_find_unique_abbrev(repo, &rewritten->object.oid, DEFAULT_ABBREV));
> > +
> >       ret = 0;
> >
> >  out:
>
> Do other commands in "git history" (split is in 'master', drop and
> fixup are cooking) behave with similar verbosity?  Consistency within
> the same "history" umbrella matters more than being similar with
> other commands that can be used for similar purposes.

They do not, they are thought with the rule of silence in mind.
However I think that this output is valuable information I might have
explained myself better at [1] but my thought is:

git history reword aabb

Now that I have my commit aabb rewritten I want to check it again just
to make sure I did what I wanted correctly, but git log aabb is still
the old commit, the rewritten one has a different hash which I do not
know unless I search for it, if it's far from HEAD I'd have to git log
--oneline, get the hash and then git log new_hash. I think that git
history reword that does have the information about the new hash
should print it to avoid this search.
What I want is something like:

git history reword aabb
Successfully reworded aabb to ccdd

So I can just git log ccdd without having to search.

I want to say I haven't looked as much as I'd like to split, drop and
fixup, but I think it would be a good addition for them also. On [1]
Patrick wrote about a --verbose for git history, I think that the
basic information i.e. at reword which is the new hash should be
always printed but if it's preferred it could go there.

For split it can print the hashes of the new commits like:
"...split into ccdd and eeff."
For fixup the commit hash also changes, so the same as reword.
The one that will have more friction would be drop is the one that
doesn't end up with new commits.

[1]: https://lore.kernel.org/git/CAN5EUNSAOMRvmLGVfzQiwWoOn9VGNVU5rVMZizOryn_q2fbCNA@mail.gmail.com/

>
> > diff --git a/t/t3451-history-reword.sh b/t/t3451-history-reword.sh
> > index 54ea8a7207..4b22d761e3 100755
> > --- a/t/t3451-history-reword.sh
> > +++ b/t/t3451-history-reword.sh
> > @@ -416,4 +416,18 @@ test_expect_success 'aborts if the commit message is the same' '
> >       )
> >  '
> >
> > +test_expect_success 'prints feedback on successful reword' '
> > +     test_when_finished "rm -rf repo" &&
> > +     git init repo &&
> > +     (
> > +             cd repo &&
> > +             test_commit first &&
> > +
> > +             reword_with_message HEAD 2>err <<-EOF &&
> > +             first reworded
> > +             EOF
> > +             test_grep "Successfully reworded" err
> > +     )
> > +'
> > +
> >  test_done

^ permalink raw reply

* Re: [PATCH] ls-files: filter pathspec before lstat
From: Junio C Hamano @ 2026-06-08 13:06 UTC (permalink / raw)
  To: Tamir Duberstein; +Cc: git, René Scharfe, Patrick Steinhardt
In-Reply-To: <20260607-ls-files-pathspec-lstat-v1-1-8cf40b730146@gmail.com>

On Sun, Jun 7, 2026 at 11:40, Tamir Duberstein wrote:
> show_files() checks whether each index entry is deleted or modified
> before show_ce() applies the pathspec. prune_index() avoids most of this
> work for pathspecs with a common directory prefix, but a top-level name
> or leading wildcard leaves every entry to be checked.
> 
> Match the pathspec before lstat() for the deleted and modified modes.
> Keep the later match in show_ce() so --error-unmatch is satisfied only
> by entries that are actually shown.

Adding an extra early `match_pathspec()` check before making slow
system calls like `lstat()` makes sense, especially when most of the
index entries need to be skipped.  But if most of them would match,
then we would end up doing the same match_pathspec() calls twice for
each path, and run lstat() anyway, so you may also be able to
construct a perf test that demonstrates a case where this approach
is not a clear win (or even degradation), perhaps?

> diff --git a/builtin/ls-files.c b/builtin/ls-files.c
> index e1a22b41b9..702c607183 100644
> --- a/builtin/ls-files.c
> +++ b/builtin/ls-files.c
> @@ -450,6 +450,13 @@ static void show_files(struct repository *repo, struct dir_struct *dir)
>  			continue;
>  		if (ce_skip_worktree(ce))
>  			continue;
> +		/* Only entries shown by show_ce() satisfy --error-unmatch. */
> +		if (pathspec.nr &&
> +		    !match_pathspec(repo->index, &pathspec, fullname.buf,
> +				    fullname.len, max_prefix_len, NULL,
> +				    S_ISDIR(ce->ce_mode) ||
> +				    S_ISGITLINK(ce->ce_mode)))
> +			continue;
>  		stat_err = lstat(fullname.buf, &st);
>  		if (stat_err && (errno != ENOENT && errno != ENOTDIR))
>  			error_errno("cannot lstat '%s'", fullname.buf);

Hmph.  In the current code, because there is no such pre-filtering,
show_ce() would unconditionally recurse into active submodules when
told to with the "--recurse-submodules" flag, even if your pathspec
coes not match the submodule.  With this change, such a submodule
whose path does not match the pathspec would not even be seen by
show_ce().  Would it cause a change in behaviour?

^ permalink raw reply

* [PATCH v3 2/2] compat/posix.h: simplify GIT_GNUC_PREREQ() comparison
From: Dominik Loidolt @ 2026-06-08 12:44 UTC (permalink / raw)
  To: ps; +Cc: git, gitster, asedeno, asedeno, avarab, Dominik Loidolt
In-Reply-To: <20260608124419.38905-1-dominik.loidolt@univie.ac.at>

Replace the glibc-style bit-shift version comparison with an explicit
major/minor comparison. This is easier to read and is consistent with
the format already used by GIT_CLANG_PREREQ() and many BSD
<sys/cdefs.h> headers.

This has no runtime impact, as the macro is evaluated at compile time.
It is also more future-proof, as it no longer assumes that GCC version
components stay below 65536.

Signed-off-by: Dominik Loidolt <dominik.loidolt@univie.ac.at>
---
 compat/posix.h | 16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/compat/posix.h b/compat/posix.h
index ffdfd91c7b..deefc43f28 100644
--- a/compat/posix.h
+++ b/compat/posix.h
@@ -4,22 +4,24 @@
 #define _FILE_OFFSET_BITS 64
 
 /*
- * Derived from Linux "Features Test Macro" header
- * Convenience macros to test the versions of gcc (or
- * a compatible compiler).
+ * Convenience macros to test the versions of GCC (or a compatible compiler).
  * Use them like this:
  *  #if GIT_GNUC_PREREQ (2,8)
- *   ... code requiring gcc 2.8 or later ...
+ *   ... code requiring GCC 2.8 or later ...
  *  #endif
  *
+ * Note that Clang and other compilers define __GNUC__ for compatibility; use
+ * GIT_CLANG_PREREQ() to check for specific Clang versions.
+ *
  * This macro of course is not part of POSIX, but we need it for the UNUSED
  * macro which is used by some of our POSIX compatibility wrappers.
-*/
+ */
 #if defined(__GNUC__) && defined(__GNUC_MINOR__)
 # define GIT_GNUC_PREREQ(maj, min) \
-	((__GNUC__ << 16) + __GNUC_MINOR__ >= ((maj) << 16) + (min))
+	((__GNUC__ > (maj)) || \
+	 (__GNUC__ == (maj) && __GNUC_MINOR__ >= (min)))
 #else
- #define GIT_GNUC_PREREQ(maj, min) 0
+# define GIT_GNUC_PREREQ(maj, min) 0
 #endif
 
 /* Similar for Clang. */
-- 
2.54.0


^ permalink raw reply related

* [PATCH v3 1/2] compat/posix.h: enable UNUSED warning messages for Clang
From: Dominik Loidolt @ 2026-06-08 12:44 UTC (permalink / raw)
  To: ps; +Cc: git, gitster, asedeno, asedeno, avarab, Dominik Loidolt
In-Reply-To: <20260605094647.94805-1-dominik.loidolt@univie.ac.at>

Use a dedicated Clang version check for the UNUSED macro.

Commit 7c07f36ad2 (git-compat-util.h: GCC deprecated message arg only in
GCC 4.5+, 2022-10-05) restricted use of the deprecated attribute's
message argument in the UNUSED macro to GCC 4.5 or newer.

Clang identifies itself as GNUC 4.2.1 for compatibility, so
GIT_GNUC_PREREQ(4, 5) does not detect whether Clang supports the
deprecated("...") form. Add GIT_CLANG_PREREQ() macro and use it to
enable the UNUSED warning message for Clang 2.9 and newer.

Signed-off-by: Dominik Loidolt <dominik.loidolt@univie.ac.at>
---
v3:
- fix comment style nit
- remove unnecessary parentheses around __clang_minor__ >= (min)

 compat/posix.h | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/compat/posix.h b/compat/posix.h
index faaae1b655..ffdfd91c7b 100644
--- a/compat/posix.h
+++ b/compat/posix.h
@@ -22,6 +22,15 @@
  #define GIT_GNUC_PREREQ(maj, min) 0
 #endif

+/* Similar for Clang. */
+#if defined(__clang__) && defined(__clang_minor__) && defined(__clang_major__)
+# define GIT_CLANG_PREREQ(maj, min) \
+	((__clang_major__ > (maj)) || \
+	 (__clang_major__ == (maj) && __clang_minor__ >= (min)))
+#else
+# define GIT_CLANG_PREREQ(maj, min) 0
+#endif
+
 /*
  * UNUSED marks a function parameter that is always unused.  It also
  * can be used to annotate a function, a variable, or a type that is
@@ -35,7 +44,7 @@
  * When a parameter may be used or unused, depending on conditional
  * compilation, consider using MAYBE_UNUSED instead.
  */
-#if GIT_GNUC_PREREQ(4, 5)
+#if GIT_GNUC_PREREQ(4, 5) || GIT_CLANG_PREREQ(2, 9)
 #define UNUSED __attribute__((unused)) \
 	__attribute__((deprecated ("parameter declared as UNUSED")))
 #elif defined(__GNUC__)

base-commit: a89346e34a937f001e5d397ee62224e3e9852040
--
2.54.0


^ permalink raw reply related

* [PATCH v2] parse-options: introduce die_for_missing_opt()
From: Siddharth Shrimali @ 2026-06-08 12:44 UTC (permalink / raw)
  To: git
  Cc: gitster, christian.couder, siddharthasthana31, toon, jn.avila,
	r.siddharth.shrimali
In-Reply-To: <20260603111044.39116-1-r.siddharth.shrimali@gmail.com>

Introduce die_for_missing_opt() to check if a dependent option is
present without its required prerequisite. This provides a centralized
API for simple option dependencies (X requires Y), inspired by and
matching the style of die_for_incompatible_opt{2,3,4}().

Use the new helper in builtin/add.c to replace the manual prerequisite
check for '--pathspec-file-nul' (requires '--pathspec-from-file'). This
case is already exercised by existing tests in t3704-add-pathspec-file.sh
and several other pathspec-file test scripts, ensuring the new helper is
verified without additional test code.

Suggested-by: Christian Couder <christian.couder@gmail.com>
Suggested-by: Jean-Noël AVILA <jn.avila@free.fr>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Siddharth Asthana <siddharthasthana31@gmail.com>
Signed-off-by: Siddharth Shrimali <r.siddharth.shrimali@gmail.com>
---
Changes since v1:
  - Squashed the implementation patch and the caller patch into a single,
    unified patch as suggested by Christian.
  - Renamed the helper function from die_for_require_opt() to
    die_for_missing_opt() to improve clarity.
  - Updated the argument names and logic order to better match the style of
    die_for_incompatible_opt*().
  - Dropped the conversion of the '--ignore-missing' check in builtin/add.c
    to keep this initial iteration strictly focused on a single, clean
    example ('--pathspec-file-nul').

 builtin/add.c   | 4 ++--
 parse-options.c | 7 +++++++
 parse-options.h | 3 +++
 3 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/builtin/add.c b/builtin/add.c
index c859f66519..505834ad3f 100644
--- a/builtin/add.c
+++ b/builtin/add.c
@@ -462,6 +462,8 @@ int cmd_add(int argc,
 		       PATHSPEC_SYMLINK_LEADING_PATH,
 		       prefix, argv);
 
+	die_for_missing_opt(pathspec_file_nul, "--pathspec-file-nul",
+			    !!pathspec_from_file, "--pathspec-from-file");
 	if (pathspec_from_file) {
 		if (pathspec.nr)
 			die(_("'%s' and pathspec arguments cannot be used together"), "--pathspec-from-file");
@@ -470,8 +472,6 @@ int cmd_add(int argc,
 				    PATHSPEC_PREFER_FULL |
 				    PATHSPEC_SYMLINK_LEADING_PATH,
 				    prefix, pathspec_from_file, pathspec_file_nul);
-	} else if (pathspec_file_nul) {
-		die(_("the option '%s' requires '%s'"), "--pathspec-file-nul", "--pathspec-from-file");
 	}
 
 	if (require_pathspec && pathspec.nr == 0) {
diff --git a/parse-options.c b/parse-options.c
index a676da86f5..11e40669eb 100644
--- a/parse-options.c
+++ b/parse-options.c
@@ -1558,3 +1558,10 @@ void die_for_incompatible_opt4(int opt1, const char *opt1_name,
 		break;
 	}
 }
+
+void die_for_missing_opt(int dependent_opt, const char *dependent_opt_name,
+			 int required_opt, const char *required_opt_name)
+{
+	if (dependent_opt && !required_opt)
+		die(_("the option '%s' requires '%s'"), dependent_opt_name, required_opt_name);
+}
diff --git a/parse-options.h b/parse-options.h
index 0d1f738f8d..5b41d2fd39 100644
--- a/parse-options.h
+++ b/parse-options.h
@@ -460,6 +460,9 @@ static inline void die_for_incompatible_opt2(int opt1, const char *opt1_name,
 				  0, "");
 }
 
+void die_for_missing_opt(int dependent_opt, const char *dependent_opt_name,
+			 int required_opt, const char *required_opt_name);
+
 /*
  * Use these assertions for callbacks that expect to be called with NONEG and
  * NOARG respectively, and do not otherwise handle the "unset" and "arg"
-- 
2.54.0


^ permalink raw reply related

* Re: [PATCH] describe: limit default ref iteration to tags
From: Junio C Hamano @ 2026-06-08 12:36 UTC (permalink / raw)
  To: Tamir Duberstein; +Cc: git, Jeff King, Patrick Steinhardt
In-Reply-To: <20260607-describe-tag-ref-scope-v1-1-653d232b86b5@gmail.com>

Tamir Duberstein <tamird@gmail.com> writes:

[jc: Removing Shawn from CC who passed away quite a while ago, RIP].

> Unless --all is given, get_name() rejects every ref outside refs/tags/.
> The rejection happens only after the ref backend has enumerated the ref,
> so repositories with many other refs spend most of a simple describe
> invocation visiting refs which cannot affect its result.
> ...
> Both revisions were built with -O3, -mcpu=native, and ThinLTO using
> Apple clang 21.0.0 on macOS 26.5. The machine was a MacBook Pro
> (Mac16,6) with a 16-core Apple M4 Max (12 performance and four
> efficiency cores) and 128 GB RAM.
>
> Signed-off-by: Tamir Duberstein <tamird@gmail.com>
> ---
>  builtin/describe.c       |  3 +++
>  t/perf/p6100-describe.sh | 20 ++++++++++++++++++++
>  2 files changed, 23 insertions(+)

Interesting.  How would this relate to and work well with
<20260601233727.43558-1-jacob.e.keller@intel.com>?

> +test_lazy_prereq PERF_REFFILES '
> +	test "$(git rev-parse --show-ref-format)" = files
> +'
> +
> +ref_count=10000
> +
>  # clear out old tags and give us a known state
>  test_expect_success 'set up tags' '
>  	git for-each-ref --format="delete %(refname)" refs/tags >to-delete &&
> @@ -27,4 +33,18 @@ test_perf 'describe HEAD with one tag' '
>  	git describe --match=new HEAD
>  '
>  
> +test_expect_success PERF_REFFILES 'set up many unrelated refs' '
> +	git tag -m tip tip HEAD &&
> +	for i in $(test_seq $ref_count)
> +	do
> +		printf "create refs/heads/describe-perf/%05d HEAD\n" $i ||
> +		return 1
> +	done >instructions &&
> +	git update-ref --stdin <instructions
> +'
> +
> +test_perf 'describe exact tag with many loose refs' --prereq PERF_REFFILES '
> +	git describe --exact-match HEAD
> +'
> +

Is there a strong reason to guard this new test behind
`PERF_REFFILES`?

Even though the penalty of enumerating 10,000 unrelated loose
references may be most pronounced in the `files` backend, skipping
unnecessary reference enumeration is an architectural win for other
backends (like `reftable` or a fully packed repository) as well.

If we drop `PERF_REFFILES` and retitle the test to "describe exact
tag with many unrelated refs", we could run it unconditionally to
benchmark the improvement across all storage formats.

^ permalink raw reply

* Re: [PATCH v3 4/6] diff: add long-running diff process via diff.<driver>.process
From: Junio C Hamano @ 2026-06-08 12:26 UTC (permalink / raw)
  To: Michael Montalbo
  Cc: Johannes Schindelin, Michael Montalbo via GitGitGadget, git
In-Reply-To: <CAC2QwmJwxpnrPNW6YLm2uXKaYjkUwjVsPN_U+c52m0rNe95_Nw@mail.gmail.com>

Michael Montalbo <mmontalbo@gmail.com> writes:

> On Sun, Jun 7, 2026 at 7:36 AM Johannes Schindelin
> <Johannes.Schindelin@gmx.de> wrote:
>>
>> Hi Michael,
>>
>> I stumbled about this patch when it broke CI in Git for Windows, where we
>> do _not_ use `NO_PYTHON`, even though Python is unavailable in the
>> build/test CI jobs. The existing tests handle this situation gracefully,
>> this here patch does not:
>> ...
>> Given the complexity of what t4080 tries to test (error, abort, crash,
>> bad-sync, no-hunks, multiple files in one session, capability
>> negotiation), it would unfortunately be infeasible to use `test-tool
>> pkt-line` from a shell script implementing that `diff.*.process` protocol.
>>
>> So I've spiked a demo how the `test-tool diff-process-backend` could look
>> like (letting Opus do the menial typing, so that I can enjoy at least part
>> of a sunny Sunday outside), which also passes the CI build and test:
>> https://github.com/dscho/git/commit/b6e3c93381b00929476c3a00155f7cf7334a22e6
>>
>> That commit is of course not intended to be used as-is; Feel free to pick
>> code parts of it and integrate them into your topic branch. Or write your
>> own test-tool helper from scratch if that's more your jam.
>>
>
> Johannes, thank you for the great feedback. The historical context is
> really helpful and
> the concerns you raise make a lot of sense. I will take a look at your
> spike and also work
> on removing Python from the test.

Another request.

Please do not force readers to scroll through a ~800 line message
just to read only 5 lines of response from you.  Keep relevant parts
of the message you are responding to in your message to help readers
understand the context in which your response was made, but trim
everything else that is not relevant from your quote.

Thanks.

^ permalink raw reply

* Re: [PATCH RFC 1/2] builtin/history: abort reword on unchanged message
From: Junio C Hamano @ 2026-06-08 12:16 UTC (permalink / raw)
  To: Pablo Sabater; +Cc: git, Patrick Steinhardt, Kaartic Sivaraam
In-Reply-To: <20260607-ps-history-reword-v1-1-ba43a3cbb81b@gmail.com>

Pablo Sabater <pabloosabaterr@gmail.com> writes:

> When using `git history reword` if the new message is the same as the
> original it continues anyway creating a new commit with the same
> message and updates its descendants, modifying the history after this
> 'reworded' commit even though there was no actual change.
>
> `git commit --amend` and `git rebase -i` + reword share this behavior,
> however `git history reword` is different:
> 1. Works in-memory without touching the index or the worktree [1], so
>    there are no side effects like staged files that could justify
>    rewriting the history when the commit message is the same.
> 2. `git history` by default updates all the branches [2] that contain the
>    original commit making it more costly than `git rebase -i` that only
>    updates the current branch.

I think the reasoning is flawed.

Both "git commit --amend" and "git rebase -i", even with no changes
to the tree, parents, or the message, update the committer timestamp
(and perhaps the committer identity running the command may be
different from the original).  Updating this info is one of the
important effects of the command.

And "history" being more capable than "rebase" is a wrong excuse to
make the system behave inconsistently between commands that have
similar features [*1*].  In a situation where letting 'history'
update all the relevant branches, if a command behaves differently
from the way the user likes (and if the way 'rebase -i' works is the
one the user likes), you'd end up forcing the user to use 'rebase
-i' when 'history' would have been more appropriate.

Having said that, I personally think that the current behaviour of
`commit --amend` and `history reword` are both _wrong_ [*2*].

You may start `git commit --amend`, and after staring at the
existing commit log message for some time in your editor, it is
quite natural for you to decide that leaving the commit as-is is the
right thing [*3*] in your situation.  It may have been a better
design for the system to notice this situation and leave the commit
as-is, with an override option `--force` to allow users to forcibly
update the committer ident and timestamp in the commit header.  I am
not a `history reword` user (yet), but from the motivation you
described for this patch, I sense that the story is the same there.

`git rebase -i A`, when A is truly an ancestor at the bottom of a
linear history leading to HEAD, behaves slightly better.  It gives
you a todo list with a bunch of `pick` insns, and when you do not
edit earliest 'pick's the todo list, these earliest commits are left
as-is.  It may still share the same issue that a 'reword' that you
ended up not rewording (or 'edit' that you ended up not touching its
tree or log message) does still recreate a new commit object, though.

`git rebase -i` may have an excuse that because it, unlike "git
commit --amend", operates on multiple commits by design.  A single
"--force" option given to the command would not have worked as an
escape hatch to allow the user to tell the command "in this reword
of this particular commit, I ended up doing nothing, but I still
want an updated committer log timestamp".  Perhaps giving the
"--force" (or --force-rewrite") option at "rebase --continue" time
may work, but in any case, unless we plan to transition to these
"better" default behaviour at a big version boundary, speculating
what a "better" behaviour would have been may be fun but not very
productive.

[Footnote]

 *1* Besides, doesn't "--update-refs" in "rebase -i" allow you to
     adjust the branches?

 *2* But it is an established behaviour people _rely_ on, so even
     though it may have been better if these commands behaved
     differently, it probably is a bit too late to change it now.

 *3* This includes the case where the original author is especially
     difficult to work with and would complain any change to their
     commits, even if the only change you made for them is a
     typofix.  Fixing a small typo/grammo may not be worth your time
     and unpleasant exchanges with them after touching their commit.

^ permalink raw reply

* Re: [PATCH RFC 2/2] builtin/history: print feedback after successful reword
From: Junio C Hamano @ 2026-06-08 12:16 UTC (permalink / raw)
  To: Pablo Sabater; +Cc: git, Patrick Steinhardt, Kaartic Sivaraam
In-Reply-To: <20260607-ps-history-reword-v1-2-ba43a3cbb81b@gmail.com>

Pablo Sabater <pabloosabaterr@gmail.com> writes:

> Unlike `git commit --amend` and `git rebase -i`, `git history reword`
> doesn't print anything, this makes it feel empty for a porcelain command
> and hard to tell if the command did anything without using other
> commands like `git log <commit>` to check if the reword was done.
>
> Print a message on successful rewords so the user has feedback about it.
>
> Signed-off-by: Pablo Sabater <pabloosabaterr@gmail.com>
> ---
>  builtin/history.c         |  4 ++++
>  t/t3451-history-reword.sh | 14 ++++++++++++++
>  2 files changed, 18 insertions(+)
>
> diff --git a/builtin/history.c b/builtin/history.c
> index 51a22a9a1c..0f1ba3b531 100644
> --- a/builtin/history.c
> +++ b/builtin/history.c
> @@ -739,6 +739,10 @@ static int cmd_history_reword(int argc,
>  		goto out;
>  	}
>  
> +	fprintf(stderr, _("Successfully reworded commit %s to %s\n"),
> +		repo_find_unique_abbrev(repo, &original->object.oid, DEFAULT_ABBREV),
> +		repo_find_unique_abbrev(repo, &rewritten->object.oid, DEFAULT_ABBREV));
> +
>  	ret = 0;
>  
>  out:

Do other commands in "git history" (split is in 'master', drop and
fixup are cooking) behave with similar verbosity?  Consistency within
the same "history" umbrella matters more than being similar with
other commands that can be used for similar purposes.

> diff --git a/t/t3451-history-reword.sh b/t/t3451-history-reword.sh
> index 54ea8a7207..4b22d761e3 100755
> --- a/t/t3451-history-reword.sh
> +++ b/t/t3451-history-reword.sh
> @@ -416,4 +416,18 @@ test_expect_success 'aborts if the commit message is the same' '
>  	)
>  '
>  
> +test_expect_success 'prints feedback on successful reword' '
> +	test_when_finished "rm -rf repo" &&
> +	git init repo &&
> +	(
> +		cd repo &&
> +		test_commit first &&
> +
> +		reword_with_message HEAD 2>err <<-EOF &&
> +		first reworded
> +		EOF
> +		test_grep "Successfully reworded" err
> +	)
> +'
> +
>  test_done

^ permalink raw reply

* Re: [PATCH v3 4/6] diff: add long-running diff process via diff.<driver>.process
From: Junio C Hamano @ 2026-06-08 12:06 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Michael Montalbo via GitGitGadget, git, Michael Montalbo
In-Reply-To: <c7987f11-9181-3975-552c-14e74abb2c97@gmx.de>

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> So the conscious project direction has been: fold pkt-line test backends
> into `test-tool` and drop the scripting-language prereq. Reintroducing the
> same shape in Python would walk this back.
> ...
> The `PYTHON` prereq exists in exactly five files today, all `git p4`
> related (where Python is an intrinsic prerequisite given that `git-p4.py`
> _is_ written in Python): `t/lib-git-p4.sh`, `t/t9802-git-p4-filetype.sh`,
> `t/t9810-git-p4-rcs.sh`, `t/t9835-git-p4-metadata-encoding-python2.sh`,
> and `t/t9836-git-p4-metadata-encoding-python3.sh`.
> ...
> That commit is of course not intended to be used as-is; Feel free to pick
> code parts of it and integrate them into your topic branch. Or write your
> own test-tool helper from scratch if that's more your jam.

Showing better direction to new folks with such a clear thinking is
very much appreciated.  Even though it is natural and perfectly OK
for tests that interacts with parts of Git that are written in these
languages (e.g., we are OK for gitweb tests to require Perl), we
should consciously keep ourselves clean and not adding unnecessary
dependencies.

^ permalink raw reply

* Re: [PATCH v3 0/8] setup: centralize object database creation
From: Junio C Hamano @ 2026-06-08 12:06 UTC (permalink / raw)
  To: Karthik Nayak; +Cc: Patrick Steinhardt, git, Kristoffer Haugsbakk
In-Reply-To: <CAOLa=ZQwVbLsOcajaxQwtkTPm=4St7EiGEEyL6_B0o3Tt1v1pw@mail.gmail.com>

Karthik Nayak <karthik.188@gmail.com> writes:

> Patrick Steinhardt <ps@pks.im> writes:
>
>> Hi,
>>
>> this small patch series refactors the logic for how we discover and
>> configure repositories. Most importantly, this involves the following
>> two steps:
>>
>>   1. We unify the logic to apply the repository format, which is
>>      currently open-coded across multiple sites. These sites have
>>      already diverged, where some repository extensions are not
>>      consistently applied.
>>
>>   2. We then centralize creation of the object database to happen at the
>>      same time we apply the repository format.
>>
>> The end result is that we apply the repository format exactly once, and
>> that's also the point in time where we can finalize the setup of the
>> repo's data structures as we know about all details of the repo at that
>> time. Ultimately, this makes it trivial to introduce the "objectStorage"
>> extension, even though that's not part of this patch series.
>> ...
>> 4:  81b92bca7f = 4:  b0d7c11fe6 repository: stop initializing the object database in `repo_set_gitdir()`
>> 5:  807fc56353 = 5:  d0af56fdae setup: stop creating the object database in `setup_git_env()`
>> 6:  96563ff99f = 6:  3e75c5b0a6 setup: stop initializing object database without repository
>> 7:  c14f45169c = 7:  50fa2fdb3c repository: stop reading loose object map twice on repo init
>> 8:  e67c6e66d6 = 8:  4dff9d1794 setup: construct object database in `apply_repository_format()`
>>
>
> The range-diff looks good and as expected. Thanks!

Thanks, both of you.  Let me mark the topic for 'next', then.

^ permalink raw reply

* Re: [PATCH v2] prio-queue: use cascade-down for faster extract-min
From: Junio C Hamano @ 2026-06-08 11:56 UTC (permalink / raw)
  To: René Scharfe
  Cc: Kristofer Karlsson, Kristofer Karlsson via GitGitGadget, git
In-Reply-To: <1aa5b755-0f74-46d5-bd6e-a9cb7f3fbb12@web.de>

René Scharfe <l.s.r@web.de> writes:

> I think I mostly understand it now: cascade is better in prio_queue_get()
> because the sift-down item is from the bottom and will likely end up back
> at the bottom, just of a different branch of the heap.  Thus a sift-down
> costs 3 compares times the number of levels, while a cascade costs just
> 2 compares times the number of levels and there is likely little to no
> need to sift it back up.
>
> For prio_queue_replace() we sift down a random item, though; we don't
> know where it will end up.  If it belongs at the very top then sift-down
> just needs 3 compares, while cascade needs 2 compares times the number
> of levels to bring the hole down and the same to bring the item up.

An excellent observation, showing clear and analytic mind.  This is
one of the reasons why I love reading review messages from you (and
also explanation in the proposed commit log messages in your
patches).

Thanks.

^ permalink raw reply

* inconsistent order of --diff-algorithm variants in man pages
From: Vincent Lefevre @ 2026-06-08 11:26 UTC (permalink / raw)
  To: git

In Documentation/diff-algorithm-option.adoc, which is used by the
git-blame(1) and git-diff(1) man pages:

`--diff-algorithm=(patience|minimal|histogram|myers)`::
        Choose a diff algorithm. The variants are as follows:
+
--
   `default`;;
   `myers`;;
        The basic greedy diff algorithm. Currently, this is the default.
   `minimal`;;
        Spend extra time to make sure the smallest possible diff is
        produced.
   `patience`;;
        Use "patience diff" algorithm when generating patches.
   `histogram`;;
        This algorithm extends the patience algorithm to "support
        low-occurrence common elements".
--

I think that using the same order in the --diff-algorithm line and
in the description that follows would be better, i.e.

  --diff-algorithm=(myers|minimal|patience|histogram)

FYI, the text was added in 07924d4d50e5304fb53eb60aaba8aef31d4c4e5e
in 2013, but without any explanation on this difference.

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / Pascaline project (LIP, ENS-Lyon)

^ permalink raw reply

* Re: [BUG] "git diff --word-diff" gives a diff while they are only space changes
From: Vincent Lefevre @ 2026-06-08 10:58 UTC (permalink / raw)
  To: Michael Montalbo; +Cc: Junio C Hamano, Chris Torek, Johannes Sixt, git
In-Reply-To: <CAC2QwmKjr2eiFNPPmERq7n-UjE-SF2vE4eHDanYE-4heWxzQVw@mail.gmail.com>

On 2026-05-28 12:25:01 -0700, Michael Montalbo wrote:
> > Thanks for the ideas, Chris. Here is my attempt at synthesizing Chris'
> > suggestions and Junio's feedback:
> >
> >   The `--word-diff` option operates by taking the same line-by-line
> >   diff that is produced without the option and computing
> >   word-by-word changes within each hunk.  This may produce a
> >   larger diff than a dedicated word-diff tool would.  If Git
> >   acquires a different implementation in the future, the output
> >   may change.  Note that this is similar to the `--diff-algorithm`
> >   option, which may also change the output.
> >
> > Does this work?
> 
> Updated the patch with the revised wording:
> https://lore.kernel.org/git/pull.2113.git.1778686956622.gitgitgadget@gmail.com/T/#t
> 
> Please feel free to pick up, modify, or drop as appropriate.

Just to say that this new text is fine for me.

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / Pascaline project (LIP, ENS-Lyon)

^ permalink raw reply

* Re: [PATCH v3] doc: fix typos via codespell
From: Junio C Hamano @ 2026-06-08 10:56 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Andrew Kreimer, git
In-Reply-To: <3398ef40-1547-4324-2cfc-97b9e2b24854@gmx.de>

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

>> I'll squash the fix-up I already had into [v2] that I have queued,
>> which should be sufficient to get to the state this [v3] should have
>> been, I think.
>
> The mechanical nature of these fixes explains another issue: One typo fix
> touched two test fixtures which might seem harmless at first, but those
> fixtures are littered with checksums that relied on the original
> (misspelled) form.
>
> Please adopt this follow-up into ak/typofixes:

Thanks.  You often keep your eyes peeled to spot these forgotten
bits, which is very very much appreciated.

I briefly wondered if this was to be done by me or it was an request
to Andrew, but since I've promised to squash the update into what I
have myself, I'll do another squashing into the result, instead of
asking Andrew to update the v2+v3 with these fixes.

Luckily, b8b38eee85 is *not* yet in 'next', so we can just squash
the [v3] from Andrew and this fixup from you into it to keep the
test passing with or without the "typo fix", to maintain
bisectability.

Somebody has to come up with a bit of tweak to the log message to
explain what has been done in all these three pieces when it
happens.  I may ask some agent to prepare a draft, review it myself,
and perhaps redo it myself from the originals without taking
anything from agent output, as I am still skeptical about all these
AI hype ;-).


> -- snipsnap --
> From 54aa4f7f7adf0c0e02b5463b5f7f64547e80cbce Mon Sep 17 00:00:00 2001
> From: Johannes Schindelin <johannes.schindelin@gmx.de>
> Date: Sat, 6 Jun 2026 22:09:04 +0200
> Subject: [PATCH] svn-test-dumps: restore checksums after the `hapenning` typo
>  fix
>
> b8b38eee85 (doc: fix typos via codespell, 2026-05-31) ran codespell
> against the entire tree and rewrote `hapenning` to `happening`
> inside the body of `t/t9150/svk-merge.dump` and
> `t/t9151/svn-mergeinfo.dump`. Both files are Subversion dump
> files: each `Node-path:` block embeds `Text-content-md5` /
> `Text-content-sha1` for the new content and, on copy operations,
> `Text-copy-source-md5` / `Text-copy-source-sha1` for the source
> content as observed at the cited revision. None of those
> checksums were updated, so loading the dumps with svnadmin 1.14.5
> (present in `ubuntu:rolling`'s CI image) fails immediately with
> `E200014: Checksum mismatch for '/trunk/Makefile'` and the two
> tests stop before any of the assertions they actually exercise can
> run. The CI failure has been visible on every `seen`-based
> linux-sha256 / linux-reftable build since 2026-06-02 (the first
> run that picked up b8b38eee85).
>
> Because `happening` and `hapenning` have the same length, no
> header byte counts need updating; only the embedded checksums do.
> Recompute the MD5 and SHA1 of every text body in the two dumps,
> and for every `Node-copyfrom-path` consult the path's most
> recently defined content to refresh the corresponding
> `Text-copy-source-md5` / `Text-copy-source-sha1`. After this,
> `svnadmin load -q` accepts both dumps cleanly and t9150 and t9151
> get past their setup steps.
>
> This commit only touches the two dump files; the typo correction
> in their surrounding human-readable comment is preserved.
>
> Assisted-by: Opus 4.7
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
>  t/t9150/svk-merge.dump     | 10 ++++----
>  t/t9151/svn-mergeinfo.dump | 48 +++++++++++++++++++-------------------
>  2 files changed, 29 insertions(+), 29 deletions(-)
>
> diff --git a/t/t9150/svk-merge.dump b/t/t9150/svk-merge.dump
> index 6a8ac81b11e6..3c46afc18a65 100644
> --- a/t/t9150/svk-merge.dump
> +++ b/t/t9150/svk-merge.dump
> @@ -71,7 +71,7 @@ Node-kind: file
>  Node-action: add
>  Prop-content-length: 10
>  Text-content-length: 2401
> -Text-content-md5: bfd8ff778d1492dc6758567373176a89
> +Text-content-md5: d6a3917748b0c09ad85c2783f1d4dac1
>  Content-length: 2411
>  
>  PROPS-END
> @@ -201,7 +201,7 @@ Node-path: branches/left/Makefile
>  Node-kind: file
>  Node-action: change
>  Text-content-length: 2465
> -Text-content-md5: 16e38d9753b061731650561ce01b1195
> +Text-content-md5: 3f413450a7a26596d9e512ee385a9b19
>  Content-length: 2465
>  
>  # -DCOLLISION_CHECK if you believe that SHA1's
> @@ -305,7 +305,7 @@ Node-path: trunk/Makefile
>  Node-kind: file
>  Node-action: change
>  Text-content-length: 2521
> -Text-content-md5: 0668418a621333f4aa8b6632cd63e2a0
> +Text-content-md5: 89788781014278d76ff23648b8b08b2d
>  Content-length: 2521
>  
>  # -DCOLLISION_CHECK if you believe that SHA1's
> @@ -412,7 +412,7 @@ Node-path: branches/left/Makefile
>  Node-kind: file
>  Node-action: change
>  Text-content-length: 2593
> -Text-content-md5: 5ccff689fb290e00b85fe18ee50c54ba
> +Text-content-md5: 706d73919e6f319a0e624aa50c8b8b38
>  Content-length: 2593
>  
>  # -DCOLLISION_CHECK if you believe that SHA1's
> @@ -529,7 +529,7 @@ Node-path: trunk/Makefile
>  Node-kind: file
>  Node-action: change
>  Text-content-length: 2713
> -Text-content-md5: 0afbe34f244cd662b1f97d708c687f90
> +Text-content-md5: 1c05266da99e8f01a5ccf816be47a484
>  Content-length: 2713
>  
>  # -DCOLLISION_CHECK if you believe that SHA1's
> diff --git a/t/t9151/svn-mergeinfo.dump b/t/t9151/svn-mergeinfo.dump
> index d5e169563745..ad741400104e 100644
> --- a/t/t9151/svn-mergeinfo.dump
> +++ b/t/t9151/svn-mergeinfo.dump
> @@ -80,8 +80,8 @@ Node-kind: file
>  Node-action: add
>  Prop-content-length: 10
>  Text-content-length: 2401
> -Text-content-md5: bfd8ff778d1492dc6758567373176a89
> -Text-content-sha1: 103205ce331f7d64086dba497574734f78439590
> +Text-content-md5: d6a3917748b0c09ad85c2783f1d4dac1
> +Text-content-sha1: 9ffe895eb95d4a7c2ee2712dcf7a13637edee6a9
>  Content-length: 2411
>  
>  PROPS-END
> @@ -194,8 +194,8 @@ Node-kind: file
>  Node-action: add
>  Node-copyfrom-rev: 2
>  Node-copyfrom-path: trunk/Makefile
> -Text-copy-source-md5: bfd8ff778d1492dc6758567373176a89
> -Text-copy-source-sha1: 103205ce331f7d64086dba497574734f78439590
> +Text-copy-source-md5: d6a3917748b0c09ad85c2783f1d4dac1
> +Text-copy-source-sha1: 9ffe895eb95d4a7c2ee2712dcf7a13637edee6a9
>  
>  
>  Revision-number: 4
> @@ -228,8 +228,8 @@ Node-kind: file
>  Node-action: add
>  Node-copyfrom-rev: 2
>  Node-copyfrom-path: trunk/Makefile
> -Text-copy-source-md5: bfd8ff778d1492dc6758567373176a89
> -Text-copy-source-sha1: 103205ce331f7d64086dba497574734f78439590
> +Text-copy-source-md5: d6a3917748b0c09ad85c2783f1d4dac1
> +Text-copy-source-sha1: 9ffe895eb95d4a7c2ee2712dcf7a13637edee6a9
>  
>  
>  Revision-number: 5
> @@ -254,8 +254,8 @@ Node-path: branches/left/Makefile
>  Node-kind: file
>  Node-action: change
>  Text-content-length: 2465
> -Text-content-md5: 16e38d9753b061731650561ce01b1195
> -Text-content-sha1: 36da4b84ea9b64218ab48171dfc5c48ae025f38b
> +Text-content-md5: 3f413450a7a26596d9e512ee385a9b19
> +Text-content-sha1: b3cd389d63c5e3af4fe22b7464cf97968662ad1a
>  Content-length: 2465
>  
>  # -DCOLLISION_CHECK if you believe that SHA1's
> @@ -359,8 +359,8 @@ Node-path: branches/right/Makefile
>  Node-kind: file
>  Node-action: change
>  Text-content-length: 2521
> -Text-content-md5: 0668418a621333f4aa8b6632cd63e2a0
> -Text-content-sha1: 4f29afd038e52f45acb5ef8c41acfc70062a741a
> +Text-content-md5: 89788781014278d76ff23648b8b08b2d
> +Text-content-sha1: f52afb2d6230e5a418416b77c3c9ad610edfd202
>  Content-length: 2521
>  
>  # -DCOLLISION_CHECK if you believe that SHA1's
> @@ -467,8 +467,8 @@ Node-path: branches/left/Makefile
>  Node-kind: file
>  Node-action: change
>  Text-content-length: 2529
> -Text-content-md5: f6b197cc3f2e89a83e545d4bb003de73
> -Text-content-sha1: 2f656677cfec0bceec85e53036ffb63e25126f8e
> +Text-content-md5: abcac8d04eb061b0a3053e359e44a2a0
> +Text-content-sha1: 866caf95e04809a5ed897aea41075b24833612ea
>  Content-length: 2529
>  
>  # -DCOLLISION_CHECK if you believe that SHA1's
> @@ -572,8 +572,8 @@ Node-path: branches/left/Makefile
>  Node-kind: file
>  Node-action: change
>  Text-content-length: 2593
> -Text-content-md5: 5ccff689fb290e00b85fe18ee50c54ba
> -Text-content-sha1: a13de8e23f1483efca3e57b2b64b0ae6f740ce10
> +Text-content-md5: 706d73919e6f319a0e624aa50c8b8b38
> +Text-content-sha1: 9992d5a9aea960c7856ef6a9364aedd5b710ef53
>  Content-length: 2593
>  
>  # -DCOLLISION_CHECK if you believe that SHA1's
> @@ -689,8 +689,8 @@ Node-kind: file
>  Node-action: add
>  Node-copyfrom-rev: 8
>  Node-copyfrom-path: branches/left/Makefile
> -Text-copy-source-md5: 5ccff689fb290e00b85fe18ee50c54ba
> -Text-copy-source-sha1: a13de8e23f1483efca3e57b2b64b0ae6f740ce10
> +Text-copy-source-md5: 706d73919e6f319a0e624aa50c8b8b38
> +Text-copy-source-sha1: 9992d5a9aea960c7856ef6a9364aedd5b710ef53
>  
>  
>  
> @@ -761,8 +761,8 @@ Node-path: trunk/Makefile
>  Node-kind: file
>  Node-action: change
>  Text-content-length: 2593
> -Text-content-md5: 5ccff689fb290e00b85fe18ee50c54ba
> -Text-content-sha1: a13de8e23f1483efca3e57b2b64b0ae6f740ce10
> +Text-content-md5: 706d73919e6f319a0e624aa50c8b8b38
> +Text-content-sha1: 9992d5a9aea960c7856ef6a9364aedd5b710ef53
>  Content-length: 2593
>  
>  # -DCOLLISION_CHECK if you believe that SHA1's
> @@ -942,8 +942,8 @@ Node-path: trunk/Makefile
>  Node-kind: file
>  Node-action: change
>  Text-content-length: 2713
> -Text-content-md5: 0afbe34f244cd662b1f97d708c687f90
> -Text-content-sha1: 46d9377d783e67a9b581da110352e799517c8a14
> +Text-content-md5: 1c05266da99e8f01a5ccf816be47a484
> +Text-content-sha1: 0cba212974e2b288389d73317f3220be11158e00
>  Content-length: 2713
>  
>  # -DCOLLISION_CHECK if you believe that SHA1's
> @@ -1166,8 +1166,8 @@ Node-path: branches/left-sub/Makefile
>  Node-kind: file
>  Node-action: change
>  Text-content-length: 2713
> -Text-content-md5: 0afbe34f244cd662b1f97d708c687f90
> -Text-content-sha1: 46d9377d783e67a9b581da110352e799517c8a14
> +Text-content-md5: 1c05266da99e8f01a5ccf816be47a484
> +Text-content-sha1: 0cba212974e2b288389d73317f3220be11158e00
>  Content-length: 2713
>  
>  # -DCOLLISION_CHECK if you believe that SHA1's
> @@ -1408,8 +1408,8 @@ Node-path: branches/left/Makefile
>  Node-kind: file
>  Node-action: change
>  Text-content-length: 2713
> -Text-content-md5: 0afbe34f244cd662b1f97d708c687f90
> -Text-content-sha1: 46d9377d783e67a9b581da110352e799517c8a14
> +Text-content-md5: 1c05266da99e8f01a5ccf816be47a484
> +Text-content-sha1: 0cba212974e2b288389d73317f3220be11158e00
>  Content-length: 2713
>  
>  # -DCOLLISION_CHECK if you believe that SHA1's

^ permalink raw reply

* Re: [PATCH RFC 1/2] builtin/history: abort reword on unchanged message
From: Pablo Sabater @ 2026-06-08 10:52 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, Kaartic Sivaraam
In-Reply-To: <aiaLxNwGPko5HS2G@pks.im>

El lun, 8 jun 2026 a las 11:30, Patrick Steinhardt (<ps@pks.im>) escribió:
>
> On Sun, Jun 07, 2026 at 10:07:20PM +0200, Pablo Sabater wrote:
> > When using `git history reword` if the new message is the same as the
> > original it continues anyway creating a new commit with the same
> > message and updates its descendants, modifying the history after this
> > 'reworded' commit even though there was no actual change.
> >
> > `git commit --amend` and `git rebase -i` + reword share this behavior,
> > however `git history reword` is different:
> > 1. Works in-memory without touching the index or the worktree [1], so
> >    there are no side effects like staged files that could justify
> >    rewriting the history when the commit message is the same.
> > 2. `git history` by default updates all the branches [2] that contain the
> >    original commit making it more costly than `git rebase -i` that only
> >    updates the current branch.
> >
> > Add a check if the original commit message is the same as the new one
> > and abort if so.
> >
> > [1]: https://lore.kernel.org/git/20260113-b4-pks-history-builtin-v11-8-e74ebfa2652d@pks.im/
> > [2]: https://git-scm.com/docs/git-history#_description
>
> Nit: I feel like both of the links don't really add much value.

I'll just drop em.

>
> > Signed-off-by: Pablo Sabater <pabloosabaterr@gmail.com>
> > ---
> >  builtin/history.c         | 10 ++++++++++
> >  t/t3451-history-reword.sh | 20 ++++++++++++++++++++
> >  2 files changed, 30 insertions(+)
> >
> > diff --git a/builtin/history.c b/builtin/history.c
> > index 0fc06fb204..51a22a9a1c 100644
> > --- a/builtin/history.c
> > +++ b/builtin/history.c
> > @@ -135,6 +135,13 @@ static int commit_tree_ext(struct repository *repo,
> >                                         original_body, action, &commit_message);
> >               if (ret < 0)
> >                       goto out;
> > +
> > +             if (!strcmp(original_body, commit_message.buf)) {
> > +                     fprintf(stderr, _("Message unchanged,"
> > +                                       " aborting reword.\n"));
> > +                     ret = 1;
> > +                     goto out;
> > +             }
> >       } else {
> >               strbuf_addstr(&commit_message, original_body);
> >       }
>
> We also execute this logic via "git history fixup --reedit-message", and
> here it wouldn't make sense to abort the commit in case the message is
> unchanged.

True I hadn't thought that, I made it here because we have both the
original and new message before creating the new commit. We could let
ret = 1 mean that the commit message is the same and then
cmd_history_fixup ignores ret = 1 and for cmd_history_reword handle
the abort.
What do you think?

>
> Patrick

--
Pablo

^ permalink raw reply

* Re: [PATCH RFC 2/2] builtin/history: print feedback after successful reword
From: Pablo Sabater @ 2026-06-08 10:45 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, Kaartic Sivaraam
In-Reply-To: <aiaLyQvo8kqfv4js@pks.im>

El lun, 8 jun 2026 a las 11:30, Patrick Steinhardt (<ps@pks.im>) escribió:
>
> On Sun, Jun 07, 2026 at 10:07:21PM +0200, Pablo Sabater wrote:
> > Unlike `git commit --amend` and `git rebase -i`, `git history reword`
> > doesn't print anything, this makes it feel empty for a porcelain command
> > and hard to tell if the command did anything without using other
> > commands like `git log <commit>` to check if the reword was done.
> >
> > Print a message on successful rewords so the user has feedback about it.
>
> I dunno about this one. My take here is that a command should be silent
> unless it has something to say, for example when it couldn't honor the
> user's request [1].

But neither `git commit --amend` nor `git rebase -i` follow this rule
of silence.
>
> > diff --git a/builtin/history.c b/builtin/history.c
> > index 51a22a9a1c..0f1ba3b531 100644
> > --- a/builtin/history.c
> > +++ b/builtin/history.c
> > @@ -739,6 +739,10 @@ static int cmd_history_reword(int argc,
> >               goto out;
> >       }
> >
> > +     fprintf(stderr, _("Successfully reworded commit %s to %s\n"),
> > +             repo_find_unique_abbrev(repo, &original->object.oid, DEFAULT_ABBREV),
> > +             repo_find_unique_abbrev(repo, &rewritten->object.oid, DEFAULT_ABBREV));
> > +
>
> Seeing the implementation also raises a couple of questions:
>
>   - Why do we mention the rewritten commit, only? Shouldn't we also
>     print the changed HEAD?

Because `git history reword <commit>` is for a single commit. After
the reword the hash changes and the original hash is no longer useful
to check the rewritten message. If I want to see how it is now:

  $ git history reword aabb
  $ git log aabb <- I can't check how it is now because this is the old one

So to check the new one I have to search the new hash. Imagine if it's
the first of 20 long commit messages, I have to git log --oneline, get
the hash and then git log new_hash, which IMO is unnecessary when git
history reword can output the new hash.

>
>   - Why don't we print any of the other rewritten branches?

Haven't thought of that, it's nice that it does modify all branches, I
just assumed that the most relevant is the current branch new commit
hash. The other rewritten branches have the same commit message, just
different hashes.

>
>   - What makes "git history reword" so special as compared to for
>     example "git history fixup" or "git history split" so that it needs
>     a message while the others don't?

Nothing, I just wanted this specifically for reword and sent this very
simple as an RFC to discuss the idea, I could extend this where it
fits.

>
> It might make sense to maybe introduce a verbose mode where we do print
> such information. But if so, we should have good answers to the above
> questions and implement this in a way that makes sense for the other
> subcommands, too, so that we can apply the same principle to all of
> them.

I like the verbose mode idea but I still think that on non-verbose
something should be printed, on verbose it could be printed
additionally all the rewritten commits (though it could get very
noisy), the changed HEAD, etc.

>
> Thanks!
>
> Patrick
>
> [1]: https://www.linfo.org/rule_of_silence.html

--
Pablo

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox