Git development

Git development
 help / color / mirror / Atom feed

* Re: [PATCH v3 0/8] setup: centralize object database creation
From: Junio C Hamano @ 2026-06-08 12:06 UTC (permalink / raw)
  To: Karthik Nayak; +Cc: Patrick Steinhardt, git, Kristoffer Haugsbakk
In-Reply-To: <CAOLa=ZQwVbLsOcajaxQwtkTPm=4St7EiGEEyL6_B0o3Tt1v1pw@mail.gmail.com>

Karthik Nayak <karthik.188@gmail.com> writes:

> Patrick Steinhardt <ps@pks.im> writes:
>
>> Hi,
>>
>> this small patch series refactors the logic for how we discover and
>> configure repositories. Most importantly, this involves the following
>> two steps:
>>
>>   1. We unify the logic to apply the repository format, which is
>>      currently open-coded across multiple sites. These sites have
>>      already diverged, where some repository extensions are not
>>      consistently applied.
>>
>>   2. We then centralize creation of the object database to happen at the
>>      same time we apply the repository format.
>>
>> The end result is that we apply the repository format exactly once, and
>> that's also the point in time where we can finalize the setup of the
>> repo's data structures as we know about all details of the repo at that
>> time. Ultimately, this makes it trivial to introduce the "objectStorage"
>> extension, even though that's not part of this patch series.
>> ...
>> 4:  81b92bca7f = 4:  b0d7c11fe6 repository: stop initializing the object database in `repo_set_gitdir()`
>> 5:  807fc56353 = 5:  d0af56fdae setup: stop creating the object database in `setup_git_env()`
>> 6:  96563ff99f = 6:  3e75c5b0a6 setup: stop initializing object database without repository
>> 7:  c14f45169c = 7:  50fa2fdb3c repository: stop reading loose object map twice on repo init
>> 8:  e67c6e66d6 = 8:  4dff9d1794 setup: construct object database in `apply_repository_format()`
>>
>
> The range-diff looks good and as expected. Thanks!

Thanks, both of you.  Let me mark the topic for 'next', then.

^ permalink raw reply

* Re: [PATCH v3 4/6] diff: add long-running diff process via diff.<driver>.process
From: Junio C Hamano @ 2026-06-08 12:06 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Michael Montalbo via GitGitGadget, git, Michael Montalbo
In-Reply-To: <c7987f11-9181-3975-552c-14e74abb2c97@gmx.de>

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> So the conscious project direction has been: fold pkt-line test backends
> into `test-tool` and drop the scripting-language prereq. Reintroducing the
> same shape in Python would walk this back.
> ...
> The `PYTHON` prereq exists in exactly five files today, all `git p4`
> related (where Python is an intrinsic prerequisite given that `git-p4.py`
> _is_ written in Python): `t/lib-git-p4.sh`, `t/t9802-git-p4-filetype.sh`,
> `t/t9810-git-p4-rcs.sh`, `t/t9835-git-p4-metadata-encoding-python2.sh`,
> and `t/t9836-git-p4-metadata-encoding-python3.sh`.
> ...
> That commit is of course not intended to be used as-is; Feel free to pick
> code parts of it and integrate them into your topic branch. Or write your
> own test-tool helper from scratch if that's more your jam.

Showing better direction to new folks with such a clear thinking is
very much appreciated.  Even though it is natural and perfectly OK
for tests that interacts with parts of Git that are written in these
languages (e.g., we are OK for gitweb tests to require Perl), we
should consciously keep ourselves clean and not adding unnecessary
dependencies.

^ permalink raw reply

* Re: [PATCH RFC 2/2] builtin/history: print feedback after successful reword
From: Junio C Hamano @ 2026-06-08 12:16 UTC (permalink / raw)
  To: Pablo Sabater; +Cc: git, Patrick Steinhardt, Kaartic Sivaraam
In-Reply-To: <20260607-ps-history-reword-v1-2-ba43a3cbb81b@gmail.com>

Pablo Sabater <pabloosabaterr@gmail.com> writes:

> Unlike `git commit --amend` and `git rebase -i`, `git history reword`
> doesn't print anything, this makes it feel empty for a porcelain command
> and hard to tell if the command did anything without using other
> commands like `git log <commit>` to check if the reword was done.
>
> Print a message on successful rewords so the user has feedback about it.
>
> Signed-off-by: Pablo Sabater <pabloosabaterr@gmail.com>
> ---
>  builtin/history.c         |  4 ++++
>  t/t3451-history-reword.sh | 14 ++++++++++++++
>  2 files changed, 18 insertions(+)
>
> diff --git a/builtin/history.c b/builtin/history.c
> index 51a22a9a1c..0f1ba3b531 100644
> --- a/builtin/history.c
> +++ b/builtin/history.c
> @@ -739,6 +739,10 @@ static int cmd_history_reword(int argc,
>  		goto out;
>  	}
>  
> +	fprintf(stderr, _("Successfully reworded commit %s to %s\n"),
> +		repo_find_unique_abbrev(repo, &original->object.oid, DEFAULT_ABBREV),
> +		repo_find_unique_abbrev(repo, &rewritten->object.oid, DEFAULT_ABBREV));
> +
>  	ret = 0;
>  
>  out:

Do other commands in "git history" (split is in 'master', drop and
fixup are cooking) behave with similar verbosity?  Consistency within
the same "history" umbrella matters more than being similar with
other commands that can be used for similar purposes.

> diff --git a/t/t3451-history-reword.sh b/t/t3451-history-reword.sh
> index 54ea8a7207..4b22d761e3 100755
> --- a/t/t3451-history-reword.sh
> +++ b/t/t3451-history-reword.sh
> @@ -416,4 +416,18 @@ test_expect_success 'aborts if the commit message is the same' '
>  	)
>  '
>  
> +test_expect_success 'prints feedback on successful reword' '
> +	test_when_finished "rm -rf repo" &&
> +	git init repo &&
> +	(
> +		cd repo &&
> +		test_commit first &&
> +
> +		reword_with_message HEAD 2>err <<-EOF &&
> +		first reworded
> +		EOF
> +		test_grep "Successfully reworded" err
> +	)
> +'
> +
>  test_done

^ permalink raw reply

* Re: [PATCH RFC 1/2] builtin/history: abort reword on unchanged message
From: Junio C Hamano @ 2026-06-08 12:16 UTC (permalink / raw)
  To: Pablo Sabater; +Cc: git, Patrick Steinhardt, Kaartic Sivaraam
In-Reply-To: <20260607-ps-history-reword-v1-1-ba43a3cbb81b@gmail.com>

Pablo Sabater <pabloosabaterr@gmail.com> writes:

> When using `git history reword` if the new message is the same as the
> original it continues anyway creating a new commit with the same
> message and updates its descendants, modifying the history after this
> 'reworded' commit even though there was no actual change.
>
> `git commit --amend` and `git rebase -i` + reword share this behavior,
> however `git history reword` is different:
> 1. Works in-memory without touching the index or the worktree [1], so
>    there are no side effects like staged files that could justify
>    rewriting the history when the commit message is the same.
> 2. `git history` by default updates all the branches [2] that contain the
>    original commit making it more costly than `git rebase -i` that only
>    updates the current branch.

I think the reasoning is flawed.

Both "git commit --amend" and "git rebase -i", even with no changes
to the tree, parents, or the message, update the committer timestamp
(and perhaps the committer identity running the command may be
different from the original).  Updating this info is one of the
important effects of the command.

And "history" being more capable than "rebase" is a wrong excuse to
make the system behave inconsistently between commands that have
similar features [*1*].  In a situation where letting 'history'
update all the relevant branches, if a command behaves differently
from the way the user likes (and if the way 'rebase -i' works is the
one the user likes), you'd end up forcing the user to use 'rebase
-i' when 'history' would have been more appropriate.

Having said that, I personally think that the current behaviour of
`commit --amend` and `history reword` are both _wrong_ [*2*].

You may start `git commit --amend`, and after staring at the
existing commit log message for some time in your editor, it is
quite natural for you to decide that leaving the commit as-is is the
right thing [*3*] in your situation.  It may have been a better
design for the system to notice this situation and leave the commit
as-is, with an override option `--force` to allow users to forcibly
update the committer ident and timestamp in the commit header.  I am
not a `history reword` user (yet), but from the motivation you
described for this patch, I sense that the story is the same there.

`git rebase -i A`, when A is truly an ancestor at the bottom of a
linear history leading to HEAD, behaves slightly better.  It gives
you a todo list with a bunch of `pick` insns, and when you do not
edit earliest 'pick's the todo list, these earliest commits are left
as-is.  It may still share the same issue that a 'reword' that you
ended up not rewording (or 'edit' that you ended up not touching its
tree or log message) does still recreate a new commit object, though.

`git rebase -i` may have an excuse that because it, unlike "git
commit --amend", operates on multiple commits by design.  A single
"--force" option given to the command would not have worked as an
escape hatch to allow the user to tell the command "in this reword
of this particular commit, I ended up doing nothing, but I still
want an updated committer log timestamp".  Perhaps giving the
"--force" (or --force-rewrite") option at "rebase --continue" time
may work, but in any case, unless we plan to transition to these
"better" default behaviour at a big version boundary, speculating
what a "better" behaviour would have been may be fun but not very
productive.

[Footnote]

 *1* Besides, doesn't "--update-refs" in "rebase -i" allow you to
     adjust the branches?

 *2* But it is an established behaviour people _rely_ on, so even
     though it may have been better if these commands behaved
     differently, it probably is a bit too late to change it now.

 *3* This includes the case where the original author is especially
     difficult to work with and would complain any change to their
     commits, even if the only change you made for them is a
     typofix.  Fixing a small typo/grammo may not be worth your time
     and unpleasant exchanges with them after touching their commit.

^ permalink raw reply

* Re: [PATCH v3 4/6] diff: add long-running diff process via diff.<driver>.process
From: Junio C Hamano @ 2026-06-08 12:26 UTC (permalink / raw)
  To: Michael Montalbo
  Cc: Johannes Schindelin, Michael Montalbo via GitGitGadget, git
In-Reply-To: <CAC2QwmJwxpnrPNW6YLm2uXKaYjkUwjVsPN_U+c52m0rNe95_Nw@mail.gmail.com>

Michael Montalbo <mmontalbo@gmail.com> writes:

> On Sun, Jun 7, 2026 at 7:36 AM Johannes Schindelin
> <Johannes.Schindelin@gmx.de> wrote:
>>
>> Hi Michael,
>>
>> I stumbled about this patch when it broke CI in Git for Windows, where we
>> do _not_ use `NO_PYTHON`, even though Python is unavailable in the
>> build/test CI jobs. The existing tests handle this situation gracefully,
>> this here patch does not:
>> ...
>> Given the complexity of what t4080 tries to test (error, abort, crash,
>> bad-sync, no-hunks, multiple files in one session, capability
>> negotiation), it would unfortunately be infeasible to use `test-tool
>> pkt-line` from a shell script implementing that `diff.*.process` protocol.
>>
>> So I've spiked a demo how the `test-tool diff-process-backend` could look
>> like (letting Opus do the menial typing, so that I can enjoy at least part
>> of a sunny Sunday outside), which also passes the CI build and test:
>> https://github.com/dscho/git/commit/b6e3c93381b00929476c3a00155f7cf7334a22e6
>>
>> That commit is of course not intended to be used as-is; Feel free to pick
>> code parts of it and integrate them into your topic branch. Or write your
>> own test-tool helper from scratch if that's more your jam.
>>
>
> Johannes, thank you for the great feedback. The historical context is
> really helpful and
> the concerns you raise make a lot of sense. I will take a look at your
> spike and also work
> on removing Python from the test.

Another request.

Please do not force readers to scroll through a ~800 line message
just to read only 5 lines of response from you.  Keep relevant parts
of the message you are responding to in your message to help readers
understand the context in which your response was made, but trim
everything else that is not relevant from your quote.

Thanks.

^ permalink raw reply

* Re: [PATCH] describe: limit default ref iteration to tags
From: Junio C Hamano @ 2026-06-08 12:36 UTC (permalink / raw)
  To: Tamir Duberstein; +Cc: git, Jeff King, Patrick Steinhardt
In-Reply-To: <20260607-describe-tag-ref-scope-v1-1-653d232b86b5@gmail.com>

Tamir Duberstein <tamird@gmail.com> writes:

[jc: Removing Shawn from CC who passed away quite a while ago, RIP].

> Unless --all is given, get_name() rejects every ref outside refs/tags/.
> The rejection happens only after the ref backend has enumerated the ref,
> so repositories with many other refs spend most of a simple describe
> invocation visiting refs which cannot affect its result.
> ...
> Both revisions were built with -O3, -mcpu=native, and ThinLTO using
> Apple clang 21.0.0 on macOS 26.5. The machine was a MacBook Pro
> (Mac16,6) with a 16-core Apple M4 Max (12 performance and four
> efficiency cores) and 128 GB RAM.
>
> Signed-off-by: Tamir Duberstein <tamird@gmail.com>
> ---
>  builtin/describe.c       |  3 +++
>  t/perf/p6100-describe.sh | 20 ++++++++++++++++++++
>  2 files changed, 23 insertions(+)

Interesting.  How would this relate to and work well with
<20260601233727.43558-1-jacob.e.keller@intel.com>?

> +test_lazy_prereq PERF_REFFILES '
> +	test "$(git rev-parse --show-ref-format)" = files
> +'
> +
> +ref_count=10000
> +
>  # clear out old tags and give us a known state
>  test_expect_success 'set up tags' '
>  	git for-each-ref --format="delete %(refname)" refs/tags >to-delete &&
> @@ -27,4 +33,18 @@ test_perf 'describe HEAD with one tag' '
>  	git describe --match=new HEAD
>  '
>  
> +test_expect_success PERF_REFFILES 'set up many unrelated refs' '
> +	git tag -m tip tip HEAD &&
> +	for i in $(test_seq $ref_count)
> +	do
> +		printf "create refs/heads/describe-perf/%05d HEAD\n" $i ||
> +		return 1
> +	done >instructions &&
> +	git update-ref --stdin <instructions
> +'
> +
> +test_perf 'describe exact tag with many loose refs' --prereq PERF_REFFILES '
> +	git describe --exact-match HEAD
> +'
> +

Is there a strong reason to guard this new test behind
`PERF_REFFILES`?

Even though the penalty of enumerating 10,000 unrelated loose
references may be most pronounced in the `files` backend, skipping
unnecessary reference enumeration is an architectural win for other
backends (like `reftable` or a fully packed repository) as well.

If we drop `PERF_REFFILES` and retitle the test to "describe exact
tag with many unrelated refs", we could run it unconditionally to
benchmark the improvement across all storage formats.

^ permalink raw reply

* [PATCH v2] parse-options: introduce die_for_missing_opt()
From: Siddharth Shrimali @ 2026-06-08 12:44 UTC (permalink / raw)
  To: git
  Cc: gitster, christian.couder, siddharthasthana31, toon, jn.avila,
	r.siddharth.shrimali
In-Reply-To: <20260603111044.39116-1-r.siddharth.shrimali@gmail.com>

Introduce die_for_missing_opt() to check if a dependent option is
present without its required prerequisite. This provides a centralized
API for simple option dependencies (X requires Y), inspired by and
matching the style of die_for_incompatible_opt{2,3,4}().

Use the new helper in builtin/add.c to replace the manual prerequisite
check for '--pathspec-file-nul' (requires '--pathspec-from-file'). This
case is already exercised by existing tests in t3704-add-pathspec-file.sh
and several other pathspec-file test scripts, ensuring the new helper is
verified without additional test code.

Suggested-by: Christian Couder <christian.couder@gmail.com>
Suggested-by: Jean-Noël AVILA <jn.avila@free.fr>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Siddharth Asthana <siddharthasthana31@gmail.com>
Signed-off-by: Siddharth Shrimali <r.siddharth.shrimali@gmail.com>
---
Changes since v1:
  - Squashed the implementation patch and the caller patch into a single,
    unified patch as suggested by Christian.
  - Renamed the helper function from die_for_require_opt() to
    die_for_missing_opt() to improve clarity.
  - Updated the argument names and logic order to better match the style of
    die_for_incompatible_opt*().
  - Dropped the conversion of the '--ignore-missing' check in builtin/add.c
    to keep this initial iteration strictly focused on a single, clean
    example ('--pathspec-file-nul').

 builtin/add.c   | 4 ++--
 parse-options.c | 7 +++++++
 parse-options.h | 3 +++
 3 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/builtin/add.c b/builtin/add.c
index c859f66519..505834ad3f 100644
--- a/builtin/add.c
+++ b/builtin/add.c
@@ -462,6 +462,8 @@ int cmd_add(int argc,
 		       PATHSPEC_SYMLINK_LEADING_PATH,
 		       prefix, argv);
 
+	die_for_missing_opt(pathspec_file_nul, "--pathspec-file-nul",
+			    !!pathspec_from_file, "--pathspec-from-file");
 	if (pathspec_from_file) {
 		if (pathspec.nr)
 			die(_("'%s' and pathspec arguments cannot be used together"), "--pathspec-from-file");
@@ -470,8 +472,6 @@ int cmd_add(int argc,
 				    PATHSPEC_PREFER_FULL |
 				    PATHSPEC_SYMLINK_LEADING_PATH,
 				    prefix, pathspec_from_file, pathspec_file_nul);
-	} else if (pathspec_file_nul) {
-		die(_("the option '%s' requires '%s'"), "--pathspec-file-nul", "--pathspec-from-file");
 	}
 
 	if (require_pathspec && pathspec.nr == 0) {
diff --git a/parse-options.c b/parse-options.c
index a676da86f5..11e40669eb 100644
--- a/parse-options.c
+++ b/parse-options.c
@@ -1558,3 +1558,10 @@ void die_for_incompatible_opt4(int opt1, const char *opt1_name,
 		break;
 	}
 }
+
+void die_for_missing_opt(int dependent_opt, const char *dependent_opt_name,
+			 int required_opt, const char *required_opt_name)
+{
+	if (dependent_opt && !required_opt)
+		die(_("the option '%s' requires '%s'"), dependent_opt_name, required_opt_name);
+}
diff --git a/parse-options.h b/parse-options.h
index 0d1f738f8d..5b41d2fd39 100644
--- a/parse-options.h
+++ b/parse-options.h
@@ -460,6 +460,9 @@ static inline void die_for_incompatible_opt2(int opt1, const char *opt1_name,
 				  0, "");
 }
 
+void die_for_missing_opt(int dependent_opt, const char *dependent_opt_name,
+			 int required_opt, const char *required_opt_name);
+
 /*
  * Use these assertions for callbacks that expect to be called with NONEG and
  * NOARG respectively, and do not otherwise handle the "unset" and "arg"
-- 
2.54.0


^ permalink raw reply related

* [PATCH v3 1/2] compat/posix.h: enable UNUSED warning messages for Clang
From: Dominik Loidolt @ 2026-06-08 12:44 UTC (permalink / raw)
  To: ps; +Cc: git, gitster, asedeno, asedeno, avarab, Dominik Loidolt
In-Reply-To: <20260605094647.94805-1-dominik.loidolt@univie.ac.at>

Use a dedicated Clang version check for the UNUSED macro.

Commit 7c07f36ad2 (git-compat-util.h: GCC deprecated message arg only in
GCC 4.5+, 2022-10-05) restricted use of the deprecated attribute's
message argument in the UNUSED macro to GCC 4.5 or newer.

Clang identifies itself as GNUC 4.2.1 for compatibility, so
GIT_GNUC_PREREQ(4, 5) does not detect whether Clang supports the
deprecated("...") form. Add GIT_CLANG_PREREQ() macro and use it to
enable the UNUSED warning message for Clang 2.9 and newer.

Signed-off-by: Dominik Loidolt <dominik.loidolt@univie.ac.at>
---
v3:
- fix comment style nit
- remove unnecessary parentheses around __clang_minor__ >= (min)

 compat/posix.h | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/compat/posix.h b/compat/posix.h
index faaae1b655..ffdfd91c7b 100644
--- a/compat/posix.h
+++ b/compat/posix.h
@@ -22,6 +22,15 @@
  #define GIT_GNUC_PREREQ(maj, min) 0
 #endif

+/* Similar for Clang. */
+#if defined(__clang__) && defined(__clang_minor__) && defined(__clang_major__)
+# define GIT_CLANG_PREREQ(maj, min) \
+	((__clang_major__ > (maj)) || \
+	 (__clang_major__ == (maj) && __clang_minor__ >= (min)))
+#else
+# define GIT_CLANG_PREREQ(maj, min) 0
+#endif
+
 /*
  * UNUSED marks a function parameter that is always unused.  It also
  * can be used to annotate a function, a variable, or a type that is
@@ -35,7 +44,7 @@
  * When a parameter may be used or unused, depending on conditional
  * compilation, consider using MAYBE_UNUSED instead.
  */
-#if GIT_GNUC_PREREQ(4, 5)
+#if GIT_GNUC_PREREQ(4, 5) || GIT_CLANG_PREREQ(2, 9)
 #define UNUSED __attribute__((unused)) \
 	__attribute__((deprecated ("parameter declared as UNUSED")))
 #elif defined(__GNUC__)

base-commit: a89346e34a937f001e5d397ee62224e3e9852040
--
2.54.0


^ permalink raw reply related

* [PATCH v3 2/2] compat/posix.h: simplify GIT_GNUC_PREREQ() comparison
From: Dominik Loidolt @ 2026-06-08 12:44 UTC (permalink / raw)
  To: ps; +Cc: git, gitster, asedeno, asedeno, avarab, Dominik Loidolt
In-Reply-To: <20260608124419.38905-1-dominik.loidolt@univie.ac.at>

Replace the glibc-style bit-shift version comparison with an explicit
major/minor comparison. This is easier to read and is consistent with
the format already used by GIT_CLANG_PREREQ() and many BSD
<sys/cdefs.h> headers.

This has no runtime impact, as the macro is evaluated at compile time.
It is also more future-proof, as it no longer assumes that GCC version
components stay below 65536.

Signed-off-by: Dominik Loidolt <dominik.loidolt@univie.ac.at>
---
 compat/posix.h | 16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/compat/posix.h b/compat/posix.h
index ffdfd91c7b..deefc43f28 100644
--- a/compat/posix.h
+++ b/compat/posix.h
@@ -4,22 +4,24 @@
 #define _FILE_OFFSET_BITS 64
 
 /*
- * Derived from Linux "Features Test Macro" header
- * Convenience macros to test the versions of gcc (or
- * a compatible compiler).
+ * Convenience macros to test the versions of GCC (or a compatible compiler).
  * Use them like this:
  *  #if GIT_GNUC_PREREQ (2,8)
- *   ... code requiring gcc 2.8 or later ...
+ *   ... code requiring GCC 2.8 or later ...
  *  #endif
  *
+ * Note that Clang and other compilers define __GNUC__ for compatibility; use
+ * GIT_CLANG_PREREQ() to check for specific Clang versions.
+ *
  * This macro of course is not part of POSIX, but we need it for the UNUSED
  * macro which is used by some of our POSIX compatibility wrappers.
-*/
+ */
 #if defined(__GNUC__) && defined(__GNUC_MINOR__)
 # define GIT_GNUC_PREREQ(maj, min) \
-	((__GNUC__ << 16) + __GNUC_MINOR__ >= ((maj) << 16) + (min))
+	((__GNUC__ > (maj)) || \
+	 (__GNUC__ == (maj) && __GNUC_MINOR__ >= (min)))
 #else
- #define GIT_GNUC_PREREQ(maj, min) 0
+# define GIT_GNUC_PREREQ(maj, min) 0
 #endif
 
 /* Similar for Clang. */
-- 
2.54.0


^ permalink raw reply related

* Re: [PATCH] ls-files: filter pathspec before lstat
From: Junio C Hamano @ 2026-06-08 13:06 UTC (permalink / raw)
  To: Tamir Duberstein; +Cc: git, René Scharfe, Patrick Steinhardt
In-Reply-To: <20260607-ls-files-pathspec-lstat-v1-1-8cf40b730146@gmail.com>

On Sun, Jun 7, 2026 at 11:40, Tamir Duberstein wrote:
> show_files() checks whether each index entry is deleted or modified
> before show_ce() applies the pathspec. prune_index() avoids most of this
> work for pathspecs with a common directory prefix, but a top-level name
> or leading wildcard leaves every entry to be checked.
> 
> Match the pathspec before lstat() for the deleted and modified modes.
> Keep the later match in show_ce() so --error-unmatch is satisfied only
> by entries that are actually shown.

Adding an extra early `match_pathspec()` check before making slow
system calls like `lstat()` makes sense, especially when most of the
index entries need to be skipped.  But if most of them would match,
then we would end up doing the same match_pathspec() calls twice for
each path, and run lstat() anyway, so you may also be able to
construct a perf test that demonstrates a case where this approach
is not a clear win (or even degradation), perhaps?

> diff --git a/builtin/ls-files.c b/builtin/ls-files.c
> index e1a22b41b9..702c607183 100644
> --- a/builtin/ls-files.c
> +++ b/builtin/ls-files.c
> @@ -450,6 +450,13 @@ static void show_files(struct repository *repo, struct dir_struct *dir)
>  			continue;
>  		if (ce_skip_worktree(ce))
>  			continue;
> +		/* Only entries shown by show_ce() satisfy --error-unmatch. */
> +		if (pathspec.nr &&
> +		    !match_pathspec(repo->index, &pathspec, fullname.buf,
> +				    fullname.len, max_prefix_len, NULL,
> +				    S_ISDIR(ce->ce_mode) ||
> +				    S_ISGITLINK(ce->ce_mode)))
> +			continue;
>  		stat_err = lstat(fullname.buf, &st);
>  		if (stat_err && (errno != ENOENT && errno != ENOTDIR))
>  			error_errno("cannot lstat '%s'", fullname.buf);

Hmph.  In the current code, because there is no such pre-filtering,
show_ce() would unconditionally recurse into active submodules when
told to with the "--recurse-submodules" flag, even if your pathspec
coes not match the submodule.  With this change, such a submodule
whose path does not match the pathspec would not even be seen by
show_ce().  Would it cause a change in behaviour?

^ permalink raw reply

* Re: [PATCH RFC 2/2] builtin/history: print feedback after successful reword
From: Pablo Sabater @ 2026-06-08 13:23 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Patrick Steinhardt, Kaartic Sivaraam
In-Reply-To: <xmqqqzmhz0pq.fsf@gitster.g>

El lun, 8 jun 2026 a las 14:16, Junio C Hamano (<gitster@pobox.com>) escribió:
>
> Pablo Sabater <pabloosabaterr@gmail.com> writes:
>
> > Unlike `git commit --amend` and `git rebase -i`, `git history reword`
> > doesn't print anything, this makes it feel empty for a porcelain command
> > and hard to tell if the command did anything without using other
> > commands like `git log <commit>` to check if the reword was done.
> >
> > Print a message on successful rewords so the user has feedback about it.
> >
> > Signed-off-by: Pablo Sabater <pabloosabaterr@gmail.com>
> > ---
> >  builtin/history.c         |  4 ++++
> >  t/t3451-history-reword.sh | 14 ++++++++++++++
> >  2 files changed, 18 insertions(+)
> >
> > diff --git a/builtin/history.c b/builtin/history.c
> > index 51a22a9a1c..0f1ba3b531 100644
> > --- a/builtin/history.c
> > +++ b/builtin/history.c
> > @@ -739,6 +739,10 @@ static int cmd_history_reword(int argc,
> >               goto out;
> >       }
> >
> > +     fprintf(stderr, _("Successfully reworded commit %s to %s\n"),
> > +             repo_find_unique_abbrev(repo, &original->object.oid, DEFAULT_ABBREV),
> > +             repo_find_unique_abbrev(repo, &rewritten->object.oid, DEFAULT_ABBREV));
> > +
> >       ret = 0;
> >
> >  out:
>
> Do other commands in "git history" (split is in 'master', drop and
> fixup are cooking) behave with similar verbosity?  Consistency within
> the same "history" umbrella matters more than being similar with
> other commands that can be used for similar purposes.

They do not, they are thought with the rule of silence in mind.
However I think that this output is valuable information I might have
explained myself better at [1] but my thought is:

git history reword aabb

Now that I have my commit aabb rewritten I want to check it again just
to make sure I did what I wanted correctly, but git log aabb is still
the old commit, the rewritten one has a different hash which I do not
know unless I search for it, if it's far from HEAD I'd have to git log
--oneline, get the hash and then git log new_hash. I think that git
history reword that does have the information about the new hash
should print it to avoid this search.
What I want is something like:

git history reword aabb
Successfully reworded aabb to ccdd

So I can just git log ccdd without having to search.

I want to say I haven't looked as much as I'd like to split, drop and
fixup, but I think it would be a good addition for them also. On [1]
Patrick wrote about a --verbose for git history, I think that the
basic information i.e. at reword which is the new hash should be
always printed but if it's preferred it could go there.

For split it can print the hashes of the new commits like:
"...split into ccdd and eeff."
For fixup the commit hash also changes, so the same as reword.
The one that will have more friction would be drop is the one that
doesn't end up with new commits.

[1]: https://lore.kernel.org/git/CAN5EUNSAOMRvmLGVfzQiwWoOn9VGNVU5rVMZizOryn_q2fbCNA@mail.gmail.com/

>
> > diff --git a/t/t3451-history-reword.sh b/t/t3451-history-reword.sh
> > index 54ea8a7207..4b22d761e3 100755
> > --- a/t/t3451-history-reword.sh
> > +++ b/t/t3451-history-reword.sh
> > @@ -416,4 +416,18 @@ test_expect_success 'aborts if the commit message is the same' '
> >       )
> >  '
> >
> > +test_expect_success 'prints feedback on successful reword' '
> > +     test_when_finished "rm -rf repo" &&
> > +     git init repo &&
> > +     (
> > +             cd repo &&
> > +             test_commit first &&
> > +
> > +             reword_with_message HEAD 2>err <<-EOF &&
> > +             first reworded
> > +             EOF
> > +             test_grep "Successfully reworded" err
> > +     )
> > +'
> > +
> >  test_done

^ permalink raw reply

* Re: [PATCH v3 0/2] prio-queue: fold lazy_queue into prio_queue for automatic get+put fusion
From: Junio C Hamano @ 2026-06-08 13:36 UTC (permalink / raw)
  To: Kristofer Karlsson via GitGitGadget
  Cc: git, René Scharfe, Kristofer Karlsson
In-Reply-To: <pull.2140.v3.git.1780832592.gitgitgadget@gmail.com>

"Kristofer Karlsson via GitGitGadget" <gitgitgadget@gmail.com>
writes:

> Changes in v3:
>
>  * Adopted Rene's suggestion to move the flush logic below the LIFO
>    early-return (LIFO mode never sets get_pending, so flushing there is a
>    no-op).

Sensible.

>  * Went a step further and inlined the flush logic directly into get() and
>    peek(), eliminating the flush_get() helper and its forward declaration of
>    sift_down_root().

Hmph, unless there is a reason to allow the copies in get() and
peek() to deviate from each other, e.g., what flush_get() had to do
inside get() and peek() were slightly different, I am not sure if
this is a good move.  I do not know if the slight difference of the
"inlined" logic we have in the patch between the one in get() and
the other one in peek() has merit, either.  It certainly lets you
avoid an unnecessary clearing of the get_pending bit (when a get was
pending but the queue has more items to yield) immediately followed
by turning it back on again (which happens always unless the
function makes an early return for an empty queue) in get(), which
will never happen in flush() that will always clear the bit before
it returns, but is such an avoidance of an assignment really worth
it?  I suspect that with the static inline version of flush_get(),
compilers are smart enough to optimize it away, but I dunno.

>        void *prio_queue_get(struct prio_queue *queue)
>        {
>        	if (!queue->nr)
>        		return NULL;
>        	if (!queue->compare)
>      ++		return queue->array[--queue->nr].data;
>      ++
>      ++	if (queue->get_pending) {
>      ++		if (!--queue->nr) {
>      ++			queue->get_pending = 0;
>      ++			return NULL;
>      ++		}
>      ++		queue->array[0] = queue->array[queue->nr];
>      ++		sift_down_root(queue);
>      ++	}
>      + 

The above is from [1/2] (this is a minor point, but flipping the
order of two patches to make the "nr_internal clean-up" as a
preliminary step might have made commenting on this part easier to
read).  I wondered if it is easier to understand if the first early
return is guarded by a conditional that takes get_pending into
account.

	if (queue->nr_internal <= queue->get_pending)
		return NULL;

As I said, since the above hunk is immediately followed by an
unconditional assignment of "queue->get_pending = 1", clearing
get_pending = 0 only when we leave inside the if() block works as an
optimization that is not available on the peek() side.  But with the
"ah the queue is empty already, the queue->ne == 1 is due to the
lazy get that did not rebalance" tweak, it would become unneeded, I
think.

>      + void *prio_queue_peek(struct prio_queue *queue)
>      + {
>       +	if (!queue->nr_internal)
>        		return NULL;
>        	if (!queue->compare)
>       +		return queue->array[queue->nr_internal - 1].data;
>      + 
>      + 	if (queue->get_pending) {
>      + 		queue->get_pending = 0;
>      +-		if (!--queue->nr)
>      ++		if (!--queue->nr_internal)
>      + 			return NULL;
>      +-		queue->array[0] = queue->array[queue->nr];
>      ++		queue->array[0] = queue->array[queue->nr_internal];
>      + 		sift_down_root(queue);
>      + 	}

This is from [2/2]; the same 

	if (queue->nr_internal <= queue->get_pending)
		return NULL;

applies here, I think.

^ permalink raw reply

* Re: [PATCH 2/7] patch-delta: use size_t for sizes
From: Patrick Steinhardt @ 2026-06-08 13:53 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget
  Cc: git, Kristofer Karlsson, Johannes Schindelin
In-Reply-To: <1fd7646ca14f7ec392c85fab10255f08d0d79368.1780570273.git.gitgitgadget@gmail.com>

On Thu, Jun 04, 2026 at 10:51:07AM +0000, Johannes Schindelin via GitGitGadget wrote:
> From: Johannes Schindelin <johannes.schindelin@gmx.de>
> 
> `patch_delta()` takes the source and delta sizes by value and writes
> back the reconstructed target size through an `unsigned long *`.  That
> datatype cannot represent a value that exceeds 4 GiB on systems where
> `unsigned long` is 32-bit (notably 64-bit Windows builds), though, even
> though the delta encoding itself, the on-disk layout, and the in-memory
> buffers happily carry such sizes. A `size_t` companion to
> `get_delta_hdr_size()`, `get_delta_hdr_size_sz()`, was introduced in
> 17fa077596 (delta, packfile: use size_t for delta header sizes,
> 2026-05-08) precisely so that `patch_delta()` could be widened without
> changing the on-the-wire decoding helper's signature.
> 
> Widen `patch_delta()`'s three size parameters to `size_t` and switch
> its internal use of `get_delta_hdr_size()` to the `_sz` variant.
> Then propagate the wider type through the callers.

Does `get_delta_hdr_size()` have any remaining callers after this patch
series? I currently only spot two such callers, and you convert both of
them in this patch.

And can we reasonably add a test case that exercises this change?

> diff --git a/packfile.c b/packfile.c
> index 89366abfe3..e202f48837 100644
> --- a/packfile.c
> +++ b/packfile.c
> @@ -1964,10 +1964,8 @@ void *unpack_entry(struct repository *r, struct packed_git *p, off_t obj_offset,
>  			      (uintmax_t)curpos, p->pack_name);
>  			data = NULL;
>  		} else {
> -			unsigned long sz;
>  			data = patch_delta(base, base_size, delta_data,
> -					   delta_size, &sz);
> -			size = sz;
> +					   delta_size, &size);

Nice that we get rid of this awkward construct.

Patrick

^ permalink raw reply

* Re: [PATCH 3/7] pack-objects(check_pack_inflate()): use size_t instead of unsigned long
From: Patrick Steinhardt @ 2026-06-08 13:53 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget
  Cc: git, Kristofer Karlsson, Johannes Schindelin
In-Reply-To: <ddb75326cde9695f1eb7bbbe77175424e6b77004.1780570273.git.gitgitgadget@gmail.com>

On Thu, Jun 04, 2026 at 10:51:08AM +0000, Johannes Schindelin via GitGitGadget wrote:
> From: Johannes Schindelin <johannes.schindelin@gmx.de>
> 
> `write_reuse_object()` learned to track its packed-object size as
> `size_t` in 606c192380 (odb, packfile: use size_t for streaming
> object sizes, 2026-05-08), but the comparison sink it feeds,
> `check_pack_inflate()`, still takes the expected decompressed size
> as `unsigned long`. The call site bridges the mismatch with
> `cast_size_t_to_ulong()`, which on Windows turns a >4 GiB object
> into an immediate die().
> 
> That function only uses `expect` once: as the right-hand side of a
> `stream.total_out == expect` equality test against zlib's counter.
> zlib's own `total_out` counter is `uLong` and is therefore still
> 32-bit-bound on Windows. Widening `expect` to `size_t` cannot fix that,
> but it is a strict improvement nonetheless: instead of dying outright,
> an oversized object now simply makes the equality fail and lets
> `write_reuse_object()` fall back to `write_no_reuse_object()`, which
> decompresses and re-deflates the content (and which the larger
> pack-objects widening series targets separately).

Hm. I wonder whether it's possible to reset `stream.total_out` on every
iteration and instead have a local `size_t` variable that we use to
track the total number of inflated bytes?

Patrick

^ permalink raw reply

* Re: [PATCH 4/7] packfile: widen unpack_entry()'s size out-parameter to size_t
From: Patrick Steinhardt @ 2026-06-08 13:53 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget
  Cc: git, Kristofer Karlsson, Johannes Schindelin
In-Reply-To: <bdebc36f21d1e2a13bc91d72a3ada1db3f7e184e.1780570273.git.gitgitgadget@gmail.com>

On Thu, Jun 04, 2026 at 10:51:09AM +0000, Johannes Schindelin via GitGitGadget wrote:
> diff --git a/builtin/fast-import.c b/builtin/fast-import.c
> index 82bc6dcc00..3dff898c43 100644
> --- a/builtin/fast-import.c
> +++ b/builtin/fast-import.c
> @@ -1239,6 +1239,8 @@ static void *gfi_unpack_entry(
>  	unsigned long *sizep)
>  {
>  	enum object_type type;
> +	size_t size_st = 0;
> +	void *data;
>  	struct packed_git *p = all_packs[oe->pack_id];
>  	if (p == pack_data && p->pack_size < (pack_size + the_hash_algo->rawsz)) {
>  		/* The object is stored in the packfile we are writing to
> @@ -1260,7 +1262,10 @@ static void *gfi_unpack_entry(
>  		 */
>  		p->pack_size = pack_size + the_hash_algo->rawsz;
>  	}
> -	return unpack_entry(the_repository, p, oe->idx.offset, &type, sizep);
> +	data = unpack_entry(the_repository, p, oe->idx.offset, &type, &size_st);
> +	if (sizep)
> +		*sizep = cast_size_t_to_ulong(size_st);
> +	return data;
>  }

Nit, please feel free to ignore: do we want to add a NEEDSWORK comment
here?

Patrick

^ permalink raw reply

* Re: [PATCH 6/7] packfile,delta: drop the `cast_size_t_to_ulong()` wrappers
From: Patrick Steinhardt @ 2026-06-08 13:53 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget
  Cc: git, Kristofer Karlsson, Johannes Schindelin
In-Reply-To: <460d733feeaf2a94fe28d7509cc4128e9c0a7610.1780570273.git.gitgitgadget@gmail.com>

On Thu, Jun 04, 2026 at 10:51:11AM +0000, Johannes Schindelin via GitGitGadget wrote:
> From: Johannes Schindelin <johannes.schindelin@gmx.de>
> 
> When I started the transition from `unsigned long` to `size_t`, in the
> interest of keeping the patches reviewable, I introduced these calls to
> prevent data type narrowing from silently failing to handle large object
> sizes. I also introduced `*_sz()` variants that would allow most of the
> callers to keep using that `unsigned long` that the 90s kindly asked to
> be returned.
> 
> After the preceding commits, the only places that called the narrow
> wrappers either no longer exist or already use the `_sz` form
> internally, so the wrappers just narrow values back through
> `cast_size_t_to_ulong()` for no reason.
> 
> Drop them and rename the `_sz` variants back to the natural names.

Aha, so you already address my comment I had on one of the preceding
patches :)

Patrick

^ permalink raw reply

* Re: [PATCH 7/7] odb: use size_t for object_info.sizep and the size APIs
From: Patrick Steinhardt @ 2026-06-08 13:53 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget
  Cc: git, Kristofer Karlsson, Johannes Schindelin
In-Reply-To: <f3aeae983ac8b281d6ba54299961e19d16699c94.1780570273.git.gitgitgadget@gmail.com>

On Thu, Jun 04, 2026 at 10:51:12AM +0000, Johannes Schindelin via GitGitGadget wrote:
> diff --git a/builtin/cat-file.c b/builtin/cat-file.c
> index fa45f774d7..fa6e396ddc 100644
> --- a/builtin/cat-file.c
> +++ b/builtin/cat-file.c
> @@ -120,7 +120,7 @@ static int cat_one_file(int opt, const char *exp_type, const char *obj_name)
>  	struct object_id oid;
>  	enum object_type type;
>  	char *buf;
> -	unsigned long size;
> +	size_t size;
>  	struct object_context obj_context = {0};
>  	struct object_info oi = OBJECT_INFO_INIT;
>  	unsigned flags = OBJECT_INFO_LOOKUP_REPLACE;
> @@ -166,7 +166,7 @@ static int cat_one_file(int opt, const char *exp_type, const char *obj_name)
>  		if (use_mailmap && (type == OBJ_COMMIT || type == OBJ_TAG)) {
>  			size_t s = size;
>  			buf = replace_idents_using_mailmap(buf, &s);
> -			size = cast_size_t_to_ulong(s);
> +			size = s;
>  		}
>  
>  		printf("%"PRIuMAX"\n", (uintmax_t)size);

Can't we drop this local variable completely and instead supply `&size`
directly?

> @@ -219,7 +225,7 @@ static int cat_one_file(int opt, const char *exp_type, const char *obj_name)
>  		if (use_mailmap) {
>  			size_t s = size;
>  			buf = replace_idents_using_mailmap(buf, &s);
> -			size = cast_size_t_to_ulong(s);
> +			size = s;
>  		}
>  
>  		/* otherwise just spit out the data */
> @@ -266,7 +272,7 @@ static int cat_one_file(int opt, const char *exp_type, const char *obj_name)
>  		if (use_mailmap) {
>  			size_t s = size;
>  			buf = replace_idents_using_mailmap(buf, &s);
> -			size = cast_size_t_to_ulong(s);
> +			size = s;
>  		}
>  		break;
>  	}
> @@ -446,7 +455,7 @@ static void print_object_or_die(struct batch_options *opt, struct expand_data *d
>  		if (use_mailmap) {
>  			size_t s = size;
>  			contents = replace_idents_using_mailmap(contents, &s);
> -			size = cast_size_t_to_ulong(s);
> +			size = s;
>  		}
>  
>  		if (type != data->type)

Likewise for these three instances.

> @@ -555,7 +564,7 @@ static void batch_object_write(const char *obj_name,
>  			if (!buf)
>  				die(_("unable to read %s"), oid_to_hex(&data->oid));
>  			buf = replace_idents_using_mailmap(buf, &s);
> -			data->size = cast_size_t_to_ulong(s);
> +			data->size = s;
>  
>  			free(buf);
>  		}

And I think this site here can be adapted, as well.

> diff --git a/diff.c b/diff.c
> index 5a584fa1d5..816b89dc6c 100644
> --- a/diff.c
> +++ b/diff.c
> @@ -4594,8 +4594,9 @@ int diff_populate_filespec(struct repository *r,
>  		}
>  	}
>  	else {
> +		size_t size_st = 0;
>  		struct object_info info = {
> -			.sizep = &s->size
> +			.sizep = &size_st
>  		};
>  
>  		if (!(size_only || check_binary))
> @@ -4617,6 +4618,7 @@ int diff_populate_filespec(struct repository *r,
>  			die("unable to read %s", oid_to_hex(&s->oid));
>  
>  object_read:
> +		s->size = cast_size_t_to_ulong(size_st);
>  		if (size_only || check_binary) {
>  			if (size_only)
>  				return 0;
> @@ -4631,6 +4633,7 @@ object_read:
>  			if (odb_read_object_info_extended(r->objects, &s->oid, &info,
>  							  OBJECT_INFO_LOOKUP_REPLACE))
>  				die("unable to read %s", oid_to_hex(&s->oid));
> +			s->size = cast_size_t_to_ulong(size_st);
>  		}
>  		s->should_free = 1;
>  	}

The flow in this function is quite weird if you ask me, but that's a
preexisting issue. This does look correct to me, even if it's awkward.

Patrick

^ permalink raw reply

* [PATCH 0/3] config: allow disabling config includes
From: Derrick Stolee via GitGitGadget @ 2026-06-08 13:57 UTC (permalink / raw)
  To: git; +Cc: gitster, Derrick Stolee

This series introduces a new way to ignore config include directives via two
mechanisms:

 * GIT_CONFIG_INCLUDES=0 in the environment.
 * git --no-includes ... in the command line.

My motivation is from a tricky situation where users want to do the risky
thing and include a repo-tracked file for sharing aliases and other
recommended config. They are then struggling in a later build step that is
running Git commands (under a tool we don't control and can't change) that
then cause filesystem accesses outside of the build system's sandbox.

While git config has a --no-includes option, that doesn't impact the
behavior of other Git commands. We build upon that existing logic for
disabling includes, though.

Having had recent luck recommending GIT_ADVICE=0 when running Git commands
from third-party tools, I thought that a similar environment variable to
disable this functionality would be helpful, too.

One thing I do worry about is whether or not this would cause a significant
break in behavior, or if this is a relatively safe thing to allow.

The three patches are organized as follows:

 1. Patch 1 has a small typo fix in the config documentation that messes
    with the format of the bulleted list. I include it here because I add to
    that list in patch 2.
 2. Patch 2 adds the environment variable and tests it via 'git config' and
    the use of a Git alias.
 3. Patch 3 adds the '--no-includes' option at the top level.

Thanks, -Stolee

Derrick Stolee (3):
  git-config.adoc: fix paragraph break
  config: add GIT_CONFIG_INCLUDES
  git: add --no-includes top-level option

 Documentation/git-config.adoc |  7 ++++++-
 Documentation/git.adoc        |  6 +++++-
 config.c                      |  7 ++++++-
 environment.h                 |  6 ++++++
 git.c                         |  6 +++++-
 t/t1305-config-include.sh     | 35 +++++++++++++++++++++++++++++++++++
 6 files changed, 63 insertions(+), 4 deletions(-)

base-commit: 9ac3f193c05c2237e2b14ebaa1149e9fc8a1abe0
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-2139%2Fderrickstolee%2Fconfig-include-override-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-2139/derrickstolee/config-include-override-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/2139
-- 
gitgitgadget

^ permalink raw reply

* [PATCH 1/3] git-config.adoc: fix paragraph break
From: Derrick Stolee via GitGitGadget @ 2026-06-08 13:57 UTC (permalink / raw)
  To: git; +Cc: gitster, Derrick Stolee, Derrick Stolee
In-Reply-To: <pull.2139.git.1780927027.gitgitgadget@gmail.com>

From: Derrick Stolee <stolee@gmail.com>

The bulletted list about environment variables is missing a '+' between
some paragraphs that belong to the same bullet item. Without it, the
bulletted list is rendered as two separate lists with "See also FILES."
as a normal paragraph between them. Adding '+' unifies the lists.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
---
 Documentation/git-config.adoc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/git-config.adoc b/Documentation/git-config.adoc
index 00545b2054..044d776613 100644
--- a/Documentation/git-config.adoc
+++ b/Documentation/git-config.adoc
@@ -476,7 +476,7 @@ GIT_CONFIG_SYSTEM::
 GIT_CONFIG_NOSYSTEM::
 	Whether to skip reading settings from the system-wide
 	$(prefix)/etc/gitconfig file. See linkgit:git[1] for details.
-
++
 See also <<FILES>>.
 
 GIT_CONFIG_COUNT::
-- 
gitgitgadget


^ permalink raw reply related

* [PATCH 2/3] config: add GIT_CONFIG_INCLUDES
From: Derrick Stolee via GitGitGadget @ 2026-06-08 13:57 UTC (permalink / raw)
  To: git; +Cc: gitster, Derrick Stolee, Derrick Stolee
In-Reply-To: <pull.2139.git.1780927027.gitgitgadget@gmail.com>

From: Derrick Stolee <stolee@gmail.com>

The config keys 'include.path' and 'includeIf.*' allow users to specify
config stored in a location outside of the typical list of config files
(system, global, local, etc.). For example, users who accept the risk
can specify helpful aliases via a file checked into the repo by pointing
'include.path' to the position of that file in the working directory.
This is dangerous, but people do it.

What becomes tricky is that this modifies all Git behavior, including
operations that are intended to be limited in activity or sandboxed in
some way. These include directives can provide surprising changes to
behavior, especially when expecting a specific list of allowed file
accesses. This could lead to failed builds, for instance.

To allow for these user-desired features when they are running commands,
add a new GIT_CONFIG_INCLUDES environment variable that disables these
redirections of config when set to zero. This variable can be set by
automation, such as build tooling, to avoid these strange behaviors.
This could be considered a recommended option for tools executing Git
commands, the same as GIT_ADVICE=0.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
---
 Documentation/git-config.adoc |  5 +++++
 config.c                      |  7 ++++++-
 environment.h                 |  6 ++++++
 t/t1305-config-include.sh     | 31 +++++++++++++++++++++++++++++++
 4 files changed, 48 insertions(+), 1 deletion(-)

diff --git a/Documentation/git-config.adoc b/Documentation/git-config.adoc
index 044d776613..c9b5159501 100644
--- a/Documentation/git-config.adoc
+++ b/Documentation/git-config.adoc
@@ -502,6 +502,11 @@ GIT_CONFIG::
 	historical compatibility; there is generally no reason to use it
 	instead of the `--file` option.
 
+GIT_CONFIG_INCLUDES::
+	If GIT_CONFIG_INCLUDES is set to 0, then Git will not follow
+	`include.path` or `includeIf.*.path` directives when reading
+	configuration files.
+
 [[EXAMPLES]]
 EXAMPLES
 --------
diff --git a/config.c b/config.c
index a1b92fe083..85edd05672 100644
--- a/config.c
+++ b/config.c
@@ -1595,9 +1595,14 @@ int config_with_options(config_fn_t fn, void *data,
 			const struct config_options *opts)
 {
 	struct config_include_data inc = CONFIG_INCLUDE_INIT;
+	int respect_includes = opts->respect_includes;
 	int ret;
 
-	if (opts->respect_includes) {
+	if (respect_includes &&
+	    !git_env_bool(CONFIG_INCLUDES_ENVIRONMENT, 1))
+		respect_includes = 0;
+
+	if (respect_includes) {
 		inc.fn = fn;
 		inc.data = data;
 		inc.opts = opts;
diff --git a/environment.h b/environment.h
index 9eb97b3869..2c57ae2533 100644
--- a/environment.h
+++ b/environment.h
@@ -52,6 +52,12 @@
  */
 #define GIT_ADVICE_ENVIRONMENT "GIT_ADVICE"
 
+/*
+ * Environment variable used to prevent following include.path or includeIf.*
+ * config directives.
+ */
+#define CONFIG_INCLUDES_ENVIRONMENT "GIT_CONFIG_INCLUDES"
+
 /*
  * Environment variable used in handshaking the wire protocol.
  * Contains a colon ':' separated list of keys with optional values
diff --git a/t/t1305-config-include.sh b/t/t1305-config-include.sh
index f3892578e4..270e4b89ab 100755
--- a/t/t1305-config-include.sh
+++ b/t/t1305-config-include.sh
@@ -396,4 +396,35 @@ test_expect_success 'onbranch without repository but explicit nonexistent Git di
 	test_must_fail nongit git --git-dir=nonexistent config get foo.bar
 '
 
+test_expect_success 'GIT_CONFIG_INCLUDES=0 disables include.path and includeIf' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		git config set include.path config.inc &&
+		git config set "includeIf.gitdir:*.path" config2.inc &&
+		git config set -f .git/config.inc foo.bar from-include &&
+		git config set -f .git/config2.inc foo.baz from-includeif &&
+		git config get foo.bar &&
+		git config get foo.baz &&
+		test_must_fail env GIT_CONFIG_INCLUDES=0 git config get foo.bar &&
+		test_must_fail env GIT_CONFIG_INCLUDES=0 git config get foo.baz &&
+		git config get --includes foo.bar &&
+		test_must_fail env GIT_CONFIG_INCLUDES=0 git config get --includes foo.bar
+	)
+'
+
+test_expect_success 'GIT_CONFIG_INCLUDES=0 blocks included alias override' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		git config set alias.test false &&
+		git config set include.path config.inc &&
+		git config set -f .git/config.inc alias.test status &&
+		git test &&
+		test_must_fail env GIT_CONFIG_INCLUDES=0 git test
+	)
+'
+
 test_done
-- 
gitgitgadget


^ permalink raw reply related

* [PATCH 3/3] git: add --no-includes top-level option
From: Derrick Stolee via GitGitGadget @ 2026-06-08 13:57 UTC (permalink / raw)
  To: git; +Cc: gitster, Derrick Stolee, Derrick Stolee
In-Reply-To: <pull.2139.git.1780927027.gitgitgadget@gmail.com>

From: Derrick Stolee <stolee@gmail.com>

The previous change added a GIT_CONFIG_INCLUDES=0 override in the
environment, similar to GIT_ADVICE=0. Follow the same model as
--no-advice to add a --no-includes option to the top-level Git options.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
---
 Documentation/git.adoc    | 6 +++++-
 git.c                     | 6 +++++-
 t/t1305-config-include.sh | 8 ++++++--
 3 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/Documentation/git.adoc b/Documentation/git.adoc
index 8a5cdd3b3d..f220427930 100644
--- a/Documentation/git.adoc
+++ b/Documentation/git.adoc
@@ -12,7 +12,7 @@ SYNOPSIS
 'git' [-v | --version] [-h | --help] [-C <path>] [-c <name>=<value>]
     [--exec-path[=<path>]] [--html-path] [--man-path] [--info-path]
     [-p | --paginate | -P | --no-pager] [--no-replace-objects] [--no-lazy-fetch]
-    [--no-optional-locks] [--no-advice] [--bare] [--git-dir=<path>]
+    [--no-optional-locks] [--no-advice] [--no-includes] [--bare] [--git-dir=<path>]
     [--work-tree=<path>] [--namespace=<name>] [--config-env=<name>=<envvar>]
     <command> [<args>]
 
@@ -194,6 +194,10 @@ If you just want to run git as if it was started in `<path>` then use
 --no-advice::
 	Disable all advice hints from being printed.
 
+--no-includes::
+	Disable all `include.path` and `includeIf.*` config directives.
+	See linkgit:git-config[1] for more information.
+
 --literal-pathspecs::
 	Treat pathspecs literally (i.e. no globbing, no pathspec magic).
 	This is equivalent to setting the `GIT_LITERAL_PATHSPECS` environment
diff --git a/git.c b/git.c
index 36f08891ef..52cfbf0e23 100644
--- a/git.c
+++ b/git.c
@@ -40,7 +40,7 @@ const char git_usage_string[] =
 	N_("git [-v | --version] [-h | --help] [-C <path>] [-c <name>=<value>]\n"
 	   "           [--exec-path[=<path>]] [--html-path] [--man-path] [--info-path]\n"
 	   "           [-p | --paginate | -P | --no-pager] [--no-replace-objects] [--no-lazy-fetch]\n"
-	   "           [--no-optional-locks] [--no-advice] [--bare] [--git-dir=<path>]\n"
+	   "           [--no-optional-locks] [--no-advice] [--no-includes] [--bare] [--git-dir=<path>]\n"
 	   "           [--work-tree=<path>] [--namespace=<name>] [--config-env=<name>=<envvar>]\n"
 	   "           <command> [<args>]");
 
@@ -354,6 +354,10 @@ static int handle_options(const char ***argv, int *argc, int *envchanged)
 			setenv(GIT_ADVICE_ENVIRONMENT, "0", 1);
 			if (envchanged)
 				*envchanged = 1;
+		} else if (!strcmp(cmd, "--no-includes")) {
+			setenv(CONFIG_INCLUDES_ENVIRONMENT, "0", 1);
+			if (envchanged)
+				*envchanged = 1;
 		} else {
 			fprintf(stderr, _("unknown option: %s\n"), cmd);
 			usage(git_usage_string);
diff --git a/t/t1305-config-include.sh b/t/t1305-config-include.sh
index 270e4b89ab..b636e5ae7b 100755
--- a/t/t1305-config-include.sh
+++ b/t/t1305-config-include.sh
@@ -409,8 +409,11 @@ test_expect_success 'GIT_CONFIG_INCLUDES=0 disables include.path and includeIf'
 		git config get foo.baz &&
 		test_must_fail env GIT_CONFIG_INCLUDES=0 git config get foo.bar &&
 		test_must_fail env GIT_CONFIG_INCLUDES=0 git config get foo.baz &&
+		test_must_fail git --no-includes config get foo.bar &&
+		test_must_fail git --no-includes config get foo.baz &&
 		git config get --includes foo.bar &&
-		test_must_fail env GIT_CONFIG_INCLUDES=0 git config get --includes foo.bar
+		test_must_fail env GIT_CONFIG_INCLUDES=0 git config get --includes foo.bar &&
+		test_must_fail git --no-includes config get --includes foo.bar
 	)
 '
 
@@ -423,7 +426,8 @@ test_expect_success 'GIT_CONFIG_INCLUDES=0 blocks included alias override' '
 		git config set include.path config.inc &&
 		git config set -f .git/config.inc alias.test status &&
 		git test &&
-		test_must_fail env GIT_CONFIG_INCLUDES=0 git test
+		test_must_fail env GIT_CONFIG_INCLUDES=0 git test &&
+		test_must_fail git --no-includes test
 	)
 '
 
-- 
gitgitgadget

^ permalink raw reply related

* Re: [PATCH] prio-queue: fold lazy_queue into prio_queue for automatic get+put fusion
From: Junio C Hamano @ 2026-06-08 14:07 UTC (permalink / raw)
  To: Kristofer Karlsson
  Cc: Kristofer Karlsson via GitGitGadget, git, René Scharfe
In-Reply-To: <CAL71e4MbC+tdTuN6p1HiHtE1XYuS1gBM-KSejFZJ1wbftxNveg@mail.gmail.com>

Kristofer Karlsson <krka@spotify.com> writes:

> Agreed, that's the right fix. I looked for existing ways of marking
> fields as private, internal or hidden but the only thing I found was
> the convention of using a code comment: /* for internal use only */
>
> I will apply a rename and submit a v2. Perhaps something like
> nr_internal to make it look less like a public API.

I think we often use trailing underscore (e.g., "n_") to mark
variables for "not to be used casually, there are better ways to
access this information", which pre-ANSI C people probably used
leading underscore (e.g. "_n") for.

This is often used in callback functions where the types of their
formal parameters are specified by the API and use of them with
casting at every use site is awkward.  For example, qsort() and
friends often take a pointer to the location that stores the value
to be compared, but it is awkward, so we do cast just once upfront
like so:

static int compare_callback(const void *a_, const void *b_)
{
	const a_type a = *((const a_type *)a_);
	const a_type b = *((const a_type *)b_);

        ... use values 'a' and 'b', without having to cast a_ or b_ ...

	return a - b;
}

I think the technique/convention can be used in a similar way for
"this is hidden and unless you can tell if you should be using it
directly, you probably shouldn't" kind of structure members.

So, nr_internal is perfectly fine, but if you find it too long, nr_
is probably just as good.

^ permalink raw reply

* Re: [PATCH 2/3] config: add GIT_CONFIG_INCLUDES
From: Patrick Steinhardt @ 2026-06-08 14:34 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget; +Cc: git, gitster, Derrick Stolee
In-Reply-To: <b48fe9f7abe794864ac4470c2620048c2e5e6b53.1780927027.git.gitgitgadget@gmail.com>

On Mon, Jun 08, 2026 at 01:57:05PM +0000, Derrick Stolee via GitGitGadget wrote:
> From: Derrick Stolee <stolee@gmail.com>
> 
> The config keys 'include.path' and 'includeIf.*' allow users to specify
> config stored in a location outside of the typical list of config files
> (system, global, local, etc.). For example, users who accept the risk
> can specify helpful aliases via a file checked into the repo by pointing
> 'include.path' to the position of that file in the working directory.
> This is dangerous, but people do it.

Huh, I never even considered this use case. But of course, this is
possible, even though it's quite scary.

> What becomes tricky is that this modifies all Git behavior, including
> operations that are intended to be limited in activity or sandboxed in
> some way. These include directives can provide surprising changes to
> behavior, especially when expecting a specific list of allowed file
> accesses. This could lead to failed builds, for instance.
> 
> To allow for these user-desired features when they are running commands,
> add a new GIT_CONFIG_INCLUDES environment variable that disables these
> redirections of config when set to zero. This variable can be set by
> automation, such as build tooling, to avoid these strange behaviors.
> This could be considered a recommended option for tools executing Git
> commands, the same as GIT_ADVICE=0.

I don't know about this part though. I could see use cases where the
tools _should_ read the project-relative configuration. It might also be
the case that the user may want to evaluate some includes, but not all
of them.

That raises the question whether we can introduce the configuration in a
way that it allows a bit more flexibility than just "yes"/"no", like for
example an allow-list of locations that should be evaluated. But maybe
I'm overthinking this.

Patrick

^ permalink raw reply

* Re: [PATCH 0/2] worktree: copy-on-write creation and shared-branch worktrees
From: Junio C Hamano @ 2026-06-08 14:36 UTC (permalink / raw)
  To: Jason Newton via GitGitGadget; +Cc: git, Jason Newton
In-Reply-To: <pull.2317.git.git.1780685368.gitgitgadget@gmail.com>

"Jason Newton via GitGitGadget" <gitgitgadget@gmail.com> writes:

> When many worktrees share one repository -- e .g. a fleet of agents each
> needing an isolated checkout -- "git worktree add" is costly at scale.
> Objects are shared via the common dir, but the working tree is not: each add
> rewrites every tracked file, so N worktrees cost N full checkouts of disk
> and I/O.

Are the "CoW" semantics offered by these underlying mechanisms,
which may differ per operating system and possibly filesystem type,
all meant as mere storage-space optimization, or do some of them
trade potential space saving with some limitation of the features,
i.e., what you can do in the CoW copy and original, or increased
runtime cost, either at the clone time or the time of first
modification?

What I am trying to get at is why should this be even an opt-in
feature.  If "cp treeA treeB" at the shell level would make all the
files in treeA under identical names and contents in treeB, and let
you edit/update/delete copies in either tree without affecting the
other tree, then in practice you would not even be able to _tell_ if
CoW is in use, no?

It may tilt the scale if there is a downside associated with the use
of CoW, like at the first modification of one copy, the system may
need to do real copies of other copies, but even such a cost should
not be outrageously worse than the cost of copying everything once
at the worktree creation time.

So I would understand "whenever we say git_copy_file(A, B), we
always use CoW facility under the hood if available, regardless of
the purpose of the operation to copy one file to another location---
it may include, but does not have to be limited to, populating
working file trees in a new worktree", and I think it is a welcome
change.

But I do not quite get "... only if the user gives --reflink
option".  Why is it even necessary to offer a choice?  Especially
since you seem to have auto-probe, we should be able to implement a
low-level operation to materialize contents identified by a_hash at
a_path in the working tree in two different ways, switching on the
availablity of CoW, e.g.,

	if (CoW available && we can find existing path with a_hash) {
	        copy-cow the found path to a_path;
	} else {
		write object identified by a_hash to a_path;
	}

>  And a branch can only be checked out in one worktree.

This is a safety feature that has nothing to do with shared files
across worktrees, no?  Your two worktrees may think they have a
checkout of the same branch (thus the same commit), one worktree
makes changes and commits, the other worktree suddenly starts seeing
a totally different output from its "git diff HEAD" that mixes what
it did relative to where it started (which is what we want) plus the
reversion of what was done in the other worktree (which is definitely
not what we want).

^ permalink raw reply

* Re: [PATCH GSoC RFC v12 03/12] cat-file: add declaration of variable i inside its for loop
From: Junio C Hamano @ 2026-06-08 14:52 UTC (permalink / raw)
  To: Pablo Sabater
  Cc: eric.peijian, calvinwan, chriscool, git, jltobler, jonathantanmy,
	karthik.188, toon, chandrapratap3519
In-Reply-To: <20260608-ps-eric-work-rebase-v12-3-5338b766e658@gmail.com>

Pablo Sabater <pabloosabaterr@gmail.com> writes:

> From: Eric Ju <eric.peijian@gmail.com>
> Subject: Re: [PATCH GSoC RFC v12 03/12] cat-file: add declaration of variable i inside its for loop

"add" sounds a bit strange, as the existing code wouldn't have
compiled if the variable were never declared.  What the patch did
was to move (not add) the declaration of a function scope variable
that is used to control for() loops.  Would any of these work?

Subject: [PATCH GSOC v12 03/12] cat-file: narrow scope of loop counter
Subject: [PATCH GSOC v12 03/12] cat-file: declare loop counter inside for()

> Some code used in this series declares variable i and only uses it
> in a for loop, not in any other logic outside the loop.
>
> Change the declaration of i to be inside the for loop for readability.
> While at it, we also change its type from "int" to "size_t" where the latter makes more sense.

Curious single line that is overly long?

> Helped-by: Christian Couder <chriscool@tuxfamily.org>
> Signed-off-by: Eric Ju <eric.peijian@gmail.com>
> Signed-off-by: Pablo Sabater <pabloosabaterr@gmail.com>
> ---
>  builtin/cat-file.c | 11 +++--------
>  fetch-pack.c       |  3 +--
>  2 files changed, 4 insertions(+), 10 deletions(-)
>
> diff --git a/builtin/cat-file.c b/builtin/cat-file.c
> index fa45f774d7..c060fd4800 100644
> --- a/builtin/cat-file.c
> +++ b/builtin/cat-file.c
> @@ -726,12 +726,10 @@ static void dispatch_calls(struct batch_options *opt,
>  		struct queued_cmd *cmd,
>  		int nr)
>  {
> -	int i;
> -
>  	if (!opt->buffer_output)
>  		die(_("flush is only for --buffer mode"));
>  
> -	for (i = 0; i < nr; i++)
> +	for (size_t i = 0; i < nr; i++)
>  		cmd[i].fn(opt, cmd[i].line, output, data);

The loop limit "nr" will not become as large as size_t because the
caller passes a platform natural "int" to the function.  Wouldn't a
stupid compiler give us warning on comparing unsigned size_t with
signed int here?

> @@ -739,9 +737,7 @@ static void dispatch_calls(struct batch_options *opt,
>  
>  static void free_cmds(struct queued_cmd *cmd, size_t *nr)
>  {
> -	size_t i;
> -
> -	for (i = 0; i < *nr; i++)
> +	for (size_t i = 0; i < *nr; i++)
>  		FREE_AND_NULL(cmd[i].line);

No type change, so the result is as safe as the original.

> @@ -768,7 +764,6 @@ static void batch_objects_command(struct batch_options *opt,
>  	size_t alloc = 0, nr = 0;
>  
>  	while (strbuf_getdelim_strip_crlf(&input, stdin, opt->input_delim) != EOF) {
> -		int i;
>  		const struct parse_cmd *cmd = NULL;
>  		const char *p = NULL, *cmd_end;
>  		struct queued_cmd call = {0};
> @@ -778,7 +773,7 @@ static void batch_objects_command(struct batch_options *opt,
>  		if (isspace(*input.buf))
>  			die(_("whitespace before command: '%s'"), input.buf);
>  
> -		for (i = 0; i < ARRAY_SIZE(commands); i++) {
> +		for (size_t i = 0; i < ARRAY_SIZE(commands); i++) {
>  			if (!skip_prefix(input.buf, commands[i].name, &cmd_end))
>  				continue;

ARRAY_SIZE() is some arithmetic over sizeof(*commands) and
sizeof(commands), which is of type size_t, so this is better than
the original.  Use of size_t i of course is a natural way to index
into commands[] array, so the result is just fine.

> diff --git a/fetch-pack.c b/fetch-pack.c
> index 120e01f3cf..f13951d154 100644
> --- a/fetch-pack.c
> +++ b/fetch-pack.c
> @@ -1388,9 +1388,8 @@ static void write_fetch_command_and_capabilities(struct strbuf *req_buf,
>  	if (advertise_sid && server_supports_v2("session-id"))
>  		packet_buf_write(req_buf, "session-id=%s", trace2_session_id());
>  	if (server_options && server_options->nr) {
> -		int i;
>  		ensure_server_supports_v2("server-option");
> -		for (i = 0; i < server_options->nr; i++)
> +		for (size_t i = 0; i < server_options->nr; i++)
>  			packet_buf_write(req_buf, "server-option=%s",
>  					 server_options->items[i].string);

server_options is a string_list whose .nr member is of type size_t,
so this comparison is perfectly fine.  Ditto for ->items[i].string
that is a natural way to index into an array.

>  	}

v

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox