Git development

Git development
 help / color / mirror / Atom feed

* [PATCH v3 7/8] repository: stop reading loose object map twice on repo init
From: Patrick Steinhardt @ 2026-06-04  7:46 UTC (permalink / raw)
  To: git; +Cc: Kristoffer Haugsbakk, Junio C Hamano, Karthik Nayak
In-Reply-To: <20260604-b4-pks-setup-centralize-odb-creation-v3-0-0691834f318a@pks.im>

When initializing a repository via `repo_init()` we end up reading the
loose object map twice:

  - `apply_repository_format()` calls `repo_set_compat_hash_algo()`,
    which in turn calls `repo_read_loose_object_map()` if we have a
    compatibility hash configured.

  - `repo_init()` calls `repo_read_loose_object_map()` directly a second
    time.

Drop the second read of the loose object map in `repo_init()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 repository.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/repository.c b/repository.c
index 2c2395105f..61dfbb8be6 100644
--- a/repository.c
+++ b/repository.c
@@ -301,9 +301,6 @@ int repo_init(struct repository *repo,
 	if (worktree)
 		repo_set_worktree(repo, worktree);
 
-	if (repo->compat_hash_algo)
-		repo_read_loose_object_map(repo);
-
 	clear_repository_format(&format);
 	strbuf_release(&err);
 	return 0;

-- 
2.54.0.1064.gd145956f57.dirty


^ permalink raw reply related

* [PATCH v3 8/8] setup: construct object database in `apply_repository_format()`
From: Patrick Steinhardt @ 2026-06-04  7:46 UTC (permalink / raw)
  To: git; +Cc: Kristoffer Haugsbakk, Junio C Hamano, Karthik Nayak
In-Reply-To: <20260604-b4-pks-setup-centralize-odb-creation-v3-0-0691834f318a@pks.im>

With the preceding changes we now always construct the repository's
object database before applying the repository format. Remove this
duplication by constructing it in `apply_repository_format()` instead.

Note that we create the object database _after_ having set up the
repository's hash algorithm, but _before_ setting the compat hash
algorithm. This is intentional:

  - Constructing the object database may require knowledge of its
    intended object format.

  - Setting up the compatibility hash requires the object database to be
    initialized already, because we immediately read the loose object
    map.

The first point is sensible, the second maybe a little less so. Ideally,
it should be the responsibility of the object database itself to
initialize any data structures required for the compatibility hash. But
this would require further changes, so this is kept as-is for now.

Further note that this requires us to move handling of the environment
variables GIT_OBJECT_DIRECTORY and GIT_ALTERNATE_OBJECT_DIRECTORIES into
the repository format, as well. This allows the caller more flexibility
around whether or not those environment variables are being honored, as
we want to respect them in "setup.c", but not in "repository.c".

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 repository.c |  4 +---
 setup.c      | 45 +++++++++++++++++++++------------------------
 setup.h      | 10 ++++++++++
 3 files changed, 32 insertions(+), 27 deletions(-)

diff --git a/repository.c b/repository.c
index 61dfbb8be6..187dd471c4 100644
--- a/repository.c
+++ b/repository.c
@@ -291,13 +291,11 @@ int repo_init(struct repository *repo,
 	if (read_repository_format_from_commondir(&format, repo->commondir))
 		goto error;
 
-	if (apply_repository_format(repo, &format, &err) < 0) {
+	if (apply_repository_format(repo, &format, 0, &err) < 0) {
 		warning("%s", err.buf);
 		goto error;
 	}
 
-	repo->objects = odb_new(repo, NULL, NULL);
-
 	if (worktree)
 		repo_set_worktree(repo, worktree);
 
diff --git a/setup.c b/setup.c
index 4a8d6230b1..513fc88749 100644
--- a/setup.c
+++ b/setup.c
@@ -1752,12 +1752,22 @@ enum discovery_result discover_git_directory_reason(struct strbuf *commondir,
 
 int apply_repository_format(struct repository *repo,
 			    const struct repository_format *format,
+			    enum apply_repository_format_flags flags,
 			    struct strbuf *err)
 {
+	char *object_directory = NULL, *alternate_object_directories = NULL;
+
 	if (verify_repository_format(format, err) < 0)
 		return -1;
 
+	if (flags & APPLY_REPOSITORY_FORMAT_HONOR_ENV) {
+		object_directory = xstrdup_or_null(getenv(DB_ENVIRONMENT));
+		alternate_object_directories = xstrdup_or_null(getenv(ALTERNATE_DB_ENVIRONMENT));
+	}
+
 	repo_set_hash_algo(repo, format->hash_algo);
+	repo->objects = odb_new(repo, object_directory,
+				alternate_object_directories);
 	repo_set_compat_hash_algo(repo, format->compat_hash_algo);
 	repo_set_ref_storage_format(repo,
 				    format->ref_storage_format,
@@ -1773,6 +1783,8 @@ int apply_repository_format(struct repository *repo,
 	repo->repository_format_precious_objects =
 		format->precious_objects;
 
+	free(alternate_object_directories);
+	free(object_directory);
 	return 0;
 }
 
@@ -1785,7 +1797,8 @@ int apply_repository_format(struct repository *repo,
  * If successful and fmt is not NULL, fill fmt with data.
  */
 static void check_and_apply_repository_format(struct repository *repo,
-					      struct repository_format *fmt)
+					      struct repository_format *fmt,
+					      enum apply_repository_format_flags flags)
 {
 	struct repository_format repo_fmt = REPOSITORY_FORMAT_INIT;
 	struct strbuf err = STRBUF_INIT;
@@ -1794,7 +1807,7 @@ static void check_and_apply_repository_format(struct repository *repo,
 		fmt = &repo_fmt;
 
 	check_repository_format_gently(repo_get_git_dir(repo), fmt, NULL);
-	if (apply_repository_format(repo, fmt, &err) < 0)
+	if (apply_repository_format(repo, fmt, flags, &err) < 0)
 		die("%s", err.buf);
 	startup_info->have_repository = 1;
 
@@ -1874,15 +1887,9 @@ const char *enter_repo(struct repository *repo, const char *path, unsigned flags
 	}
 
 	if (is_git_directory(".")) {
-		struct strvec to_free = STRVEC_INIT;
-
 		set_git_dir(repo, ".", 0);
-		repo->objects = odb_new(repo,
-					getenv_safe(&to_free, DB_ENVIRONMENT),
-					getenv_safe(&to_free, ALTERNATE_DB_ENVIRONMENT));
-		check_and_apply_repository_format(repo, NULL);
-
-		strvec_clear(&to_free);
+		check_and_apply_repository_format(repo, NULL,
+						  APPLY_REPOSITORY_FORMAT_HONOR_ENV);
 		return path;
 	}
 
@@ -2034,8 +2041,6 @@ const char *setup_git_directory_gently(struct repository *repo, int *nongit_ok)
 	    startup_info->have_repository ||
 	    /* GIT_DIR_EXPLICIT */
 	    getenv(GIT_DIR_ENVIRONMENT)) {
-		struct strvec to_free = STRVEC_INIT;
-
 		if (!repo->gitdir) {
 			const char *gitdir = getenv(GIT_DIR_ENVIRONMENT);
 			if (!gitdir)
@@ -2046,17 +2051,13 @@ const char *setup_git_directory_gently(struct repository *repo, int *nongit_ok)
 		if (startup_info->have_repository) {
 			struct strbuf err = STRBUF_INIT;
 
-			repo->objects = odb_new(repo,
-						getenv_safe(&to_free, DB_ENVIRONMENT),
-						getenv_safe(&to_free, ALTERNATE_DB_ENVIRONMENT));
-			if (apply_repository_format(repo, &repo_fmt, &err) < 0)
+			if (apply_repository_format(repo, &repo_fmt,
+						    APPLY_REPOSITORY_FORMAT_HONOR_ENV, &err) < 0)
 				die("%s", err.buf);
 
 			clear_repository_format(&repo_fmt);
 			strbuf_release(&err);
 		}
-
-		strvec_clear(&to_free);
 	}
 	/*
 	 * Since precompose_string_if_needed() needs to look at
@@ -2805,7 +2806,6 @@ int init_db(struct repository *repo,
 	int exist_ok = flags & INIT_DB_EXIST_OK;
 	char *original_git_dir = real_pathdup(git_dir, 1);
 	struct repository_format repo_fmt = REPOSITORY_FORMAT_INIT;
-	struct strvec to_free = STRVEC_INIT;
 
 	if (real_git_dir) {
 		struct stat st;
@@ -2826,16 +2826,14 @@ int init_db(struct repository *repo,
 	}
 	startup_info->have_repository = 1;
 
-	repo->objects = odb_new(repo, getenv_safe(&to_free, DB_ENVIRONMENT),
-				getenv_safe(&to_free, ALTERNATE_DB_ENVIRONMENT));
-
 	/*
 	 * Check to see if the repository version is right.
 	 * Note that a newly created repository does not have
 	 * config file, so this will not fail.  What we are catching
 	 * is an attempt to reinitialize new repository with an old tool.
 	 */
-	check_and_apply_repository_format(repo, &repo_fmt);
+	check_and_apply_repository_format(repo, &repo_fmt,
+					  APPLY_REPOSITORY_FORMAT_HONOR_ENV);
 
 	repository_format_configure(repo, &repo_fmt, hash, ref_storage_format);
 
@@ -2892,7 +2890,6 @@ int init_db(struct repository *repo,
 	}
 
 	clear_repository_format(&repo_fmt);
-	strvec_clear(&to_free);
 	free(original_git_dir);
 	return 0;
 }
diff --git a/setup.h b/setup.h
index efbb82fdbf..19679fe78f 100644
--- a/setup.h
+++ b/setup.h
@@ -221,6 +221,15 @@ void clear_repository_format(struct repository_format *format);
 int verify_repository_format(const struct repository_format *format,
 			     struct strbuf *err);
 
+enum apply_repository_format_flags {
+	/*
+	 * Honor environment variables when applying the repository format to
+	 * the repository. For now, this only covers environment variables that
+	 * relate to the object database.
+	 */
+	APPLY_REPOSITORY_FORMAT_HONOR_ENV = (1 << 0),
+};
+
 /*
  * Apply the given repository format to the repo. This initializes extensions
  * and basic data structures required for normal operation. Returns 0 on
@@ -229,6 +238,7 @@ int verify_repository_format(const struct repository_format *format,
  */
 int apply_repository_format(struct repository *repo,
 			    const struct repository_format *format,
+			    enum apply_repository_format_flags flags,
 			    struct strbuf *err);
 
 const char *get_template_dir(const char *option_template);

-- 
2.54.0.1064.gd145956f57.dirty


^ permalink raw reply related

* Re: [PATCH 1/2] parse-options: introduce die_for_required_opt()
From: Christian Couder @ 2026-06-04  8:00 UTC (permalink / raw)
  To: Jean-Noël AVILA; +Cc: git, Siddharth Shrimali, gitster, toon
In-Reply-To: <2412618.ElGaqSPkdT@piment-oiseau>

Hi,

On Wed, Jun 3, 2026 at 9:49 PM Jean-Noël AVILA <jn.avila@free.fr> wrote:

> To me, "die_for_required_opt" is a misnomer as the function does not die for
> an existing "required" condition, unlike the other functions such as
> die_for_incompatible_opt<n>.
>
> The names of the parameters do not indicate that the test is not symmetrical
> (not failing on XOR).
>
> Maybe something like "die_for_missing_opt(int tested_opt, const char
> *tested_opt_name, int required_opt, const char *required_opt_name)
>
> would make it more understandable.

Yeah, I agree it's better.

With "dependent_opt" instead of "tested_opt", I think it would be even better.

Thanks.

^ permalink raw reply

* Re: [PATCH 1/2] parse-options: introduce die_for_required_opt()
From: Christian Couder @ 2026-06-04  8:10 UTC (permalink / raw)
  To: Siddharth Shrimali; +Cc: git, gitster, toon, jn.avila
In-Reply-To: <20260603111044.39116-2-r.siddharth.shrimali@gmail.com>

On Wed, Jun 3, 2026 at 1:11 PM Siddharth Shrimali
<r.siddharth.shrimali@gmail.com> wrote:
>
> Introduce a new helper function die_for_required_opt() to check if a
> given option is present without its required prerequisite option.
>
> This provides a centralized API for handling simple option dependencies
> (i.e., X requires Y), matching the style of the existing mutual-exclusion
> helpers like die_for_incompatible_opt{2,3,4}().
>
> Suggested-by: Christian Couder <christian.couder@gmail.com>

In general it's simpler for GSoC contributors to mention all your
mentors in "Mentored-by: ..." trailers in all your patches during your
GSoC, rather than keeping track of who helped you with each patch.

> Signed-off-by: Siddharth Shrimali <r.siddharth.shrimali@gmail.com>
> ---
>  parse-options.c | 7 +++++++
>  parse-options.h | 3 +++
>  2 files changed, 10 insertions(+)

I think it would be nice if the new function could actually be used in
a single *.c file. It would be even nicer if there was an existing
test that already checked that the dependent option needs the required
option. This way we would also already ensure that the new helper is
working properly.

^ permalink raw reply

* git history feedback
From: Rasmus Villemoes @ 2026-06-04  8:17 UTC (permalink / raw)
  To: git; +Cc: Patrick Steinhardt

Hi

As soon as I saw the announcement of 'git history', I knew that was
something I was gonna use a lot. Especially the split functionality has
always been somewhat of a hassle (at least for me) to do via an
interactive rebase. I've played around with it a little, and it seems to
work as it says on the tin.

So today I had occasion to put it to real use, and then I found two
things I'd like to be able to do with it:

When a commit needs to be split into three or more commits, it is a
little cumbersome to do iteratively, since the new commit to split
obviously has a new sha, so one first has to figure out what that new id
is and then do another "git history split". For higher values of "three"
that becomes rather tedious. So it would be nice if there was an
iterative mode, which after splitting off the first commit would
automatically start again with the new child commit.

If "git add -p" had an answer meaning "yes to this hunk and all
following in this file and all remaining files as well", this could
probably even be the default behaviour of "git history split", as it
would just require that one extra answer to be given after the first
commit is split off in order to keep the current behaviour. Otherwise,
I'd also be happy to have "git history iter-split" or "git history split
--iter" or any other spelling.

The other thing I'd like is a sort of ultimate version of the above:
What I needed in the concrete case at hand was actually to split two
commit into n individual hunks each, then do an interactive rebase to combine
those 2n commits to n commits (I had done changes "row-wise", but needed
to change them to "column-wise"). For that, I would like to have had a
completely automatic "git history atomize" that would split a commit
into individual hunks, prefixing the commit subject with
e.g. "[<filename> -- hunk #nn]". A subsequent 'git rebase -i' could then easily
rearrange those and squash the related hunks.

Aside: are experimental commands eligible for teaching the completion
logic about them? I.e., can we add a __git_history() to
git-completion.bash? Aside from the obvious "let it know about existing
subcommands", I'd love for "git history split <TAB>" to show the most
recent ~20 (or something) commits in one-line format, stopping if
there's a merge commit.

Thanks,
Rasmus

^ permalink raw reply

* Re: [PATCH 2/2] builtin/add: use die_for_required_opt() helper
From: Christian Couder @ 2026-06-04  8:27 UTC (permalink / raw)
  To: Siddharth Shrimali; +Cc: git, gitster, toon, jn.avila
In-Reply-To: <20260603111044.39116-3-r.siddharth.shrimali@gmail.com>

On Wed, Jun 3, 2026 at 1:11 PM Siddharth Shrimali
<r.siddharth.shrimali@gmail.com> wrote:
>
> Clean up manual option dependency checks by replacing explicit conditional
> blocks with the newly introduced die_for_required_opt() helper function.
>
> Specifically, simplify the prerequisite check logic for both
> '--ignore-missing' (which requires '--dry-run') and '--pathspec-file-nul'
> (which requires '--pathspec-from-file').

It's a good idea to use the new helper function for
'--pathspec-file-nul' requiring '--pathspec-from-file' because it
looks like this is tested a lot already:

$ git grep requires | grep 'the option'
t2026-checkout-pathspec-file.sh:    test_grep -e "the option
.--pathspec-file-nul. requires .--pathspec-from-file." err
t2072-restore-pathspec-file.sh:    test_grep -e "the option
.--pathspec-file-nul. requires .--pathspec-from-file." err &&
t3601-rm-pathspec-file.sh:    test_grep -e "the option
.--pathspec-file-nul. requires .--pathspec-from-file." err &&
t3704-add-pathspec-file.sh:    test_grep -e "the option
.--pathspec-file-nul. requires .--pathspec-from-file." err &&
t3909-stash-pathspec-file.sh:    test_grep -e "the option
.--pathspec-file-nul. requires .--pathspec-from-file." err
t7107-reset-pathspec-file.sh:    test_grep -e "the option
.--pathspec-file-nul. requires .--pathspec-from-file." err &&
t7526-commit-pathspec-file.sh:    test_grep -e "the option
.--pathspec-file-nul. requires .--pathspec-from-file." err &&

You could mention this in the commit message.

Also it might be worth squashing this patch into the previous one.

Thanks.

^ permalink raw reply

* Re: git history feedback
From: Patrick Steinhardt @ 2026-06-04  8:39 UTC (permalink / raw)
  To: Rasmus Villemoes; +Cc: git
In-Reply-To: <87ecimhg8s.fsf@rasmusvillemoes.dk>

On Thu, Jun 04, 2026 at 10:17:07AM +0200, Rasmus Villemoes wrote:
> Hi
> 
> As soon as I saw the announcement of 'git history', I knew that was
> something I was gonna use a lot. Especially the split functionality has
> always been somewhat of a hassle (at least for me) to do via an
> interactive rebase. I've played around with it a little, and it seems to
> work as it says on the tin.

Happy to hear!

> So today I had occasion to put it to real use, and then I found two
> things I'd like to be able to do with it:
> 
> When a commit needs to be split into three or more commits, it is a
> little cumbersome to do iteratively, since the new commit to split
> obviously has a new sha, so one first has to figure out what that new id
> is and then do another "git history split". For higher values of "three"
> that becomes rather tedious. So it would be nice if there was an
> iterative mode, which after splitting off the first commit would
> automatically start again with the new child commit.

That's fair, and also something that was discussed when initially
introducing the "split" subcommand. We don't have that mode right now,
but I think it would make sense to maybe add a new option that makes us
split repeatedly until all chunks have been exploded into separate
commits.

> If "git add -p" had an answer meaning "yes to this hunk and all
> following in this file and all remaining files as well", this could
> probably even be the default behaviour of "git history split", as it
> would just require that one extra answer to be given after the first
> commit is split off in order to keep the current behaviour. Otherwise,
> I'd also be happy to have "git history iter-split" or "git history split
> --iter" or any other spelling.

Yeah. From my perspective, having this available as an option would be
the best path forward. But I'm also open to alternatives as you suggest.

> The other thing I'd like is a sort of ultimate version of the above:
> What I needed in the concrete case at hand was actually to split two
> commit into n individual hunks each, then do an interactive rebase to combine
> those 2n commits to n commits (I had done changes "row-wise", but needed
> to change them to "column-wise"). For that, I would like to have had a
> completely automatic "git history atomize" that would split a commit
> into individual hunks, prefixing the commit subject with
> e.g. "[<filename> -- hunk #nn]". A subsequent 'git rebase -i' could then easily
> rearrange those and squash the related hunks.

It could easily be another option: `git history split --explode` for
example, which seems to be a common term for such an operation.

Regarding the subsequent rebase: I have one more patch series cooking
locally that implements `git history move` to move around commits, and I
did have the plan to eventually also implement `git history squash` to
squash commit A into commit B. But when handling many commits it might
even be easier to use interactive rebases instead.

Well, unless we eventually maybe even get something like a graphical
interface. I was playing around with a TUI interface for git-history(1)
that lets you move commits around without the hassle of the command
line. But that's certainly something that's still going to take a while
to materialize, if it ever does.

> Aside: are experimental commands eligible for teaching the completion
> logic about them? I.e., can we add a __git_history() to
> git-completion.bash? Aside from the obvious "let it know about existing
> subcommands", I'd love for "git history split <TAB>" to show the most
> recent ~20 (or something) commits in one-line format, stopping if
> there's a merge commit.

I just haven't gotten around to it yet. I'd certainly welcome the
addition of command line completion.

Thanks for your feedback!

Patrick

^ permalink raw reply

* Re: [PATCH v2 5/9] reset: introduce ability to skip reference updates
From: Patrick Steinhardt @ 2026-06-04  9:01 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Pablo Sabater
In-Reply-To: <xmqqqzmnqj1o.fsf@gitster.g>

On Thu, Jun 04, 2026 at 08:51:47AM +0900, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> > @@ -112,6 +113,9 @@ int reset_head(struct repository *r, const struct reset_head_opts *opts)
> >  	if (opts->branch_msg && !opts->branch)
> >  		BUG("branch reflog message given without a branch");
> >  
> > +	if (skip_ref_updates && (opts->branch || refs_only))
> > +		BUG("asked to perform ref updates and skip them at the same time");
> 
> ;-)  That's certainly a careful safety valve.
> 
> Would we also want to catch skip_ref_updates && update_orig_head
> being both set as a bogus request?

Yeah, I think that's a good idea.

Patrick

^ permalink raw reply

* Re: [PATCH v2 9/9] builtin/history: implement "drop" subcommand
From: Patrick Steinhardt @ 2026-06-04  9:02 UTC (permalink / raw)
  To: Kristoffer Haugsbakk; +Cc: git, Pablo Sabater, Junio C Hamano
In-Reply-To: <4b4672de-17cc-426f-8498-6384b1ad0d06@app.fastmail.com>

On Wed, Jun 03, 2026 at 09:04:33PM +0200, Kristoffer Haugsbakk wrote:
> On Wed, Jun 3, 2026, at 18:14, Patrick Steinhardt wrote:
> > diff --git a/Documentation/git-history.adoc
> > b/Documentation/git-history.adoc
> > index 2ba8121795..4eac732fd2 100644
> > --- a/Documentation/git-history.adoc
> > +++ b/Documentation/git-history.adoc
> > @@ -51,13 +52,28 @@ be stateful operations. The limitation can be
> > lifted once (if) Git learns about
> >  first-class conflicts.
> >
> >  When using `fixup` with `--empty=drop`, dropping the root commit is not yet
> > -supported.
> > +supported. Likewise, `drop` cannot remove the root commit or a merge commit.
> >
> >  COMMANDS
> >  --------
> >
> >  The following commands are available to rewrite history in different ways:
> >
> > +`drop <commit>`::
> > +	Remove the specified commit from the history. All descendants of the
> > +	commit are replayed directly onto its parent.
> > ++
> > +The root commit cannot be dropped as that may lead to edge cases where refs
> > +end up with no commits anymore. Merge commits cannot be dropped either; see
> > +LIMITATIONS.
> 
> Should section names be “bare” or quoted like "LIMITATIONS"?
> I don’t know.
> 
> Maybe add “above” since it’s a previous section.

Hm. I think I'd prefer to keep this as-is, as we already have
preexisting documentation in the same manpage that does it exactly like
the above.

> > ++
> > +If `HEAD` points at a commit that is to be rewritten, the index and working
> >[snip]
> > +Drop a commit
> > +~~~~~~~~~~~~~
> > +
> > +----------
> > +$ git log --oneline
> > +abc1234 (HEAD -> main) third
> > +def5678 second
> > +ghi9012 first
> > +
> > +$ git history drop def5678
> 
> I know this is only the most simple example. And I might be dragging in
> something beyond the scope of this example. But I recall one
> demonstration on the first git-history(1) series which used a lot of
> revision expressions and someone saying that they couldn’t imagine a
> workflow where this would be more interactive than bringing up the
> git-rebase(1) todo editor.
> 
> (I couldn’t find back to this right now.)

Yeah, I remember this discussion.

> Although it is slower in terms of machine cycles, the keyboard instinct
> for dropping a nearby commit might be to do `git rebase -i @~10`
> (sufficiently high number) and navigating quickly in the configured
> editor, deleting the line or using the keybind for `drop`. This example
> which by implication brings up the log in order to paste the abbreviated
> hash isn’t as ergonomic in comparison.
> 
> But using a revision expression like searching the subject with
> `main^{/second}`, while not quicker probably, does distinguish itself
> from git-rebase(1) by being a pretty fast ad hoc invocation that can be
> done in one command without futzing with some weird sed(1) editor in
> order to navigate to the `second` line and deleting it, or
> something. And that’s a small win in isolation, but it segues much more
> naturally into letting you script, say, dropping the last commit that
> starts with the subject `TEMP`.
> 
> Or maybe revision expressions is too much in this context?

No, I think that's actually a good idea to demonstrate how this can be
used without being too unergomonic.

Eventually I still have the idea in my mind to implement a TUI around
git-history(1) that allows to drive its several subcommands without
having to laboriously select the right commits. Ideally, you'd simply
select the commits you care about and then pick specific actions you
want to do on them, with a nice visual graph that updates after the
operation. But that's a bit into the future, I guess :)

> > diff --git a/t/t3454-history-drop.sh b/t/t3454-history-drop.sh
> > new file mode 100755
> > index 0000000000..37d8413e7e
> > --- /dev/null
> > +++ b/t/t3454-history-drop.sh
> > @@ -0,0 +1,513 @@
> > +#!/bin/sh
> > +
> > +test_description='tests for git-history drop subcommand'
> > +
> > +. ./test-lib.sh
> > +. "$TEST_DIRECTORY/lib-log-graph.sh"
> > +
> > +expect_graph () {
> > +	cat >expect &&
> > +	lib_test_cmp_graph --format=%s "$@"
> > +}
> > +
> > +expect_log () {
> > +	git log --format="%s" "$@" >actual &&
> > +	cat >expect &&
> > +	test_cmp expect actual
> > +}
> > +
> > +test_expect_success 'errors on missing commit argument' '
> > +	test_when_finished "rm -rf repo" &&
> > +	git init repo &&
> > +	(
> > +		cd repo &&
> > +		test_commit initial &&
> > +		test_must_fail git history drop 2>err &&
> > +		test_grep "command expects a single revision" err
> 
> Why not `test_cmp` since it’s a fixed error?
> 
> Same for a few other tests like `errors on unknown revision`.

We tend to use test_grep for cases like this, even for static error
messages. I haven't really figured out myself when exactly we tend to
use one form over the other, to be honest.

> >[snip]
> > +test_expect_success 'errors with invalid --empty= value' '
> > +	test_when_finished "rm -rf repo" &&
> > +	git init repo &&
> > +	test_commit -C repo initial &&
> > +	test_commit -C repo second &&
> > +	test_must_fail git -C repo history drop --empty=bogus HEAD 2>err &&
> > +	test_grep "unrecognized.*--empty.*bogus" err
> > +'
> 
> Style related I guess. Most tests here use a subshell but this one uses
> `git -C`? Why is that?

No good reason, will fix.

Thanks!

Patrick

^ permalink raw reply

* Re: [PATCH v2 9/9] builtin/history: implement "drop" subcommand
From: Patrick Steinhardt @ 2026-06-04  9:02 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Pablo Sabater
In-Reply-To: <xmqqmrxbqir4.fsf@gitster.g>

On Thu, Jun 04, 2026 at 08:58:07AM +0900, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> > +static int update_worktree(struct repository *repo,
> > +			   const struct commit *old_head,
> > +			   const struct commit *new_head,
> > +			   bool dry_run)
> > +{
> > +	struct reset_head_opts opts = {
> > +		.oid_from = &old_head->object.oid,
> > +		.oid = &new_head->object.oid,
> > +		.flags = RESET_HEAD_SKIP_REF_UPDATES,
> > +	};
> > +	if (dry_run)
> > +		opts.flags |= RESET_HEAD_DRY_RUN;
> > +	return reset_head(repo, &opts);
> > +}
> > + ...
> > +	/*
> > +	 * If HEAD will move as a result of the rewrite then we'll have to
> > +	 * merge in the changes into the worktree and index. This merge can of
> > +	 * course conflict, which will cause the whole operation to abort.
> > +	 *
> > +	 * If we had already updated the refs at that point then we'd have an
> > +	 * inconsistent repository state. So we first perform a dry-run merge
> > +	 * here before updating refs.
> > +	 */
> > +	if (!dry_run && !is_bare_repository()) {
> > +		ret = find_head_tree_change(repo, &result, &old_head,
> > +					    &new_head, &head_moves);
> > +		if (ret < 0)
> > +			goto out;
> > +
> > +		if (head_moves && update_worktree(repo, old_head, new_head, true) < 0) {
> > +			ret = error(_("dropping this commit would "
> > +				      "overwrite local changes; aborting"));
> > +			goto out;
> > +		}
> > +	}
> 
> This block is skipped under --dry-run, but update_worktree is
> equipped to (and indeed run unconditionally here) run in the dry-run
> mode.  Does it mean that "git history drop --dry-run" that user runs
> to see which refs may be updated will not get warned about possible
> worktree conflicts that would prevent the real run from happening?
> Unless there is a compelling reason not to, I think --dry-run should
> be a close simulation of what would happen without it.

That's an oversight on my part indeed. We _should_ run this block with
dry-run, and it shows that we're definitely lacking test coverage here.
Will fix, thanks!

Patrick

^ permalink raw reply

* Re: [PATCH v2 2/4] doc: replay: improve config description
From: Patrick Steinhardt @ 2026-06-04  9:05 UTC (permalink / raw)
  To: Kristoffer Haugsbakk; +Cc: Junio C Hamano, Siddharth Asthana, git
In-Reply-To: <fcc2cf52-cb10-4799-a4c9-eb5916187075@app.fastmail.com>

On Thu, Jun 04, 2026 at 08:31:57AM +0200, Kristoffer Haugsbakk wrote:
> On Thu, Jun 4, 2026, at 08:27, Patrick Steinhardt wrote:
> > On Wed, Jun 03, 2026 at 06:04:23PM +0200,
> > kristofferhaugsbakk@fastmail.com wrote:
> >> diff --git a/Documentation/git-replay.adoc b/Documentation/git-replay.adoc
> >> index f9ca2db2833..4de85088d6c 100644
> >> --- a/Documentation/git-replay.adoc
> >> +++ b/Documentation/git-replay.adoc
> >> @@ -211,6 +211,7 @@ to use bare commit IDs instead of branch names.
> >>
> >>  CONFIGURATION
> >>  -------------
> >> +:git-replay: 1
> >>  include::config/replay.adoc[]
> >
> > Not quite sure, but was this change supposed to be part of the preceding
> > commit, where you also added the include?
> 
> No, because the conditional is only being put to use now. That was the
> intention anyway. Maybe there is some reason to put it in the first
> commit?

Probably not. It just read funny, but I guess that it's my ignorance
about the adoc format that was also at play here.

Patrick

^ permalink raw reply

* Re: Mirror repositories for submodules
From: Simon Richter @ 2026-06-04  9:27 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, Benson Muite, git
In-Reply-To: <20260604061605.GA3194609@coredump.intra.peff.net>

Hi,

On 6/4/26 3:16 PM, Jeff King wrote:

> Here's a thought experiment. What if you put the UUID into a URL, like:
>    repoid://123456789.git

Yes, that's the idea, except I would want to use a relative URL, like

     ../123456789.git

This could solve the "naive cloning" problem, because it creates an 
expectation that the submodules can be found on the same server, or in a 
nearby path.

I'm aware that this is *also* bad for decentralization, because it makes 
it easier to use one of the big forges where the repositories for 
often-used submodules are are already likely to be present, but it plays 
into our use case, where we want to share the repositories for 
often-used subprojects.

> Now, all of that said, do we still need uuids at all? If the canonical
> submodule name is https://github.com/git/git.git, then anybody can just
> rewrite that locally in the same way using url.*.insteadOf config.

Yes, but we'd then need a mechanism for a server to indicate "for 
cloning, you should use these 'insteadOf' settings, which is a massive 
can of worms from a security standpoint.

I also don't think these canonical URLs can ever be stable if they refer 
to infrastructure that is not under the control of the maintainer -- it 
would tie the project identity to the hosting provider, and increase the 
inertia to overcome for moves (such as the current exodus from github 
and gitlab towards codeberg).

> Which makes me wonder if I am missing something about the original
> request that started this thread. But it sounds to me like it is just
> asking for the existing URL-rewriting feature.

The original mail has a similar problem as we do in Debian, and as my 
employer has: CI jobs should exclusively talk to in-house 
infrastructure, because continuously cloning repositories for each build 
is bad for the environment.

The common goal is that a naive clone should get submodules from a local 
server, ideally without us having to write some tool to make an initial 
checkout, enumerate submodules, create insteadOf settings, clone first 
layer of submodules, enumerate second layer, ...

    Simon

^ permalink raw reply

* Re: git history feedback
From: Kristoffer Haugsbakk @ 2026-06-04  9:34 UTC (permalink / raw)
  To: Rasmus Villemoes, git; +Cc: Patrick Steinhardt
In-Reply-To: <87ecimhg8s.fsf@rasmusvillemoes.dk>

On Thu, Jun 4, 2026, at 10:17, Rasmus Villemoes wrote:
>[snip]
>
> So today I had occasion to put it to real use, and then I found two
> things I'd like to be able to do with it:
>
> When a commit needs to be split into three or more commits, it is a
> little cumbersome to do iteratively, since the new commit to split
> obviously has a new sha, so one first has to figure out what that new id
> is and then do another "git history split". For higher values of "three"
> that becomes rather tedious. So it would be nice if there was an
> iterative mode, which after splitting off the first commit would
> automatically start again with the new child commit.

For commit subject `anchor` I would do something like this.

1. `git history split :/anchor`
2. Split out the first commit with a new message; keep the `anchor`
   subject of the original
3. Repeat (1)

>[snip]

^ permalink raw reply

* [PATCH v3 0/8] t: fix broken TAP output
From: Patrick Steinhardt @ 2026-06-04 10:07 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King
In-Reply-To: <20260602-pks-t7527-fix-tap-output-v1-0-db3da2a1b137@pks.im>

Hi,

this small patch series fixes another instance of broken TAP output that
has landed via 4d11b9c218 (Merge branch 'pt/fsmonitor-linux', 2026-05-31).

As this has happened multiple times by now I decided to have a look at
whether we can fix this class of issues a bit more holistically. So this
series also contains a change that makes prove bail out when it sees
invalid TAP output, which uncovers a small set of preexisting issues in
our test suite.

Changes in v3:
  - Fix a test gap for AlmaLinux and Debian in GitLab CI, which uncovers
    an issue flagged by Peff.
  - Fix TAP breakage in t7810.
  - Link to v2: https://patch.msgid.link/20260603-pks-t7527-fix-tap-output-v2-0-cf3af5694e20@pks.im

Changes in v2:
  - Fix waiting for p4d, and deduplicate the logic that does this.
  - Link to v1: https://patch.msgid.link/20260602-pks-t7527-fix-tap-output-v1-0-db3da2a1b137@pks.im

Test runs can be found at [1] and [2]. Note that GitHub-side tests are
failing on Windows, but that is a preexisting failure on "master".

Thanks!

Patrick

[1]: https://gitlab.com/gitlab-org/git/-/merge_requests/585
[2]: https://github.com/git/git/pull/2320

---
Patrick Steinhardt (8):
      gitlab-ci: rearrange Linux jobs to match GitHub's order
      gitlab-ci: add missing Linux jobs
      ci: unify Linux images across GitLab and GitHub
      t7527: fix broken TAP output
      t7810: turn MB_REGEX check into a lazy prereq
      t/test-lib: silence EBUSY errors on Windows during test cleanup
      t/lib-git-p4: silence output when killing p4d and its watchdog
      t: let prove fail when parsing invalid TAP output

 .github/workflows/main.yml   |  2 +-
 .gitlab-ci.yml               | 23 +++++++++++++++--------
 ci/lib.sh                    |  2 +-
 t/lib-git-p4.sh              |  4 ++--
 t/t7527-builtin-fsmonitor.sh |  7 ++++---
 t/t7810-grep.sh              |  5 +++--
 t/test-lib.sh                | 10 ++++++++--
 7 files changed, 34 insertions(+), 19 deletions(-)

Range-diff versus v2:

-:  ---------- > 1:  5e817b102f gitlab-ci: rearrange Linux jobs to match GitHub's order
-:  ---------- > 2:  83646cc834 gitlab-ci: add missing Linux jobs
-:  ---------- > 3:  cca1567fbf ci: unify Linux images across GitLab and GitHub
1:  52abbd5280 = 4:  430bc51818 t7527: fix broken TAP output
-:  ---------- > 5:  78ef22df8d t7810: turn MB_REGEX check into a lazy prereq
2:  ea1f1eb466 = 6:  7bbaeff48c t/test-lib: silence EBUSY errors on Windows during test cleanup
3:  e97a515470 = 7:  abf2be09e6 t/lib-git-p4: silence output when killing p4d and its watchdog
4:  436d7d8cf3 = 8:  04367c34be t: let prove fail when parsing invalid TAP output

---
base-commit: 1666c1265231b0bc5f613fbbf3f0a9896cdef76e
change-id: 20260601-pks-t7527-fix-tap-output-105da1d73df0


^ permalink raw reply

* [PATCH v3 1/8] gitlab-ci: rearrange Linux jobs to match GitHub's order
From: Patrick Steinhardt @ 2026-06-04 10:07 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King
In-Reply-To: <20260604-pks-t7527-fix-tap-output-v3-0-7d766ed481e4@pks.im>

Rearrange the order of Linux jobs that we have defined in GitLab CI so
that it matches the order on GitHub's side. This makes it easier to
compare whether the list of jobs actually matches on both sides.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 .gitlab-ci.yml | 15 ++++++++-------
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml
index e0b9a0d82b..8cb41baa14 100644
--- a/.gitlab-ci.yml
+++ b/.gitlab-ci.yml
@@ -42,15 +42,15 @@ test:linux:
       - jobname: linux-reftable
         image: ubuntu:rolling
         CC: clang
+      - jobname: linux-TEST-vars
+        image: ubuntu:20.04
+        CC: gcc
+        CC_PACKAGE: gcc-8
       - jobname: linux-breaking-changes
         image: ubuntu:20.04
         CC: gcc
       - jobname: fedora-breaking-changes-meson
         image: fedora:latest
-      - jobname: linux-TEST-vars
-        image: ubuntu:20.04
-        CC: gcc
-        CC_PACKAGE: gcc-8
       - jobname: linux-leaks
         image: ubuntu:rolling
         CC: gcc
@@ -60,13 +60,14 @@ test:linux:
       - jobname: linux-asan-ubsan
         image: ubuntu:rolling
         CC: clang
+      - jobname: linux-meson
+        image: ubuntu:rolling
+        CC: gcc
       - jobname: linux-musl-meson
         image: alpine:latest
+        # Supported until 2025-04-02.
       - jobname: linux32
         image: i386/ubuntu:20.04
-      - jobname: linux-meson
-        image: ubuntu:rolling
-        CC: gcc
   artifacts:
     paths:
       - t/failed-test-artifacts

-- 
2.54.0.1064.gd145956f57.dirty


^ permalink raw reply related

* [PATCH v3 2/8] gitlab-ci: add missing Linux jobs
From: Patrick Steinhardt @ 2026-06-04 10:07 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King
In-Reply-To: <20260604-pks-t7527-fix-tap-output-v3-0-7d766ed481e4@pks.im>

The GitLab CI definitions are missing jobs for AlmaLinux and Debian,
both of which exist in GitHub Workflows. Plug this gap.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 .gitlab-ci.yml | 6 ++++++
 ci/lib.sh      | 2 +-
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml
index 8cb41baa14..a5bdec5159 100644
--- a/.gitlab-ci.yml
+++ b/.gitlab-ci.yml
@@ -68,6 +68,12 @@ test:linux:
         # Supported until 2025-04-02.
       - jobname: linux32
         image: i386/ubuntu:20.04
+      # A RHEL 8 compatible distro.  Supported until 2029-05-31.
+      - jobname: almalinux-8
+        image: almalinux:8
+      # Supported until 2026-08-31.
+      - jobname: debian-11
+        image: debian:11
   artifacts:
     paths:
       - t/failed-test-artifacts
diff --git a/ci/lib.sh b/ci/lib.sh
index 6e3799cfc3..b939110a6e 100755
--- a/ci/lib.sh
+++ b/ci/lib.sh
@@ -254,7 +254,7 @@ then
 		CI_OS_NAME=osx
 		JOBS=$(nproc)
 		;;
-	*,alpine:*|*,fedora:*|*,ubuntu:*|*,i386/ubuntu:*)
+	*,almalinux:*|*,alpine:*|*,debian:*|*,fedora:*|*,ubuntu:*|*,i386/ubuntu:*)
 		CI_OS_NAME=linux
 		JOBS=$(nproc)
 		;;

-- 
2.54.0.1064.gd145956f57.dirty


^ permalink raw reply related

* [PATCH v3 3/8] ci: unify Linux images across GitLab and GitHub
From: Patrick Steinhardt @ 2026-06-04 10:07 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King
In-Reply-To: <20260604-pks-t7527-fix-tap-output-v3-0-7d766ed481e4@pks.im>

The image for the "linux-breaking-changes" job has drifted apart across
GitHub and GitLab. Adapt it to use "ubuntu:rolling" on both systems.

With this change there's only one difference remaining: GitHub uses
"ubuntu:focal" for the "linux32" job while GitLab uses "ubuntu:20.04".
These are different names for the same image, so there is no actual
difference here. Adjust GitHub to use the "20.04" tag -- this matches
all the other jobs which use version numbers, and you don't have to
learn Ubuntu's release names by heart.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 .github/workflows/main.yml | 2 +-
 .gitlab-ci.yml             | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml
index 3da5326f0b..cf341d74db 100644
--- a/.github/workflows/main.yml
+++ b/.github/workflows/main.yml
@@ -407,7 +407,7 @@ jobs:
           image: alpine:latest
         # Supported until 2025-04-02.
         - jobname: linux32
-          image: i386/ubuntu:focal
+          image: i386/ubuntu:20.04
         # A RHEL 8 compatible distro.  Supported until 2029-05-31.
         - jobname: almalinux-8
           image: almalinux:8
diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml
index a5bdec5159..49f3689b6a 100644
--- a/.gitlab-ci.yml
+++ b/.gitlab-ci.yml
@@ -47,7 +47,7 @@ test:linux:
         CC: gcc
         CC_PACKAGE: gcc-8
       - jobname: linux-breaking-changes
-        image: ubuntu:20.04
+        image: ubuntu:rolling
         CC: gcc
       - jobname: fedora-breaking-changes-meson
         image: fedora:latest

-- 
2.54.0.1064.gd145956f57.dirty


^ permalink raw reply related

* [PATCH v3 4/8] t7527: fix broken TAP output
From: Patrick Steinhardt @ 2026-06-04 10:07 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King
In-Reply-To: <20260604-pks-t7527-fix-tap-output-v3-0-7d766ed481e4@pks.im>

Before running the tests in t7527 we first verify whether the fsmonitor
even works, which seems to depend on the actual filesystem that is in
use. The verification executes outside of any prerequisite or test body,
so its stdout/stderr is not being redirected.

The consequence of this is that any command that prints to stdout/stderr
may break the TAP specification by printing invalid lines. And in fact
we already do that, as git-init(1) prints the path to the created Git
repository by default.

Fix this issue by moving the logic into a lazy prerequisite.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 t/t7527-builtin-fsmonitor.sh | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/t/t7527-builtin-fsmonitor.sh b/t/t7527-builtin-fsmonitor.sh
index b63c162f9b..d881e27466 100755
--- a/t/t7527-builtin-fsmonitor.sh
+++ b/t/t7527-builtin-fsmonitor.sh
@@ -25,7 +25,8 @@ maybe_timeout () {
 		"$@"
 	fi
 }
-verify_fsmonitor_works () {
+
+test_lazy_prereq FSMONITOR_WORKS '
 	git init test_fsmonitor_smoke || return 1

 	GIT_TRACE_FSMONITOR="$PWD/smoke.trace" &&
@@ -50,9 +51,9 @@ verify_fsmonitor_works () {
 	ret=$?
 	rm -rf test_fsmonitor_smoke smoke.trace
 	return $ret
-}
+'

-if ! verify_fsmonitor_works
+if ! test_have_prereq FSMONITOR_WORKS
 then
 	skip_all="filesystem does not deliver fsmonitor events (container/overlayfs?)"
 	test_done

-- 
2.54.0.1064.gd145956f57.dirty

^ permalink raw reply related

* [PATCH v3 5/8] t7810: turn MB_REGEX check into a lazy prereq
From: Patrick Steinhardt @ 2026-06-04 10:07 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King
In-Reply-To: <20260604-pks-t7527-fix-tap-output-v3-0-7d766ed481e4@pks.im>

In t7810 we verify whether the system has proper multibyte locale
support by executing `test-tool regex` with a unicode character. When
this check fails though we'll output an error that breaks the TAP
format.

Fix this issue by turning the logic into a lazy prerequisite.

Reported-by: Jeff King <peff@peff.net>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 t/t7810-grep.sh | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
index 1b195bee59..d61c4a4d73 100755
--- a/t/t7810-grep.sh
+++ b/t/t7810-grep.sh
@@ -18,8 +18,9 @@ test_invalid_grep_expression() {
 	'
 }
 
-LC_ALL=en_US.UTF-8 test-tool regex '^.$' '¿' &&
-  test_set_prereq MB_REGEX
+test_lazy_prereq MB_REGEX '
+	LC_ALL=en_US.UTF-8 test-tool regex "^.$" "¿"
+'
 
 cat >hello.c <<EOF
 #include <assert.h>

-- 
2.54.0.1064.gd145956f57.dirty


^ permalink raw reply related

* [PATCH v3 6/8] t/test-lib: silence EBUSY errors on Windows during test cleanup
From: Patrick Steinhardt @ 2026-06-04 10:07 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King
In-Reply-To: <20260604-pks-t7527-fix-tap-output-v3-0-7d766ed481e4@pks.im>

When tests have finished we clean up the trash directory via `rm -rf`.
On Windows this can fail with EBUSY in cases where a process still holds
some of the files open, for example when we have spawned a daemonized
process that wasn't properly terminated. We thus retry several times,
but every failure will result in error messages being printed, and that
in turn breaks the TAP output format.

One such case where this is causing issues is in t921x, which contains
tests related to Scalar. Some tests spawn the fsmonitor daemon, and we
never properly terminate it.

The obvious fix would be to ensure that we never leak any processes, but
that gets ugly fast. Instead, let's work around the issue by silencing
error messages printed by the `rm -rf` calls. We already know to print
an error when the retry loop fails, so we don't loose much.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 t/test-lib.sh | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/t/test-lib.sh b/t/test-lib.sh
index 4a7357b547..d1d24c4124 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1299,10 +1299,10 @@ test_done () {
 			error "Tests passed but trash directory already removed before test cleanup; aborting"

 			cd "$TRASH_DIRECTORY/.." &&
-			rm -fr "$TRASH_DIRECTORY" || {
+			rm -fr "$TRASH_DIRECTORY" 2>/dev/null || {
 				# try again in a bit
 				sleep 5;
-				rm -fr "$TRASH_DIRECTORY"
+				rm -fr "$TRASH_DIRECTORY" 2>/dev/null
 			} ||
 			error "Tests passed but test cleanup failed; aborting"
 		fi

-- 
2.54.0.1064.gd145956f57.dirty

^ permalink raw reply related

* [PATCH v3 7/8] t/lib-git-p4: silence output when killing p4d and its watchdog
From: Patrick Steinhardt @ 2026-06-04 10:07 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King
In-Reply-To: <20260604-pks-t7527-fix-tap-output-v3-0-7d766ed481e4@pks.im>

When stopping the p4d watchdog process via "kill -9", the shell may
print a job-control notification like:

  ./test-lib.sh: line 1269: 57960 Killed: 9               while true; do
      if test $nr_tries_left -eq 0; then
          kill -9 $p4d_pid; exit 1;
      fi; sleep 1; nr_tries_left=$(($nr_tries_left - 1));
  done 2> /dev/null 4>&2  (wd: ~)

This message is printed asynchronously by the shell when it reaps the
process. While harmless right now, this will cause breakage once we
enable strict parsing of the TAP protocol in a subsequent commit.

Fix this by using `wait` so that we can synchronously reap the watchdog
process and swallow the diagnostic.

While at it, deduplicate the logic we have in `stop_p4d_and_watchdog ()`
and `stop_and_cleanup_p4d ()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 t/lib-git-p4.sh | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/t/lib-git-p4.sh b/t/lib-git-p4.sh
index d22e9c684a..9108868187 100644
--- a/t/lib-git-p4.sh
+++ b/t/lib-git-p4.sh
@@ -65,6 +65,7 @@ pidfile="$TRASH_DIRECTORY/p4d.pid"
 
 stop_p4d_and_watchdog () {
 	kill -9 $p4d_pid $watchdog_pid
+	wait $p4d_pid $watchdog_pid 2>/dev/null
 }
 
 # git p4 submit generates a temp file, which will
@@ -174,8 +175,7 @@ retry_until_success () {
 }
 
 stop_and_cleanup_p4d () {
-	kill -9 $p4d_pid $watchdog_pid
-	wait $p4d_pid
+	stop_p4d_and_watchdog
 	rm -rf "$db" "$cli" "$pidfile"
 }
 

-- 
2.54.0.1064.gd145956f57.dirty


^ permalink raw reply related

* [PATCH v3 8/8] t: let prove fail when parsing invalid TAP output
From: Patrick Steinhardt @ 2026-06-04 10:07 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King
In-Reply-To: <20260604-pks-t7527-fix-tap-output-v3-0-7d766ed481e4@pks.im>

To make the result of our tests accessible we use the TAP protocol. This
protocol is parsed by either prove or by Meson. Unfortunately, these two
tools differ when it comes to their strictness when parsing the
protocol:

  - Prove by default happily accepts lines not specified by the
    protocol.

  - Meson will also accept such lines, but prints a big and ugly warning
    message.

We have fixed our test suite in the past to not print invalid TAP lines
anymore via b1dc2e796e (Merge branch 'ps/meson-tap-parse', 2025-06-17).
But as none of our tools perform a strict check it's still possible for
broken tests to sneak back in, like for example in 362f69547f (Merge
branch 'ps/t1006-tap-fix', 2025-07-16). This doesn't hurt at all when
using prove, but it's quite annoying when using Meson due to the
generated warnings.

Unfortunately, there doesn't seem to be a portable way to make all tools
complain about violations of the TAP format. The TAP 14 specification
has added pragmas to the protocol that would allow us to say `pragma
+strict`, and the effect of that would be to treat invalid TAP lines as
a test failure. But the release of TAP 14 is still rather recent, and
Test-Harness for example only gained support for it in version 3.48,
which was released in 2023.

In fact though, this pragma was already introduced as an inofficial
extension of the TAP protocol with Test-Harness 3.10, released in 2008.
So while not all tools understand the pragma, at least prove does for a
long time.

Unconditionally enable the pragma when using prove so that we'll detect
tests that emit broken TAP output right away. This would have detected
the issues fixed in preceding commits:

    $ prove t7527-builtin-fsmonitor.sh
    t7527-builtin-fsmonitor.sh .. All 69 subtests passed
            (less 6 skipped subtests: 63 okay)

    Test Summary Report
    -------------------
    t7527-builtin-fsmonitor.sh (Wstat: 0 Tests: 69 Failed: 0)
      Parse errors: Unknown TAP token: "Initialized empty Git repository in /tmp/git/test_fsmonitor_smoke/.git/"

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 t/test-lib.sh | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/t/test-lib.sh b/t/test-lib.sh
index d1d24c4124..ceefb99bff 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1532,6 +1532,12 @@ then
 	BAIL_OUT 'You need to build test-tool; Run "make t/helper/test-tool" in the source (toplevel) directory'
 fi

+if test -n "$HARNESS_ACTIVE"
+then
+	say "TAP version 13"
+	say "pragma +strict"
+fi
+
 # Are we running this test at all?
 remove_trash=
 this_test=${0##*/}

-- 
2.54.0.1064.gd145956f57.dirty

^ permalink raw reply related

* Re: [PATCH] transport-helper: fix TSAN race in transfer_debug()
From: Pushkar Singh @ 2026-06-04 10:19 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, peff
In-Reply-To: <xmqqv7bzp0vc.fsf@gitster.g>

Hi Junio,

On Thu, Jun 4, 2026 at 6:39 AM Junio C Hamano <gitster@pobox.com> wrote:

> Would it be possible that transfer_debug_enabled is still -1 at this
> point?  We would proceed in such a case, which is a bit different from
> what would have happened in the original.
>
> Perhaps
>
>         if (transfer_debug_enabled <= 0)
>                 return;
>
> is what you want?  I dunno.

You're right. The original code would never proceed while the value was still
negative, whereas my change would.

I'll update it to use <= 0 and send a v2.

Thanks,
Pushkar

^ permalink raw reply

* [PATCH 0/7] More work supporting objects larger than 4GB on Windows
From: Johannes Schindelin via GitGitGadget @ 2026-06-04 10:51 UTC (permalink / raw)
  To: git; +Cc: Kristofer Karlsson, Johannes Schindelin

This patch series tries to address the problems pointed out by the expensive
tests that now run in CI: t5608 and t7508 verify various aspects about
objects larger than 4GB, which Git does not currently handle correctly when
run on a platform where size_t is 64-bit and unsigned long is 32-bit.

Unfortunately, this conflicts heavily with ps/odb-source-loose. I rebased
the branch onto seen and pushed the result to
https://github.com/dscho/git/tree/refs/heads/objects-larger-than-4gb-on-windows-pt2-seen,
to make it easier to resolve merge conflicts. Here is the relevant
range-diff:

1:  f3aeae983a ! 1:  62adeb9818 odb: use size_t for object_info.sizep and the size APIs
    @@ builtin/log.c: static int show_blob_object(const struct object_id *oid, struct r
     
      ## builtin/ls-files.c ##
     @@ builtin/ls-files.c: static void expand_objectsize(struct repository *repo, struct strbuf *line,
    - 			      const enum object_type type, unsigned int padded)
    - {
    + 	size_t len;
    + 
      	if (type == OBJ_BLOB) {
     -		unsigned long size;
     +		size_t size;
    @@ builtin/ls-files.c: static void expand_objectsize(struct repository *repo, struc
     
      ## builtin/ls-tree.c ##
     @@ builtin/ls-tree.c: static void expand_objectsize(struct strbuf *line, const struct object_id *oid,
    - 			      const enum object_type type, unsigned int padded)
    - {
    + 	size_t len;
    + 
      	if (type == OBJ_BLOB) {
     -		unsigned long size;
     +		size_t size;
    @@ notes.c: static void format_note(struct notes_tree *t, const struct object_id *o
      	if (!t)
     
      ## object-file.c ##
    -@@ object-file.c: static int parse_loose_header(const char *hdr, struct object_info *oi)
    +@@ object-file.c: int parse_loose_header(const char *hdr, struct object_info *oi)
      	}
      
      	if (oi->sizep)
    @@ object-file.c: static int parse_loose_header(const char *hdr, struct object_info
      
      	/*
      	 * The length must be followed by a zero byte
    -@@ object-file.c: static int read_object_info_from_path(struct odb_source *source,
    - 	void *map = NULL;
    - 	git_zstream stream, *stream_to_end = NULL;
    - 	char hdr[MAX_HEADER_LEN];
    --	unsigned long size_scratch;
    -+	size_t size_scratch;
    - 	enum object_type type_scratch;
    - 	struct stat st;
    - 
     @@ object-file.c: int force_object_loose(struct odb_source *source,
    - {
    + 	struct odb_source_files *files = odb_source_files_downcast(source);
      	const struct git_hash_algo *compat = source->odb->repo->compat_hash_algo;
      	void *buf;
     -	unsigned long len;
    @@ object-file.c: int read_loose_object(struct repository *repo,
      
      	fd = git_open(path);
      	if (fd >= 0)
    -@@ object-file.c: int odb_source_loose_read_object_stream(struct odb_read_stream **out,
    - 	struct object_info oi = OBJECT_INFO_INIT;
    - 	struct odb_loose_read_stream *st;
    - 	unsigned long mapsize;
    --	unsigned long size_ul;
    - 	void *mapped;
    - 
    - 	mapped = odb_source_loose_map_object(source, oid, &mapsize);
    -@@ object-file.c: int odb_source_loose_read_object_stream(struct odb_read_stream **out,
    - 		goto error;
    - 	}
    - 
    --	/*
    --	 * object_info.sizep is unsigned long* (32-bit on Windows), but
    --	 * st->base.size is size_t (64-bit). Use temporary variable.
    --	 * Note: loose objects >4GB would still truncate here, but such
    --	 * large loose objects are uncommon (they'd normally be packed).
    --	 */
    --	oi.sizep = &size_ul;
    -+	oi.sizep = &st->base.size;
    - 	oi.typep = &st->base.type;
    - 
    - 	if (parse_loose_header(st->hdr, &oi) < 0 || st->base.type < 0)
    - 		goto error;
    --	st->base.size = size_ul;
    - 
    - 	st->mapped = mapped;
    - 	st->mapsize = mapsize;
     
      ## object.c ##
     @@ object.c: struct object *parse_object_with_flags(struct repository *r,
    @@ odb.h: int odb_read_object_info_extended(struct object_database *odb,
      enum odb_has_object_flags {
      	/* Retry packed storage after checking packed and loose storage */
     
    + ## odb/source-loose.c ##
    +@@ odb/source-loose.c: static int read_object_info_from_path(struct odb_source_loose *loose,
    + 	void *map = NULL;
    + 	git_zstream stream, *stream_to_end = NULL;
    + 	char hdr[MAX_HEADER_LEN];
    +-	unsigned long size_scratch;
    ++	size_t size_scratch;
    + 	enum object_type type_scratch;
    + 	struct stat st;
    + 
    +@@ odb/source-loose.c: static int odb_source_loose_read_object_stream(struct odb_read_stream **out,
    + 	struct object_info oi = OBJECT_INFO_INIT;
    + 	struct odb_loose_read_stream *st;
    + 	unsigned long mapsize;
    +-	unsigned long size_ul;
    + 	void *mapped;
    + 
    + 	mapped = odb_source_loose_map_object(loose, oid, &mapsize);
    +@@ odb/source-loose.c: static int odb_source_loose_read_object_stream(struct odb_read_stream **out,
    + 		goto error;
    + 	}
    + 
    +-	/*
    +-	 * object_info.sizep is unsigned long* (32-bit on Windows), but
    +-	 * st->base.size is size_t (64-bit). Use temporary variable.
    +-	 * Note: loose objects >4GB would still truncate here, but such
    +-	 * large loose objects are uncommon (they'd normally be packed).
    +-	 */
    +-	oi.sizep = &size_ul;
    ++	oi.sizep = &st->base.size;
    + 	oi.typep = &st->base.type;
    + 
    + 	if (parse_loose_header(st->hdr, &oi) < 0 || st->base.type < 0)
    + 		goto error;
    +-	st->base.size = size_ul;
    + 
    + 	st->mapped = mapped;
    + 	st->mapsize = mapsize;
    +
      ## odb/streaming.c ##
     @@ odb/streaming.c: static int open_istream_incore(struct odb_read_stream **out,
      		.base.read = read_istream_incore,


Johannes Schindelin (7):
  compat/msvc: use _chsize_s for ftruncate
  patch-delta: use size_t for sizes
  pack-objects(check_pack_inflate()): use size_t instead of unsigned
    long
  packfile: widen unpack_entry()'s size out-parameter to size_t
  pack-objects: use size_t for in-core object sizes
  packfile,delta: drop the `cast_size_t_to_ulong()` wrappers
  odb: use size_t for object_info.sizep and the size APIs

 apply.c                       |  8 ++--
 archive.c                     |  4 +-
 attr.c                        |  2 +-
 bisect.c                      |  2 +-
 blame.c                       | 15 +++++--
 builtin/cat-file.c            | 39 ++++++++++++-------
 builtin/difftool.c            |  2 +-
 builtin/fast-export.c         |  7 +++-
 builtin/fast-import.c         | 29 ++++++++++----
 builtin/fsck.c                |  2 +-
 builtin/grep.c                | 12 +++---
 builtin/index-pack.c          | 10 ++---
 builtin/log.c                 |  2 +-
 builtin/ls-files.c            |  2 +-
 builtin/ls-tree.c             |  4 +-
 builtin/merge-tree.c          |  6 +--
 builtin/mktag.c               |  2 +-
 builtin/notes.c               |  6 +--
 builtin/pack-objects.c        | 73 +++++++++++++++++++++--------------
 builtin/repo.c                |  4 +-
 builtin/tag.c                 |  4 +-
 builtin/unpack-file.c         |  2 +-
 builtin/unpack-objects.c      |  8 ++--
 bundle.c                      |  2 +-
 combine-diff.c                |  4 +-
 commit.c                      | 10 ++---
 compat/msvc-posix.h           | 24 +++++++++++-
 config.c                      |  2 +-
 delta.h                       | 20 +++-------
 diff.c                        |  5 ++-
 dir.c                         |  2 +-
 entry.c                       |  4 +-
 fmt-merge-msg.c               |  4 +-
 fsck.c                        |  2 +-
 grep.c                        |  4 +-
 http-push.c                   |  2 +-
 list-objects-filter.c         |  2 +-
 mailmap.c                     |  2 +-
 match-trees.c                 |  4 +-
 merge-blobs.c                 |  6 +--
 merge-blobs.h                 |  2 +-
 merge-ort.c                   |  2 +-
 notes-cache.c                 |  2 +-
 notes-merge.c                 |  2 +-
 notes.c                       |  8 ++--
 object-file.c                 | 18 +++------
 object.c                      |  2 +-
 odb.c                         | 12 +++---
 odb.h                         | 10 ++---
 odb/streaming.c               | 13 +------
 pack-bitmap.c                 |  4 +-
 pack-check.c                  |  5 +--
 pack-objects.h                |  2 +-
 packfile.c                    | 54 ++++++++++----------------
 packfile.h                    |  5 ++-
 patch-delta.c                 |  8 ++--
 path-walk.c                   |  2 +-
 protocol-caps.c               |  5 ++-
 read-cache.c                  |  6 +--
 ref-filter.c                  |  2 +-
 reflog.c                      |  2 +-
 rerere.c                      |  2 +-
 submodule-config.c            |  2 +-
 t/helper/test-delta.c         | 10 +++--
 t/helper/test-pack-deltas.c   |  3 +-
 t/helper/test-partial-clone.c |  2 +-
 t/unit-tests/u-odb-inmemory.c |  2 +-
 tag.c                         |  4 +-
 tree-walk.c                   | 10 +++--
 tree.c                        |  2 +-
 xdiff-interface.c             |  2 +-
 71 files changed, 296 insertions(+), 253 deletions(-)


base-commit: 9ac3f193c05c2237e2b14ebaa1149e9fc8a1abe0
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-2137%2Fdscho%2Fobjects-larger-than-4gb-on-windows-pt2-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-2137/dscho/objects-larger-than-4gb-on-windows-pt2-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/2137
-- 
gitgitgadget

^ permalink raw reply

* [PATCH 1/7] compat/msvc: use _chsize_s for ftruncate
From: Johannes Schindelin via GitGitGadget @ 2026-06-04 10:51 UTC (permalink / raw)
  To: git; +Cc: Kristofer Karlsson, Johannes Schindelin, Johannes Schindelin
In-Reply-To: <pull.2137.git.1780570272.gitgitgadget@gmail.com>

From: Johannes Schindelin <johannes.schindelin@gmx.de>

On Windows, `unsigned long` and `long` are 32 bits even on 64-bit
builds. The MSVC compatibility header has shimmed `ftruncate()` with

	#define ftruncate _chsize

ever since `compat/msvc-posix.h` was introduced. `_chsize()` takes a
32-bit `long` for the new length, which silently truncates files (and
the requested size) to 2 GiB. That is enough to make t7508 test 126
"git add fails gracefully with 4 GiB and 8 GiB files" fail under
MSVC: `test-tool truncate` creates a sparse 4 GiB or 8 GiB file via
the shimmed `ftruncate()`, and the test never gets off the ground.

`_chsize_s()` is the modern replacement, accepts a 64-bit `__int64`
length, and is the only sensible target on Windows. The catch is that
it does not follow the POSIX `-1` + `errno` convention: it returns
`0` on success and an errno value (a small positive integer) on
failure. A plain `#define ftruncate _chsize_s` would therefore
silently break callers that test the return value as `< 0` or against
`-1`, of which there are several: `http.c`, `parallel-checkout.c`,
and `t/helper/test-truncate.c` among them.

Introduce a `static inline` wrapper that calls `_chsize_s()`, copies
its errno return into `errno`, and translates the result to the
familiar `-1` / `0` convention, then point `ftruncate` at the
wrapper. Place the wrapper after `#include "mingw-posix.h"` so the
`off_t` parameter resolves to the already-widened `off64_t` rather
than the 32-bit `_off_t` from `compat/vcbuild/include/unistd.h`.

MinGW is unaffected: its `ftruncate()` already takes `off_t` and
routes through `ftruncate64()` when `_FILE_OFFSET_BITS=64`, which is
the default in our build.

Assisted-by: Opus 4.7
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 compat/msvc-posix.h | 24 +++++++++++++++++++++++-
 1 file changed, 23 insertions(+), 1 deletion(-)

diff --git a/compat/msvc-posix.h b/compat/msvc-posix.h
index c500b8b4aa..7ce39b8d3f 100644
--- a/compat/msvc-posix.h
+++ b/compat/msvc-posix.h
@@ -16,7 +16,6 @@
 #define __attribute__(x)
 #define strcasecmp   _stricmp
 #define strncasecmp  _strnicmp
-#define ftruncate    _chsize
 #define strtoull     _strtoui64
 #define strtoll      _strtoi64

@@ -30,4 +29,27 @@ typedef int sigset_t;

 #include "mingw-posix.h"

+/*
+ * MSVC's `_chsize()` takes a 32-bit `long` and silently truncates files
+ * to 2 GiB. `_chsize_s()` accepts a 64-bit length but returns 0 on
+ * success or an errno value on failure, rather than the -1/errno
+ * convention POSIX `ftruncate()` callers expect. Wrap it so callers
+ * that test the return value as `< 0` or against `-1` keep working.
+ *
+ * Note: this declaration must follow `#include "mingw-posix.h"` so
+ * `off_t` resolves to `off64_t` and the parameter type matches the
+ * underlying `_chsize_s()` width.
+ */
+static inline int msvc_ftruncate(int fd, off_t length)
+{
+	int err = _chsize_s(fd, length);
+
+	if (err) {
+		errno = err;
+		return -1;
+	}
+	return 0;
+}
+#define ftruncate msvc_ftruncate
+
 #endif /* COMPAT_MSVC_POSIX_H */
-- 
gitgitgadget

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox