Git development
 help / color / mirror / Atom feed
* [PATCH v3 6/8] t/test-lib: silence EBUSY errors on Windows during test cleanup
From: Patrick Steinhardt @ 2026-06-04 10:07 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King
In-Reply-To: <20260604-pks-t7527-fix-tap-output-v3-0-7d766ed481e4@pks.im>

When tests have finished we clean up the trash directory via `rm -rf`.
On Windows this can fail with EBUSY in cases where a process still holds
some of the files open, for example when we have spawned a daemonized
process that wasn't properly terminated. We thus retry several times,
but every failure will result in error messages being printed, and that
in turn breaks the TAP output format.

One such case where this is causing issues is in t921x, which contains
tests related to Scalar. Some tests spawn the fsmonitor daemon, and we
never properly terminate it.

The obvious fix would be to ensure that we never leak any processes, but
that gets ugly fast. Instead, let's work around the issue by silencing
error messages printed by the `rm -rf` calls. We already know to print
an error when the retry loop fails, so we don't loose much.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 t/test-lib.sh | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/t/test-lib.sh b/t/test-lib.sh
index 4a7357b547..d1d24c4124 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1299,10 +1299,10 @@ test_done () {
 			error "Tests passed but trash directory already removed before test cleanup; aborting"
 
 			cd "$TRASH_DIRECTORY/.." &&
-			rm -fr "$TRASH_DIRECTORY" || {
+			rm -fr "$TRASH_DIRECTORY" 2>/dev/null || {
 				# try again in a bit
 				sleep 5;
-				rm -fr "$TRASH_DIRECTORY"
+				rm -fr "$TRASH_DIRECTORY" 2>/dev/null
 			} ||
 			error "Tests passed but test cleanup failed; aborting"
 		fi

-- 
2.54.0.1064.gd145956f57.dirty


^ permalink raw reply related

* [PATCH v3 5/8] t7810: turn MB_REGEX check into a lazy prereq
From: Patrick Steinhardt @ 2026-06-04 10:07 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King
In-Reply-To: <20260604-pks-t7527-fix-tap-output-v3-0-7d766ed481e4@pks.im>

In t7810 we verify whether the system has proper multibyte locale
support by executing `test-tool regex` with a unicode character. When
this check fails though we'll output an error that breaks the TAP
format.

Fix this issue by turning the logic into a lazy prerequisite.

Reported-by: Jeff King <peff@peff.net>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 t/t7810-grep.sh | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
index 1b195bee59..d61c4a4d73 100755
--- a/t/t7810-grep.sh
+++ b/t/t7810-grep.sh
@@ -18,8 +18,9 @@ test_invalid_grep_expression() {
 	'
 }
 
-LC_ALL=en_US.UTF-8 test-tool regex '^.$' '¿' &&
-  test_set_prereq MB_REGEX
+test_lazy_prereq MB_REGEX '
+	LC_ALL=en_US.UTF-8 test-tool regex "^.$" "¿"
+'
 
 cat >hello.c <<EOF
 #include <assert.h>

-- 
2.54.0.1064.gd145956f57.dirty


^ permalink raw reply related

* [PATCH v3 4/8] t7527: fix broken TAP output
From: Patrick Steinhardt @ 2026-06-04 10:07 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King
In-Reply-To: <20260604-pks-t7527-fix-tap-output-v3-0-7d766ed481e4@pks.im>

Before running the tests in t7527 we first verify whether the fsmonitor
even works, which seems to depend on the actual filesystem that is in
use. The verification executes outside of any prerequisite or test body,
so its stdout/stderr is not being redirected.

The consequence of this is that any command that prints to stdout/stderr
may break the TAP specification by printing invalid lines. And in fact
we already do that, as git-init(1) prints the path to the created Git
repository by default.

Fix this issue by moving the logic into a lazy prerequisite.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 t/t7527-builtin-fsmonitor.sh | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/t/t7527-builtin-fsmonitor.sh b/t/t7527-builtin-fsmonitor.sh
index b63c162f9b..d881e27466 100755
--- a/t/t7527-builtin-fsmonitor.sh
+++ b/t/t7527-builtin-fsmonitor.sh
@@ -25,7 +25,8 @@ maybe_timeout () {
 		"$@"
 	fi
 }
-verify_fsmonitor_works () {
+
+test_lazy_prereq FSMONITOR_WORKS '
 	git init test_fsmonitor_smoke || return 1
 
 	GIT_TRACE_FSMONITOR="$PWD/smoke.trace" &&
@@ -50,9 +51,9 @@ verify_fsmonitor_works () {
 	ret=$?
 	rm -rf test_fsmonitor_smoke smoke.trace
 	return $ret
-}
+'
 
-if ! verify_fsmonitor_works
+if ! test_have_prereq FSMONITOR_WORKS
 then
 	skip_all="filesystem does not deliver fsmonitor events (container/overlayfs?)"
 	test_done

-- 
2.54.0.1064.gd145956f57.dirty


^ permalink raw reply related

* [PATCH v3 3/8] ci: unify Linux images across GitLab and GitHub
From: Patrick Steinhardt @ 2026-06-04 10:07 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King
In-Reply-To: <20260604-pks-t7527-fix-tap-output-v3-0-7d766ed481e4@pks.im>

The image for the "linux-breaking-changes" job has drifted apart across
GitHub and GitLab. Adapt it to use "ubuntu:rolling" on both systems.

With this change there's only one difference remaining: GitHub uses
"ubuntu:focal" for the "linux32" job while GitLab uses "ubuntu:20.04".
These are different names for the same image, so there is no actual
difference here. Adjust GitHub to use the "20.04" tag -- this matches
all the other jobs which use version numbers, and you don't have to
learn Ubuntu's release names by heart.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 .github/workflows/main.yml | 2 +-
 .gitlab-ci.yml             | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml
index 3da5326f0b..cf341d74db 100644
--- a/.github/workflows/main.yml
+++ b/.github/workflows/main.yml
@@ -407,7 +407,7 @@ jobs:
           image: alpine:latest
         # Supported until 2025-04-02.
         - jobname: linux32
-          image: i386/ubuntu:focal
+          image: i386/ubuntu:20.04
         # A RHEL 8 compatible distro.  Supported until 2029-05-31.
         - jobname: almalinux-8
           image: almalinux:8
diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml
index a5bdec5159..49f3689b6a 100644
--- a/.gitlab-ci.yml
+++ b/.gitlab-ci.yml
@@ -47,7 +47,7 @@ test:linux:
         CC: gcc
         CC_PACKAGE: gcc-8
       - jobname: linux-breaking-changes
-        image: ubuntu:20.04
+        image: ubuntu:rolling
         CC: gcc
       - jobname: fedora-breaking-changes-meson
         image: fedora:latest

-- 
2.54.0.1064.gd145956f57.dirty


^ permalink raw reply related

* [PATCH v3 2/8] gitlab-ci: add missing Linux jobs
From: Patrick Steinhardt @ 2026-06-04 10:07 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King
In-Reply-To: <20260604-pks-t7527-fix-tap-output-v3-0-7d766ed481e4@pks.im>

The GitLab CI definitions are missing jobs for AlmaLinux and Debian,
both of which exist in GitHub Workflows. Plug this gap.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 .gitlab-ci.yml | 6 ++++++
 ci/lib.sh      | 2 +-
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml
index 8cb41baa14..a5bdec5159 100644
--- a/.gitlab-ci.yml
+++ b/.gitlab-ci.yml
@@ -68,6 +68,12 @@ test:linux:
         # Supported until 2025-04-02.
       - jobname: linux32
         image: i386/ubuntu:20.04
+      # A RHEL 8 compatible distro.  Supported until 2029-05-31.
+      - jobname: almalinux-8
+        image: almalinux:8
+      # Supported until 2026-08-31.
+      - jobname: debian-11
+        image: debian:11
   artifacts:
     paths:
       - t/failed-test-artifacts
diff --git a/ci/lib.sh b/ci/lib.sh
index 6e3799cfc3..b939110a6e 100755
--- a/ci/lib.sh
+++ b/ci/lib.sh
@@ -254,7 +254,7 @@ then
 		CI_OS_NAME=osx
 		JOBS=$(nproc)
 		;;
-	*,alpine:*|*,fedora:*|*,ubuntu:*|*,i386/ubuntu:*)
+	*,almalinux:*|*,alpine:*|*,debian:*|*,fedora:*|*,ubuntu:*|*,i386/ubuntu:*)
 		CI_OS_NAME=linux
 		JOBS=$(nproc)
 		;;

-- 
2.54.0.1064.gd145956f57.dirty


^ permalink raw reply related

* [PATCH v3 1/8] gitlab-ci: rearrange Linux jobs to match GitHub's order
From: Patrick Steinhardt @ 2026-06-04 10:07 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King
In-Reply-To: <20260604-pks-t7527-fix-tap-output-v3-0-7d766ed481e4@pks.im>

Rearrange the order of Linux jobs that we have defined in GitLab CI so
that it matches the order on GitHub's side. This makes it easier to
compare whether the list of jobs actually matches on both sides.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 .gitlab-ci.yml | 15 ++++++++-------
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml
index e0b9a0d82b..8cb41baa14 100644
--- a/.gitlab-ci.yml
+++ b/.gitlab-ci.yml
@@ -42,15 +42,15 @@ test:linux:
       - jobname: linux-reftable
         image: ubuntu:rolling
         CC: clang
+      - jobname: linux-TEST-vars
+        image: ubuntu:20.04
+        CC: gcc
+        CC_PACKAGE: gcc-8
       - jobname: linux-breaking-changes
         image: ubuntu:20.04
         CC: gcc
       - jobname: fedora-breaking-changes-meson
         image: fedora:latest
-      - jobname: linux-TEST-vars
-        image: ubuntu:20.04
-        CC: gcc
-        CC_PACKAGE: gcc-8
       - jobname: linux-leaks
         image: ubuntu:rolling
         CC: gcc
@@ -60,13 +60,14 @@ test:linux:
       - jobname: linux-asan-ubsan
         image: ubuntu:rolling
         CC: clang
+      - jobname: linux-meson
+        image: ubuntu:rolling
+        CC: gcc
       - jobname: linux-musl-meson
         image: alpine:latest
+        # Supported until 2025-04-02.
       - jobname: linux32
         image: i386/ubuntu:20.04
-      - jobname: linux-meson
-        image: ubuntu:rolling
-        CC: gcc
   artifacts:
     paths:
       - t/failed-test-artifacts

-- 
2.54.0.1064.gd145956f57.dirty


^ permalink raw reply related

* [PATCH v3 0/8] t: fix broken TAP output
From: Patrick Steinhardt @ 2026-06-04 10:07 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King
In-Reply-To: <20260602-pks-t7527-fix-tap-output-v1-0-db3da2a1b137@pks.im>

Hi,

this small patch series fixes another instance of broken TAP output that
has landed via 4d11b9c218 (Merge branch 'pt/fsmonitor-linux', 2026-05-31).

As this has happened multiple times by now I decided to have a look at
whether we can fix this class of issues a bit more holistically. So this
series also contains a change that makes prove bail out when it sees
invalid TAP output, which uncovers a small set of preexisting issues in
our test suite.

Changes in v3:
  - Fix a test gap for AlmaLinux and Debian in GitLab CI, which uncovers
    an issue flagged by Peff.
  - Fix TAP breakage in t7810.
  - Link to v2: https://patch.msgid.link/20260603-pks-t7527-fix-tap-output-v2-0-cf3af5694e20@pks.im

Changes in v2:
  - Fix waiting for p4d, and deduplicate the logic that does this.
  - Link to v1: https://patch.msgid.link/20260602-pks-t7527-fix-tap-output-v1-0-db3da2a1b137@pks.im

Test runs can be found at [1] and [2]. Note that GitHub-side tests are
failing on Windows, but that is a preexisting failure on "master".

Thanks!

Patrick

[1]: https://gitlab.com/gitlab-org/git/-/merge_requests/585
[2]: https://github.com/git/git/pull/2320

---
Patrick Steinhardt (8):
      gitlab-ci: rearrange Linux jobs to match GitHub's order
      gitlab-ci: add missing Linux jobs
      ci: unify Linux images across GitLab and GitHub
      t7527: fix broken TAP output
      t7810: turn MB_REGEX check into a lazy prereq
      t/test-lib: silence EBUSY errors on Windows during test cleanup
      t/lib-git-p4: silence output when killing p4d and its watchdog
      t: let prove fail when parsing invalid TAP output

 .github/workflows/main.yml   |  2 +-
 .gitlab-ci.yml               | 23 +++++++++++++++--------
 ci/lib.sh                    |  2 +-
 t/lib-git-p4.sh              |  4 ++--
 t/t7527-builtin-fsmonitor.sh |  7 ++++---
 t/t7810-grep.sh              |  5 +++--
 t/test-lib.sh                | 10 ++++++++--
 7 files changed, 34 insertions(+), 19 deletions(-)

Range-diff versus v2:

-:  ---------- > 1:  5e817b102f gitlab-ci: rearrange Linux jobs to match GitHub's order
-:  ---------- > 2:  83646cc834 gitlab-ci: add missing Linux jobs
-:  ---------- > 3:  cca1567fbf ci: unify Linux images across GitLab and GitHub
1:  52abbd5280 = 4:  430bc51818 t7527: fix broken TAP output
-:  ---------- > 5:  78ef22df8d t7810: turn MB_REGEX check into a lazy prereq
2:  ea1f1eb466 = 6:  7bbaeff48c t/test-lib: silence EBUSY errors on Windows during test cleanup
3:  e97a515470 = 7:  abf2be09e6 t/lib-git-p4: silence output when killing p4d and its watchdog
4:  436d7d8cf3 = 8:  04367c34be t: let prove fail when parsing invalid TAP output

---
base-commit: 1666c1265231b0bc5f613fbbf3f0a9896cdef76e
change-id: 20260601-pks-t7527-fix-tap-output-105da1d73df0


^ permalink raw reply

* Re: git history feedback
From: Kristoffer Haugsbakk @ 2026-06-04  9:34 UTC (permalink / raw)
  To: Rasmus Villemoes, git; +Cc: Patrick Steinhardt
In-Reply-To: <87ecimhg8s.fsf@rasmusvillemoes.dk>

On Thu, Jun 4, 2026, at 10:17, Rasmus Villemoes wrote:
>[snip]
>
> So today I had occasion to put it to real use, and then I found two
> things I'd like to be able to do with it:
>
> When a commit needs to be split into three or more commits, it is a
> little cumbersome to do iteratively, since the new commit to split
> obviously has a new sha, so one first has to figure out what that new id
> is and then do another "git history split". For higher values of "three"
> that becomes rather tedious. So it would be nice if there was an
> iterative mode, which after splitting off the first commit would
> automatically start again with the new child commit.

For commit subject `anchor` I would do something like this.

1. `git history split :/anchor`
2. Split out the first commit with a new message; keep the `anchor`
   subject of the original
3. Repeat (1)

>[snip]

^ permalink raw reply

* Re: Mirror repositories for submodules
From: Simon Richter @ 2026-06-04  9:27 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, Benson Muite, git
In-Reply-To: <20260604061605.GA3194609@coredump.intra.peff.net>

Hi,

On 6/4/26 3:16 PM, Jeff King wrote:

> Here's a thought experiment. What if you put the UUID into a URL, like:
>    repoid://123456789.git

Yes, that's the idea, except I would want to use a relative URL, like

     ../123456789.git

This could solve the "naive cloning" problem, because it creates an 
expectation that the submodules can be found on the same server, or in a 
nearby path.

I'm aware that this is *also* bad for decentralization, because it makes 
it easier to use one of the big forges where the repositories for 
often-used submodules are are already likely to be present, but it plays 
into our use case, where we want to share the repositories for 
often-used subprojects.

> Now, all of that said, do we still need uuids at all? If the canonical
> submodule name is https://github.com/git/git.git, then anybody can just
> rewrite that locally in the same way using url.*.insteadOf config.

Yes, but we'd then need a mechanism for a server to indicate "for 
cloning, you should use these 'insteadOf' settings, which is a massive 
can of worms from a security standpoint.

I also don't think these canonical URLs can ever be stable if they refer 
to infrastructure that is not under the control of the maintainer -- it 
would tie the project identity to the hosting provider, and increase the 
inertia to overcome for moves (such as the current exodus from github 
and gitlab towards codeberg).

> Which makes me wonder if I am missing something about the original
> request that started this thread. But it sounds to me like it is just
> asking for the existing URL-rewriting feature.

The original mail has a similar problem as we do in Debian, and as my 
employer has: CI jobs should exclusively talk to in-house 
infrastructure, because continuously cloning repositories for each build 
is bad for the environment.

The common goal is that a naive clone should get submodules from a local 
server, ideally without us having to write some tool to make an initial 
checkout, enumerate submodules, create insteadOf settings, clone first 
layer of submodules, enumerate second layer, ...

    Simon

^ permalink raw reply

* Re: [PATCH v2 2/4] doc: replay: improve config description
From: Patrick Steinhardt @ 2026-06-04  9:05 UTC (permalink / raw)
  To: Kristoffer Haugsbakk; +Cc: Junio C Hamano, Siddharth Asthana, git
In-Reply-To: <fcc2cf52-cb10-4799-a4c9-eb5916187075@app.fastmail.com>

On Thu, Jun 04, 2026 at 08:31:57AM +0200, Kristoffer Haugsbakk wrote:
> On Thu, Jun 4, 2026, at 08:27, Patrick Steinhardt wrote:
> > On Wed, Jun 03, 2026 at 06:04:23PM +0200,
> > kristofferhaugsbakk@fastmail.com wrote:
> >> diff --git a/Documentation/git-replay.adoc b/Documentation/git-replay.adoc
> >> index f9ca2db2833..4de85088d6c 100644
> >> --- a/Documentation/git-replay.adoc
> >> +++ b/Documentation/git-replay.adoc
> >> @@ -211,6 +211,7 @@ to use bare commit IDs instead of branch names.
> >>
> >>  CONFIGURATION
> >>  -------------
> >> +:git-replay: 1
> >>  include::config/replay.adoc[]
> >
> > Not quite sure, but was this change supposed to be part of the preceding
> > commit, where you also added the include?
> 
> No, because the conditional is only being put to use now. That was the
> intention anyway. Maybe there is some reason to put it in the first
> commit?

Probably not. It just read funny, but I guess that it's my ignorance
about the adoc format that was also at play here.

Patrick

^ permalink raw reply

* Re: [PATCH v2 9/9] builtin/history: implement "drop" subcommand
From: Patrick Steinhardt @ 2026-06-04  9:02 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Pablo Sabater
In-Reply-To: <xmqqmrxbqir4.fsf@gitster.g>

On Thu, Jun 04, 2026 at 08:58:07AM +0900, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> > +static int update_worktree(struct repository *repo,
> > +			   const struct commit *old_head,
> > +			   const struct commit *new_head,
> > +			   bool dry_run)
> > +{
> > +	struct reset_head_opts opts = {
> > +		.oid_from = &old_head->object.oid,
> > +		.oid = &new_head->object.oid,
> > +		.flags = RESET_HEAD_SKIP_REF_UPDATES,
> > +	};
> > +	if (dry_run)
> > +		opts.flags |= RESET_HEAD_DRY_RUN;
> > +	return reset_head(repo, &opts);
> > +}
> > + ...
> > +	/*
> > +	 * If HEAD will move as a result of the rewrite then we'll have to
> > +	 * merge in the changes into the worktree and index. This merge can of
> > +	 * course conflict, which will cause the whole operation to abort.
> > +	 *
> > +	 * If we had already updated the refs at that point then we'd have an
> > +	 * inconsistent repository state. So we first perform a dry-run merge
> > +	 * here before updating refs.
> > +	 */
> > +	if (!dry_run && !is_bare_repository()) {
> > +		ret = find_head_tree_change(repo, &result, &old_head,
> > +					    &new_head, &head_moves);
> > +		if (ret < 0)
> > +			goto out;
> > +
> > +		if (head_moves && update_worktree(repo, old_head, new_head, true) < 0) {
> > +			ret = error(_("dropping this commit would "
> > +				      "overwrite local changes; aborting"));
> > +			goto out;
> > +		}
> > +	}
> 
> This block is skipped under --dry-run, but update_worktree is
> equipped to (and indeed run unconditionally here) run in the dry-run
> mode.  Does it mean that "git history drop --dry-run" that user runs
> to see which refs may be updated will not get warned about possible
> worktree conflicts that would prevent the real run from happening?
> Unless there is a compelling reason not to, I think --dry-run should
> be a close simulation of what would happen without it.

That's an oversight on my part indeed. We _should_ run this block with
dry-run, and it shows that we're definitely lacking test coverage here.
Will fix, thanks!

Patrick

^ permalink raw reply

* Re: [PATCH v2 9/9] builtin/history: implement "drop" subcommand
From: Patrick Steinhardt @ 2026-06-04  9:02 UTC (permalink / raw)
  To: Kristoffer Haugsbakk; +Cc: git, Pablo Sabater, Junio C Hamano
In-Reply-To: <4b4672de-17cc-426f-8498-6384b1ad0d06@app.fastmail.com>

On Wed, Jun 03, 2026 at 09:04:33PM +0200, Kristoffer Haugsbakk wrote:
> On Wed, Jun 3, 2026, at 18:14, Patrick Steinhardt wrote:
> > diff --git a/Documentation/git-history.adoc
> > b/Documentation/git-history.adoc
> > index 2ba8121795..4eac732fd2 100644
> > --- a/Documentation/git-history.adoc
> > +++ b/Documentation/git-history.adoc
> > @@ -51,13 +52,28 @@ be stateful operations. The limitation can be
> > lifted once (if) Git learns about
> >  first-class conflicts.
> >
> >  When using `fixup` with `--empty=drop`, dropping the root commit is not yet
> > -supported.
> > +supported. Likewise, `drop` cannot remove the root commit or a merge commit.
> >
> >  COMMANDS
> >  --------
> >
> >  The following commands are available to rewrite history in different ways:
> >
> > +`drop <commit>`::
> > +	Remove the specified commit from the history. All descendants of the
> > +	commit are replayed directly onto its parent.
> > ++
> > +The root commit cannot be dropped as that may lead to edge cases where refs
> > +end up with no commits anymore. Merge commits cannot be dropped either; see
> > +LIMITATIONS.
> 
> Should section names be “bare” or quoted like "LIMITATIONS"?
> I don’t know.
> 
> Maybe add “above” since it’s a previous section.

Hm. I think I'd prefer to keep this as-is, as we already have
preexisting documentation in the same manpage that does it exactly like
the above.

> > ++
> > +If `HEAD` points at a commit that is to be rewritten, the index and working
> >[snip]
> > +Drop a commit
> > +~~~~~~~~~~~~~
> > +
> > +----------
> > +$ git log --oneline
> > +abc1234 (HEAD -> main) third
> > +def5678 second
> > +ghi9012 first
> > +
> > +$ git history drop def5678
> 
> I know this is only the most simple example. And I might be dragging in
> something beyond the scope of this example. But I recall one
> demonstration on the first git-history(1) series which used a lot of
> revision expressions and someone saying that they couldn’t imagine a
> workflow where this would be more interactive than bringing up the
> git-rebase(1) todo editor.
> 
> (I couldn’t find back to this right now.)

Yeah, I remember this discussion.

> Although it is slower in terms of machine cycles, the keyboard instinct
> for dropping a nearby commit might be to do `git rebase -i @~10`
> (sufficiently high number) and navigating quickly in the configured
> editor, deleting the line or using the keybind for `drop`. This example
> which by implication brings up the log in order to paste the abbreviated
> hash isn’t as ergonomic in comparison.
> 
> But using a revision expression like searching the subject with
> `main^{/second}`, while not quicker probably, does distinguish itself
> from git-rebase(1) by being a pretty fast ad hoc invocation that can be
> done in one command without futzing with some weird sed(1) editor in
> order to navigate to the `second` line and deleting it, or
> something. And that’s a small win in isolation, but it segues much more
> naturally into letting you script, say, dropping the last commit that
> starts with the subject `TEMP`.
> 
> Or maybe revision expressions is too much in this context?

No, I think that's actually a good idea to demonstrate how this can be
used without being too unergomonic.

Eventually I still have the idea in my mind to implement a TUI around
git-history(1) that allows to drive its several subcommands without
having to laboriously select the right commits. Ideally, you'd simply
select the commits you care about and then pick specific actions you
want to do on them, with a nice visual graph that updates after the
operation. But that's a bit into the future, I guess :)

> > diff --git a/t/t3454-history-drop.sh b/t/t3454-history-drop.sh
> > new file mode 100755
> > index 0000000000..37d8413e7e
> > --- /dev/null
> > +++ b/t/t3454-history-drop.sh
> > @@ -0,0 +1,513 @@
> > +#!/bin/sh
> > +
> > +test_description='tests for git-history drop subcommand'
> > +
> > +. ./test-lib.sh
> > +. "$TEST_DIRECTORY/lib-log-graph.sh"
> > +
> > +expect_graph () {
> > +	cat >expect &&
> > +	lib_test_cmp_graph --format=%s "$@"
> > +}
> > +
> > +expect_log () {
> > +	git log --format="%s" "$@" >actual &&
> > +	cat >expect &&
> > +	test_cmp expect actual
> > +}
> > +
> > +test_expect_success 'errors on missing commit argument' '
> > +	test_when_finished "rm -rf repo" &&
> > +	git init repo &&
> > +	(
> > +		cd repo &&
> > +		test_commit initial &&
> > +		test_must_fail git history drop 2>err &&
> > +		test_grep "command expects a single revision" err
> 
> Why not `test_cmp` since it’s a fixed error?
> 
> Same for a few other tests like `errors on unknown revision`.

We tend to use test_grep for cases like this, even for static error
messages. I haven't really figured out myself when exactly we tend to
use one form over the other, to be honest.

> >[snip]
> > +test_expect_success 'errors with invalid --empty= value' '
> > +	test_when_finished "rm -rf repo" &&
> > +	git init repo &&
> > +	test_commit -C repo initial &&
> > +	test_commit -C repo second &&
> > +	test_must_fail git -C repo history drop --empty=bogus HEAD 2>err &&
> > +	test_grep "unrecognized.*--empty.*bogus" err
> > +'
> 
> Style related I guess. Most tests here use a subshell but this one uses
> `git -C`? Why is that?

No good reason, will fix.

Thanks!

Patrick

^ permalink raw reply

* Re: [PATCH v2 5/9] reset: introduce ability to skip reference updates
From: Patrick Steinhardt @ 2026-06-04  9:01 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Pablo Sabater
In-Reply-To: <xmqqqzmnqj1o.fsf@gitster.g>

On Thu, Jun 04, 2026 at 08:51:47AM +0900, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> > @@ -112,6 +113,9 @@ int reset_head(struct repository *r, const struct reset_head_opts *opts)
> >  	if (opts->branch_msg && !opts->branch)
> >  		BUG("branch reflog message given without a branch");
> >  
> > +	if (skip_ref_updates && (opts->branch || refs_only))
> > +		BUG("asked to perform ref updates and skip them at the same time");
> 
> ;-)  That's certainly a careful safety valve.
> 
> Would we also want to catch skip_ref_updates && update_orig_head
> being both set as a bogus request?

Yeah, I think that's a good idea.

Patrick

^ permalink raw reply

* Re: git history feedback
From: Patrick Steinhardt @ 2026-06-04  8:39 UTC (permalink / raw)
  To: Rasmus Villemoes; +Cc: git
In-Reply-To: <87ecimhg8s.fsf@rasmusvillemoes.dk>

On Thu, Jun 04, 2026 at 10:17:07AM +0200, Rasmus Villemoes wrote:
> Hi
> 
> As soon as I saw the announcement of 'git history', I knew that was
> something I was gonna use a lot. Especially the split functionality has
> always been somewhat of a hassle (at least for me) to do via an
> interactive rebase. I've played around with it a little, and it seems to
> work as it says on the tin.

Happy to hear!

> So today I had occasion to put it to real use, and then I found two
> things I'd like to be able to do with it:
> 
> When a commit needs to be split into three or more commits, it is a
> little cumbersome to do iteratively, since the new commit to split
> obviously has a new sha, so one first has to figure out what that new id
> is and then do another "git history split". For higher values of "three"
> that becomes rather tedious. So it would be nice if there was an
> iterative mode, which after splitting off the first commit would
> automatically start again with the new child commit.

That's fair, and also something that was discussed when initially
introducing the "split" subcommand. We don't have that mode right now,
but I think it would make sense to maybe add a new option that makes us
split repeatedly until all chunks have been exploded into separate
commits.

> If "git add -p" had an answer meaning "yes to this hunk and all
> following in this file and all remaining files as well", this could
> probably even be the default behaviour of "git history split", as it
> would just require that one extra answer to be given after the first
> commit is split off in order to keep the current behaviour. Otherwise,
> I'd also be happy to have "git history iter-split" or "git history split
> --iter" or any other spelling.

Yeah. From my perspective, having this available as an option would be
the best path forward. But I'm also open to alternatives as you suggest.

> The other thing I'd like is a sort of ultimate version of the above:
> What I needed in the concrete case at hand was actually to split two
> commit into n individual hunks each, then do an interactive rebase to combine
> those 2n commits to n commits (I had done changes "row-wise", but needed
> to change them to "column-wise"). For that, I would like to have had a
> completely automatic "git history atomize" that would split a commit
> into individual hunks, prefixing the commit subject with
> e.g. "[<filename> -- hunk #nn]". A subsequent 'git rebase -i' could then easily
> rearrange those and squash the related hunks.

It could easily be another option: `git history split --explode` for
example, which seems to be a common term for such an operation.

Regarding the subsequent rebase: I have one more patch series cooking
locally that implements `git history move` to move around commits, and I
did have the plan to eventually also implement `git history squash` to
squash commit A into commit B. But when handling many commits it might
even be easier to use interactive rebases instead.

Well, unless we eventually maybe even get something like a graphical
interface. I was playing around with a TUI interface for git-history(1)
that lets you move commits around without the hassle of the command
line. But that's certainly something that's still going to take a while
to materialize, if it ever does.

> Aside: are experimental commands eligible for teaching the completion
> logic about them? I.e., can we add a __git_history() to
> git-completion.bash? Aside from the obvious "let it know about existing
> subcommands", I'd love for "git history split <TAB>" to show the most
> recent ~20 (or something) commits in one-line format, stopping if
> there's a merge commit.

I just haven't gotten around to it yet. I'd certainly welcome the
addition of command line completion.

Thanks for your feedback!

Patrick

^ permalink raw reply

* Re: [PATCH 2/2] builtin/add: use die_for_required_opt() helper
From: Christian Couder @ 2026-06-04  8:27 UTC (permalink / raw)
  To: Siddharth Shrimali; +Cc: git, gitster, toon, jn.avila
In-Reply-To: <20260603111044.39116-3-r.siddharth.shrimali@gmail.com>

On Wed, Jun 3, 2026 at 1:11 PM Siddharth Shrimali
<r.siddharth.shrimali@gmail.com> wrote:
>
> Clean up manual option dependency checks by replacing explicit conditional
> blocks with the newly introduced die_for_required_opt() helper function.
>
> Specifically, simplify the prerequisite check logic for both
> '--ignore-missing' (which requires '--dry-run') and '--pathspec-file-nul'
> (which requires '--pathspec-from-file').

It's a good idea to use the new helper function for
'--pathspec-file-nul' requiring '--pathspec-from-file' because it
looks like this is tested a lot already:

$ git grep requires | grep 'the option'
t2026-checkout-pathspec-file.sh:    test_grep -e "the option
.--pathspec-file-nul. requires .--pathspec-from-file." err
t2072-restore-pathspec-file.sh:    test_grep -e "the option
.--pathspec-file-nul. requires .--pathspec-from-file." err &&
t3601-rm-pathspec-file.sh:    test_grep -e "the option
.--pathspec-file-nul. requires .--pathspec-from-file." err &&
t3704-add-pathspec-file.sh:    test_grep -e "the option
.--pathspec-file-nul. requires .--pathspec-from-file." err &&
t3909-stash-pathspec-file.sh:    test_grep -e "the option
.--pathspec-file-nul. requires .--pathspec-from-file." err
t7107-reset-pathspec-file.sh:    test_grep -e "the option
.--pathspec-file-nul. requires .--pathspec-from-file." err &&
t7526-commit-pathspec-file.sh:    test_grep -e "the option
.--pathspec-file-nul. requires .--pathspec-from-file." err &&

You could mention this in the commit message.

Also it might be worth squashing this patch into the previous one.

Thanks.

^ permalink raw reply

* git history feedback
From: Rasmus Villemoes @ 2026-06-04  8:17 UTC (permalink / raw)
  To: git; +Cc: Patrick Steinhardt

Hi

As soon as I saw the announcement of 'git history', I knew that was
something I was gonna use a lot. Especially the split functionality has
always been somewhat of a hassle (at least for me) to do via an
interactive rebase. I've played around with it a little, and it seems to
work as it says on the tin.

So today I had occasion to put it to real use, and then I found two
things I'd like to be able to do with it:

When a commit needs to be split into three or more commits, it is a
little cumbersome to do iteratively, since the new commit to split
obviously has a new sha, so one first has to figure out what that new id
is and then do another "git history split". For higher values of "three"
that becomes rather tedious. So it would be nice if there was an
iterative mode, which after splitting off the first commit would
automatically start again with the new child commit.

If "git add -p" had an answer meaning "yes to this hunk and all
following in this file and all remaining files as well", this could
probably even be the default behaviour of "git history split", as it
would just require that one extra answer to be given after the first
commit is split off in order to keep the current behaviour. Otherwise,
I'd also be happy to have "git history iter-split" or "git history split
--iter" or any other spelling.

The other thing I'd like is a sort of ultimate version of the above:
What I needed in the concrete case at hand was actually to split two
commit into n individual hunks each, then do an interactive rebase to combine
those 2n commits to n commits (I had done changes "row-wise", but needed
to change them to "column-wise"). For that, I would like to have had a
completely automatic "git history atomize" that would split a commit
into individual hunks, prefixing the commit subject with
e.g. "[<filename> -- hunk #nn]". A subsequent 'git rebase -i' could then easily
rearrange those and squash the related hunks.

Aside: are experimental commands eligible for teaching the completion
logic about them? I.e., can we add a __git_history() to
git-completion.bash? Aside from the obvious "let it know about existing
subcommands", I'd love for "git history split <TAB>" to show the most
recent ~20 (or something) commits in one-line format, stopping if
there's a merge commit.

Thanks,
Rasmus

^ permalink raw reply

* Re: [PATCH 1/2] parse-options: introduce die_for_required_opt()
From: Christian Couder @ 2026-06-04  8:10 UTC (permalink / raw)
  To: Siddharth Shrimali; +Cc: git, gitster, toon, jn.avila
In-Reply-To: <20260603111044.39116-2-r.siddharth.shrimali@gmail.com>

On Wed, Jun 3, 2026 at 1:11 PM Siddharth Shrimali
<r.siddharth.shrimali@gmail.com> wrote:
>
> Introduce a new helper function die_for_required_opt() to check if a
> given option is present without its required prerequisite option.
>
> This provides a centralized API for handling simple option dependencies
> (i.e., X requires Y), matching the style of the existing mutual-exclusion
> helpers like die_for_incompatible_opt{2,3,4}().
>
> Suggested-by: Christian Couder <christian.couder@gmail.com>

In general it's simpler for GSoC contributors to mention all your
mentors in "Mentored-by: ..." trailers in all your patches during your
GSoC, rather than keeping track of who helped you with each patch.

> Signed-off-by: Siddharth Shrimali <r.siddharth.shrimali@gmail.com>
> ---
>  parse-options.c | 7 +++++++
>  parse-options.h | 3 +++
>  2 files changed, 10 insertions(+)

I think it would be nice if the new function could actually be used in
a single *.c file. It would be even nicer if there was an existing
test that already checked that the dependent option needs the required
option. This way we would also already ensure that the new helper is
working properly.

^ permalink raw reply

* Re: [PATCH 1/2] parse-options: introduce die_for_required_opt()
From: Christian Couder @ 2026-06-04  8:00 UTC (permalink / raw)
  To: Jean-Noël AVILA; +Cc: git, Siddharth Shrimali, gitster, toon
In-Reply-To: <2412618.ElGaqSPkdT@piment-oiseau>

Hi,

On Wed, Jun 3, 2026 at 9:49 PM Jean-Noël AVILA <jn.avila@free.fr> wrote:

> To me, "die_for_required_opt" is a misnomer as the function does not die for
> an existing "required" condition, unlike the other functions such as
> die_for_incompatible_opt<n>.
>
> The names of the parameters do not indicate that the test is not symmetrical
> (not failing on XOR).
>
> Maybe something like "die_for_missing_opt(int tested_opt, const char
> *tested_opt_name, int required_opt, const char *required_opt_name)
>
> would make it more understandable.

Yeah, I agree it's better.

With "dependent_opt" instead of "tested_opt", I think it would be even better.

Thanks.

^ permalink raw reply

* [PATCH v3 8/8] setup: construct object database in `apply_repository_format()`
From: Patrick Steinhardt @ 2026-06-04  7:46 UTC (permalink / raw)
  To: git; +Cc: Kristoffer Haugsbakk, Junio C Hamano, Karthik Nayak
In-Reply-To: <20260604-b4-pks-setup-centralize-odb-creation-v3-0-0691834f318a@pks.im>

With the preceding changes we now always construct the repository's
object database before applying the repository format. Remove this
duplication by constructing it in `apply_repository_format()` instead.

Note that we create the object database _after_ having set up the
repository's hash algorithm, but _before_ setting the compat hash
algorithm. This is intentional:

  - Constructing the object database may require knowledge of its
    intended object format.

  - Setting up the compatibility hash requires the object database to be
    initialized already, because we immediately read the loose object
    map.

The first point is sensible, the second maybe a little less so. Ideally,
it should be the responsibility of the object database itself to
initialize any data structures required for the compatibility hash. But
this would require further changes, so this is kept as-is for now.

Further note that this requires us to move handling of the environment
variables GIT_OBJECT_DIRECTORY and GIT_ALTERNATE_OBJECT_DIRECTORIES into
the repository format, as well. This allows the caller more flexibility
around whether or not those environment variables are being honored, as
we want to respect them in "setup.c", but not in "repository.c".

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 repository.c |  4 +---
 setup.c      | 45 +++++++++++++++++++++------------------------
 setup.h      | 10 ++++++++++
 3 files changed, 32 insertions(+), 27 deletions(-)

diff --git a/repository.c b/repository.c
index 61dfbb8be6..187dd471c4 100644
--- a/repository.c
+++ b/repository.c
@@ -291,13 +291,11 @@ int repo_init(struct repository *repo,
 	if (read_repository_format_from_commondir(&format, repo->commondir))
 		goto error;
 
-	if (apply_repository_format(repo, &format, &err) < 0) {
+	if (apply_repository_format(repo, &format, 0, &err) < 0) {
 		warning("%s", err.buf);
 		goto error;
 	}
 
-	repo->objects = odb_new(repo, NULL, NULL);
-
 	if (worktree)
 		repo_set_worktree(repo, worktree);
 
diff --git a/setup.c b/setup.c
index 4a8d6230b1..513fc88749 100644
--- a/setup.c
+++ b/setup.c
@@ -1752,12 +1752,22 @@ enum discovery_result discover_git_directory_reason(struct strbuf *commondir,
 
 int apply_repository_format(struct repository *repo,
 			    const struct repository_format *format,
+			    enum apply_repository_format_flags flags,
 			    struct strbuf *err)
 {
+	char *object_directory = NULL, *alternate_object_directories = NULL;
+
 	if (verify_repository_format(format, err) < 0)
 		return -1;
 
+	if (flags & APPLY_REPOSITORY_FORMAT_HONOR_ENV) {
+		object_directory = xstrdup_or_null(getenv(DB_ENVIRONMENT));
+		alternate_object_directories = xstrdup_or_null(getenv(ALTERNATE_DB_ENVIRONMENT));
+	}
+
 	repo_set_hash_algo(repo, format->hash_algo);
+	repo->objects = odb_new(repo, object_directory,
+				alternate_object_directories);
 	repo_set_compat_hash_algo(repo, format->compat_hash_algo);
 	repo_set_ref_storage_format(repo,
 				    format->ref_storage_format,
@@ -1773,6 +1783,8 @@ int apply_repository_format(struct repository *repo,
 	repo->repository_format_precious_objects =
 		format->precious_objects;
 
+	free(alternate_object_directories);
+	free(object_directory);
 	return 0;
 }
 
@@ -1785,7 +1797,8 @@ int apply_repository_format(struct repository *repo,
  * If successful and fmt is not NULL, fill fmt with data.
  */
 static void check_and_apply_repository_format(struct repository *repo,
-					      struct repository_format *fmt)
+					      struct repository_format *fmt,
+					      enum apply_repository_format_flags flags)
 {
 	struct repository_format repo_fmt = REPOSITORY_FORMAT_INIT;
 	struct strbuf err = STRBUF_INIT;
@@ -1794,7 +1807,7 @@ static void check_and_apply_repository_format(struct repository *repo,
 		fmt = &repo_fmt;
 
 	check_repository_format_gently(repo_get_git_dir(repo), fmt, NULL);
-	if (apply_repository_format(repo, fmt, &err) < 0)
+	if (apply_repository_format(repo, fmt, flags, &err) < 0)
 		die("%s", err.buf);
 	startup_info->have_repository = 1;
 
@@ -1874,15 +1887,9 @@ const char *enter_repo(struct repository *repo, const char *path, unsigned flags
 	}
 
 	if (is_git_directory(".")) {
-		struct strvec to_free = STRVEC_INIT;
-
 		set_git_dir(repo, ".", 0);
-		repo->objects = odb_new(repo,
-					getenv_safe(&to_free, DB_ENVIRONMENT),
-					getenv_safe(&to_free, ALTERNATE_DB_ENVIRONMENT));
-		check_and_apply_repository_format(repo, NULL);
-
-		strvec_clear(&to_free);
+		check_and_apply_repository_format(repo, NULL,
+						  APPLY_REPOSITORY_FORMAT_HONOR_ENV);
 		return path;
 	}
 
@@ -2034,8 +2041,6 @@ const char *setup_git_directory_gently(struct repository *repo, int *nongit_ok)
 	    startup_info->have_repository ||
 	    /* GIT_DIR_EXPLICIT */
 	    getenv(GIT_DIR_ENVIRONMENT)) {
-		struct strvec to_free = STRVEC_INIT;
-
 		if (!repo->gitdir) {
 			const char *gitdir = getenv(GIT_DIR_ENVIRONMENT);
 			if (!gitdir)
@@ -2046,17 +2051,13 @@ const char *setup_git_directory_gently(struct repository *repo, int *nongit_ok)
 		if (startup_info->have_repository) {
 			struct strbuf err = STRBUF_INIT;
 
-			repo->objects = odb_new(repo,
-						getenv_safe(&to_free, DB_ENVIRONMENT),
-						getenv_safe(&to_free, ALTERNATE_DB_ENVIRONMENT));
-			if (apply_repository_format(repo, &repo_fmt, &err) < 0)
+			if (apply_repository_format(repo, &repo_fmt,
+						    APPLY_REPOSITORY_FORMAT_HONOR_ENV, &err) < 0)
 				die("%s", err.buf);
 
 			clear_repository_format(&repo_fmt);
 			strbuf_release(&err);
 		}
-
-		strvec_clear(&to_free);
 	}
 	/*
 	 * Since precompose_string_if_needed() needs to look at
@@ -2805,7 +2806,6 @@ int init_db(struct repository *repo,
 	int exist_ok = flags & INIT_DB_EXIST_OK;
 	char *original_git_dir = real_pathdup(git_dir, 1);
 	struct repository_format repo_fmt = REPOSITORY_FORMAT_INIT;
-	struct strvec to_free = STRVEC_INIT;
 
 	if (real_git_dir) {
 		struct stat st;
@@ -2826,16 +2826,14 @@ int init_db(struct repository *repo,
 	}
 	startup_info->have_repository = 1;
 
-	repo->objects = odb_new(repo, getenv_safe(&to_free, DB_ENVIRONMENT),
-				getenv_safe(&to_free, ALTERNATE_DB_ENVIRONMENT));
-
 	/*
 	 * Check to see if the repository version is right.
 	 * Note that a newly created repository does not have
 	 * config file, so this will not fail.  What we are catching
 	 * is an attempt to reinitialize new repository with an old tool.
 	 */
-	check_and_apply_repository_format(repo, &repo_fmt);
+	check_and_apply_repository_format(repo, &repo_fmt,
+					  APPLY_REPOSITORY_FORMAT_HONOR_ENV);
 
 	repository_format_configure(repo, &repo_fmt, hash, ref_storage_format);
 
@@ -2892,7 +2890,6 @@ int init_db(struct repository *repo,
 	}
 
 	clear_repository_format(&repo_fmt);
-	strvec_clear(&to_free);
 	free(original_git_dir);
 	return 0;
 }
diff --git a/setup.h b/setup.h
index efbb82fdbf..19679fe78f 100644
--- a/setup.h
+++ b/setup.h
@@ -221,6 +221,15 @@ void clear_repository_format(struct repository_format *format);
 int verify_repository_format(const struct repository_format *format,
 			     struct strbuf *err);
 
+enum apply_repository_format_flags {
+	/*
+	 * Honor environment variables when applying the repository format to
+	 * the repository. For now, this only covers environment variables that
+	 * relate to the object database.
+	 */
+	APPLY_REPOSITORY_FORMAT_HONOR_ENV = (1 << 0),
+};
+
 /*
  * Apply the given repository format to the repo. This initializes extensions
  * and basic data structures required for normal operation. Returns 0 on
@@ -229,6 +238,7 @@ int verify_repository_format(const struct repository_format *format,
  */
 int apply_repository_format(struct repository *repo,
 			    const struct repository_format *format,
+			    enum apply_repository_format_flags flags,
 			    struct strbuf *err);
 
 const char *get_template_dir(const char *option_template);

-- 
2.54.0.1064.gd145956f57.dirty


^ permalink raw reply related

* [PATCH v3 7/8] repository: stop reading loose object map twice on repo init
From: Patrick Steinhardt @ 2026-06-04  7:46 UTC (permalink / raw)
  To: git; +Cc: Kristoffer Haugsbakk, Junio C Hamano, Karthik Nayak
In-Reply-To: <20260604-b4-pks-setup-centralize-odb-creation-v3-0-0691834f318a@pks.im>

When initializing a repository via `repo_init()` we end up reading the
loose object map twice:

  - `apply_repository_format()` calls `repo_set_compat_hash_algo()`,
    which in turn calls `repo_read_loose_object_map()` if we have a
    compatibility hash configured.

  - `repo_init()` calls `repo_read_loose_object_map()` directly a second
    time.

Drop the second read of the loose object map in `repo_init()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 repository.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/repository.c b/repository.c
index 2c2395105f..61dfbb8be6 100644
--- a/repository.c
+++ b/repository.c
@@ -301,9 +301,6 @@ int repo_init(struct repository *repo,
 	if (worktree)
 		repo_set_worktree(repo, worktree);
 
-	if (repo->compat_hash_algo)
-		repo_read_loose_object_map(repo);
-
 	clear_repository_format(&format);
 	strbuf_release(&err);
 	return 0;

-- 
2.54.0.1064.gd145956f57.dirty


^ permalink raw reply related

* [PATCH v3 6/8] setup: stop initializing object database without repository
From: Patrick Steinhardt @ 2026-06-04  7:46 UTC (permalink / raw)
  To: git; +Cc: Kristoffer Haugsbakk, Junio C Hamano, Karthik Nayak
In-Reply-To: <20260604-b4-pks-setup-centralize-odb-creation-v3-0-0691834f318a@pks.im>

The function `setup_git_directory_gently()` is responsible for
discovering and setting up a Git repository based on various environment
variables and the current working directory. The result is thus a fully
usable Git repository.

One oddity of this function is that we may set up the object database
even in the case where we don't have a repository, namely in the case
where the `GIT_DIR_EXPLICIT` environment variable is set but points to a
non-existent repository. If so, we call `setup_git_env_internal()` with
the value of the environment variable so that the repository's Git
directory is configured, even if it points to a non-existent directory.

Historically though, this function didn't only configure the repository,
but also initialized the object database. We retained this behaviour
from a preceding commit, even though it really doesn't make much sense
in the first place -- there is no repository, so we don't have an object
database either. There seemingly isn't much of a reason to construct the
object database, as we typically won't try to read objects when we don't
have an object database.

There's one exception though: git-index-pack(1) may run outside of a
repository, which can be used to perform consistency checks for a
packfile. The code path is _almost_ working: we already know to call
`parse_object_buffer()`, which can read objects without an object
database being available. And that works for all object types except for
commits, because `parse_commit_buffer()` calls `parse_commit_graph()`,
and that function doesn't handle the case where we don't have an object
database.

Fix this instance to check for the object database instead of checking
for the Git directory having been initialized. With this fixed, we can
now stop constructing an object database completely.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 commit-graph.c | 4 ++--
 setup.c        | 7 +++----
 2 files changed, 5 insertions(+), 6 deletions(-)

diff --git a/commit-graph.c b/commit-graph.c
index 9abe62bd5a..0820cf5fb8 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -740,13 +740,13 @@ static struct commit_graph *prepare_commit_graph(struct repository *r)
 	struct odb_source *source;
 
 	/*
-	 * Early return if there is no git dir or if the commit graph is
+	 * Early return if there is no object database or if the commit graph is
 	 * disabled.
 	 *
 	 * This must come before the "already attempted?" check below, because
 	 * we want to disable even an already-loaded graph file.
 	 */
-	if (!r->gitdir || r->commit_graph_disabled)
+	if (!r->objects || r->commit_graph_disabled)
 		return NULL;
 
 	if (r->objects->commit_graph_attempted)
diff --git a/setup.c b/setup.c
index 0dc9fe4565..4a8d6230b1 100644
--- a/setup.c
+++ b/setup.c
@@ -2043,13 +2043,12 @@ const char *setup_git_directory_gently(struct repository *repo, int *nongit_ok)
 			setup_git_env_internal(repo, gitdir);
 		}
 
-		repo->objects = odb_new(repo,
-					getenv_safe(&to_free, DB_ENVIRONMENT),
-					getenv_safe(&to_free, ALTERNATE_DB_ENVIRONMENT));
-
 		if (startup_info->have_repository) {
 			struct strbuf err = STRBUF_INIT;
 
+			repo->objects = odb_new(repo,
+						getenv_safe(&to_free, DB_ENVIRONMENT),
+						getenv_safe(&to_free, ALTERNATE_DB_ENVIRONMENT));
 			if (apply_repository_format(repo, &repo_fmt, &err) < 0)
 				die("%s", err.buf);
 

-- 
2.54.0.1064.gd145956f57.dirty


^ permalink raw reply related

* [PATCH v3 5/8] setup: stop creating the object database in `setup_git_env()`
From: Patrick Steinhardt @ 2026-06-04  7:46 UTC (permalink / raw)
  To: git; +Cc: Kristoffer Haugsbakk, Junio C Hamano, Karthik Nayak
In-Reply-To: <20260604-b4-pks-setup-centralize-odb-creation-v3-0-0691834f318a@pks.im>

In the preceding commit we have stopped creating the object database in
`repo_set_gitdir()`. But the logic is still somewhat confusing as we
still end up creating it conditionally in `setup_git_dir()`, which is
called multiple times.

Drop the conditional logic and instead create the object database in all
places where we have discovered and configured a repository.

This leads to even more duplication than we already had in the preceding
commit, but an alert reader may notice that we now (almost) always call
`odb_new()` directly before having called `apply_repository_format()`.
The only exception to this is `setup_git_directory_gently()`, where we
also call the function when _not_ applying the repository format. This
will be fixed in the next commit, and once that's done we can then unify
creation of the object database into `apply_repository_format()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 setup.c | 37 ++++++++++++++++++++++++++-----------
 1 file changed, 26 insertions(+), 11 deletions(-)

diff --git a/setup.c b/setup.c
index 3bd3f6c592..0dc9fe4565 100644
--- a/setup.c
+++ b/setup.c
@@ -1035,8 +1035,7 @@ const char *read_gitfile_gently(const char *path, int *return_error_code)
 }
 
 static void setup_git_env_internal(struct repository *repo,
-				   const char *git_dir,
-				   bool skip_initializing_odb)
+				   const char *git_dir)
 {
 	char *git_replace_ref_base;
 	const char *shallow_file;
@@ -1053,10 +1052,6 @@ static void setup_git_env_internal(struct repository *repo,
 	repo_set_gitdir(repo, git_dir, &args);
 	strvec_clear(&to_free);
 
-	if (!skip_initializing_odb)
-		repo->objects = odb_new(repo, getenv_safe(&to_free, DB_ENVIRONMENT),
-					getenv_safe(&to_free, ALTERNATE_DB_ENVIRONMENT));
-
 	if (getenv(NO_REPLACE_OBJECTS_ENVIRONMENT))
 		disable_replace_refs();
 	replace_ref_base = getenv(GIT_REPLACE_REF_BASE_ENVIRONMENT);
@@ -1072,10 +1067,10 @@ static void setup_git_env_internal(struct repository *repo,
 		fetch_if_missing = 0;
 }
 
-static void set_git_dir_1(struct repository *repo, const char *path, bool skip_initializing_odb)
+static void set_git_dir_1(struct repository *repo, const char *path)
 {
 	xsetenv(GIT_DIR_ENVIRONMENT, path, 1);
-	setup_git_env_internal(repo, path, skip_initializing_odb);
+	setup_git_env_internal(repo, path);
 }
 
 static void update_relative_gitdir(const char *name UNUSED,
@@ -1089,7 +1084,7 @@ static void update_relative_gitdir(const char *name UNUSED,
 	trace_printf_key(&trace_setup_key,
 			 "setup: move $GIT_DIR to '%s'",
 			 path);
-	set_git_dir_1(repo, path, true);
+	set_git_dir_1(repo, path);
 	free(path);
 }
 
@@ -1102,7 +1097,7 @@ static void set_git_dir(struct repository *repo, const char *path, int make_real
 		path = realpath.buf;
 	}
 
-	set_git_dir_1(repo, path, false);
+	set_git_dir_1(repo, path);
 	if (!is_absolute_path(path))
 		chdir_notify_register(NULL, update_relative_gitdir, repo);
 
@@ -1879,8 +1874,15 @@ const char *enter_repo(struct repository *repo, const char *path, unsigned flags
 	}
 
 	if (is_git_directory(".")) {
+		struct strvec to_free = STRVEC_INIT;
+
 		set_git_dir(repo, ".", 0);
+		repo->objects = odb_new(repo,
+					getenv_safe(&to_free, DB_ENVIRONMENT),
+					getenv_safe(&to_free, ALTERNATE_DB_ENVIRONMENT));
 		check_and_apply_repository_format(repo, NULL);
+
+		strvec_clear(&to_free);
 		return path;
 	}
 
@@ -2032,13 +2034,19 @@ const char *setup_git_directory_gently(struct repository *repo, int *nongit_ok)
 	    startup_info->have_repository ||
 	    /* GIT_DIR_EXPLICIT */
 	    getenv(GIT_DIR_ENVIRONMENT)) {
+		struct strvec to_free = STRVEC_INIT;
+
 		if (!repo->gitdir) {
 			const char *gitdir = getenv(GIT_DIR_ENVIRONMENT);
 			if (!gitdir)
 				gitdir = DEFAULT_GIT_DIR_ENVIRONMENT;
-			setup_git_env_internal(repo, gitdir, false);
+			setup_git_env_internal(repo, gitdir);
 		}
 
+		repo->objects = odb_new(repo,
+					getenv_safe(&to_free, DB_ENVIRONMENT),
+					getenv_safe(&to_free, ALTERNATE_DB_ENVIRONMENT));
+
 		if (startup_info->have_repository) {
 			struct strbuf err = STRBUF_INIT;
 
@@ -2048,6 +2056,8 @@ const char *setup_git_directory_gently(struct repository *repo, int *nongit_ok)
 			clear_repository_format(&repo_fmt);
 			strbuf_release(&err);
 		}
+
+		strvec_clear(&to_free);
 	}
 	/*
 	 * Since precompose_string_if_needed() needs to look at
@@ -2796,6 +2806,7 @@ int init_db(struct repository *repo,
 	int exist_ok = flags & INIT_DB_EXIST_OK;
 	char *original_git_dir = real_pathdup(git_dir, 1);
 	struct repository_format repo_fmt = REPOSITORY_FORMAT_INIT;
+	struct strvec to_free = STRVEC_INIT;
 
 	if (real_git_dir) {
 		struct stat st;
@@ -2816,6 +2827,9 @@ int init_db(struct repository *repo,
 	}
 	startup_info->have_repository = 1;
 
+	repo->objects = odb_new(repo, getenv_safe(&to_free, DB_ENVIRONMENT),
+				getenv_safe(&to_free, ALTERNATE_DB_ENVIRONMENT));
+
 	/*
 	 * Check to see if the repository version is right.
 	 * Note that a newly created repository does not have
@@ -2879,6 +2893,7 @@ int init_db(struct repository *repo,
 	}
 
 	clear_repository_format(&repo_fmt);
+	strvec_clear(&to_free);
 	free(original_git_dir);
 	return 0;
 }

-- 
2.54.0.1064.gd145956f57.dirty


^ permalink raw reply related

* [PATCH v3 4/8] repository: stop initializing the object database in `repo_set_gitdir()`
From: Patrick Steinhardt @ 2026-06-04  7:46 UTC (permalink / raw)
  To: git; +Cc: Kristoffer Haugsbakk, Junio C Hamano, Karthik Nayak
In-Reply-To: <20260604-b4-pks-setup-centralize-odb-creation-v3-0-0691834f318a@pks.im>

The function `repo_set_gitdir()` obviously sets the Git directory for a
given repository. Less obviously though, the function also configures a
couple of auxiliary settings.

One such thing is that we create the object database in this function.
This logic only happens conditionally though, as `set_git_dir()` may be
called multiple times during repository setup, and we don't want to
create the object database multiple times. This is somewhat tangled and
hard to follow.

Remove the logic from `repo_set_gitdir()` and instead initialize the
object database outside of it. This leads to some duplication right now,
but that duplication will be removed in a subsequent step where we will
start initializing the object database as part of applying the repo's
format.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 repository.c | 8 ++------
 repository.h | 3 ---
 setup.c      | 7 ++++---
 3 files changed, 6 insertions(+), 12 deletions(-)

diff --git a/repository.c b/repository.c
index 58a13f7c4f..2c2395105f 100644
--- a/repository.c
+++ b/repository.c
@@ -181,12 +181,6 @@ void repo_set_gitdir(struct repository *repo,
 	free(old_gitdir);
 
 	repo_set_commondir(repo, o->commondir);
-
-	if (!repo->objects)
-		repo->objects = odb_new(repo, o->object_dir, o->alternate_db);
-	else if (!o->skip_initializing_odb)
-		BUG("cannot reinitialize an already-initialized object directory");
-
 	repo->disable_ref_updates = o->disable_ref_updates;
 
 	expand_base_dir(&repo->graft_file, o->graft_file,
@@ -302,6 +296,8 @@ int repo_init(struct repository *repo,
 		goto error;
 	}
 
+	repo->objects = odb_new(repo, NULL, NULL);
+
 	if (worktree)
 		repo_set_worktree(repo, worktree);
 
diff --git a/repository.h b/repository.h
index c3ec0f4b79..36e2db2633 100644
--- a/repository.h
+++ b/repository.h
@@ -221,12 +221,9 @@ const char *repo_get_work_tree(struct repository *repo);
  */
 struct set_gitdir_args {
 	const char *commondir;
-	const char *object_dir;
 	const char *graft_file;
 	const char *index_file;
-	const char *alternate_db;
 	bool disable_ref_updates;
-	bool skip_initializing_odb;
 };
 
 void repo_set_gitdir(struct repository *repo, const char *root,
diff --git a/setup.c b/setup.c
index c5015923f1..3bd3f6c592 100644
--- a/setup.c
+++ b/setup.c
@@ -1045,17 +1045,18 @@ static void setup_git_env_internal(struct repository *repo,
 	struct strvec to_free = STRVEC_INIT;
 
 	args.commondir = getenv_safe(&to_free, GIT_COMMON_DIR_ENVIRONMENT);
-	args.object_dir = getenv_safe(&to_free, DB_ENVIRONMENT);
 	args.graft_file = getenv_safe(&to_free, GRAFT_ENVIRONMENT);
 	args.index_file = getenv_safe(&to_free, INDEX_ENVIRONMENT);
-	args.alternate_db = getenv_safe(&to_free, ALTERNATE_DB_ENVIRONMENT);
 	if (getenv(GIT_QUARANTINE_ENVIRONMENT))
 		args.disable_ref_updates = true;
-	args.skip_initializing_odb = skip_initializing_odb;
 
 	repo_set_gitdir(repo, git_dir, &args);
 	strvec_clear(&to_free);
 
+	if (!skip_initializing_odb)
+		repo->objects = odb_new(repo, getenv_safe(&to_free, DB_ENVIRONMENT),
+					getenv_safe(&to_free, ALTERNATE_DB_ENVIRONMENT));
+
 	if (getenv(NO_REPLACE_OBJECTS_ENVIRONMENT))
 		disable_replace_refs();
 	replace_ref_base = getenv(GIT_REPLACE_REF_BASE_ENVIRONMENT);

-- 
2.54.0.1064.gd145956f57.dirty


^ permalink raw reply related

* [PATCH v3 3/8] setup: deduplicate logic to apply repository format
From: Patrick Steinhardt @ 2026-06-04  7:46 UTC (permalink / raw)
  To: git; +Cc: Kristoffer Haugsbakk, Junio C Hamano, Karthik Nayak
In-Reply-To: <20260604-b4-pks-setup-centralize-odb-creation-v3-0-0691834f318a@pks.im>

After having discovered the repository format we then apply it to the
repository so that it knows to use the proper repository extensions. The
logic to apply the format is duplicated across three callsites, which
makes it rather painfull to add new extensions.

Introduce a new function `apply_repository_format()` that takes a repo
and applies a given format to it and adapt all callsites to use it.
This function is also the new caller of `verify_repository_format()` so
that we can ensure that we never apply an invalid repository format.
The verification we have in `read_and_verify_repository_format()` is
thus redundant now and dropped.

Rename `read_and_verify_repository_format()` accordingly. While at it,
also rename `check_repository_format()` to clarify that it doesn't only
_check_ the format, but that it also applies it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 repository.c | 31 +++++++-------------
 setup.c      | 93 ++++++++++++++++++++++++++++++++----------------------------
 setup.h      | 10 +++++++
 3 files changed, 71 insertions(+), 63 deletions(-)

diff --git a/repository.c b/repository.c
index db57b8308b..58a13f7c4f 100644
--- a/repository.c
+++ b/repository.c
@@ -262,8 +262,8 @@ void repo_set_worktree(struct repository *repo, const char *path)
 	trace2_def_repo(repo);
 }
 
-static int read_and_verify_repository_format(struct repository_format *format,
-					     const char *commondir)
+static int read_repository_format_from_commondir(struct repository_format *format,
+						 const char *commondir)
 {
 	int ret = 0;
 	struct strbuf sb = STRBUF_INIT;
@@ -272,11 +272,6 @@ static int read_and_verify_repository_format(struct repository_format *format,
 	read_repository_format(format, sb.buf);
 	strbuf_reset(&sb);
 
-	if (verify_repository_format(format, &sb) < 0) {
-		warning("%s", sb.buf);
-		ret = -1;
-	}
-
 	strbuf_release(&sb);
 	return ret;
 }
@@ -290,6 +285,8 @@ int repo_init(struct repository *repo,
 	      const char *worktree)
 {
 	struct repository_format format = REPOSITORY_FORMAT_INIT;
+	struct strbuf err = STRBUF_INIT;
+
 	memset(repo, 0, sizeof(*repo));
 
 	initialize_repository(repo);
@@ -297,21 +294,13 @@ int repo_init(struct repository *repo,
 	if (repo_init_gitdir(repo, gitdir))
 		goto error;
 
-	if (read_and_verify_repository_format(&format, repo->commondir))
+	if (read_repository_format_from_commondir(&format, repo->commondir))
 		goto error;
 
-	repo_set_hash_algo(repo, format.hash_algo);
-	repo_set_compat_hash_algo(repo, format.compat_hash_algo);
-	repo_set_ref_storage_format(repo, format.ref_storage_format,
-				    format.ref_storage_payload);
-	repo->repository_format_worktree_config = format.worktree_config;
-	repo->repository_format_relative_worktrees = format.relative_worktrees;
-	repo->repository_format_precious_objects = format.precious_objects;
-	repo->repository_format_submodule_path_cfg = format.submodule_path_cfg;
-
-	/* take ownership of format.partial_clone */
-	repo->repository_format_partial_clone = format.partial_clone;
-	format.partial_clone = NULL;
+	if (apply_repository_format(repo, &format, &err) < 0) {
+		warning("%s", err.buf);
+		goto error;
+	}
 
 	if (worktree)
 		repo_set_worktree(repo, worktree);
@@ -320,10 +309,12 @@ int repo_init(struct repository *repo,
 		repo_read_loose_object_map(repo);
 
 	clear_repository_format(&format);
+	strbuf_release(&err);
 	return 0;
 
 error:
 	clear_repository_format(&format);
+	strbuf_release(&err);
 	repo_clear(repo);
 	return -1;
 }
diff --git a/setup.c b/setup.c
index 252b443117..c5015923f1 100644
--- a/setup.c
+++ b/setup.c
@@ -750,8 +750,7 @@ static int check_repo_format(const char *var, const char *value,
 	return read_worktree_config(var, value, ctx, vdata);
 }
 
-static int check_repository_format_gently(struct repository *repo,
-					  const char *gitdir,
+static int check_repository_format_gently(const char *gitdir,
 					  struct repository_format *candidate,
 					  int *nongit_ok)
 {
@@ -765,7 +764,7 @@ static int check_repository_format_gently(struct repository *repo,
 	strbuf_release(&sb);
 
 	/*
-	 * For historical use of check_repository_format() in git-init,
+	 * For historical use of check_and_apply_repository_format() in git-init,
 	 * we treat a missing config as a silent "ok", even when nongit_ok
 	 * is unset.
 	 */
@@ -782,8 +781,6 @@ static int check_repository_format_gently(struct repository *repo,
 		die("%s", err.buf);
 	}
 
-	repo->repository_format_precious_objects = candidate->precious_objects;
-
 	string_list_clear(&candidate->unknown_extensions, 0);
 	string_list_clear(&candidate->v1_only_extensions, 0);
 
@@ -1140,7 +1137,7 @@ static const char *setup_explicit_git_dir(struct repository *repo,
 		die(_("not a git repository: '%s'"), gitdirenv);
 	}
 
-	if (check_repository_format_gently(repo, gitdirenv, repo_fmt, nongit_ok)) {
+	if (check_repository_format_gently(gitdirenv, repo_fmt, nongit_ok)) {
 		free(gitfile);
 		return NULL;
 	}
@@ -1217,7 +1214,7 @@ static const char *setup_discovered_git_dir(struct repository *repo,
 					    struct repository_format *repo_fmt,
 					    int *nongit_ok)
 {
-	if (check_repository_format_gently(repo, gitdir, repo_fmt, nongit_ok))
+	if (check_repository_format_gently(gitdir, repo_fmt, nongit_ok))
 		return NULL;
 
 	/* --work-tree is set without --git-dir; use discovered one */
@@ -1265,7 +1262,7 @@ static const char *setup_bare_git_dir(struct repository *repo,
 {
 	int root_len;
 
-	if (check_repository_format_gently(repo, ".", repo_fmt, nongit_ok))
+	if (check_repository_format_gently(".", repo_fmt, nongit_ok))
 		return NULL;
 
 	setenv(GIT_IMPLICIT_WORK_TREE_ENVIRONMENT, "0", 1);
@@ -1757,6 +1754,32 @@ enum discovery_result discover_git_directory_reason(struct strbuf *commondir,
 	return result;
 }
 
+int apply_repository_format(struct repository *repo,
+			    const struct repository_format *format,
+			    struct strbuf *err)
+{
+	if (verify_repository_format(format, err) < 0)
+		return -1;
+
+	repo_set_hash_algo(repo, format->hash_algo);
+	repo_set_compat_hash_algo(repo, format->compat_hash_algo);
+	repo_set_ref_storage_format(repo,
+				    format->ref_storage_format,
+				    format->ref_storage_payload);
+	repo->repository_format_worktree_config =
+		format->worktree_config;
+	repo->repository_format_submodule_path_cfg =
+		format->submodule_path_cfg;
+	repo->repository_format_relative_worktrees =
+		format->relative_worktrees;
+	repo->repository_format_partial_clone =
+		xstrdup_or_null(format->partial_clone);
+	repo->repository_format_precious_objects =
+		format->precious_objects;
+
+	return 0;
+}
+
 /*
  * Check the repository format version in the path found in repo_get_git_dir(repo),
  * and die if it is a version we don't understand. Generally one would
@@ -1765,26 +1788,20 @@ enum discovery_result discover_git_directory_reason(struct strbuf *commondir,
  *
  * If successful and fmt is not NULL, fill fmt with data.
  */
-static void check_repository_format(struct repository *repo, struct repository_format *fmt)
+static void check_and_apply_repository_format(struct repository *repo,
+					      struct repository_format *fmt)
 {
 	struct repository_format repo_fmt = REPOSITORY_FORMAT_INIT;
+	struct strbuf err = STRBUF_INIT;
+
 	if (!fmt)
 		fmt = &repo_fmt;
-	check_repository_format_gently(repo, repo_get_git_dir(repo), fmt, NULL);
+
+	check_repository_format_gently(repo_get_git_dir(repo), fmt, NULL);
+	if (apply_repository_format(repo, fmt, &err) < 0)
+		die("%s", err.buf);
 	startup_info->have_repository = 1;
-	repo_set_hash_algo(repo, fmt->hash_algo);
-	repo_set_compat_hash_algo(repo, fmt->compat_hash_algo);
-	repo_set_ref_storage_format(repo,
-				    fmt->ref_storage_format,
-				    fmt->ref_storage_payload);
-	repo->repository_format_worktree_config =
-		fmt->worktree_config;
-	repo->repository_format_submodule_path_cfg =
-		fmt->submodule_path_cfg;
-	repo->repository_format_relative_worktrees =
-		fmt->relative_worktrees;
-	repo->repository_format_partial_clone =
-		xstrdup_or_null(fmt->partial_clone);
+
 	clear_repository_format(&repo_fmt);
 }
 
@@ -1862,7 +1879,7 @@ const char *enter_repo(struct repository *repo, const char *path, unsigned flags
 
 	if (is_git_directory(".")) {
 		set_git_dir(repo, ".", 0);
-		check_repository_format(repo, NULL);
+		check_and_apply_repository_format(repo, NULL);
 		return path;
 	}
 
@@ -2020,25 +2037,15 @@ const char *setup_git_directory_gently(struct repository *repo, int *nongit_ok)
 				gitdir = DEFAULT_GIT_DIR_ENVIRONMENT;
 			setup_git_env_internal(repo, gitdir, false);
 		}
+
 		if (startup_info->have_repository) {
-			repo_set_hash_algo(repo, repo_fmt.hash_algo);
-			repo_set_compat_hash_algo(repo,
-						  repo_fmt.compat_hash_algo);
-			repo_set_ref_storage_format(repo,
-						    repo_fmt.ref_storage_format,
-						    repo_fmt.ref_storage_payload);
-			repo->repository_format_worktree_config =
-				repo_fmt.worktree_config;
-			repo->repository_format_relative_worktrees =
-				repo_fmt.relative_worktrees;
-			repo->repository_format_submodule_path_cfg =
-				repo_fmt.submodule_path_cfg;
-			/* take ownership of repo_fmt.partial_clone */
-			repo->repository_format_partial_clone =
-				repo_fmt.partial_clone;
-			repo_fmt.partial_clone = NULL;
-			repo->repository_format_precious_objects =
-				repo_fmt.precious_objects;
+			struct strbuf err = STRBUF_INIT;
+
+			if (apply_repository_format(repo, &repo_fmt, &err) < 0)
+				die("%s", err.buf);
+
+			clear_repository_format(&repo_fmt);
+			strbuf_release(&err);
 		}
 	}
 	/*
@@ -2814,7 +2821,7 @@ int init_db(struct repository *repo,
 	 * config file, so this will not fail.  What we are catching
 	 * is an attempt to reinitialize new repository with an old tool.
 	 */
-	check_repository_format(repo, &repo_fmt);
+	check_and_apply_repository_format(repo, &repo_fmt);
 
 	repository_format_configure(repo, &repo_fmt, hash, ref_storage_format);
 
diff --git a/setup.h b/setup.h
index 9409326fe4..efbb82fdbf 100644
--- a/setup.h
+++ b/setup.h
@@ -221,6 +221,16 @@ void clear_repository_format(struct repository_format *format);
 int verify_repository_format(const struct repository_format *format,
 			     struct strbuf *err);
 
+/*
+ * Apply the given repository format to the repo. This initializes extensions
+ * and basic data structures required for normal operation. Returns 0 on
+ * success, a negative error code when the format is not valid as determined by
+ * `verify_repository_format()`.
+ */
+int apply_repository_format(struct repository *repo,
+			    const struct repository_format *format,
+			    struct strbuf *err);
+
 const char *get_template_dir(const char *option_template);
 
 #define INIT_DB_QUIET      (1 << 0)

-- 
2.54.0.1064.gd145956f57.dirty


^ permalink raw reply related

* [PATCH v3 2/8] setup: drop `setup_git_env()`
From: Patrick Steinhardt @ 2026-06-04  7:46 UTC (permalink / raw)
  To: git; +Cc: Kristoffer Haugsbakk, Junio C Hamano, Karthik Nayak
In-Reply-To: <20260604-b4-pks-setup-centralize-odb-creation-v3-0-0691834f318a@pks.im>

The `setup_git_env()` function is a trivial wrapper around
`setup_git_env_internal()` and has a single call site only. Drop the
function.

While at it, drop stale documentation in "environment.h" that points to
this function, even though it hasn't been exposed to callers outside of
"setup.c" since 43ad1047a9 (setup: stop using `the_repository` in
`setup_git_env()`, 2026-03-27) anymore.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 environment.h | 8 +-------
 refs.c        | 3 ++-
 setup.c       | 7 +------
 3 files changed, 4 insertions(+), 14 deletions(-)

diff --git a/environment.h b/environment.h
index 9eb97b3869..ccfcf37bfb 100644
--- a/environment.h
+++ b/environment.h
@@ -130,13 +130,6 @@ void repo_config_values_init(struct repo_config_values *cfg);
  * `the_repository`. We should eventually get rid of these and make the
  * dependency on a repository explicit:
  *
- *   - `setup_git_env()` ideally shouldn't exist as it modifies global state,
- *     namely the environment. The current process shouldn't ever access that
- *     state via envvars though, but should instead consult a `struct
- *     repository`. When spawning new processes, we would ideally also pass a
- *     `struct repository` and then set up the environment variables for the
- *     child process, only.
- *
  *   - `have_git_dir()` should not have to exist at all. Instead, we should
  *     decide on whether or not we have a `struct repository`.
  *
@@ -147,6 +140,7 @@ void repo_config_values_init(struct repo_config_values *cfg);
  * Please do not add new global config variables here.
  */
 # ifdef USE_THE_REPOSITORY_VARIABLE
+
 /*
  * Returns true iff we have a configured git repository (either via
  * setup_git_directory, or in the environment via $GIT_DIR).
diff --git a/refs.c b/refs.c
index 0f3355d2ee..e7070eb743 100644
--- a/refs.c
+++ b/refs.c
@@ -126,7 +126,8 @@ struct ref_namespace_info ref_namespace[] = {
 		 * points to the content of another. Unlike the other
 		 * ref namespaces, this one can be changed by the
 		 * GIT_REPLACE_REF_BASE environment variable. This
-		 * .namespace value will be overwritten in setup_git_env().
+		 * .namespace value will be overwritten during repository
+		 * setup.
 		 */
 		.ref = "refs/replace/",
 		.decoration = DECORATION_GRAFTED,
diff --git a/setup.c b/setup.c
index d723306dfe..252b443117 100644
--- a/setup.c
+++ b/setup.c
@@ -1074,11 +1074,6 @@ static void setup_git_env_internal(struct repository *repo,
 		fetch_if_missing = 0;
 }
 
-static void setup_git_env(struct repository *repo, const char *git_dir)
-{
-	setup_git_env_internal(repo, git_dir, false);
-}
-
 static void set_git_dir_1(struct repository *repo, const char *path, bool skip_initializing_odb)
 {
 	xsetenv(GIT_DIR_ENVIRONMENT, path, 1);
@@ -2023,7 +2018,7 @@ const char *setup_git_directory_gently(struct repository *repo, int *nongit_ok)
 			const char *gitdir = getenv(GIT_DIR_ENVIRONMENT);
 			if (!gitdir)
 				gitdir = DEFAULT_GIT_DIR_ENVIRONMENT;
-			setup_git_env(repo, gitdir);
+			setup_git_env_internal(repo, gitdir, false);
 		}
 		if (startup_info->have_repository) {
 			repo_set_hash_algo(repo, repo_fmt.hash_algo);

-- 
2.54.0.1064.gd145956f57.dirty


^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox