public inbox for git@vger.kernel.org
 help / color / mirror / Atom feed
* [Question] check_repository_format_gently() is not side-effect-free
@ 2026-03-19 17:45 Tian Yuchen
  2026-03-19 18:07 ` Junio C Hamano
  0 siblings, 1 reply; 6+ messages in thread
From: Tian Yuchen @ 2026-03-19 17:45 UTC (permalink / raw)
  To: Git

Hi Git community,

In setup.c:

> static const char *setup_explicit_git_dir(const char *gitdirenv,
> 					  struct strbuf *cwd,
> 					  struct repository_format *repo_fmt,
> 					  int *nongit_ok)
> {
> 	const char *work_tree_env = getenv(GIT_WORK_TREE_ENVIRONMENT);
> 	const char *worktree;
> 	char *gitfile;
> 	int offset;
> 
> 	if (PATH_MAX - 40 < strlen(gitdirenv))
> 		die(_("'$%s' too big"), GIT_DIR_ENVIRONMENT);
> 
> 	gitfile = (char*)read_gitfile(gitdirenv);
> 	if (gitfile) {
> 		gitfile = xstrdup(gitfile);
> 		gitdirenv = gitfile;
> 	}
> 
> 	if (!is_git_directory(gitdirenv)) {
> 		if (nongit_ok) {
> 			*nongit_ok = 1;
> 			free(gitfile);
> 			return NULL;
> 		}
> 		die(_("not a git repository: '%s'"), gitdirenv);
> 	}
> 

...

> 	if (check_repository_format_gently(gitdirenv, repo_fmt, nongit_ok)) {
> 		free(gitfile);
> 		return NULL;
> 	}

I've noticed that this function behaves rather strangely. If I am 
correct, The primary purpose of this function is to verify whether the 
Git binary understands the current repository format.

The implementation is as follows:

> static int check_repository_format_gently(const char *gitdir, struct repository_format *candidate, int *nongit_ok)
> {
> 	struct strbuf sb = STRBUF_INIT;
> 	struct strbuf err = STRBUF_INIT;
> 	int has_common;
> 
> 	has_common = get_common_dir(&sb, gitdir);
> 	strbuf_addstr(&sb, "/config");
> 	read_repository_format(candidate, sb.buf);
> 	strbuf_release(&sb);
> 
> 	/*
> 	 * For historical use of check_repository_format() in git-init,
> 	 * we treat a missing config as a silent "ok", even when nongit_ok
> 	 * is unset.
> 	 */
> 	if (candidate->version < 0)
> 		return 0;
> 
> 	if (verify_repository_format(candidate, &err) < 0) {
> 		if (nongit_ok) {
> 			warning("%s", err.buf);
> 			strbuf_release(&err);
> 			*nongit_ok = -1;
> 			return -1;
> 		}
> 		die("%s", err.buf);
> 	}
> 
> 	the_repository->repository_format_precious_objects = candidate->precious_objects;
> 
> 	string_list_clear(&candidate->unknown_extensions, 0);
> 	string_list_clear(&candidate->v1_only_extensions, 0);
> 
> 	if (candidate->worktree_config) {
> 		/*
> 		 * pick up core.bare and core.worktree from per-worktree
> 		 * config if present
> 		 */
> 		strbuf_addf(&sb, "%s/config.worktree", gitdir);
> 		git_config_from_file(read_worktree_config, sb.buf, candidate);
> 		strbuf_release(&sb);
> 		has_common = 0;
> 	}

...

> 	if (!has_common) {
> 		if (candidate->is_bare != -1) {
> 			is_bare_repository_cfg = candidate->is_bare;
> 			if (is_bare_repository_cfg == 1)
> 				inside_work_tree = -1;
> 		}
> 		if (candidate->work_tree) {
> 			free(git_work_tree_cfg);
> 			git_work_tree_cfg = xstrdup(candidate->work_tree);
> 			inside_work_tree = -1;
> 		}
> 	}
> 
> 	return 0;
> }

Obviously it applies the parsed results to the global environment. Since 
this function starts with 'check_' and ends with '_gently,' it's hard 
not to think of it as a side-effect-free diagnostic function.

git grep "^[a-z_]* check_[a-z_]*(" -- "*.c"

Using the command above, we can see that there are other functions in 
the Git source code that start with 'check_' but are not entirely 
side-effect-free. However, I believe that the two variables 
(is_bare_repository_cfg, git_work_tree_cfg) this function modifies are 
much more crucial: For example, if I want to handle both a bare 
repository and a normal repository within a single process, wouldn't 
problems arise soon?

I would love to hear if there are any historical reasons preventing us 
from stripping these global side-effects out of this function.

Regards,

Yuchen

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Question] check_repository_format_gently() is not side-effect-free
  2026-03-19 17:45 [Question] check_repository_format_gently() is not side-effect-free Tian Yuchen
@ 2026-03-19 18:07 ` Junio C Hamano
  2026-03-20  0:24   ` Tian Yuchen
  0 siblings, 1 reply; 6+ messages in thread
From: Junio C Hamano @ 2026-03-19 18:07 UTC (permalink / raw)
  To: Tian Yuchen; +Cc: Git

Tian Yuchen <cat@malon.dev> writes:

The verb "check" does not imply side-effect-free.  By checking, each
of these functions tries to achieve something, and the way the
result of their work is conveyed back to the caller may not
necessarily be only by their return values.

The adverb "gently" in this codebase typically means "the variant
without gently signals problems by dying.  Instead of dying, return
to the caller with error code, so that the caller can decide to
die".

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Question] check_repository_format_gently() is not side-effect-free
  2026-03-19 18:07 ` Junio C Hamano
@ 2026-03-20  0:24   ` Tian Yuchen
  2026-03-20  6:07     ` Patrick Steinhardt
  0 siblings, 1 reply; 6+ messages in thread
From: Tian Yuchen @ 2026-03-20  0:24 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git

On 3/20/26 02:07, Junio C Hamano wrote:
> The verb "check" does not imply side-effect-free.  By checking, each
> of these functions tries to achieve something, and the way the
> result of their work is conveyed back to the caller may not
> necessarily be only by their return values.
> 
> The adverb "gently" in this codebase typically means "the variant
> without gently signals problems by dying.  Instead of dying, return
> to the caller with error code, so that the caller can decide to
> die".

Ah, I see. I guess I took it too literally. Thank you for clarification!

Setting the semantics aside, the problem remains: I still think the 
setup method here isn't quite right. It creates a bottleneck for 
eventually handling multiple repositories in the same process without 
data races.

Do you think this is worth a patch?

Thanks,

Yuchen

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Question] check_repository_format_gently() is not side-effect-free
  2026-03-20  0:24   ` Tian Yuchen
@ 2026-03-20  6:07     ` Patrick Steinhardt
  2026-03-20 15:56       ` Junio C Hamano
  2026-03-20 16:26       ` Tian Yuchen
  0 siblings, 2 replies; 6+ messages in thread
From: Patrick Steinhardt @ 2026-03-20  6:07 UTC (permalink / raw)
  To: Tian Yuchen; +Cc: Junio C Hamano, Git

On Fri, Mar 20, 2026 at 08:24:09AM +0800, Tian Yuchen wrote:
> On 3/20/26 02:07, Junio C Hamano wrote:
> > The verb "check" does not imply side-effect-free.  By checking, each
> > of these functions tries to achieve something, and the way the
> > result of their work is conveyed back to the caller may not
> > necessarily be only by their return values.
> > 
> > The adverb "gently" in this codebase typically means "the variant
> > without gently signals problems by dying.  Instead of dying, return
> > to the caller with error code, so that the caller can decide to
> > die".
> 
> Ah, I see. I guess I took it too literally. Thank you for clarification!
> 
> Setting the semantics aside, the problem remains: I still think the setup
> method here isn't quite right. It creates a bottleneck for eventually
> handling multiple repositories in the same process without data races.
> 
> Do you think this is worth a patch?

Yes, I think that the whole of "setup.c" is something we will want to
refactor eventually so that it does not modify global state anymore. So
it's not only `check_repository_format_gently()`, but also lots of other
functionality in that file. The motivation is not only being able to set
up multiple repositories, but also making the code overall easier to
understand.

That being said, I'll give a small warning that it's probably
non-trivial to refactor this subystem :)

Thanks!

Patrick

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Question] check_repository_format_gently() is not side-effect-free
  2026-03-20  6:07     ` Patrick Steinhardt
@ 2026-03-20 15:56       ` Junio C Hamano
  2026-03-20 16:26       ` Tian Yuchen
  1 sibling, 0 replies; 6+ messages in thread
From: Junio C Hamano @ 2026-03-20 15:56 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: Tian Yuchen, Git

Patrick Steinhardt <ps@pks.im> writes:

> ... The motivation is not only being able to set
> up multiple repositories, but also making the code overall easier to
> understand.
>
> That being said, I'll give a small warning that it's probably
> non-trivial to refactor this subystem :)

Well said, and I agree 100%; thanks.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Question] check_repository_format_gently() is not side-effect-free
  2026-03-20  6:07     ` Patrick Steinhardt
  2026-03-20 15:56       ` Junio C Hamano
@ 2026-03-20 16:26       ` Tian Yuchen
  1 sibling, 0 replies; 6+ messages in thread
From: Tian Yuchen @ 2026-03-20 16:26 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: Junio C Hamano, Git

Hi Patrick,

On 3/20/26 14:07, Patrick Steinhardt wrote:

> Yes, I think that the whole of "setup.c" is something we will want to
> refactor eventually so that it does not modify global state anymore. So
> it's not only `check_repository_format_gently()`, but also lots of other
> functionality in that file. The motivation is not only being able to set
> up multiple repositories, but also making the code overall easier to
> understand.
> 
> That being said, I'll give a small warning that it's probably
> non-trivial to refactor this subystem 🙂

Thanks for the reply!

It’s true — setup.c seems utterly baffling to me. I thought I understood 
it before, but the more closely I look at it, the more I realize there 
are details everywhere that require careful attention.

I won't drop a break-the-world patch out of nowhere. I'll keep learning 
until I'm able to do so ;)

Thanks,

Yuchen

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2026-03-20 16:26 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-19 17:45 [Question] check_repository_format_gently() is not side-effect-free Tian Yuchen
2026-03-19 18:07 ` Junio C Hamano
2026-03-20  0:24   ` Tian Yuchen
2026-03-20  6:07     ` Patrick Steinhardt
2026-03-20 15:56       ` Junio C Hamano
2026-03-20 16:26       ` Tian Yuchen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox