From: Junio C Hamano <gitster@pobox.com>
To: Tian Yuchen <a3205153416@gmail.com>
Cc: git@vger.kernel.org, sandals@crustytoothpaste.net
Subject: Re: [PATCH v3] setup: fail if .git is not a file or directory
Date: Sun, 15 Feb 2026 00:41:09 -0800 [thread overview]
Message-ID: <xmqqfr72flga.fsf@gitster.g> (raw)
In-Reply-To: <20260214045247.118013-1-a3205153416@gmail.com> (Tian Yuchen's message of "Sat, 14 Feb 2026 12:52:47 +0800")
Tian Yuchen <a3205153416@gmail.com> writes:
> Currently, `setup_git_directory_gently_1()` checks if `.git` is a
> regular file (handling submodules/worktrees) or a directory. If it is
> neither (e.g., a FIFO), the code hits a NEEDSWORK comment and simply
> ignores the entity, continuing the discovery process in the parent
> directory.
>
> Failing instead of ignoring here makes it easier for the users to notice
> anomalies and take action, particularly if this non-file non-directory
> entity was created by mistake or by file system corruption.
OK.
> However, strictly enforcing 'lstat()' and 'S_ISREG()' breaks valid
> workflows where '.git' is a symlink pointing to a real git directory
> (e.g. created via 'ln -s').
This is true, but it is more or less irrelevant, isn't it?
The existing check to see .git is a regular file or a directory
already uses stat() and has no such problem. A mistaken use of
lstat() will only become relevant if we were to add an extra lstat()
call on top of what we already have, but that is not what we are
doing here, right?
> To ensure safety and correctness:
>
> 1. Differentiate between "missing file" and "is a directory". This
> removes the long standing NEEDSWORK comment in 'read_gitfile_gently()'.
This is more about telling "not a file" and "is a directory" apart,
isn't it? Missing file (STAT_FAILED) is distinct from NOT_A_FILE
and we check if it is a git directory only when we get the latter,
and ignore if the thing that is not a file is not a git directory.
> 2. Explicitly check 'st_mode' after 'stat()'. If the path resolves to a
> directory, return 'READ_GITFILE_ERR_IS_A_DIR' so the caller can try
> to handle it as a directory.
This lets us tell "not a file" and "is a directory" apart, which is
duplicate of the above #1, isn't it?
> 3. If the path exists but is neither a regular file nor a directory,
> return 'READ_GITFILE_ERR_NOT_A_FILE'.
Again, the same thing. I _think_ what we discussed as the
motivation behind introducing new error classes are:
* We are assuming "not a file" is more or less synonymous to "is a
directory", and ignoring a path that is neither a file nor a git
directory, which is what the NEEDSWORK comment in
setup_git_directory_gently_1() is about.
* We are assuing failure from stat(2) is more or less synonymous to
"nothing there", and blindly go uplevel, which is what the
NEEDSWORK comment in read_gitfile_error_die() is about.
So, we introduce IS_A_DIR (which is *not* an error) and together
with NOT_A_FILE (which was not an error, but now is an error), we
can tell "whoa, there is .git that is not a file or a directory"
reliably, to solve the first NEEDSWORK. We also introduce
STAT_ENOENT (which is *not* an error) and together with STAT_FAILED,
we can tell "ok, there is nothing there, so we go up one level and
try to see if there is .git there" (i.e., happy normal case) and
"whoa, there is .git we cannot tell what it is" case apart, which is
about the second NEEDSWORK.
> I have verified this with a test script covering:
> 1. Normal .git file
> 2. .git as a symlink to a directory
> 3. .git as a FIFO
> 4. .git as a symlink to a FIFO
> 5. .git with garbage content
>
> setup.c | 39 +++++++++++++++++++++++++++++----------
> setup.h | 3 +++
> 2 files changed, 32 insertions(+), 10 deletions(-)
There is no test addition here, though? What am I missing?
> diff --git a/setup.c b/setup.c
> index 3a6a048620..8681a8a9d1 100644
> --- a/setup.c
> +++ b/setup.c
> @@ -911,6 +911,10 @@ void read_gitfile_error_die(int error_code, const char *path, const char *dir)
> die(_("no path in gitfile: %s"), path);
> case READ_GITFILE_ERR_NOT_A_REPO:
> die(_("not a git repository: %s"), dir);
> + case READ_GITFILE_ERR_STAT_ENOENT:
> + die(_("Not a git repository: %s"), path);
> + case READ_GITFILE_ERR_IS_A_DIR:
> + die(_("Not a git file (is a directory): %s"), path);
Hmph, isn't this backwards?
We used to treat STAT_FAILED as OK without dying in this function,
because we conflated "there is nothing there, so you should go one
level up and try again" happy case with all other stat(2) failure,
and that is why we introduced STAT_ENOENT here. ENOENT is the
*only* case among what used to be STAT_FAILED that we do *not* want
to die in this function. The same thing with NOT_A_FILE vs
IS_A_DIR. We used to treat the former as OK but the only case we
wanted to treat as OK was IS_A_DIR and all other cases, like FIFO,
we wanted to complain, no?
That is why the caller in read_gitfile_gently() before this patch
used to call this function regardless of what the error_code is.
Because the function above was updated to die in these two happy
cases, the caller (below, hunk at around line 1000) excludes these
two happy error codes, which it should not have to do.
> @@ -939,8 +943,14 @@ const char *read_gitfile_gently(const char *path, int *return_error_code)
> static struct strbuf realpath = STRBUF_INIT;
>
> if (stat(path, &st)) {
> - /* NEEDSWORK: discern between ENOENT vs other errors */
> - error_code = READ_GITFILE_ERR_STAT_FAILED;
> + if (errno == ENOENT)
> + error_code = READ_GITFILE_ERR_STAT_ENOENT;
> + else
> + error_code = READ_GITFILE_ERR_STAT_FAILED;
> + goto cleanup_return;
OK. We used to bundle any stat failure into STAT_FAILED, and
treated it as a single happy case, but by extracting STAT_ENOENT out
of various reasons stat(2) failed, we can single out ENOENT as the
only happy case and treat all other stat(2) failure as errors.
> + }
> + if (S_ISDIR(st.st_mode)) {
> + error_code = READ_GITFILE_ERR_IS_A_DIR;
> goto cleanup_return;
> }
> if (!S_ISREG(st.st_mode)) {
And we used to bundle anything other than S_ISREG() as a single
NOT_A_FILE class, and assumed that would be a directory and treated
it as a non-error. We now explicitly return IS_A_DIR so that the
caller can treat that as the only happy case among other NOT_A_FILE
cases that are suspicious.
> @@ -994,7 +1004,9 @@ const char *read_gitfile_gently(const char *path, int *return_error_code)
> cleanup_return:
> if (return_error_code)
> *return_error_code = error_code;
> - else if (error_code)
> + else if (error_code &&
> + error_code != READ_GITFILE_ERR_STAT_ENOENT &&
> + error_code != READ_GITFILE_ERR_IS_A_DIR)
> read_gitfile_error_die(error_code, path, dir);
This looks seriously wrong, and it is because it is trying to paper
over the misdesign of the updates made to read_gitfile_error_die(),
isn't it?
> @@ -1576,18 +1588,25 @@ static enum discovery_result setup_git_directory_gently_1(struct strbuf *dir,
> if (offset > min_offset)
> strbuf_addch(dir, '/');
> strbuf_addstr(dir, DEFAULT_GIT_DIR_ENVIRONMENT);
> - gitdirenv = read_gitfile_gently(dir->buf, die_on_error ?
> - NULL : &error_code);
> + gitdirenv = read_gitfile_gently(dir->buf, &error_code);
> if (!gitdirenv) {
> - if (die_on_error ||
> - error_code == READ_GITFILE_ERR_NOT_A_FILE) {
> - /* NEEDSWORK: fail if .git is not file nor dir */
> + if (error_code == READ_GITFILE_ERR_STAT_ENOENT ||
> + error_code == READ_GITFILE_ERR_IS_A_DIR) {
This smells seriously wrong.
IS_A_DIR leading to a call to is_git_directory() check makes sense,
but if we got STAT_ENOENT, what would we expect to find there by
making an is_git_directory() call?
> if (is_git_directory(dir->buf)) {
> gitdirenv = DEFAULT_GIT_DIR_ENVIRONMENT;
> gitdir_path = xstrdup(dir->buf);
> }
> + } else if (error_code == READ_GITFILE_ERR_NOT_A_FILE) {
> + if (die_on_error)
> + die(_("Invalid %s: not a regular file or directory"), dir->buf);
> + else
> + return GIT_DIR_INVALID_GITFILE;
> + } else if (error_code != READ_GITFILE_ERR_STAT_FAILED) {
Do you mean "!=" here, not "=="???
> + if (die_on_error)
> + read_gitfile_error_die(error_code, dir->buf, NULL);
> + else
> + return GIT_DIR_INVALID_GITFILE;
> + }
If we wanted to have an explicit STAT_NOENT check, it would make
sense to add it as the last "else if" arm here, with an empty body,
i.e.
} else if (error_code == READ_GITFILE_STAT_ENOENT) {
; /* nothing there, go up one level */
}
next prev parent reply other threads:[~2026-02-15 8:41 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-11 18:21 [RFC] setup: fail if .git is not a file or directory Tian Yuchen
2026-02-11 19:47 ` Junio C Hamano
2026-02-12 17:33 ` Tian Yuchen
2026-02-12 17:24 ` [PATCH v2] " Tian Yuchen
2026-02-12 20:59 ` Junio C Hamano
2026-02-13 16:37 ` Tian Yuchen
2026-02-14 4:52 ` [PATCH v3] " Tian Yuchen
2026-02-15 8:41 ` Junio C Hamano [this message]
2026-02-15 16:22 ` Tian Yuchen
2026-02-16 2:37 ` Junio C Hamano
2026-02-16 16:02 ` Tian Yuchen
2026-02-17 8:41 ` [PATCH v4] setup: allow cwd/.git to be a symlink to a directory Tian Yuchen
2026-02-17 11:26 ` Karthik Nayak
2026-02-17 15:30 ` Tian Yuchen
2026-02-17 18:56 ` Karthik Nayak
2026-02-17 21:10 ` Junio C Hamano
2026-02-17 17:01 ` Junio C Hamano
2026-02-17 18:50 ` Karthik Nayak
2026-02-18 4:08 ` Tian Yuchen
2026-02-17 17:59 ` Karthik Nayak
2026-02-18 5:18 ` [PATCH v5 0/2] setup.c: v5 reroll Tian Yuchen
2026-02-18 5:18 ` [PATCH v5 1/2] setup: distingush ENOENT from other stat errors Tian Yuchen
2026-02-18 10:12 ` Karthik Nayak
2026-02-18 11:11 ` Tian Yuchen
2026-02-18 18:15 ` Junio C Hamano
2026-02-18 18:43 ` Junio C Hamano
2026-02-18 5:18 ` [PATCH v5 2/2] setup: allow cwd/.git to be a symlink to a directory Tian Yuchen
2026-02-18 10:27 ` Karthik Nayak
2026-02-18 11:20 ` Tian Yuchen
2026-02-18 18:25 ` Junio C Hamano
2026-02-19 5:11 ` Tian Yuchen
2026-02-15 17:08 ` [PATCH v3] setup: fail if .git is not a file or directory Tian Yuchen
2026-02-12 22:39 ` [RFC] " brian m. carlson
2026-02-12 22:45 ` Junio C Hamano
2026-02-12 23:03 ` brian m. carlson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xmqqfr72flga.fsf@gitster.g \
--to=gitster@pobox.com \
--cc=a3205153416@gmail.com \
--cc=git@vger.kernel.org \
--cc=sandals@crustytoothpaste.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox