* [RFC] setup: fail if .git is not a file or directory
@ 2026-02-11 18:21 Tian Yuchen
2026-02-11 19:47 ` Junio C Hamano
` (2 more replies)
0 siblings, 3 replies; 35+ messages in thread
From: Tian Yuchen @ 2026-02-11 18:21 UTC (permalink / raw)
To: git; +Cc: gitster
Currently, `setup_git_directory_gently_1()` checks if `.git` is a
regular file (handling submodules/worktrees) or a directory. If it is
neither (e.g., a FIFO), the code hits a NEEDSWORK comment and simply
ignores the entity, continuing the discovery process in the parent
directory.
This behavior can be very dangerous. If a user is inside a subdirectory
containing a melformed/broken `.git` entity, the Git will traverse up,
attach to a parent repository and might execute destructive commands.
I tried to resolve the NEEDSWORK by using `lstat()` to explicitly check
the entity's mode. If it is neither a regular file nor a directory, we
kill the discovery process.
But I still have questions:
1. Is failing hard the desired behavior here? Should skipping it and
continuing discovery be an option for the user, which might seem
more fault-tolerant?
2. Should we die() immediately here, or return GIT_DIR_INVALID_GITFILE
and let the caller decide?
Signed-off-by: Tian Yuchen <a3205153416@gmail.com>
---
setup.c | 12 +++++++++++-
1 file changed, 11 insertions(+), 1 deletion(-)
diff --git a/setup.c b/setup.c
index 3a6a048620..a1b56de67a 100644
--- a/setup.c
+++ b/setup.c
@@ -1581,7 +1581,17 @@ static enum discovery_result setup_git_directory_gently_1(struct strbuf *dir,
if (!gitdirenv) {
if (die_on_error ||
error_code == READ_GITFILE_ERR_NOT_A_FILE) {
- /* NEEDSWORK: fail if .git is not file nor dir */
+ struct stat st;
+ if (!lstat(dir->buf, &st) &&
+ !S_ISREG(st.st_mode) &&
+ !S_ISDIR(st.st_mode)){
+
+ if (die_on_error)
+ die(_("Invalid %s: not a regular file or directory"), dir->buf);
+ else
+ return GIT_DIR_INVALID_GITFILE;
+ }
+
if (is_git_directory(dir->buf)) {
gitdirenv = DEFAULT_GIT_DIR_ENVIRONMENT;
gitdir_path = xstrdup(dir->buf);
--
2.43.0
^ permalink raw reply related [flat|nested] 35+ messages in thread* Re: [RFC] setup: fail if .git is not a file or directory 2026-02-11 18:21 [RFC] setup: fail if .git is not a file or directory Tian Yuchen @ 2026-02-11 19:47 ` Junio C Hamano 2026-02-12 17:33 ` Tian Yuchen 2026-02-12 17:24 ` [PATCH v2] " Tian Yuchen 2026-02-12 22:39 ` [RFC] " brian m. carlson 2 siblings, 1 reply; 35+ messages in thread From: Junio C Hamano @ 2026-02-11 19:47 UTC (permalink / raw) To: Tian Yuchen; +Cc: git Tian Yuchen <a3205153416@gmail.com> writes: > Currently, `setup_git_directory_gently_1()` checks if `.git` is a > regular file (handling submodules/worktrees) or a directory. If it is > neither (e.g., a FIFO), the code hits a NEEDSWORK comment and simply > ignores the entity, continuing the discovery process in the parent > directory. Thanks for noticing the needswork comment. > This behavior can be very dangerous. If a user is inside a subdirectory > containing a melformed/broken `.git` entity, the Git will traverse up, > attach to a parent repository and might execute destructive commands. Is it? If a malicious party can replace your .git file in your submodule with a device node or a fifo, it means they can write into your working tree of your top level superproject, so they have other and better things to use to attack you if they wanted to. I would say 'very dangerous' is an overstatement. If "git add" in a place you thought was in your submodule ends up adding the file to the superproject because of filesystem corruption making ".git" in your submodule to something strange, you'd have larger problem anyway--- would you trust your commits and trees recorded in such a corrupt filesystem? Having said all that. Failing instead of ignoring may make it easier for the end-users to notice anomalies and take corrective action, if this non-file non-directory filesystem entity was created by mistake or by file-system corruption. It would be an unnecessary breakage to those who deliberately named a fifo they use ".git", fully knowing well that it is a reserved name that "git add" and friends happily ignore, but I somehow find it unlikely. > But I still have questions: > 1. Is failing hard the desired behavior here? Should skipping it and > continuing discovery be an option for the user, which might seem > more fault-tolerant? > 2. Should we die() immediately here, or return GIT_DIR_INVALID_GITFILE > and let the caller decide? Determining if it makes sense to do so is 80% of the work needed to resolve a NEEDSWORK comment. The actual implementation is often much simpler than the work needed for it. ;-) > Signed-off-by: Tian Yuchen <a3205153416@gmail.com> > --- > setup.c | 12 +++++++++++- > 1 file changed, 11 insertions(+), 1 deletion(-) > > diff --git a/setup.c b/setup.c > index 3a6a048620..a1b56de67a 100644 > --- a/setup.c > +++ b/setup.c > @@ -1581,7 +1581,17 @@ static enum discovery_result setup_git_directory_gently_1(struct strbuf *dir, > if (!gitdirenv) { > if (die_on_error || > error_code == READ_GITFILE_ERR_NOT_A_FILE) { > - /* NEEDSWORK: fail if .git is not file nor dir */ > + struct stat st; > + if (!lstat(dir->buf, &st) && > + !S_ISREG(st.st_mode) && > + !S_ISDIR(st.st_mode)){ > + > + if (die_on_error) > + die(_("Invalid %s: not a regular file or directory"), dir->buf); > + else > + return GIT_DIR_INVALID_GITFILE; > + } I would have expected that the new code would come after the existing "if the thing is a .git directory, it is a happy outcome" code. In this code path, where we found ".git" that is not a file, the most likely reason is because we found a healthy git directory, so it is a sane thing to avoid extra lstat() in that normal codepath. Only after we see that the ".git" we have is neither a "gitdir:" file nor a git directory, we know we are dealing with a rare anomaly that we can afford to waste extra cycles. > if (is_git_directory(dir->buf)) { > gitdirenv = DEFAULT_GIT_DIR_ENVIRONMENT; > gitdir_path = xstrdup(dir->buf); ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC] setup: fail if .git is not a file or directory 2026-02-11 19:47 ` Junio C Hamano @ 2026-02-12 17:33 ` Tian Yuchen 0 siblings, 0 replies; 35+ messages in thread From: Tian Yuchen @ 2026-02-12 17:33 UTC (permalink / raw) To: Junio C Hamano; +Cc: git Hi Junio, I've rewrote the code based on your suggestions. If there are further improvements to be made regarding coding style or readability(especially the style of comments), please let me know. Happy valentine's day(though it's tomorrow)! Yuchen ^ permalink raw reply [flat|nested] 35+ messages in thread
* [PATCH v2] setup: fail if .git is not a file or directory 2026-02-11 18:21 [RFC] setup: fail if .git is not a file or directory Tian Yuchen 2026-02-11 19:47 ` Junio C Hamano @ 2026-02-12 17:24 ` Tian Yuchen 2026-02-12 20:59 ` Junio C Hamano 2026-02-14 4:52 ` [PATCH v3] " Tian Yuchen 2026-02-12 22:39 ` [RFC] " brian m. carlson 2 siblings, 2 replies; 35+ messages in thread From: Tian Yuchen @ 2026-02-12 17:24 UTC (permalink / raw) To: git; +Cc: gitster Currently, `setup_git_directory_gently_1()` checks if `.git` is a regular file (handling submodules/worktrees) or a directory. If it is neither (e.g., a FIFO), the code hits a NEEDSWORK comment and simply ignores the entity, continuing the discovery process in the parent directory. Failing instead of ignoring here makes it easier for the users to notice anomalies and take action, particularly if this non-file non-directory entity was created by mistake or by file system corruption. Resolve the NEEDSWORK by using `lstat()` to check the entity's mode. To avoid performance penalties on the happy path, this check is only performed after `is_git_directory()` fails. Signed-off-by: Tian Yuchen <a3205153416@gmail.com> --- Change in v2: - Moved the `lstat` check into an `else` block after `is_git_directory()` to avoid any extra system call overhead on the normal happy path, as suggested by Junio. setup.c | 18 +++++++++++++++--- 1 file changed, 15 insertions(+), 3 deletions(-) diff --git a/setup.c b/setup.c index 3a6a048620..95658f2547 100644 --- a/setup.c +++ b/setup.c @@ -1580,11 +1580,23 @@ static enum discovery_result setup_git_directory_gently_1(struct strbuf *dir, NULL : &error_code); if (!gitdirenv) { if (die_on_error || - error_code == READ_GITFILE_ERR_NOT_A_FILE) { - /* NEEDSWORK: fail if .git is not file nor dir */ - if (is_git_directory(dir->buf)) { + error_code == READ_GITFILE_ERR_NOT_A_FILE) { + if (is_git_directory(dir->buf)){ gitdirenv = DEFAULT_GIT_DIR_ENVIRONMENT; gitdir_path = xstrdup(dir->buf); + } else { + /* + * It is neither a gitfile nor a git directory. + */ + struct stat st; + if (!lstat(dir->buf, &st) && + !S_ISREG(st.st_mode) && + !S_ISDIR(st.st_mode)) { + if (die_on_error) + die(_("Invalid %s: not a regular file or directory"), dir->buf); + else + return GIT_DIR_INVALID_GITFILE; + } } } else if (error_code != READ_GITFILE_ERR_STAT_FAILED) return GIT_DIR_INVALID_GITFILE; -- 2.43.0 ^ permalink raw reply related [flat|nested] 35+ messages in thread
* Re: [PATCH v2] setup: fail if .git is not a file or directory 2026-02-12 17:24 ` [PATCH v2] " Tian Yuchen @ 2026-02-12 20:59 ` Junio C Hamano 2026-02-13 16:37 ` Tian Yuchen 2026-02-14 4:52 ` [PATCH v3] " Tian Yuchen 1 sibling, 1 reply; 35+ messages in thread From: Junio C Hamano @ 2026-02-12 20:59 UTC (permalink / raw) To: Tian Yuchen; +Cc: git Tian Yuchen <a3205153416@gmail.com> writes: > if (!gitdirenv) { > if (die_on_error || > - error_code == READ_GITFILE_ERR_NOT_A_FILE) { > + error_code == READ_GITFILE_ERR_NOT_A_FILE) { Why this indentation change? > - /* NEEDSWORK: fail if .git is not file nor dir */ > - if (is_git_directory(dir->buf)) { > + if (is_git_directory(dir->buf)){ Why this change (which is style violation that lack necessary SP between "){")? > gitdirenv = DEFAULT_GIT_DIR_ENVIRONMENT; > gitdir_path = xstrdup(dir->buf); > + } else { > + /* > + * It is neither a gitfile nor a git directory. > + */ > + struct stat st; > + if (!lstat(dir->buf, &st) && > + !S_ISREG(st.st_mode) && > + !S_ISDIR(st.st_mode)) { > + if (die_on_error) > + die(_("Invalid %s: not a regular file or directory"), dir->buf); > + else > + return GIT_DIR_INVALID_GITFILE; > + } > } Stepping back a bit, even though the NEEDSWORK comment is placed here, I am not sure if this is the bast place to do much of the necessary work. When gitdirenv is NULL and die_on_error is set, we would have died on any error other than STAT_FAILED (most often, this is because there wasn't .git at the path in the first place) or NOT_A_FILE (again, a happy case is that .git existed and it was a directory). We would have already died in read_gitfile_error_die() in all other cases. But these two error cases are not necessarily entirely happy and that is what the NEEDSWORK comment is about. So, if we wanted to tighten the error checking to help users diagnose problems in their filesystem, I wonder if it is a better approach to refine the set of READ_GITFILE_ERR_* error codes: - STAT_ENOENT (new) is returned when stat failed and we got ENOENT, and it is not a fatal error. - STAT_FAILED becomes a fatal error in read_gitfile_error_die(). - IS_A_DIR (new) is returned when stat succeeded and it is a directory. It is not a fatal error. - NOT_A_FILE becomes a fatal error in read_gitfile_error_die(). Existing callers of the two functions, read_gitfile_gently() and read_gitfile_error_die(), must be audited and adjusted appropriately, but once it is done, it would become much simpler, wouldn't it? Thanks. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v2] setup: fail if .git is not a file or directory 2026-02-12 20:59 ` Junio C Hamano @ 2026-02-13 16:37 ` Tian Yuchen 0 siblings, 0 replies; 35+ messages in thread From: Tian Yuchen @ 2026-02-13 16:37 UTC (permalink / raw) To: Junio C Hamano; +Cc: git On 2/13/26 04:59, Junio C Hamano wrote: > Why this indentation change? > >> - /* NEEDSWORK: fail if .git is not file nor dir */ >> - if (is_git_directory(dir->buf)) { >> + if (is_git_directory(dir->buf)){ > > Why this change (which is style violation that lack necessary SP > between "){")? Sorry for the typo. I will fix it in the next patch :( > Stepping back a bit, even though the NEEDSWORK comment is placed > here, I am not sure if this is the bast place to do much of the > necessary work. > > When gitdirenv is NULL and die_on_error is set, we would have died > on any error other than STAT_FAILED (most often, this is because > there wasn't .git at the path in the first place) or NOT_A_FILE > (again, a happy case is that .git existed and it was a directory). > We would have already died in read_gitfile_error_die() in all other > cases. But these two error cases are not necessarily entirely happy > and that is what the NEEDSWORK comment is about. > > So, if we wanted to tighten the error checking to help users > diagnose problems in their filesystem, I wonder if it is a better > approach to refine the set of READ_GITFILE_ERR_* error codes: > > - STAT_ENOENT (new) is returned when stat failed and we got ENOENT, > and it is not a fatal error. > > - STAT_FAILED becomes a fatal error in read_gitfile_error_die(). > > - IS_A_DIR (new) is returned when stat succeeded and it is a > directory. It is not a fatal error. > > - NOT_A_FILE becomes a fatal error in read_gitfile_error_die(). This approach makes sense to me. I will work on v3 patch that: 1. Refines 'read_gitfile_error' with 'STAT_ENOENT' and 'IS_A_DIR'. 2. Makes 'STAT_FAILED' and 'NOT_A_FILE' fatal by default in 'read_gitfile_error_die()' 3. Adjusts existing callers to handle these new codes. This will be a larger refactor, so it might take a bit of time to ensure I don't break other call sites. > Existing callers of the two functions, read_gitfile_gently() and > read_gitfile_error_die(), must be audited and adjusted > appropriately, but once it is done, it would become much simpler, > wouldn't it? Yes indeed. Thanks for guiding me toward a cleaner architecture. Buon san valentino, Yuchen ^ permalink raw reply [flat|nested] 35+ messages in thread
* [PATCH v3] setup: fail if .git is not a file or directory 2026-02-12 17:24 ` [PATCH v2] " Tian Yuchen 2026-02-12 20:59 ` Junio C Hamano @ 2026-02-14 4:52 ` Tian Yuchen 2026-02-15 8:41 ` Junio C Hamano 1 sibling, 1 reply; 35+ messages in thread From: Tian Yuchen @ 2026-02-14 4:52 UTC (permalink / raw) To: git; +Cc: gitster, sandals Currently, `setup_git_directory_gently_1()` checks if `.git` is a regular file (handling submodules/worktrees) or a directory. If it is neither (e.g., a FIFO), the code hits a NEEDSWORK comment and simply ignores the entity, continuing the discovery process in the parent directory. Failing instead of ignoring here makes it easier for the users to notice anomalies and take action, particularly if this non-file non-directory entity was created by mistake or by file system corruption. However, strictly enforcing 'lstat()' and 'S_ISREG()' breaks valid workflows where '.git' is a symlink pointing to a real git directory (e.g. created via 'ln -s'). To ensure safety and correctness: 1. Differentiate between "missing file" and "is a directory". This removes the long standing NEEDSWORK comment in 'read_gitfile_gently()'. 2. Explicitly check 'st_mode' after 'stat()'. If the path resolves to a directory, return 'READ_GITFILE_ERR_IS_A_DIR' so the caller can try to handle it as a directory. 3. If the path exists but is neither a regular file nor a directory, return 'READ_GITFILE_ERR_NOT_A_FILE'. Update 'setup_git_directory_gently_1()' to aborts setup immeditaely upon encountering a malicous '.git' file. Signed-off-by: Tian Yuchen <a3205153416@gmail.com> --- I have verified this with a test script covering: 1. Normal .git file 2. .git as a symlink to a directory 3. .git as a FIFO 4. .git as a symlink to a FIFO 5. .git with garbage content setup.c | 39 +++++++++++++++++++++++++++++---------- setup.h | 3 +++ 2 files changed, 32 insertions(+), 10 deletions(-) diff --git a/setup.c b/setup.c index 3a6a048620..8681a8a9d1 100644 --- a/setup.c +++ b/setup.c @@ -911,6 +911,10 @@ void read_gitfile_error_die(int error_code, const char *path, const char *dir) die(_("no path in gitfile: %s"), path); case READ_GITFILE_ERR_NOT_A_REPO: die(_("not a git repository: %s"), dir); + case READ_GITFILE_ERR_STAT_ENOENT: + die(_("Not a git repository: %s"), path); + case READ_GITFILE_ERR_IS_A_DIR: + die(_("Not a git file (is a directory): %s"), path); default: BUG("unknown error code"); } @@ -939,8 +943,14 @@ const char *read_gitfile_gently(const char *path, int *return_error_code) static struct strbuf realpath = STRBUF_INIT; if (stat(path, &st)) { - /* NEEDSWORK: discern between ENOENT vs other errors */ - error_code = READ_GITFILE_ERR_STAT_FAILED; + if (errno == ENOENT) + error_code = READ_GITFILE_ERR_STAT_ENOENT; + else + error_code = READ_GITFILE_ERR_STAT_FAILED; + goto cleanup_return; + } + if (S_ISDIR(st.st_mode)) { + error_code = READ_GITFILE_ERR_IS_A_DIR; goto cleanup_return; } if (!S_ISREG(st.st_mode)) { @@ -994,7 +1004,9 @@ const char *read_gitfile_gently(const char *path, int *return_error_code) cleanup_return: if (return_error_code) *return_error_code = error_code; - else if (error_code) + else if (error_code && + error_code != READ_GITFILE_ERR_STAT_ENOENT && + error_code != READ_GITFILE_ERR_IS_A_DIR) read_gitfile_error_die(error_code, path, dir); free(buf); @@ -1576,18 +1588,25 @@ static enum discovery_result setup_git_directory_gently_1(struct strbuf *dir, if (offset > min_offset) strbuf_addch(dir, '/'); strbuf_addstr(dir, DEFAULT_GIT_DIR_ENVIRONMENT); - gitdirenv = read_gitfile_gently(dir->buf, die_on_error ? - NULL : &error_code); + gitdirenv = read_gitfile_gently(dir->buf, &error_code); if (!gitdirenv) { - if (die_on_error || - error_code == READ_GITFILE_ERR_NOT_A_FILE) { - /* NEEDSWORK: fail if .git is not file nor dir */ + if (error_code == READ_GITFILE_ERR_STAT_ENOENT || + error_code == READ_GITFILE_ERR_IS_A_DIR) { if (is_git_directory(dir->buf)) { gitdirenv = DEFAULT_GIT_DIR_ENVIRONMENT; gitdir_path = xstrdup(dir->buf); } - } else if (error_code != READ_GITFILE_ERR_STAT_FAILED) - return GIT_DIR_INVALID_GITFILE; + } else if (error_code == READ_GITFILE_ERR_NOT_A_FILE) { + if (die_on_error) + die(_("Invalid %s: not a regular file or directory"), dir->buf); + else + return GIT_DIR_INVALID_GITFILE; + } else if (error_code != READ_GITFILE_ERR_STAT_FAILED) { + if (die_on_error) + read_gitfile_error_die(error_code, dir->buf, NULL); + else + return GIT_DIR_INVALID_GITFILE; + } } else gitfile = xstrdup(dir->buf); /* diff --git a/setup.h b/setup.h index d55dcc6608..0271cc8f93 100644 --- a/setup.h +++ b/setup.h @@ -36,6 +36,9 @@ int is_nonbare_repository_dir(struct strbuf *path); #define READ_GITFILE_ERR_NO_PATH 6 #define READ_GITFILE_ERR_NOT_A_REPO 7 #define READ_GITFILE_ERR_TOO_LARGE 8 +#define READ_GITFILE_ERR_STAT_ENOENT 9 +#define READ_GITFILE_ERR_IS_A_DIR 10 + void read_gitfile_error_die(int error_code, const char *path, const char *dir); const char *read_gitfile_gently(const char *path, int *return_error_code); #define read_gitfile(path) read_gitfile_gently((path), NULL) -- 2.43.0 ^ permalink raw reply related [flat|nested] 35+ messages in thread
* Re: [PATCH v3] setup: fail if .git is not a file or directory 2026-02-14 4:52 ` [PATCH v3] " Tian Yuchen @ 2026-02-15 8:41 ` Junio C Hamano 2026-02-15 16:22 ` Tian Yuchen 2026-02-15 17:08 ` [PATCH v3] setup: fail if .git is not a file or directory Tian Yuchen 0 siblings, 2 replies; 35+ messages in thread From: Junio C Hamano @ 2026-02-15 8:41 UTC (permalink / raw) To: Tian Yuchen; +Cc: git, sandals Tian Yuchen <a3205153416@gmail.com> writes: > Currently, `setup_git_directory_gently_1()` checks if `.git` is a > regular file (handling submodules/worktrees) or a directory. If it is > neither (e.g., a FIFO), the code hits a NEEDSWORK comment and simply > ignores the entity, continuing the discovery process in the parent > directory. > > Failing instead of ignoring here makes it easier for the users to notice > anomalies and take action, particularly if this non-file non-directory > entity was created by mistake or by file system corruption. OK. > However, strictly enforcing 'lstat()' and 'S_ISREG()' breaks valid > workflows where '.git' is a symlink pointing to a real git directory > (e.g. created via 'ln -s'). This is true, but it is more or less irrelevant, isn't it? The existing check to see .git is a regular file or a directory already uses stat() and has no such problem. A mistaken use of lstat() will only become relevant if we were to add an extra lstat() call on top of what we already have, but that is not what we are doing here, right? > To ensure safety and correctness: > > 1. Differentiate between "missing file" and "is a directory". This > removes the long standing NEEDSWORK comment in 'read_gitfile_gently()'. This is more about telling "not a file" and "is a directory" apart, isn't it? Missing file (STAT_FAILED) is distinct from NOT_A_FILE and we check if it is a git directory only when we get the latter, and ignore if the thing that is not a file is not a git directory. > 2. Explicitly check 'st_mode' after 'stat()'. If the path resolves to a > directory, return 'READ_GITFILE_ERR_IS_A_DIR' so the caller can try > to handle it as a directory. This lets us tell "not a file" and "is a directory" apart, which is duplicate of the above #1, isn't it? > 3. If the path exists but is neither a regular file nor a directory, > return 'READ_GITFILE_ERR_NOT_A_FILE'. Again, the same thing. I _think_ what we discussed as the motivation behind introducing new error classes are: * We are assuming "not a file" is more or less synonymous to "is a directory", and ignoring a path that is neither a file nor a git directory, which is what the NEEDSWORK comment in setup_git_directory_gently_1() is about. * We are assuing failure from stat(2) is more or less synonymous to "nothing there", and blindly go uplevel, which is what the NEEDSWORK comment in read_gitfile_error_die() is about. So, we introduce IS_A_DIR (which is *not* an error) and together with NOT_A_FILE (which was not an error, but now is an error), we can tell "whoa, there is .git that is not a file or a directory" reliably, to solve the first NEEDSWORK. We also introduce STAT_ENOENT (which is *not* an error) and together with STAT_FAILED, we can tell "ok, there is nothing there, so we go up one level and try to see if there is .git there" (i.e., happy normal case) and "whoa, there is .git we cannot tell what it is" case apart, which is about the second NEEDSWORK. > I have verified this with a test script covering: > 1. Normal .git file > 2. .git as a symlink to a directory > 3. .git as a FIFO > 4. .git as a symlink to a FIFO > 5. .git with garbage content > > setup.c | 39 +++++++++++++++++++++++++++++---------- > setup.h | 3 +++ > 2 files changed, 32 insertions(+), 10 deletions(-) There is no test addition here, though? What am I missing? > diff --git a/setup.c b/setup.c > index 3a6a048620..8681a8a9d1 100644 > --- a/setup.c > +++ b/setup.c > @@ -911,6 +911,10 @@ void read_gitfile_error_die(int error_code, const char *path, const char *dir) > die(_("no path in gitfile: %s"), path); > case READ_GITFILE_ERR_NOT_A_REPO: > die(_("not a git repository: %s"), dir); > + case READ_GITFILE_ERR_STAT_ENOENT: > + die(_("Not a git repository: %s"), path); > + case READ_GITFILE_ERR_IS_A_DIR: > + die(_("Not a git file (is a directory): %s"), path); Hmph, isn't this backwards? We used to treat STAT_FAILED as OK without dying in this function, because we conflated "there is nothing there, so you should go one level up and try again" happy case with all other stat(2) failure, and that is why we introduced STAT_ENOENT here. ENOENT is the *only* case among what used to be STAT_FAILED that we do *not* want to die in this function. The same thing with NOT_A_FILE vs IS_A_DIR. We used to treat the former as OK but the only case we wanted to treat as OK was IS_A_DIR and all other cases, like FIFO, we wanted to complain, no? That is why the caller in read_gitfile_gently() before this patch used to call this function regardless of what the error_code is. Because the function above was updated to die in these two happy cases, the caller (below, hunk at around line 1000) excludes these two happy error codes, which it should not have to do. > @@ -939,8 +943,14 @@ const char *read_gitfile_gently(const char *path, int *return_error_code) > static struct strbuf realpath = STRBUF_INIT; > > if (stat(path, &st)) { > - /* NEEDSWORK: discern between ENOENT vs other errors */ > - error_code = READ_GITFILE_ERR_STAT_FAILED; > + if (errno == ENOENT) > + error_code = READ_GITFILE_ERR_STAT_ENOENT; > + else > + error_code = READ_GITFILE_ERR_STAT_FAILED; > + goto cleanup_return; OK. We used to bundle any stat failure into STAT_FAILED, and treated it as a single happy case, but by extracting STAT_ENOENT out of various reasons stat(2) failed, we can single out ENOENT as the only happy case and treat all other stat(2) failure as errors. > + } > + if (S_ISDIR(st.st_mode)) { > + error_code = READ_GITFILE_ERR_IS_A_DIR; > goto cleanup_return; > } > if (!S_ISREG(st.st_mode)) { And we used to bundle anything other than S_ISREG() as a single NOT_A_FILE class, and assumed that would be a directory and treated it as a non-error. We now explicitly return IS_A_DIR so that the caller can treat that as the only happy case among other NOT_A_FILE cases that are suspicious. > @@ -994,7 +1004,9 @@ const char *read_gitfile_gently(const char *path, int *return_error_code) > cleanup_return: > if (return_error_code) > *return_error_code = error_code; > - else if (error_code) > + else if (error_code && > + error_code != READ_GITFILE_ERR_STAT_ENOENT && > + error_code != READ_GITFILE_ERR_IS_A_DIR) > read_gitfile_error_die(error_code, path, dir); This looks seriously wrong, and it is because it is trying to paper over the misdesign of the updates made to read_gitfile_error_die(), isn't it? > @@ -1576,18 +1588,25 @@ static enum discovery_result setup_git_directory_gently_1(struct strbuf *dir, > if (offset > min_offset) > strbuf_addch(dir, '/'); > strbuf_addstr(dir, DEFAULT_GIT_DIR_ENVIRONMENT); > - gitdirenv = read_gitfile_gently(dir->buf, die_on_error ? > - NULL : &error_code); > + gitdirenv = read_gitfile_gently(dir->buf, &error_code); > if (!gitdirenv) { > - if (die_on_error || > - error_code == READ_GITFILE_ERR_NOT_A_FILE) { > - /* NEEDSWORK: fail if .git is not file nor dir */ > + if (error_code == READ_GITFILE_ERR_STAT_ENOENT || > + error_code == READ_GITFILE_ERR_IS_A_DIR) { This smells seriously wrong. IS_A_DIR leading to a call to is_git_directory() check makes sense, but if we got STAT_ENOENT, what would we expect to find there by making an is_git_directory() call? > if (is_git_directory(dir->buf)) { > gitdirenv = DEFAULT_GIT_DIR_ENVIRONMENT; > gitdir_path = xstrdup(dir->buf); > } > + } else if (error_code == READ_GITFILE_ERR_NOT_A_FILE) { > + if (die_on_error) > + die(_("Invalid %s: not a regular file or directory"), dir->buf); > + else > + return GIT_DIR_INVALID_GITFILE; > + } else if (error_code != READ_GITFILE_ERR_STAT_FAILED) { Do you mean "!=" here, not "=="??? > + if (die_on_error) > + read_gitfile_error_die(error_code, dir->buf, NULL); > + else > + return GIT_DIR_INVALID_GITFILE; > + } If we wanted to have an explicit STAT_NOENT check, it would make sense to add it as the last "else if" arm here, with an empty body, i.e. } else if (error_code == READ_GITFILE_STAT_ENOENT) { ; /* nothing there, go up one level */ } ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v3] setup: fail if .git is not a file or directory 2026-02-15 8:41 ` Junio C Hamano @ 2026-02-15 16:22 ` Tian Yuchen 2026-02-16 2:37 ` Junio C Hamano 2026-02-15 17:08 ` [PATCH v3] setup: fail if .git is not a file or directory Tian Yuchen 1 sibling, 1 reply; 35+ messages in thread From: Tian Yuchen @ 2026-02-15 16:22 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, sandals Hi Junio, I admit the code logic is indeed full of flaws - the vast majority of your suggestions make more sense than the current code. I do lack experience handling this kind of call chain, but I'll keep learning #^# Given the extensive changes required, I won't address each point you raised individually. Instead, I'll rewrite the code directly based on your proposed solution. Please reserve your review for the next patch. >> I have verified this with a test script covering: >> 1. Normal .git file >> 2. .git as a symlink to a directory >> 3. .git as a FIFO >> 4. .git as a symlink to a FIFO >> 5. .git with garbage content >> >> setup.c | 39 +++++++++++++++++++++++++++++---------- >> setup.h | 3 +++ >> 2 files changed, 32 insertions(+), 10 deletions(-) >There is no test addition here, though? What am I missing? Sorry, I didn't express myself clearly. I meant I tested it myself but never add a test script. Test script will be included in the next patch. On the other hand, if I understand correctly, state flows should be categorized as follows: 1. Nothing there (ENOENT) ---> ignore and go up one level 2. Directory (IS_A_DIR) ---> check is_git_directory 3. NOT_A_FILE ---> die 4. *REAL* error (READ_FAILED, INVALID_FORMAT) ---> die And I mixed 1 and 2 and covered 4 in an obscure way (!= STAT_FAILED). I don't think this code is "unrunable" but indeed the logic flow is GARBAGE. I'll fix it. Thank you for your advice, Regards, Yuchen ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v3] setup: fail if .git is not a file or directory 2026-02-15 16:22 ` Tian Yuchen @ 2026-02-16 2:37 ` Junio C Hamano 2026-02-16 16:02 ` Tian Yuchen 0 siblings, 1 reply; 35+ messages in thread From: Junio C Hamano @ 2026-02-16 2:37 UTC (permalink / raw) To: Tian Yuchen; +Cc: git, sandals Tian Yuchen <a3205153416@gmail.com> writes: > Sorry, I didn't express myself clearly. I meant I tested it myself but > never add a test script. Test script will be included in the next patch. I see. Thanks. > On the other hand, if I understand correctly, state flows should be > categorized as follows: > > 1. Nothing there (ENOENT) ---> ignore and go up one level > 2. Directory (IS_A_DIR) ---> check is_git_directory > 3. NOT_A_FILE ---> die > 4. *REAL* error (READ_FAILED, INVALID_FORMAT) ---> die > > And I mixed 1 and 2 and covered 4 in an obscure way (!= STAT_FAILED). I > don't think this code is "unrunable" but indeed the logic flow is > GARBAGE. I'll fix it. The above 4-bullet list makes sense to me. It makes me wonder what the current code does and more importantly what we want to do when we find a directory and is_git_directory() says that it is *not* a valid one. Thanks. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v3] setup: fail if .git is not a file or directory 2026-02-16 2:37 ` Junio C Hamano @ 2026-02-16 16:02 ` Tian Yuchen 2026-02-17 8:41 ` [PATCH v4] setup: allow cwd/.git to be a symlink to a directory Tian Yuchen 0 siblings, 1 reply; 35+ messages in thread From: Tian Yuchen @ 2026-02-16 16:02 UTC (permalink / raw) To: Junio C Hamano; +Cc: git On 2/16/26 10:37, Junio C Hamano wrote: > Tian Yuchen <a3205153416@gmail.com> writes: > >> Sorry, I didn't express myself clearly. I meant I tested it myself but >> never add a test script. Test script will be included in the next patch. > > I see. Thanks. > I am currently working on it. However, this is my first time writing test scripts, so I can' guarantee they'll be perfect right away. Please feel free to leave your suggestions when the time comes. >> On the other hand, if I understand correctly, state flows should be >> categorized as follows: >> >> 1. Nothing there (ENOENT) ---> ignore and go up one level >> 2. Directory (IS_A_DIR) ---> check is_git_directory >> 3. NOT_A_FILE ---> die >> 4. *REAL* error (READ_FAILED, INVALID_FORMAT) ---> die >> >> And I mixed 1 and 2 and covered 4 in an obscure way (!= STAT_FAILED). I >> don't think this code is "unrunable" but indeed the logic flow is >> GARBAGE. I'll fix it. > > The above 4-bullet list makes sense to me. It makes me wonder what > the current code does and more importantly what we want to do when > we find a directory and is_git_directory() says that it is *not* a > valid one. > > Thanks. > If we find a directory but 'is_git_directory()' reports it as invalid, I believe we should treat it the same way as 'ENOENT', isn't it? This ensures that a random empty directory named '.git' doesn't stop the discovery process. So, the logic for case 2 becomes: - Found a directory; - Check 'is_git_directory'; - a. Valid? -----> Stop, we found it; - b. Invalid? -----> Ignore, continue the loop. Regards, Yuchen ^ permalink raw reply [flat|nested] 35+ messages in thread
* [PATCH v4] setup: allow cwd/.git to be a symlink to a directory 2026-02-16 16:02 ` Tian Yuchen @ 2026-02-17 8:41 ` Tian Yuchen 2026-02-17 11:26 ` Karthik Nayak ` (2 more replies) 0 siblings, 3 replies; 35+ messages in thread From: Tian Yuchen @ 2026-02-17 8:41 UTC (permalink / raw) To: git; +Cc: gitster Strictly enforcing 'lstat()' and 'S_ISREG()' on '.git' prevents valid workflows where '.git' is a symbolic link pointing to a real git directory (e.g. created via 'ln -s'). Refactor 'setup_git_directory_gently_1()' to use 'stat()' instead of 'lstat()'. This allows the filesystem to automatically resolve symbolic links. To ensure safety and correctness, the logic flow is updated to: 1. Ignore 'ENOENT' (file missing). 2. Check 'IS_A_DIR' cases via 'is_git_directory()'. 3. Explicitly reject 'NOT_A_FILE' cases (FIFOs or sockets). Add a new test script t/t0009-setup-security.sh which verifies: - Valid .git symlinks to real directories are accepted. - .git as a named pipe (FIFO) is rejected. - .git as a symlink to a named pipe is rejected. - .git with garbage content is rejected. - Empty .git directories are ignored. Signed-off-by: Tian Yuchen <a3205153416@gmail.com> --- setup.c | 39 ++++++++++++++------- setup.h | 2 ++ t/t0009-setup-security.sh | 72 +++++++++++++++++++++++++++++++++++++++ 3 files changed, 101 insertions(+), 12 deletions(-) create mode 100755 t/t0009-setup-security.sh diff --git a/setup.c b/setup.c index 3a6a048620..269aa9faaa 100644 --- a/setup.c +++ b/setup.c @@ -939,8 +939,14 @@ const char *read_gitfile_gently(const char *path, int *return_error_code) static struct strbuf realpath = STRBUF_INIT; if (stat(path, &st)) { - /* NEEDSWORK: discern between ENOENT vs other errors */ - error_code = READ_GITFILE_ERR_STAT_FAILED; + if (errno == ENOENT) + error_code = READ_GITFILE_ERR_STAT_ENOENT; + else + error_code = READ_GITFILE_ERR_STAT_FAILED; + goto cleanup_return; + } + if (S_ISDIR(st.st_mode)) { + error_code = READ_GITFILE_ERR_IS_A_DIR; goto cleanup_return; } if (!S_ISREG(st.st_mode)) { @@ -994,7 +1000,9 @@ const char *read_gitfile_gently(const char *path, int *return_error_code) cleanup_return: if (return_error_code) *return_error_code = error_code; - else if (error_code) + else if (error_code && + error_code != READ_GITFILE_ERR_STAT_ENOENT && + error_code != READ_GITFILE_ERR_IS_A_DIR) read_gitfile_error_die(error_code, path, dir); free(buf); @@ -1576,20 +1584,27 @@ static enum discovery_result setup_git_directory_gently_1(struct strbuf *dir, if (offset > min_offset) strbuf_addch(dir, '/'); strbuf_addstr(dir, DEFAULT_GIT_DIR_ENVIRONMENT); - gitdirenv = read_gitfile_gently(dir->buf, die_on_error ? - NULL : &error_code); + gitdirenv = read_gitfile_gently(dir->buf, &error_code); if (!gitdirenv) { - if (die_on_error || - error_code == READ_GITFILE_ERR_NOT_A_FILE) { - /* NEEDSWORK: fail if .git is not file nor dir */ + if (error_code == READ_GITFILE_ERR_STAT_ENOENT) { + ; + } else if (error_code == READ_GITFILE_ERR_IS_A_DIR) { if (is_git_directory(dir->buf)) { gitdirenv = DEFAULT_GIT_DIR_ENVIRONMENT; gitdir_path = xstrdup(dir->buf); } - } else if (error_code != READ_GITFILE_ERR_STAT_FAILED) - return GIT_DIR_INVALID_GITFILE; - } else - gitfile = xstrdup(dir->buf); + } else if (error_code == READ_GITFILE_ERR_NOT_A_FILE) { + if (die_on_error) + die(_("Invalid %s: not a regular file or directory"), dir->buf); + else + return GIT_DIR_INVALID_GITFILE; + } else if (error_code != READ_GITFILE_ERR_STAT_FAILED) { + if (die_on_error) + read_gitfile_error_die(error_code, dir->buf, NULL); + else + return GIT_DIR_INVALID_GITFILE; + } + } /* * Earlier, we tentatively added DEFAULT_GIT_DIR_ENVIRONMENT * to check that directory for a repository. diff --git a/setup.h b/setup.h index d55dcc6608..c23629cb4f 100644 --- a/setup.h +++ b/setup.h @@ -36,6 +36,8 @@ int is_nonbare_repository_dir(struct strbuf *path); #define READ_GITFILE_ERR_NO_PATH 6 #define READ_GITFILE_ERR_NOT_A_REPO 7 #define READ_GITFILE_ERR_TOO_LARGE 8 +#define READ_GITFILE_ERR_STAT_ENOENT 9 +#define READ_GITFILE_ERR_IS_A_DIR 10 void read_gitfile_error_die(int error_code, const char *path, const char *dir); const char *read_gitfile_gently(const char *path, int *return_error_code); #define read_gitfile(path) read_gitfile_gently((path), NULL) diff --git a/t/t0009-setup-security.sh b/t/t0009-setup-security.sh new file mode 100755 index 0000000000..72c5232147 --- /dev/null +++ b/t/t0009-setup-security.sh @@ -0,0 +1,72 @@ +#!/bin/sh + +test_description='setup: validation of .git file/directory types + +Verify that setup_git_directory() correctly handles: +1. Valid .git directories (including symlinks to them). +2. Invalid .git files (FIFOs, sockets) by erroring out. +3. Invalid .git files (garbage) by erroring out. +' + +. ./test-lib.sh + +test_expect_success 'setup: create parent git repository' ' + git init parent && + test_commit -C parent "root-commit" +' + +test_expect_success SYMLINKS 'setup: .git as a symlink to a directory is valid' ' + mkdir -p parent/link-to-dir && + ( + cd parent/link-to-dir && + git init real-repo && + ln -s real-repo/.git .git && + git rev-parse --git-dir >actual && + echo .git >expect && + test_cmp expect actual + ) +' + +test_expect_success PIPE 'setup: .git as a FIFO (named pipe) is rejected' ' + mkdir -p parent/fifo && + ( + cd parent/fifo && + mkfifo .git && + test_must_fail git rev-parse --git-dir 2>stderr && + grep "not a regular file" stderr + ) +' + +test_expect_success SYMLINKS,PIPE 'setup: .git as a symlink to a FIFO is rejected' ' + mkdir -p parent/symlink-fifo && + ( + cd parent/symlink-fifo && + mkfifo target-fifo && + ln -s target-fifo .git && + test_must_fail git rev-parse --git-dir 2>stderr && + grep "not a regular file" stderr + ) +' + +test_expect_success 'setup: .git with garbage content is rejected' ' + mkdir -p parent/garbage && + ( + cd parent/garbage && + echo "garbage" >.git && + test_must_fail git rev-parse --git-dir 2>stderr && + grep "invalid gitfile format" stderr + ) +' + +test_expect_success 'setup: .git as an empty directory is ignored' ' + mkdir -p parent/empty-dir && + ( + cd parent/empty-dir && + mkdir .git && + git rev-parse --git-dir >actual && + echo "$TRASH_DIRECTORY/parent/.git" >expect && + test_cmp expect actual + ) +' + +test_done -- 2.43.0 ^ permalink raw reply related [flat|nested] 35+ messages in thread
* Re: [PATCH v4] setup: allow cwd/.git to be a symlink to a directory 2026-02-17 8:41 ` [PATCH v4] setup: allow cwd/.git to be a symlink to a directory Tian Yuchen @ 2026-02-17 11:26 ` Karthik Nayak 2026-02-17 15:30 ` Tian Yuchen 2026-02-17 17:01 ` Junio C Hamano 2026-02-17 17:59 ` Karthik Nayak 2026-02-18 5:18 ` [PATCH v5 0/2] setup.c: v5 reroll Tian Yuchen 2 siblings, 2 replies; 35+ messages in thread From: Karthik Nayak @ 2026-02-17 11:26 UTC (permalink / raw) To: Tian Yuchen, git; +Cc: gitster [-- Attachment #1: Type: text/plain, Size: 4834 bytes --] Tian Yuchen <a3205153416@gmail.com> writes: > Strictly enforcing 'lstat()' and 'S_ISREG()' on '.git' prevents valid > workflows where '.git' is a symbolic link pointing to a real git > directory (e.g. created via 'ln -s'). > > Refactor 'setup_git_directory_gently_1()' to use 'stat()' instead of > 'lstat()'. This allows the filesystem to automatically resolve symbolic > links. > > To ensure safety and correctness, the logic flow is updated to: > > 1. Ignore 'ENOENT' (file missing). > 2. Check 'IS_A_DIR' cases via 'is_git_directory()'. > 3. Explicitly reject 'NOT_A_FILE' cases (FIFOs or sockets). > Small nit, it would have been a bit nicer to separate these out into individual commits with tests added per commit. > Add a new test script t/t0009-setup-security.sh which verifies: > Wouldn't something like 't0009-git-dir-validation.sh' be a better name? > - Valid .git symlinks to real directories are accepted. > - .git as a named pipe (FIFO) is rejected. > - .git as a symlink to a named pipe is rejected. > - .git with garbage content is rejected. > - Empty .git directories are ignored. > > Signed-off-by: Tian Yuchen <a3205153416@gmail.com> > --- > setup.c | 39 ++++++++++++++------- > setup.h | 2 ++ > t/t0009-setup-security.sh | 72 +++++++++++++++++++++++++++++++++++++++ > 3 files changed, 101 insertions(+), 12 deletions(-) > create mode 100755 t/t0009-setup-security.sh > > diff --git a/setup.c b/setup.c > index 3a6a048620..269aa9faaa 100644 > --- a/setup.c > +++ b/setup.c > @@ -939,8 +939,14 @@ const char *read_gitfile_gently(const char *path, int *return_error_code) > static struct strbuf realpath = STRBUF_INIT; > > if (stat(path, &st)) { > - /* NEEDSWORK: discern between ENOENT vs other errors */ > - error_code = READ_GITFILE_ERR_STAT_FAILED; > + if (errno == ENOENT) > + error_code = READ_GITFILE_ERR_STAT_ENOENT; > + else > + error_code = READ_GITFILE_ERR_STAT_FAILED; > + goto cleanup_return; So this differentiates between 'stat()' failing and the path not existing. Ok. > + } > + if (S_ISDIR(st.st_mode)) { > + error_code = READ_GITFILE_ERR_IS_A_DIR; > goto cleanup_return; > } Okay so if the '.git' file is a directory, we set the appropriate error. So this error code is new and we introduce it in this patch. > if (!S_ISREG(st.st_mode)) { > @@ -994,7 +1000,9 @@ const char *read_gitfile_gently(const char *path, int *return_error_code) > cleanup_return: > if (return_error_code) > *return_error_code = error_code; > - else if (error_code) > + else if (error_code && > + error_code != READ_GITFILE_ERR_STAT_ENOENT && > + error_code != READ_GITFILE_ERR_IS_A_DIR) > read_gitfile_error_die(error_code, path, dir); > I understand the exclusion here (they are non-fatal flows), but wouldn't it more make sense to add these two exclusions within `read_gitfile_error_die()` which already has two such exclusions? By separating this out, it gets really confusing. > free(buf); > @@ -1576,20 +1584,27 @@ static enum discovery_result setup_git_directory_gently_1(struct strbuf *dir, > if (offset > min_offset) > strbuf_addch(dir, '/'); > strbuf_addstr(dir, DEFAULT_GIT_DIR_ENVIRONMENT); > - gitdirenv = read_gitfile_gently(dir->buf, die_on_error ? > - NULL : &error_code); > + gitdirenv = read_gitfile_gently(dir->buf, &error_code); Okay so we unconditionally read the error into errorcode, quick question that comes to mind: Wouldn't this break the previous flow for when `die_on_error = 1`? Where `read_gitfile_error_die()` would've been called? > if (!gitdirenv) { > - if (die_on_error || > - error_code == READ_GITFILE_ERR_NOT_A_FILE) { > - /* NEEDSWORK: fail if .git is not file nor dir */ > + if (error_code == READ_GITFILE_ERR_STAT_ENOENT) { > + ; > + } else if (error_code == READ_GITFILE_ERR_IS_A_DIR) { > if (is_git_directory(dir->buf)) { > gitdirenv = DEFAULT_GIT_DIR_ENVIRONMENT; > gitdir_path = xstrdup(dir->buf); > } > - } else if (error_code != READ_GITFILE_ERR_STAT_FAILED) > - return GIT_DIR_INVALID_GITFILE; > - } else > - gitfile = xstrdup(dir->buf); > + } else if (error_code == READ_GITFILE_ERR_NOT_A_FILE) { > + if (die_on_error) > + die(_("Invalid %s: not a regular file or directory"), dir->buf); > + else > + return GIT_DIR_INVALID_GITFILE; > + } else if (error_code != READ_GITFILE_ERR_STAT_FAILED) { > + if (die_on_error) > + read_gitfile_error_die(error_code, dir->buf, NULL); Well seems a bit convoluted as we explicitly skip this in our call to `read_gitfile_gently()` to then call it ourselves. > + else > + return GIT_DIR_INVALID_GITFILE; > + } > + } > /* > * Earlier, we tentatively added DEFAULT_GIT_DIR_ENVIRONMENT > * to check that directory for a repository. [snip] [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 690 bytes --] ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v4] setup: allow cwd/.git to be a symlink to a directory 2026-02-17 11:26 ` Karthik Nayak @ 2026-02-17 15:30 ` Tian Yuchen 2026-02-17 18:56 ` Karthik Nayak 2026-02-17 17:01 ` Junio C Hamano 1 sibling, 1 reply; 35+ messages in thread From: Tian Yuchen @ 2026-02-17 15:30 UTC (permalink / raw) To: Karthik Nayak, git; +Cc: gitster Hi Karthik, Thanks for the review! > Small nit, it would have been a bit nicer to separate these out into > individual commits with tests added per commit. Since this is a security fix involving logic changes, I kept the tests and code together to ensure the commit is self-contained. I hope keeping them together is acceptable here! > Wouldn't something like 't0009-git-dir-validation.sh' be a better name? Indeed a much better name. Will rename it in the next reroll. > I understand the exclusion here (they are non-fatal flows), but wouldn't > it more make sense to add these two exclusions within > `read_gitfile_error_die()` which already has two such exclusions? By > separating this out, it gets really confusing. I actually implemented exactly that in previous patches (handling these exclusions inside 'read_gitfile_error_die'), but Junio pointed out that: >> diff --git a/setup.c b/setup.c >> index 3a6a048620..8681a8a9d1 100644 >> --- a/setup.c >> +++ b/setup.c >> @@ -911,6 +911,10 @@ void read_gitfile_error_die(int error_code, const char *path, const char *dir) >> die(_("no path in gitfile: %s"), path); >> case READ_GITFILE_ERR_NOT_A_REPO: >> die(_("not a git repository: %s"), dir); >> + case READ_GITFILE_ERR_STAT_ENOENT: >> + die(_("Not a git repository: %s"), path); >> + case READ_GITFILE_ERR_IS_A_DIR: >> + die(_("Not a git file (is a directory): %s"), path); > > Hmph, isn't this backwards? > > We used to treat STAT_FAILED as OK without dying in this function, > because we conflated "there is nothing there, so you should go one > level up and try again" happy case with all other stat(2) failure, > and that is why we introduced STAT_ENOENT here. ENOENT is the > *only* case among what used to be STAT_FAILED that we do *not* want > to die in this function. The same thing with NOT_A_FILE vs > IS_A_DIR. We used to treat the former as OK but the only case we > wanted to treat as OK was IS_A_DIR and all other cases, like FIFO, > we wanted to complain, no? In other word, ENOENT and IS_A_DIR cases are *VALID* states during the discovery process, not *ERRORS* that need to be suppressed in a "die" function. Therefore, we moved the decision-making logic up to the caller. This allows 'setup_git_directory_gently_1' to decide: ENOENT -> Continue search IS_A_DIR -> Check dir NOT_A_FILE -> Die Other -> Call 'read_gitfile_error_die()' *REAL ERROR* > Okay so we unconditionally read the error into errorcode, quick question > that comes to mind: Wouldn't this break the previous flow for when > `die_on_error = 1`? Where `read_gitfile_error_die()` would've been > called? It does change the flow, but intentionally, by passing &error_code (making it non-NULL), we prevent 'read_gitfile_gently' from automatically dying. We must do it because if it encounters a "garbage file", we now want to capture that error code and verify it in the caller. But more importantly, if it encounters ENOENT (which is now a distinct error code), we definitely do not want it to die, nor do we want to treat it as a fatal error. It does look a bit verbose, but it makes the state transitions explicit in 'setup_git_directory_gently_1'. I believe we are on the right track! Thanks again for the feedback. Regards, Yuchen ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v4] setup: allow cwd/.git to be a symlink to a directory 2026-02-17 15:30 ` Tian Yuchen @ 2026-02-17 18:56 ` Karthik Nayak 2026-02-17 21:10 ` Junio C Hamano 0 siblings, 1 reply; 35+ messages in thread From: Karthik Nayak @ 2026-02-17 18:56 UTC (permalink / raw) To: Tian Yuchen, git; +Cc: gitster [-- Attachment #1: Type: text/plain, Size: 4618 bytes --] Tian Yuchen <a3205153416@gmail.com> writes: > Hi Karthik, > > Thanks for the review! > >> Small nit, it would have been a bit nicer to separate these out into >> individual commits with tests added per commit. > > Since this is a security fix involving logic changes, I kept the tests > and code together to ensure the commit is self-contained. I hope keeping > them together is acceptable here! > My indication wasn't separate the tests out into an individual commit. But rather to highlight that this one commit is doing multiple things and it would be nice to split it out and each commit could tackle an individual problem with tests included. >> Wouldn't something like 't0009-git-dir-validation.sh' be a better name? > > Indeed a much better name. Will rename it in the next reroll. > >> I understand the exclusion here (they are non-fatal flows), but wouldn't >> it more make sense to add these two exclusions within >> `read_gitfile_error_die()` which already has two such exclusions? By >> separating this out, it gets really confusing. > > I actually implemented exactly that in previous patches (handling these > exclusions inside 'read_gitfile_error_die'), but Junio pointed out that: > > >> diff --git a/setup.c b/setup.c > >> index 3a6a048620..8681a8a9d1 100644 > >> --- a/setup.c > >> +++ b/setup.c > >> @@ -911,6 +911,10 @@ void read_gitfile_error_die(int error_code, > const char *path, const char *dir) > >> die(_("no path in gitfile: %s"), path); > >> case READ_GITFILE_ERR_NOT_A_REPO: > >> die(_("not a git repository: %s"), dir); > >> + case READ_GITFILE_ERR_STAT_ENOENT: > >> + die(_("Not a git repository: %s"), path); > >> + case READ_GITFILE_ERR_IS_A_DIR: > >> + die(_("Not a git file (is a directory): %s"), path); > > > > Hmph, isn't this backwards? > > > > We used to treat STAT_FAILED as OK without dying in this function, > > because we conflated "there is nothing there, so you should go one > > level up and try again" happy case with all other stat(2) failure, > > and that is why we introduced STAT_ENOENT here. ENOENT is the > > *only* case among what used to be STAT_FAILED that we do *not* want > > to die in this function. The same thing with NOT_A_FILE vs > > IS_A_DIR. We used to treat the former as OK but the only case we > > wanted to treat as OK was IS_A_DIR and all other cases, like FIFO, > > we wanted to complain, no? > > In other word, ENOENT and IS_A_DIR cases are *VALID* states during the > discovery process, not *ERRORS* that need to be suppressed in a "die" > function. Therefore, we moved the decision-making logic up to the > caller. This allows 'setup_git_directory_gently_1' to decide: > > ENOENT -> Continue search > IS_A_DIR -> Check dir > NOT_A_FILE -> Die > Other -> Call 'read_gitfile_error_die()' *REAL ERROR* > My understanding was Junio was suggesting that what you're doing is the inverse of what is expected. In short (on top of your patch), something like: diff --git a/setup.c b/setup.c index c3dd6a4197..7edf921564 100644 --- a/setup.c +++ b/setup.c @@ -898,10 +898,14 @@ int verify_repository_format(const struct repository_format *format, void read_gitfile_error_die(int error_code, const char *path, const char *dir) { switch (error_code) { - case READ_GITFILE_ERR_STAT_FAILED: - case READ_GITFILE_ERR_NOT_A_FILE: + case READ_GITFILE_ERR_STAT_ENOENT: + case READ_GITFILE_ERR_IS_A_DIR: /* non-fatal; follow return path */ break; + case READ_GITFILE_ERR_STAT_FAILED: + die(_("stat failed to run correctly")) + case READ_GITFILE_ERR_NOT_A_FILE: + die(_("unsupported file type")) case READ_GITFILE_ERR_OPEN_FAILED: die_errno(_("error opening '%s'"), path); case READ_GITFILE_ERR_TOO_LARGE: >> Okay so we unconditionally read the error into errorcode, quick question >> that comes to mind: Wouldn't this break the previous flow for when >> `die_on_error = 1`? Where `read_gitfile_error_die()` would've been >> called? > > It does change the flow, but intentionally, by passing &error_code > (making it non-NULL), we prevent 'read_gitfile_gently' from > automatically dying. > > We must do it because if it encounters a "garbage file", we now want to > capture that error code and verify it in the caller. But more > importantly, if it encounters ENOENT (which is now a distinct error > code), we definitely do not want it to die, nor do we want to treat it > as a fatal error. > > It does look a bit verbose, but it makes the state transitions explicit > in 'setup_git_directory_gently_1'. I believe we are on the right track! > > Thanks again for the feedback. > > Regards, > > Yuchen [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 690 bytes --] ^ permalink raw reply related [flat|nested] 35+ messages in thread
* Re: [PATCH v4] setup: allow cwd/.git to be a symlink to a directory 2026-02-17 18:56 ` Karthik Nayak @ 2026-02-17 21:10 ` Junio C Hamano 0 siblings, 0 replies; 35+ messages in thread From: Junio C Hamano @ 2026-02-17 21:10 UTC (permalink / raw) To: Karthik Nayak; +Cc: Tian Yuchen, git Karthik Nayak <karthik.188@gmail.com> writes: > My indication wasn't separate the tests out into an individual commit. > But rather to highlight that this one commit is doing multiple things > and it would be nice to split it out and each commit could tackle an > individual problem with tests included. Like treating "all stat failures silently ignored, instead of treating ENOENT specifically" and "a non-directory non-file filesystem entity being silently ignored" two separate issues, so at least two patches, and possibly if we need a preliminary clean-up to make these two changes, yet another one? And each patch focuses on one thing it addresses, with its own test? That makes sense. Thanks. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v4] setup: allow cwd/.git to be a symlink to a directory 2026-02-17 11:26 ` Karthik Nayak 2026-02-17 15:30 ` Tian Yuchen @ 2026-02-17 17:01 ` Junio C Hamano 2026-02-17 18:50 ` Karthik Nayak 1 sibling, 1 reply; 35+ messages in thread From: Junio C Hamano @ 2026-02-17 17:01 UTC (permalink / raw) To: Karthik Nayak; +Cc: Tian Yuchen, git Karthik Nayak <karthik.188@gmail.com> writes: >> @@ -994,7 +1000,9 @@ const char *read_gitfile_gently(const char *path, int *return_error_code) >> cleanup_return: >> if (return_error_code) >> *return_error_code = error_code; >> - else if (error_code) >> + else if (error_code && >> + error_code != READ_GITFILE_ERR_STAT_ENOENT && >> + error_code != READ_GITFILE_ERR_IS_A_DIR) >> read_gitfile_error_die(error_code, path, dir); >> > > I understand the exclusion here (they are non-fatal flows), but wouldn't > it more make sense to add these two exclusions within > `read_gitfile_error_die()` which already has two such exclusions? By > separating this out, it gets really confusing. Absolutely. The point of this change, IIUC, is that these two existing exclusions were too broad. stat() can fail for many reasons, but because we did not differenciate ENOENT (which we *are* happy to see and do not want to consider an error) from all other error cases (which we may have been better off if we diagnosed them as error), we pretended both ENOENT and all other stat() failures were happy case and "case ERR_STAT_FAILED:" covered both. To fix this, the patch splits stat() failures into two, ERR_STAT_ENOENT is the happy case we should have been returning without dying from read_gitfile_error_die(), and ERR_STAT_FAILED is the rest that we should have been dying there but in order to return from there without dying when we got ENOENT, we were not dying there. Now we have a separate ERR_STAT_ENOENT, read_gitfile_error_die() can (and should) die when we see ERR_STAT_FAILED, and it can (and should) return to us when we see ERR_STAT_ENOENT as a happy case. The story is exactly the same between ERR_NOT_A_FILE (which had been non-error only because we wanted to treat a directory as OK, but we can make it an error) and ERR_IS_A DIR (which is new, and is an OK case). The above exception on the caller's side you quoted is a complete opposite from that line of reasoning, and that is why it is confusing. If there are other callers of read_gitfile_error_die() and different semantics, such a "now we die on every possible errors" may also be a valid position to take, *but* then it does not make sense unless this patch makes read_gitfile_error_die() to die on ERR_STAT_FAILED and ERR_NOT_A_FILE. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v4] setup: allow cwd/.git to be a symlink to a directory 2026-02-17 17:01 ` Junio C Hamano @ 2026-02-17 18:50 ` Karthik Nayak 2026-02-18 4:08 ` Tian Yuchen 0 siblings, 1 reply; 35+ messages in thread From: Karthik Nayak @ 2026-02-17 18:50 UTC (permalink / raw) To: Junio C Hamano; +Cc: Tian Yuchen, git [-- Attachment #1: Type: text/plain, Size: 2842 bytes --] Junio C Hamano <gitster@pobox.com> writes: > Karthik Nayak <karthik.188@gmail.com> writes: > >>> @@ -994,7 +1000,9 @@ const char *read_gitfile_gently(const char *path, int *return_error_code) >>> cleanup_return: >>> if (return_error_code) >>> *return_error_code = error_code; >>> - else if (error_code) >>> + else if (error_code && >>> + error_code != READ_GITFILE_ERR_STAT_ENOENT && >>> + error_code != READ_GITFILE_ERR_IS_A_DIR) >>> read_gitfile_error_die(error_code, path, dir); >>> >> >> I understand the exclusion here (they are non-fatal flows), but wouldn't >> it more make sense to add these two exclusions within >> `read_gitfile_error_die()` which already has two such exclusions? By >> separating this out, it gets really confusing. > > Absolutely. The point of this change, IIUC, is that these two > existing exclusions were too broad. stat() can fail for many > reasons, but because we did not differenciate ENOENT (which we *are* > happy to see and do not want to consider an error) from all other > error cases (which we may have been better off if we diagnosed them > as error), we pretended both ENOENT and all other stat() failures > were happy case and "case ERR_STAT_FAILED:" covered both. Yeah, so this is the situation before the patch, and I'm in agreement. > To fix > this, the patch splits stat() failures into two, ERR_STAT_ENOENT is > the happy case we should have been returning without dying from > read_gitfile_error_die(), and ERR_STAT_FAILED is the rest that we > should have been dying there but in order to return from there > without dying when we got ENOENT, we were not dying there. Now we > have a separate ERR_STAT_ENOENT, read_gitfile_error_die() can (and > should) die when we see ERR_STAT_FAILED, and it can (and should) > return to us when we see ERR_STAT_ENOENT as a happy case. Okay, this is what I was expecting too, historically `read_gitfile_error_die()` treated ERR_STAT_FAILED as the non-fatal path which made sense. But now that we have ERR_STAT_ENOENT. It should treat the latter as the non-fatal path and the former as an actual issue. > The story > is exactly the same between ERR_NOT_A_FILE (which had been non-error > only because we wanted to treat a directory as OK, but we can make > it an error) and ERR_IS_A DIR (which is new, and is an OK case). > Yup makes sense. > The above exception on the caller's side you quoted is a complete > opposite from that line of reasoning, and that is why it is > confusing. > > If there are other callers of read_gitfile_error_die() and different > semantics, such a "now we die on every possible errors" may also be > a valid position to take, *but* then it does not make sense unless > this patch makes read_gitfile_error_die() to die on ERR_STAT_FAILED > and ERR_NOT_A_FILE. Exactly! Thanks for clearing it out. Karthik [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 690 bytes --] ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v4] setup: allow cwd/.git to be a symlink to a directory 2026-02-17 18:50 ` Karthik Nayak @ 2026-02-18 4:08 ` Tian Yuchen 0 siblings, 0 replies; 35+ messages in thread From: Tian Yuchen @ 2026-02-18 4:08 UTC (permalink / raw) To: Karthik Nayak, Junio C Hamano; +Cc: git On 2/18/26 02:50, Karthik Nayak wrote: > Junio C Hamano <gitster@pobox.com> writes: > >> Karthik Nayak <karthik.188@gmail.com> writes: >> >>>> @@ -994,7 +1000,9 @@ const char *read_gitfile_gently(const char *path, int *return_error_code) >>>> cleanup_return: >>>> if (return_error_code) >>>> *return_error_code = error_code; >>>> - else if (error_code) >>>> + else if (error_code && >>>> + error_code != READ_GITFILE_ERR_STAT_ENOENT && >>>> + error_code != READ_GITFILE_ERR_IS_A_DIR) >>>> read_gitfile_error_die(error_code, path, dir); >>>> >>> >>> I understand the exclusion here (they are non-fatal flows), but wouldn't >>> it more make sense to add these two exclusions within >>> `read_gitfile_error_die()` which already has two such exclusions? By >>> separating this out, it gets really confusing. >> >> Absolutely. The point of this change, IIUC, is that these two >> existing exclusions were too broad. stat() can fail for many >> reasons, but because we did not differenciate ENOENT (which we *are* >> happy to see and do not want to consider an error) from all other >> error cases (which we may have been better off if we diagnosed them >> as error), we pretended both ENOENT and all other stat() failures >> were happy case and "case ERR_STAT_FAILED:" covered both. > > Yeah, so this is the situation before the patch, and I'm in agreement. > >> To fix >> this, the patch splits stat() failures into two, ERR_STAT_ENOENT is >> the happy case we should have been returning without dying from >> read_gitfile_error_die(), and ERR_STAT_FAILED is the rest that we >> should have been dying there but in order to return from there >> without dying when we got ENOENT, we were not dying there. Now we >> have a separate ERR_STAT_ENOENT, read_gitfile_error_die() can (and >> should) die when we see ERR_STAT_FAILED, and it can (and should) >> return to us when we see ERR_STAT_ENOENT as a happy case. > > Okay, this is what I was expecting too, historically > `read_gitfile_error_die()` treated ERR_STAT_FAILED as the non-fatal path > which made sense. But now that we have ERR_STAT_ENOENT. It should treat > the latter as the non-fatal path and the former as an actual issue. > >> The story >> is exactly the same between ERR_NOT_A_FILE (which had been non-error >> only because we wanted to treat a directory as OK, but we can make >> it an error) and ERR_IS_A DIR (which is new, and is an OK case). >> > > Yup makes sense. > >> The above exception on the caller's side you quoted is a complete >> opposite from that line of reasoning, and that is why it is >> confusing. >> >> If there are other callers of read_gitfile_error_die() and different >> semantics, such a "now we die on every possible errors" may also be >> a valid position to take, *but* then it does not make sense unless >> this patch makes read_gitfile_error_die() to die on ERR_STAT_FAILED >> and ERR_NOT_A_FILE. > > Exactly! Thanks for clearing it out. > > Karthik I don't think leaving it up yo the caller to determine "which are error cases" is as confusing as you suggest. But honestly, your approach is indeed much clearer and more readable. I'll split this into two commits and send them shortly. Regards, Yuchen ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v4] setup: allow cwd/.git to be a symlink to a directory 2026-02-17 8:41 ` [PATCH v4] setup: allow cwd/.git to be a symlink to a directory Tian Yuchen 2026-02-17 11:26 ` Karthik Nayak @ 2026-02-17 17:59 ` Karthik Nayak 2026-02-18 5:18 ` [PATCH v5 0/2] setup.c: v5 reroll Tian Yuchen 2 siblings, 0 replies; 35+ messages in thread From: Karthik Nayak @ 2026-02-17 17:59 UTC (permalink / raw) To: Tian Yuchen, git; +Cc: gitster [-- Attachment #1: Type: text/plain, Size: 1458 bytes --] Tian Yuchen <a3205153416@gmail.com> writes: > Strictly enforcing 'lstat()' and 'S_ISREG()' on '.git' prevents valid > workflows where '.git' is a symbolic link pointing to a real git > directory (e.g. created via 'ln -s'). > > Refactor 'setup_git_directory_gently_1()' to use 'stat()' instead of > 'lstat()'. This allows the filesystem to automatically resolve symbolic > links. > > To ensure safety and correctness, the logic flow is updated to: > > 1. Ignore 'ENOENT' (file missing). > 2. Check 'IS_A_DIR' cases via 'is_git_directory()'. > 3. Explicitly reject 'NOT_A_FILE' cases (FIFOs or sockets). > > Add a new test script t/t0009-setup-security.sh which verifies: > > - Valid .git symlinks to real directories are accepted. > - .git as a named pipe (FIFO) is rejected. > - .git as a symlink to a named pipe is rejected. > - .git with garbage content is rejected. > - Empty .git directories are ignored. > > Signed-off-by: Tian Yuchen <a3205153416@gmail.com> > --- > setup.c | 39 ++++++++++++++------- > setup.h | 2 ++ > t/t0009-setup-security.sh | 72 +++++++++++++++++++++++++++++++++++++++ > 3 files changed, 101 insertions(+), 12 deletions(-) > create mode 100755 t/t0009-setup-security.sh > I also missed this in my review, but the test needs to be added to 'meson.build', without which meson would fail. This is caught by our CI too, if you run the CI on GitLab/GitHub you should see the issue. Karthik [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 690 bytes --] ^ permalink raw reply [flat|nested] 35+ messages in thread
* [PATCH v5 0/2] setup.c: v5 reroll 2026-02-17 8:41 ` [PATCH v4] setup: allow cwd/.git to be a symlink to a directory Tian Yuchen 2026-02-17 11:26 ` Karthik Nayak 2026-02-17 17:59 ` Karthik Nayak @ 2026-02-18 5:18 ` Tian Yuchen 2026-02-18 5:18 ` [PATCH v5 1/2] setup: distingush ENOENT from other stat errors Tian Yuchen 2026-02-18 5:18 ` [PATCH v5 2/2] setup: allow cwd/.git to be a symlink to a directory Tian Yuchen 2 siblings, 2 replies; 35+ messages in thread From: Tian Yuchen @ 2026-02-18 5:18 UTC (permalink / raw) To: git; +Cc: gitster, karthik.188 Hi Karthik, Junio, Here is the v5 patch. Bases on the feedback, I have split the changed into two separate patches to isolate the refactoring from the logic change. Patch 1/2: Refactors read_gitfile_* functions to correctly distinguish ENOENT from other stat() failures. Patch 2/2: Switches lstat() to stat() to support symlinks, while adding checks for benign and fatal cases. Also includes the new test script t0009-git-dir-validation.sh (name has been changed as suggested by Karthik). Changes since v4: 1. Split into 2 commits. 2. read_gitfile_error_die() now handles happy/fatal cases internally. 3. Renamed the test script. 4. Updates t/meson.build. If there are any further issues, please feel free to bring them to my attention. Thanks for guiding me towards this cleaner structure! Regards, Yuchen --- Tian Yuchen (2): setup: distingush ENOENT from other stat errors setup: allow cwd/.git to be a symlink to a directory setup.c | 35 ++++++++++------- setup.h | 2 + t/meson.build | 1 + t/t0009-git-dir-validation.sh | 72 +++++++++++++++++++++++++++++++++++ 4 files changed, 96 insertions(+), 14 deletions(-) create mode 100755 t/t0009-git-dir-validation.sh -- 2.43.0 ^ permalink raw reply [flat|nested] 35+ messages in thread
* [PATCH v5 1/2] setup: distingush ENOENT from other stat errors 2026-02-18 5:18 ` [PATCH v5 0/2] setup.c: v5 reroll Tian Yuchen @ 2026-02-18 5:18 ` Tian Yuchen 2026-02-18 10:12 ` Karthik Nayak 2026-02-18 18:15 ` Junio C Hamano 2026-02-18 5:18 ` [PATCH v5 2/2] setup: allow cwd/.git to be a symlink to a directory Tian Yuchen 1 sibling, 2 replies; 35+ messages in thread From: Tian Yuchen @ 2026-02-18 5:18 UTC (permalink / raw) To: git; +Cc: gitster, karthik.188 Currently, 'read_gitfile_gently()' treats all 'stat()' failures as generic errors. This prevents distinguishing between a missing file and real errors like permission denied (fatal). Introduce 'READ_GITFILE_ERR_STAT_ENOENT' and 'READ_GITFILE_ERR_IS_A_DIR'. Updata 'read_gitfile_error_die()' to handle these cases: 1. Return for happy cases ('ENOENT', 'IS_A_DIR'); 2. Die for fatal cases ('STAT_FAILED', 'NOT_A_FILE'); This prepares for error handling in the next commit. Signed-off-by: Tian Yuchen <a3205153416@gmail.com> --- setup.c | 17 +++++++++++++---- setup.h | 2 ++ 2 files changed, 15 insertions(+), 4 deletions(-) diff --git a/setup.c b/setup.c index c8336eb20e..0ca129623e 100644 --- a/setup.c +++ b/setup.c @@ -897,10 +897,13 @@ int verify_repository_format(const struct repository_format *format, void read_gitfile_error_die(int error_code, const char *path, const char *dir) { switch (error_code) { + case READ_GITFILE_ERR_STAT_ENOENT: + case READ_GITFILE_ERR_IS_A_DIR: + return; case READ_GITFILE_ERR_STAT_FAILED: + die(_("error reading %s"), path); case READ_GITFILE_ERR_NOT_A_FILE: - /* non-fatal; follow return path */ - break; + die(_("invalid %s: not a regular file"), path); case READ_GITFILE_ERR_OPEN_FAILED: die_errno(_("error opening '%s'"), path); case READ_GITFILE_ERR_TOO_LARGE: @@ -941,8 +944,14 @@ const char *read_gitfile_gently(const char *path, int *return_error_code) static struct strbuf realpath = STRBUF_INIT; if (stat(path, &st)) { - /* NEEDSWORK: discern between ENOENT vs other errors */ - error_code = READ_GITFILE_ERR_STAT_FAILED; + if (errno == ENOENT) + error_code = READ_GITFILE_ERR_STAT_ENOENT; + else + error_code = READ_GITFILE_ERR_STAT_FAILED; + goto cleanup_return; + } + if (S_ISDIR(st.st_mode)) { + error_code = READ_GITFILE_ERR_IS_A_DIR; goto cleanup_return; } if (!S_ISREG(st.st_mode)) { diff --git a/setup.h b/setup.h index 0738dec244..ed4b13f061 100644 --- a/setup.h +++ b/setup.h @@ -36,6 +36,8 @@ int is_nonbare_repository_dir(struct strbuf *path); #define READ_GITFILE_ERR_NO_PATH 6 #define READ_GITFILE_ERR_NOT_A_REPO 7 #define READ_GITFILE_ERR_TOO_LARGE 8 +#define READ_GITFILE_ERR_STAT_ENOENT 9 +#define READ_GITFILE_ERR_IS_A_DIR 10 void read_gitfile_error_die(int error_code, const char *path, const char *dir); const char *read_gitfile_gently(const char *path, int *return_error_code); #define read_gitfile(path) read_gitfile_gently((path), NULL) -- 2.43.0 ^ permalink raw reply related [flat|nested] 35+ messages in thread
* Re: [PATCH v5 1/2] setup: distingush ENOENT from other stat errors 2026-02-18 5:18 ` [PATCH v5 1/2] setup: distingush ENOENT from other stat errors Tian Yuchen @ 2026-02-18 10:12 ` Karthik Nayak 2026-02-18 11:11 ` Tian Yuchen 2026-02-18 18:15 ` Junio C Hamano 1 sibling, 1 reply; 35+ messages in thread From: Karthik Nayak @ 2026-02-18 10:12 UTC (permalink / raw) To: Tian Yuchen, git; +Cc: gitster [-- Attachment #1: Type: text/plain, Size: 3627 bytes --] Tian Yuchen <a3205153416@gmail.com> writes: > Currently, 'read_gitfile_gently()' treats all 'stat()' failures as > generic errors. This prevents distinguishing between a missing file and > real errors like permission denied (fatal). Correct. > Introduce 'READ_GITFILE_ERR_STAT_ENOENT' and 'READ_GITFILE_ERR_IS_A_DIR'. Okay, to clarify we're making two distinct changes: 1. Split the 'stat()' error ERR_STAT_FAILED into ERR_STAT_FAILED and ERR_STAT_ENOENT. This is what your previous paragraph described. It would be nice if you could also explain this. 2. We introduce ERR_IS_A_DIR for when the .git path is a directory. We don't talk about this at all. > Updata 'read_gitfile_error_die()' to handle these cases: > > 1. Return for happy cases ('ENOENT', 'IS_A_DIR'); > 2. Die for fatal cases ('STAT_FAILED', 'NOT_A_FILE'); > Nit: It would be nice to explain on the decision behind picking errors for the happy vs fatal path. > This prepares for error handling in the next commit. > > Signed-off-by: Tian Yuchen <a3205153416@gmail.com> > --- > setup.c | 17 +++++++++++++---- > setup.h | 2 ++ > 2 files changed, 15 insertions(+), 4 deletions(-) > > diff --git a/setup.c b/setup.c > index c8336eb20e..0ca129623e 100644 > --- a/setup.c > +++ b/setup.c > @@ -897,10 +897,13 @@ int verify_repository_format(const struct repository_format *format, > void read_gitfile_error_die(int error_code, const char *path, const char *dir) > { > switch (error_code) { > + case READ_GITFILE_ERR_STAT_ENOENT: > + case READ_GITFILE_ERR_IS_A_DIR: > + return; Nit: For this function it should be okay to return early. But I was expecting a break here, since it was using 'break' before, ideally we shouldn't change it unless there is a reason to. > case READ_GITFILE_ERR_STAT_FAILED: > + die(_("error reading %s"), path); > case READ_GITFILE_ERR_NOT_A_FILE: > - /* non-fatal; follow return path */ > - break; > + die(_("invalid %s: not a regular file"), path); Would it make more sense to do 'not a regular file: %s'? > case READ_GITFILE_ERR_OPEN_FAILED: > die_errno(_("error opening '%s'"), path); > case READ_GITFILE_ERR_TOO_LARGE: > @@ -941,8 +944,14 @@ const char *read_gitfile_gently(const char *path, int *return_error_code) > static struct strbuf realpath = STRBUF_INIT; > > if (stat(path, &st)) { > - /* NEEDSWORK: discern between ENOENT vs other errors */ > - error_code = READ_GITFILE_ERR_STAT_FAILED; > + if (errno == ENOENT) > + error_code = READ_GITFILE_ERR_STAT_ENOENT; > + else > + error_code = READ_GITFILE_ERR_STAT_FAILED; > + goto cleanup_return; > + } > + if (S_ISDIR(st.st_mode)) { > + error_code = READ_GITFILE_ERR_IS_A_DIR; > goto cleanup_return; > } > if (!S_ISREG(st.st_mode)) { > diff --git a/setup.h b/setup.h > index 0738dec244..ed4b13f061 100644 > --- a/setup.h > +++ b/setup.h > @@ -36,6 +36,8 @@ int is_nonbare_repository_dir(struct strbuf *path); > #define READ_GITFILE_ERR_NO_PATH 6 > #define READ_GITFILE_ERR_NOT_A_REPO 7 > #define READ_GITFILE_ERR_TOO_LARGE 8 > +#define READ_GITFILE_ERR_STAT_ENOENT 9 > +#define READ_GITFILE_ERR_IS_A_DIR 10 > void read_gitfile_error_die(int error_code, const char *path, const char *dir); > const char *read_gitfile_gently(const char *path, int *return_error_code); > #define read_gitfile(path) read_gitfile_gently((path), NULL) > -- > 2.43.0 So why didn't we add the tests here for the changes made? Nit: I would even go further to even separate this into two commits: 1. Split 'stat()' error into ERR_STAT_FAILED and ERR_STAT_ENOENT. 2. Introduce 'READ_GITFILE_ERR_IS_A_DIR'. But I'll leave that to you. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 690 bytes --] ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v5 1/2] setup: distingush ENOENT from other stat errors 2026-02-18 10:12 ` Karthik Nayak @ 2026-02-18 11:11 ` Tian Yuchen 0 siblings, 0 replies; 35+ messages in thread From: Tian Yuchen @ 2026-02-18 11:11 UTC (permalink / raw) To: Karthik Nayak, git; +Cc: gitster Hi Karthik, Thanks for the review! > Nit: For this function it should be okay to return early. But I was > expecting a break here, since it was using 'break' before, ideally we > shouldn't change it unless there is a reason to. > Would it make more sense to do 'not a regular file: %s'? Points taken. > So why didn't we add the tests here for the changes made? I think the reason is that this commit is a refactoring of the internel error handling. The actual logic change that triggers these new paths happens in the next commit (2/2). Without the logic change in the next patch, the system behaves identically to before, doesn't it? > Nit: I would even go further to even separate this into two commits: > 1. Split 'stat()' error into ERR_STAT_FAILED and ERR_STAT_ENOENT. > 2. Introduce 'READ_GITFILE_ERR_IS_A_DIR'. > > But I'll leave that to you. I appreciate the suggestion. But since the changes are relatively small and closely related, I'll stick to keeping them in this single patch to avoid excessive fragmentation. Regards, Yuchen ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v5 1/2] setup: distingush ENOENT from other stat errors 2026-02-18 5:18 ` [PATCH v5 1/2] setup: distingush ENOENT from other stat errors Tian Yuchen 2026-02-18 10:12 ` Karthik Nayak @ 2026-02-18 18:15 ` Junio C Hamano 2026-02-18 18:43 ` Junio C Hamano 1 sibling, 1 reply; 35+ messages in thread From: Junio C Hamano @ 2026-02-18 18:15 UTC (permalink / raw) To: Tian Yuchen; +Cc: git, karthik.188 Tian Yuchen <a3205153416@gmail.com> writes: > Currently, 'read_gitfile_gently()' treats all 'stat()' failures as > generic errors. This prevents distinguishing between a missing file and > real errors like permission denied (fatal). The above plan makes sense---you would split stat() error into two different classes, start returning ERR_STAT_NOENT in addition to ERR_STAT_FAILED, have the caller act on the new ERR_STAT_NOENT and adjust the way it acts on ERR_STAT_FAILED, and if possible add tests to make sure we react to failures from stat in an appropriate way (but how? --- it is where my "if possible" comes from). So I would expect that the other patch would be to split ERR_NOT_A_FILE and add ERR_IS_A_DIR, have the caller act on the new ERR_IS_A_DIR and adjust the way it acts on ERR_NOT_A_FILE. But then the proposed log message below says that in addition to NOENT, it also deals with IS_A_DIR. I do not mind doing these two in the same patch, but I do prefer to see each patch to be complete. If a callee is changed and starts returning different return values, the callers must be also adjusted to react to these new return values. Looking at the preimage of [v5 2/2], we stil check if dir->buf is a plain vanilla ".git" directory only when read_gitfile_gently() returns ERR_NOT_A_FILE with [v5 1/2] applied, but in the new world order with this patch applied, shouldn't the caller deal with ".git" when it gets ERR_IS_A_DIR and not ERR_NOT_A_FILE? After applying this patch but before applying [v5 2/2], we lose the ability to use plain vanilla ".git"? That is the kind of thing I meant by each patch to be complete. So, I do not understand what the splitting the topic into two along this axis is trying to achieve. Thanks. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v5 1/2] setup: distingush ENOENT from other stat errors 2026-02-18 18:15 ` Junio C Hamano @ 2026-02-18 18:43 ` Junio C Hamano 0 siblings, 0 replies; 35+ messages in thread From: Junio C Hamano @ 2026-02-18 18:43 UTC (permalink / raw) To: Tian Yuchen; +Cc: git, karthik.188 Junio C Hamano <gitster@pobox.com> writes: > Tian Yuchen <a3205153416@gmail.com> writes: > >> Currently, 'read_gitfile_gently()' treats all 'stat()' failures as >> generic errors. This prevents distinguishing between a missing file and >> real errors like permission denied (fatal). > > The above plan makes sense---you would split stat() error into two > different classes, start returning ERR_STAT_NOENT in addition to > ERR_STAT_FAILED, have the caller act on the new ERR_STAT_NOENT and > adjust the way it acts on ERR_STAT_FAILED, and if possible add tests > to make sure we react to failures from stat in an appropriate way > (but how? --- it is where my "if possible" comes from). So I would > expect that the other patch would be to split ERR_NOT_A_FILE and add > ERR_IS_A_DIR, have the caller act on the new ERR_IS_A_DIR and adjust > the way it acts on ERR_NOT_A_FILE. I forgot to say one thing. When changing the external interface for these service functions like read_gitfile_gently() and read_gitfile_error_die(), we need to make sure the change will not break _other_ callers of them, outside our main focus area. The latter, for example, has a caller in submodule.c and we need to make sure that the existing code is reacting to the updated definition of what ERR_STAT_FAILED and ERR_NOT_A_FILE mean (and if not, adjust it). read_gitfile_gently() is used more widely outside setup.c and we need to audit these callers, too. ^ permalink raw reply [flat|nested] 35+ messages in thread
* [PATCH v5 2/2] setup: allow cwd/.git to be a symlink to a directory 2026-02-18 5:18 ` [PATCH v5 0/2] setup.c: v5 reroll Tian Yuchen 2026-02-18 5:18 ` [PATCH v5 1/2] setup: distingush ENOENT from other stat errors Tian Yuchen @ 2026-02-18 5:18 ` Tian Yuchen 2026-02-18 10:27 ` Karthik Nayak 2026-02-18 18:25 ` Junio C Hamano 1 sibling, 2 replies; 35+ messages in thread From: Tian Yuchen @ 2026-02-18 5:18 UTC (permalink / raw) To: git; +Cc: gitster, karthik.188 Strictly enforcing 'lstat()' prevents valid '.git' symlinks. Switch 'setup_git_directory_gently_1()' to use 'stat()' to allow filesystem resolution. Calling the refactored 'read_gitfile_error_die()' to ensure safety: 1. Happy cases ('ENOENT', 'IS_A_DIR') are ignored automatically; 2. Invalid types (like FIFOs and sockets) trigger 'die()' via 'NOT_A_FILE'. Add 't/t0009-git-dir-validation.sh' to verify symlink support and FIFO rejection, and register it in 't/meson.build'. Signed-off-by: Tian Yuchen <a3205153416@gmail.com> --- setup.c | 18 ++++----- t/meson.build | 1 + t/t0009-git-dir-validation.sh | 72 +++++++++++++++++++++++++++++++++++ 3 files changed, 81 insertions(+), 10 deletions(-) create mode 100755 t/t0009-git-dir-validation.sh diff --git a/setup.c b/setup.c index 0ca129623e..6e6068e5eb 100644 --- a/setup.c +++ b/setup.c @@ -1590,17 +1590,15 @@ static enum discovery_result setup_git_directory_gently_1(struct strbuf *dir, gitdirenv = read_gitfile_gently(dir->buf, die_on_error ? NULL : &error_code); if (!gitdirenv) { - if (die_on_error || - error_code == READ_GITFILE_ERR_NOT_A_FILE) { - /* NEEDSWORK: fail if .git is not file nor dir */ - if (is_git_directory(dir->buf)) { - gitdirenv = DEFAULT_GIT_DIR_ENVIRONMENT; - gitdir_path = xstrdup(dir->buf); - } - } else if (error_code != READ_GITFILE_ERR_STAT_FAILED) - return GIT_DIR_INVALID_GITFILE; - } else + if (error_code) + read_gitfile_error_die(error_code, dir->buf, NULL); + if (is_git_directory(dir->buf)) { + gitdirenv = DEFAULT_GIT_DIR_ENVIRONMENT; + gitdir_path = xstrdup(dir->buf); + } + } else { gitfile = xstrdup(dir->buf); + } /* * Earlier, we tentatively added DEFAULT_GIT_DIR_ENVIRONMENT * to check that directory for a repository. diff --git a/t/meson.build b/t/meson.build index f80e366cff..c4afaacee5 100644 --- a/t/meson.build +++ b/t/meson.build @@ -80,6 +80,7 @@ integration_tests = [ 't0006-date.sh', 't0007-git-var.sh', 't0008-ignores.sh', + 't0009-git-dir-validation.sh', 't0010-racy-git.sh', 't0012-help.sh', 't0013-sha1dc.sh', diff --git a/t/t0009-git-dir-validation.sh b/t/t0009-git-dir-validation.sh new file mode 100755 index 0000000000..9b3925c85f --- /dev/null +++ b/t/t0009-git-dir-validation.sh @@ -0,0 +1,72 @@ +#!/bin/sh + +test_description='setup: validation of .git file/directory types + +Verify that setup_git_directory() correctly handles: +1. Valid .git directories (including symlinks to them). +2. Invalid .git files (FIFOs, sockets) by erroring out. +3. Invalid .git files (garbage) by erroring out. +' + +. ./test-lib.sh + +test_expect_success 'setup: create parent git repository' ' + git init parent && + test_commit -C parent "root-commit" +' + +test_expect_success SYMLINKS 'setup: .git as a symlink to a directory is valid' ' + mkdir -p parent/link-to-dir && + ( + cd parent/link-to-dir && + git init real-repo && + ln -s real-repo/.git .git && + git rev-parse --git-dir >actual && + echo .git >expect && + test_cmp expect actual + ) +' + +test_expect_success PIPE 'setup: .git as a FIFO (named pipe) is rejected' ' + mkdir -p parent/fifo-trap && + ( + cd parent/fifo-trap && + mkfifo .git && + test_must_fail git rev-parse --git-dir 2>stderr && + grep "not a regular file" stderr + ) +' + +test_expect_success SYMLINKS,PIPE 'setup: .git as a symlink to a FIFO is rejected' ' + mkdir -p parent/symlink-fifo-trap && + ( + cd parent/symlink-fifo-trap && + mkfifo target-fifo && + ln -s target-fifo .git && + test_must_fail git rev-parse --git-dir 2>stderr && + grep "not a regular file" stderr + ) +' + +test_expect_success 'setup: .git with garbage content is rejected' ' + mkdir -p parent/garbage-trap && + ( + cd parent/garbage-trap && + echo "garbage" >.git && + test_must_fail git rev-parse --git-dir 2>stderr && + grep "invalid gitfile format" stderr + ) +' + +test_expect_success 'setup: .git as an empty directory is ignored' ' + mkdir -p parent/empty-dir && + ( + cd parent/empty-dir && + mkdir .git && + git rev-parse --git-dir >actual && + echo "$TRASH_DIRECTORY/parent/.git" >expect && + test_cmp expect actual + ) +' + +test_done -- 2.43.0 ^ permalink raw reply related [flat|nested] 35+ messages in thread
* Re: [PATCH v5 2/2] setup: allow cwd/.git to be a symlink to a directory 2026-02-18 5:18 ` [PATCH v5 2/2] setup: allow cwd/.git to be a symlink to a directory Tian Yuchen @ 2026-02-18 10:27 ` Karthik Nayak 2026-02-18 11:20 ` Tian Yuchen 2026-02-18 18:25 ` Junio C Hamano 1 sibling, 1 reply; 35+ messages in thread From: Karthik Nayak @ 2026-02-18 10:27 UTC (permalink / raw) To: Tian Yuchen, git; +Cc: gitster [-- Attachment #1: Type: text/plain, Size: 2413 bytes --] Tian Yuchen <a3205153416@gmail.com> writes: > Strictly enforcing 'lstat()' prevents valid '.git' symlinks. > > Switch 'setup_git_directory_gently_1()' to use 'stat()' to allow > filesystem resolution. But we don't really do this no? We were calling `read_gitfile_gently()` before and continue to do so, so there was no change regards to calling `stat()` here. Or am I missing something? > > Calling the refactored 'read_gitfile_error_die()' to ensure safety: > > 1. Happy cases ('ENOENT', 'IS_A_DIR') are ignored automatically; > 2. Invalid types (like FIFOs and sockets) trigger 'die()' via > 'NOT_A_FILE'. > > Add 't/t0009-git-dir-validation.sh' to verify symlink support and FIFO > rejection, and register it in 't/meson.build'. > > Signed-off-by: Tian Yuchen <a3205153416@gmail.com> > --- > setup.c | 18 ++++----- > t/meson.build | 1 + > t/t0009-git-dir-validation.sh | 72 +++++++++++++++++++++++++++++++++++ > 3 files changed, 81 insertions(+), 10 deletions(-) > create mode 100755 t/t0009-git-dir-validation.sh > > diff --git a/setup.c b/setup.c > index 0ca129623e..6e6068e5eb 100644 > --- a/setup.c > +++ b/setup.c > @@ -1590,17 +1590,15 @@ static enum discovery_result setup_git_directory_gently_1(struct strbuf *dir, > gitdirenv = read_gitfile_gently(dir->buf, die_on_error ? > NULL : &error_code); > if (!gitdirenv) { > - if (die_on_error || > - error_code == READ_GITFILE_ERR_NOT_A_FILE) { Earlier if die_on_error was false and we got any other error, let's say READ_GITFILE_ERR_INVALID_FORMAT. Then we'd return GIT_DIR_INVALID_GITFILE from here. > - /* NEEDSWORK: fail if .git is not file nor dir */ > - if (is_git_directory(dir->buf)) { > - gitdirenv = DEFAULT_GIT_DIR_ENVIRONMENT; > - gitdir_path = xstrdup(dir->buf); > - } > - } else if (error_code != READ_GITFILE_ERR_STAT_FAILED) > - return GIT_DIR_INVALID_GITFILE; > - } else > + if (error_code) > + read_gitfile_error_die(error_code, dir->buf, NULL); > + if (is_git_directory(dir->buf)) { > + gitdirenv = DEFAULT_GIT_DIR_ENVIRONMENT; > + gitdir_path = xstrdup(dir->buf); > + } But now we'd die. Correct? Doesn't that change the expected flow? > + } else { > gitfile = xstrdup(dir->buf); > + } > /* > * Earlier, we tentatively added DEFAULT_GIT_DIR_ENVIRONMENT > * to check that directory for a repository. [snip] [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 690 bytes --] ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v5 2/2] setup: allow cwd/.git to be a symlink to a directory 2026-02-18 10:27 ` Karthik Nayak @ 2026-02-18 11:20 ` Tian Yuchen 0 siblings, 0 replies; 35+ messages in thread From: Tian Yuchen @ 2026-02-18 11:20 UTC (permalink / raw) To: Karthik Nayak, git; +Cc: gitster Hi Karthik, > But we don't really do this no? We were calling `read_gitfile_gently()` > before and continue to do so, so there was no change regards to calling > `stat()` here. Or am I missing something? Oops, It seems I mixed up the changes in previous patches. I did make a mistake. > But now we'd die. Correct? Doesn't that change the expected flow? Yes, this is a regression I missed. If 'die_on_error' is false, encountering an error like 'READ_GITFILE_ERR_INVALID_FORMAT' should return 'GIT_DIR_INVALID_GITFILE' rather than dying. So in v6 I will ensure that we only delegate to 'read_gitfile_error_die()' when: 1. It is a happy case we want to ignore 2. It is a security case we MUST die on; 3. 'die_on_error' is tru Otherwise, we should fall back to returning the error code as before. Thank you for your time, Regards, Yuchen >> + } else { >> gitfile = xstrdup(dir->buf); >> + } >> /* >> * Earlier, we tentatively added DEFAULT_GIT_DIR_ENVIRONMENT >> * to check that directory for a repository. > > [snip] ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v5 2/2] setup: allow cwd/.git to be a symlink to a directory 2026-02-18 5:18 ` [PATCH v5 2/2] setup: allow cwd/.git to be a symlink to a directory Tian Yuchen 2026-02-18 10:27 ` Karthik Nayak @ 2026-02-18 18:25 ` Junio C Hamano 2026-02-19 5:11 ` Tian Yuchen 1 sibling, 1 reply; 35+ messages in thread From: Junio C Hamano @ 2026-02-18 18:25 UTC (permalink / raw) To: Tian Yuchen; +Cc: git, karthik.188 Tian Yuchen <a3205153416@gmail.com> writes: > Strictly enforcing 'lstat()' prevents valid '.git' symlinks. But nobody sane would propose running one more lstat() anyway, so how is that relevant? > if (!gitdirenv) { > - if (die_on_error || > - error_code == READ_GITFILE_ERR_NOT_A_FILE) { > - /* NEEDSWORK: fail if .git is not file nor dir */ > - if (is_git_directory(dir->buf)) { > - gitdirenv = DEFAULT_GIT_DIR_ENVIRONMENT; > - gitdir_path = xstrdup(dir->buf); > - } > - } else if (error_code != READ_GITFILE_ERR_STAT_FAILED) > - return GIT_DIR_INVALID_GITFILE; The _intent_ of the original code was to * do is_git_directory() thing to deal with a plain vanilla ".git" directory when read_gitfile_gently thing said "we found a directory" (NOT_A_FILE is overly coarse, which is what we are correcting in this topic, but the _intent_ was to do the is_git_directory() thing when we know it is a directory). * return INVALID_GITFILE on any error, but do not return when the reason why read_gitfile_gently thing failed was because there is no ".git" there (again, STAT_FAILED is overly coarse, which is what we are correcting in this topic, but the _intent_ was to return INVALID thing when we not the failure is not due to ENOENT). Note that returning INVALID_GITFILE is done when die_on_error is not set. > - } else > + if (error_code) > + read_gitfile_error_die(error_code, dir->buf, NULL); Should this be unconditional? If our caller did not ask us to die upon an error with die_on_error, what happens? The original I think returned INVALID_GITFILE for the caller to deal with. > + if (is_git_directory(dir->buf)) { Should this be unconditional? If the thing is a directory, the original would have given us NOT_A_FILE but now it would give us IS_A_DIR. And that is the only case original wanted to call is_git_directory() no? > + gitdirenv = DEFAULT_GIT_DIR_ENVIRONMENT; > + gitdir_path = xstrdup(dir->buf); > + } > + } else { > gitfile = xstrdup(dir->buf); > + } > /* > * Earlier, we tentatively added DEFAULT_GIT_DIR_ENVIRONMENT > * to check that directory for a repository. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v5 2/2] setup: allow cwd/.git to be a symlink to a directory 2026-02-18 18:25 ` Junio C Hamano @ 2026-02-19 5:11 ` Tian Yuchen 0 siblings, 0 replies; 35+ messages in thread From: Tian Yuchen @ 2026-02-19 5:11 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, karthik.188 On 2/19/26 02:25, Junio C Hamano wrote: > Tian Yuchen <a3205153416@gmail.com> writes: > >> Strictly enforcing 'lstat()' prevents valid '.git' symlinks. > > But nobody sane would propose running one more lstat() anyway, so > how is that relevant? > >> if (!gitdirenv) { >> - if (die_on_error || >> - error_code == READ_GITFILE_ERR_NOT_A_FILE) { >> - /* NEEDSWORK: fail if .git is not file nor dir */ >> - if (is_git_directory(dir->buf)) { >> - gitdirenv = DEFAULT_GIT_DIR_ENVIRONMENT; >> - gitdir_path = xstrdup(dir->buf); >> - } >> - } else if (error_code != READ_GITFILE_ERR_STAT_FAILED) >> - return GIT_DIR_INVALID_GITFILE; > > The _intent_ of the original code was to > > * do is_git_directory() thing to deal with a plain vanilla > ".git" directory when read_gitfile_gently thing said "we found > a directory" (NOT_A_FILE is overly coarse, which is what we > are correcting in this topic, but the _intent_ was to do the > is_git_directory() thing when we know it is a directory). > > * return INVALID_GITFILE on any error, but do not return when > the reason why read_gitfile_gently thing failed was because > there is no ".git" there (again, STAT_FAILED is overly coarse, > which is what we are correcting in this topic, but the > _intent_ was to return INVALID thing when we not the failure > is not due to ENOENT). Note that returning INVALID_GITFILE is > done when die_on_error is not set. > >> - } else >> + if (error_code) >> + read_gitfile_error_die(error_code, dir->buf, NULL); > > Should this be unconditional? If our caller did not ask us to die > upon an error with die_on_error, what happens? The original I think > returned INVALID_GITFILE for the caller to deal with. > >> + if (is_git_directory(dir->buf)) { > > Should this be unconditional? If the thing is a directory, the > original would have given us NOT_A_FILE but now it would give us > IS_A_DIR. And that is the only case original wanted to call > is_git_directory() no? > >> + gitdirenv = DEFAULT_GIT_DIR_ENVIRONMENT; >> + gitdir_path = xstrdup(dir->buf); >> + } >> + } else { >> gitfile = xstrdup(dir->buf); >> + } >> /* >> * Earlier, we tentatively added DEFAULT_GIT_DIR_ENVIRONMENT >> * to check that directory for a repository. Thanks for the review. I have already sent out v6 yesterday. Here is the link: https://lore.kernel.org/git/20260218124638.176936-1-a3205153416@gmail.com/ Based on your feedback and the changes that have been made on v6, it seems that the tasks that need to be completed are as follows: - Conditional 'is_git_directory' check: restrict the 'is_git_directory()' check to only run when we explicitly get 'READ_GITFILE_ERR_IS_A_DIR'. It makes no sense to check it for other error types. - Squash two patches into one single commit, as you suggested. (Actually, I'm a bit confused—are you saying to “make both patches standalone executable” or to “merge the two patches directly”? Either way, I'll go ahead and send the v7 patches first.) - Rephrase the commit message to describe the lstat() limitation, rather than saying 'we switch to stat()' The holiday is over, and my efficiency in sending patches and replying to emails may be somewhat reduced. Please bear with me. Regards, Yuchen ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v3] setup: fail if .git is not a file or directory 2026-02-15 8:41 ` Junio C Hamano 2026-02-15 16:22 ` Tian Yuchen @ 2026-02-15 17:08 ` Tian Yuchen 1 sibling, 0 replies; 35+ messages in thread From: Tian Yuchen @ 2026-02-15 17:08 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, sandals diff --git a/setup.c b/setup.c index 3a6a048620..269aa9faaa 100644 --- a/setup.c +++ b/setup.c @@ -939,8 +939,14 @@ const char *read_gitfile_gently(const char *path, int *return_error_code) static struct strbuf realpath = STRBUF_INIT; if (stat(path, &st)) { - /* NEEDSWORK: discern between ENOENT vs other errors */ - error_code = READ_GITFILE_ERR_STAT_FAILED; + if (errno == ENOENT) + error_code = READ_GITFILE_ERR_STAT_ENOENT; + else + error_code = READ_GITFILE_ERR_STAT_FAILED; + goto cleanup_return; + } + if (S_ISDIR(st.st_mode)) { + error_code = READ_GITFILE_ERR_IS_A_DIR; goto cleanup_return; } if (!S_ISREG(st.st_mode)) { @@ -994,7 +1000,9 @@ const char *read_gitfile_gently(const char *path, int *return_error_code) cleanup_return: if (return_error_code) *return_error_code = error_code; - else if (error_code) + else if (error_code && + error_code != READ_GITFILE_ERR_STAT_ENOENT && + error_code != READ_GITFILE_ERR_IS_A_DIR) read_gitfile_error_die(error_code, path, dir); free(buf); @@ -1576,20 +1584,27 @@ static enum discovery_result setup_git_directory_gently_1(struct strbuf *dir, if (offset > min_offset) strbuf_addch(dir, '/'); strbuf_addstr(dir, DEFAULT_GIT_DIR_ENVIRONMENT); - gitdirenv = read_gitfile_gently(dir->buf, die_on_error ? - NULL : &error_code); + gitdirenv = read_gitfile_gently(dir->buf, &error_code); if (!gitdirenv) { - if (die_on_error || - error_code == READ_GITFILE_ERR_NOT_A_FILE) { - /* NEEDSWORK: fail if .git is not file nor dir */ + if (error_code == READ_GITFILE_ERR_STAT_ENOENT) { + ; + } else if (error_code == READ_GITFILE_ERR_IS_A_DIR) { if (is_git_directory(dir->buf)) { gitdirenv = DEFAULT_GIT_DIR_ENVIRONMENT; gitdir_path = xstrdup(dir->buf); } - } else if (error_code != READ_GITFILE_ERR_STAT_FAILED) - return GIT_DIR_INVALID_GITFILE; - } else - gitfile = xstrdup(dir->buf); + } else if (error_code == READ_GITFILE_ERR_NOT_A_FILE) { + if (die_on_error) + die(_("Invalid %s: not a regular file or directory"), dir->buf); + else + return GIT_DIR_INVALID_GITFILE; + } else if (error_code != READ_GITFILE_ERR_STAT_FAILED) { + if (die_on_error) + read_gitfile_error_die(error_code, dir->buf, NULL); + else + return GIT_DIR_INVALID_GITFILE; + } + } /* * Earlier, we tentatively added DEFAULT_GIT_DIR_ENVIRONMENT * to check that directory for a repository. diff --git a/setup.h b/setup.h index d55dcc6608..0271cc8f93 100644 --- a/setup.h +++ b/setup.h @@ -36,6 +36,9 @@ int is_nonbare_repository_dir(struct strbuf *path); #define READ_GITFILE_ERR_NO_PATH 6 #define READ_GITFILE_ERR_NOT_A_REPO 7 #define READ_GITFILE_ERR_TOO_LARGE 8 +#define READ_GITFILE_ERR_STAT_ENOENT 9 +#define READ_GITFILE_ERR_IS_A_DIR 10 + Still working on it. Looks clearer, right? I just wanna confirm whether I'm on the right track. Notice that I didn't change the following part: (though you mentioned earlier) > cleanup_return: > if (return_error_code) > *return_error_code = error_code; >- else if (error_code) >+ else if (error_code && >+ error_code != READ_GITFILE_ERR_STAT_ENOENT && >+ error_code != READ_GITFILE_ERR_IS_A_DIR) > read_gitfile_error_die(error_code, path, dir); My understanding is that we haven't distinguished between “unexpected” (ENOENT, IS_A_DIR) and “errors” (NOT_A_FILE, format error) in the code using a categorical approach. Instead, we differentiate them solely based on whether they are listed in `read_gitfile_error_die()` (if I haven't missed anything). Consequently, retaining this code here seems to be the easiest and most readable approach to ensure that it doesn't return prematurely. However, I believe this check shouldn't reside here. That's why I'm replying to you again. (By the way, I've always been a bit curious about how your name is pronounced :/) Regards, Yuchen ^ permalink raw reply related [flat|nested] 35+ messages in thread
* Re: [RFC] setup: fail if .git is not a file or directory 2026-02-11 18:21 [RFC] setup: fail if .git is not a file or directory Tian Yuchen 2026-02-11 19:47 ` Junio C Hamano 2026-02-12 17:24 ` [PATCH v2] " Tian Yuchen @ 2026-02-12 22:39 ` brian m. carlson 2026-02-12 22:45 ` Junio C Hamano 2 siblings, 1 reply; 35+ messages in thread From: brian m. carlson @ 2026-02-12 22:39 UTC (permalink / raw) To: Tian Yuchen; +Cc: git, gitster [-- Attachment #1: Type: text/plain, Size: 1372 bytes --] On 2026-02-11 at 18:21:22, Tian Yuchen wrote: > Currently, `setup_git_directory_gently_1()` checks if `.git` is a > regular file (handling submodules/worktrees) or a directory. If it is > neither (e.g., a FIFO), the code hits a NEEDSWORK comment and simply > ignores the entity, continuing the discovery process in the parent > directory. > > This behavior can be very dangerous. If a user is inside a subdirectory > containing a melformed/broken `.git` entity, the Git will traverse up, > attach to a parent repository and might execute destructive commands. > > I tried to resolve the NEEDSWORK by using `lstat()` to explicitly check > the entity's mode. If it is neither a regular file nor a directory, we > kill the discovery process. We used to allow symlinks as well. That was used instead of gitfiles for submodules at one point, I believe, and there may still be some people using that. A brief test indicates that that functionality still works, so if we make a change here, we should be sure to accept symlinks as well. In general, we should allow people to use symlinks wherever they can use a file or directory unless we can definitively prove that there's a clear security or functionality problem that cannot be avoided. Git was originally written for Unix, after all. -- brian m. carlson (they/them) Toronto, Ontario, CA [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 262 bytes --] ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC] setup: fail if .git is not a file or directory 2026-02-12 22:39 ` [RFC] " brian m. carlson @ 2026-02-12 22:45 ` Junio C Hamano 2026-02-12 23:03 ` brian m. carlson 0 siblings, 1 reply; 35+ messages in thread From: Junio C Hamano @ 2026-02-12 22:45 UTC (permalink / raw) To: brian m. carlson; +Cc: Tian Yuchen, git "brian m. carlson" <sandals@crustytoothpaste.net> writes: > We used to allow symlinks as well. That was used instead of gitfiles > for submodules at one point, I believe, and there may still be some > people using that. A brief test indicates that that functionality still > works, so if we make a change here, we should be sure to accept symlinks > as well. In my review, I outlined a way to avoid this extra lstat(), and instead reuse the result of stat() used in read_gitfile_gently() already; the check in that function being stat() is exactly because we want to follow such a symbolic link. > In general, we should allow people to use symlinks wherever they can use > a file or directory unless we can definitively prove that there's a > clear security or functionality problem that cannot be avoided. Git was > originally written for Unix, after all. Is this also an obvlique reference to a separate potential security issue, I wonder. It reminds me that I need to see if I have to ping the thread again. Thanks. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC] setup: fail if .git is not a file or directory 2026-02-12 22:45 ` Junio C Hamano @ 2026-02-12 23:03 ` brian m. carlson 0 siblings, 0 replies; 35+ messages in thread From: brian m. carlson @ 2026-02-12 23:03 UTC (permalink / raw) To: Junio C Hamano; +Cc: Tian Yuchen, git [-- Attachment #1: Type: text/plain, Size: 1035 bytes --] On 2026-02-12 at 22:45:05, Junio C Hamano wrote: > "brian m. carlson" <sandals@crustytoothpaste.net> writes: > > > In general, we should allow people to use symlinks wherever they can use > > a file or directory unless we can definitively prove that there's a > > clear security or functionality problem that cannot be avoided. Git was > > originally written for Unix, after all. > > Is this also an obvlique reference to a separate potential security > issue, I wonder. It reminds me that I need to see if I have to ping > the thread again. It was intended to be a reference to the situation we had with, I believe, `.gitmodules` or `.gitignore` or another file of that sort, where symlinks were very much broken in that context. I merely mentioned security issues for completeness. I believe it was bb6832d552 ("fsck: warn about symlinked dotfiles we'll open with O_NOFOLLOW", 2021-05-03) that introduced that change and was what I was thinking of. -- brian m. carlson (they/them) Toronto, Ontario, CA [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 262 bytes --] ^ permalink raw reply [flat|nested] 35+ messages in thread
end of thread, other threads:[~2026-02-19 5:12 UTC | newest] Thread overview: 35+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-02-11 18:21 [RFC] setup: fail if .git is not a file or directory Tian Yuchen 2026-02-11 19:47 ` Junio C Hamano 2026-02-12 17:33 ` Tian Yuchen 2026-02-12 17:24 ` [PATCH v2] " Tian Yuchen 2026-02-12 20:59 ` Junio C Hamano 2026-02-13 16:37 ` Tian Yuchen 2026-02-14 4:52 ` [PATCH v3] " Tian Yuchen 2026-02-15 8:41 ` Junio C Hamano 2026-02-15 16:22 ` Tian Yuchen 2026-02-16 2:37 ` Junio C Hamano 2026-02-16 16:02 ` Tian Yuchen 2026-02-17 8:41 ` [PATCH v4] setup: allow cwd/.git to be a symlink to a directory Tian Yuchen 2026-02-17 11:26 ` Karthik Nayak 2026-02-17 15:30 ` Tian Yuchen 2026-02-17 18:56 ` Karthik Nayak 2026-02-17 21:10 ` Junio C Hamano 2026-02-17 17:01 ` Junio C Hamano 2026-02-17 18:50 ` Karthik Nayak 2026-02-18 4:08 ` Tian Yuchen 2026-02-17 17:59 ` Karthik Nayak 2026-02-18 5:18 ` [PATCH v5 0/2] setup.c: v5 reroll Tian Yuchen 2026-02-18 5:18 ` [PATCH v5 1/2] setup: distingush ENOENT from other stat errors Tian Yuchen 2026-02-18 10:12 ` Karthik Nayak 2026-02-18 11:11 ` Tian Yuchen 2026-02-18 18:15 ` Junio C Hamano 2026-02-18 18:43 ` Junio C Hamano 2026-02-18 5:18 ` [PATCH v5 2/2] setup: allow cwd/.git to be a symlink to a directory Tian Yuchen 2026-02-18 10:27 ` Karthik Nayak 2026-02-18 11:20 ` Tian Yuchen 2026-02-18 18:25 ` Junio C Hamano 2026-02-19 5:11 ` Tian Yuchen 2026-02-15 17:08 ` [PATCH v3] setup: fail if .git is not a file or directory Tian Yuchen 2026-02-12 22:39 ` [RFC] " brian m. carlson 2026-02-12 22:45 ` Junio C Hamano 2026-02-12 23:03 ` brian m. carlson
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox