git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/5] Last preparations before upstreaming Git for Windows' symlink support
@ 2025-12-16 15:33 Johannes Schindelin via GitGitGadget
  2025-12-16 15:33 ` [PATCH 1/5] mingw: do resolve symlinks in `getcwd()` Johannes Schindelin via GitGitGadget
                   ` (4 more replies)
  0 siblings, 5 replies; 13+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2025-12-16 15:33 UTC (permalink / raw)
  To: git; +Cc: Johannes Schindelin

After preparing Git's test suite for the upcoming support for symlinks on
Windows, this patch series touches up a couple of code paths that might not
seem to be related at first, but need to be adjusted for the symlink support
to work as expected.

This is based on js/test-symlink-windows.

Johannes Schindelin (2):
  mingw: do resolve symlinks in `getcwd()`
  init: do parse _all_ core.* settings early

Karsten Blees (3):
  strbuf_readlink(): avoid calling `readlink()` twice in corner-cases
  strbuf_readlink(): support link targets that exceed PATH_MAX
  trim_last_path_component(): avoid hard-coding the directory separator

 compat/mingw.c | 18 +++++++-----------
 environment.c  |  4 ++--
 environment.h  |  2 ++
 lockfile.c     |  4 ++--
 setup.c        |  2 +-
 strbuf.c       | 10 ++++------
 6 files changed, 18 insertions(+), 22 deletions(-)


base-commit: 77dfd223aa5b180d69cb2da54f6a7859fb94e131
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-2017%2Fdscho%2Flast-preparations-before-mingw-symlinks-support-next-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-2017/dscho/last-preparations-before-mingw-symlinks-support-next-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/2017
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/5] mingw: do resolve symlinks in `getcwd()`
  2025-12-16 15:33 [PATCH 0/5] Last preparations before upstreaming Git for Windows' symlink support Johannes Schindelin via GitGitGadget
@ 2025-12-16 15:33 ` Johannes Schindelin via GitGitGadget
  2025-12-17 14:44   ` Patrick Steinhardt
  2025-12-16 15:33 ` [PATCH 2/5] init: do parse _all_ core.* settings early Johannes Schindelin via GitGitGadget
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 13+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2025-12-16 15:33 UTC (permalink / raw)
  To: git; +Cc: Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

As pointed out in https://github.com/git-for-windows/git/issues/1676,
the `git rev-parse --is-inside-work-tree` command currently fails when
the current directory's path contains symbolic links.

The underlying reason for this bug is that `getcwd()` is supposed to
resolve symbolic links, but our `mingw_getcwd()` implementation did not.

We do have all the building blocks for that, though: the
`GetFinalPathByHandleW()` function will resolve symbolic links. However,
we only called that function if `GetLongPathNameW()` failed, for
historical reasons: the latter function was supported for a long time,
but the former API function was introduced only with Windows Vista, and
we used to support also Windows XP. With that support having been
dropped, we are free to call the symbolic link-resolving function right
away.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 compat/mingw.c | 18 +++++++-----------
 1 file changed, 7 insertions(+), 11 deletions(-)

diff --git a/compat/mingw.c b/compat/mingw.c
index ba1b7b6dd1..7215b127cc 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -1251,18 +1251,16 @@ char *mingw_getcwd(char *pointer, int len)
 {
 	wchar_t cwd[MAX_PATH], wpointer[MAX_PATH];
 	DWORD ret = GetCurrentDirectoryW(ARRAY_SIZE(cwd), cwd);
+	HANDLE hnd;
 
 	if (!ret || ret >= ARRAY_SIZE(cwd)) {
 		errno = ret ? ENAMETOOLONG : err_win_to_posix(GetLastError());
 		return NULL;
 	}
-	ret = GetLongPathNameW(cwd, wpointer, ARRAY_SIZE(wpointer));
-	if (!ret && GetLastError() == ERROR_ACCESS_DENIED) {
-		HANDLE hnd = CreateFileW(cwd, 0,
-			FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE, NULL,
-			OPEN_EXISTING, FILE_FLAG_BACKUP_SEMANTICS, NULL);
-		if (hnd == INVALID_HANDLE_VALUE)
-			return NULL;
+	hnd = CreateFileW(cwd, 0,
+			  FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE, NULL,
+			  OPEN_EXISTING, FILE_FLAG_BACKUP_SEMANTICS, NULL);
+	if (hnd != INVALID_HANDLE_VALUE) {
 		ret = GetFinalPathNameByHandleW(hnd, wpointer, ARRAY_SIZE(wpointer), 0);
 		CloseHandle(hnd);
 		if (!ret || ret >= ARRAY_SIZE(wpointer))
@@ -1271,13 +1269,11 @@ char *mingw_getcwd(char *pointer, int len)
 			return NULL;
 		return pointer;
 	}
-	if (!ret || ret >= ARRAY_SIZE(wpointer))
-		return NULL;
-	if (GetFileAttributesW(wpointer) == INVALID_FILE_ATTRIBUTES) {
+	if (GetFileAttributesW(cwd) == INVALID_FILE_ATTRIBUTES) {
 		errno = ENOENT;
 		return NULL;
 	}
-	if (xwcstoutf(pointer, wpointer, len) < 0)
+	if (xwcstoutf(pointer, cwd, len) < 0)
 		return NULL;
 	convert_slashes(pointer);
 	return pointer;
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/5] init: do parse _all_ core.* settings early
  2025-12-16 15:33 [PATCH 0/5] Last preparations before upstreaming Git for Windows' symlink support Johannes Schindelin via GitGitGadget
  2025-12-16 15:33 ` [PATCH 1/5] mingw: do resolve symlinks in `getcwd()` Johannes Schindelin via GitGitGadget
@ 2025-12-16 15:33 ` Johannes Schindelin via GitGitGadget
  2025-12-17 14:44   ` Patrick Steinhardt
  2025-12-16 15:33 ` [PATCH 3/5] strbuf_readlink(): avoid calling `readlink()` twice in corner-cases Karsten Blees via GitGitGadget
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 13+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2025-12-16 15:33 UTC (permalink / raw)
  To: git; +Cc: Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

In Git for Windows, `has_symlinks` is set to 0 by default. Therefore, we
need to parse the config setting `core.symlinks` to know if it has been
set to `true`. In `git init`, we must do that before copying the
templates because they might contain symbolic links.

Even if the support for symbolic links on Windows has not made it to
upstream Git yet, we really should make sure that all the `core.*`
settings are parsed before proceeding, as they might very well change
the behavior of `git init` in a way the user intended.

This fixes https://github.com/git-for-windows/git/issues/3414

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 environment.c | 4 ++--
 environment.h | 2 ++
 setup.c       | 2 +-
 3 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/environment.c b/environment.c
index a770b5921d..b65b85a01f 100644
--- a/environment.c
+++ b/environment.c
@@ -324,8 +324,8 @@ next_name:
 	return (current & ~negative) | positive;
 }
 
-static int git_default_core_config(const char *var, const char *value,
-				   const struct config_context *ctx, void *cb)
+int git_default_core_config(const char *var, const char *value,
+			    const struct config_context *ctx, void *cb)
 {
 	/* This needs a better name */
 	if (!strcmp(var, "core.filemode")) {
diff --git a/environment.h b/environment.h
index 51898c99cd..e61f843fdb 100644
--- a/environment.h
+++ b/environment.h
@@ -106,6 +106,8 @@ const char *strip_namespace(const char *namespaced_ref);
 
 int git_default_config(const char *, const char *,
 		       const struct config_context *, void *);
+int git_default_core_config(const char *var, const char *value,
+			    const struct config_context *ctx, void *cb);
 
 /*
  * TODO: All the below state either explicitly or implicitly relies on
diff --git a/setup.c b/setup.c
index 7086741e6c..42e4e7a690 100644
--- a/setup.c
+++ b/setup.c
@@ -2611,7 +2611,7 @@ int init_db(const char *git_dir, const char *real_git_dir,
 	 * have set up the repository format such that we can evaluate
 	 * includeIf conditions correctly in the case of re-initialization.
 	 */
-	repo_config(the_repository, platform_core_config, NULL);
+	repo_config(the_repository, git_default_core_config, NULL);
 
 	safe_create_dir(the_repository, git_dir, 0);
 
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 3/5] strbuf_readlink(): avoid calling `readlink()` twice in corner-cases
  2025-12-16 15:33 [PATCH 0/5] Last preparations before upstreaming Git for Windows' symlink support Johannes Schindelin via GitGitGadget
  2025-12-16 15:33 ` [PATCH 1/5] mingw: do resolve symlinks in `getcwd()` Johannes Schindelin via GitGitGadget
  2025-12-16 15:33 ` [PATCH 2/5] init: do parse _all_ core.* settings early Johannes Schindelin via GitGitGadget
@ 2025-12-16 15:33 ` Karsten Blees via GitGitGadget
  2025-12-16 15:33 ` [PATCH 4/5] strbuf_readlink(): support link targets that exceed PATH_MAX Karsten Blees via GitGitGadget
  2025-12-16 15:33 ` [PATCH 5/5] trim_last_path_component(): avoid hard-coding the directory separator Karsten Blees via GitGitGadget
  4 siblings, 0 replies; 13+ messages in thread
From: Karsten Blees via GitGitGadget @ 2025-12-16 15:33 UTC (permalink / raw)
  To: git; +Cc: Johannes Schindelin, Karsten Blees

From: Karsten Blees <blees@dcon.de>

The `strbuf_readlink()` function calls `readlink()`` twice if the hint
argument specifies the exact size of the link target (e.g. by passing
stat.st_size as returned by `lstat()`). This is necessary because
`readlink(..., hint) == hint` could mean that the buffer was too small.

Use `hint + 1` as buffer size to prevent this.

Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 strbuf.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/strbuf.c b/strbuf.c
index 6c3851a7f8..44a8f6a554 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -578,12 +578,12 @@ int strbuf_readlink(struct strbuf *sb, const char *path, size_t hint)
 	while (hint < STRBUF_MAXLINK) {
 		ssize_t len;
 
-		strbuf_grow(sb, hint);
-		len = readlink(path, sb->buf, hint);
+		strbuf_grow(sb, hint + 1);
+		len = readlink(path, sb->buf, hint + 1);
 		if (len < 0) {
 			if (errno != ERANGE)
 				break;
-		} else if (len < hint) {
+		} else if (len <= hint) {
 			strbuf_setlen(sb, len);
 			return 0;
 		}
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 4/5] strbuf_readlink(): support link targets that exceed PATH_MAX
  2025-12-16 15:33 [PATCH 0/5] Last preparations before upstreaming Git for Windows' symlink support Johannes Schindelin via GitGitGadget
                   ` (2 preceding siblings ...)
  2025-12-16 15:33 ` [PATCH 3/5] strbuf_readlink(): avoid calling `readlink()` twice in corner-cases Karsten Blees via GitGitGadget
@ 2025-12-16 15:33 ` Karsten Blees via GitGitGadget
  2025-12-17 14:44   ` Patrick Steinhardt
  2025-12-17 23:39   ` Junio C Hamano
  2025-12-16 15:33 ` [PATCH 5/5] trim_last_path_component(): avoid hard-coding the directory separator Karsten Blees via GitGitGadget
  4 siblings, 2 replies; 13+ messages in thread
From: Karsten Blees via GitGitGadget @ 2025-12-16 15:33 UTC (permalink / raw)
  To: git; +Cc: Johannes Schindelin, Karsten Blees

From: Karsten Blees <blees@dcon.de>

The `strbuf_readlink()` function refuses to read link targets that
exceed PATH_MAX (even if a sufficient size was specified by the caller).

As some platforms (*cough* Windows *cough*) support longer paths, remove
this restriction (similar to `strbuf_getcwd()`).

Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 strbuf.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/strbuf.c b/strbuf.c
index 44a8f6a554..fa4e30f112 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -566,8 +566,6 @@ ssize_t strbuf_write(struct strbuf *sb, FILE *f)
 	return sb->len ? fwrite(sb->buf, 1, sb->len, f) : 0;
 }
 
-#define STRBUF_MAXLINK (2*PATH_MAX)
-
 int strbuf_readlink(struct strbuf *sb, const char *path, size_t hint)
 {
 	size_t oldalloc = sb->alloc;
@@ -575,7 +573,7 @@ int strbuf_readlink(struct strbuf *sb, const char *path, size_t hint)
 	if (hint < 32)
 		hint = 32;
 
-	while (hint < STRBUF_MAXLINK) {
+	for (;;) {
 		ssize_t len;
 
 		strbuf_grow(sb, hint + 1);
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 5/5] trim_last_path_component(): avoid hard-coding the directory separator
  2025-12-16 15:33 [PATCH 0/5] Last preparations before upstreaming Git for Windows' symlink support Johannes Schindelin via GitGitGadget
                   ` (3 preceding siblings ...)
  2025-12-16 15:33 ` [PATCH 4/5] strbuf_readlink(): support link targets that exceed PATH_MAX Karsten Blees via GitGitGadget
@ 2025-12-16 15:33 ` Karsten Blees via GitGitGadget
  4 siblings, 0 replies; 13+ messages in thread
From: Karsten Blees via GitGitGadget @ 2025-12-16 15:33 UTC (permalink / raw)
  To: git; +Cc: Johannes Schindelin, Karsten Blees

From: Karsten Blees <blees@dcon.de>

Currently, this function hard-codes the directory separator as the
forward slash.

However, on Windows the backslash character is valid, too. And we want
to call this function in the upcoming support for symlinks on Windows
with the symlink targets (which naturally use the canonical directory
separator on Windows, which is _not_ the forward slash).

Prepare that function to be useful also in that context.

Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 lockfile.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lockfile.c b/lockfile.c
index 1d5ed01682..67082a9caa 100644
--- a/lockfile.c
+++ b/lockfile.c
@@ -19,14 +19,14 @@ static void trim_last_path_component(struct strbuf *path)
 	int i = path->len;
 
 	/* back up past trailing slashes, if any */
-	while (i && path->buf[i - 1] == '/')
+	while (i && is_dir_sep(path->buf[i - 1]))
 		i--;
 
 	/*
 	 * then go backwards until a slash, or the beginning of the
 	 * string
 	 */
-	while (i && path->buf[i - 1] != '/')
+	while (i && !is_dir_sep(path->buf[i - 1]))
 		i--;
 
 	strbuf_setlen(path, i);
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/5] mingw: do resolve symlinks in `getcwd()`
  2025-12-16 15:33 ` [PATCH 1/5] mingw: do resolve symlinks in `getcwd()` Johannes Schindelin via GitGitGadget
@ 2025-12-17 14:44   ` Patrick Steinhardt
  0 siblings, 0 replies; 13+ messages in thread
From: Patrick Steinhardt @ 2025-12-17 14:44 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget; +Cc: git, Johannes Schindelin

On Tue, Dec 16, 2025 at 03:33:45PM +0000, Johannes Schindelin via GitGitGadget wrote:
> diff --git a/compat/mingw.c b/compat/mingw.c
> index ba1b7b6dd1..7215b127cc 100644
> --- a/compat/mingw.c
> +++ b/compat/mingw.c
> @@ -1251,18 +1251,16 @@ char *mingw_getcwd(char *pointer, int len)
>  {
>  	wchar_t cwd[MAX_PATH], wpointer[MAX_PATH];
>  	DWORD ret = GetCurrentDirectoryW(ARRAY_SIZE(cwd), cwd);
> +	HANDLE hnd;
>  
>  	if (!ret || ret >= ARRAY_SIZE(cwd)) {
>  		errno = ret ? ENAMETOOLONG : err_win_to_posix(GetLastError());
>  		return NULL;
>  	}
> -	ret = GetLongPathNameW(cwd, wpointer, ARRAY_SIZE(wpointer));
> -	if (!ret && GetLastError() == ERROR_ACCESS_DENIED) {
> -		HANDLE hnd = CreateFileW(cwd, 0,
> -			FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE, NULL,
> -			OPEN_EXISTING, FILE_FLAG_BACKUP_SEMANTICS, NULL);
> -		if (hnd == INVALID_HANDLE_VALUE)
> -			return NULL;
> +	hnd = CreateFileW(cwd, 0,
> +			  FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE, NULL,
> +			  OPEN_EXISTING, FILE_FLAG_BACKUP_SEMANTICS, NULL);
> +	if (hnd != INVALID_HANDLE_VALUE) {
>  		ret = GetFinalPathNameByHandleW(hnd, wpointer, ARRAY_SIZE(wpointer), 0);
>  		CloseHandle(hnd);
>  		if (!ret || ret >= ARRAY_SIZE(wpointer))

Okay. Due to the change we now also try calling `GetFileAttributesW()`
in case `CreateFileW()` fails, which wasn't the case before. But I'd
consider that to be a win -- if we cannot figure out the final path
name, then we can at least return the unresolved current working
directory.

Patrick

> @@ -1271,13 +1269,11 @@ char *mingw_getcwd(char *pointer, int len)
>  			return NULL;
>  		return pointer;
>  	}
> -	if (!ret || ret >= ARRAY_SIZE(wpointer))
> -		return NULL;
> -	if (GetFileAttributesW(wpointer) == INVALID_FILE_ATTRIBUTES) {
> +	if (GetFileAttributesW(cwd) == INVALID_FILE_ATTRIBUTES) {
>  		errno = ENOENT;
>  		return NULL;
>  	}
> -	if (xwcstoutf(pointer, wpointer, len) < 0)
> +	if (xwcstoutf(pointer, cwd, len) < 0)
>  		return NULL;
>  	convert_slashes(pointer);
>  	return pointer;

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/5] init: do parse _all_ core.* settings early
  2025-12-16 15:33 ` [PATCH 2/5] init: do parse _all_ core.* settings early Johannes Schindelin via GitGitGadget
@ 2025-12-17 14:44   ` Patrick Steinhardt
  0 siblings, 0 replies; 13+ messages in thread
From: Patrick Steinhardt @ 2025-12-17 14:44 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget; +Cc: git, Johannes Schindelin

On Tue, Dec 16, 2025 at 03:33:46PM +0000, Johannes Schindelin via GitGitGadget wrote:
> diff --git a/setup.c b/setup.c
> index 7086741e6c..42e4e7a690 100644
> --- a/setup.c
> +++ b/setup.c
> @@ -2611,7 +2611,7 @@ int init_db(const char *git_dir, const char *real_git_dir,
>  	 * have set up the repository format such that we can evaluate
>  	 * includeIf conditions correctly in the case of re-initialization.
>  	 */
> -	repo_config(the_repository, platform_core_config, NULL);
> +	repo_config(the_repository, git_default_core_config, NULL);
>  
>  	safe_create_dir(the_repository, git_dir, 0);

Two lines further down we call `create_default_files()`, and there we
end up calling `repo_config(the_repository, git_default_config, NULL)`
as one of the first things. We do so after copying templates though, so
indeed this comes too late.

We also cannot really merge these two calls: we need to re-parse the
configuration after having copied over the template, as the template may
contain a gitconfig file itself.

Furthermore, `git_default_core_config()` already knows to call
`platform_core_config()`, as well. So we're not losing any of that
information, either.

All to say that this change makes sense to me and should be safe, as we
don't end up parsing _more_ configuration keys, we only parse a subset
of it a bit earlier.

Patrick

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 4/5] strbuf_readlink(): support link targets that exceed PATH_MAX
  2025-12-16 15:33 ` [PATCH 4/5] strbuf_readlink(): support link targets that exceed PATH_MAX Karsten Blees via GitGitGadget
@ 2025-12-17 14:44   ` Patrick Steinhardt
  2025-12-19  8:50     ` Johannes Schindelin
  2025-12-17 23:39   ` Junio C Hamano
  1 sibling, 1 reply; 13+ messages in thread
From: Patrick Steinhardt @ 2025-12-17 14:44 UTC (permalink / raw)
  To: Karsten Blees via GitGitGadget; +Cc: git, Johannes Schindelin, Karsten Blees

On Tue, Dec 16, 2025 at 03:33:48PM +0000, Karsten Blees via GitGitGadget wrote:
> diff --git a/strbuf.c b/strbuf.c
> index 44a8f6a554..fa4e30f112 100644
> --- a/strbuf.c
> +++ b/strbuf.c
> @@ -566,8 +566,6 @@ ssize_t strbuf_write(struct strbuf *sb, FILE *f)
>  	return sb->len ? fwrite(sb->buf, 1, sb->len, f) : 0;
>  }
>  
> -#define STRBUF_MAXLINK (2*PATH_MAX)
> -
>  int strbuf_readlink(struct strbuf *sb, const char *path, size_t hint)
>  {
>  	size_t oldalloc = sb->alloc;
> @@ -575,7 +573,7 @@ int strbuf_readlink(struct strbuf *sb, const char *path, size_t hint)
>  	if (hint < 32)
>  		hint = 32;
>  
> -	while (hint < STRBUF_MAXLINK) {
> +	for (;;) {
>  		ssize_t len;
>  
>  		strbuf_grow(sb, hint + 1);

This makes me wonder whether we have a better way to figure out the
actual size of the buffer that we ultimately need to allocate. But
reading through readlink(3p) doesn't indicate anything, and I'm not sure
whether we can always rely on lstat(3p) to return the correct size for
symlink contents on all platforms.

One thing that _is_ noted though is that calling the function with a
buffer size larger than SSIZE_MAX is implementation-defined. It does
make me a bit uneasy in that light to grow indefinitely.

Which makes me wonder whether Windows has a limit for the symlink
contents that we could enforce in theory so that we can reasonably turn
this into a bounded loop again?

Patrick

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 4/5] strbuf_readlink(): support link targets that exceed PATH_MAX
  2025-12-16 15:33 ` [PATCH 4/5] strbuf_readlink(): support link targets that exceed PATH_MAX Karsten Blees via GitGitGadget
  2025-12-17 14:44   ` Patrick Steinhardt
@ 2025-12-17 23:39   ` Junio C Hamano
  1 sibling, 0 replies; 13+ messages in thread
From: Junio C Hamano @ 2025-12-17 23:39 UTC (permalink / raw)
  To: Karsten Blees via GitGitGadget; +Cc: git, Johannes Schindelin, Karsten Blees

"Karsten Blees via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Karsten Blees <blees@dcon.de>
>
> The `strbuf_readlink()` function refuses to read link targets that
> exceed PATH_MAX (even if a sufficient size was specified by the caller).
>
> As some platforms (*cough* Windows *cough*) support longer paths, remove
> this restriction (similar to `strbuf_getcwd()`).
>
> Signed-off-by: Karsten Blees <blees@dcon.de>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
>  strbuf.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)

We've been bitten before by platforms that sets PATH_MAX too low
(i.e., lower than what they comfortably support), so this is a
welcome change.

> diff --git a/strbuf.c b/strbuf.c
> index 44a8f6a554..fa4e30f112 100644
> --- a/strbuf.c
> +++ b/strbuf.c
> @@ -566,8 +566,6 @@ ssize_t strbuf_write(struct strbuf *sb, FILE *f)
>  	return sb->len ? fwrite(sb->buf, 1, sb->len, f) : 0;
>  }
>  
> -#define STRBUF_MAXLINK (2*PATH_MAX)
> -
>  int strbuf_readlink(struct strbuf *sb, const char *path, size_t hint)
>  {
>  	size_t oldalloc = sb->alloc;
> @@ -575,7 +573,7 @@ int strbuf_readlink(struct strbuf *sb, const char *path, size_t hint)
>  	if (hint < 32)
>  		hint = 32;
>  
> -	while (hint < STRBUF_MAXLINK) {
> +	for (;;) {
>  		ssize_t len;
>  
>  		strbuf_grow(sb, hint + 1);

I briefly wondered if this would cause us loop infinitely on a truly
broken platform, where readlink() somehow keeps returning negative,
but we only retry when we got ERANGE (which can be seen several
lines below the postimage of hte patch), so we should be safe.

Thanks.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 4/5] strbuf_readlink(): support link targets that exceed PATH_MAX
  2025-12-17 14:44   ` Patrick Steinhardt
@ 2025-12-19  8:50     ` Johannes Schindelin
  2025-12-19 11:51       ` Patrick Steinhardt
  0 siblings, 1 reply; 13+ messages in thread
From: Johannes Schindelin @ 2025-12-19  8:50 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: Karsten Blees via GitGitGadget, git, Karsten Blees

Hi Patrick,

On Wed, 17 Dec 2025, Patrick Steinhardt wrote:

> On Tue, Dec 16, 2025 at 03:33:48PM +0000, Karsten Blees via GitGitGadget wrote:
> > diff --git a/strbuf.c b/strbuf.c
> > index 44a8f6a554..fa4e30f112 100644
> > --- a/strbuf.c
> > +++ b/strbuf.c
> > @@ -566,8 +566,6 @@ ssize_t strbuf_write(struct strbuf *sb, FILE *f)
> >  	return sb->len ? fwrite(sb->buf, 1, sb->len, f) : 0;
> >  }
> >  
> > -#define STRBUF_MAXLINK (2*PATH_MAX)
> > -
> >  int strbuf_readlink(struct strbuf *sb, const char *path, size_t hint)
> >  {
> >  	size_t oldalloc = sb->alloc;
> > @@ -575,7 +573,7 @@ int strbuf_readlink(struct strbuf *sb, const char *path, size_t hint)
> >  	if (hint < 32)
> >  		hint = 32;
> >  
> > -	while (hint < STRBUF_MAXLINK) {
> > +	for (;;) {
> >  		ssize_t len;
> >  
> >  		strbuf_grow(sb, hint + 1);
> 
> This makes me wonder whether we have a better way to figure out the
> actual size of the buffer that we ultimately need to allocate. But
> reading through readlink(3p) doesn't indicate anything, and I'm not sure
> whether we can always rely on lstat(3p) to return the correct size for
> symlink contents on all platforms.
> 
> One thing that _is_ noted though is that calling the function with a
> buffer size larger than SSIZE_MAX is implementation-defined. It does
> make me a bit uneasy in that light to grow indefinitely.
> 
> Which makes me wonder whether Windows has a limit for the symlink
> contents that we could enforce in theory so that we can reasonably turn
> this into a bounded loop again?

https://learn.microsoft.com/en-us/windows/win32/fileio/maximum-file-path-limitation
suggests that the maximum permissible target path should be 32,768. But
that's not _quite_ correct, as
`../t/../Documentation/RelNotes/../../README.md` is a perfectly valid (if
awkward) symlink target.

Still, I would say that 32,768 would make for a fine (still insanely high,
but not so high as to allow malicious symlinks to cause memory problems)
limit.

Sound good?
Johannes

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 4/5] strbuf_readlink(): support link targets that exceed PATH_MAX
  2025-12-19  8:50     ` Johannes Schindelin
@ 2025-12-19 11:51       ` Patrick Steinhardt
  2025-12-30  5:00         ` Junio C Hamano
  0 siblings, 1 reply; 13+ messages in thread
From: Patrick Steinhardt @ 2025-12-19 11:51 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Karsten Blees via GitGitGadget, git, Karsten Blees

On Fri, Dec 19, 2025 at 09:50:15AM +0100, Johannes Schindelin wrote:
> Hi Patrick,
> 
> On Wed, 17 Dec 2025, Patrick Steinhardt wrote:
> 
> > On Tue, Dec 16, 2025 at 03:33:48PM +0000, Karsten Blees via GitGitGadget wrote:
> > > diff --git a/strbuf.c b/strbuf.c
> > > index 44a8f6a554..fa4e30f112 100644
> > > --- a/strbuf.c
> > > +++ b/strbuf.c
> > > @@ -566,8 +566,6 @@ ssize_t strbuf_write(struct strbuf *sb, FILE *f)
> > >  	return sb->len ? fwrite(sb->buf, 1, sb->len, f) : 0;
> > >  }
> > >  
> > > -#define STRBUF_MAXLINK (2*PATH_MAX)
> > > -
> > >  int strbuf_readlink(struct strbuf *sb, const char *path, size_t hint)
> > >  {
> > >  	size_t oldalloc = sb->alloc;
> > > @@ -575,7 +573,7 @@ int strbuf_readlink(struct strbuf *sb, const char *path, size_t hint)
> > >  	if (hint < 32)
> > >  		hint = 32;
> > >  
> > > -	while (hint < STRBUF_MAXLINK) {
> > > +	for (;;) {
> > >  		ssize_t len;
> > >  
> > >  		strbuf_grow(sb, hint + 1);
> > 
> > This makes me wonder whether we have a better way to figure out the
> > actual size of the buffer that we ultimately need to allocate. But
> > reading through readlink(3p) doesn't indicate anything, and I'm not sure
> > whether we can always rely on lstat(3p) to return the correct size for
> > symlink contents on all platforms.
> > 
> > One thing that _is_ noted though is that calling the function with a
> > buffer size larger than SSIZE_MAX is implementation-defined. It does
> > make me a bit uneasy in that light to grow indefinitely.
> > 
> > Which makes me wonder whether Windows has a limit for the symlink
> > contents that we could enforce in theory so that we can reasonably turn
> > this into a bounded loop again?
> 
> https://learn.microsoft.com/en-us/windows/win32/fileio/maximum-file-path-limitation
> suggests that the maximum permissible target path should be 32,768. But
> that's not _quite_ correct, as
> `../t/../Documentation/RelNotes/../../README.md` is a perfectly valid (if
> awkward) symlink target.
> 
> Still, I would say that 32,768 would make for a fine (still insanely high,
> but not so high as to allow malicious symlinks to cause memory problems)
> limit.
> 
> Sound good?
> Johannes

Sounds good to me, thanks!

Patrick

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 4/5] strbuf_readlink(): support link targets that exceed PATH_MAX
  2025-12-19 11:51       ` Patrick Steinhardt
@ 2025-12-30  5:00         ` Junio C Hamano
  0 siblings, 0 replies; 13+ messages in thread
From: Junio C Hamano @ 2025-12-30  5:00 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: Johannes Schindelin, Karsten Blees via GitGitGadget, git,
	Karsten Blees

Patrick Steinhardt <ps@pks.im> writes:

>> > This makes me wonder whether we have a better way to figure out the
>> > actual size of the buffer that we ultimately need to allocate. But
>> > reading through readlink(3p) doesn't indicate anything, and I'm not sure
>> > whether we can always rely on lstat(3p) to return the correct size for
>> > symlink contents on all platforms.
>> > 
>> > One thing that _is_ noted though is that calling the function with a
>> > buffer size larger than SSIZE_MAX is implementation-defined. It does
>> > make me a bit uneasy in that light to grow indefinitely.
>> > 
>> > Which makes me wonder whether Windows has a limit for the symlink
>> > contents that we could enforce in theory so that we can reasonably turn
>> > this into a bounded loop again?
>> 
>> https://learn.microsoft.com/en-us/windows/win32/fileio/maximum-file-path-limitation
>> suggests that the maximum permissible target path should be 32,768. But
>> that's not _quite_ correct, as
>> `../t/../Documentation/RelNotes/../../README.md` is a perfectly valid (if
>> awkward) symlink target.
>> 
>> Still, I would say that 32,768 would make for a fine (still insanely high,
>> but not so high as to allow malicious symlinks to cause memory problems)
>> limit.
>> 
>> Sound good?
>> Johannes
>
> Sounds good to me, thanks!

As this is a generic codepath in strbuf.c, platforms that do not
honor Microsoft's promise cited above can break the assumption made
here by going beyond 32k, no?

I am OK if this infinite loop had our own "we are growing the buffer
very long and still getting not-enough-buf error; let's give up"
termination condition.

IOW, a simpler alternative may be

---- >8 ----
Subject: strbuf_readlink(): do not trust PATH_MAX

We have been bitten before by platforms that sets PATH_MAX way too
low, far below the length of paths they comfortably support.  The
strbuf_readlink() limits the link targets to PATH_MAX, which is a
code path that is broken by such platforms.

Raise the limit to 32kB, which matches the limit of a
platform with such a problem [*].

 * https://learn.microsoft.com/en-us/windows/win32/fileio/maximum-file-path-limitation

 strbuf.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git c/strbuf.c w/strbuf.c
index 7fb7d12ac0..1c7659bcd2 100644
--- c/strbuf.c
+++ w/strbuf.c
@@ -566,7 +566,11 @@ ssize_t strbuf_write(struct strbuf *sb, FILE *f)
 	return sb->len ? fwrite(sb->buf, 1, sb->len, f) : 0;
 }
 
-#define STRBUF_MAXLINK (2*PATH_MAX)
+/*
+ * Do not use PATH_MAX, as some platforms sets it too low;
+ * 32kB matches what Windows has as the real limit for a pathnname.
+ */
+#define STRBUF_MAXLINK (2 * (1 << 15))
 
 int strbuf_readlink(struct strbuf *sb, const char *path, size_t hint)
 {

^ permalink raw reply related	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2025-12-30  5:00 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-12-16 15:33 [PATCH 0/5] Last preparations before upstreaming Git for Windows' symlink support Johannes Schindelin via GitGitGadget
2025-12-16 15:33 ` [PATCH 1/5] mingw: do resolve symlinks in `getcwd()` Johannes Schindelin via GitGitGadget
2025-12-17 14:44   ` Patrick Steinhardt
2025-12-16 15:33 ` [PATCH 2/5] init: do parse _all_ core.* settings early Johannes Schindelin via GitGitGadget
2025-12-17 14:44   ` Patrick Steinhardt
2025-12-16 15:33 ` [PATCH 3/5] strbuf_readlink(): avoid calling `readlink()` twice in corner-cases Karsten Blees via GitGitGadget
2025-12-16 15:33 ` [PATCH 4/5] strbuf_readlink(): support link targets that exceed PATH_MAX Karsten Blees via GitGitGadget
2025-12-17 14:44   ` Patrick Steinhardt
2025-12-19  8:50     ` Johannes Schindelin
2025-12-19 11:51       ` Patrick Steinhardt
2025-12-30  5:00         ` Junio C Hamano
2025-12-17 23:39   ` Junio C Hamano
2025-12-16 15:33 ` [PATCH 5/5] trim_last_path_component(): avoid hard-coding the directory separator Karsten Blees via GitGitGadget

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).