git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] Symlink resolutions: limits and return modes
@ 2024-06-17  9:03 Miguel Ángel Pastor Olivar via GitGitGadget
  2024-06-17  9:03 ` [PATCH 1/2] cat-file: configurable number of symlink resolutions Miguel Ángel Pastor Olivar via GitGitGadget
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Miguel Ángel Pastor Olivar via GitGitGadget @ 2024-06-17  9:03 UTC (permalink / raw)
  To: git; +Cc: Miguel Ángel Pastor Olivar

It can be useful to limit the number of symlink resolutions performed while
looking for a tree entry. The goal is to provide the ability to resolve up
to a particular depth, instead of reaching the end of the link chain.

In addition, I would like to extend the symlink resolution process and
provide the ability to return the object found at the designated depth
instead of returning an error.

The current code already provides a limit to the maximum number of
resolutions that can be performed, and something similar to this is returned
to the caller:

loop SP <size> LF
<object> LF


With these patches, we are looking to return the actual information of the
object where the resolution stopped. Something similar to:

<oid> blob <size>\nndata\n


Miguel Ángel Pastor Olivar (2):
  cat-file: configurable number of symlink resolutions
  cat-file: configurable "best effort mode" for symlink resolution

 Documentation/config/core.txt | 19 +++++++++++++++
 config.c                      | 18 ++++++++++++++
 environment.c                 |  3 +++
 environment.h                 |  8 ++++++
 t/t1006-cat-file.sh           | 46 +++++++++++++++++++++++++++++++++++
 tree-walk.c                   | 18 +++++++++++++-
 6 files changed, 111 insertions(+), 1 deletion(-)


base-commit: d63586cb314731c851f28e14fc8012988467e2da
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1751%2Fmigue%2Fmigue%2Ffollow-symlinks-max-depth-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1751/migue/migue/follow-symlinks-max-depth-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1751
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 1/2] cat-file: configurable number of symlink resolutions
  2024-06-17  9:03 [PATCH 0/2] Symlink resolutions: limits and return modes Miguel Ángel Pastor Olivar via GitGitGadget
@ 2024-06-17  9:03 ` Miguel Ángel Pastor Olivar via GitGitGadget
  2024-06-17 19:33   ` Junio C Hamano
  2024-06-17  9:03 ` [PATCH 2/2] cat-file: configurable "best effort mode" for symlink resolution Miguel Ángel Pastor Olivar via GitGitGadget
  2024-06-17 19:33 ` [PATCH 0/2] Symlink resolutions: limits and return modes Junio C Hamano
  2 siblings, 1 reply; 6+ messages in thread
From: Miguel Ángel Pastor Olivar via GitGitGadget @ 2024-06-17  9:03 UTC (permalink / raw)
  To: git; +Cc: Miguel Ángel Pastor Olivar, Miguel Ángel Pastor Olivar

From: =?UTF-8?q?Miguel=20=C3=81ngel=20Pastor=20Olivar?= <migue@github.com>

Sometimes, it can be useful to limit the number of symlink resolutions
performed while looking for a tree entry.

The goal is to provide the ability to resolve up to a particular depth,
instead of reaching the end of the link chain.

The current code already provides a limit to the maximum number of
resolutions that can be performed
(GET_TREE_ENTRY_FOLLOW_SYMLINKS_MAX_LINKS). This patch introduces a new
config setting to make the previous property configurable. No logical
changes are introduced in this patch

Signed-off-by: Miguel Ángel Pastor Olivar <migue@github.com>
---
 Documentation/config/core.txt |  5 +++++
 config.c                      |  5 +++++
 environment.c                 |  1 +
 environment.h                 |  1 +
 t/t1006-cat-file.sh           | 17 +++++++++++++++++
 tree-walk.c                   |  7 ++++++-
 6 files changed, 35 insertions(+), 1 deletion(-)

diff --git a/Documentation/config/core.txt b/Documentation/config/core.txt
index 93d65e1dfd2..ca2d1eede52 100644
--- a/Documentation/config/core.txt
+++ b/Documentation/config/core.txt
@@ -757,3 +757,8 @@ core.maxTreeDepth::
 	tree (e.g., "a/b/cde/f" has a depth of 4). This is a fail-safe
 	to allow Git to abort cleanly, and should not generally need to
 	be adjusted. The default is 4096.
+
+core.maxSymlinkDepth::
+	The maximum number of symlinks Git is willing to resolve while
+	looking for a tree entry.
+	The default is GET_TREE_ENTRY_FOLLOW_SYMLINKS_MAX_LINKS.
\ No newline at end of file
diff --git a/config.c b/config.c
index abce05b7744..d69e9a3ae6b 100644
--- a/config.c
+++ b/config.c
@@ -1682,6 +1682,11 @@ static int git_default_core_config(const char *var, const char *value,
 		return 0;
 	}
 
+	if (!strcmp(var, "core.maxsymlinkdepth")) {
+		max_symlink_depth = git_config_int(var, value, ctx->kvi);
+		return 0;
+	}
+
 	/* Add other config variables here and to Documentation/config.txt. */
 	return platform_core_config(var, value, ctx, cb);
 }
diff --git a/environment.c b/environment.c
index 701d5151354..6d7a5001eb1 100644
--- a/environment.c
+++ b/environment.c
@@ -95,6 +95,7 @@ int max_allowed_tree_depth =
 #else
 	2048;
 #endif
+int max_symlink_depth = -1;
 
 #ifndef PROTECT_HFS_DEFAULT
 #define PROTECT_HFS_DEFAULT 0
diff --git a/environment.h b/environment.h
index e9f01d4d11c..ea39c2887b1 100644
--- a/environment.h
+++ b/environment.h
@@ -141,6 +141,7 @@ extern size_t delta_base_cache_limit;
 extern unsigned long big_file_threshold;
 extern unsigned long pack_size_limit_cfg;
 extern int max_allowed_tree_depth;
+extern int max_symlink_depth;
 
 /*
  * Accessors for the core.sharedrepository config which lazy-load the value
diff --git a/t/t1006-cat-file.sh b/t/t1006-cat-file.sh
index e12b2219721..fd7ab1d1eff 100755
--- a/t/t1006-cat-file.sh
+++ b/t/t1006-cat-file.sh
@@ -878,6 +878,9 @@ test_expect_success 'cat-file -t and -s on corrupt loose object' '
 test_expect_success 'prep for symlink tests' '
 	echo_without_newline "$hello_content" >morx &&
 	test_ln_s_add morx same-dir-link &&
+	test_ln_s_add same-dir-link link-to-symlink-1 &&
+	test_ln_s_add link-to-symlink-1 link-to-symlink-2 &&
+	test_ln_s_add link-to-symlink-2 link-to-symlink-3 &&
 	test_ln_s_add dir link-to-dir &&
 	test_ln_s_add ../fleem out-of-repo-link &&
 	test_ln_s_add .. out-of-repo-link-dir &&
@@ -1096,6 +1099,20 @@ test_expect_success 'git cat-file --batch --follow-symlink returns correct sha a
 	test_cmp expect actual
 '
 
+test_expect_success 'git cat-file --batch --follow-symlink stop resolving symlinks' '
+	printf "loop 22\nHEAD:link-to-symlink-3\n">expect &&
+	printf 'HEAD:link-to-symlink-3' | git -c core.maxsymlinkdepth=1 cat-file --batch="%(objectname) %(objecttype) %(objectsize)" --follow-symlinks > actual &&
+	test_cmp expect actual &&
+	printf 'HEAD:link-to-symlink-3' | git -c core.maxsymlinkdepth=2 cat-file --batch="%(objectname) %(objecttype) %(objectsize)" --follow-symlinks > actual &&
+	test_cmp expect actual &&
+	printf 'HEAD:link-to-symlink-3' | git -c core.maxsymlinkdepth=3 cat-file --batch="%(objectname) %(objecttype) %(objectsize)" --follow-symlinks > actual &&
+	test_cmp expect actual &&
+	oid=$(git rev-parse HEAD:morx) &&
+	printf "${oid} blob 11\nHello World\n" >expect &&
+	printf 'HEAD:link-to-symlink-3' | git -c core.maxsymlinkdepth=4 cat-file --batch="%(objectname) %(objecttype) %(objectsize)" --follow-symlinks > actual &&
+	test_cmp expect actual
+'
+
 test_expect_success 'cat-file --batch-all-objects shows all objects' '
 	# make new repos so we know the full set of objects; we will
 	# also make sure that there are some packed and some loose
diff --git a/tree-walk.c b/tree-walk.c
index 6565d9ad993..3ec2302309e 100644
--- a/tree-walk.c
+++ b/tree-walk.c
@@ -664,7 +664,12 @@ enum get_oid_result get_tree_entry_follow_symlinks(struct repository *r,
 	struct object_id current_tree_oid;
 	struct strbuf namebuf = STRBUF_INIT;
 	struct tree_desc t;
-	int follows_remaining = GET_TREE_ENTRY_FOLLOW_SYMLINKS_MAX_LINKS;
+	int follows_remaining =
+		max_symlink_depth > -1 &&
+				max_symlink_depth <=
+					GET_TREE_ENTRY_FOLLOW_SYMLINKS_MAX_LINKS ?
+			max_symlink_depth :
+			GET_TREE_ENTRY_FOLLOW_SYMLINKS_MAX_LINKS;
 
 	init_tree_desc(&t, NULL, NULL, 0UL);
 	strbuf_addstr(&namebuf, name);
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 2/2] cat-file: configurable "best effort mode" for symlink resolution
  2024-06-17  9:03 [PATCH 0/2] Symlink resolutions: limits and return modes Miguel Ángel Pastor Olivar via GitGitGadget
  2024-06-17  9:03 ` [PATCH 1/2] cat-file: configurable number of symlink resolutions Miguel Ángel Pastor Olivar via GitGitGadget
@ 2024-06-17  9:03 ` Miguel Ángel Pastor Olivar via GitGitGadget
  2024-06-17 19:33   ` Junio C Hamano
  2024-06-17 19:33 ` [PATCH 0/2] Symlink resolutions: limits and return modes Junio C Hamano
  2 siblings, 1 reply; 6+ messages in thread
From: Miguel Ángel Pastor Olivar via GitGitGadget @ 2024-06-17  9:03 UTC (permalink / raw)
  To: git; +Cc: Miguel Ángel Pastor Olivar, Miguel Ángel Pastor Olivar

From: =?UTF-8?q?Miguel=20=C3=81ngel=20Pastor=20Olivar?= <migue@github.com>

This patch introduces a new "best effort mode" where the object found at
resolution step N is returned. If we've reached the end of the chain, the
returned object will be the file at the end of the chain, however, if, after
n resolutions we haven't reached the end of the chain, the returned object
will represent a symlink

The goal is to extend the symlink resolution process and provide the ability
to return the object found at the designated depth instead of returning an
error.

The current code already provides a limit to the maximum number of
resolutions that can be performed and something similar to this is returned
back to the caller:

loop SP <size> LF <object> LF

With the new config setting we are looking to return the actual information
of the object where the resolution stopped. Something similar to:

<oid> blob <size>\ndata\n

Signed-off-by: Miguel Ángel Pastor Olivar <migue@github.com>
---
 Documentation/config/core.txt | 16 +++++++++++++++-
 config.c                      | 13 +++++++++++++
 environment.c                 |  2 ++
 environment.h                 |  7 +++++++
 t/t1006-cat-file.sh           | 29 +++++++++++++++++++++++++++++
 tree-walk.c                   | 11 +++++++++++
 6 files changed, 77 insertions(+), 1 deletion(-)

diff --git a/Documentation/config/core.txt b/Documentation/config/core.txt
index ca2d1eede52..706f316c89e 100644
--- a/Documentation/config/core.txt
+++ b/Documentation/config/core.txt
@@ -761,4 +761,18 @@ core.maxTreeDepth::
 core.maxSymlinkDepth::
 	The maximum number of symlinks Git is willing to resolve while
 	looking for a tree entry.
-	The default is GET_TREE_ENTRY_FOLLOW_SYMLINKS_MAX_LINKS.
\ No newline at end of file
+	The default is GET_TREE_ENTRY_FOLLOW_SYMLINKS_MAX_LINKS.
+
+core.symlinkResolutionMode::
+	The result returned by the symlink resolution process when
+	core.maxSymlinkDepth is reached. When set to "error"
+	`
+	loop SP <size> LF
+	<object> LF
+	` is returned.
+	If `best-effort` is set, the resolution process will return
+	something like:
+	`
+	<oid> blob <size> 120000\nname\n
+	`
+	The default is "error".
\ No newline at end of file
diff --git a/config.c b/config.c
index d69e9a3ae6b..fa753565e68 100644
--- a/config.c
+++ b/config.c
@@ -1687,6 +1687,19 @@ static int git_default_core_config(const char *var, const char *value,
 		return 0;
 	}
 
+	if (!strcmp(var, "core.symlinkresolutionmode")) {
+		if (!value)
+			symlink_resolution_mode = SYMLINK_RESOLUTION_MODE_ERROR;
+		if (!strcmp(value, "error"))
+			symlink_resolution_mode = SYMLINK_RESOLUTION_MODE_ERROR;
+		else if (!strcmp(value, "best-effort"))
+			symlink_resolution_mode =
+				SYMLINK_RESOLUTION_MODE_BEST_EFFORT;
+		else
+			warning(_("ignoring unknown core.symlinkresolutionmode value '%s'"),
+				value);
+	}
+
 	/* Add other config variables here and to Documentation/config.txt. */
 	return platform_core_config(var, value, ctx, cb);
 }
diff --git a/environment.c b/environment.c
index 6d7a5001eb1..a497331f2bc 100644
--- a/environment.c
+++ b/environment.c
@@ -96,6 +96,8 @@ int max_allowed_tree_depth =
 	2048;
 #endif
 int max_symlink_depth = -1;
+enum symlink_resolution_mode symlink_resolution_mode =
+	SYMLINK_RESOLUTION_MODE_ERROR;
 
 #ifndef PROTECT_HFS_DEFAULT
 #define PROTECT_HFS_DEFAULT 0
diff --git a/environment.h b/environment.h
index ea39c2887b1..5a6eebb061b 100644
--- a/environment.h
+++ b/environment.h
@@ -143,6 +143,13 @@ extern unsigned long pack_size_limit_cfg;
 extern int max_allowed_tree_depth;
 extern int max_symlink_depth;
 
+enum symlink_resolution_mode {
+	SYMLINK_RESOLUTION_MODE_ERROR = 0,
+	SYMLINK_RESOLUTION_MODE_BEST_EFFORT
+};
+
+extern enum symlink_resolution_mode symlink_resolution_mode;
+
 /*
  * Accessors for the core.sharedrepository config which lazy-load the value
  * from the config (if not already set). The "reset" function can be
diff --git a/t/t1006-cat-file.sh b/t/t1006-cat-file.sh
index fd7ab1d1eff..c1d807a0d7f 100755
--- a/t/t1006-cat-file.sh
+++ b/t/t1006-cat-file.sh
@@ -1113,6 +1113,35 @@ test_expect_success 'git cat-file --batch --follow-symlink stop resolving symlin
 	test_cmp expect actual
 '
 
+test_expect_success 'git cat-file --batch --follow-symlink stop resolving symlink at designated depth with error mode config' '
+	printf "loop 22\nHEAD:link-to-symlink-3\n">expect &&
+	printf 'HEAD:link-to-symlink-3' | git -c core.maxsymlinkdepth=1 -c core.symlinkresolutionmode=error cat-file --batch="%(objectname) %(objecttype) %(objectsize)" --follow-symlinks > actual &&
+	test_cmp expect actual &&
+	printf 'HEAD:link-to-symlink-3' | git -c core.maxsymlinkdepth=2 -c core.symlinkresolutionmode=error cat-file --batch="%(objectname) %(objecttype) %(objectsize)" --follow-symlinks > actual &&
+	test_cmp expect actual &&
+	printf 'HEAD:link-to-symlink-3' | git -c core.maxsymlinkdepth=3 -c core.symlinkresolutionmode=error cat-file --batch="%(objectname) %(objecttype) %(objectsize)" --follow-symlinks > actual &&
+	test_cmp expect actual &&
+	oid=$(git rev-parse HEAD:morx) &&
+	printf "${oid} blob 11\nHello World\n" >expect &&
+	printf 'HEAD:link-to-symlink-3' | git -c core.maxsymlinkdepth=4 cat-file --batch="%(objectname) %(objecttype) %(objectsize)" --follow-symlinks > actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'git cat-file --batch --follow-symlink return object info at designated depth' '
+	oid=$(git rev-parse HEAD:link-to-symlink-1) &&
+	printf "${oid} blob 13\nsame-dir-link\n" >expect &&
+	printf 'HEAD:link-to-symlink-1' | git -c core.maxsymlinkdepth=1  -c core.symlinkresolutionmode=best-effort cat-file --batch="%(objectname) %(objecttype) %(objectsize)" --follow-symlinks >actual &&
+	test_cmp expect actual &&
+	oid=$(git rev-parse HEAD:same-dir-link) &&
+	printf "${oid} blob 4\nmorx\n" > expect &&
+	printf 'HEAD:link-to-symlink-1' | git -c core.maxsymlinkdepth=2  -c core.symlinkresolutionmode=best-effort cat-file --batch="%(objectname) %(objecttype) %(objectsize)" --follow-symlinks >actual &&
+	test_cmp expect actual &&
+	oid=$(git rev-parse HEAD:morx) &&
+	printf "${oid} blob 11\nHello World\n" > expect &&
+	printf 'HEAD:link-to-symlink-1' | git -c core.maxsymlinkdepth=3  -c core.symlinkresolutionmode=best-effort cat-file --batch="%(objectname) %(objecttype) %(objectsize)" --follow-symlinks >actual &&
+	test_cmp expect actual
+'
+
 test_expect_success 'cat-file --batch-all-objects shows all objects' '
 	# make new repos so we know the full set of objects; we will
 	# also make sure that there are some packed and some loose
diff --git a/tree-walk.c b/tree-walk.c
index 3ec2302309e..ee861fd6351 100644
--- a/tree-walk.c
+++ b/tree-walk.c
@@ -821,6 +821,17 @@ enum get_oid_result get_tree_entry_follow_symlinks(struct repository *r,
 			contents_start = contents;
 
 			parent = &parents[parents_nr - 1];
+
+			if (follows_remaining == 0 &&
+			    symlink_resolution_mode ==
+				    SYMLINK_RESOLUTION_MODE_BEST_EFFORT) {
+				strbuf_addstr(result_path, contents);
+				oidcpy(result, &current_tree_oid);
+				free(contents);
+				retval = FOUND;
+				goto done;
+			}
+
 			init_tree_desc(&t, &parent->oid, parent->tree, parent->size);
 			strbuf_splice(&namebuf, 0, len,
 				      contents_start, link_len);
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 0/2] Symlink resolutions: limits and return modes
  2024-06-17  9:03 [PATCH 0/2] Symlink resolutions: limits and return modes Miguel Ángel Pastor Olivar via GitGitGadget
  2024-06-17  9:03 ` [PATCH 1/2] cat-file: configurable number of symlink resolutions Miguel Ángel Pastor Olivar via GitGitGadget
  2024-06-17  9:03 ` [PATCH 2/2] cat-file: configurable "best effort mode" for symlink resolution Miguel Ángel Pastor Olivar via GitGitGadget
@ 2024-06-17 19:33 ` Junio C Hamano
  2 siblings, 0 replies; 6+ messages in thread
From: Junio C Hamano @ 2024-06-17 19:33 UTC (permalink / raw)
  To: Miguel Ángel Pastor Olivar via GitGitGadget
  Cc: git, Miguel Ángel Pastor Olivar

"Miguel Ángel Pastor Olivar via GitGitGadget"
<gitgitgadget@gmail.com> writes:

> The current code already provides a limit to the maximum number of
> resolutions that can be performed, and something similar to this is returned
> to the caller:
>
> loop SP <size> LF
> <object> LF
>
>
> With these patches, we are looking to return the actual information of the
> object where the resolution stopped. Something similar to:
>
> <oid> blob <size>\nndata\n

Just a random and idle thought, but is it all that interesting to
learn only about the object at the horizon?

If recursive resolutions are limited to say 3 levels, I wonder if it
is beneficial to give full record from each iteration without losing
information, e.g., saying "A points at B which in turn points at C,
and I stopped there but C is still not the final thing", instead of
saying "I followed links and C was the last one I saw after I
repeated for the maximum number of times the configuration allows me
to".

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/2] cat-file: configurable number of symlink resolutions
  2024-06-17  9:03 ` [PATCH 1/2] cat-file: configurable number of symlink resolutions Miguel Ángel Pastor Olivar via GitGitGadget
@ 2024-06-17 19:33   ` Junio C Hamano
  0 siblings, 0 replies; 6+ messages in thread
From: Junio C Hamano @ 2024-06-17 19:33 UTC (permalink / raw)
  To: Miguel Ángel Pastor Olivar via GitGitGadget
  Cc: git, Miguel Ángel Pastor Olivar,
	Miguel Ángel Pastor Olivar

"Miguel Ángel Pastor Olivar via GitGitGadget"
<gitgitgadget@gmail.com> writes:

> diff --git a/Documentation/config/core.txt b/Documentation/config/core.txt
> index 93d65e1dfd2..ca2d1eede52 100644
> --- a/Documentation/config/core.txt
> +++ b/Documentation/config/core.txt
> @@ -757,3 +757,8 @@ core.maxTreeDepth::
>  	tree (e.g., "a/b/cde/f" has a depth of 4). This is a fail-safe
>  	to allow Git to abort cleanly, and should not generally need to
>  	be adjusted. The default is 4096.
> +
> +core.maxSymlinkDepth::
> +	The maximum number of symlinks Git is willing to resolve while
> +	looking for a tree entry.
> +	The default is GET_TREE_ENTRY_FOLLOW_SYMLINKS_MAX_LINKS.
> \ No newline at end of file

Style: please do not end our text files with an incomplete line.

Regarding the patch contents, this is an end-user facing document.
How would they learn what the actual value is?

Is there a "valid range" the users are allowed to set this value to?
If so, what is the range?  What do users get when they set it outside
the allowed range?  Do they get warned?  Do they get die()?  Is the
value silently ignored?

If there is no upper limit for the "valid range", how does a user
set it to "infinity", and what's the downside of doing so?  What
happens when the user sets it to 0, or a negative value, if there is
no lower limit for the "valid range"?  The questions in this
paragraph your updated documentation text do not have to answer if
your "valid range" does have both upper and lower limit, but the
documentation must answer questions in the previous paragraph.

> diff --git a/config.c b/config.c
> index abce05b7744..d69e9a3ae6b 100644
> --- a/config.c
> +++ b/config.c
> @@ -1682,6 +1682,11 @@ static int git_default_core_config(const char *var, const char *value,
>  		return 0;
>  	}
>  
> +	if (!strcmp(var, "core.maxsymlinkdepth")) {
> +		max_symlink_depth = git_config_int(var, value, ctx->kvi);
> +		return 0;
> +	}
> +
>  	/* Add other config variables here and to Documentation/config.txt. */
>  	return platform_core_config(var, value, ctx, cb);
>  }
> diff --git a/environment.c b/environment.c
> index 701d5151354..6d7a5001eb1 100644
> --- a/environment.c
> +++ b/environment.c
> @@ -95,6 +95,7 @@ int max_allowed_tree_depth =
>  #else
>  	2048;
>  #endif
> +int max_symlink_depth = -1;

Why set it to -1 here, instead of initializing it to the
GET_TREE_ENTRY_FOLLOW_SYMLINKS?  By introducing a configuration
variable (which by the way I am not convinced is necessarily a good
idea to begin with), you are surfacing that built-in default value
as a more prominent thing, not hidden away in a little corner of
tree-walk.c implementation detail.  If you do define a "valid range
of values", the code that parses core.maxsymlinkdepth in config.c
may want to learn what the value of GET_TREE_ENTRY_FOLLOW_SYMLINKS
is, which means the symbol may need to be visible in some common
header file anyway.

By the way, this is not a new problem this patch introduces, as the
default GET_TREE_ENTRY_FOLLOW_SYMLINKS came from 275721c2
(tree-walk: learn get_tree_entry_follow_symlinks, 2015-05-20), but I
wonder if the default number should somehow be aligned with the
other upper limit, SYMREF_MAXDEPTH for a symbolic ref pointing at
another symbolic ref pointing at yet another ...

> +test_expect_success 'git cat-file --batch --follow-symlink stop resolving symlinks' '
> +	printf "loop 22\nHEAD:link-to-symlink-3\n">expect &&
> +	printf 'HEAD:link-to-symlink-3' | git -c core.maxsymlinkdepth=1 cat-file --batch="%(objectname) %(objecttype) %(objectsize)" --follow-symlinks > actual &&

Style: a redirection operator needs a single SP before it and no SP
between it and its target, i.e.

	printf "loop 22..." >expect &&
	printf "HEAD:link ..." |
        git ... cat-file ... >actual &&

Also fold overly long line after "|" pipeline.

> diff --git a/tree-walk.c b/tree-walk.c
> index 6565d9ad993..3ec2302309e 100644
> --- a/tree-walk.c
> +++ b/tree-walk.c
> @@ -664,7 +664,12 @@ enum get_oid_result get_tree_entry_follow_symlinks(struct repository *r,
>  	struct object_id current_tree_oid;
>  	struct strbuf namebuf = STRBUF_INIT;
>  	struct tree_desc t;
> -	int follows_remaining = GET_TREE_ENTRY_FOLLOW_SYMLINKS_MAX_LINKS;
> +	int follows_remaining =
> +		max_symlink_depth > -1 &&
> +				max_symlink_depth <=
> +					GET_TREE_ENTRY_FOLLOW_SYMLINKS_MAX_LINKS ?
> +			max_symlink_depth :
> +			GET_TREE_ENTRY_FOLLOW_SYMLINKS_MAX_LINKS;

Strange indentation.

If you range-limit at the place the configuration was parsed, you do
not have to do any of this here, but if you insist hiding
GET_TREE_ENTRY_FOLLOW_SYMLINKS_MAX_LINKS from others (yet still use
it in the end-user facing documentation???), then

	int follows_remaining =
		(-1 < max_symlink_depth &&
		 max_symlink_depth <= GET_TREE_ENTRY_FOLLOW_SYMLINKS_MAX_LINKS)
		? max_symlink_depth
		: GET_TREE_ENTRY_FOLLOW_SYMLINKS_MAX_LINKS;

or perhaps a lot easier to read form, i.e.

	int follows_remaining = max_symlink_depth;

        if (follows_remaining < -1 ||
            GET_TREE_ENTRY_FOLLOW_SYMLINKS_MAX_LINKS < follows_remaining)
		follows_remaining = GET_TREE_ENTRY_FOLLOW_SYMLINKS_MAX_LINKS;

>  	init_tree_desc(&t, NULL, NULL, 0UL);
>  	strbuf_addstr(&namebuf, name);


Thanks.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/2] cat-file: configurable "best effort mode" for symlink resolution
  2024-06-17  9:03 ` [PATCH 2/2] cat-file: configurable "best effort mode" for symlink resolution Miguel Ángel Pastor Olivar via GitGitGadget
@ 2024-06-17 19:33   ` Junio C Hamano
  0 siblings, 0 replies; 6+ messages in thread
From: Junio C Hamano @ 2024-06-17 19:33 UTC (permalink / raw)
  To: Miguel Ángel Pastor Olivar via GitGitGadget
  Cc: git, Miguel Ángel Pastor Olivar,
	Miguel Ángel Pastor Olivar

"Miguel Ángel Pastor Olivar via GitGitGadget"
<gitgitgadget@gmail.com> writes:

> From: =?UTF-8?q?Miguel=20=C3=81ngel=20Pastor=20Olivar?= <migue@github.com>
>
> This patch introduces a new "best effort mode" where the object found at
> resolution step N is returned. If we've reached the end of the chain, the
> returned object will be the file at the end of the chain, however, if, after
> n resolutions we haven't reached the end of the chain, the returned object
> will represent a symlink
>
> The goal is to extend the symlink resolution process and provide the ability
> to return the object found at the designated depth instead of returning an
> error.
>
> The current code already provides a limit to the maximum number of
> resolutions that can be performed and something similar to this is returned
> back to the caller:
>
> loop SP <size> LF <object> LF
>
> With the new config setting we are looking to return the actual information
> of the object where the resolution stopped. Something similar to:
>
> <oid> blob <size>\ndata\n

I do not think this should be a configuration variable at all.
Either a command line option, or even better yet would be an
in-stream instruction ("flip into the 'tell me the last symlink
you saw before you gave up' mode"), is understandable though, given
that this is strictly for the "batch" mode.

For that matter, it is dubious that the previous one that added a
configuration variable to lower the symlink recursion limit is a
good idea.  It does not affect anything but "cat-file --batch" and
an in-stream instruction, e.g. "in this session, do not resolve more
than 3 levels", sounds like a much better fit to what this wants to
do.  That way, it will be a lot better isolated from unrelated code
paths.  It might even make sense not to introduce the new
max_symlink_depth global variable, but pass it through as a new
member in "struct object_context" given to get_oid_with_context(),
which in turn is passed as a new parameter to
get_tree_entry_follow_symlinks() function.

So, I am supportive to solving the problem this series attempts to
solve, but I am not on board with the design this series took.

Thanks.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2024-06-17 19:35 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-17  9:03 [PATCH 0/2] Symlink resolutions: limits and return modes Miguel Ángel Pastor Olivar via GitGitGadget
2024-06-17  9:03 ` [PATCH 1/2] cat-file: configurable number of symlink resolutions Miguel Ángel Pastor Olivar via GitGitGadget
2024-06-17 19:33   ` Junio C Hamano
2024-06-17  9:03 ` [PATCH 2/2] cat-file: configurable "best effort mode" for symlink resolution Miguel Ángel Pastor Olivar via GitGitGadget
2024-06-17 19:33   ` Junio C Hamano
2024-06-17 19:33 ` [PATCH 0/2] Symlink resolutions: limits and return modes Junio C Hamano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).