public inbox for git@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/16] Auto-configure advertised remotes via URL whitelist
@ 2026-03-23  8:05 Christian Couder
  2026-03-23  8:05 ` [PATCH 01/16] promisor-remote: try accepted remotes before others in get_direct() Christian Couder
                   ` (16 more replies)
  0 siblings, 17 replies; 33+ messages in thread
From: Christian Couder @ 2026-03-23  8:05 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Patrick Steinhardt, Taylor Blau, Karthik Nayak,
	Elijah Newren, Christian Couder

Currently, the "promisor-remote" protocol capability allows a server
to advertise promisor remotes (and their tokens/filters), but the
client's `promisor.acceptFromServer` mechanism requires these remotes
to already exist in the config.

This is a significant burden for users and administrators who have to
pre-configure remotes.

This patch series improves on this by introducing a new
`promisor.acceptFromServerUrl` config option, which provides an
additive, URL-based security whitelist.

Multiple `promisor.acceptFromServerUrl` config options can be provided
in different config files. Each one should contain a URL glob pattern
which can optionally be prefixed with a remote name in the
"[<name>=]<pattern>" format.

With this new config option:

 - The server can update fields (like tokens) for known remotes,
   provided their URL matches the whitelist, even if
   `acceptFromServer` is set to `None`.

 - Unknown remotes advertised by the server can be automatically
   configured on the client if their URL matches the whitelist.

 - If there is no `<name>` prefix before the glob pattern matched, the
   auto-configured remote is named using the
   "promisor-auto-<sanitized-url>" format. So the same auto-configured
   remote config entry will be reused for the same URL.

 - If a `<name>` prefix is provided, it will be used for the
   auto-configured remote config entry.

 - If the chosen name (auto-generated or prefixed) already exists but
   points to a different URL, overwriting the existing config is
   prevented by appending a numeric suffix (e.g., -1, -2) to the name
   and auto-configuring using that name.

 - The server's originally advertised name is always saved in the
   `remote.<name>.advertisedAs` config variable of the auto-configured
   remote for tracing and debugging.

 - To honor the server's recommendation, promisor_remote_get_direct()
   is updated to try accepted remotes first before falling back to
   other configured promisor remotes. This ensures auto-configured
   remotes are preferred over other remotes especially the
   partial-clone origin.

Security considerations:

 - Advertised URLs are routed through url_normalize() before matching
   against the user's glob patterns to prevent percent-encoding, case
   variation, or path-traversal (../) bypasses.

 - Auto-generated remote names are sanitized (non-alphanumeric
   characters are replaced with '-' and prefixed with
   'promisor-auto-'). This guarantees safe config section names and
   prevents a server from maliciously overwriting standard remotes
   (like origin).

 - The documentation explains in detail how to use secure glob
   patterns in `promisor.acceptFromServerUrl`.

High level description of the patches
=====================================

 - Patch 1/16 ("promisor-remote: try accepted remotes before others in
   get_direct()"):

   Fixes promisor_remote_get_direct() to prioritize accepted
   remotes. This could be a separate fix, but is needed towards the
   end of the series.

 - Patches 2-3/16 ("urlmatch:*"):

   Exposes and adapts helpers in the urlmatch API.

 - Patches 4-11/16 ("promisor-remote:*"):

   Big refactoring of filter_promisor_remote() and
   should_accept_remote(). This keeps `struct promisor_info` instances
   alive longer to anticipate possible state-desync bugs, decouples
   the server's advertised name from the local config name, and
   sanitizes control flow without changing the existing behavior.

 - Patch 12/16 ("t5710:*"):

   Cleans up how "file://" URIs are managed in the test script to
   prepare for URI normalization later in the series and avoid issues
   on Windows.

 - Patches 13-15/16 ("promisor-remote:*"):

   The core feature. Introduces the parsing machinery, adds the
   additive whitelist for known remotes (with url_normalize()
   security), and finally implements the auto-creation and collision
   resolution for unknown remotes.

 - Patch 16/16 ("doc: promisor: improve acceptFromServer entry"):

   Cleans up and modernizes the existing `promisor.acceptFromServer`
   documentation.

CI tests
========

They all pass, see:

https://github.com/chriscool/git/actions/runs/23350745268


Christian Couder (16):
  promisor-remote: try accepted remotes before others in get_direct()
  urlmatch: change 'allow_globs' arg to bool
  urlmatch: add url_is_valid_pattern() helper
  promisor-remote: clarify that a remote is ignored
  promisor-remote: refactor has_control_char()
  promisor-remote: refactor accept_from_server()
  promisor-remote: keep accepted promisor_info structs alive
  promisor-remote: remove the 'accepted' strvec
  promisor-remote: add 'local_name' to 'struct promisor_info'
  promisor-remote: pass config entry to all_fields_match() directly
  promisor-remote: refactor should_accept_remote() control flow
  t5710: use proper file:// URIs for absolute paths
  promisor-remote: introduce promisor.acceptFromServerUrl
  promisor-remote: trust known remotes matching acceptFromServerUrl
  promisor-remote: auto-configure unknown remotes
  doc: promisor: improve acceptFromServer entry

 Documentation/config/promisor.adoc    | 118 +++++-
 Documentation/config/remote.adoc      |   9 +
 Documentation/gitprotocol-v2.adoc     |   9 +-
 promisor-remote.c                     | 532 +++++++++++++++++++++-----
 t/t5710-promisor-remote-capability.sh | 250 ++++++++++--
 urlmatch.c                            |  18 +-
 urlmatch.h                            |  11 +
 7 files changed, 802 insertions(+), 145 deletions(-)

-- 
2.53.0.625.g20f70b52bb


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH 01/16] promisor-remote: try accepted remotes before others in get_direct()
  2026-03-23  8:05 [PATCH 00/16] Auto-configure advertised remotes via URL whitelist Christian Couder
@ 2026-03-23  8:05 ` Christian Couder
  2026-03-26 12:20   ` Patrick Steinhardt
  2026-03-23  8:05 ` [PATCH 02/16] urlmatch: change 'allow_globs' arg to bool Christian Couder
                   ` (15 subsequent siblings)
  16 siblings, 1 reply; 33+ messages in thread
From: Christian Couder @ 2026-03-23  8:05 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Patrick Steinhardt, Taylor Blau, Karthik Nayak,
	Elijah Newren, Christian Couder, Christian Couder

When a server advertises promisor remotes and the client accepts some
of them, those remotes carry the server's intent: 'fetch missing
objects preferably from here', and the client agrees with that for the
remotes it accepts.

However promisor_remote_get_direct() actually iterates over all
promisor remotes in list order, which is the order they appear in the
config files (except perhaps for the one appearing in the
`extensions.partialClone` config variable which is tried last).

This means an existing, but not accepted, promisor remote, could be
tried before the accepted ones, which does not reflect the intent of
the agreement between client and server.

If the client doesn't care about what the server suggests, it should
accept nothing and rely on its remotes as they are already configured.

To better reflect the agreement between client and server, let's make
promisor_remote_get_direct() try the accepted promisor remotes before
the non-accepted ones.

Concretely, let's extract a try_promisor_remotes() helper and call it
twice from promisor_remote_get_direct():

- first with an `accepted_only=true` argument to try only the accepted
  remotes,
- then with `accepted_only=false` to fall back to any remaining remote.

Ensuring that accepted remotes are preferred will be even more
important if in the future a mechanism is developed to allow the
client to auto-configure remotes that the server advertises. This will
in particular avoid fetching from the server (which is already
configured as a promisor remote) before trying the auto-configured
remotes, as these new remotes would likely appear at the end of the
config file, and as the server might not appear in the
`extensions.partialClone` config variable.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 promisor-remote.c | 46 +++++++++++++++++++++++++++++++++-------------
 1 file changed, 33 insertions(+), 13 deletions(-)

diff --git a/promisor-remote.c b/promisor-remote.c
index 96fa215b06..3f8aeee787 100644
--- a/promisor-remote.c
+++ b/promisor-remote.c
@@ -268,11 +268,37 @@ static int remove_fetched_oids(struct repository *repo,
 	return remaining_nr;
 }
 
+static int try_promisor_remotes(struct repository *repo,
+				struct object_id **remaining_oids,
+				int *remaining_nr, int *to_free,
+				bool accepted_only)
+{
+	struct promisor_remote *r = repo->promisor_remote_config->promisors;
+
+	for (; r; r = r->next) {
+		if (accepted_only && !r->accepted)
+			continue;
+		if (!accepted_only && r->accepted)
+			continue;
+		if (fetch_objects(repo, r->name, *remaining_oids, *remaining_nr) < 0) {
+			if (*remaining_nr == 1)
+				continue;
+			*remaining_nr = remove_fetched_oids(repo, remaining_oids,
+							    *remaining_nr, *to_free);
+			if (*remaining_nr) {
+				*to_free = 1;
+				continue;
+			}
+		}
+		return 1; /* all fetched */
+	}
+	return 0;
+}
+
 void promisor_remote_get_direct(struct repository *repo,
 				const struct object_id *oids,
 				int oid_nr)
 {
-	struct promisor_remote *r;
 	struct object_id *remaining_oids = (struct object_id *)oids;
 	int remaining_nr = oid_nr;
 	int to_free = 0;
@@ -283,19 +309,13 @@ void promisor_remote_get_direct(struct repository *repo,
 
 	promisor_remote_init(repo);
 
-	for (r = repo->promisor_remote_config->promisors; r; r = r->next) {
-		if (fetch_objects(repo, r->name, remaining_oids, remaining_nr) < 0) {
-			if (remaining_nr == 1)
-				continue;
-			remaining_nr = remove_fetched_oids(repo, &remaining_oids,
-							 remaining_nr, to_free);
-			if (remaining_nr) {
-				to_free = 1;
-				continue;
-			}
-		}
+	/* Try accepted remotes first (those the server told us to use) */
+	if (try_promisor_remotes(repo, &remaining_oids, &remaining_nr,
+				 &to_free, true))
+		goto all_fetched;
+	if (try_promisor_remotes(repo, &remaining_oids, &remaining_nr,
+				 &to_free, false))
 		goto all_fetched;
-	}
 
 	for (i = 0; i < remaining_nr; i++) {
 		if (is_promisor_object(repo, &remaining_oids[i]))
-- 
2.53.0.625.g20f70b52bb


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 02/16] urlmatch: change 'allow_globs' arg to bool
  2026-03-23  8:05 [PATCH 00/16] Auto-configure advertised remotes via URL whitelist Christian Couder
  2026-03-23  8:05 ` [PATCH 01/16] promisor-remote: try accepted remotes before others in get_direct() Christian Couder
@ 2026-03-23  8:05 ` Christian Couder
  2026-03-23  8:05 ` [PATCH 03/16] urlmatch: add url_is_valid_pattern() helper Christian Couder
                   ` (14 subsequent siblings)
  16 siblings, 0 replies; 33+ messages in thread
From: Christian Couder @ 2026-03-23  8:05 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Patrick Steinhardt, Taylor Blau, Karthik Nayak,
	Elijah Newren, Christian Couder, Christian Couder

The last argument of url_normalize_1() is `char allow_globs` but it is
used as a boolean, not as a char.

Let's convert it to a `bool`, and while at it convert the two calls to
url_normalize_1() so they pass 'true' or 'false' instead of '1' or '0'.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 urlmatch.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/urlmatch.c b/urlmatch.c
index eea8300489..989bc7eb8b 100644
--- a/urlmatch.c
+++ b/urlmatch.c
@@ -111,7 +111,7 @@ static int match_host(const struct url_info *url_info,
 	return (!url_len && !pat_len);
 }
 
-static char *url_normalize_1(const char *url, struct url_info *out_info, char allow_globs)
+static char *url_normalize_1(const char *url, struct url_info *out_info, bool allow_globs)
 {
 	/*
 	 * Normalize NUL-terminated url using the following rules:
@@ -437,7 +437,7 @@ static char *url_normalize_1(const char *url, struct url_info *out_info, char al
 
 char *url_normalize(const char *url, struct url_info *out_info)
 {
-	return url_normalize_1(url, out_info, 0);
+	return url_normalize_1(url, out_info, false);
 }
 
 static size_t url_match_prefix(const char *url,
@@ -577,7 +577,7 @@ int urlmatch_config_entry(const char *var, const char *value,
 		struct url_info norm_info;
 
 		config_url = xmemdupz(key, dot - key);
-		norm_url = url_normalize_1(config_url, &norm_info, 1);
+		norm_url = url_normalize_1(config_url, &norm_info, true);
 		if (norm_url)
 			retval = match_urls(url, &norm_info, &matched);
 		else if (collect->fallback_match_fn)
-- 
2.53.0.625.g20f70b52bb


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 03/16] urlmatch: add url_is_valid_pattern() helper
  2026-03-23  8:05 [PATCH 00/16] Auto-configure advertised remotes via URL whitelist Christian Couder
  2026-03-23  8:05 ` [PATCH 01/16] promisor-remote: try accepted remotes before others in get_direct() Christian Couder
  2026-03-23  8:05 ` [PATCH 02/16] urlmatch: change 'allow_globs' arg to bool Christian Couder
@ 2026-03-23  8:05 ` Christian Couder
  2026-03-26 12:20   ` Patrick Steinhardt
  2026-03-23  8:05 ` [PATCH 04/16] promisor-remote: clarify that a remote is ignored Christian Couder
                   ` (13 subsequent siblings)
  16 siblings, 1 reply; 33+ messages in thread
From: Christian Couder @ 2026-03-23  8:05 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Patrick Steinhardt, Taylor Blau, Karthik Nayak,
	Elijah Newren, Christian Couder, Christian Couder

In a following commit, we will need to check if a URL that might have
glob patterns looks valid, so let's export a dedicated helper function
for that purpose.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 urlmatch.c | 12 ++++++++++++
 urlmatch.h | 11 +++++++++++
 2 files changed, 23 insertions(+)

diff --git a/urlmatch.c b/urlmatch.c
index 989bc7eb8b..a8cb6c3bee 100644
--- a/urlmatch.c
+++ b/urlmatch.c
@@ -440,6 +440,18 @@ char *url_normalize(const char *url, struct url_info *out_info)
 	return url_normalize_1(url, out_info, false);
 }
 
+bool url_is_valid_pattern(const char *url)
+{
+	char *normalized = url_normalize_1(url, NULL, true);
+
+	if (normalized) {
+		free(normalized);
+		return true;
+	}
+
+	return false;
+}
+
 static size_t url_match_prefix(const char *url,
 			       const char *url_prefix,
 			       size_t url_prefix_len)
diff --git a/urlmatch.h b/urlmatch.h
index 5ba85cea13..4e01422a02 100644
--- a/urlmatch.h
+++ b/urlmatch.h
@@ -36,6 +36,17 @@ struct url_info {
 
 char *url_normalize(const char *, struct url_info *);
 
+/*
+ * Return 'true' if the string looks like a valid URL or a valid URL pattern
+ * (allowing '*' globs), 'false' otherwise.
+ *
+ * This is NOT a URL validation function.  Full URL validation is NOT
+ * performed.  Some invalid host names are passed through this function
+ * undetected.  However, most all other problems that make a URL invalid
+ * will be detected (including a missing host for non file: URLs).
+ */
+bool url_is_valid_pattern(const char *url);
+
 struct urlmatch_item {
 	size_t hostmatch_len;
 	size_t pathmatch_len;
-- 
2.53.0.625.g20f70b52bb


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 04/16] promisor-remote: clarify that a remote is ignored
  2026-03-23  8:05 [PATCH 00/16] Auto-configure advertised remotes via URL whitelist Christian Couder
                   ` (2 preceding siblings ...)
  2026-03-23  8:05 ` [PATCH 03/16] urlmatch: add url_is_valid_pattern() helper Christian Couder
@ 2026-03-23  8:05 ` Christian Couder
  2026-03-26 12:20   ` Patrick Steinhardt
  2026-03-23  8:05 ` [PATCH 05/16] promisor-remote: refactor has_control_char() Christian Couder
                   ` (12 subsequent siblings)
  16 siblings, 1 reply; 33+ messages in thread
From: Christian Couder @ 2026-03-23  8:05 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Patrick Steinhardt, Taylor Blau, Karthik Nayak,
	Elijah Newren, Christian Couder, Christian Couder

In should_accept_remote() when a remote is ignored, we might tell users
why it is ignored in a warning, but we don't tell them that the remote
is actually ignored.

Let's clarify that, so users have a better idea of what's actually
happening.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 promisor-remote.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/promisor-remote.c b/promisor-remote.c
index 3f8aeee787..f5c4d41155 100644
--- a/promisor-remote.c
+++ b/promisor-remote.c
@@ -660,15 +660,16 @@ static int should_accept_remote(enum accept_promisor accept,
 		BUG("Unhandled 'enum accept_promisor' value '%d'", accept);
 
 	if (!remote_url || !*remote_url) {
-		warning(_("no or empty URL advertised for remote '%s'"), remote_name);
+		warning(_("no or empty URL advertised for remote '%s', "
+			  "ignoring this remote"), remote_name);
 		return 0;
 	}
 
 	if (!strcmp(p->url, remote_url))
 		return all_fields_match(advertised, config_info, 0);
 
-	warning(_("known remote named '%s' but with URL '%s' instead of '%s'"),
-		remote_name, p->url, remote_url);
+	warning(_("known remote named '%s' but with URL '%s' instead of '%s', "
+		  "ignoring this remote"), remote_name, p->url, remote_url);
 
 	return 0;
 }
-- 
2.53.0.625.g20f70b52bb


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 05/16] promisor-remote: refactor has_control_char()
  2026-03-23  8:05 [PATCH 00/16] Auto-configure advertised remotes via URL whitelist Christian Couder
                   ` (3 preceding siblings ...)
  2026-03-23  8:05 ` [PATCH 04/16] promisor-remote: clarify that a remote is ignored Christian Couder
@ 2026-03-23  8:05 ` Christian Couder
  2026-03-23  8:05 ` [PATCH 06/16] promisor-remote: refactor accept_from_server() Christian Couder
                   ` (11 subsequent siblings)
  16 siblings, 0 replies; 33+ messages in thread
From: Christian Couder @ 2026-03-23  8:05 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Patrick Steinhardt, Taylor Blau, Karthik Nayak,
	Elijah Newren, Christian Couder, Christian Couder

In a following commit we are going to check if some strings contain
control character, so let's refactor the logic to do that in a new
has_control_char() helper function.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 promisor-remote.c | 24 ++++++++++++++----------
 1 file changed, 14 insertions(+), 10 deletions(-)

diff --git a/promisor-remote.c b/promisor-remote.c
index f5c4d41155..eda38223b9 100644
--- a/promisor-remote.c
+++ b/promisor-remote.c
@@ -632,6 +632,14 @@ static int all_fields_match(struct promisor_info *advertised,
 	return 1;
 }
 
+static bool has_control_char(const char *s)
+{
+	for (const char *c = s; *c; c++)
+		if (iscntrl(*c))
+			return true;
+	return false;
+}
+
 static int should_accept_remote(enum accept_promisor accept,
 				struct promisor_info *advertised,
 				struct string_list *config_info)
@@ -762,18 +770,14 @@ static bool valid_filter(const char *filter, const char *remote_name)
 	return !res;
 }
 
-/* Check that a token doesn't contain any control character */
 static bool valid_token(const char *token, const char *remote_name)
 {
-	const char *c = token;
-
-	for (; *c; c++)
-		if (iscntrl(*c)) {
-			warning(_("invalid token '%s' for remote '%s' "
-				  "will not be stored"),
-				token, remote_name);
-			return false;
-		}
+	if (has_control_char(token)) {
+		warning(_("invalid token '%s' for remote '%s' "
+			  "will not be stored"),
+			token, remote_name);
+		return false;
+	}
 
 	return true;
 }
-- 
2.53.0.625.g20f70b52bb


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 06/16] promisor-remote: refactor accept_from_server()
  2026-03-23  8:05 [PATCH 00/16] Auto-configure advertised remotes via URL whitelist Christian Couder
                   ` (4 preceding siblings ...)
  2026-03-23  8:05 ` [PATCH 05/16] promisor-remote: refactor has_control_char() Christian Couder
@ 2026-03-23  8:05 ` Christian Couder
  2026-03-23  8:05 ` [PATCH 07/16] promisor-remote: keep accepted promisor_info structs alive Christian Couder
                   ` (10 subsequent siblings)
  16 siblings, 0 replies; 33+ messages in thread
From: Christian Couder @ 2026-03-23  8:05 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Patrick Steinhardt, Taylor Blau, Karthik Nayak,
	Elijah Newren, Christian Couder, Christian Couder

In a following commit, we are going to add more logic to
filter_promisor_remote() which is already doing a lot of things.

Let's alleviate that by moving the logic that checks and validates the
value of the `promisor.acceptFromServer` config variable into its own
accept_from_server() helper function.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 promisor-remote.c | 27 +++++++++++++++++----------
 1 file changed, 17 insertions(+), 10 deletions(-)

diff --git a/promisor-remote.c b/promisor-remote.c
index eda38223b9..3116d14d14 100644
--- a/promisor-remote.c
+++ b/promisor-remote.c
@@ -852,20 +852,12 @@ static bool promisor_store_advertised_fields(struct promisor_info *advertised,
 	return reload_config;
 }
 
-static void filter_promisor_remote(struct repository *repo,
-				   struct strvec *accepted,
-				   const char *info)
+static enum accept_promisor accept_from_server(struct repository *repo)
 {
 	const char *accept_str;
 	enum accept_promisor accept = ACCEPT_NONE;
-	struct string_list config_info = STRING_LIST_INIT_NODUP;
-	struct string_list remote_info = STRING_LIST_INIT_DUP;
-	struct store_info *store_info = NULL;
-	struct string_list_item *item;
-	bool reload_config = false;
-	struct string_list accepted_filters = STRING_LIST_INIT_DUP;
 
-	if (!repo_config_get_string_tmp(the_repository, "promisor.acceptfromserver", &accept_str)) {
+	if (!repo_config_get_string_tmp(repo, "promisor.acceptfromserver", &accept_str)) {
 		if (!*accept_str || !strcasecmp("None", accept_str))
 			accept = ACCEPT_NONE;
 		else if (!strcasecmp("KnownUrl", accept_str))
@@ -879,6 +871,21 @@ static void filter_promisor_remote(struct repository *repo,
 				accept_str, "promisor.acceptfromserver");
 	}
 
+	return accept;
+}
+
+static void filter_promisor_remote(struct repository *repo,
+				   struct strvec *accepted,
+				   const char *info)
+{
+	struct string_list config_info = STRING_LIST_INIT_NODUP;
+	struct string_list remote_info = STRING_LIST_INIT_DUP;
+	struct store_info *store_info = NULL;
+	struct string_list_item *item;
+	bool reload_config = false;
+	struct string_list accepted_filters = STRING_LIST_INIT_DUP;
+	enum accept_promisor accept = accept_from_server(repo);
+
 	if (accept == ACCEPT_NONE)
 		return;
 
-- 
2.53.0.625.g20f70b52bb


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 07/16] promisor-remote: keep accepted promisor_info structs alive
  2026-03-23  8:05 [PATCH 00/16] Auto-configure advertised remotes via URL whitelist Christian Couder
                   ` (5 preceding siblings ...)
  2026-03-23  8:05 ` [PATCH 06/16] promisor-remote: refactor accept_from_server() Christian Couder
@ 2026-03-23  8:05 ` Christian Couder
  2026-03-26 12:21   ` Patrick Steinhardt
  2026-03-23  8:05 ` [PATCH 08/16] promisor-remote: remove the 'accepted' strvec Christian Couder
                   ` (9 subsequent siblings)
  16 siblings, 1 reply; 33+ messages in thread
From: Christian Couder @ 2026-03-23  8:05 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Patrick Steinhardt, Taylor Blau, Karthik Nayak,
	Elijah Newren, Christian Couder, Christian Couder

In filter_promisor_remote(), the instances of `struct promisor_info`
for accepted remotes are dismantled into separate parallel data
structures (the 'accepted' strvec for server names, and
'accepted_filters' for filter strings) and then immediately freed.

Instead, let's keep these instances on an 'accepted_remotes' list.

This way the post-loop phase can iterate a single list to build the
protocol reply, apply advertised filters, and mark remotes as
accepted, rather than iterating three separate structures.

This refactoring also prepares for a following commit that will add a
'local_name' member to 'struct promisor_info'. Since struct instances
stay alive, downstream code will be able to simply read both names
from them rather than needing yet another parallel strvec.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 promisor-remote.c | 42 +++++++++++++++++-------------------------
 1 file changed, 17 insertions(+), 25 deletions(-)

diff --git a/promisor-remote.c b/promisor-remote.c
index 3116d14d14..34b4ca5806 100644
--- a/promisor-remote.c
+++ b/promisor-remote.c
@@ -880,10 +880,10 @@ static void filter_promisor_remote(struct repository *repo,
 {
 	struct string_list config_info = STRING_LIST_INIT_NODUP;
 	struct string_list remote_info = STRING_LIST_INIT_DUP;
+	struct string_list accepted_remotes = STRING_LIST_INIT_NODUP;
 	struct store_info *store_info = NULL;
 	struct string_list_item *item;
 	bool reload_config = false;
-	struct string_list accepted_filters = STRING_LIST_INIT_DUP;
 	enum accept_promisor accept = accept_from_server(repo);
 
 	if (accept == ACCEPT_NONE)
@@ -912,17 +912,10 @@ static void filter_promisor_remote(struct repository *repo,
 			if (promisor_store_advertised_fields(advertised, store_info))
 				reload_config = true;
 
-			strvec_push(accepted, advertised->name);
-
-			/* Capture advertised filters for accepted remotes */
-			if (advertised->filter) {
-				struct string_list_item *i;
-				i = string_list_append(&accepted_filters, advertised->name);
-				i->util = xstrdup(advertised->filter);
-			}
+			string_list_append(&accepted_remotes, advertised->name)->util = advertised;
+		} else {
+			promisor_info_free(advertised);
 		}
-
-		promisor_info_free(advertised);
 	}
 
 	promisor_info_list_clear(&config_info);
@@ -932,24 +925,23 @@ static void filter_promisor_remote(struct repository *repo,
 	if (reload_config)
 		repo_promisor_remote_reinit(repo);
 
-	/* Apply accepted remote filters to the stable repo state */
-	for_each_string_list_item(item, &accepted_filters) {
-		struct promisor_remote *r = repo_promisor_remote_find(repo, item->string);
-		if (r) {
-			free(r->advertised_filter);
-			r->advertised_filter = item->util;
-			item->util = NULL;
-		}
-	}
+	/* Apply accepted remotes to the stable repo state */
+	for_each_string_list_item(item, &accepted_remotes) {
+		struct promisor_info *info = item->util;
+		struct promisor_remote *r = repo_promisor_remote_find(repo, info->name);
 
-	string_list_clear(&accepted_filters, 1);
+		strvec_push(accepted, info->name);
 
-	/* Mark the remotes as accepted in the repository state */
-	for (size_t i = 0; i < accepted->nr; i++) {
-		struct promisor_remote *r = repo_promisor_remote_find(repo, accepted->v[i]);
-		if (r)
+		if (r) {
 			r->accepted = 1;
+			if (info->filter) {
+				free(r->advertised_filter);
+				r->advertised_filter = xstrdup(info->filter);
+			}
+		}
 	}
+
+	promisor_info_list_clear(&accepted_remotes);
 }
 
 void promisor_remote_reply(const char *info, char **accepted_out)
-- 
2.53.0.625.g20f70b52bb


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 08/16] promisor-remote: remove the 'accepted' strvec
  2026-03-23  8:05 [PATCH 00/16] Auto-configure advertised remotes via URL whitelist Christian Couder
                   ` (6 preceding siblings ...)
  2026-03-23  8:05 ` [PATCH 07/16] promisor-remote: keep accepted promisor_info structs alive Christian Couder
@ 2026-03-23  8:05 ` Christian Couder
  2026-03-26 12:21   ` Patrick Steinhardt
  2026-03-23  8:05 ` [PATCH 09/16] promisor-remote: add 'local_name' to 'struct promisor_info' Christian Couder
                   ` (8 subsequent siblings)
  16 siblings, 1 reply; 33+ messages in thread
From: Christian Couder @ 2026-03-23  8:05 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Patrick Steinhardt, Taylor Blau, Karthik Nayak,
	Elijah Newren, Christian Couder, Christian Couder

In a previous commit, filter_promisor_remote() was refactored to keep
accepted 'struct promisor_info' instances alive instead of dismantling
them into separate parallel data structures.

Let's go one step further and replace the 'struct strvec *accepted'
argument passed to filter_promisor_remote() with a
'struct string_list *accepted_remotes' argument.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 promisor-remote.c | 27 ++++++++++++---------------
 1 file changed, 12 insertions(+), 15 deletions(-)

diff --git a/promisor-remote.c b/promisor-remote.c
index 34b4ca5806..bdfc5e7608 100644
--- a/promisor-remote.c
+++ b/promisor-remote.c
@@ -875,12 +875,11 @@ static enum accept_promisor accept_from_server(struct repository *repo)
 }
 
 static void filter_promisor_remote(struct repository *repo,
-				   struct strvec *accepted,
+				   struct string_list *accepted_remotes,
 				   const char *info)
 {
 	struct string_list config_info = STRING_LIST_INIT_NODUP;
 	struct string_list remote_info = STRING_LIST_INIT_DUP;
-	struct string_list accepted_remotes = STRING_LIST_INIT_NODUP;
 	struct store_info *store_info = NULL;
 	struct string_list_item *item;
 	bool reload_config = false;
@@ -912,7 +911,7 @@ static void filter_promisor_remote(struct repository *repo,
 			if (promisor_store_advertised_fields(advertised, store_info))
 				reload_config = true;
 
-			string_list_append(&accepted_remotes, advertised->name)->util = advertised;
+			string_list_append(accepted_remotes, advertised->name)->util = advertised;
 		} else {
 			promisor_info_free(advertised);
 		}
@@ -926,12 +925,10 @@ static void filter_promisor_remote(struct repository *repo,
 		repo_promisor_remote_reinit(repo);
 
 	/* Apply accepted remotes to the stable repo state */
-	for_each_string_list_item(item, &accepted_remotes) {
+	for_each_string_list_item(item, accepted_remotes) {
 		struct promisor_info *info = item->util;
 		struct promisor_remote *r = repo_promisor_remote_find(repo, info->name);
 
-		strvec_push(accepted, info->name);
-
 		if (r) {
 			r->accepted = 1;
 			if (info->filter) {
@@ -940,23 +937,23 @@ static void filter_promisor_remote(struct repository *repo,
 			}
 		}
 	}
-
-	promisor_info_list_clear(&accepted_remotes);
 }
 
 void promisor_remote_reply(const char *info, char **accepted_out)
 {
-	struct strvec accepted = STRVEC_INIT;
+	struct string_list accepted_remotes = STRING_LIST_INIT_NODUP;
 
-	filter_promisor_remote(the_repository, &accepted, info);
+	filter_promisor_remote(the_repository, &accepted_remotes, info);
 
 	if (accepted_out) {
-		if (accepted.nr) {
+		if (accepted_remotes.nr) {
 			struct strbuf reply = STRBUF_INIT;
-			for (size_t i = 0; i < accepted.nr; i++) {
-				if (i)
+			struct string_list_item *item;
+
+			for_each_string_list_item(item, &accepted_remotes) {
+				if (reply.len)
 					strbuf_addch(&reply, ';');
-				strbuf_addstr_urlencode(&reply, accepted.v[i], allow_unsanitized);
+				strbuf_addstr_urlencode(&reply, item->string, allow_unsanitized);
 			}
 			*accepted_out = strbuf_detach(&reply, NULL);
 		} else {
@@ -964,7 +961,7 @@ void promisor_remote_reply(const char *info, char **accepted_out)
 		}
 	}
 
-	strvec_clear(&accepted);
+	promisor_info_list_clear(&accepted_remotes);
 }
 
 void mark_promisor_remotes_as_accepted(struct repository *r, const char *remotes)
-- 
2.53.0.625.g20f70b52bb


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 09/16] promisor-remote: add 'local_name' to 'struct promisor_info'
  2026-03-23  8:05 [PATCH 00/16] Auto-configure advertised remotes via URL whitelist Christian Couder
                   ` (7 preceding siblings ...)
  2026-03-23  8:05 ` [PATCH 08/16] promisor-remote: remove the 'accepted' strvec Christian Couder
@ 2026-03-23  8:05 ` Christian Couder
  2026-03-26 12:21   ` Patrick Steinhardt
  2026-03-23  8:05 ` [PATCH 10/16] promisor-remote: pass config entry to all_fields_match() directly Christian Couder
                   ` (7 subsequent siblings)
  16 siblings, 1 reply; 33+ messages in thread
From: Christian Couder @ 2026-03-23  8:05 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Patrick Steinhardt, Taylor Blau, Karthik Nayak,
	Elijah Newren, Christian Couder, Christian Couder

In a following commit, we will store promisor remote information under
a remote name different than the one the server advertised.

To prepare for this change, let's add a new 'char* local_name' member
to 'struct promisor_info', and let's update the related functions.

While at it, let's also add a small promisor_info_internal_name()
helper that returns `local_name` when set, `name` otherwise, and let's
use this small helper in promisor_store_advertised_fields() and in the
post-loop of filter_promisor_remote() so that lookups against the local
repo configuration use the right name.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 promisor-remote.c | 25 ++++++++++++++++++-------
 1 file changed, 18 insertions(+), 7 deletions(-)

diff --git a/promisor-remote.c b/promisor-remote.c
index bdfc5e7608..da347fa2dc 100644
--- a/promisor-remote.c
+++ b/promisor-remote.c
@@ -434,15 +434,19 @@ static struct string_list *fields_stored(void)
 
 /*
  * Struct for promisor remotes involved in the "promisor-remote"
- * protocol capability.
+ * protocol capability:
  *
- * Except for "name", each <member> in this struct and its <value>
- * should correspond (either on the client side or on the server side)
- * to a "remote.<name>.<member>" config variable set to <value> where
- * "<name>" is a promisor remote name.
+ * - "name" is the name the server advertised.
+ * - "local_name" is the name we use locally (may be auto-generated).
+ *
+ * Except for "name" and "local_name", each <member> in this struct
+ * and its <value> should correspond (either on the client side or on
+ * the server side) to a "remote.<name>.<member>" config variable set
+ * to <value> where "<name>" is a promisor remote name.
  */
 struct promisor_info {
 	const char *name;
+	const char *local_name;
 	const char *url;
 	const char *filter;
 	const char *token;
@@ -451,6 +455,7 @@ struct promisor_info {
 static void promisor_info_free(struct promisor_info *p)
 {
 	free((char *)p->name);
+	free((char *)p->local_name);
 	free((char *)p->url);
 	free((char *)p->filter);
 	free((char *)p->token);
@@ -464,6 +469,11 @@ static void promisor_info_list_clear(struct string_list *list)
 	string_list_clear(list, 0);
 }
 
+static const char *promisor_info_internal_name(struct promisor_info *p)
+{
+	return p->local_name ? p->local_name : p->name;
+}
+
 static void set_one_field(struct promisor_info *p,
 			  const char *field, const char *value)
 {
@@ -819,7 +829,7 @@ static bool promisor_store_advertised_fields(struct promisor_info *advertised,
 {
 	struct promisor_info *p;
 	struct string_list_item *item;
-	const char *remote_name = advertised->name;
+	const char *remote_name = promisor_info_internal_name(advertised);
 	bool reload_config = false;
 
 	if (!(store_info->store_filter || store_info->store_token))
@@ -927,7 +937,8 @@ static void filter_promisor_remote(struct repository *repo,
 	/* Apply accepted remotes to the stable repo state */
 	for_each_string_list_item(item, accepted_remotes) {
 		struct promisor_info *info = item->util;
-		struct promisor_remote *r = repo_promisor_remote_find(repo, info->name);
+		const char *local = promisor_info_internal_name(info);
+		struct promisor_remote *r = repo_promisor_remote_find(repo, local);
 
 		if (r) {
 			r->accepted = 1;
-- 
2.53.0.625.g20f70b52bb


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 10/16] promisor-remote: pass config entry to all_fields_match() directly
  2026-03-23  8:05 [PATCH 00/16] Auto-configure advertised remotes via URL whitelist Christian Couder
                   ` (8 preceding siblings ...)
  2026-03-23  8:05 ` [PATCH 09/16] promisor-remote: add 'local_name' to 'struct promisor_info' Christian Couder
@ 2026-03-23  8:05 ` Christian Couder
  2026-03-26 12:21   ` Patrick Steinhardt
  2026-03-23  8:05 ` [PATCH 11/16] promisor-remote: refactor should_accept_remote() control flow Christian Couder
                   ` (6 subsequent siblings)
  16 siblings, 1 reply; 33+ messages in thread
From: Christian Couder @ 2026-03-23  8:05 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Patrick Steinhardt, Taylor Blau, Karthik Nayak,
	Elijah Newren, Christian Couder, Christian Couder

The `in_list == 0` path of all_fields_match() re-looks up the
remote in config_info by advertised->name, even though every
caller in should_accept_remote() has already performed this
lookup and holds the result in 'p'.

To avoid this useless work, let's replace the `int in_list`
parameter with a `struct promisor_info *config_entry` pointer:

 - When NULL (ACCEPT_ALL mode): scan the whole `config_info` list, as
   the old `in_list == 1` path did.

 - When non-NULL: match against that single config entry directly,
   avoiding the redundant string_list_lookup() call.

This removes the hidden dependency on `advertised->name` inside
all_fields_match(), which would be wrong in the following commits when
auto-configured remotes will be implemented as the local config name
may differ from the server's advertised name.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 promisor-remote.c | 30 ++++++++++++++++++------------
 1 file changed, 18 insertions(+), 12 deletions(-)

diff --git a/promisor-remote.c b/promisor-remote.c
index da347fa2dc..8f2c1280c3 100644
--- a/promisor-remote.c
+++ b/promisor-remote.c
@@ -598,9 +598,18 @@ static int match_field_against_config(const char *field, const char *value,
 	return 0;
 }
 
+/*
+ * Check that the advertised fields match the local configuration.
+ *
+ * When 'config_entry' is NULL (ACCEPT_ALL mode), every checked field
+ * must match at least one remote in 'config_info'.
+ *
+ * When 'config_entry' points to a specific remote's config, the
+ * checked fields are compared against that single remote only.
+ */
 static int all_fields_match(struct promisor_info *advertised,
 			    struct string_list *config_info,
-			    int in_list)
+			    struct promisor_info *config_entry)
 {
 	struct string_list *fields = fields_checked();
 	struct string_list_item *item_checked;
@@ -609,7 +618,6 @@ static int all_fields_match(struct promisor_info *advertised,
 		int match = 0;
 		const char *field = item_checked->string;
 		const char *value = NULL;
-		struct string_list_item *item;
 
 		if (!strcasecmp(field, promisor_field_filter))
 			value = advertised->filter;
@@ -619,7 +627,11 @@ static int all_fields_match(struct promisor_info *advertised,
 		if (!value)
 			return 0;
 
-		if (in_list) {
+		if (config_entry) {
+			match = match_field_against_config(field, value,
+							   config_entry);
+		} else {
+			struct string_list_item *item;
 			for_each_string_list_item(item, config_info) {
 				struct promisor_info *p = item->util;
 				if (match_field_against_config(field, value, p)) {
@@ -627,12 +639,6 @@ static int all_fields_match(struct promisor_info *advertised,
 					break;
 				}
 			}
-		} else {
-			item = string_list_lookup(config_info, advertised->name);
-			if (item) {
-				struct promisor_info *p = item->util;
-				match = match_field_against_config(field, value, p);
-			}
 		}
 
 		if (!match)
@@ -660,7 +666,7 @@ static int should_accept_remote(enum accept_promisor accept,
 	const char *remote_url = advertised->url;
 
 	if (accept == ACCEPT_ALL)
-		return all_fields_match(advertised, config_info, 1);
+		return all_fields_match(advertised, config_info, NULL);
 
 	/* Get config info for that promisor remote */
 	item = string_list_lookup(config_info, remote_name);
@@ -672,7 +678,7 @@ static int should_accept_remote(enum accept_promisor accept,
 	p = item->util;
 
 	if (accept == ACCEPT_KNOWN_NAME)
-		return all_fields_match(advertised, config_info, 0);
+		return all_fields_match(advertised, config_info, p);
 
 	if (accept != ACCEPT_KNOWN_URL)
 		BUG("Unhandled 'enum accept_promisor' value '%d'", accept);
@@ -684,7 +690,7 @@ static int should_accept_remote(enum accept_promisor accept,
 	}
 
 	if (!strcmp(p->url, remote_url))
-		return all_fields_match(advertised, config_info, 0);
+		return all_fields_match(advertised, config_info, p);
 
 	warning(_("known remote named '%s' but with URL '%s' instead of '%s', "
 		  "ignoring this remote"), remote_name, p->url, remote_url);
-- 
2.53.0.625.g20f70b52bb


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 11/16] promisor-remote: refactor should_accept_remote() control flow
  2026-03-23  8:05 [PATCH 00/16] Auto-configure advertised remotes via URL whitelist Christian Couder
                   ` (9 preceding siblings ...)
  2026-03-23  8:05 ` [PATCH 10/16] promisor-remote: pass config entry to all_fields_match() directly Christian Couder
@ 2026-03-23  8:05 ` Christian Couder
  2026-03-26 12:21   ` Patrick Steinhardt
  2026-03-23  8:05 ` [PATCH 12/16] t5710: use proper file:// URIs for absolute paths Christian Couder
                   ` (5 subsequent siblings)
  16 siblings, 1 reply; 33+ messages in thread
From: Christian Couder @ 2026-03-23  8:05 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Patrick Steinhardt, Taylor Blau, Karthik Nayak,
	Elijah Newren, Christian Couder, Christian Couder

In following commits, we are going to add URL-based acceptance logic
into should_accept_remote().

To prepare for the upcoming changes, let's restructure the control flow
in should_accept_remote().

Concretely, let's:

 - Move the empty-URL check to the very top of the function, so that
   every acceptance mode, instead of only ACCEPT_KNOWN_URL, rejects
   remotes with a missing or blank URL early.

 - Invert the URL comparison: instead of returning on match and
   warning on mismatch, return early on mismatch and let the match
   case fall through. This opens a single exit path at the bottom of
   the function for following commits to extend.

Anyway the server shouldn't send any empty URL in the first place, so
this shouldn't change any behavior in practice.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 promisor-remote.c | 21 +++++++++++----------
 1 file changed, 11 insertions(+), 10 deletions(-)

diff --git a/promisor-remote.c b/promisor-remote.c
index 8f2c1280c3..c2f0eb7223 100644
--- a/promisor-remote.c
+++ b/promisor-remote.c
@@ -665,6 +665,12 @@ static int should_accept_remote(enum accept_promisor accept,
 	const char *remote_name = advertised->name;
 	const char *remote_url = advertised->url;
 
+	if (!remote_url || !*remote_url) {
+		warning(_("no or empty URL advertised for remote '%s', "
+			  "ignoring this remote"), remote_name);
+		return 0;
+	}
+
 	if (accept == ACCEPT_ALL)
 		return all_fields_match(advertised, config_info, NULL);
 
@@ -683,19 +689,14 @@ static int should_accept_remote(enum accept_promisor accept,
 	if (accept != ACCEPT_KNOWN_URL)
 		BUG("Unhandled 'enum accept_promisor' value '%d'", accept);
 
-	if (!remote_url || !*remote_url) {
-		warning(_("no or empty URL advertised for remote '%s', "
-			  "ignoring this remote"), remote_name);
+	if (strcmp(p->url, remote_url)) {
+		warning(_("known remote named '%s' but with URL '%s' instead of '%s', "
+			  "ignoring this remote"),
+			remote_name, p->url, remote_url);
 		return 0;
 	}
 
-	if (!strcmp(p->url, remote_url))
-		return all_fields_match(advertised, config_info, p);
-
-	warning(_("known remote named '%s' but with URL '%s' instead of '%s', "
-		  "ignoring this remote"), remote_name, p->url, remote_url);
-
-	return 0;
+	return all_fields_match(advertised, config_info, p);
 }
 
 static int skip_field_name_prefix(const char *elem, const char *field_name, const char **value)
-- 
2.53.0.625.g20f70b52bb


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 12/16] t5710: use proper file:// URIs for absolute paths
  2026-03-23  8:05 [PATCH 00/16] Auto-configure advertised remotes via URL whitelist Christian Couder
                   ` (10 preceding siblings ...)
  2026-03-23  8:05 ` [PATCH 11/16] promisor-remote: refactor should_accept_remote() control flow Christian Couder
@ 2026-03-23  8:05 ` Christian Couder
  2026-03-26 12:21   ` Patrick Steinhardt
  2026-03-23  8:05 ` [PATCH 13/16] promisor-remote: introduce promisor.acceptFromServerUrl Christian Couder
                   ` (4 subsequent siblings)
  16 siblings, 1 reply; 33+ messages in thread
From: Christian Couder @ 2026-03-23  8:05 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Patrick Steinhardt, Taylor Blau, Karthik Nayak,
	Elijah Newren, Christian Couder, Christian Couder

In t5710, we frequently construct local file URIs using `file://$(pwd)`.
On Unix-like systems, $(pwd) returns an absolute path starting with a
slash (e.g., `/tmp/repo`), resulting in a valid 3-slash URI with an
empty host (`file:///tmp/repo`).

However, on Windows, $(pwd) returns a path starting with a drive
letter (e.g., `D:/a/repo`). This results in a 2-slash URI
(`file://D:/a/repo`). Standard URI parsers misinterpret this format,
treating `D:` as the host rather than part of the absolute path.

While this hasn't broken the existing tests (because the old
`promisor.acceptFromServer` logic relies entirely on strict `strcmp`
without normalizing the URLs), it will break future commits that pass
these URLs through `url_normalize()`.

To future-proof the tests and ensure cross-platform URI compliance,
let's introduce a $PWD_URL helper that explicitly guarantees a leading
slash for the path component, ensuring valid 3-slash `file:///` URIs on
all operating systems.

While at it, let's also introduce $ENCODED_PWD_URL to handle spaces in
directory paths (which is needed for URL glob pattern matching).

Then let's replace all instances of `file://$(pwd)` with $PWD_URL across
the test script, and let's simplify the `ENCODED_URL` constructions in
the `sendFields` and `checkFields` tests to use $ENCODED_PWD_URL
directly.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 t/t5710-promisor-remote-capability.sh | 55 ++++++++++++++++-----------
 1 file changed, 32 insertions(+), 23 deletions(-)

diff --git a/t/t5710-promisor-remote-capability.sh b/t/t5710-promisor-remote-capability.sh
index 357822c01a..c7a484228f 100755
--- a/t/t5710-promisor-remote-capability.sh
+++ b/t/t5710-promisor-remote-capability.sh
@@ -76,6 +76,17 @@ copy_to_lop () {
 	cp "$path" "$path2"
 }
 
+# On Windows, 'pwd' returns a path like 'D:/foo/bar'. Prepend '/' to turn
+# it into '/D:/foo/bar', which is what git expects in file:// URLs on Windows.
+# On Unix, the path already starts with '/', so this is a no-op.
+pwd_path=$(pwd)
+case "$pwd_path" in
+[a-zA-Z]:*) pwd_path="/$pwd_path" ;;
+esac
+PWD_URL="file://$pwd_path"
+# Same as PWD_URL but with spaces percent-encoded, for use in URL patterns.
+ENCODED_PWD_URL="file://$(echo "$pwd_path" | sed "s/ /%20/g")"
+
 test_expect_success "setup for testing promisor remote advertisement" '
 	# Create another bare repo called "lop" (for Large Object Promisor)
 	git init --bare lop &&
@@ -88,7 +99,7 @@ test_expect_success "setup for testing promisor remote advertisement" '
 	initialize_server 1 "$oid" &&
 
 	# Configure lop as promisor remote for server
-	git -C server remote add lop "file://$(pwd)/lop" &&
+	git -C server remote add lop "$PWD_URL/lop" &&
 	git -C server config remote.lop.promisor true &&
 
 	git -C lop config uploadpack.allowFilter true &&
@@ -104,7 +115,7 @@ test_expect_success "clone with promisor.advertise set to 'true'" '
 	# Clone from server to create a client
 	GIT_NO_LAZY_FETCH=0 git clone -c remote.lop.promisor=true \
 		-c remote.lop.fetch="+refs/heads/*:refs/remotes/lop/*" \
-		-c remote.lop.url="file://$(pwd)/lop" \
+		-c remote.lop.url="$PWD_URL/lop" \
 		-c promisor.acceptfromserver=All \
 		--no-local --filter="blob:limit=5k" server client &&
 
@@ -119,7 +130,7 @@ test_expect_success "clone with promisor.advertise set to 'false'" '
 	# Clone from server to create a client
 	GIT_NO_LAZY_FETCH=0 git clone -c remote.lop.promisor=true \
 		-c remote.lop.fetch="+refs/heads/*:refs/remotes/lop/*" \
-		-c remote.lop.url="file://$(pwd)/lop" \
+		-c remote.lop.url="$PWD_URL/lop" \
 		-c promisor.acceptfromserver=All \
 		--no-local --filter="blob:limit=5k" server client &&
 
@@ -137,7 +148,7 @@ test_expect_success "clone with promisor.acceptfromserver set to 'None'" '
 	# Clone from server to create a client
 	GIT_NO_LAZY_FETCH=0 git clone -c remote.lop.promisor=true \
 		-c remote.lop.fetch="+refs/heads/*:refs/remotes/lop/*" \
-		-c remote.lop.url="file://$(pwd)/lop" \
+		-c remote.lop.url="$PWD_URL/lop" \
 		-c promisor.acceptfromserver=None \
 		--no-local --filter="blob:limit=5k" server client &&
 
@@ -156,8 +167,8 @@ test_expect_success "init + fetch with promisor.advertise set to 'true'" '
 	git -C client init &&
 	git -C client config remote.lop.promisor true &&
 	git -C client config remote.lop.fetch "+refs/heads/*:refs/remotes/lop/*" &&
-	git -C client config remote.lop.url "file://$(pwd)/lop" &&
-	git -C client config remote.server.url "file://$(pwd)/server" &&
+	git -C client config remote.lop.url "$PWD_URL/lop" &&
+	git -C client config remote.server.url "$PWD_URL/server" &&
 	git -C client config remote.server.fetch "+refs/heads/*:refs/remotes/server/*" &&
 	git -C client config promisor.acceptfromserver All &&
 	GIT_NO_LAZY_FETCH=0 git -C client fetch --filter="blob:limit=5k" server &&
@@ -173,7 +184,7 @@ test_expect_success "clone with promisor.acceptfromserver set to 'KnownName'" '
 	# Clone from server to create a client
 	GIT_NO_LAZY_FETCH=0 git clone -c remote.lop.promisor=true \
 		-c remote.lop.fetch="+refs/heads/*:refs/remotes/lop/*" \
-		-c remote.lop.url="file://$(pwd)/lop" \
+		-c remote.lop.url="$PWD_URL/lop" \
 		-c promisor.acceptfromserver=KnownName \
 		--no-local --filter="blob:limit=5k" server client &&
 
@@ -188,7 +199,7 @@ test_expect_success "clone with 'KnownName' and different remote names" '
 	# Clone from server to create a client
 	GIT_NO_LAZY_FETCH=0 git clone -c remote.serverTwo.promisor=true \
 		-c remote.serverTwo.fetch="+refs/heads/*:refs/remotes/lop/*" \
-		-c remote.serverTwo.url="file://$(pwd)/lop" \
+		-c remote.serverTwo.url="$PWD_URL/lop" \
 		-c promisor.acceptfromserver=KnownName \
 		--no-local --filter="blob:limit=5k" server client &&
 
@@ -225,7 +236,7 @@ test_expect_success "clone with promisor.acceptfromserver set to 'KnownUrl'" '
 	# Clone from server to create a client
 	GIT_NO_LAZY_FETCH=0 git clone -c remote.lop.promisor=true \
 		-c remote.lop.fetch="+refs/heads/*:refs/remotes/lop/*" \
-		-c remote.lop.url="file://$(pwd)/lop" \
+		-c remote.lop.url="$PWD_URL/lop" \
 		-c promisor.acceptfromserver=KnownUrl \
 		--no-local --filter="blob:limit=5k" server client &&
 
@@ -242,7 +253,7 @@ test_expect_success "clone with 'KnownUrl' and different remote urls" '
 	# Clone from server to create a client
 	GIT_NO_LAZY_FETCH=0 git clone -c remote.lop.promisor=true \
 		-c remote.lop.fetch="+refs/heads/*:refs/remotes/lop/*" \
-		-c remote.lop.url="file://$(pwd)/serverTwo" \
+		-c remote.lop.url="$PWD_URL/serverTwo" \
 		-c promisor.acceptfromserver=KnownUrl \
 		--no-local --filter="blob:limit=5k" server client &&
 
@@ -257,7 +268,7 @@ test_expect_success "clone with 'KnownUrl' and url not configured on the server"
 	git -C server config promisor.advertise true &&
 	test_when_finished "rm -rf client" &&
 
-	test_when_finished "git -C server config set remote.lop.url \"file://$(pwd)/lop\"" &&
+	test_when_finished "git -C server config set remote.lop.url \"$PWD_URL/lop\"" &&
 	git -C server config unset remote.lop.url &&
 
 	# Clone from server to create a client
@@ -266,7 +277,7 @@ test_expect_success "clone with 'KnownUrl' and url not configured on the server"
 	# missing, so the remote name will be used instead which will fail.
 	test_must_fail env GIT_NO_LAZY_FETCH=0 git clone -c remote.lop.promisor=true \
 		-c remote.lop.fetch="+refs/heads/*:refs/remotes/lop/*" \
-		-c remote.lop.url="file://$(pwd)/lop" \
+		-c remote.lop.url="$PWD_URL/lop" \
 		-c promisor.acceptfromserver=KnownUrl \
 		--no-local --filter="blob:limit=5k" server client &&
 
@@ -278,7 +289,7 @@ test_expect_success "clone with 'KnownUrl' and empty url, so not advertised" '
 	git -C server config promisor.advertise true &&
 	test_when_finished "rm -rf client" &&
 
-	test_when_finished "git -C server config set remote.lop.url \"file://$(pwd)/lop\"" &&
+	test_when_finished "git -C server config set remote.lop.url \"$PWD_URL/lop\"" &&
 	git -C server config set remote.lop.url "" &&
 
 	# Clone from server to create a client
@@ -287,7 +298,7 @@ test_expect_success "clone with 'KnownUrl' and empty url, so not advertised" '
 	# so the remote name will be used instead which will fail.
 	test_must_fail env GIT_NO_LAZY_FETCH=0 git clone -c remote.lop.promisor=true \
 		-c remote.lop.fetch="+refs/heads/*:refs/remotes/lop/*" \
-		-c remote.lop.url="file://$(pwd)/lop" \
+		-c remote.lop.url="$PWD_URL/lop" \
 		-c promisor.acceptfromserver=KnownUrl \
 		--no-local --filter="blob:limit=5k" server client &&
 
@@ -311,13 +322,12 @@ test_expect_success "clone with promisor.sendFields" '
 	GIT_TRACE_PACKET="$(pwd)/trace" GIT_NO_LAZY_FETCH=0 git clone \
 		-c remote.lop.promisor=true \
 		-c remote.lop.fetch="+refs/heads/*:refs/remotes/lop/*" \
-		-c remote.lop.url="file://$(pwd)/lop" \
+		-c remote.lop.url="$PWD_URL/lop" \
 		-c promisor.acceptfromserver=All \
 		--no-local --filter="blob:limit=5k" server client &&
 
 	# Check that fields are properly transmitted
-	ENCODED_URL=$(echo "file://$(pwd)/lop" | sed -e "s/ /%20/g") &&
-	PR1="name=lop,url=$ENCODED_URL,partialCloneFilter=blob:none" &&
+	PR1="name=lop,url=$ENCODED_PWD_URL/lop,partialCloneFilter=blob:none" &&
 	PR2="name=otherLop,url=https://invalid.invalid,partialCloneFilter=blob:limit=10k,token=fooBar" &&
 	test_grep "clone< promisor-remote=$PR1;$PR2" trace &&
 	test_grep "clone> promisor-remote=lop;otherLop" trace &&
@@ -342,15 +352,14 @@ test_expect_success "clone with promisor.checkFields" '
 	GIT_TRACE_PACKET="$(pwd)/trace" GIT_NO_LAZY_FETCH=0 git clone \
 		-c remote.lop.promisor=true \
 		-c remote.lop.fetch="+refs/heads/*:refs/remotes/lop/*" \
-		-c remote.lop.url="file://$(pwd)/lop" \
+		-c remote.lop.url="$PWD_URL/lop" \
 		-c remote.lop.partialCloneFilter="blob:none" \
 		-c promisor.acceptfromserver=All \
 		-c promisor.checkFields=partialcloneFilter \
 		--no-local --filter="blob:limit=5k" server client &&
 
 	# Check that fields are properly transmitted
-	ENCODED_URL=$(echo "file://$(pwd)/lop" | sed -e "s/ /%20/g") &&
-	PR1="name=lop,url=$ENCODED_URL,partialCloneFilter=blob:none" &&
+	PR1="name=lop,url=$ENCODED_PWD_URL/lop,partialCloneFilter=blob:none" &&
 	PR2="name=otherLop,url=https://invalid.invalid,partialCloneFilter=blob:limit=10k,token=fooBar" &&
 	test_grep "clone< promisor-remote=$PR1;$PR2" trace &&
 	test_grep "clone> promisor-remote=lop" trace &&
@@ -380,7 +389,7 @@ test_expect_success "clone with promisor.storeFields=partialCloneFilter" '
 	GIT_TRACE_PACKET="$(pwd)/trace" GIT_NO_LAZY_FETCH=0 git clone \
 		-c remote.lop.promisor=true \
 		-c remote.lop.fetch="+refs/heads/*:refs/remotes/lop/*" \
-		-c remote.lop.url="file://$(pwd)/lop" \
+		-c remote.lop.url="$PWD_URL/lop" \
 		-c remote.lop.token="fooYYY" \
 		-c remote.lop.partialCloneFilter="blob:none" \
 		-c promisor.acceptfromserver=All \
@@ -432,7 +441,7 @@ test_expect_success "clone and fetch with --filter=auto" '
 
 	GIT_TRACE_PACKET="$(pwd)/trace" GIT_NO_LAZY_FETCH=0 git clone \
 		-c remote.lop.promisor=true \
-		-c remote.lop.url="file://$(pwd)/lop" \
+		-c remote.lop.url="$PWD_URL/lop" \
 		-c promisor.acceptfromserver=All \
 		--no-local --filter=auto server client 2>err &&
 
@@ -489,7 +498,7 @@ test_expect_success "clone with promisor.advertise set to 'true' but don't delet
 	# Clone from server to create a client
 	GIT_NO_LAZY_FETCH=0 git clone -c remote.lop.promisor=true \
 		-c remote.lop.fetch="+refs/heads/*:refs/remotes/lop/*" \
-		-c remote.lop.url="file://$(pwd)/lop" \
+		-c remote.lop.url="$PWD_URL/lop" \
 		-c promisor.acceptfromserver=All \
 		--no-local --filter="blob:limit=5k" server client &&
 
-- 
2.53.0.625.g20f70b52bb


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 13/16] promisor-remote: introduce promisor.acceptFromServerUrl
  2026-03-23  8:05 [PATCH 00/16] Auto-configure advertised remotes via URL whitelist Christian Couder
                   ` (11 preceding siblings ...)
  2026-03-23  8:05 ` [PATCH 12/16] t5710: use proper file:// URIs for absolute paths Christian Couder
@ 2026-03-23  8:05 ` Christian Couder
  2026-03-26 12:21   ` Patrick Steinhardt
  2026-03-23  8:05 ` [PATCH 14/16] promisor-remote: trust known remotes matching acceptFromServerUrl Christian Couder
                   ` (3 subsequent siblings)
  16 siblings, 1 reply; 33+ messages in thread
From: Christian Couder @ 2026-03-23  8:05 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Patrick Steinhardt, Taylor Blau, Karthik Nayak,
	Elijah Newren, Christian Couder, Christian Couder

The "promisor-remote" protocol capability allows servers to advertise
promisor remotes, but doesn't allow these remotes to be automatically
configured on the client.

Let's introduce a new `promisor.acceptFromServerUrl` config variable
which contains a glob pattern, so that advertised remotes with a URL
matching that pattern will be automatically configured.

The glob pattern can optionally be prefixed with a remote name which
will be used as the name of the new local remote.

For now though, let's only introduce the functions to read and validate
the glob patterns and the optional prefixes.

Checking if the URLs of the advertised remotes match the glob patterns
and taking the appropriate action is left for a following commit.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 promisor-remote.c | 73 +++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 73 insertions(+)

diff --git a/promisor-remote.c b/promisor-remote.c
index c2f0eb7223..4cb18e1a6a 100644
--- a/promisor-remote.c
+++ b/promisor-remote.c
@@ -12,6 +12,7 @@
 #include "packfile.h"
 #include "environment.h"
 #include "url.h"
+#include "urlmatch.h"
 #include "version.h"
 
 struct promisor_remote_config {
@@ -656,6 +657,76 @@ static bool has_control_char(const char *s)
 	return false;
 }
 
+struct allowed_url {
+	char *remote_name;
+	char *url_pattern;
+};
+
+static struct allowed_url *valid_accept_url(const char *url)
+{
+	char *dup, *p;
+	struct allowed_url *allowed;
+
+	if (!url)
+		return NULL;
+
+	dup = xstrdup(url);
+	p = strchr(dup, '=');
+	if (p) {
+		*p = '\0';
+		if (!valid_remote_name(dup)) {
+			warning(_("invalid remote name '%s' before '=' sign "
+				  "in '%s' from promisor.acceptFromServerUrl config"),
+				dup, url);
+			free(dup);
+			return NULL;
+		}
+		p++;
+	} else {
+		p = dup;
+	}
+
+	if (has_control_char(p) || !url_is_valid_pattern(p)) {
+		warning(_("invalid url pattern '%s' "
+			  "in '%s' from promisor.acceptFromServerUrl config"), p, url);
+		free(dup);
+		return NULL;
+	}
+
+	allowed = xmalloc(sizeof(*allowed));
+	allowed->remote_name = (p == dup) ? NULL : dup;
+	allowed->url_pattern = p;
+
+	return allowed;
+}
+
+static struct string_list *accept_from_server_url(struct repository *repo)
+{
+	static struct string_list accept_urls = STRING_LIST_INIT_DUP;
+	static int initialized;
+	const struct string_list *config_urls;
+
+	if (initialized)
+		return &accept_urls;
+
+	initialized = 1;
+
+	if (!repo_config_get_string_multi(repo, "promisor.acceptfromserverurl", &config_urls)) {
+		struct string_list_item *item;
+
+		for_each_string_list_item(item, config_urls) {
+			struct allowed_url *allowed = valid_accept_url(item->string);
+			if (allowed) {
+				struct string_list_item *new;
+				new = string_list_append(&accept_urls, item->string);
+				new->util = allowed;
+			}
+		}
+	}
+
+	return &accept_urls;
+}
+
 static int should_accept_remote(enum accept_promisor accept,
 				struct promisor_info *advertised,
 				struct string_list *config_info)
@@ -901,6 +972,8 @@ static void filter_promisor_remote(struct repository *repo,
 	struct string_list_item *item;
 	bool reload_config = false;
 	enum accept_promisor accept = accept_from_server(repo);
+	/* Pre-load and validate the acceptFromServerUrl config */
+	(void)accept_from_server_url(repo);
 
 	if (accept == ACCEPT_NONE)
 		return;
-- 
2.53.0.625.g20f70b52bb


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 14/16] promisor-remote: trust known remotes matching acceptFromServerUrl
  2026-03-23  8:05 [PATCH 00/16] Auto-configure advertised remotes via URL whitelist Christian Couder
                   ` (12 preceding siblings ...)
  2026-03-23  8:05 ` [PATCH 13/16] promisor-remote: introduce promisor.acceptFromServerUrl Christian Couder
@ 2026-03-23  8:05 ` Christian Couder
  2026-03-23 18:54   ` Junio C Hamano
  2026-03-26 12:21   ` Patrick Steinhardt
  2026-03-23  8:05 ` [PATCH 15/16] promisor-remote: auto-configure unknown remotes Christian Couder
                   ` (2 subsequent siblings)
  16 siblings, 2 replies; 33+ messages in thread
From: Christian Couder @ 2026-03-23  8:05 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Patrick Steinhardt, Taylor Blau, Karthik Nayak,
	Elijah Newren, Christian Couder, Christian Couder

A previous commit introduced the `promisor.acceptFromServerUrl` config
variable along with the machinery to parse and validate the URL glob
patterns and optional remote name prefixes it contains. However, these
URL patterns are not yet tied into the client's acceptance logic.

When a promisor remote is already configured locally, its fields (like
authentication tokens) may occasionally need to be refreshed by the
server. If `promisor.acceptFromServer` is set to the secure default
("None"), these updates are rejected, potentially causing future
fetches to fail.

To enable such targeted updates for trusted URLs, let's use the URL
patterns from `promisor.acceptFromServerUrl` as an additional URL
based whitelist.

Concretely, let's check the advertised URLs against the URL glob
patterns by introducing a new small helper function called
url_matches_accept_list(), which iterates over the glob patterns and
returns the first matching allowed_url entry (or NULL).

(Before matching, the advertised URL is passed through url_normalize()
so that case variations in the scheme/host, percent-encoding tricks,
and ".." path segments cannot bypass the whitelist.)

Let's then use this helper at the tail of should_accept_remote() so
that, when `accept == ACCEPT_NONE`, a known remote whose URL matches
the whitelist is still accepted.

To prepare for this new logic, let's also:

 - Add an 'accept_urls' parameter to should_accept_remote().

 - Replace the BUG() guard in the ACCEPT_KNOWN_URL case with an
   explicit 'if (accept == ACCEPT_KNOWN_URL) return' and a new
   BUG() guard in the ACCEPT_NONE case, so url_matches_accept_list()
   is only called in the ACCEPT_NONE case.

 - Call accept_from_server_url() from filter_promisor_remote()
   and relax its early return so that the function is entered when
   `accept_urls` has entries even if `accept == ACCEPT_NONE`.

Let's then properly document `promisor.acceptFromServerUrl` in
"promisor.adoc" as an additive security whitelist for known remotes,
including the URL normalization behavior, and let's mention it in
"gitprotocol-v2.adoc".

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 Documentation/config/promisor.adoc    | 46 +++++++++++++++++
 Documentation/gitprotocol-v2.adoc     |  9 ++--
 promisor-remote.c                     | 50 +++++++++++++++---
 t/t5710-promisor-remote-capability.sh | 73 +++++++++++++++++++++++++++
 4 files changed, 166 insertions(+), 12 deletions(-)

diff --git a/Documentation/config/promisor.adoc b/Documentation/config/promisor.adoc
index b0fa43b839..6f5442cd65 100644
--- a/Documentation/config/promisor.adoc
+++ b/Documentation/config/promisor.adoc
@@ -51,6 +51,52 @@ promisor.acceptFromServer::
 	to "fetch" and "clone" requests from the client. Name and URL
 	comparisons are case sensitive. See linkgit:gitprotocol-v2[5].
 
+promisor.acceptFromServerUrl::
+	A glob pattern to specify which URLs advertised by a server
+	are considered trusted by the client. This option acts as an
+	additive security whitelist that works in conjunction with
+	`promisor.acceptFromServer`.
++
+This option can appear multiple times in config files. An advertised
+URL will be accepted if it matches _ANY_ glob pattern specified by
+this option in _ANY_ config file read by Git.
++
+Be _VERY_ careful with these glob patterns, as it can be a big
+security hole to allow any advertised remote to be auto-configured!
+To minimize security risks, follow these guidelines:
++
+1. Start with a secure protocol scheme, like `https://` or `ssh://`.
++
+2. Only allow domain names or paths where you control and trust _ALL_
+   the content. Be especially careful with shared hosting platforms
+   like `github.com` or `gitlab.com`. A broad pattern like
+   `https://gitlab.com/*` is dangerous because it trusts every
+   repository on the entire platform. Always restrict such patterns to
+   your specific organization or namespace (e.g.,
+   `https://gitlab.com/your-org/*`).
++
+3. Don't use globs (`*`) in the domain name. For example
+   `https://cdn.example.com/*` is much safer than
+   `https://*.example.com/*`, because the latter matches
+   `https://evil-hacker.net/fake.example.com/repo`.
++
+4. Make sure to have a `/` at the end of the domain name (or the end
+   of specific directories). For example `https://cdn.example.com/*`
+   is much safer than `https://cdn.example.com*`, because the latter
+   matches `https://cdn.example.com.hacker.net/repo`.
++
+Before matching, the advertised URL is normalized: the scheme and
+host are lowercased, percent-encoded characters are decoded where
+possible, and path segments like `..` are resolved.  Glob patterns
+are matched against this normalized URL as-is, so patterns should
+be written in normalized form (e.g., lowercase scheme and host).
++
+Even if `promisor.acceptFromServer` is set to `None` (the default),
+Git will still accept field updates (like tokens) for known remotes,
+provided their URLs match a pattern in
+`promisor.acceptFromServerUrl`. See linkgit:gitprotocol-v2[5] for
+details on the protocol.
+
 promisor.checkFields::
 	A comma or space separated list of additional remote related
 	field names. A client checks if the values of these fields
diff --git a/Documentation/gitprotocol-v2.adoc b/Documentation/gitprotocol-v2.adoc
index f985cb4c47..175358bba2 100644
--- a/Documentation/gitprotocol-v2.adoc
+++ b/Documentation/gitprotocol-v2.adoc
@@ -862,10 +862,11 @@ the server advertised, the client shouldn't advertise the
 
 On the server side, the "promisor.advertise" and "promisor.sendFields"
 configuration options can be used to control what it advertises. On
-the client side, the "promisor.acceptFromServer" configuration option
-can be used to control what it accepts, and the "promisor.storeFields"
-option, to control what it stores. See the documentation of these
-configuration options in linkgit:git-config[1] for more information.
+the client side, the "promisor.acceptFromServer" and
+"promisor.acceptFromServerUrl" configuration options can be used to
+control what it accepts, and the "promisor.storeFields" option, to
+control what it stores. See the documentation of these configuration
+options in linkgit:git-config[1] for more information.
 
 Note that in the future it would be nice if the "promisor-remote"
 protocol capability could be used by the server, when responding to
diff --git a/promisor-remote.c b/promisor-remote.c
index 4cb18e1a6a..210e6950af 100644
--- a/promisor-remote.c
+++ b/promisor-remote.c
@@ -14,6 +14,7 @@
 #include "url.h"
 #include "urlmatch.h"
 #include "version.h"
+#include "wildmatch.h"
 
 struct promisor_remote_config {
 	struct promisor_remote *promisors;
@@ -727,8 +728,31 @@ static struct string_list *accept_from_server_url(struct repository *repo)
 	return &accept_urls;
 }
 
+static struct allowed_url *url_matches_accept_list(
+		struct string_list *accept_urls, const char *url)
+{
+	struct string_list_item *item;
+	char *normalized = url_normalize(url, NULL);
+
+	if (!normalized)
+		return NULL;
+
+	for_each_string_list_item(item, accept_urls) {
+		struct allowed_url *allowed = item->util;
+
+		if (!wildmatch(allowed->url_pattern, normalized, 0)) {
+			free(normalized);
+			return allowed;
+		}
+	}
+
+	free(normalized);
+	return NULL;
+}
+
 static int should_accept_remote(enum accept_promisor accept,
 				struct promisor_info *advertised,
+				struct string_list *accept_urls,
 				struct string_list *config_info)
 {
 	struct promisor_info *p;
@@ -757,9 +781,6 @@ static int should_accept_remote(enum accept_promisor accept,
 	if (accept == ACCEPT_KNOWN_NAME)
 		return all_fields_match(advertised, config_info, p);
 
-	if (accept != ACCEPT_KNOWN_URL)
-		BUG("Unhandled 'enum accept_promisor' value '%d'", accept);
-
 	if (strcmp(p->url, remote_url)) {
 		warning(_("known remote named '%s' but with URL '%s' instead of '%s', "
 			  "ignoring this remote"),
@@ -767,7 +788,21 @@ static int should_accept_remote(enum accept_promisor accept,
 		return 0;
 	}
 
-	return all_fields_match(advertised, config_info, p);
+	if (accept == ACCEPT_KNOWN_URL)
+		return all_fields_match(advertised, config_info, p);
+
+	if (accept != ACCEPT_NONE)
+		BUG("Unhandled 'enum accept_promisor' value '%d'", accept);
+
+	/*
+	 * Even if accept == ACCEPT_NONE, we MUST trust this known
+	 * remote to update its token or other such fields if its URL
+	 * matches the acceptFromServerUrl whitelist!
+	 */
+	if (url_matches_accept_list(accept_urls, remote_url))
+		return all_fields_match(advertised, config_info, p);
+
+	return 0;
 }
 
 static int skip_field_name_prefix(const char *elem, const char *field_name, const char **value)
@@ -972,10 +1007,9 @@ static void filter_promisor_remote(struct repository *repo,
 	struct string_list_item *item;
 	bool reload_config = false;
 	enum accept_promisor accept = accept_from_server(repo);
-	/* Pre-load and validate the acceptFromServerUrl config */
-	(void)accept_from_server_url(repo);
+	struct string_list *accept_urls = accept_from_server_url(repo);
 
-	if (accept == ACCEPT_NONE)
+	if (accept == ACCEPT_NONE && !accept_urls->nr)
 		return;
 
 	/* Parse remote info received */
@@ -995,7 +1029,7 @@ static void filter_promisor_remote(struct repository *repo,
 			string_list_sort(&config_info);
 		}
 
-		if (should_accept_remote(accept, advertised, &config_info)) {
+		if (should_accept_remote(accept, advertised, accept_urls, &config_info)) {
 			if (!store_info)
 				store_info = store_info_new(repo);
 			if (promisor_store_advertised_fields(advertised, store_info))
diff --git a/t/t5710-promisor-remote-capability.sh b/t/t5710-promisor-remote-capability.sh
index c7a484228f..4ebad14af5 100755
--- a/t/t5710-promisor-remote-capability.sh
+++ b/t/t5710-promisor-remote-capability.sh
@@ -306,6 +306,77 @@ test_expect_success "clone with 'KnownUrl' and empty url, so not advertised" '
 	check_missing_objects server 1 "$oid"
 '
 
+test_expect_success "clone with 'None' but URL whitelisted" '
+	git -C server config promisor.advertise true &&
+	test_when_finished "rm -rf client" &&
+
+	GIT_NO_LAZY_FETCH=0 git clone -c remote.lop.promisor=true \
+		-c remote.lop.fetch="+refs/heads/*:refs/remotes/lop/*" \
+		-c remote.lop.url="$PWD_URL/lop" \
+		-c promisor.acceptfromserver=None \
+		-c promisor.acceptFromServerUrl="$ENCODED_PWD_URL/*" \
+		--no-local --filter="blob:limit=5k" server client &&
+
+	# Check that the largest object is still missing on the server
+	check_missing_objects server 1 "$oid"
+'
+
+test_expect_success "clone with 'None' but URL not in whitelist" '
+	git -C server config promisor.advertise true &&
+	test_when_finished "rm -rf client" &&
+
+	GIT_NO_LAZY_FETCH=0 git clone -c remote.lop.promisor=true \
+		-c remote.lop.fetch="+refs/heads/*:refs/remotes/lop/*" \
+		-c remote.lop.url="$PWD_URL/lop" \
+		-c promisor.acceptfromserver=None \
+		-c promisor.acceptFromServerUrl="https://example.com/*" \
+		--no-local --filter="blob:limit=5k" server client &&
+
+	# Check that the largest object is not missing on the server
+	check_missing_objects server 0 "" &&
+
+	# Reinitialize server so that the largest object is missing again
+	initialize_server 1 "$oid"
+'
+
+test_expect_success "clone with 'None' but URL whitelisted in one pattern out of two" '
+	git -C server config promisor.advertise true &&
+	test_when_finished "rm -rf client" &&
+
+	GIT_NO_LAZY_FETCH=0 git clone -c remote.lop.promisor=true \
+		-c remote.lop.fetch="+refs/heads/*:refs/remotes/lop/*" \
+		-c remote.lop.url="$PWD_URL/lop" \
+		-c promisor.acceptfromserver=None \
+		-c promisor.acceptFromServerUrl="https://example.com/*" \
+		-c promisor.acceptFromServerUrl="$ENCODED_PWD_URL/*" \
+		--no-local --filter="blob:limit=5k" server client &&
+
+	# Check that the largest object is still missing on the server
+	check_missing_objects server 1 "$oid"
+'
+
+test_expect_success "clone with 'None', URL whitelisted, but client has different URL" '
+	git -C server config promisor.advertise true &&
+	test_when_finished "rm -rf client" &&
+
+	# The client configures "lop" with a different URL (serverTwo) than
+	# what the server advertises (lop). Even though the advertised URL
+	# matches the whitelist, the remote is rejected because the
+	# configured URL does not match the advertised one.
+	GIT_NO_LAZY_FETCH=0 git clone -c remote.lop.promisor=true \
+		-c remote.lop.fetch="+refs/heads/*:refs/remotes/lop/*" \
+		-c remote.lop.url="$PWD_URL/serverTwo" \
+		-c promisor.acceptfromserver=None \
+		-c promisor.acceptFromServerUrl="$ENCODED_PWD_URL/*" \
+		--no-local --filter="blob:limit=5k" server client &&
+
+	# Check that the largest object is not missing on the server
+	check_missing_objects server 0 "" &&
+
+	# Reinitialize server so that the largest object is missing again
+	initialize_server 1 "$oid"
+'
+
 test_expect_success "clone with promisor.sendFields" '
 	git -C server config promisor.advertise true &&
 	test_when_finished "rm -rf client" &&
@@ -573,4 +644,6 @@ test_expect_success "subsequent fetch from a client when promisor.advertise is f
 	check_missing_objects server 1 "$oid"
 '
 
+
+
 test_done
-- 
2.53.0.625.g20f70b52bb


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 15/16] promisor-remote: auto-configure unknown remotes
  2026-03-23  8:05 [PATCH 00/16] Auto-configure advertised remotes via URL whitelist Christian Couder
                   ` (13 preceding siblings ...)
  2026-03-23  8:05 ` [PATCH 14/16] promisor-remote: trust known remotes matching acceptFromServerUrl Christian Couder
@ 2026-03-23  8:05 ` Christian Couder
  2026-03-26 12:21   ` Patrick Steinhardt
  2026-03-23  8:05 ` [PATCH 16/16] doc: promisor: improve acceptFromServer entry Christian Couder
  2026-03-26 12:21 ` [PATCH 00/16] Auto-configure advertised remotes via URL whitelist Patrick Steinhardt
  16 siblings, 1 reply; 33+ messages in thread
From: Christian Couder @ 2026-03-23  8:05 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Patrick Steinhardt, Taylor Blau, Karthik Nayak,
	Elijah Newren, Christian Couder, Christian Couder

Previous commits have introduced the `promisor.acceptFromServerUrl`
config variable to whitelist some URLs advertised by a server through
the "promisor-remote" protocol capability.

However the new `promisor.acceptFromServerUrl` mechanism, like the old
`promisor.acceptFromServer` mechanism, still requires a remote to
already exist in the client's local configuration before it can be
accepted. This places a significant manual burden on users to
pre-configure these remotes, and creates friction for administrators
who have to troubleshoot or manually provision these setups for their
teams.

To eliminate this burden, let's automatically create a new `[remote]`
section in the client's config when a server advertises an unknown
remote whose URL matches a `promisor.acceptFromServerUrl` glob pattern.

Concretely, let's add four helpers:

 - sanitize_remote_name(): turn an arbitrary URL-derived string into a
   valid remote name by replacing non-alphanumeric characters,
   collapsing runs of '-', and prepending "promisor-auto-".

 - promisor_remote_name_from_url(): normalize the URL and extract
   host+port+path to build a human-readable base name, then pass it
   through sanitize_remote_name().

 - configure_auto_promisor_remote(): write the remote.*.url,
   remote.*.promisor and remote.*.advertisedAs keys to the repo
   config.

 - handle_matching_allowed_url(): pick the final name (user-supplied
   alias or auto-generated), handle collisions by appending "-1",
   "-2", etc., then call configure_auto_promisor_remote().

Let's also add should_accept_new_remote_url() which reuses the
url_matches_accept_list() helper introduced in a previous commit to
find a matching pattern, then delegates to handle_matching_allowed_url()
to create the remote.

And then let's call should_accept_new_remote_url() from the '!item'
(unknown remote) branch of should_accept_remote(), setting
`reload_config` so that the newly-written config is picked up.

Finally let's document all that by:

 - expanding the `promisor.acceptFromServerUrl` entry to describe
   auto-creation, the optional "name=" prefix syntax, the
   "promisor-auto-*" generation rules, and numeric-suffix collision
   handling, and by
 - adding a "remote.<name>.advertisedAs" entry to "remote.adoc".

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 Documentation/config/promisor.adoc    |  35 ++++-
 Documentation/config/remote.adoc      |   9 ++
 promisor-remote.c                     | 202 +++++++++++++++++++++++++-
 t/t5710-promisor-remote-capability.sh | 122 ++++++++++++++++
 4 files changed, 355 insertions(+), 13 deletions(-)

diff --git a/Documentation/config/promisor.adoc b/Documentation/config/promisor.adoc
index 6f5442cd65..d8e5f4a6dc 100644
--- a/Documentation/config/promisor.adoc
+++ b/Documentation/config/promisor.adoc
@@ -53,9 +53,11 @@ promisor.acceptFromServer::
 
 promisor.acceptFromServerUrl::
 	A glob pattern to specify which URLs advertised by a server
-	are considered trusted by the client. This option acts as an
-	additive security whitelist that works in conjunction with
-	`promisor.acceptFromServer`.
+	are allowed to be auto-configured (created and persisted) on
+	the client side. Unlike `promisor.acceptFromServer`, which
+	only accepts already configured remotes, a match against this
+	option instructs Git to write a new `[remote "<name>"]`
+	section to the client's configuration.
 +
 This option can appear multiple times in config files. An advertised
 URL will be accepted if it matches _ANY_ glob pattern specified by
@@ -91,11 +93,28 @@ possible, and path segments like `..` are resolved.  Glob patterns
 are matched against this normalized URL as-is, so patterns should
 be written in normalized form (e.g., lowercase scheme and host).
 +
-Even if `promisor.acceptFromServer` is set to `None` (the default),
-Git will still accept field updates (like tokens) for known remotes,
-provided their URLs match a pattern in
-`promisor.acceptFromServerUrl`. See linkgit:gitprotocol-v2[5] for
-details on the protocol.
+The glob pattern can optionally be prefixed with a remote name and an
+equals sign (e.g., `cdn=https://cdn.example.com/*`). If such a prefix
+is provided, accepted remotes will be saved under that name. If no
+such prefix is provided, a safe remote name will be automatically
+generated by sanitizing the URL and prefixing it with
+`promisor-auto-`. If a remote with the chosen name already exists but
+points to a different URL, Git will append a numeric suffix (e.g.,
+`-1`, `-2`) to the name to prevent overwriting existing
+configurations. You should make sure that this doesn't happen often
+though, as remotes will be rejected if the numeric suffix increases
+too much. In all cases, the original name advertised by the server is
+recorded in the `remote.<name>.advertisedAs` configuration variable
+for tracing and debugging purposes.
++
+Note that this option acts as an additive security whitelist. It works
+in conjunction with `promisor.acceptFromServer` (see the documentation
+of that option for the implications of accepting a promisor
+remote). Even if `promisor.acceptFromServer` is set to `None` (the
+default), Git will still automatically configure new remotes, and
+accept field updates (like tokens) for known remotes, provided their
+URLs match a pattern in `promisor.acceptFromServerUrl`. See
+linkgit:gitprotocol-v2[5] for details on the protocol.
 
 promisor.checkFields::
 	A comma or space separated list of additional remote related
diff --git a/Documentation/config/remote.adoc b/Documentation/config/remote.adoc
index 91e46f66f5..6e2bbdf457 100644
--- a/Documentation/config/remote.adoc
+++ b/Documentation/config/remote.adoc
@@ -91,6 +91,15 @@ remote.<name>.promisor::
 	When set to true, this remote will be used to fetch promisor
 	objects.
 
+remote.<name>.advertisedAs::
+	When a promisor remote is automatically configured using
+	information advertised by a server through the
+	`promisor-remote` protocol capability (see
+	`promisor.acceptFromServerUrl`), the server's originally
+	advertised name is saved in this variable. This is for
+	information, tracing and debugging purposes. Users should not
+	typically modify or create such configuration entries.
+
 remote.<name>.partialclonefilter::
 	The filter that will be applied when fetching from this	promisor remote.
 	Changing or clearing this value will only affect fetches for new commits.
diff --git a/promisor-remote.c b/promisor-remote.c
index 210e6950af..5321a9c4bf 100644
--- a/promisor-remote.c
+++ b/promisor-remote.c
@@ -750,10 +750,197 @@ static struct allowed_url *url_matches_accept_list(
 	return NULL;
 }
 
-static int should_accept_remote(enum accept_promisor accept,
+/*
+ * Sanitize the buffer to make it a valid remote name coming from the
+ * server by:
+ *
+ * - replacing any non alphanumeric character with a '-'
+ * - stripping any leading '-',
+ * - condensing multiple '-' into one,
+ * - prepending "promisor-auto-",
+ * - validating the result.
+ */
+static int sanitize_remote_name(struct strbuf *buf, const char *url)
+{
+	char prev = '-';
+	for (size_t i = 0; i < buf->len; ) {
+		if (!isalnum(buf->buf[i]))
+			buf->buf[i] = '-';
+		if (prev == '-' && buf->buf[i] == '-') {
+			strbuf_remove(buf, i, 1);
+		} else {
+			prev = buf->buf[i];
+			i++;
+		}
+	}
+
+	strbuf_strip_suffix(buf, "-");
+
+	if (!buf->len) {
+		warning(_("couldn't generate a valid remote name from "
+			  "advertised url '%s', ignoring this remote"), url);
+		return -1;
+	}
+
+	strbuf_insertstr(buf, 0, "promisor-auto-");
+
+	if (!valid_remote_name(buf->buf)) {
+		warning(_("generated remote name '%s' from advertised url '%s' "
+			  "is invalid, ignoring this remote"), buf->buf, url);
+		return -1;
+	}
+
+	return 0;
+}
+
+static char *promisor_remote_name_from_url(const char *url)
+{
+	struct url_info url_info = { 0 };
+	char *normalized = url_normalize(url, &url_info);
+	struct strbuf buf = STRBUF_INIT;
+
+	if (!normalized) {
+		warning(_("couldn't normalize advertised url '%s', "
+			  "ignoring this remote"), url);
+		return NULL;
+	}
+
+	if (url_info.host_len) {
+		strbuf_add(&buf, normalized + url_info.host_off, url_info.host_len);
+		strbuf_addch(&buf, '-');
+	}
+
+	if (url_info.port_len) {
+		strbuf_add(&buf, normalized + url_info.port_off, url_info.port_len);
+		strbuf_addch(&buf, '-');
+	}
+
+	if (url_info.path_len) {
+		strbuf_add(&buf, normalized + url_info.path_off, url_info.path_len);
+		strbuf_trim_trailing_dir_sep(&buf);
+		strbuf_strip_suffix(&buf, ".git");
+	}
+
+	free(normalized);
+
+	if (sanitize_remote_name(&buf, url)) {
+		strbuf_release(&buf);
+		return NULL;
+	}
+
+	return strbuf_detach(&buf, NULL);
+}
+
+static void configure_auto_promisor_remote(struct repository *repo,
+					   const char *name,
+					   const char *url,
+					   const char *advertised_as,
+					   bool reuse)
+{
+	char *key;
+
+	if (!reuse) {
+		fprintf(stderr, _("Auto-creating promisor remote '%s' for URL '%s'\n"),
+			name, url);
+
+		key = xstrfmt("remote.%s.url", name);
+		repo_config_set_gently(repo, key, url);
+		free(key);
+	}
+
+	/* NB: when reusing, this promotes an existing non-promisor remote */
+	key = xstrfmt("remote.%s.promisor", name);
+	repo_config_set_gently(repo, key, "true");
+	free(key);
+
+	if (advertised_as) {
+		key = xstrfmt("remote.%s.advertisedAs", name);
+		repo_config_set_gently(repo, key, advertised_as);
+		free(key);
+	}
+}
+
+#define MAX_REMOTES_WITH_SIMILAR_NAMES 20
+
+/* Return the allocated local name, or NULL on failure */
+static char *handle_matching_allowed_url(struct repository *repo,
+					 char *allowed_name,
+					 const char *remote_url,
+					 const char *remote_name)
+{
+	char *name;
+	char *basename = allowed_name ?
+		xstrdup(allowed_name) :
+		promisor_remote_name_from_url(remote_url);
+	int i = 0;
+	bool reuse = false;
+
+	if (!basename)
+		return NULL;
+
+	name = xstrdup(basename);
+
+	while (i < MAX_REMOTES_WITH_SIMILAR_NAMES) {
+		char *url_key = xstrfmt("remote.%s.url", name);
+		const char *existing_url;
+		int exists = !repo_config_get_string_tmp(repo, url_key, &existing_url);
+
+		free(url_key);
+
+		if (!exists)
+			break; /* Free to use */
+
+		if (!strcmp(existing_url, remote_url)) {
+			reuse = true;
+			break; /* Same URL, so safe to reuse */
+		}
+
+		i++;
+		free(name);
+		name = xstrfmt("%s-%d", basename, i);
+	}
+
+	if (i < MAX_REMOTES_WITH_SIMILAR_NAMES) {
+		configure_auto_promisor_remote(repo, name,
+					       remote_url, remote_name,
+					       reuse);
+	} else {
+		warning(_("too many remotes accepted with name like '%s-X', "
+			  "ignoring this remote"), basename);
+		FREE_AND_NULL(name);
+	}
+
+	free(basename);
+	return name;
+}
+
+static int should_accept_new_remote_url(struct repository *repo,
+					struct string_list *accept_urls,
+					struct promisor_info *advertised)
+{
+	struct allowed_url *allowed = url_matches_accept_list(accept_urls,
+							     advertised->url);
+	if (allowed) {
+		char *name = handle_matching_allowed_url(repo,
+							 allowed->remote_name,
+							 advertised->url,
+							 advertised->name);
+		if (name) {
+			free((char *)advertised->local_name);
+			advertised->local_name = name;
+			return 1;
+		}
+	}
+
+	return 0;
+}
+
+static int should_accept_remote(struct repository *repo,
+				enum accept_promisor accept,
 				struct promisor_info *advertised,
 				struct string_list *accept_urls,
-				struct string_list *config_info)
+				struct string_list *config_info,
+				bool *reload_config)
 {
 	struct promisor_info *p;
 	struct string_list_item *item;
@@ -772,9 +959,13 @@ static int should_accept_remote(enum accept_promisor accept,
 	/* Get config info for that promisor remote */
 	item = string_list_lookup(config_info, remote_name);
 
-	if (!item)
+	if (!item) {
 		/* We don't know about that remote */
-		return 0;
+		int res = should_accept_new_remote_url(repo, accept_urls, advertised);
+		if (res)
+			*reload_config = true;
+		return res;
+	}
 
 	p = item->util;
 
@@ -1029,7 +1220,8 @@ static void filter_promisor_remote(struct repository *repo,
 			string_list_sort(&config_info);
 		}
 
-		if (should_accept_remote(accept, advertised, accept_urls, &config_info)) {
+		if (should_accept_remote(repo, accept, advertised, accept_urls,
+					 &config_info, &reload_config)) {
 			if (!store_info)
 				store_info = store_info_new(repo);
 			if (promisor_store_advertised_fields(advertised, store_info))
diff --git a/t/t5710-promisor-remote-capability.sh b/t/t5710-promisor-remote-capability.sh
index 4ebad14af5..7d8df05fc7 100755
--- a/t/t5710-promisor-remote-capability.sh
+++ b/t/t5710-promisor-remote-capability.sh
@@ -377,6 +377,128 @@ test_expect_success "clone with 'None', URL whitelisted, but client has differen
 	initialize_server 1 "$oid"
 '
 
+test_expect_success "clone with URL whitelisted and no remote already configured" '
+	git -C server config promisor.advertise true &&
+	test_when_finished "rm -rf client" &&
+
+	GIT_NO_LAZY_FETCH=0 git clone \
+		-c promisor.acceptfromserver=None \
+		-c promisor.acceptFromServerUrl="$ENCODED_PWD_URL/*" \
+		--no-local --filter="blob:limit=5k" server client &&
+
+	# Check that a remote has been auto-created with the right fields.
+	# The remote is identified by "remote.<name>.advertisedAs" == "lop".
+	FULL_NAME=$(git -C client config --name-only --get-regexp "remote\..*\.advertisedas" "^lop$") &&
+	REMOTE_NAME=$(echo "$FULL_NAME" | sed "s/remote\.\(.*\)\.advertisedas/\1/") &&
+
+	# Check ".url" and ".promisor" values
+	printf "%s\n" "$PWD_URL/lop" "true" >expect &&
+	git -C client config "remote.$REMOTE_NAME.url" >actual &&
+	git -C client config "remote.$REMOTE_NAME.promisor" >>actual &&
+	test_cmp expect actual &&
+
+	# Check that the largest object is still missing on the server
+	check_missing_objects server 1 "$oid"
+'
+
+test_expect_success "clone with named URL whitelisted and no pre-configured remote" '
+	git -C server config promisor.advertise true &&
+	test_when_finished "rm -rf client" &&
+
+	GIT_NO_LAZY_FETCH=0 git clone \
+		-c promisor.acceptfromserver=None \
+		-c promisor.acceptFromServerUrl="cdn=$ENCODED_PWD_URL/*" \
+		--no-local --filter="blob:limit=5k" server client &&
+
+	# Check that a remote has been auto-created with the right "cdn" name and fields.
+	printf "%s\n" "$PWD_URL/lop" "true" "lop" >expect &&
+	git -C client config "remote.cdn.url" >actual &&
+	git -C client config "remote.cdn.promisor" >>actual &&
+	git -C client config "remote.cdn.advertisedAs" >>actual &&
+	test_cmp expect actual &&
+
+	# Check that the largest object is still missing on the server
+	check_missing_objects server 1 "$oid"
+'
+
+test_expect_success "clone with URL whitelisted but colliding name" '
+	git -C server config promisor.advertise true &&
+	test_when_finished "rm -rf client" &&
+
+	GIT_NO_LAZY_FETCH=0 git clone -c remote.cdn.promisor=true \
+		-c remote.cdn.fetch="+refs/heads/*:refs/remotes/lop/*" \
+		-c remote.cdn.url="https://example.com/cdn" \
+		-c promisor.acceptfromserver=None \
+		-c promisor.acceptFromServerUrl="cdn=$ENCODED_PWD_URL/*" \
+		--no-local --filter="blob:limit=5k" server client &&
+
+	# Check that a remote has been auto-created with the right "cdn-1" name and fields.
+	printf "%s\n" "$PWD_URL/lop" "true" "lop" >expect &&
+	git -C client config "remote.cdn-1.url" >actual &&
+	git -C client config "remote.cdn-1.promisor" >>actual &&
+	git -C client config "remote.cdn-1.advertisedAs" >>actual &&
+	test_cmp expect actual &&
+
+	# Check that the original "cdn" remote was not overwritten.
+	printf "%s\n" "https://example.com/cdn" "true" >expect &&
+	git -C client config "remote.cdn.url" >actual &&
+	git -C client config "remote.cdn.promisor" >>actual &&
+	test_cmp expect actual &&
+
+	# Check that the largest object is still missing on the server
+	check_missing_objects server 1 "$oid"
+'
+
+test_expect_success "clone with URL whitelisted and reusable remote" '
+	git -C server config promisor.advertise true &&
+	test_when_finished "rm -rf client" &&
+
+	GIT_NO_LAZY_FETCH=0 git clone \
+		-c remote.cdn.fetch="+refs/heads/*:refs/remotes/lop/*" \
+		-c remote.cdn.url="$PWD_URL/lop" \
+		-c promisor.acceptfromserver=None \
+		-c promisor.acceptFromServerUrl="cdn=$ENCODED_PWD_URL/*" \
+		--no-local --filter="blob:limit=5k" server client &&
+
+	# Check that the existing "cdn" remote has been properly updated.
+	printf "%s\n" "$PWD_URL/lop" "true" "lop" "+refs/heads/*:refs/remotes/lop/*" >expect &&
+	git -C client config "remote.cdn.url" >actual &&
+	git -C client config "remote.cdn.promisor" >>actual &&
+	git -C client config "remote.cdn.advertisedAs" >>actual &&
+	git -C client config "remote.cdn.fetch" >>actual &&
+	test_cmp expect actual &&
+
+	# Check that no new "cdn-1" remote has been created.
+	test_must_fail git -C client config "remote.cdn-1.url" &&
+
+	# Check that the largest object is still missing on the server
+	check_missing_objects server 1 "$oid"
+'
+
+test_expect_success "clone with invalid promisor.acceptFromServerUrl" '
+	git -C server config promisor.advertise true &&
+	test_when_finished "rm -rf client" &&
+
+	# As "bad name" contains a space, which is not a valid remote name,
+	# the pattern should be rejected with a warning and no remote created.
+	GIT_NO_LAZY_FETCH=0 git clone \
+		-c promisor.acceptfromserver=None \
+		-c "promisor.acceptFromServerUrl=bad name=https://example.com/*" \
+		--no-local --filter="blob:limit=5k" server client 2>err &&
+
+	# Check that a warning was emitted
+	test_grep "invalid remote name '\''bad name'\''" err &&
+
+	# Check that no remote was auto-created
+	test_must_fail git -C client config --get-regexp "remote\..*\.advertisedas" &&
+
+	# Check that the largest object is not missing on the server
+	check_missing_objects server 0 "" &&
+
+	# Reinitialize server so that the largest object is missing again
+	initialize_server 1 "$oid"
+'
+
 test_expect_success "clone with promisor.sendFields" '
 	git -C server config promisor.advertise true &&
 	test_when_finished "rm -rf client" &&
-- 
2.53.0.625.g20f70b52bb


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 16/16] doc: promisor: improve acceptFromServer entry
  2026-03-23  8:05 [PATCH 00/16] Auto-configure advertised remotes via URL whitelist Christian Couder
                   ` (14 preceding siblings ...)
  2026-03-23  8:05 ` [PATCH 15/16] promisor-remote: auto-configure unknown remotes Christian Couder
@ 2026-03-23  8:05 ` Christian Couder
  2026-03-26 12:21 ` [PATCH 00/16] Auto-configure advertised remotes via URL whitelist Patrick Steinhardt
  16 siblings, 0 replies; 33+ messages in thread
From: Christian Couder @ 2026-03-23  8:05 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Patrick Steinhardt, Taylor Blau, Karthik Nayak,
	Elijah Newren, Christian Couder, Christian Couder

The entry for the `promisor.acceptFromServer` in
"Documentation/config/promisor.adoc" has a number of issues:

- it's not clear if new remotes and URLs can be created,
- it looks like a big block of text,
- it's not easy to see all the options,
- it's not easy to see which option is the default one,
- for "knownName", it says "advertised by the client" instead of
  "advertised by the server",
- it doesn't refer to the new related `acceptFromServerUrl`
  option.

Let's address all these issues by rewording large parts of it
and using bullet points for the different options.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 Documentation/config/promisor.adoc | 53 ++++++++++++++++++++----------
 1 file changed, 35 insertions(+), 18 deletions(-)

diff --git a/Documentation/config/promisor.adoc b/Documentation/config/promisor.adoc
index d8e5f4a6dc..1d64e7f1d4 100644
--- a/Documentation/config/promisor.adoc
+++ b/Documentation/config/promisor.adoc
@@ -32,24 +32,41 @@ variable is set to "true", and the "name" and "url" fields are always
 advertised regardless of this setting.
 
 promisor.acceptFromServer::
-	If set to "all", a client will accept all the promisor remotes
-	a server might advertise using the "promisor-remote"
-	capability. If set to "knownName" the client will accept
-	promisor remotes which are already configured on the client
-	and have the same name as those advertised by the client. This
-	is not very secure, but could be used in a corporate setup
-	where servers and clients are trusted to not switch name and
-	URLs. If set to "knownUrl", the client will accept promisor
-	remotes which have both the same name and the same URL
-	configured on the client as the name and URL advertised by the
-	server. This is more secure than "all" or "knownName", so it
-	should be used if possible instead of those options. Default
-	is "none", which means no promisor remote advertised by a
-	server will be accepted. By accepting a promisor remote, the
-	client agrees that the server might omit objects that are
-	lazily fetchable from this promisor remote from its responses
-	to "fetch" and "clone" requests from the client. Name and URL
-	comparisons are case sensitive. See linkgit:gitprotocol-v2[5].
+	Controls which promisor remotes advertised by a server (using the
+	"promisor-remote" protocol capability) a client will accept. By
+	accepting a promisor remote, the client agrees that the server
+	might omit objects that are lazily fetchable from this promisor
+	remote from its responses to "fetch" and "clone" requests.
++
+Note that this option does not cause new remotes to be automatically
+created in the client's configuration. It only allows remotes which
+are somehow already configured to be trusted for the current
+operation, or their fields to be updated (if `promisor.storeFields` is
+set and the remote already exists locally). To allow Git to
+automatically create and persist new remotes from server
+advertisements, use `promisor.acceptFromServerUrl`.
++
+The available options are:
++
+* `none` (default): No promisor remote advertised by a server will be
+  accepted.
++
+* `knownUrl`: The client will accept promisor remotes that are already
+  configured on the client and have both the same name and the same URL
+  as advertised by the server. This is more secure than `all` or
+  `knownName`, and should be used if possible instead of those options.
++
+* `knownName`: The client will accept promisor remotes that are already
+  configured on the client and have the same name as those advertised
+  by the server. This is not very secure, but could be used in a corporate
+  setup where servers and clients are trusted to not switch names and URLs.
++
+* `all`: The client will accept all the promisor remotes a server might
+  advertise. This is the least secure option and should only be used in
+  fully trusted environments.
++
+Name and URL comparisons are case-sensitive. See linkgit:gitprotocol-v2[5]
+for protocol details.
 
 promisor.acceptFromServerUrl::
 	A glob pattern to specify which URLs advertised by a server
-- 
2.53.0.625.g20f70b52bb


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH 14/16] promisor-remote: trust known remotes matching acceptFromServerUrl
  2026-03-23  8:05 ` [PATCH 14/16] promisor-remote: trust known remotes matching acceptFromServerUrl Christian Couder
@ 2026-03-23 18:54   ` Junio C Hamano
  2026-03-23 23:47     ` Junio C Hamano
  2026-03-27 12:17     ` Christian Couder
  2026-03-26 12:21   ` Patrick Steinhardt
  1 sibling, 2 replies; 33+ messages in thread
From: Junio C Hamano @ 2026-03-23 18:54 UTC (permalink / raw)
  To: Christian Couder
  Cc: git, Patrick Steinhardt, Taylor Blau, Karthik Nayak,
	Elijah Newren, Christian Couder

Christian Couder <christian.couder@gmail.com> writes:

> diff --git a/Documentation/config/promisor.adoc b/Documentation/config/promisor.adoc
> index b0fa43b839..6f5442cd65 100644
> --- a/Documentation/config/promisor.adoc
> +++ b/Documentation/config/promisor.adoc
> @@ -51,6 +51,52 @@ promisor.acceptFromServer::
>  	to "fetch" and "clone" requests from the client. Name and URL
>  	comparisons are case sensitive. See linkgit:gitprotocol-v2[5].
>  
> +promisor.acceptFromServerUrl::
> +	A glob pattern to specify which URLs advertised by a server
> +	are considered trusted by the client. This option acts as an
> +	additive security whitelist that works in conjunction with
> +	`promisor.acceptFromServer`.

Between the first sentence and the second one, I think there needs
to be an explanation on what "trusted" means in this context.  Is it
trusted so that the URL can feed random configuration variable=value
pairs for the client to blindly apply?  Or is it trusted to do very
limited things that other remotes can do, and if so what are these
limited things?  Without knowing that, the end-users cannot assess
the security implications of setting this option.

I am guessing that the client would behave as if the existing
promisor.acceptFromServer configuration variable were set to "all"
when talking with a remote whose URL matches one of the patterns
listed?

By the way, some people may suggest "white" -> "allow".


> +1. Start with a secure protocol scheme, like `https://` or `ssh://`.

Is there a practical reason why people would want to use schemes
other than the above two?  This sounds like something a small amount
of code can easily enforce.

> +2. Only allow domain names or paths where you control and trust _ALL_
> +   the content.

Obviously.

> +3. Don't use globs (`*`) in the domain name. For example
> +   `https://cdn.example.com/*` is much safer than
> +   `https://*.example.com/*`, because the latter matches
> +   `https://evil-hacker.net/fake.example.com/repo`.

Is there a practical use case where allowing '*' to match anything
that contains a slash '/' is useful?

> +4. Make sure to have a `/` at the end of the domain name (or the end
> +   of specific directories). For example `https://cdn.example.com/*`
> +   is much safer than `https://cdn.example.com*`, because the latter
> +   matches `https://cdn.example.com.hacker.net/repo`.

Ditto.  The above two points sound like excuses to keep sloppy
asterisk matching logic.  Yes, retroactively tightening rules always
have risk to break existing deployments, but if existing code paths
of urlmatch do not have any good reason to allow '*' to match a
string that contains a slash '/', perhaps there is no fallout.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 14/16] promisor-remote: trust known remotes matching acceptFromServerUrl
  2026-03-23 18:54   ` Junio C Hamano
@ 2026-03-23 23:47     ` Junio C Hamano
  2026-03-27 12:17     ` Christian Couder
  1 sibling, 0 replies; 33+ messages in thread
From: Junio C Hamano @ 2026-03-23 23:47 UTC (permalink / raw)
  To: Christian Couder
  Cc: git, Patrick Steinhardt, Taylor Blau, Karthik Nayak,
	Elijah Newren, Christian Couder

Junio C Hamano <gitster@pobox.com> writes:

> Obviously.
>
>> +3. Don't use globs (`*`) in the domain name. For example
>> +   `https://cdn.example.com/*` is much safer than
>> +   `https://*.example.com/*`, because the latter matches
>> +   `https://evil-hacker.net/fake.example.com/repo`.
>
> Is there a practical use case where allowing '*' to match anything
> that contains a slash '/' is useful?
>
>> +4. Make sure to have a `/` at the end of the domain name (or the end
>> +   of specific directories). For example `https://cdn.example.com/*`
>> +   is much safer than `https://cdn.example.com*`, because the latter
>> +   matches `https://cdn.example.com.hacker.net/repo`.
>
> Ditto.  The above two points sound like excuses to keep sloppy
> asterisk matching logic.  Yes, retroactively tightening rules always
> have risk to break existing deployments, but if existing code paths
> of urlmatch do not have any good reason to allow '*' to match a
> string that contains a slash '/', perhaps there is no fallout.

I probably should caution the readers not to take the above too
literally.  Forbidding an asterisk '*' glob not to match '/'
anywhere in urlmatch will obviusly break existing deployments that
does this

    [http "https://example.com/*"]
	var = val

and expects it to catch any URL pointing into the site.

But I still do think the matcher should be more intelligent than the
current implementation to avoid pitfalls like #3 and #4 above.

Perhaps if an additional rule says that '*' after the scheme:// part
before the first '/' in the pattern, e.g.,

    https://*.example.com/
    https://*.example.com
    https://example.com*

unlike '*' that appear anywhere else, never matches a substring that
contains a slash '/' in it, it would cover plausible mistakes that
the above #3 and #4 are trying to catch, without hurting any real
world use case?




^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 01/16] promisor-remote: try accepted remotes before others in get_direct()
  2026-03-23  8:05 ` [PATCH 01/16] promisor-remote: try accepted remotes before others in get_direct() Christian Couder
@ 2026-03-26 12:20   ` Patrick Steinhardt
  0 siblings, 0 replies; 33+ messages in thread
From: Patrick Steinhardt @ 2026-03-26 12:20 UTC (permalink / raw)
  To: Christian Couder
  Cc: git, Junio C Hamano, Taylor Blau, Karthik Nayak, Elijah Newren,
	Christian Couder

On Mon, Mar 23, 2026 at 09:05:04AM +0100, Christian Couder wrote:
> When a server advertises promisor remotes and the client accepts some
> of them, those remotes carry the server's intent: 'fetch missing
> objects preferably from here', and the client agrees with that for the
> remotes it accepts.
> 
> However promisor_remote_get_direct() actually iterates over all
> promisor remotes in list order, which is the order they appear in the
> config files (except perhaps for the one appearing in the
> `extensions.partialClone` config variable which is tried last).
> 
> This means an existing, but not accepted, promisor remote, could be
> tried before the accepted ones, which does not reflect the intent of
> the agreement between client and server.
> 
> If the client doesn't care about what the server suggests, it should
> accept nothing and rely on its remotes as they are already configured.
> 
> To better reflect the agreement between client and server, let's make
> promisor_remote_get_direct() try the accepted promisor remotes before
> the non-accepted ones.

Interesting, and it feels sensible to me. Is it documented anywhere that
the ordering of announced remotes is actually important?

> Concretely, let's extract a try_promisor_remotes() helper and call it
> twice from promisor_remote_get_direct():
> 
> - first with an `accepted_only=true` argument to try only the accepted
>   remotes,
> - then with `accepted_only=false` to fall back to any remaining remote.
> 
> Ensuring that accepted remotes are preferred will be even more
> important if in the future a mechanism is developed to allow the
> client to auto-configure remotes that the server advertises. This will
> in particular avoid fetching from the server (which is already
> configured as a promisor remote) before trying the auto-configured
> remotes, as these new remotes would likely appear at the end of the
> config file, and as the server might not appear in the
> `extensions.partialClone` config variable.

Not quite sure I correctly understand this paragraph. Is the idea that
in the future, we might not even store announced promisors in the config
at all but simply use whatever the server announces on any given fetch?

> diff --git a/promisor-remote.c b/promisor-remote.c
> index 96fa215b06..3f8aeee787 100644
> --- a/promisor-remote.c
> +++ b/promisor-remote.c
> @@ -268,11 +268,37 @@ static int remove_fetched_oids(struct repository *repo,
>  	return remaining_nr;
>  }
>  
> +static int try_promisor_remotes(struct repository *repo,
> +				struct object_id **remaining_oids,
> +				int *remaining_nr, int *to_free,
> +				bool accepted_only)
> +{
> +	struct promisor_remote *r = repo->promisor_remote_config->promisors;
> +
> +	for (; r; r = r->next) {
> +		if (accepted_only && !r->accepted)
> +			continue;
> +		if (!accepted_only && r->accepted)
> +			continue;

This can be simplified to `if (!accepted_only != !r->accepted)`.

Also, can we maybe add a test for this to verify that we use the correct
ordering now?

Patrick

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 03/16] urlmatch: add url_is_valid_pattern() helper
  2026-03-23  8:05 ` [PATCH 03/16] urlmatch: add url_is_valid_pattern() helper Christian Couder
@ 2026-03-26 12:20   ` Patrick Steinhardt
  0 siblings, 0 replies; 33+ messages in thread
From: Patrick Steinhardt @ 2026-03-26 12:20 UTC (permalink / raw)
  To: Christian Couder
  Cc: git, Junio C Hamano, Taylor Blau, Karthik Nayak, Elijah Newren,
	Christian Couder

On Mon, Mar 23, 2026 at 09:05:06AM +0100, Christian Couder wrote:
> diff --git a/urlmatch.c b/urlmatch.c
> index 989bc7eb8b..a8cb6c3bee 100644
> --- a/urlmatch.c
> +++ b/urlmatch.c
> @@ -440,6 +440,18 @@ char *url_normalize(const char *url, struct url_info *out_info)
>  	return url_normalize_1(url, out_info, false);
>  }
>  
> +bool url_is_valid_pattern(const char *url)
> +{
> +	char *normalized = url_normalize_1(url, NULL, true);
> +
> +	if (normalized) {
> +		free(normalized);
> +		return true;
> +	}
> +
> +	return false;
> +}

We could simplify this implementation to:

	bool url_is_valid_pattern(const char *url)
	{
		char *normalized = url_normalize_1(url, NULL, true);
		free(normalized);
		return !!normalized;
	}

> diff --git a/urlmatch.h b/urlmatch.h
> index 5ba85cea13..4e01422a02 100644
> --- a/urlmatch.h
> +++ b/urlmatch.h
> @@ -36,6 +36,17 @@ struct url_info {
>  
>  char *url_normalize(const char *, struct url_info *);
>  
> +/*
> + * Return 'true' if the string looks like a valid URL or a valid URL pattern
> + * (allowing '*' globs), 'false' otherwise.
> + *
> + * This is NOT a URL validation function.  Full URL validation is NOT
> + * performed.  Some invalid host names are passed through this function
> + * undetected.  However, most all other problems that make a URL invalid
> + * will be detected (including a missing host for non file: URLs).
> + */
> +bool url_is_valid_pattern(const char *url);

Okay, this comment is basically copied from `url_normalize_1()`, which
makes sense.

Patrick

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 04/16] promisor-remote: clarify that a remote is ignored
  2026-03-23  8:05 ` [PATCH 04/16] promisor-remote: clarify that a remote is ignored Christian Couder
@ 2026-03-26 12:20   ` Patrick Steinhardt
  0 siblings, 0 replies; 33+ messages in thread
From: Patrick Steinhardt @ 2026-03-26 12:20 UTC (permalink / raw)
  To: Christian Couder
  Cc: git, Junio C Hamano, Taylor Blau, Karthik Nayak, Elijah Newren,
	Christian Couder

On Mon, Mar 23, 2026 at 09:05:07AM +0100, Christian Couder wrote:
> In should_accept_remote() when a remote is ignored, we might tell users
> why it is ignored in a warning, but we don't tell them that the remote
> is actually ignored.
> 
> Let's clarify that, so users have a better idea of what's actually
> happening.

Interesting that this doesn't result in any test changes. Don't we have
test coverage for `should_accept_remote()`?

> diff --git a/promisor-remote.c b/promisor-remote.c
> index 3f8aeee787..f5c4d41155 100644
> --- a/promisor-remote.c
> +++ b/promisor-remote.c
> @@ -660,15 +660,16 @@ static int should_accept_remote(enum accept_promisor accept,
>  		BUG("Unhandled 'enum accept_promisor' value '%d'", accept);
>  
>  	if (!remote_url || !*remote_url) {
> -		warning(_("no or empty URL advertised for remote '%s'"), remote_name);
> +		warning(_("no or empty URL advertised for remote '%s', "
> +			  "ignoring this remote"), remote_name);
>  		return 0;
>  	}
>  
>  	if (!strcmp(p->url, remote_url))
>  		return all_fields_match(advertised, config_info, 0);
>  
> -	warning(_("known remote named '%s' but with URL '%s' instead of '%s'"),
> -		remote_name, p->url, remote_url);
> +	warning(_("known remote named '%s' but with URL '%s' instead of '%s', "
> +		  "ignoring this remote"), remote_name, p->url, remote_url);
>  
>  	return 0;
>  }

The change itself seems sensible to me.

Patrick

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 07/16] promisor-remote: keep accepted promisor_info structs alive
  2026-03-23  8:05 ` [PATCH 07/16] promisor-remote: keep accepted promisor_info structs alive Christian Couder
@ 2026-03-26 12:21   ` Patrick Steinhardt
  0 siblings, 0 replies; 33+ messages in thread
From: Patrick Steinhardt @ 2026-03-26 12:21 UTC (permalink / raw)
  To: Christian Couder
  Cc: git, Junio C Hamano, Taylor Blau, Karthik Nayak, Elijah Newren,
	Christian Couder

On Mon, Mar 23, 2026 at 09:05:10AM +0100, Christian Couder wrote:
> diff --git a/promisor-remote.c b/promisor-remote.c
> index 3116d14d14..34b4ca5806 100644
> --- a/promisor-remote.c
> +++ b/promisor-remote.c
> @@ -912,17 +912,10 @@ static void filter_promisor_remote(struct repository *repo,
>  			if (promisor_store_advertised_fields(advertised, store_info))
>  				reload_config = true;
>  
> -			strvec_push(accepted, advertised->name);

Okay, here we used to store the name of the accepted remote...

> -			/* Capture advertised filters for accepted remotes */
> -			if (advertised->filter) {
> -				struct string_list_item *i;
> -				i = string_list_append(&accepted_filters, advertised->name);
> -				i->util = xstrdup(advertised->filter);
> -			}

... and here we used to store the filter, if any.

> +			string_list_append(&accepted_remotes, advertised->name)->util = advertised;

Instead, we now store the whole remote, ...

> +		} else {
> +			promisor_info_free(advertised);

... unless we don't want to accept it, in which case we free it now.
Makes sense.

> @@ -932,24 +925,23 @@ static void filter_promisor_remote(struct repository *repo,
>  	if (reload_config)
>  		repo_promisor_remote_reinit(repo);
>  
> -	/* Apply accepted remote filters to the stable repo state */
> -	for_each_string_list_item(item, &accepted_filters) {
> -		struct promisor_remote *r = repo_promisor_remote_find(repo, item->string);
> -		if (r) {
> -			free(r->advertised_filter);
> -			r->advertised_filter = item->util;
> -			item->util = NULL;
> -		}
> -	}
> +	/* Apply accepted remotes to the stable repo state */
> +	for_each_string_list_item(item, &accepted_remotes) {
> +		struct promisor_info *info = item->util;
> +		struct promisor_remote *r = repo_promisor_remote_find(repo, info->name);

And instead of iterating through the accepted filters we now iterate
through the accepted remotes and take their respective filters from the
promisor remote info.

> -	string_list_clear(&accepted_filters, 1);
> +		strvec_push(accepted, info->name);
>  
> -	/* Mark the remotes as accepted in the repository state */
> -	for (size_t i = 0; i < accepted->nr; i++) {
> -		struct promisor_remote *r = repo_promisor_remote_find(repo, accepted->v[i]);
> -		if (r)
> +		if (r) {
>  			r->accepted = 1;
> +			if (info->filter) {
> +				free(r->advertised_filter);
> +				r->advertised_filter = xstrdup(info->filter);
> +			}
> +		}
>  	}
> +
> +	promisor_info_list_clear(&accepted_remotes);
>  }

Patrick

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 08/16] promisor-remote: remove the 'accepted' strvec
  2026-03-23  8:05 ` [PATCH 08/16] promisor-remote: remove the 'accepted' strvec Christian Couder
@ 2026-03-26 12:21   ` Patrick Steinhardt
  0 siblings, 0 replies; 33+ messages in thread
From: Patrick Steinhardt @ 2026-03-26 12:21 UTC (permalink / raw)
  To: Christian Couder
  Cc: git, Junio C Hamano, Taylor Blau, Karthik Nayak, Elijah Newren,
	Christian Couder

On Mon, Mar 23, 2026 at 09:05:11AM +0100, Christian Couder wrote:
> In a previous commit, filter_promisor_remote() was refactored to keep
> accepted 'struct promisor_info' instances alive instead of dismantling
> them into separate parallel data structures.
> 
> Let's go one step further and replace the 'struct strvec *accepted'
> argument passed to filter_promisor_remote() with a
> 'struct string_list *accepted_remotes' argument.

Right, this is indeed something I was wondering about given that we now
effectively stored the remote name twice.

Patrick

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 09/16] promisor-remote: add 'local_name' to 'struct promisor_info'
  2026-03-23  8:05 ` [PATCH 09/16] promisor-remote: add 'local_name' to 'struct promisor_info' Christian Couder
@ 2026-03-26 12:21   ` Patrick Steinhardt
  0 siblings, 0 replies; 33+ messages in thread
From: Patrick Steinhardt @ 2026-03-26 12:21 UTC (permalink / raw)
  To: Christian Couder
  Cc: git, Junio C Hamano, Taylor Blau, Karthik Nayak, Elijah Newren,
	Christian Couder

On Mon, Mar 23, 2026 at 09:05:12AM +0100, Christian Couder wrote:
> In a following commit, we will store promisor remote information under
> a remote name different than the one the server advertised.
> 
> To prepare for this change, let's add a new 'char* local_name' member

Micronit: s/char* local_name/char *local_name/

> diff --git a/promisor-remote.c b/promisor-remote.c
> index bdfc5e7608..da347fa2dc 100644
> --- a/promisor-remote.c
> +++ b/promisor-remote.c
> @@ -434,15 +434,19 @@ static struct string_list *fields_stored(void)
>  
>  /*
>   * Struct for promisor remotes involved in the "promisor-remote"
> - * protocol capability.
> + * protocol capability:
>   *
> - * Except for "name", each <member> in this struct and its <value>
> - * should correspond (either on the client side or on the server side)
> - * to a "remote.<name>.<member>" config variable set to <value> where
> - * "<name>" is a promisor remote name.
> + * - "name" is the name the server advertised.
> + * - "local_name" is the name we use locally (may be auto-generated).
> + *
> + * Except for "name" and "local_name", each <member> in this struct
> + * and its <value> should correspond (either on the client side or on
> + * the server side) to a "remote.<name>.<member>" config variable set
> + * to <value> where "<name>" is a promisor remote name.
>   */
>  struct promisor_info {
>  	const char *name;
> +	const char *local_name;
>  	const char *url;
>  	const char *filter;
>  	const char *token;

I think it would be easier to follow if the struct-level comment applied
to the general description of the struct, and individual members would
then have their own comments describing their intent.

> @@ -464,6 +469,11 @@ static void promisor_info_list_clear(struct string_list *list)
>  	string_list_clear(list, 0);
>  }
>  
> +static const char *promisor_info_internal_name(struct promisor_info *p)
> +{
> +	return p->local_name ? p->local_name : p->name;
> +}
> +
>  static void set_one_field(struct promisor_info *p,
>  			  const char *field, const char *value)
>  {
> @@ -819,7 +829,7 @@ static bool promisor_store_advertised_fields(struct promisor_info *advertised,
>  {
>  	struct promisor_info *p;
>  	struct string_list_item *item;
> -	const char *remote_name = advertised->name;
> +	const char *remote_name = promisor_info_internal_name(advertised);
>  	bool reload_config = false;
>  
>  	if (!(store_info->store_filter || store_info->store_token))
> @@ -927,7 +937,8 @@ static void filter_promisor_remote(struct repository *repo,
>  	/* Apply accepted remotes to the stable repo state */
>  	for_each_string_list_item(item, accepted_remotes) {
>  		struct promisor_info *info = item->util;
> -		struct promisor_remote *r = repo_promisor_remote_find(repo, info->name);
> +		const char *local = promisor_info_internal_name(info);
> +		struct promisor_remote *r = repo_promisor_remote_find(repo, local);
>  
>  		if (r) {
>  			r->accepted = 1;

Okay. These hunks are essentially a no-op for now given that we don't
yet store a local name.

Patrick

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 10/16] promisor-remote: pass config entry to all_fields_match() directly
  2026-03-23  8:05 ` [PATCH 10/16] promisor-remote: pass config entry to all_fields_match() directly Christian Couder
@ 2026-03-26 12:21   ` Patrick Steinhardt
  0 siblings, 0 replies; 33+ messages in thread
From: Patrick Steinhardt @ 2026-03-26 12:21 UTC (permalink / raw)
  To: Christian Couder
  Cc: git, Junio C Hamano, Taylor Blau, Karthik Nayak, Elijah Newren,
	Christian Couder

On Mon, Mar 23, 2026 at 09:05:13AM +0100, Christian Couder wrote:
> The `in_list == 0` path of all_fields_match() re-looks up the

This reads a bit weird. How about "looks up the remote in config_info by
advertised->name repeatedly" instead?

> diff --git a/promisor-remote.c b/promisor-remote.c
> index da347fa2dc..8f2c1280c3 100644
> --- a/promisor-remote.c
> +++ b/promisor-remote.c
> @@ -619,7 +627,11 @@ static int all_fields_match(struct promisor_info *advertised,
>  		if (!value)
>  			return 0;
>  
> -		if (in_list) {
> +		if (config_entry) {
> +			match = match_field_against_config(field, value,
> +							   config_entry);
> +		} else {
> +			struct string_list_item *item;
>  			for_each_string_list_item(item, config_info) {
>  				struct promisor_info *p = item->util;
>  				if (match_field_against_config(field, value, p)) {
> @@ -627,12 +639,6 @@ static int all_fields_match(struct promisor_info *advertised,
>  					break;
>  				}
>  			}
> -		} else {
> -			item = string_list_lookup(config_info, advertised->name);
> -			if (item) {
> -				struct promisor_info *p = item->util;
> -				match = match_field_against_config(field, value, p);
> -			}
>  		}
>  
>  		if (!match)

Okay, the logic is reversed now, which makes sense as we now pass `NULL`
instead of `1`, and the promisor info instead of `0`.

The change itself makes sense, but other than that I have a very hard
time understanding these two functions. I think they would strongly
benefit from some comments explaining what's going on, what the input is
and what we're trying to do. Of course that doesn't have to be part of
this commit here, but I would appreciate a preparatory commit that helps
guide the reader a bit.

Patrick

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 11/16] promisor-remote: refactor should_accept_remote() control flow
  2026-03-23  8:05 ` [PATCH 11/16] promisor-remote: refactor should_accept_remote() control flow Christian Couder
@ 2026-03-26 12:21   ` Patrick Steinhardt
  0 siblings, 0 replies; 33+ messages in thread
From: Patrick Steinhardt @ 2026-03-26 12:21 UTC (permalink / raw)
  To: Christian Couder
  Cc: git, Junio C Hamano, Taylor Blau, Karthik Nayak, Elijah Newren,
	Christian Couder

On Mon, Mar 23, 2026 at 09:05:14AM +0100, Christian Couder wrote:
> diff --git a/promisor-remote.c b/promisor-remote.c
> index 8f2c1280c3..c2f0eb7223 100644
> --- a/promisor-remote.c
> +++ b/promisor-remote.c
> @@ -665,6 +665,12 @@ static int should_accept_remote(enum accept_promisor accept,
>  	const char *remote_name = advertised->name;
>  	const char *remote_url = advertised->url;
>  
> +	if (!remote_url || !*remote_url) {
> +		warning(_("no or empty URL advertised for remote '%s', "
> +			  "ignoring this remote"), remote_name);
> +		return 0;
> +	}
> +
>  	if (accept == ACCEPT_ALL)
>  		return all_fields_match(advertised, config_info, NULL);
>  

You mention that it shouldn't change behaviour in well-defined cases
where the remote sends non-empty URLs. But does it change behaviour in
ill-defined cases where the remote sends empty ones?

In other words, does this fix a bug that can be hit in the real world?

Patrick

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 12/16] t5710: use proper file:// URIs for absolute paths
  2026-03-23  8:05 ` [PATCH 12/16] t5710: use proper file:// URIs for absolute paths Christian Couder
@ 2026-03-26 12:21   ` Patrick Steinhardt
  0 siblings, 0 replies; 33+ messages in thread
From: Patrick Steinhardt @ 2026-03-26 12:21 UTC (permalink / raw)
  To: Christian Couder
  Cc: git, Junio C Hamano, Taylor Blau, Karthik Nayak, Elijah Newren,
	Christian Couder

On Mon, Mar 23, 2026 at 09:05:15AM +0100, Christian Couder wrote:
> diff --git a/t/t5710-promisor-remote-capability.sh b/t/t5710-promisor-remote-capability.sh
> index 357822c01a..c7a484228f 100755
> --- a/t/t5710-promisor-remote-capability.sh
> +++ b/t/t5710-promisor-remote-capability.sh
> @@ -76,6 +76,17 @@ copy_to_lop () {
>  	cp "$path" "$path2"
>  }
>  
> +# On Windows, 'pwd' returns a path like 'D:/foo/bar'. Prepend '/' to turn
> +# it into '/D:/foo/bar', which is what git expects in file:// URLs on Windows.
> +# On Unix, the path already starts with '/', so this is a no-op.
> +pwd_path=$(pwd)
> +case "$pwd_path" in
> +[a-zA-Z]:*) pwd_path="/$pwd_path" ;;
> +esac
> +PWD_URL="file://$pwd_path"
> +# Same as PWD_URL but with spaces percent-encoded, for use in URL patterns.
> +ENCODED_PWD_URL="file://$(echo "$pwd_path" | sed "s/ /%20/g")"

Can't we do this unconditionally on all platforms? "file:////foobar"
should be valid on Unix systems, too, shouldn't it?

Patrick

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 13/16] promisor-remote: introduce promisor.acceptFromServerUrl
  2026-03-23  8:05 ` [PATCH 13/16] promisor-remote: introduce promisor.acceptFromServerUrl Christian Couder
@ 2026-03-26 12:21   ` Patrick Steinhardt
  0 siblings, 0 replies; 33+ messages in thread
From: Patrick Steinhardt @ 2026-03-26 12:21 UTC (permalink / raw)
  To: Christian Couder
  Cc: git, Junio C Hamano, Taylor Blau, Karthik Nayak, Elijah Newren,
	Christian Couder

On Mon, Mar 23, 2026 at 09:05:16AM +0100, Christian Couder wrote:
> diff --git a/promisor-remote.c b/promisor-remote.c
> index c2f0eb7223..4cb18e1a6a 100644
> --- a/promisor-remote.c
> +++ b/promisor-remote.c
[snip]
> +static struct string_list *accept_from_server_url(struct repository *repo)
> +{
> +	static struct string_list accept_urls = STRING_LIST_INIT_DUP;
> +	static int initialized;
> +	const struct string_list *config_urls;
> +
> +	if (initialized)
> +		return &accept_urls;
> +
> +	initialized = 1;
> +
> +	if (!repo_config_get_string_multi(repo, "promisor.acceptfromserverurl", &config_urls)) {
> +		struct string_list_item *item;
> +
> +		for_each_string_list_item(item, config_urls) {
> +			struct allowed_url *allowed = valid_accept_url(item->string);
> +			if (allowed) {
> +				struct string_list_item *new;
> +				new = string_list_append(&accept_urls, item->string);
> +				new->util = allowed;
> +			}
> +		}
> +	}
> +
> +	return &accept_urls;
> +}

I'm still not much of a fan of us getting more and more function-local
static variables. It just feels wrong to me, and like we're accruing
technical debt. I also doubt that the performance overhead of storing
this on the stack with proper lifecycle management will matter at all
given that we're in a context where we talk with a remote anyway. The
handful of allocations really shouldn't matter in that context.

Patrick

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 14/16] promisor-remote: trust known remotes matching acceptFromServerUrl
  2026-03-23  8:05 ` [PATCH 14/16] promisor-remote: trust known remotes matching acceptFromServerUrl Christian Couder
  2026-03-23 18:54   ` Junio C Hamano
@ 2026-03-26 12:21   ` Patrick Steinhardt
  1 sibling, 0 replies; 33+ messages in thread
From: Patrick Steinhardt @ 2026-03-26 12:21 UTC (permalink / raw)
  To: Christian Couder
  Cc: git, Junio C Hamano, Taylor Blau, Karthik Nayak, Elijah Newren,
	Christian Couder

On Mon, Mar 23, 2026 at 09:05:17AM +0100, Christian Couder wrote:
> A previous commit introduced the `promisor.acceptFromServerUrl` config
> variable along with the machinery to parse and validate the URL glob
> patterns and optional remote name prefixes it contains. However, these
> URL patterns are not yet tied into the client's acceptance logic.
> 
> When a promisor remote is already configured locally, its fields (like
> authentication tokens) may occasionally need to be refreshed by the
> server. If `promisor.acceptFromServer` is set to the secure default
> ("None"), these updates are rejected, potentially causing future
> fetches to fail.

This only talks about already-configured remotes, and so I was naturally
wondering what about not-yet-configured remotes. But looking ahead a bit
shows that this is implemented in the next commit. Good.

> To enable such targeted updates for trusted URLs, let's use the URL
> patterns from `promisor.acceptFromServerUrl` as an additional URL
> based whitelist.
> 
> Concretely, let's check the advertised URLs against the URL glob
> patterns by introducing a new small helper function called
> url_matches_accept_list(), which iterates over the glob patterns and
> returns the first matching allowed_url entry (or NULL).
> 
> (Before matching, the advertised URL is passed through url_normalize()
> so that case variations in the scheme/host, percent-encoding tricks,
> and ".." path segments cannot bypass the whitelist.)
> 
> Let's then use this helper at the tail of should_accept_remote() so
> that, when `accept == ACCEPT_NONE`, a known remote whose URL matches
> the whitelist is still accepted.
> 
> To prepare for this new logic, let's also:
> 
>  - Add an 'accept_urls' parameter to should_accept_remote().
> 
>  - Replace the BUG() guard in the ACCEPT_KNOWN_URL case with an
>    explicit 'if (accept == ACCEPT_KNOWN_URL) return' and a new
>    BUG() guard in the ACCEPT_NONE case, so url_matches_accept_list()
>    is only called in the ACCEPT_NONE case.
> 
>  - Call accept_from_server_url() from filter_promisor_remote()
>    and relax its early return so that the function is entered when
>    `accept_urls` has entries even if `accept == ACCEPT_NONE`.
> 
> Let's then properly document `promisor.acceptFromServerUrl` in
> "promisor.adoc" as an additive security whitelist for known remotes,
> including the URL normalization behavior, and let's mention it in
> "gitprotocol-v2.adoc".

I feel like the description is steering a bit too strongly into the
direction of a step-by-step instruction of how to implement the change
rather than explaining what's done and why it's done this way.

> Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
> ---
>  Documentation/config/promisor.adoc    | 46 +++++++++++++++++
>  Documentation/gitprotocol-v2.adoc     |  9 ++--
>  promisor-remote.c                     | 50 +++++++++++++++---
>  t/t5710-promisor-remote-capability.sh | 73 +++++++++++++++++++++++++++
>  4 files changed, 166 insertions(+), 12 deletions(-)
> 
> diff --git a/Documentation/config/promisor.adoc b/Documentation/config/promisor.adoc
> index b0fa43b839..6f5442cd65 100644
> --- a/Documentation/config/promisor.adoc
> +++ b/Documentation/config/promisor.adoc
> @@ -51,6 +51,52 @@ promisor.acceptFromServer::
>  	to "fetch" and "clone" requests from the client. Name and URL
>  	comparisons are case sensitive. See linkgit:gitprotocol-v2[5].
>  
> +promisor.acceptFromServerUrl::
> +	A glob pattern to specify which URLs advertised by a server
> +	are considered trusted by the client. This option acts as an
> +	additive security whitelist that works in conjunction with
> +	`promisor.acceptFromServer`.
> ++
> +This option can appear multiple times in config files. An advertised
> +URL will be accepted if it matches _ANY_ glob pattern specified by
> +this option in _ANY_ config file read by Git.
> ++
> +Be _VERY_ careful with these glob patterns, as it can be a big
> +security hole to allow any advertised remote to be auto-configured!
> +To minimize security risks, follow these guidelines:
> ++
> +1. Start with a secure protocol scheme, like `https://` or `ssh://`.
> ++
> +2. Only allow domain names or paths where you control and trust _ALL_
> +   the content. Be especially careful with shared hosting platforms
> +   like `github.com` or `gitlab.com`. A broad pattern like
> +   `https://gitlab.com/*` is dangerous because it trusts every
> +   repository on the entire platform. Always restrict such patterns to
> +   your specific organization or namespace (e.g.,
> +   `https://gitlab.com/your-org/*`).
> ++
> +3. Don't use globs (`*`) in the domain name. For example
> +   `https://cdn.example.com/*` is much safer than
> +   `https://*.example.com/*`, because the latter matches
> +   `https://evil-hacker.net/fake.example.com/repo`.
> ++
> +4. Make sure to have a `/` at the end of the domain name (or the end
> +   of specific directories). For example `https://cdn.example.com/*`
> +   is much safer than `https://cdn.example.com*`, because the latter
> +   matches `https://cdn.example.com.hacker.net/repo`.
> ++
> +Before matching, the advertised URL is normalized: the scheme and
> +host are lowercased, percent-encoded characters are decoded where
> +possible, and path segments like `..` are resolved.  Glob patterns
> +are matched against this normalized URL as-is, so patterns should
> +be written in normalized form (e.g., lowercase scheme and host).
> ++
> +Even if `promisor.acceptFromServer` is set to `None` (the default),
> +Git will still accept field updates (like tokens) for known remotes,
> +provided their URLs match a pattern in
> +`promisor.acceptFromServerUrl`. See linkgit:gitprotocol-v2[5] for
> +details on the protocol.

Given that there's a bunch to process here, would it make sense to give
users an example for how to do it properly?

I also wonder why we require a new config entry instead of extending
`promisor.acceptFromRemote` to have for example a new "url:https://..."
setting. Are there cases where you would ever want to use the new
URL-based schema with a different setting than "all"?

I guess the case with "none" is exactly that, where you may auto-update
configured remotes. But I wonder whether it would be more sensible to
split out behaviour of accepting and updating promisors into separate
configuration variables. These are ultimately different concerns, and
the interaction as layed out in this commit is somewhat non-obvious to
me.

Patrick

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 15/16] promisor-remote: auto-configure unknown remotes
  2026-03-23  8:05 ` [PATCH 15/16] promisor-remote: auto-configure unknown remotes Christian Couder
@ 2026-03-26 12:21   ` Patrick Steinhardt
  0 siblings, 0 replies; 33+ messages in thread
From: Patrick Steinhardt @ 2026-03-26 12:21 UTC (permalink / raw)
  To: Christian Couder
  Cc: git, Junio C Hamano, Taylor Blau, Karthik Nayak, Elijah Newren,
	Christian Couder

On Mon, Mar 23, 2026 at 09:05:18AM +0100, Christian Couder wrote:
> Previous commits have introduced the `promisor.acceptFromServerUrl`
> config variable to whitelist some URLs advertised by a server through
> the "promisor-remote" protocol capability.
> 
> However the new `promisor.acceptFromServerUrl` mechanism, like the old
> `promisor.acceptFromServer` mechanism, still requires a remote to
> already exist in the client's local configuration before it can be
> accepted. This places a significant manual burden on users to
> pre-configure these remotes, and creates friction for administrators
> who have to troubleshoot or manually provision these setups for their
> teams.
> 
> To eliminate this burden, let's automatically create a new `[remote]`
> section in the client's config when a server advertises an unknown
> remote whose URL matches a `promisor.acceptFromServerUrl` glob pattern.

Would it make sense to extend git-clone(1) to have a command line option
that basically does this as a one-shot? Something like `git clone
--accept-promisors=url:https://gitlab.com/*`? I assume that many users
may not want to keep on updating their configured promisors all the
time.

Furthermore, this here reconfirms my thought on the previous commit that
it would make sense to detangle accepting promisors, storing them in the
configuration and updating them automatically. These are all different
things:

  - Accepting promisors is basically an ongoing runtime thing where you
    start to use announced promisors even though they are not configured
    at all.

  - Storing promisors is typically a one-time thing that you'd want to
    do when creating a new repository.

  - Updating promisors automatically is probably something you want to
    do on an ongoing basis when you have stored promisors.

We're currently putting all of these use cases into the same bag, but
they have very different characteristics. I guess the most common use
case will eventually be to never auto-accept promisors, store them at
clone time, and keep them updated whenever they change. This cannot be
expressed with "promisor.acceptFromServerUrl" as far as I understand.

So ideally, we'd have:

  - "acceptFromServer" to configure ongoing runtime behaviour.

  - "storeFromServer" and a one-time command-line option for
    git-clone(1) to configure which remotes to persist.

  - "updateFromServer" to update stored promisors.

Patrick

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 00/16] Auto-configure advertised remotes via URL whitelist
  2026-03-23  8:05 [PATCH 00/16] Auto-configure advertised remotes via URL whitelist Christian Couder
                   ` (15 preceding siblings ...)
  2026-03-23  8:05 ` [PATCH 16/16] doc: promisor: improve acceptFromServer entry Christian Couder
@ 2026-03-26 12:21 ` Patrick Steinhardt
  16 siblings, 0 replies; 33+ messages in thread
From: Patrick Steinhardt @ 2026-03-26 12:21 UTC (permalink / raw)
  To: Christian Couder
  Cc: git, Junio C Hamano, Taylor Blau, Karthik Nayak, Elijah Newren

On Mon, Mar 23, 2026 at 09:05:03AM +0100, Christian Couder wrote:
> High level description of the patches
> =====================================
> 
>  - Patch 1/16 ("promisor-remote: try accepted remotes before others in
>    get_direct()"):
> 
>    Fixes promisor_remote_get_direct() to prioritize accepted
>    remotes. This could be a separate fix, but is needed towards the
>    end of the series.
> 
>  - Patches 2-3/16 ("urlmatch:*"):
> 
>    Exposes and adapts helpers in the urlmatch API.
> 
>  - Patches 4-11/16 ("promisor-remote:*"):
> 
>    Big refactoring of filter_promisor_remote() and
>    should_accept_remote(). This keeps `struct promisor_info` instances
>    alive longer to anticipate possible state-desync bugs, decouples
>    the server's advertised name from the local config name, and
>    sanitizes control flow without changing the existing behavior.
> 
>  - Patch 12/16 ("t5710:*"):
> 
>    Cleans up how "file://" URIs are managed in the test script to
>    prepare for URI normalization later in the series and avoid issues
>    on Windows.
> 
>  - Patches 13-15/16 ("promisor-remote:*"):
> 
>    The core feature. Introduces the parsing machinery, adds the
>    additive whitelist for known remotes (with url_normalize()
>    security), and finally implements the auto-creation and collision
>    resolution for unknown remotes.
> 
>  - Patch 16/16 ("doc: promisor: improve acceptFromServer entry"):
> 
>    Cleans up and modernizes the existing `promisor.acceptFromServer`
>    documentation.

I wonder whether it would make sense to split up this series into two.
The first 12 patches and parts of 16 are all sensible improvements that
can land independent of the patches that introduce the new logic. And
given that I expect some discussion around the new logic itself, I
expect that these refactorings can land way faster on their own.

It would also help reduce the review load a bit if one then ultimately
only has to review three patches for the new feature.

Patrick

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 14/16] promisor-remote: trust known remotes matching acceptFromServerUrl
  2026-03-23 18:54   ` Junio C Hamano
  2026-03-23 23:47     ` Junio C Hamano
@ 2026-03-27 12:17     ` Christian Couder
  1 sibling, 0 replies; 33+ messages in thread
From: Christian Couder @ 2026-03-27 12:17 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, Patrick Steinhardt, Taylor Blau, Karthik Nayak,
	Elijah Newren, Christian Couder

On Mon, Mar 23, 2026 at 7:54 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Christian Couder <christian.couder@gmail.com> writes:
>
> > diff --git a/Documentation/config/promisor.adoc b/Documentation/config/promisor.adoc
> > index b0fa43b839..6f5442cd65 100644
> > --- a/Documentation/config/promisor.adoc
> > +++ b/Documentation/config/promisor.adoc
> > @@ -51,6 +51,52 @@ promisor.acceptFromServer::
> >       to "fetch" and "clone" requests from the client. Name and URL
> >       comparisons are case sensitive. See linkgit:gitprotocol-v2[5].
> >
> > +promisor.acceptFromServerUrl::
> > +     A glob pattern to specify which URLs advertised by a server
> > +     are considered trusted by the client. This option acts as an
> > +     additive security whitelist that works in conjunction with
> > +     `promisor.acceptFromServer`.
>
> Between the first sentence and the second one, I think there needs
> to be an explanation on what "trusted" means in this context.  Is it
> trusted so that the URL can feed random configuration variable=value
> pairs for the client to blindly apply?  Or is it trusted to do very
> limited things that other remotes can do, and if so what are these
> limited things?  Without knowing that, the end-users cannot assess
> the security implications of setting this option.

Yeah, in the current version, the following is used, which is more explicit:

    A glob pattern to specify which server-advertised URLs a
    client is allowed to act on. When a URL matches, the client
    will accept the advertised remote as a promisor remote and may
    automatically accept field updates (such as authentication
    tokens) from the server, even if `promisor.acceptFromServer`
    is set to `none` (the default).

> I am guessing that the client would behave as if the existing
> promisor.acceptFromServer configuration variable were set to "all"
> when talking with a remote whose URL matches one of the patterns
> listed?

Yes, that's the idea.

> By the way, some people may suggest "white" -> "allow".

Right, I have changed all the "whitelist" instances with "allowlist".

> > +1. Start with a secure protocol scheme, like `https://` or `ssh://`.
>
> Is there a practical reason why people would want to use schemes
> other than the above two?  This sounds like something a small amount
> of code can easily enforce.

The main issue is that some remote helper schemes might be secure,
while it might be difficult to maintain a hardcoded allowlist of them.
(Different implementations might exist out there with the same scheme
name but different security properties.)

Also file:// and http:// for example might be OK in some corporate setups.

We could add yet another config variable (or env variable) for an
allowlist of schemes (on top of `https://` and `ssh://` which would be
the only ones accepted by default), but maybe we can do that in a
future patch series.

[...]

> >> +3. Don't use globs (`*`) in the domain name. For example
> >> +   `https://cdn.example.com/*` is much safer than
> >> +   `https://*.example.com/*`, because the latter matches
> >> +   `https://evil-hacker.net/fake.example.com/repo`.
> >
> > Is there a practical use case where allowing '*' to match anything
> > that contains a slash '/' is useful?
> >
> >> +4. Make sure to have a `/` at the end of the domain name (or the end
> >> +   of specific directories). For example `https://cdn.example.com/*`
> >> +   is much safer than `https://cdn.example.com*`, because the latter
> >> +   matches `https://cdn.example.com.hacker.net/repo`.
> >
> > Ditto.  The above two points sound like excuses to keep sloppy
> > asterisk matching logic.  Yes, retroactively tightening rules always
> > have risk to break existing deployments, but if existing code paths
> > of urlmatch do not have any good reason to allow '*' to match a
> > string that contains a slash '/', perhaps there is no fallout.
>
> I probably should caution the readers not to take the above too
> literally.  Forbidding an asterisk '*' glob not to match '/'
> anywhere in urlmatch will obviusly break existing deployments that
> does this
>
>     [http "https://example.com/*"]
>         var = val
>
> and expects it to catch any URL pointing into the site.

Yeah, the main reason for allowing an asterisk '*' glob to match '/'
is to allow something like:

git config set --global promisor.acceptFromServerUrl "https://my-org.com/*"

to be all what is needed for most internal work in many random orgs.

> But I still do think the matcher should be more intelligent than the
> current implementation to avoid pitfalls like #3 and #4 above.
>
> Perhaps if an additional rule says that '*' after the scheme:// part
> before the first '/' in the pattern, e.g.,
>
>     https://*.example.com/
>     https://*.example.com
>     https://example.com*
>
> unlike '*' that appear anywhere else, never matches a substring that
> contains a slash '/' in it, it would cover plausible mistakes that
> the above #3 and #4 are trying to catch, without hurting any real
> world use case?

I agree that it is better security wise, so I have implemented it in
the current version. Now the scheme and port parts must match exactly
while * match any sequence of characters within the host and path
parts but cannot cross part boundaries.

It's not a panacea though. Users still have to be very careful.

The current documentation looks like this:

--------------------

promisor.acceptFromServerUrl::
    A glob pattern to specify which server-advertised URLs a
    client is allowed to act on. When a URL matches, the client
    will accept the advertised remote as a promisor remote and may
    automatically accept field updates (such as authentication
    tokens) from the server, even if `promisor.acceptFromServer`
    is set to `none` (the default).
+
This option can appear multiple times in config files. An advertised
URL will be accepted if it matches _ANY_ glob pattern specified by
this option in _ANY_ config file read by Git.
+
Be _VERY_ careful with these patterns: `*` matches any sequence of
characters within the 'host' and 'path' parts of a URL (but cannot
cross part boundaries). An overly broad pattern is a major security
risk, as a matching URL allows a server to update fields (such as
authentication tokens) on known remotes without further confirmation.
To minimize security risks, follow these guidelines:
+
1. Start with a secure protocol scheme, like `https://` or `ssh://`.
+
2. Only allow domain names or paths where you control and trust _ALL_
   the content. Be especially careful with shared hosting platforms
   like `github.com` or `gitlab.com`. A broad pattern like
   `https://gitlab.com/*` is dangerous because it trusts every
   repository on the entire platform. Always restrict such patterns to
   your specific organization or namespace (e.g.,
   `https://gitlab.com/your-org/*`).
+
3. Never use globs at the end of domain names. For example,
   `https://cdn.your-org.com/*` might be safe, but
   `https://cdn.your-org.com*/*` is a major security risk because
   the latter matches `https://cdn.your-org.com.hacker.net/repo`.
+
4. Be careful using globs at the beginning of domain names. While the
   code ensures a `*` in the host cannot cross into the path, a
   pattern like `https://*.example.com/*` will still match any
   subdomain. This is extremely dangerous on shared hosting platforms
   (e.g., `https://*.github.io/*` trusts every user's site on the
   entire platform).
+
Before matching, both the advertised URL and the pattern are
normalized: the scheme and host are lowercased, percent-encoded
characters are decoded where possible, and path segments like `..`
are resolved. The port must also match exactly (e.g.,
`https://example.com:8080/*` will not match a URL advertised on
port 9999).
+
For the security implications of accepting a promisor remote, see the
documentation of `promisor.acceptFromServer`. For details on the
protocol, see linkgit:gitprotocol-v2[5].

--------------------

I will work on the suggestions Patrick made before sending the v2
(which I may split into 2 patch series as Patrick suggested).

Thanks.

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2026-03-27 12:18 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-23  8:05 [PATCH 00/16] Auto-configure advertised remotes via URL whitelist Christian Couder
2026-03-23  8:05 ` [PATCH 01/16] promisor-remote: try accepted remotes before others in get_direct() Christian Couder
2026-03-26 12:20   ` Patrick Steinhardt
2026-03-23  8:05 ` [PATCH 02/16] urlmatch: change 'allow_globs' arg to bool Christian Couder
2026-03-23  8:05 ` [PATCH 03/16] urlmatch: add url_is_valid_pattern() helper Christian Couder
2026-03-26 12:20   ` Patrick Steinhardt
2026-03-23  8:05 ` [PATCH 04/16] promisor-remote: clarify that a remote is ignored Christian Couder
2026-03-26 12:20   ` Patrick Steinhardt
2026-03-23  8:05 ` [PATCH 05/16] promisor-remote: refactor has_control_char() Christian Couder
2026-03-23  8:05 ` [PATCH 06/16] promisor-remote: refactor accept_from_server() Christian Couder
2026-03-23  8:05 ` [PATCH 07/16] promisor-remote: keep accepted promisor_info structs alive Christian Couder
2026-03-26 12:21   ` Patrick Steinhardt
2026-03-23  8:05 ` [PATCH 08/16] promisor-remote: remove the 'accepted' strvec Christian Couder
2026-03-26 12:21   ` Patrick Steinhardt
2026-03-23  8:05 ` [PATCH 09/16] promisor-remote: add 'local_name' to 'struct promisor_info' Christian Couder
2026-03-26 12:21   ` Patrick Steinhardt
2026-03-23  8:05 ` [PATCH 10/16] promisor-remote: pass config entry to all_fields_match() directly Christian Couder
2026-03-26 12:21   ` Patrick Steinhardt
2026-03-23  8:05 ` [PATCH 11/16] promisor-remote: refactor should_accept_remote() control flow Christian Couder
2026-03-26 12:21   ` Patrick Steinhardt
2026-03-23  8:05 ` [PATCH 12/16] t5710: use proper file:// URIs for absolute paths Christian Couder
2026-03-26 12:21   ` Patrick Steinhardt
2026-03-23  8:05 ` [PATCH 13/16] promisor-remote: introduce promisor.acceptFromServerUrl Christian Couder
2026-03-26 12:21   ` Patrick Steinhardt
2026-03-23  8:05 ` [PATCH 14/16] promisor-remote: trust known remotes matching acceptFromServerUrl Christian Couder
2026-03-23 18:54   ` Junio C Hamano
2026-03-23 23:47     ` Junio C Hamano
2026-03-27 12:17     ` Christian Couder
2026-03-26 12:21   ` Patrick Steinhardt
2026-03-23  8:05 ` [PATCH 15/16] promisor-remote: auto-configure unknown remotes Christian Couder
2026-03-26 12:21   ` Patrick Steinhardt
2026-03-23  8:05 ` [PATCH 16/16] doc: promisor: improve acceptFromServer entry Christian Couder
2026-03-26 12:21 ` [PATCH 00/16] Auto-configure advertised remotes via URL whitelist Patrick Steinhardt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox