* [PATCH 1/9] promisor-remote: refactor initialising field lists
2025-12-23 11:11 [PATCH 0/9] Implement `promisor.storeFields` and `--filter=auto` Christian Couder
@ 2025-12-23 11:11 ` Christian Couder
2025-12-23 11:11 ` [PATCH 2/9] promisor-remote: allow a client to store fields Christian Couder
` (9 subsequent siblings)
10 siblings, 0 replies; 80+ messages in thread
From: Christian Couder @ 2025-12-23 11:11 UTC (permalink / raw)
To: git
Cc: Junio C Hamano, Patrick Steinhardt, Taylor Blau, Karthik Nayak,
Elijah Newren, Christian Couder, Christian Couder
In "promisor-remote.c", the fields_sent() and fields_checked()
functions serve similar purposes and contain a small amount of
duplicated code.
As we are going to add a similar function in a following commit,
let's refactor this common code into a new initialize_fields_list()
function.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
promisor-remote.c | 28 ++++++++++++++--------------
1 file changed, 14 insertions(+), 14 deletions(-)
diff --git a/promisor-remote.c b/promisor-remote.c
index 77ebf537e2..5d8151cedb 100644
--- a/promisor-remote.c
+++ b/promisor-remote.c
@@ -375,18 +375,24 @@ static char *fields_from_config(struct string_list *fields_list, const char *con
return fields;
}
+static struct string_list *initialize_fields_list(struct string_list *fields_list, int *initialized,
+ const char *config_key)
+{
+ if (!*initialized) {
+ fields_list->cmp = strcasecmp;
+ fields_from_config(fields_list, config_key);
+ *initialized = 1;
+ }
+
+ return fields_list;
+}
+
static struct string_list *fields_sent(void)
{
static struct string_list fields_list = STRING_LIST_INIT_NODUP;
static int initialized;
- if (!initialized) {
- fields_list.cmp = strcasecmp;
- fields_from_config(&fields_list, "promisor.sendFields");
- initialized = 1;
- }
-
- return &fields_list;
+ return initialize_fields_list(&fields_list, &initialized, "promisor.sendFields");
}
static struct string_list *fields_checked(void)
@@ -394,13 +400,7 @@ static struct string_list *fields_checked(void)
static struct string_list fields_list = STRING_LIST_INIT_NODUP;
static int initialized;
- if (!initialized) {
- fields_list.cmp = strcasecmp;
- fields_from_config(&fields_list, "promisor.checkFields");
- initialized = 1;
- }
-
- return &fields_list;
+ return initialize_fields_list(&fields_list, &initialized, "promisor.checkFields");
}
/*
--
2.52.0.319.gfcaffa7898
^ permalink raw reply related [flat|nested] 80+ messages in thread* [PATCH 2/9] promisor-remote: allow a client to store fields
2025-12-23 11:11 [PATCH 0/9] Implement `promisor.storeFields` and `--filter=auto` Christian Couder
2025-12-23 11:11 ` [PATCH 1/9] promisor-remote: refactor initialising field lists Christian Couder
@ 2025-12-23 11:11 ` Christian Couder
2026-01-07 10:05 ` Patrick Steinhardt
2025-12-23 11:11 ` [PATCH 3/9] clone: make filter_options local to cmd_clone() Christian Couder
` (8 subsequent siblings)
10 siblings, 1 reply; 80+ messages in thread
From: Christian Couder @ 2025-12-23 11:11 UTC (permalink / raw)
To: git
Cc: Junio C Hamano, Patrick Steinhardt, Taylor Blau, Karthik Nayak,
Elijah Newren, Christian Couder, Christian Couder
A previous commit allowed a server to pass additional fields through
the "promisor-remote" protocol capability after the "name" and "url"
fields, specifically the "partialCloneFilter" and "token" fields.
Another previous commit, c213820c51 (promisor-remote: allow a client
to check fields, 2025-09-08), has made it possible for a client to
decide if it accepts a promisor remote advertised by a server based
on these additional fields.
Often though, it would be interesting for the client to just store in
its configuration files these additional fields passed by the server,
so that it can use them when needed.
For example if a token is necessary to access a promisor remote, that
token could be updated frequently only on the server side and then
passed to all the clients through the "promisor-remote" capability,
avoiding the need to update it on all the clients manually.
Storing the token on the client side makes sure that the token is
available when the client needs to access the promisor remotes for a
lazy fetch.
In the same way, if it appears that it's better to use a different
filter to access a promisor remote, it could be helpful if the client
could automatically use it.
To allow this, let's introduce a new "promisor.storeFields"
configuration variable.
Like "promisor.checkFields" and "promisor.sendFields", it should
contain a comma or space separated list of field names. Only the
"partialCloneFilter" and "token" field names are supported for now.
When a server advertises a promisor remote, for example "foo", along
with for example "token=XXXXX" to a client, and on the client side
"promisor.storeFields" contains "token", then the client will store
XXXXX for the "remote.foo.token" variable in its configuration file
and reload its configuration so it can immediately use this new
configuration variable.
A message is emitted on stderr to warn users when the config is
changed.
Note that even if "promisor.acceptFromServer" is set to "all", a
promisor remote has to be already configured on the client side for
some of its config to be changed. In any case no new remote is
configured and no new URL is stored.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
Documentation/config/promisor.adoc | 33 ++++++
Documentation/gitprotocol-v2.adoc | 12 ++-
promisor-remote.c | 148 +++++++++++++++++++++++++-
t/t5710-promisor-remote-capability.sh | 49 +++++++++
4 files changed, 236 insertions(+), 6 deletions(-)
diff --git a/Documentation/config/promisor.adoc b/Documentation/config/promisor.adoc
index 93e5e0d9b5..b0fa43b839 100644
--- a/Documentation/config/promisor.adoc
+++ b/Documentation/config/promisor.adoc
@@ -89,3 +89,36 @@ variable. The fields are checked only if the
`promisor.acceptFromServer` config variable is not set to "None". If
set to "None", this config variable has no effect. See
linkgit:gitprotocol-v2[5].
+
+promisor.storeFields::
+ A comma or space separated list of additional remote related
+ field names. If a client accepts an advertised remote, the
+ client will store the values associated with these field names
+ taken from the remote advertisement into its configuration,
+ and then reload its remote configuration. Currently,
+ "partialCloneFilter" and "token" are the only supported field
+ names.
++
+For example if a server advertises "partialCloneFilter=blob:limit=20k"
+for remote "foo", and that remote is accepted, then "blob:limit=20k"
+will be stored for the "remote.foo.partialCloneFilter" configuration
+variable.
++
+If the new field value from an advertised remote is the same as the
+existing field value for that remote on the client side, then no
+change is made to the client configuration though.
++
+When a new value is stored, a message is printed to standard error to
+let users know about this.
++
+Note that for security reasons, if the remote is not already
+configured on the client side, nothing will be stored for that
+remote. In any case, no new remote will be created and no URL will be
+stored.
++
+Before storing a partial clone filter, it's parsed to check it's
+valid. If it's not, a warning is emitted and it's not stored.
++
+Before storing a token, a check is performed to ensure it contains no
+control character. If the check fails, a warning is emitted and it's
+not stored.
diff --git a/Documentation/gitprotocol-v2.adoc b/Documentation/gitprotocol-v2.adoc
index c7db103299..d93dd279ea 100644
--- a/Documentation/gitprotocol-v2.adoc
+++ b/Documentation/gitprotocol-v2.adoc
@@ -826,9 +826,10 @@ are case-sensitive and MUST be transmitted exactly as specified
above. Clients MUST ignore fields they don't recognize to allow for
future protocol extensions.
-For now, the client can only use information transmitted through these
-fields to decide if it accepts the advertised promisor remote. In the
-future that information might be used for other purposes though.
+The client can use information transmitted through these fields to
+decide if it accepts the advertised promisor remote. Also, the client
+can be configured to store the values of these fields (see
+"promisor.storeFields" in linkgit:git-config[1]).
Field values MUST be urlencoded.
@@ -856,8 +857,9 @@ the server advertised, the client shouldn't advertise the
On the server side, the "promisor.advertise" and "promisor.sendFields"
configuration options can be used to control what it advertises. On
the client side, the "promisor.acceptFromServer" configuration option
-can be used to control what it accepts. See the documentation of these
-configuration options for more information.
+can be used to control what it accepts, and the "promisor.storeFields"
+option, to control what it stores. See the documentation of these
+configuration options in linkgit:git-config[1] for more information.
Note that in the future it would be nice if the "promisor-remote"
protocol capability could be used by the server, when responding to
diff --git a/promisor-remote.c b/promisor-remote.c
index 5d8151cedb..8d6d2d7b76 100644
--- a/promisor-remote.c
+++ b/promisor-remote.c
@@ -403,6 +403,14 @@ static struct string_list *fields_checked(void)
return initialize_fields_list(&fields_list, &initialized, "promisor.checkFields");
}
+static struct string_list *fields_stored(void)
+{
+ static struct string_list fields_list = STRING_LIST_INIT_NODUP;
+ static int initialized;
+
+ return initialize_fields_list(&fields_list, &initialized, "promisor.storeFields");
+}
+
/*
* Struct for promisor remotes involved in the "promisor-remote"
* protocol capability.
@@ -692,6 +700,132 @@ static struct promisor_info *parse_one_advertised_remote(const char *remote_info
return info;
}
+static bool store_one_field(struct repository *repo, const char *remote_name,
+ const char *field_name, const char *field_key,
+ const char *advertised, const char *current)
+{
+ if (advertised && (!current || strcmp(current, advertised))) {
+ char *key = xstrfmt("remote.%s.%s", remote_name, field_key);
+
+ fprintf(stderr, _("Storing new %s from server for remote '%s'.\n"
+ " '%s' -> '%s'\n"),
+ field_name, remote_name,
+ current ? current : "",
+ advertised);
+
+ repo_config_set_worktree_gently(repo, key, advertised);
+ free(key);
+
+ return true;
+ }
+
+ return false;
+}
+
+/* Check that a filter is valid by parsing it */
+static bool valid_filter(const char *filter, const char *remote_name)
+{
+ struct list_objects_filter_options filter_opts = LIST_OBJECTS_FILTER_INIT;
+ struct strbuf err = STRBUF_INIT;
+ int res = gently_parse_list_objects_filter(&filter_opts, filter, &err);
+
+ if (res)
+ warning(_("invalid filter '%s' for remote '%s' "
+ "will not be stored: %s"),
+ filter, remote_name, err.buf);
+
+ list_objects_filter_release(&filter_opts);
+ strbuf_release(&err);
+
+ return !res;
+}
+
+/* Check that a token doesn't contain any control character */
+static bool valid_token(const char *token, const char *remote_name)
+{
+ const char *c = token;
+
+ for (; *c; c++)
+ if (iscntrl(*c)) {
+ warning(_("invalid token '%s' for remote '%s' "
+ "will not be stored"),
+ token, remote_name);
+ return false;
+ }
+
+ return true;
+}
+
+struct store_info {
+ struct repository *repo;
+ struct string_list config_info;
+ bool store_filter;
+ bool store_token;
+};
+
+static struct store_info *new_store_info(struct repository *repo)
+{
+ struct string_list *fields_to_store = fields_stored();
+ struct store_info *s = xmalloc(sizeof(*s));
+
+ s->repo = repo;
+
+ string_list_init_nodup(&s->config_info);
+ promisor_config_info_list(repo, &s->config_info, fields_to_store);
+ string_list_sort(&s->config_info);
+
+ s->store_filter = !!string_list_lookup(fields_to_store, promisor_field_filter);
+ s->store_token = !!string_list_lookup(fields_to_store, promisor_field_token);
+
+ return s;
+}
+
+static void free_store_info(struct store_info *s)
+{
+ if (s) {
+ promisor_info_list_clear(&s->config_info);
+ free(s);
+ }
+}
+
+static bool promisor_store_advertised_fields(struct promisor_info *advertised,
+ struct store_info *store_info)
+{
+ struct promisor_info *p;
+ struct string_list_item *item;
+ const char *remote_name = advertised->name;
+ bool reload_config = false;
+
+ if (!(store_info->store_filter || store_info->store_token))
+ return false;
+
+ /*
+ * Get existing config info for the advertised promisor
+ * remote. This ensures the remote is already configured on
+ * the client side.
+ */
+ item = string_list_lookup(&store_info->config_info, remote_name);
+
+ if (!item)
+ return false;
+
+ p = item->util;
+
+ if (store_info->store_filter && advertised->filter &&
+ valid_filter(advertised->filter, remote_name))
+ reload_config |= store_one_field(store_info->repo, remote_name,
+ "filter", promisor_field_filter,
+ advertised->filter, p->filter);
+
+ if (store_info->store_token && advertised->token &&
+ valid_token(advertised->token, remote_name))
+ reload_config |= store_one_field(store_info->repo, remote_name,
+ "token", promisor_field_token,
+ advertised->token, p->token);
+
+ return reload_config;
+}
+
static void filter_promisor_remote(struct repository *repo,
struct strvec *accepted,
const char *info)
@@ -700,7 +834,9 @@ static void filter_promisor_remote(struct repository *repo,
enum accept_promisor accept = ACCEPT_NONE;
struct string_list config_info = STRING_LIST_INIT_NODUP;
struct string_list remote_info = STRING_LIST_INIT_DUP;
+ struct store_info *store_info = NULL;
struct string_list_item *item;
+ bool reload_config = false;
if (!repo_config_get_string_tmp(the_repository, "promisor.acceptfromserver", &accept_str)) {
if (!*accept_str || !strcasecmp("None", accept_str))
@@ -736,14 +872,24 @@ static void filter_promisor_remote(struct repository *repo,
string_list_sort(&config_info);
}
- if (should_accept_remote(accept, advertised, &config_info))
+ if (should_accept_remote(accept, advertised, &config_info)) {
+ if (!store_info)
+ store_info = new_store_info(repo);
+ if (promisor_store_advertised_fields(advertised, store_info))
+ reload_config = true;
+
strvec_push(accepted, advertised->name);
+ }
promisor_info_free(advertised);
}
promisor_info_list_clear(&config_info);
string_list_clear(&remote_info, 0);
+ free_store_info(store_info);
+
+ if (reload_config)
+ repo_promisor_remote_reinit(repo);
}
char *promisor_remote_reply(const char *info)
diff --git a/t/t5710-promisor-remote-capability.sh b/t/t5710-promisor-remote-capability.sh
index 023735d6a8..a726af214a 100755
--- a/t/t5710-promisor-remote-capability.sh
+++ b/t/t5710-promisor-remote-capability.sh
@@ -360,6 +360,55 @@ test_expect_success "clone with promisor.checkFields" '
check_missing_objects server 1 "$oid"
'
+test_expect_success "clone with promisor.storeFields=partialCloneFilter" '
+ git -C server config promisor.advertise true &&
+ test_when_finished "rm -rf client" &&
+
+ git -C server remote add otherLop "https://invalid.invalid" &&
+ git -C server config remote.otherLop.token "fooBar" &&
+ git -C server config remote.otherLop.stuff "baz" &&
+ git -C server config remote.otherLop.partialCloneFilter "blob:limit=10k" &&
+ test_when_finished "git -C server remote remove otherLop" &&
+
+ git -C server config remote.lop.token "fooXXX" &&
+ git -C server config remote.lop.partialCloneFilter "blob:limit=8k" &&
+
+ test_config -C server promisor.sendFields "partialCloneFilter, token" &&
+ test_when_finished "rm trace" &&
+
+ # Clone from server to create a client
+ GIT_TRACE_PACKET="$(pwd)/trace" GIT_NO_LAZY_FETCH=0 git clone \
+ -c remote.lop.promisor=true \
+ -c remote.lop.fetch="+refs/heads/*:refs/remotes/lop/*" \
+ -c remote.lop.url="file://$(pwd)/lop" \
+ -c remote.lop.token="fooYYY" \
+ -c remote.lop.partialCloneFilter="blob:none" \
+ -c promisor.acceptfromserver=All \
+ -c promisor.storeFields=partialcloneFilter \
+ --no-local --filter="blob:limit=5k" server client 2>err &&
+
+ # Check that the filter from the server is stored
+ echo "blob:limit=8k" >expected &&
+ git -C client config remote.lop.partialCloneFilter >actual &&
+ test_cmp expected actual &&
+
+ # Check that user is notified when the filter is stored
+ test_grep "Storing new filter from server for remote '\''lop'\''" err &&
+ test_grep "'\''blob:none'\'' -> '\''blob:limit=8k'\''" err &&
+
+ # Check that the token from the server is NOT stored
+ echo "fooYYY" >expected &&
+ git -C client config remote.lop.token >actual &&
+ test_cmp expected actual &&
+ test_grep ! "Storing new token from server" err &&
+
+ # Check that the filter for an unknown remote is NOT stored
+ test_must_fail git -C client config remote.otherLop.partialCloneFilter >actual &&
+
+ # Check that the largest object is still missing on the server
+ check_missing_objects server 1 "$oid"
+'
+
test_expect_success "clone with promisor.advertise set to 'true' but don't delete the client" '
git -C server config promisor.advertise true &&
--
2.52.0.319.gfcaffa7898
^ permalink raw reply related [flat|nested] 80+ messages in thread* Re: [PATCH 2/9] promisor-remote: allow a client to store fields
2025-12-23 11:11 ` [PATCH 2/9] promisor-remote: allow a client to store fields Christian Couder
@ 2026-01-07 10:05 ` Patrick Steinhardt
2026-02-04 10:20 ` Christian Couder
0 siblings, 1 reply; 80+ messages in thread
From: Patrick Steinhardt @ 2026-01-07 10:05 UTC (permalink / raw)
To: Christian Couder
Cc: git, Junio C Hamano, Taylor Blau, Karthik Nayak, Elijah Newren,
Christian Couder
On Tue, Dec 23, 2025 at 12:11:06PM +0100, Christian Couder wrote:
> A previous commit allowed a server to pass additional fields through
> the "promisor-remote" protocol capability after the "name" and "url"
> fields, specifically the "partialCloneFilter" and "token" fields.
>
> Another previous commit, c213820c51 (promisor-remote: allow a client
> to check fields, 2025-09-08), has made it possible for a client to
> decide if it accepts a promisor remote advertised by a server based
> on these additional fields.
>
> Often though, it would be interesting for the client to just store in
> its configuration files these additional fields passed by the server,
> so that it can use them when needed.
>
> For example if a token is necessary to access a promisor remote, that
> token could be updated frequently only on the server side and then
> passed to all the clients through the "promisor-remote" capability,
> avoiding the need to update it on all the clients manually.
>
> Storing the token on the client side makes sure that the token is
> available when the client needs to access the promisor remotes for a
> lazy fetch.
I guess another use case is that a client performs a fresh clone and
doesn't know anything about the remote's promisors yet, right? In that
case, the client may want to tell git-clone(1) to accept any of the
remote's advertised promisors, store it and then use that promisor's
filter to perform the actual clone.
> In the same way, if it appears that it's better to use a different
> filter to access a promisor remote, it could be helpful if the client
> could automatically use it.
>
> To allow this, let's introduce a new "promisor.storeFields"
> configuration variable.
>
> Like "promisor.checkFields" and "promisor.sendFields", it should
> contain a comma or space separated list of field names. Only the
> "partialCloneFilter" and "token" field names are supported for now.
>
> When a server advertises a promisor remote, for example "foo", along
> with for example "token=XXXXX" to a client, and on the client side
> "promisor.storeFields" contains "token", then the client will store
> XXXXX for the "remote.foo.token" variable in its configuration file
> and reload its configuration so it can immediately use this new
> configuration variable.
>
> A message is emitted on stderr to warn users when the config is
> changed.
>
> Note that even if "promisor.acceptFromServer" is set to "all", a
> promisor remote has to be already configured on the client side for
> some of its config to be changed. In any case no new remote is
> configured and no new URL is stored.
Hm, okay, so that's not yet part of this series. I assume this is going
to be part of a subsequent patch series then?
> diff --git a/promisor-remote.c b/promisor-remote.c
> index 5d8151cedb..8d6d2d7b76 100644
> --- a/promisor-remote.c
> +++ b/promisor-remote.c
> @@ -403,6 +403,14 @@ static struct string_list *fields_checked(void)
> return initialize_fields_list(&fields_list, &initialized, "promisor.checkFields");
> }
>
> +static struct string_list *fields_stored(void)
> +{
> + static struct string_list fields_list = STRING_LIST_INIT_NODUP;
> + static int initialized;
> +
> + return initialize_fields_list(&fields_list, &initialized, "promisor.storeFields");
> +}
I'm a bit worried about all the function-local state that we're
accumulating in those functions. Wouldn't it be preferable if we instead
had a `struct promisor_remote` that encapsulates the information?
> @@ -692,6 +700,132 @@ static struct promisor_info *parse_one_advertised_remote(const char *remote_info
> return info;
> }
>
> +static bool store_one_field(struct repository *repo, const char *remote_name,
> + const char *field_name, const char *field_key,
> + const char *advertised, const char *current)
> +{
> + if (advertised && (!current || strcmp(current, advertised))) {
> + char *key = xstrfmt("remote.%s.%s", remote_name, field_key);
> +
> + fprintf(stderr, _("Storing new %s from server for remote '%s'.\n"
> + " '%s' -> '%s'\n"),
> + field_name, remote_name,
> + current ? current : "",
> + advertised);
> +
> + repo_config_set_worktree_gently(repo, key, advertised);
Why do we store this information in the current per-worktree config? I'd
expect that this should be stored in the local config.
> + free(key);
> +
> + return true;
> + }
> +
> + return false;
> +}
> +
> +/* Check that a filter is valid by parsing it */
> +static bool valid_filter(const char *filter, const char *remote_name)
> +{
> + struct list_objects_filter_options filter_opts = LIST_OBJECTS_FILTER_INIT;
> + struct strbuf err = STRBUF_INIT;
> + int res = gently_parse_list_objects_filter(&filter_opts, filter, &err);
> +
> + if (res)
> + warning(_("invalid filter '%s' for remote '%s' "
> + "will not be stored: %s"),
> + filter, remote_name, err.buf);
> +
> + list_objects_filter_release(&filter_opts);
> + strbuf_release(&err);
> +
> + return !res;
> +}
> +
> +/* Check that a token doesn't contain any control character */
> +static bool valid_token(const char *token, const char *remote_name)
> +{
> + const char *c = token;
> +
> + for (; *c; c++)
> + if (iscntrl(*c)) {
Makes sense. I was also wondering about whether we want to check for
non-space whitespace characters, like newlines.
> + warning(_("invalid token '%s' for remote '%s' "
> + "will not be stored"),
> + token, remote_name);
> + return false;
> + }
> +
> + return true;
> +}
> +
> +struct store_info {
> + struct repository *repo;
> + struct string_list config_info;
> + bool store_filter;
> + bool store_token;
> +};
> +
> +static struct store_info *new_store_info(struct repository *repo)
This should be called `store_info_new()` according to our coding
guidelines.
> +{
> + struct string_list *fields_to_store = fields_stored();
> + struct store_info *s = xmalloc(sizeof(*s));
> +
> + s->repo = repo;
> +
> + string_list_init_nodup(&s->config_info);
> + promisor_config_info_list(repo, &s->config_info, fields_to_store);
> + string_list_sort(&s->config_info);
> +
> + s->store_filter = !!string_list_lookup(fields_to_store, promisor_field_filter);
> + s->store_token = !!string_list_lookup(fields_to_store, promisor_field_token);
> +
> + return s;
> +}
> +
> +static void free_store_info(struct store_info *s)
Likewise, this would be `store_info_free()`.
> diff --git a/t/t5710-promisor-remote-capability.sh b/t/t5710-promisor-remote-capability.sh
> index 023735d6a8..a726af214a 100755
> --- a/t/t5710-promisor-remote-capability.sh
> +++ b/t/t5710-promisor-remote-capability.sh
> @@ -360,6 +360,55 @@ test_expect_success "clone with promisor.checkFields" '
> check_missing_objects server 1 "$oid"
> '
>
> +test_expect_success "clone with promisor.storeFields=partialCloneFilter" '
> + git -C server config promisor.advertise true &&
> + test_when_finished "rm -rf client" &&
> +
> + git -C server remote add otherLop "https://invalid.invalid" &&
> + git -C server config remote.otherLop.token "fooBar" &&
> + git -C server config remote.otherLop.stuff "baz" &&
> + git -C server config remote.otherLop.partialCloneFilter "blob:limit=10k" &&
> + test_when_finished "git -C server remote remove otherLop" &&
> +
> + git -C server config remote.lop.token "fooXXX" &&
> + git -C server config remote.lop.partialCloneFilter "blob:limit=8k" &&
> +
> + test_config -C server promisor.sendFields "partialCloneFilter, token" &&
> + test_when_finished "rm trace" &&
> +
> + # Clone from server to create a client
> + GIT_TRACE_PACKET="$(pwd)/trace" GIT_NO_LAZY_FETCH=0 git clone \
> + -c remote.lop.promisor=true \
> + -c remote.lop.fetch="+refs/heads/*:refs/remotes/lop/*" \
> + -c remote.lop.url="file://$(pwd)/lop" \
> + -c remote.lop.token="fooYYY" \
> + -c remote.lop.partialCloneFilter="blob:none" \
> + -c promisor.acceptfromserver=All \
> + -c promisor.storeFields=partialcloneFilter \
> + --no-local --filter="blob:limit=5k" server client 2>err &&
Onet thing that's missing in these tests is to verify that a subsequent
git-fetch(1) updates the configuration.
Patrick
^ permalink raw reply [flat|nested] 80+ messages in thread* Re: [PATCH 2/9] promisor-remote: allow a client to store fields
2026-01-07 10:05 ` Patrick Steinhardt
@ 2026-02-04 10:20 ` Christian Couder
0 siblings, 0 replies; 80+ messages in thread
From: Christian Couder @ 2026-02-04 10:20 UTC (permalink / raw)
To: Patrick Steinhardt
Cc: git, Junio C Hamano, Taylor Blau, Karthik Nayak, Elijah Newren,
Christian Couder
On Wed, Jan 7, 2026 at 11:05 AM Patrick Steinhardt <ps@pks.im> wrote:
>
> On Tue, Dec 23, 2025 at 12:11:06PM +0100, Christian Couder wrote:
> > A previous commit allowed a server to pass additional fields through
> > the "promisor-remote" protocol capability after the "name" and "url"
> > fields, specifically the "partialCloneFilter" and "token" fields.
> >
> > Another previous commit, c213820c51 (promisor-remote: allow a client
> > to check fields, 2025-09-08), has made it possible for a client to
> > decide if it accepts a promisor remote advertised by a server based
> > on these additional fields.
> >
> > Often though, it would be interesting for the client to just store in
> > its configuration files these additional fields passed by the server,
> > so that it can use them when needed.
> >
> > For example if a token is necessary to access a promisor remote, that
> > token could be updated frequently only on the server side and then
> > passed to all the clients through the "promisor-remote" capability,
> > avoiding the need to update it on all the clients manually.
> >
> > Storing the token on the client side makes sure that the token is
> > available when the client needs to access the promisor remotes for a
> > lazy fetch.
>
> I guess another use case is that a client performs a fresh clone and
> doesn't know anything about the remote's promisors yet, right? In that
> case, the client may want to tell git-clone(1) to accept any of the
> remote's advertised promisors, store it and then use that promisor's
> filter to perform the actual clone.
Actually there are two issues with this.
The first one is the security issue with the client adding a new
promisor to its config that I will discuss below.
The second one is the fact that it's better if the filter suggested by
the server is used right away during the initial clone, but you have
to pass a `--filter=<filter-spec>` to the clone option in the first
place when you start the initial clone and the filter suggested by the
server might be different than the one you pass. This is why the
second part of the series implements `--filter=auto`.
> > In the same way, if it appears that it's better to use a different
> > filter to access a promisor remote, it could be helpful if the client
> > could automatically use it.
By the way I have removed this in the version 2 I am going to send
soon, as it could be misleading.
> > To allow this, let's introduce a new "promisor.storeFields"
> > configuration variable.
> >
> > Like "promisor.checkFields" and "promisor.sendFields", it should
> > contain a comma or space separated list of field names. Only the
> > "partialCloneFilter" and "token" field names are supported for now.
> >
> > When a server advertises a promisor remote, for example "foo", along
> > with for example "token=XXXXX" to a client, and on the client side
> > "promisor.storeFields" contains "token", then the client will store
> > XXXXX for the "remote.foo.token" variable in its configuration file
> > and reload its configuration so it can immediately use this new
> > configuration variable.
> >
> > A message is emitted on stderr to warn users when the config is
> > changed.
> >
> > Note that even if "promisor.acceptFromServer" is set to "all", a
> > promisor remote has to be already configured on the client side for
> > some of its config to be changed. In any case no new remote is
> > configured and no new URL is stored.
>
> Hm, okay, so that's not yet part of this series. I assume this is going
> to be part of a subsequent patch series then?
My opinion is that we should indeed work on that in a future separate
series, as it could be very useful in setups where clients trust the
server, like corporate setups. For now I prefer to keep things safe by
default and not make it possible.
> > diff --git a/promisor-remote.c b/promisor-remote.c
> > index 5d8151cedb..8d6d2d7b76 100644
> > --- a/promisor-remote.c
> > +++ b/promisor-remote.c
> > @@ -403,6 +403,14 @@ static struct string_list *fields_checked(void)
> > return initialize_fields_list(&fields_list, &initialized, "promisor.checkFields");
> > }
> >
> > +static struct string_list *fields_stored(void)
> > +{
> > + static struct string_list fields_list = STRING_LIST_INIT_NODUP;
> > + static int initialized;
> > +
> > + return initialize_fields_list(&fields_list, &initialized, "promisor.storeFields");
> > +}
>
> I'm a bit worried about all the function-local state that we're
> accumulating in those functions. Wouldn't it be preferable if we instead
> had a `struct promisor_remote` that encapsulates the information?
I don't think we have a good standard way to manage information from
the config yet. Some suggestions have been made about using a new
struct for some config options, for example in:
https://lore.kernel.org/git/8899016f-eeef-404b-8da6-ff3a90e81cea@gmail.com/
and perhaps such a good standard way to manage config information will
result from these efforts, but I think it's too early to be sure.
In the meantime, I don't think it's a good idea to spend time on a
specialized way to do it just for promisor remotes.
> > @@ -692,6 +700,132 @@ static struct promisor_info *parse_one_advertised_remote(const char *remote_info
> > return info;
> > }
> >
> > +static bool store_one_field(struct repository *repo, const char *remote_name,
> > + const char *field_name, const char *field_key,
> > + const char *advertised, const char *current)
> > +{
> > + if (advertised && (!current || strcmp(current, advertised))) {
> > + char *key = xstrfmt("remote.%s.%s", remote_name, field_key);
> > +
> > + fprintf(stderr, _("Storing new %s from server for remote '%s'.\n"
> > + " '%s' -> '%s'\n"),
> > + field_name, remote_name,
> > + current ? current : "",
> > + advertised);
> > +
> > + repo_config_set_worktree_gently(repo, key, advertised);
>
> Why do we store this information in the current per-worktree config? I'd
> expect that this should be stored in the local config.
Right, repo_config_set_gently() is used now instead.
> > + free(key);
> > +
> > + return true;
> > + }
[...]
> > +struct store_info {
> > + struct repository *repo;
> > + struct string_list config_info;
> > + bool store_filter;
> > + bool store_token;
> > +};
> > +
> > +static struct store_info *new_store_info(struct repository *repo)
>
> This should be called `store_info_new()` according to our coding
> guidelines.
Fine, `store_info_new()` and `store_info_free()` are now used as you suggest.
> > diff --git a/t/t5710-promisor-remote-capability.sh b/t/t5710-promisor-remote-capability.sh
> > index 023735d6a8..a726af214a 100755
> > --- a/t/t5710-promisor-remote-capability.sh
> > +++ b/t/t5710-promisor-remote-capability.sh
> > @@ -360,6 +360,55 @@ test_expect_success "clone with promisor.checkFields" '
> > check_missing_objects server 1 "$oid"
> > '
> >
> > +test_expect_success "clone with promisor.storeFields=partialCloneFilter" '
> > + git -C server config promisor.advertise true &&
> > + test_when_finished "rm -rf client" &&
> > +
> > + git -C server remote add otherLop "https://invalid.invalid" &&
> > + git -C server config remote.otherLop.token "fooBar" &&
> > + git -C server config remote.otherLop.stuff "baz" &&
> > + git -C server config remote.otherLop.partialCloneFilter "blob:limit=10k" &&
> > + test_when_finished "git -C server remote remove otherLop" &&
> > +
> > + git -C server config remote.lop.token "fooXXX" &&
> > + git -C server config remote.lop.partialCloneFilter "blob:limit=8k" &&
> > +
> > + test_config -C server promisor.sendFields "partialCloneFilter, token" &&
> > + test_when_finished "rm trace" &&
> > +
> > + # Clone from server to create a client
> > + GIT_TRACE_PACKET="$(pwd)/trace" GIT_NO_LAZY_FETCH=0 git clone \
> > + -c remote.lop.promisor=true \
> > + -c remote.lop.fetch="+refs/heads/*:refs/remotes/lop/*" \
> > + -c remote.lop.url="file://$(pwd)/lop" \
> > + -c remote.lop.token="fooYYY" \
> > + -c remote.lop.partialCloneFilter="blob:none" \
> > + -c promisor.acceptfromserver=All \
> > + -c promisor.storeFields=partialcloneFilter \
> > + --no-local --filter="blob:limit=5k" server client 2>err &&
>
> Onet thing that's missing in these tests is to verify that a subsequent
> git-fetch(1) updates the configuration.
Ok, I have added a test using `git fetch`.
Thanks.
^ permalink raw reply [flat|nested] 80+ messages in thread
* [PATCH 3/9] clone: make filter_options local to cmd_clone()
2025-12-23 11:11 [PATCH 0/9] Implement `promisor.storeFields` and `--filter=auto` Christian Couder
2025-12-23 11:11 ` [PATCH 1/9] promisor-remote: refactor initialising field lists Christian Couder
2025-12-23 11:11 ` [PATCH 2/9] promisor-remote: allow a client to store fields Christian Couder
@ 2025-12-23 11:11 ` Christian Couder
2025-12-23 11:11 ` [PATCH 4/9] fetch: make filter_options local to cmd_fetch() Christian Couder
` (7 subsequent siblings)
10 siblings, 0 replies; 80+ messages in thread
From: Christian Couder @ 2025-12-23 11:11 UTC (permalink / raw)
To: git
Cc: Junio C Hamano, Patrick Steinhardt, Taylor Blau, Karthik Nayak,
Elijah Newren, Christian Couder, Christian Couder
The `struct list_objects_filter_options filter_options` variable used
in "builtin/clone.c" to store the parsed filters specified by
`--filter=<filterspec>` is currently a static variable global to the
file.
As we are going to use it more in a following commit, it could become
a bit less easy to understand how it's managed.
To avoid that, let's make it clear that it's owned by cmd_clone() by
moving its definition into that function and making it non-static.
The only additional change to make this work is to pass it as an
argument to checkout(). So it's a small quite cheap cleanup anyway.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
builtin/clone.c | 16 +++++++++++-----
1 file changed, 11 insertions(+), 5 deletions(-)
diff --git a/builtin/clone.c b/builtin/clone.c
index b19b302b06..186e5498d4 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -77,7 +77,6 @@ static struct string_list option_required_reference = STRING_LIST_INIT_NODUP;
static struct string_list option_optional_reference = STRING_LIST_INIT_NODUP;
static int max_jobs = -1;
static struct string_list option_recurse_submodules = STRING_LIST_INIT_NODUP;
-static struct list_objects_filter_options filter_options = LIST_OBJECTS_FILTER_INIT;
static int config_filter_submodules = -1; /* unspecified */
static int option_remote_submodules;
@@ -634,7 +633,9 @@ static int git_sparse_checkout_init(const char *repo)
return result;
}
-static int checkout(int submodule_progress, int filter_submodules,
+static int checkout(int submodule_progress,
+ struct list_objects_filter_options *filter_options,
+ int filter_submodules,
enum ref_storage_format ref_storage_format)
{
struct object_id oid;
@@ -723,9 +724,9 @@ static int checkout(int submodule_progress, int filter_submodules,
strvec_pushf(&cmd.args, "--ref-format=%s",
ref_storage_format_to_name(ref_storage_format));
- if (filter_submodules && filter_options.choice)
+ if (filter_submodules && filter_options->choice)
strvec_pushf(&cmd.args, "--filter=%s",
- expand_list_objects_filter_spec(&filter_options));
+ expand_list_objects_filter_spec(filter_options));
if (option_single_branch >= 0)
strvec_push(&cmd.args, option_single_branch ?
@@ -903,6 +904,7 @@ int cmd_clone(int argc,
enum transport_family family = TRANSPORT_FAMILY_ALL;
struct string_list option_config = STRING_LIST_INIT_DUP;
int option_dissociate = 0;
+ struct list_objects_filter_options filter_options = LIST_OBJECTS_FILTER_INIT;
int option_filter_submodules = -1; /* unspecified */
struct string_list server_options = STRING_LIST_INIT_NODUP;
const char *bundle_uri = NULL;
@@ -1625,9 +1627,13 @@ int cmd_clone(int argc,
return 1;
junk_mode = JUNK_LEAVE_REPO;
- err = checkout(submodule_progress, filter_submodules,
+ err = checkout(submodule_progress,
+ &filter_options,
+ filter_submodules,
ref_storage_format);
+ list_objects_filter_release(&filter_options);
+
string_list_clear(&option_not, 0);
string_list_clear(&option_config, 0);
string_list_clear(&server_options, 0);
--
2.52.0.319.gfcaffa7898
^ permalink raw reply related [flat|nested] 80+ messages in thread* [PATCH 4/9] fetch: make filter_options local to cmd_fetch()
2025-12-23 11:11 [PATCH 0/9] Implement `promisor.storeFields` and `--filter=auto` Christian Couder
` (2 preceding siblings ...)
2025-12-23 11:11 ` [PATCH 3/9] clone: make filter_options local to cmd_clone() Christian Couder
@ 2025-12-23 11:11 ` Christian Couder
2026-01-07 10:05 ` Patrick Steinhardt
2025-12-23 11:11 ` [PATCH 5/9] doc: fetch: document `--filter=<filter-spec>` option Christian Couder
` (6 subsequent siblings)
10 siblings, 1 reply; 80+ messages in thread
From: Christian Couder @ 2025-12-23 11:11 UTC (permalink / raw)
To: git
Cc: Junio C Hamano, Patrick Steinhardt, Taylor Blau, Karthik Nayak,
Elijah Newren, Christian Couder, Christian Couder
The `struct list_objects_filter_options filter_options` variable used
in "builtin/fetch.c" to store the parsed filters specified by
`--filter=<filterspec>` is currently a static variable global to the
file.
As we are going to use it more in a following commit, it could become a
bit less easy to understand how it's managed.
To avoid that, let's make it clear that it's owned by cmd_fetch() by
moving its definition into that function and making it non-static.
This requires passing a pointer to it through the prepare_transport(),
do_fetch(), backfill_tags(), fetch_one_setup_partial(), and fetch_one()
functions, but it's quite straightforward.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
builtin/fetch.c | 48 +++++++++++++++++++++++++++---------------------
1 file changed, 27 insertions(+), 21 deletions(-)
diff --git a/builtin/fetch.c b/builtin/fetch.c
index 288d3772ea..b984173447 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -97,7 +97,6 @@ static struct strbuf default_rla = STRBUF_INIT;
static struct transport *gtransport;
static struct transport *gsecondary;
static struct refspec refmap = REFSPEC_INIT_FETCH;
-static struct list_objects_filter_options filter_options = LIST_OBJECTS_FILTER_INIT;
static struct string_list server_options = STRING_LIST_INIT_DUP;
static struct string_list negotiation_tip = STRING_LIST_INIT_NODUP;
@@ -1449,7 +1448,8 @@ static void add_negotiation_tips(struct git_transport_options *smart_options)
smart_options->negotiation_tips = oids;
}
-static struct transport *prepare_transport(struct remote *remote, int deepen)
+static struct transport *prepare_transport(struct remote *remote, int deepen,
+ struct list_objects_filter_options *filter_options)
{
struct transport *transport;
@@ -1473,9 +1473,9 @@ static struct transport *prepare_transport(struct remote *remote, int deepen)
set_option(transport, TRANS_OPT_UPDATE_SHALLOW, "yes");
if (refetch)
set_option(transport, TRANS_OPT_REFETCH, "yes");
- if (filter_options.choice) {
+ if (filter_options->choice) {
const char *spec =
- expand_list_objects_filter_spec(&filter_options);
+ expand_list_objects_filter_spec(filter_options);
set_option(transport, TRANS_OPT_LIST_OBJECTS_FILTER, spec);
set_option(transport, TRANS_OPT_FROM_PROMISOR, "1");
}
@@ -1493,7 +1493,8 @@ static int backfill_tags(struct display_state *display_state,
struct ref_transaction *transaction,
struct ref *ref_map,
struct fetch_head *fetch_head,
- const struct fetch_config *config)
+ const struct fetch_config *config,
+ struct list_objects_filter_options *filter_options)
{
int retcode, cannot_reuse;
@@ -1507,7 +1508,7 @@ static int backfill_tags(struct display_state *display_state,
cannot_reuse = transport->cannot_reuse ||
deepen_since || deepen_not.nr;
if (cannot_reuse) {
- gsecondary = prepare_transport(transport->remote, 0);
+ gsecondary = prepare_transport(transport->remote, 0, filter_options);
transport = gsecondary;
}
@@ -1713,7 +1714,8 @@ static int commit_ref_transaction(struct ref_transaction **transaction,
static int do_fetch(struct transport *transport,
struct refspec *rs,
- const struct fetch_config *config)
+ const struct fetch_config *config,
+ struct list_objects_filter_options *filter_options)
{
struct ref_transaction *transaction = NULL;
struct ref *ref_map = NULL;
@@ -1873,7 +1875,7 @@ static int do_fetch(struct transport *transport,
* the transaction and don't commit anything.
*/
if (backfill_tags(&display_state, transport, transaction, tags_ref_map,
- &fetch_head, config))
+ &fetch_head, config, filter_options))
retcode = 1;
}
@@ -2198,20 +2200,21 @@ static int fetch_multiple(struct string_list *list, int max_children,
* Fetching from the promisor remote should use the given filter-spec
* or inherit the default filter-spec from the config.
*/
-static inline void fetch_one_setup_partial(struct remote *remote)
+static inline void fetch_one_setup_partial(struct remote *remote,
+ struct list_objects_filter_options *filter_options)
{
/*
* Explicit --no-filter argument overrides everything, regardless
* of any prior partial clones and fetches.
*/
- if (filter_options.no_filter)
+ if (filter_options->no_filter)
return;
/*
* If no prior partial clone/fetch and the current fetch DID NOT
* request a partial-fetch, do a normal fetch.
*/
- if (!repo_has_promisor_remote(the_repository) && !filter_options.choice)
+ if (!repo_has_promisor_remote(the_repository) && !filter_options->choice)
return;
/*
@@ -2220,8 +2223,8 @@ static inline void fetch_one_setup_partial(struct remote *remote)
* filter-spec as the default for subsequent fetches to this
* remote if there is currently no default filter-spec.
*/
- if (filter_options.choice) {
- partial_clone_register(remote->name, &filter_options);
+ if (filter_options->choice) {
+ partial_clone_register(remote->name, filter_options);
return;
}
@@ -2230,14 +2233,15 @@ static inline void fetch_one_setup_partial(struct remote *remote)
* explicitly given filter-spec or inherit the filter-spec from
* the config.
*/
- if (!filter_options.choice)
- partial_clone_get_default_filter_spec(&filter_options, remote->name);
+ if (!filter_options->choice)
+ partial_clone_get_default_filter_spec(filter_options, remote->name);
return;
}
static int fetch_one(struct remote *remote, int argc, const char **argv,
int prune_tags_ok, int use_stdin_refspecs,
- const struct fetch_config *config)
+ const struct fetch_config *config,
+ struct list_objects_filter_options *filter_options)
{
struct refspec rs = REFSPEC_INIT_FETCH;
int i;
@@ -2249,7 +2253,7 @@ static int fetch_one(struct remote *remote, int argc, const char **argv,
die(_("no remote repository specified; please specify either a URL or a\n"
"remote name from which new revisions should be fetched"));
- gtransport = prepare_transport(remote, 1);
+ gtransport = prepare_transport(remote, 1, filter_options);
if (prune < 0) {
/* no command line request */
@@ -2304,7 +2308,7 @@ static int fetch_one(struct remote *remote, int argc, const char **argv,
sigchain_push_common(unlock_pack_on_signal);
atexit(unlock_pack_atexit);
sigchain_push(SIGPIPE, SIG_IGN);
- exit_code = do_fetch(gtransport, &rs, config);
+ exit_code = do_fetch(gtransport, &rs, config, filter_options);
sigchain_pop(SIGPIPE);
refspec_clear(&rs);
transport_disconnect(gtransport);
@@ -2329,6 +2333,7 @@ int cmd_fetch(int argc,
const char *submodule_prefix = "";
const char *bundle_uri;
struct string_list list = STRING_LIST_INIT_DUP;
+ struct list_objects_filter_options filter_options = LIST_OBJECTS_FILTER_INIT;
struct remote *remote = NULL;
int all = -1, multiple = 0;
int result = 0;
@@ -2594,7 +2599,7 @@ int cmd_fetch(int argc,
trace2_region_enter("fetch", "negotiate-only", the_repository);
if (!remote)
die(_("must supply remote when using --negotiate-only"));
- gtransport = prepare_transport(remote, 1);
+ gtransport = prepare_transport(remote, 1, &filter_options);
if (gtransport->smart_options) {
gtransport->smart_options->acked_commits = &acked_commits;
} else {
@@ -2616,12 +2621,12 @@ int cmd_fetch(int argc,
} else if (remote) {
if (filter_options.choice || repo_has_promisor_remote(the_repository)) {
trace2_region_enter("fetch", "setup-partial", the_repository);
- fetch_one_setup_partial(remote);
+ fetch_one_setup_partial(remote, &filter_options);
trace2_region_leave("fetch", "setup-partial", the_repository);
}
trace2_region_enter("fetch", "fetch-one", the_repository);
result = fetch_one(remote, argc, argv, prune_tags_ok, stdin_refspecs,
- &config);
+ &config, &filter_options);
trace2_region_leave("fetch", "fetch-one", the_repository);
} else {
int max_children = max_jobs;
@@ -2727,5 +2732,6 @@ int cmd_fetch(int argc,
cleanup:
string_list_clear(&list, 0);
+ list_objects_filter_release(&filter_options);
return result;
}
--
2.52.0.319.gfcaffa7898
^ permalink raw reply related [flat|nested] 80+ messages in thread* Re: [PATCH 4/9] fetch: make filter_options local to cmd_fetch()
2025-12-23 11:11 ` [PATCH 4/9] fetch: make filter_options local to cmd_fetch() Christian Couder
@ 2026-01-07 10:05 ` Patrick Steinhardt
0 siblings, 0 replies; 80+ messages in thread
From: Patrick Steinhardt @ 2026-01-07 10:05 UTC (permalink / raw)
To: Christian Couder
Cc: git, Junio C Hamano, Taylor Blau, Karthik Nayak, Elijah Newren,
Christian Couder
On Tue, Dec 23, 2025 at 12:11:08PM +0100, Christian Couder wrote:
> The `struct list_objects_filter_options filter_options` variable used
> in "builtin/fetch.c" to store the parsed filters specified by
> `--filter=<filterspec>` is currently a static variable global to the
> file.
>
> As we are going to use it more in a following commit, it could become a
> bit less easy to understand how it's managed.
>
> To avoid that, let's make it clear that it's owned by cmd_fetch() by
> moving its definition into that function and making it non-static.
>
> This requires passing a pointer to it through the prepare_transport(),
> do_fetch(), backfill_tags(), fetch_one_setup_partial(), and fetch_one()
> functions, but it's quite straightforward.
Nice cleanups. I'm always happy to see less global state.
Patrick
^ permalink raw reply [flat|nested] 80+ messages in thread
* [PATCH 5/9] doc: fetch: document `--filter=<filter-spec>` option
2025-12-23 11:11 [PATCH 0/9] Implement `promisor.storeFields` and `--filter=auto` Christian Couder
` (3 preceding siblings ...)
2025-12-23 11:11 ` [PATCH 4/9] fetch: make filter_options local to cmd_fetch() Christian Couder
@ 2025-12-23 11:11 ` Christian Couder
2025-12-26 13:33 ` Jean-Noël AVILA
2025-12-23 11:11 ` [PATCH 6/9] list-objects-filter-options: support 'auto' mode for --filter Christian Couder
` (5 subsequent siblings)
10 siblings, 1 reply; 80+ messages in thread
From: Christian Couder @ 2025-12-23 11:11 UTC (permalink / raw)
To: git
Cc: Junio C Hamano, Patrick Steinhardt, Taylor Blau, Karthik Nayak,
Elijah Newren, Christian Couder, Christian Couder
The `--filter=<filter-spec>` option is documented in most commands that
support it except `git fetch`.
Let's fix that and document that option properly in the same way as it
is already documented for `git clone`.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
Documentation/fetch-options.adoc | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/Documentation/fetch-options.adoc b/Documentation/fetch-options.adoc
index fcba46ee9e..70a9818331 100644
--- a/Documentation/fetch-options.adoc
+++ b/Documentation/fetch-options.adoc
@@ -88,6 +88,16 @@ linkgit:git-config[1].
This is incompatible with `--recurse-submodules=(yes|on-demand)` and takes
precedence over the `fetch.output` config option.
+--filter=<filter-spec>::
+ Use the partial clone feature and request that the server sends
+ a subset of reachable objects according to a given object filter.
+ When using `--filter`, the supplied _<filter-spec>_ is used for
+ the partial fetch. For example, `--filter=blob:none` will filter
+ out all blobs (file contents) until needed by Git. Also,
+ `--filter=blob:limit=<size>` will filter out all blobs of size
+ at least _<size>_. For more details on filter specifications, see
+ the `--filter` option in linkgit:git-rev-list[1].
+
ifndef::git-pull[]
`--write-fetch-head`::
`--no-write-fetch-head`::
--
2.52.0.319.gfcaffa7898
^ permalink raw reply related [flat|nested] 80+ messages in thread* Re: [PATCH 5/9] doc: fetch: document `--filter=<filter-spec>` option
2025-12-23 11:11 ` [PATCH 5/9] doc: fetch: document `--filter=<filter-spec>` option Christian Couder
@ 2025-12-26 13:33 ` Jean-Noël AVILA
2026-02-04 11:19 ` Christian Couder
0 siblings, 1 reply; 80+ messages in thread
From: Jean-Noël AVILA @ 2025-12-26 13:33 UTC (permalink / raw)
To: git, Christian Couder
Cc: Junio C Hamano, Patrick Steinhardt, Taylor Blau, Karthik Nayak,
Elijah Newren, Christian Couder, Christian Couder
On Tuesday, 23 December 2025 12:11:09 CET Christian Couder wrote:
> The `--filter=<filter-spec>` option is documented in most commands that
> support it except `git fetch`.
>
> Let's fix that and document that option properly in the same way as it
> is already documented for `git clone`.
>
> Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
> ---
> Documentation/fetch-options.adoc | 10 ++++++++++
> 1 file changed, 10 insertions(+)
>
> diff --git a/Documentation/fetch-options.adoc b/Documentation/fetch-
options.adoc
> index fcba46ee9e..70a9818331 100644
> --- a/Documentation/fetch-options.adoc
> +++ b/Documentation/fetch-options.adoc
> @@ -88,6 +88,16 @@ linkgit:git-config[1].
> This is incompatible with `--recurse-submodules=(yes|on-demand)` and takes
> precedence over the `fetch.output` config option.
>
> +--filter=<filter-spec>::
The option itself must also be back-ticked.
`--filter=<filter-spec>`::
> + Use the partial clone feature and request that the server sends
> + a subset of reachable objects according to a given object filter.
> + When using `--filter`, the supplied _<filter-spec>_ is used for
> + the partial fetch. For example, `--filter=blob:none` will filter
Isn't this second sentence redundant? What new information is brought?
> + out all blobs (file contents) until needed by Git. Also,
> + `--filter=blob:limit=<size>` will filter out all blobs of size
> + at least _<size>_. For more details on filter specifications, see
> + the `--filter` option in linkgit:git-rev-list[1].
> +
> ifndef::git-pull[]
> `--write-fetch-head`::
> `--no-write-fetch-head`::
Thank you
Jean-Noël
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 5/9] doc: fetch: document `--filter=<filter-spec>` option
2025-12-26 13:33 ` Jean-Noël AVILA
@ 2026-02-04 11:19 ` Christian Couder
0 siblings, 0 replies; 80+ messages in thread
From: Christian Couder @ 2026-02-04 11:19 UTC (permalink / raw)
To: avila.jn
Cc: chriscool, christian.couder, git, gitster, karthik.188, me,
newren, ps
(Sorry but I cannot find the email send by Jean-Noël in Gmail so I am
using `git send-email` instead of Gmail to reply.)
On Fri, 26 Dec 2025 14:33:38 Jean-Noël AVILA wrote:
On Tuesday, 23 December 2025 12:11:09 CET Christian Couder wrote:
> > diff --git a/Documentation/fetch-options.adoc b/Documentation/fetch-
> options.adoc
> > index fcba46ee9e..70a9818331 100644
> > --- a/Documentation/fetch-options.adoc
> > +++ b/Documentation/fetch-options.adoc
> > @@ -88,6 +88,16 @@ linkgit:git-config[1].
> > This is incompatible with `--recurse-submodules=(yes|on-demand)` and takes
> > precedence over the `fetch.output` config option.
> >
> > +--filter=<filter-spec>::
>
> The option itself must also be back-ticked.
>
> `--filter=<filter-spec>`::
Yeah, I have back-ticked it in v2.
> + Use the partial clone feature and request that the server sends
> + a subset of reachable objects according to a given object filter.
> + When using `--filter`, the supplied _<filter-spec>_ is used for
> + the partial fetch. For example, `--filter=blob:none` will filter
>
> Isn't this second sentence redundant? What new information is brought?
I agree it's redundant, but I copied it from the `git-clone`
documentation as-is because the goal here is not to improve on the
existing documentation but to fix the fact that some documentation is
missing.
That's why the commit message said "in the same way as it is already
documented for `git clone`". I have improved the commit message to
make the commit goal clearer though.
If the documentation was wrong, I agree that copying it as-is would
not be the right thing to do, but here it's not wrong. And it's better
to have some docs that are a bit redundant than to miss some docs.
Also I think it's better to improve on the documentation in a separate
commit because this way:
- the `git-clone` documentation could be improved like the `git-fetch`
documentation in a single commit (so we get consistent documentation
using consistent documentation changes),
- how to best remove the redundancy is just a separate topic that I
prefer to avoid at least for now.
Thanks.
^ permalink raw reply [flat|nested] 80+ messages in thread
* [PATCH 6/9] list-objects-filter-options: support 'auto' mode for --filter
2025-12-23 11:11 [PATCH 0/9] Implement `promisor.storeFields` and `--filter=auto` Christian Couder
` (4 preceding siblings ...)
2025-12-23 11:11 ` [PATCH 5/9] doc: fetch: document `--filter=<filter-spec>` option Christian Couder
@ 2025-12-23 11:11 ` Christian Couder
2026-01-07 10:05 ` Patrick Steinhardt
2025-12-23 11:11 ` [PATCH 7/9] list-objects-filter-options: implement auto filter resolution Christian Couder
` (4 subsequent siblings)
10 siblings, 1 reply; 80+ messages in thread
From: Christian Couder @ 2025-12-23 11:11 UTC (permalink / raw)
To: git
Cc: Junio C Hamano, Patrick Steinhardt, Taylor Blau, Karthik Nayak,
Elijah Newren, Christian Couder, Christian Couder
In a following commit, we are going to allow passing "auto" as a
<filterspec> to the `--filter=<filterspec>` option, but only for some
commands. Other commands that support the `--filter=<filterspec>`
option should still die() when 'auto' is passed.
Let's set up the "list-objects-filter-options.{c,h}" infrastructure to
support that:
- Add a new `unsigned int allow_auto_filter : 1;` flag to
`struct list_objects_filter_options` which specifies if "auto" is
accepted or not.
- Change gently_parse_list_objects_filter() to parse "auto" if it's
accepted.
- Make sure we die() if "auto" is combined with another filter.
- Update list_objects_filter_release() to preserve the
allow_auto_filter flag, as this function is often called (via
opt_parse_list_objects_filter) to reset the struct before parsing a
new value.
Let's also update `list-objects-filter.c` to recognize the new
`LOFC_AUTO` choice. Since "auto" must be resolved to a concrete filter
before filtering actually begins, initializing a filter with
`LOFC_AUTO` is invalid and will trigger a BUG().
Note that ideally combining "auto" with "auto" could be allowed, but in
practice, it's probably not worth the added code complexity. And if we
really want it, nothing prevents us to allow it in future work.
If we ever want to give a meaning to combining "auto" with a different
filter too, nothing prevents us to do that in future work either.
While at it, let's add a new "u-list-objects-filter-options.c" file for
`struct list_objects_filter_options` related unit tests. For now it
only tests gently_parse_list_objects_filter() though.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
Makefile | 1 +
list-objects-filter-options.c | 36 +++++++++++--
list-objects-filter-options.h | 6 +++
list-objects-filter.c | 8 +++
t/meson.build | 1 +
t/unit-tests/u-list-objects-filter-options.c | 53 ++++++++++++++++++++
6 files changed, 102 insertions(+), 3 deletions(-)
create mode 100644 t/unit-tests/u-list-objects-filter-options.c
diff --git a/Makefile b/Makefile
index 89d8d73ec0..85b2ff09f4 100644
--- a/Makefile
+++ b/Makefile
@@ -1507,6 +1507,7 @@ CLAR_TEST_SUITES += u-dir
CLAR_TEST_SUITES += u-example-decorate
CLAR_TEST_SUITES += u-hash
CLAR_TEST_SUITES += u-hashmap
+CLAR_TEST_SUITES += u-list-objects-filter-options
CLAR_TEST_SUITES += u-mem-pool
CLAR_TEST_SUITES += u-oid-array
CLAR_TEST_SUITES += u-oidmap
diff --git a/list-objects-filter-options.c b/list-objects-filter-options.c
index 7420bf81fe..f13ae5caeb 100644
--- a/list-objects-filter-options.c
+++ b/list-objects-filter-options.c
@@ -20,6 +20,8 @@ const char *list_object_filter_config_name(enum list_objects_filter_choice c)
case LOFC_DISABLED:
/* we have no name for "no filter at all" */
break;
+ case LOFC_AUTO:
+ return "auto";
case LOFC_BLOB_NONE:
return "blob:none";
case LOFC_BLOB_LIMIT:
@@ -52,7 +54,17 @@ int gently_parse_list_objects_filter(
if (filter_options->choice)
BUG("filter_options already populated");
- if (!strcmp(arg, "blob:none")) {
+ if (!strcmp(arg, "auto")) {
+ if (!filter_options->allow_auto_filter) {
+ strbuf_addstr(
+ errbuf,
+ _("'auto' filter not supported by this command"));
+ return 1;
+ }
+ filter_options->choice = LOFC_AUTO;
+ return 0;
+
+ } else if (!strcmp(arg, "blob:none")) {
filter_options->choice = LOFC_BLOB_NONE;
return 0;
@@ -146,10 +158,20 @@ static int parse_combine_subfilter(
decoded = url_percent_decode(subspec->buf);
- result = has_reserved_character(subspec, errbuf) ||
- gently_parse_list_objects_filter(
+ result = has_reserved_character(subspec, errbuf);
+ if (result)
+ goto cleanup;
+
+ result = gently_parse_list_objects_filter(
&filter_options->sub[new_index], decoded, errbuf);
+ if (result)
+ goto cleanup;
+
+ result = (filter_options->sub[new_index].choice == LOFC_AUTO);
+ if (result)
+ strbuf_addstr(errbuf, _("an 'auto' filter cannot be combined"));
+cleanup:
free(decoded);
return result;
}
@@ -263,6 +285,9 @@ void parse_list_objects_filter(
} else {
struct list_objects_filter_options *sub;
+ if (filter_options->choice == LOFC_AUTO)
+ die(_("an 'auto' filter is incompatible with any other filter"));
+
/*
* Make filter_options an LOFC_COMBINE spec so we can trivially
* add subspecs to it.
@@ -277,6 +302,9 @@ void parse_list_objects_filter(
if (gently_parse_list_objects_filter(sub, arg, &errbuf))
die("%s", errbuf.buf);
+ if (sub->choice == LOFC_AUTO)
+ die(_("an 'auto' filter is incompatible with any other filter"));
+
strbuf_addch(&filter_options->filter_spec, '+');
filter_spec_append_urlencode(filter_options, arg);
}
@@ -317,6 +345,7 @@ void list_objects_filter_release(
struct list_objects_filter_options *filter_options)
{
size_t sub;
+ unsigned int allow_auto_filter = filter_options->allow_auto_filter;
if (!filter_options)
return;
@@ -326,6 +355,7 @@ void list_objects_filter_release(
list_objects_filter_release(&filter_options->sub[sub]);
free(filter_options->sub);
list_objects_filter_init(filter_options);
+ filter_options->allow_auto_filter = allow_auto_filter;
}
void partial_clone_register(
diff --git a/list-objects-filter-options.h b/list-objects-filter-options.h
index 7b2108b986..77d7bbc846 100644
--- a/list-objects-filter-options.h
+++ b/list-objects-filter-options.h
@@ -18,6 +18,7 @@ enum list_objects_filter_choice {
LOFC_SPARSE_OID,
LOFC_OBJECT_TYPE,
LOFC_COMBINE,
+ LOFC_AUTO,
LOFC__COUNT /* must be last */
};
@@ -50,6 +51,11 @@ struct list_objects_filter_options {
*/
unsigned int no_filter : 1;
+ /*
+ * Is LOFC_AUTO a valid option?
+ */
+ unsigned int allow_auto_filter : 1;
+
/*
* BEGIN choice-specific parsed values from within the filter-spec. Only
* some values will be defined for any given choice.
diff --git a/list-objects-filter.c b/list-objects-filter.c
index acd65ebb73..78316e7f90 100644
--- a/list-objects-filter.c
+++ b/list-objects-filter.c
@@ -745,6 +745,13 @@ static void filter_combine__init(
filter->finalize_omits_fn = filter_combine__finalize_omits;
}
+static void filter_auto__init(
+ struct list_objects_filter_options *filter_options UNUSED,
+ struct filter *filter UNUSED)
+{
+ BUG("LOFC_AUTO should have been resolved before initializing the filter");
+}
+
typedef void (*filter_init_fn)(
struct list_objects_filter_options *filter_options,
struct filter *filter);
@@ -760,6 +767,7 @@ static filter_init_fn s_filters[] = {
filter_sparse_oid__init,
filter_object_type__init,
filter_combine__init,
+ filter_auto__init,
};
struct filter *list_objects_filter__init(
diff --git a/t/meson.build b/t/meson.build
index 459c52a489..0bd66cc6ce 100644
--- a/t/meson.build
+++ b/t/meson.build
@@ -4,6 +4,7 @@ clar_test_suites = [
'unit-tests/u-example-decorate.c',
'unit-tests/u-hash.c',
'unit-tests/u-hashmap.c',
+ 'unit-tests/u-list-objects-filter-options.c',
'unit-tests/u-mem-pool.c',
'unit-tests/u-oid-array.c',
'unit-tests/u-oidmap.c',
diff --git a/t/unit-tests/u-list-objects-filter-options.c b/t/unit-tests/u-list-objects-filter-options.c
new file mode 100644
index 0000000000..f7d73701b5
--- /dev/null
+++ b/t/unit-tests/u-list-objects-filter-options.c
@@ -0,0 +1,53 @@
+#include "unit-test.h"
+#include "list-objects-filter-options.h"
+#include "strbuf.h"
+
+/* Helper to test gently_parse_list_objects_filter() */
+static void check_gentle_parse(const char *filter_spec,
+ int expect_success,
+ int allow_auto,
+ enum list_objects_filter_choice expected_choice)
+{
+ struct list_objects_filter_options filter_options = LIST_OBJECTS_FILTER_INIT;
+ struct strbuf errbuf = STRBUF_INIT;
+ int ret;
+
+ filter_options.allow_auto_filter = allow_auto;
+
+ ret = gently_parse_list_objects_filter(&filter_options, filter_spec, &errbuf);
+
+ if (expect_success) {
+ cl_assert_equal_i(ret, 0);
+ cl_assert_equal_i(expected_choice, filter_options.choice);
+ cl_assert_equal_i(errbuf.len, 0);
+ } else {
+ cl_assert(ret != 0);
+ cl_assert(errbuf.len > 0);
+ }
+
+ strbuf_release(&errbuf);
+ list_objects_filter_release(&filter_options);
+}
+
+void test_list_objects_filter_options__regular_filters(void)
+{
+ check_gentle_parse("blob:none", 1, 0, LOFC_BLOB_NONE);
+ check_gentle_parse("blob:none", 1, 1, LOFC_BLOB_NONE);
+ check_gentle_parse("blob:limit=5k", 1, 0, LOFC_BLOB_LIMIT);
+ check_gentle_parse("blob:limit=5k", 1, 1, LOFC_BLOB_LIMIT);
+ check_gentle_parse("combine:blob:none+tree:0", 1, 0, LOFC_COMBINE);
+ check_gentle_parse("combine:blob:none+tree:0", 1, 1, LOFC_COMBINE);
+}
+
+void test_list_objects_filter_options__auto_allowed(void)
+{
+ check_gentle_parse("auto", 1, 1, LOFC_AUTO);
+ check_gentle_parse("auto", 0, 0, 0);
+}
+
+void test_list_objects_filter_options__combine_auto_fails(void)
+{
+ check_gentle_parse("combine:auto+blob:none", 0, 1, 0);
+ check_gentle_parse("combine:blob:none+auto", 0, 1, 0);
+ check_gentle_parse("combine:auto+auto", 0, 1, 0);
+}
--
2.52.0.319.gfcaffa7898
^ permalink raw reply related [flat|nested] 80+ messages in thread* Re: [PATCH 6/9] list-objects-filter-options: support 'auto' mode for --filter
2025-12-23 11:11 ` [PATCH 6/9] list-objects-filter-options: support 'auto' mode for --filter Christian Couder
@ 2026-01-07 10:05 ` Patrick Steinhardt
2026-02-04 10:21 ` Christian Couder
0 siblings, 1 reply; 80+ messages in thread
From: Patrick Steinhardt @ 2026-01-07 10:05 UTC (permalink / raw)
To: Christian Couder
Cc: git, Junio C Hamano, Taylor Blau, Karthik Nayak, Elijah Newren,
Christian Couder
On Tue, Dec 23, 2025 at 12:11:10PM +0100, Christian Couder wrote:
> In a following commit, we are going to allow passing "auto" as a
> <filterspec> to the `--filter=<filterspec>` option, but only for some
> commands. Other commands that support the `--filter=<filterspec>`
> option should still die() when 'auto' is passed.
Okay. I assume the idea is that the user can eventually say `git clone
--filter=auto`, and Git would automatically pick the best filter
advertised by the remote. Sounds reasonable to me.
> Let's set up the "list-objects-filter-options.{c,h}" infrastructure to
> support that:
>
> - Add a new `unsigned int allow_auto_filter : 1;` flag to
> `struct list_objects_filter_options` which specifies if "auto" is
> accepted or not.
> - Change gently_parse_list_objects_filter() to parse "auto" if it's
> accepted.
> - Make sure we die() if "auto" is combined with another filter.
> - Update list_objects_filter_release() to preserve the
> allow_auto_filter flag, as this function is often called (via
> opt_parse_list_objects_filter) to reset the struct before parsing a
> new value.
>
> Let's also update `list-objects-filter.c` to recognize the new
> `LOFC_AUTO` choice. Since "auto" must be resolved to a concrete filter
> before filtering actually begins, initializing a filter with
> `LOFC_AUTO` is invalid and will trigger a BUG().
>
> Note that ideally combining "auto" with "auto" could be allowed, but in
> practice, it's probably not worth the added code complexity. And if we
> really want it, nothing prevents us to allow it in future work.
I guess the question is what this would even mean, and I cannot think
of any benefit to allow `--filter=combine:auto+auto`. So agreed
> If we ever want to give a meaning to combining "auto" with a different
> filter too, nothing prevents us to do that in future work either.
So basically the case where the user knows that they definitely don't
want blobs, and in addition they want to pick the best filter advertised
by the server? Yeah, that sounds like it could eventually be a nice
addition.
> diff --git a/list-objects-filter-options.c b/list-objects-filter-options.c
> index 7420bf81fe..f13ae5caeb 100644
> --- a/list-objects-filter-options.c
> +++ b/list-objects-filter-options.c
> @@ -52,7 +54,17 @@ int gently_parse_list_objects_filter(
> if (filter_options->choice)
> BUG("filter_options already populated");
>
> - if (!strcmp(arg, "blob:none")) {
> + if (!strcmp(arg, "auto")) {
> + if (!filter_options->allow_auto_filter) {
> + strbuf_addstr(
> + errbuf,
> + _("'auto' filter not supported by this command"));
Tiny nit: the indentation looks a bit weird here.
> @@ -146,10 +158,20 @@ static int parse_combine_subfilter(
>
> decoded = url_percent_decode(subspec->buf);
>
> - result = has_reserved_character(subspec, errbuf) ||
> - gently_parse_list_objects_filter(
> + result = has_reserved_character(subspec, errbuf);
> + if (result)
> + goto cleanup;
> +
> + result = gently_parse_list_objects_filter(
> &filter_options->sub[new_index], decoded, errbuf);
> + if (result)
> + goto cleanup;
> +
> + result = (filter_options->sub[new_index].choice == LOFC_AUTO);
> + if (result)
> + strbuf_addstr(errbuf, _("an 'auto' filter cannot be combined"));
Nit: let's maybe also add the `goto cleanup` here. I'm not a fan of
leaving it away for the final statement as it makes it easy to forget
backfilling it in case this function needs to be extended in the future.
> @@ -317,6 +345,7 @@ void list_objects_filter_release(
> struct list_objects_filter_options *filter_options)
> {
> size_t sub;
> + unsigned int allow_auto_filter = filter_options->allow_auto_filter;
>
> if (!filter_options)
> return;
> @@ -326,6 +355,7 @@ void list_objects_filter_release(
> list_objects_filter_release(&filter_options->sub[sub]);
> free(filter_options->sub);
> list_objects_filter_init(filter_options);
> + filter_options->allow_auto_filter = allow_auto_filter;
> }
Why do we do this extra step to restore the `allow_auto_filter` option
here? Are there any callers that reuse the filter after it has been
released?
In any case, this function does have clearing semantics as it also knows
to re-init the filter options. So it's somewhat misnamed and really
should be called `list_objects_filter_clear()` according to our coding
guidelines. That's certainly outside the scope of this patch series
though.
Patrick
^ permalink raw reply [flat|nested] 80+ messages in thread* Re: [PATCH 6/9] list-objects-filter-options: support 'auto' mode for --filter
2026-01-07 10:05 ` Patrick Steinhardt
@ 2026-02-04 10:21 ` Christian Couder
0 siblings, 0 replies; 80+ messages in thread
From: Christian Couder @ 2026-02-04 10:21 UTC (permalink / raw)
To: Patrick Steinhardt
Cc: git, Junio C Hamano, Taylor Blau, Karthik Nayak, Elijah Newren,
Christian Couder
On Wed, Jan 7, 2026 at 11:05 AM Patrick Steinhardt <ps@pks.im> wrote:
>
> On Tue, Dec 23, 2025 at 12:11:10PM +0100, Christian Couder wrote:
> > In a following commit, we are going to allow passing "auto" as a
> > <filterspec> to the `--filter=<filterspec>` option, but only for some
> > commands. Other commands that support the `--filter=<filterspec>`
> > option should still die() when 'auto' is passed.
>
> Okay. I assume the idea is that the user can eventually say `git clone
> --filter=auto`, and Git would automatically pick the best filter
> advertised by the remote. Sounds reasonable to me.
Yeah, that's the idea.
> > Let's set up the "list-objects-filter-options.{c,h}" infrastructure to
> > support that:
> >
> > - Add a new `unsigned int allow_auto_filter : 1;` flag to
> > `struct list_objects_filter_options` which specifies if "auto" is
> > accepted or not.
> > - Change gently_parse_list_objects_filter() to parse "auto" if it's
> > accepted.
> > - Make sure we die() if "auto" is combined with another filter.
> > - Update list_objects_filter_release() to preserve the
> > allow_auto_filter flag, as this function is often called (via
> > opt_parse_list_objects_filter) to reset the struct before parsing a
> > new value.
> >
> > Let's also update `list-objects-filter.c` to recognize the new
> > `LOFC_AUTO` choice. Since "auto" must be resolved to a concrete filter
> > before filtering actually begins, initializing a filter with
> > `LOFC_AUTO` is invalid and will trigger a BUG().
> >
> > Note that ideally combining "auto" with "auto" could be allowed, but in
> > practice, it's probably not worth the added code complexity. And if we
> > really want it, nothing prevents us to allow it in future work.
>
> I guess the question is what this would even mean, and I cannot think
> of any benefit to allow `--filter=combine:auto+auto`. So agreed
We could allow `--filter=combine:auto+auto` to mean the same as just
`--filter=auto`. But I also don't see a benefit to allow this now.
> > If we ever want to give a meaning to combining "auto" with a different
> > filter too, nothing prevents us to do that in future work either.
>
> So basically the case where the user knows that they definitely don't
> want blobs, and in addition they want to pick the best filter advertised
> by the server? Yeah, that sounds like it could eventually be a nice
> addition.
Yeah, but I think it's also not needed for now.
> > diff --git a/list-objects-filter-options.c b/list-objects-filter-options.c
> > index 7420bf81fe..f13ae5caeb 100644
> > --- a/list-objects-filter-options.c
> > +++ b/list-objects-filter-options.c
> > @@ -52,7 +54,17 @@ int gently_parse_list_objects_filter(
> > if (filter_options->choice)
> > BUG("filter_options already populated");
> >
> > - if (!strcmp(arg, "blob:none")) {
> > + if (!strcmp(arg, "auto")) {
> > + if (!filter_options->allow_auto_filter) {
> > + strbuf_addstr(
> > + errbuf,
> > + _("'auto' filter not supported by this command"));
>
> Tiny nit: the indentation looks a bit weird here.
I have changed it. Hope it's better now.
> > @@ -146,10 +158,20 @@ static int parse_combine_subfilter(
> >
> > decoded = url_percent_decode(subspec->buf);
> >
> > - result = has_reserved_character(subspec, errbuf) ||
> > - gently_parse_list_objects_filter(
> > + result = has_reserved_character(subspec, errbuf);
> > + if (result)
> > + goto cleanup;
> > +
> > + result = gently_parse_list_objects_filter(
> > &filter_options->sub[new_index], decoded, errbuf);
> > + if (result)
> > + goto cleanup;
> > +
> > + result = (filter_options->sub[new_index].choice == LOFC_AUTO);
> > + if (result)
> > + strbuf_addstr(errbuf, _("an 'auto' filter cannot be combined"));
>
> Nit: let's maybe also add the `goto cleanup` here. I'm not a fan of
> leaving it away for the final statement as it makes it easy to forget
> backfilling it in case this function needs to be extended in the future.
Ok, I have added the `goto cleanup`.
> > @@ -317,6 +345,7 @@ void list_objects_filter_release(
> > struct list_objects_filter_options *filter_options)
> > {
> > size_t sub;
> > + unsigned int allow_auto_filter = filter_options->allow_auto_filter;
> >
> > if (!filter_options)
> > return;
> > @@ -326,6 +355,7 @@ void list_objects_filter_release(
> > list_objects_filter_release(&filter_options->sub[sub]);
> > free(filter_options->sub);
> > list_objects_filter_init(filter_options);
> > + filter_options->allow_auto_filter = allow_auto_filter;
> > }
>
> Why do we do this extra step to restore the `allow_auto_filter` option
> here? Are there any callers that reuse the filter after it has been
> released?
As you noticed below, list_objects_filter_release() doesn't just
release resources but actually resets the state. That's because the
filter options are indeed reused during command-line parsing.
In cmd_clone() a single `struct list_objects_filter_options` called
"filter_options" is declared and then pointers to it are passed to a
number of functions. In particular, opt_parse_list_objects_filter()
handles the `--no-filter` case by calling
list_objects_filter_set_no_filter() which calls
list_objects_filter_release().
So yeah, if the user runs something like `git fetch --no-filter
--filter=auto`, then "filter_options" is reused when `--filter=auto`
is processed, so after it has been released.
Also note that the `allow_auto_filter` field is a configuration bit
set by the command (e.g., cmd_fetch) before parsing begins. It
indicates that the command supports the 'auto' mode. It's not data
provided by users, so it doesn't change depending on which filter
related options are passed.
I have added the following to the commit message:
"Also note that the new `allow_auto_filter` flag depends on the command,
not user choices, so it should be reset to the command default when
`struct list_objects_filter_options` instances are reset."
> In any case, this function does have clearing semantics as it also knows
> to re-init the filter options. So it's somewhat misnamed and really
> should be called `list_objects_filter_clear()` according to our coding
> guidelines. That's certainly outside the scope of this patch series
> though.
Yeah, it can be done separately.
^ permalink raw reply [flat|nested] 80+ messages in thread
* [PATCH 7/9] list-objects-filter-options: implement auto filter resolution
2025-12-23 11:11 [PATCH 0/9] Implement `promisor.storeFields` and `--filter=auto` Christian Couder
` (5 preceding siblings ...)
2025-12-23 11:11 ` [PATCH 6/9] list-objects-filter-options: support 'auto' mode for --filter Christian Couder
@ 2025-12-23 11:11 ` Christian Couder
2026-01-07 10:05 ` Patrick Steinhardt
2025-12-23 11:11 ` [PATCH 8/9] promisor-remote: keep advertised filter in memory Christian Couder
` (3 subsequent siblings)
10 siblings, 1 reply; 80+ messages in thread
From: Christian Couder @ 2025-12-23 11:11 UTC (permalink / raw)
To: git
Cc: Junio C Hamano, Patrick Steinhardt, Taylor Blau, Karthik Nayak,
Elijah Newren, Christian Couder, Christian Couder
In a following commit, we will need to aggregate filters from multiple
accepted promisor remotes into a single filter.
For that purpose, let's add a `list_objects_filter_combine()` helper
function that takes a list of filter specifications and combines them
into a single string. If multiple filters are provided, it constructs a
"combine:..." filter, ensuring that sub-filters are properly
URL-encoded using the existing `allow_unencoded` logic.
In a following commit, we will add a `--filter=auto` option that will
enable a client to use the filters suggested by the server for the
promisor remotes the client accepted.
To simplify the filter processing related to this new feature, let's
also add a small `list_objects_filter_resolve_auto()` function.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
list-objects-filter-options.c | 35 ++++++++++++++++++++
list-objects-filter-options.h | 19 +++++++++++
t/unit-tests/u-list-objects-filter-options.c | 33 ++++++++++++++++++
3 files changed, 87 insertions(+)
diff --git a/list-objects-filter-options.c b/list-objects-filter-options.c
index f13ae5caeb..4a9c1991c1 100644
--- a/list-objects-filter-options.c
+++ b/list-objects-filter-options.c
@@ -230,6 +230,41 @@ static void filter_spec_append_urlencode(
filter->filter_spec.buf + orig_len);
}
+char *list_objects_filter_combine(const struct string_list *specs)
+{
+ struct strbuf buf = STRBUF_INIT;
+
+ if (!specs->nr)
+ return NULL;
+
+ if (specs->nr == 1)
+ return xstrdup(specs->items[0].string);
+
+ strbuf_addstr(&buf, "combine:");
+
+ for (size_t i = 0; i < specs->nr; i++) {
+ const char *spec = specs->items[i].string;
+ if (i > 0)
+ strbuf_addch(&buf, '+');
+
+ strbuf_addstr_urlencode(&buf, spec, allow_unencoded);
+ }
+
+ return strbuf_detach(&buf, NULL);
+}
+
+void list_objects_filter_resolve_auto(struct list_objects_filter_options *filter_options,
+ char *new_filter, struct strbuf *errbuf)
+{
+ if (filter_options->choice != LOFC_AUTO)
+ return;
+
+ list_objects_filter_release(filter_options);
+
+ if (new_filter)
+ gently_parse_list_objects_filter(filter_options, new_filter, errbuf);
+}
+
/*
* Changes filter_options into an equivalent LOFC_COMBINE filter options
* instance. Does not do anything if filter_options is already LOFC_COMBINE.
diff --git a/list-objects-filter-options.h b/list-objects-filter-options.h
index 77d7bbc846..832d615c17 100644
--- a/list-objects-filter-options.h
+++ b/list-objects-filter-options.h
@@ -6,6 +6,7 @@
#include "strbuf.h"
struct option;
+struct string_list;
/*
* The list of defined filters for list-objects.
@@ -168,4 +169,22 @@ void list_objects_filter_copy(
struct list_objects_filter_options *dest,
const struct list_objects_filter_options *src);
+/*
+ * Combine the filter specs in 'specs' into a combined filter string
+ * like "combine:<spec1>+<spec2>", where <spec1>, <spec2>, etc are
+ * properly urlencoded. If 'specs' contains no element, NULL is
+ * returned. If 'specs' contains a single element, a copy of that
+ * element is returned.
+ */
+char *list_objects_filter_combine(const struct string_list *specs);
+
+/*
+ * Check if 'filter_options' are an 'auto' filter, and if that's the
+ * case populate it with the filter specified by 'new_filter'.
+ */
+void list_objects_filter_resolve_auto(
+ struct list_objects_filter_options *filter_options,
+ char *new_filter,
+ struct strbuf *errbuf);
+
#endif /* LIST_OBJECTS_FILTER_OPTIONS_H */
diff --git a/t/unit-tests/u-list-objects-filter-options.c b/t/unit-tests/u-list-objects-filter-options.c
index f7d73701b5..84a012af3c 100644
--- a/t/unit-tests/u-list-objects-filter-options.c
+++ b/t/unit-tests/u-list-objects-filter-options.c
@@ -1,6 +1,7 @@
#include "unit-test.h"
#include "list-objects-filter-options.h"
#include "strbuf.h"
+#include "string-list.h"
/* Helper to test gently_parse_list_objects_filter() */
static void check_gentle_parse(const char *filter_spec,
@@ -51,3 +52,35 @@ void test_list_objects_filter_options__combine_auto_fails(void)
check_gentle_parse("combine:blob:none+auto", 0, 1, 0);
check_gentle_parse("combine:auto+auto", 0, 1, 0);
}
+
+/* Helper to test list_objects_filter_combine() */
+static void check_combine(const char **specs, size_t nr, const char *expected)
+{
+ struct string_list spec_list = STRING_LIST_INIT_NODUP;
+ char *actual;
+
+ for (size_t i = 0; i < nr; i++)
+ string_list_append(&spec_list, specs[i]);
+
+ actual = list_objects_filter_combine(&spec_list);
+
+ cl_assert_equal_s(actual, expected);
+
+ free(actual);
+ string_list_clear(&spec_list, 0);
+}
+
+void test_list_objects_filter_options__combine_helper(void)
+{
+ const char *empty[] = { NULL };
+ const char *one[] = { "blob:none" };
+ const char *two[] = { "blob:none", "tree:0" };
+ const char *complex[] = { "blob:limit=1k", "object:type=tag" };
+ const char *needs_encoding[] = { "blob:none", "combine:tree:0+blob:limit=1k" };
+
+ check_combine(empty, 0, NULL);
+ check_combine(one, 1, "blob:none");
+ check_combine(two, 2, "combine:blob:none+tree:0");
+ check_combine(complex, 2, "combine:blob:limit=1k+object:type=tag");
+ check_combine(needs_encoding, 2, "combine:blob:none+combine:tree:0%2bblob:limit=1k");
+}
--
2.52.0.319.gfcaffa7898
^ permalink raw reply related [flat|nested] 80+ messages in thread* Re: [PATCH 7/9] list-objects-filter-options: implement auto filter resolution
2025-12-23 11:11 ` [PATCH 7/9] list-objects-filter-options: implement auto filter resolution Christian Couder
@ 2026-01-07 10:05 ` Patrick Steinhardt
2026-02-04 10:29 ` Christian Couder
0 siblings, 1 reply; 80+ messages in thread
From: Patrick Steinhardt @ 2026-01-07 10:05 UTC (permalink / raw)
To: Christian Couder
Cc: git, Junio C Hamano, Taylor Blau, Karthik Nayak, Elijah Newren,
Christian Couder
On Tue, Dec 23, 2025 at 12:11:11PM +0100, Christian Couder wrote:
> In a following commit, we will need to aggregate filters from multiple
> accepted promisor remotes into a single filter.
Ah, interesting. I was always operating under the assumption that when
the server advertises multiple promisors, the client will pick only one
of them. And that made me wonder how the client knows which one to pick
in the first place.
But of course it's possible to just pick _all_ of them by combining the
filter.
> diff --git a/list-objects-filter-options.c b/list-objects-filter-options.c
> index f13ae5caeb..4a9c1991c1 100644
> --- a/list-objects-filter-options.c
> +++ b/list-objects-filter-options.c
> @@ -230,6 +230,41 @@ static void filter_spec_append_urlencode(
> filter->filter_spec.buf + orig_len);
> }
>
> +char *list_objects_filter_combine(const struct string_list *specs)
> +{
> + struct strbuf buf = STRBUF_INIT;
> +
> + if (!specs->nr)
> + return NULL;
> +
> + if (specs->nr == 1)
> + return xstrdup(specs->items[0].string);
> +
> + strbuf_addstr(&buf, "combine:");
> +
> + for (size_t i = 0; i < specs->nr; i++) {
> + const char *spec = specs->items[i].string;
> + if (i > 0)
> + strbuf_addch(&buf, '+');
> +
> + strbuf_addstr_urlencode(&buf, spec, allow_unencoded);
Shouldn't we use `filter_spec_append_urlencode()` to do this?
> + }
> +
> + return strbuf_detach(&buf, NULL);
> +}
I'm surprised we didn't have such a function yet.
> +void list_objects_filter_resolve_auto(struct list_objects_filter_options *filter_options,
> + char *new_filter, struct strbuf *errbuf)
> +{
> + if (filter_options->choice != LOFC_AUTO)
> + return;
I wonder whether we should rather `BUG()` in case the filter is not an
"auto" filter. Otherwise it's easy to get the callsite wrong, as the
user may expect that the filter gets resolved tdo the new filter, but
it's actually not because the original filter wasn't an "auto" filter in
the first place.
> + list_objects_filter_release(filter_options);
> +
> + if (new_filter)
> + gently_parse_list_objects_filter(filter_options, new_filter, errbuf);
> +}
So as menitoned in a preceding commit `list_objects_filter_release()`,
will retain the `allow_auto` option. But when resolving "auto" filters
I'd expect us to not accept "auto" in the resolved filter anymore.
Otherwise, if `new_filter` was "auto", we'd still end up with an auto
filter, wouldn't we? I'd rather expect us to abort in that case.
Patrick
^ permalink raw reply [flat|nested] 80+ messages in thread* Re: [PATCH 7/9] list-objects-filter-options: implement auto filter resolution
2026-01-07 10:05 ` Patrick Steinhardt
@ 2026-02-04 10:29 ` Christian Couder
2026-02-11 11:48 ` Patrick Steinhardt
0 siblings, 1 reply; 80+ messages in thread
From: Christian Couder @ 2026-02-04 10:29 UTC (permalink / raw)
To: Patrick Steinhardt
Cc: git, Junio C Hamano, Taylor Blau, Karthik Nayak, Elijah Newren,
Christian Couder
On Wed, Jan 7, 2026 at 11:05 AM Patrick Steinhardt <ps@pks.im> wrote:
>
> On Tue, Dec 23, 2025 at 12:11:11PM +0100, Christian Couder wrote:
> > In a following commit, we will need to aggregate filters from multiple
> > accepted promisor remotes into a single filter.
>
> Ah, interesting. I was always operating under the assumption that when
> the server advertises multiple promisors, the client will pick only one
> of them. And that made me wonder how the client knows which one to pick
> in the first place.
>
> But of course it's possible to just pick _all_ of them by combining the
> filter.
Yeah, that's the idea.
> > diff --git a/list-objects-filter-options.c b/list-objects-filter-options.c
> > index f13ae5caeb..4a9c1991c1 100644
> > --- a/list-objects-filter-options.c
> > +++ b/list-objects-filter-options.c
> > @@ -230,6 +230,41 @@ static void filter_spec_append_urlencode(
> > filter->filter_spec.buf + orig_len);
> > }
> >
> > +char *list_objects_filter_combine(const struct string_list *specs)
> > +{
> > + struct strbuf buf = STRBUF_INIT;
> > +
> > + if (!specs->nr)
> > + return NULL;
> > +
> > + if (specs->nr == 1)
> > + return xstrdup(specs->items[0].string);
> > +
> > + strbuf_addstr(&buf, "combine:");
> > +
> > + for (size_t i = 0; i < specs->nr; i++) {
> > + const char *spec = specs->items[i].string;
> > + if (i > 0)
> > + strbuf_addch(&buf, '+');
> > +
> > + strbuf_addstr_urlencode(&buf, spec, allow_unencoded);
>
> Shouldn't we use `filter_spec_append_urlencode()` to do this?
Yeah, probably, see below.
> > + }
> > +
> > + return strbuf_detach(&buf, NULL);
> > +}
>
> I'm surprised we didn't have such a function yet.
I have refactored the code so that we use a temporary `struct
list_objects_filter_options` and `gently_parse_list_objects_filter()`
to construct a combined filter in the next commit instead of this
function.
This also takes care of your comment above about
`strbuf_addstr_urlencode()` vs `filter_spec_append_urlencode()`.
> > +void list_objects_filter_resolve_auto(struct list_objects_filter_options *filter_options,
> > + char *new_filter, struct strbuf *errbuf)
> > +{
> > + if (filter_options->choice != LOFC_AUTO)
> > + return;
>
> I wonder whether we should rather `BUG()` in case the filter is not an
> "auto" filter. Otherwise it's easy to get the callsite wrong, as the
> user may expect that the filter gets resolved tdo the new filter, but
> it's actually not because the original filter wasn't an "auto" filter in
> the first place.
Actually the list_objects_filter_resolve_auto() function is not very
useful, so I have just removed it too in the v2 I will send.
This way this whole patch is not necessary and has been removed in v2.
> > + list_objects_filter_release(filter_options);
> > +
> > + if (new_filter)
> > + gently_parse_list_objects_filter(filter_options, new_filter, errbuf);
> > +}
>
> So as menitoned in a preceding commit `list_objects_filter_release()`,
> will retain the `allow_auto` option. But when resolving "auto" filters
> I'd expect us to not accept "auto" in the resolved filter anymore.
> Otherwise, if `new_filter` was "auto", we'd still end up with an auto
> filter, wouldn't we? I'd rather expect us to abort in that case.
Right, I have fixed this by adding the following in the next commit:
/* The result of resolving an 'auto' filter must not be 'auto' */
args->filter_options.allow_auto_filter = 0;
Thanks!
^ permalink raw reply [flat|nested] 80+ messages in thread* Re: [PATCH 7/9] list-objects-filter-options: implement auto filter resolution
2026-02-04 10:29 ` Christian Couder
@ 2026-02-11 11:48 ` Patrick Steinhardt
2026-02-12 10:07 ` Christian Couder
0 siblings, 1 reply; 80+ messages in thread
From: Patrick Steinhardt @ 2026-02-11 11:48 UTC (permalink / raw)
To: Christian Couder
Cc: git, Junio C Hamano, Taylor Blau, Karthik Nayak, Elijah Newren,
Christian Couder
On Wed, Feb 04, 2026 at 11:29:43AM +0100, Christian Couder wrote:
> On Wed, Jan 7, 2026 at 11:05 AM Patrick Steinhardt <ps@pks.im> wrote:
> >
> > On Tue, Dec 23, 2025 at 12:11:11PM +0100, Christian Couder wrote:
> > > In a following commit, we will need to aggregate filters from multiple
> > > accepted promisor remotes into a single filter.
> >
> > Ah, interesting. I was always operating under the assumption that when
> > the server advertises multiple promisors, the client will pick only one
> > of them. And that made me wonder how the client knows which one to pick
> > in the first place.
> >
> > But of course it's possible to just pick _all_ of them by combining the
> > filter.
>
> Yeah, that's the idea.
One thought I recently had: if one selects multiple promisor remotes,
how does the client know which promisor remote to fetch a certain object
from? We don't always have enough information about a missing object to
be able to tell which of the filters would have excluded it, so it's not
possible to basically "reverse" the filtering and deduce from them which
remote should have them.
Patrick
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 7/9] list-objects-filter-options: implement auto filter resolution
2026-02-11 11:48 ` Patrick Steinhardt
@ 2026-02-12 10:07 ` Christian Couder
0 siblings, 0 replies; 80+ messages in thread
From: Christian Couder @ 2026-02-12 10:07 UTC (permalink / raw)
To: Patrick Steinhardt
Cc: git, Junio C Hamano, Taylor Blau, Karthik Nayak, Elijah Newren,
Christian Couder, Kaartic Sivaraam
On Wed, Feb 11, 2026 at 12:48 PM Patrick Steinhardt <ps@pks.im> wrote:
> One thought I recently had: if one selects multiple promisor remotes,
> how does the client know which promisor remote to fetch a certain object
> from? We don't always have enough information about a missing object to
> be able to tell which of the filters would have excluded it, so it's not
> possible to basically "reverse" the filtering and deduce from them which
> remote should have them.
When there are multiple promisor remotes, the client will try to fetch
the missing objects from the promisor remotes in the order they appear
in the config file, then it will try the "main remote" if it still
couldn't fetch some objects.
Note that this isn't changed by this patch series. This is how it
works since it has been possible to configure multiple promisor
remotes. It's also documented in the "Using many promisor remotes" of
"Documentation/technical/partial-clone.adoc".
By the way the doc says "the long term plan should be to make the
order somehow fully configurable" and this is what the "Implement
promisor remote fetch ordering" GSoC 2026 project is about. See:
https://git.github.io/SoC-2026-Ideas/
(Thanks to Kaartic Sivaraam who recently submitted the PR to add this
and other projects to that page.)
^ permalink raw reply [flat|nested] 80+ messages in thread
* [PATCH 8/9] promisor-remote: keep advertised filter in memory
2025-12-23 11:11 [PATCH 0/9] Implement `promisor.storeFields` and `--filter=auto` Christian Couder
` (6 preceding siblings ...)
2025-12-23 11:11 ` [PATCH 7/9] list-objects-filter-options: implement auto filter resolution Christian Couder
@ 2025-12-23 11:11 ` Christian Couder
2026-01-07 10:05 ` Patrick Steinhardt
2025-12-23 11:11 ` [PATCH 9/9] fetch-pack: wire up and enable auto filter logic Christian Couder
` (2 subsequent siblings)
10 siblings, 1 reply; 80+ messages in thread
From: Christian Couder @ 2025-12-23 11:11 UTC (permalink / raw)
To: git
Cc: Junio C Hamano, Patrick Steinhardt, Taylor Blau, Karthik Nayak,
Elijah Newren, Christian Couder, Christian Couder
Currently, advertised filters are only kept in memory temporarily
during parsing, or persisted to disk if `promisor.storeFields`
contains 'partialCloneFilter'.
In a following commit though, we will add a `--filter=auto` option.
This option will enable the client to use the filters that the server
is suggesting for the promisor remotes the client accepts.
To use them even if `promisor.storeFields` is not configured, these
filters should be stored somewhere for the current session.
Let's add an `advertised_filter` field to `struct promisor_remote`
for that purpose.
To ensure that the filters are available in all cases,
filter_promisor_remote() captures them into a temporary list and
applies them to the `promisor_remote` structs after the potential
configuration reload.
Then the accepted remotes are marked as `accepted` in the repository
state. This ensures that subsequent calls to look up accepted remotes
(like in the filter construction below) actually find them.
In a following commit, we will add a `--filter=auto` option that will
enable a client to use the filters suggested by the server for the
promisor remotes the client accepted.
To enable the client to construct a filter spec based on these filters,
let's add a `promisor_remote_construct_filter(repo)` function.
This function:
- iterates over all accepted promisor remotes in the repository,
- collects the filters advertised for them (using `advertised_filter`
which a previous commit added to `struct promisor_remote`), and
- generates a single filter spec for them (using the
`list_objects_filter_combine()` function added by a previous commit).
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
promisor-remote.c | 48 +++++++++++++++++++++++++++++++++++++++++++++++
promisor-remote.h | 6 ++++++
2 files changed, 54 insertions(+)
diff --git a/promisor-remote.c b/promisor-remote.c
index 8d6d2d7b76..d5f3223cd0 100644
--- a/promisor-remote.c
+++ b/promisor-remote.c
@@ -193,6 +193,7 @@ void promisor_remote_clear(struct promisor_remote_config *config)
while (config->promisors) {
struct promisor_remote *r = config->promisors;
free(r->partial_clone_filter);
+ free(r->advertised_filter);
config->promisors = config->promisors->next;
free(r);
}
@@ -837,6 +838,7 @@ static void filter_promisor_remote(struct repository *repo,
struct store_info *store_info = NULL;
struct string_list_item *item;
bool reload_config = false;
+ struct string_list captured_filters = STRING_LIST_INIT_DUP;
if (!repo_config_get_string_tmp(the_repository, "promisor.acceptfromserver", &accept_str)) {
if (!*accept_str || !strcasecmp("None", accept_str))
@@ -879,6 +881,13 @@ static void filter_promisor_remote(struct repository *repo,
reload_config = true;
strvec_push(accepted, advertised->name);
+
+ /* Capture advertised filters for accepted remotes */
+ if (advertised->filter) {
+ struct string_list_item *i;
+ i = string_list_append(&captured_filters, advertised->name);
+ i->util = xstrdup(advertised->filter);
+ }
}
promisor_info_free(advertised);
@@ -890,6 +899,25 @@ static void filter_promisor_remote(struct repository *repo,
if (reload_config)
repo_promisor_remote_reinit(repo);
+
+ /* Apply captured filters to the stable repo state */
+ for_each_string_list_item(item, &captured_filters) {
+ struct promisor_remote *r = repo_promisor_remote_find(repo, item->string);
+ if (r) {
+ free(r->advertised_filter);
+ r->advertised_filter = item->util;
+ item->util = NULL;
+ }
+ }
+
+ string_list_clear(&captured_filters, 1);
+
+ /* Mark the remotes as accepted in the repository state */
+ for (size_t i = 0; i < accepted->nr; i++) {
+ struct promisor_remote *r = repo_promisor_remote_find(repo, accepted->v[i]);
+ if (r)
+ r->accepted = 1;
+ }
}
char *promisor_remote_reply(const char *info)
@@ -935,3 +963,23 @@ void mark_promisor_remotes_as_accepted(struct repository *r, const char *remotes
string_list_clear(&accepted_remotes, 0);
}
+
+char *promisor_remote_construct_filter(struct repository *repo)
+{
+ struct string_list advertised_filters = STRING_LIST_INIT_NODUP;
+ struct promisor_remote *r;
+ char *result;
+
+ promisor_remote_init(repo);
+
+ for (r = repo->promisor_remote_config->promisors; r; r = r->next) {
+ if (r->accepted && r->advertised_filter)
+ string_list_append(&advertised_filters, r->advertised_filter);
+ }
+
+ result = list_objects_filter_combine(&advertised_filters);
+
+ string_list_clear(&advertised_filters, 0);
+
+ return result;
+}
diff --git a/promisor-remote.h b/promisor-remote.h
index 263d331a55..98a0f05e03 100644
--- a/promisor-remote.h
+++ b/promisor-remote.h
@@ -15,6 +15,7 @@ struct object_id;
struct promisor_remote {
struct promisor_remote *next;
char *partial_clone_filter;
+ char *advertised_filter;
unsigned int accepted : 1;
const char name[FLEX_ARRAY];
};
@@ -67,4 +68,9 @@ void mark_promisor_remotes_as_accepted(struct repository *repo, const char *remo
*/
int repo_has_accepted_promisor_remote(struct repository *r);
+/*
+ * Use the filters from the accepted remotes to create a filter.
+ */
+char *promisor_remote_construct_filter(struct repository *repo);
+
#endif /* PROMISOR_REMOTE_H */
--
2.52.0.319.gfcaffa7898
^ permalink raw reply related [flat|nested] 80+ messages in thread* Re: [PATCH 8/9] promisor-remote: keep advertised filter in memory
2025-12-23 11:11 ` [PATCH 8/9] promisor-remote: keep advertised filter in memory Christian Couder
@ 2026-01-07 10:05 ` Patrick Steinhardt
2026-02-04 10:57 ` Christian Couder
0 siblings, 1 reply; 80+ messages in thread
From: Patrick Steinhardt @ 2026-01-07 10:05 UTC (permalink / raw)
To: Christian Couder
Cc: git, Junio C Hamano, Taylor Blau, Karthik Nayak, Elijah Newren,
Christian Couder
On Tue, Dec 23, 2025 at 12:11:12PM +0100, Christian Couder wrote:
> diff --git a/promisor-remote.c b/promisor-remote.c
> index 8d6d2d7b76..d5f3223cd0 100644
> --- a/promisor-remote.c
> +++ b/promisor-remote.c
> @@ -837,6 +838,7 @@ static void filter_promisor_remote(struct repository *repo,
> struct store_info *store_info = NULL;
> struct string_list_item *item;
> bool reload_config = false;
> + struct string_list captured_filters = STRING_LIST_INIT_DUP;
>
> if (!repo_config_get_string_tmp(the_repository, "promisor.acceptfromserver", &accept_str)) {
> if (!*accept_str || !strcasecmp("None", accept_str))
Nit: I found the "captured" terminology to be somewhat confusing. Can we
maybe rename this to `advertised_filters` to clarify?
> @@ -890,6 +899,25 @@ static void filter_promisor_remote(struct repository *repo,
>
> if (reload_config)
> repo_promisor_remote_reinit(repo);
> +
> + /* Apply captured filters to the stable repo state */
> + for_each_string_list_item(item, &captured_filters) {
> + struct promisor_remote *r = repo_promisor_remote_find(repo, item->string);
> + if (r) {
> + free(r->advertised_filter);
> + r->advertised_filter = item->util;
> + item->util = NULL;
> + }
> + }
> +
> + string_list_clear(&captured_filters, 1);
Ah, I was wondering about memory lifetime first because we ask
`string_list_clear()` to free the `->util` pointers. But above we set
that pointer to `NULL` in case we retain it.
> @@ -935,3 +963,23 @@ void mark_promisor_remotes_as_accepted(struct repository *r, const char *remotes
>
> string_list_clear(&accepted_remotes, 0);
> }
> +
> +char *promisor_remote_construct_filter(struct repository *repo)
> +{
> + struct string_list advertised_filters = STRING_LIST_INIT_NODUP;
> + struct promisor_remote *r;
> + char *result;
> +
> + promisor_remote_init(repo);
> +
> + for (r = repo->promisor_remote_config->promisors; r; r = r->next) {
> + if (r->accepted && r->advertised_filter)
> + string_list_append(&advertised_filters, r->advertised_filter);
Would we ever accept a promisor remote that _doesn't_ have an advertised
filter? If not, should we maybe `BUG()` in case the advertised filter
has not been set?
Patrick
^ permalink raw reply [flat|nested] 80+ messages in thread* Re: [PATCH 8/9] promisor-remote: keep advertised filter in memory
2026-01-07 10:05 ` Patrick Steinhardt
@ 2026-02-04 10:57 ` Christian Couder
2026-02-11 11:48 ` Patrick Steinhardt
0 siblings, 1 reply; 80+ messages in thread
From: Christian Couder @ 2026-02-04 10:57 UTC (permalink / raw)
To: Patrick Steinhardt
Cc: git, Junio C Hamano, Taylor Blau, Karthik Nayak, Elijah Newren,
Christian Couder
On Wed, Jan 7, 2026 at 11:05 AM Patrick Steinhardt <ps@pks.im> wrote:
>
> On Tue, Dec 23, 2025 at 12:11:12PM +0100, Christian Couder wrote:
> > diff --git a/promisor-remote.c b/promisor-remote.c
> > index 8d6d2d7b76..d5f3223cd0 100644
> > --- a/promisor-remote.c
> > +++ b/promisor-remote.c
> > @@ -837,6 +838,7 @@ static void filter_promisor_remote(struct repository *repo,
> > struct store_info *store_info = NULL;
> > struct string_list_item *item;
> > bool reload_config = false;
> > + struct string_list captured_filters = STRING_LIST_INIT_DUP;
> >
> > if (!repo_config_get_string_tmp(the_repository, "promisor.acceptfromserver", &accept_str)) {
> > if (!*accept_str || !strcasecmp("None", accept_str))
>
> Nit: I found the "captured" terminology to be somewhat confusing. Can we
> maybe rename this to `advertised_filters` to clarify?
Well "advertised_filter" is already used and I think it might be
confusing to use a very similar name, so for now until we find a
better name, I kept "captured" in v2 even if it's not the best.
What about using `server_filters`?
> > @@ -935,3 +963,23 @@ void mark_promisor_remotes_as_accepted(struct repository *r, const char *remotes
> >
> > string_list_clear(&accepted_remotes, 0);
> > }
> > +
> > +char *promisor_remote_construct_filter(struct repository *repo)
> > +{
> > + struct string_list advertised_filters = STRING_LIST_INIT_NODUP;
> > + struct promisor_remote *r;
> > + char *result;
> > +
> > + promisor_remote_init(repo);
> > +
> > + for (r = repo->promisor_remote_config->promisors; r; r = r->next) {
> > + if (r->accepted && r->advertised_filter)
> > + string_list_append(&advertised_filters, r->advertised_filter);
>
> Would we ever accept a promisor remote that _doesn't_ have an advertised
> filter? If not, should we maybe `BUG()` in case the advertised filter
> has not been set?
I think it should be fine to accept a promisor remote without an
advertised filter. The server might prefer to not advertise filters
because it thinks that the client should determine the best filter
based on the client needs. That's how it works now.
^ permalink raw reply [flat|nested] 80+ messages in thread* Re: [PATCH 8/9] promisor-remote: keep advertised filter in memory
2026-02-04 10:57 ` Christian Couder
@ 2026-02-11 11:48 ` Patrick Steinhardt
2026-02-11 16:59 ` Junio C Hamano
0 siblings, 1 reply; 80+ messages in thread
From: Patrick Steinhardt @ 2026-02-11 11:48 UTC (permalink / raw)
To: Christian Couder
Cc: git, Junio C Hamano, Taylor Blau, Karthik Nayak, Elijah Newren,
Christian Couder
On Wed, Feb 04, 2026 at 11:57:42AM +0100, Christian Couder wrote:
> On Wed, Jan 7, 2026 at 11:05 AM Patrick Steinhardt <ps@pks.im> wrote:
> >
> > On Tue, Dec 23, 2025 at 12:11:12PM +0100, Christian Couder wrote:
> > > diff --git a/promisor-remote.c b/promisor-remote.c
> > > index 8d6d2d7b76..d5f3223cd0 100644
> > > --- a/promisor-remote.c
> > > +++ b/promisor-remote.c
> > > @@ -837,6 +838,7 @@ static void filter_promisor_remote(struct repository *repo,
> > > struct store_info *store_info = NULL;
> > > struct string_list_item *item;
> > > bool reload_config = false;
> > > + struct string_list captured_filters = STRING_LIST_INIT_DUP;
> > >
> > > if (!repo_config_get_string_tmp(the_repository, "promisor.acceptfromserver", &accept_str)) {
> > > if (!*accept_str || !strcasecmp("None", accept_str))
> >
> > Nit: I found the "captured" terminology to be somewhat confusing. Can we
> > maybe rename this to `advertised_filters` to clarify?
>
> Well "advertised_filter" is already used and I think it might be
> confusing to use a very similar name, so for now until we find a
> better name, I kept "captured" in v2 even if it's not the best.
>
> What about using `server_filters`?
I think that'd work better than "captured". But we should probably not
call it "sever" but "remote" instead, so `remote_filters`. I would be
happy with such a rename.
Another alternative would be `accepted_filters` to stress the fact that
it's not the complete list of filters. I'd be happy with either though.
> > > @@ -935,3 +963,23 @@ void mark_promisor_remotes_as_accepted(struct repository *r, const char *remotes
> > >
> > > string_list_clear(&accepted_remotes, 0);
> > > }
> > > +
> > > +char *promisor_remote_construct_filter(struct repository *repo)
> > > +{
> > > + struct string_list advertised_filters = STRING_LIST_INIT_NODUP;
> > > + struct promisor_remote *r;
> > > + char *result;
> > > +
> > > + promisor_remote_init(repo);
> > > +
> > > + for (r = repo->promisor_remote_config->promisors; r; r = r->next) {
> > > + if (r->accepted && r->advertised_filter)
> > > + string_list_append(&advertised_filters, r->advertised_filter);
> >
> > Would we ever accept a promisor remote that _doesn't_ have an advertised
> > filter? If not, should we maybe `BUG()` in case the advertised filter
> > has not been set?
>
> I think it should be fine to accept a promisor remote without an
> advertised filter. The server might prefer to not advertise filters
> because it thinks that the client should determine the best filter
> based on the client needs. That's how it works now.
Okay, fair.
Patrick
^ permalink raw reply [flat|nested] 80+ messages in thread* Re: [PATCH 8/9] promisor-remote: keep advertised filter in memory
2026-02-11 11:48 ` Patrick Steinhardt
@ 2026-02-11 16:59 ` Junio C Hamano
2026-02-12 10:07 ` Christian Couder
0 siblings, 1 reply; 80+ messages in thread
From: Junio C Hamano @ 2026-02-11 16:59 UTC (permalink / raw)
To: Patrick Steinhardt
Cc: Christian Couder, git, Taylor Blau, Karthik Nayak, Elijah Newren,
Christian Couder
Patrick Steinhardt <ps@pks.im> writes:
> Another alternative would be `accepted_filters` to stress the fact that
> it's not the complete list of filters. I'd be happy with either though.
So advertised is a superset, from which we chose some and becomes accepted?
Sounds very logical to me.
;-)
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH 8/9] promisor-remote: keep advertised filter in memory
2026-02-11 16:59 ` Junio C Hamano
@ 2026-02-12 10:07 ` Christian Couder
0 siblings, 0 replies; 80+ messages in thread
From: Christian Couder @ 2026-02-12 10:07 UTC (permalink / raw)
To: Junio C Hamano
Cc: Patrick Steinhardt, git, Taylor Blau, Karthik Nayak,
Elijah Newren, Christian Couder
On Wed, Feb 11, 2026 at 5:59 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Patrick Steinhardt <ps@pks.im> writes:
>
> > Another alternative would be `accepted_filters` to stress the fact that
> > it's not the complete list of filters. I'd be happy with either though.
>
> So advertised is a superset, from which we chose some and becomes accepted?
> Sounds very logical to me.
Fine, `accepted_filters` it is now.
^ permalink raw reply [flat|nested] 80+ messages in thread
* [PATCH 9/9] fetch-pack: wire up and enable auto filter logic
2025-12-23 11:11 [PATCH 0/9] Implement `promisor.storeFields` and `--filter=auto` Christian Couder
` (7 preceding siblings ...)
2025-12-23 11:11 ` [PATCH 8/9] promisor-remote: keep advertised filter in memory Christian Couder
@ 2025-12-23 11:11 ` Christian Couder
2026-01-07 10:05 ` Patrick Steinhardt
2026-02-04 11:08 ` [PATCH v2 0/8] Implement `promisor.storeFields` and `--filter=auto` Christian Couder
2026-04-27 12:41 ` [PATCH v2 0/8] Auto-configure advertised remotes via URL allowlist Christian Couder
10 siblings, 1 reply; 80+ messages in thread
From: Christian Couder @ 2025-12-23 11:11 UTC (permalink / raw)
To: git
Cc: Junio C Hamano, Patrick Steinhardt, Taylor Blau, Karthik Nayak,
Elijah Newren, Christian Couder, Christian Couder
Previous commits have set up an infrastructure for `--filter=auto` to
automatically prepare a partial clone filter based on what the server
advertised and the client accepted.
Using that infrastructure, let's now enable the `--filter=auto` option
in `git clone` and `git fetch` by setting `allow_auto_filter` to 1.
Note that these small changes mean that when `git clone --filter=auto`
or `git fetch --filter=auto` are used, "auto" is automatically saved
as the partial clone filter for the server on the client. Therefore
subsequent calls to `git fetch` on the client will automatically use
this "auto" mode even without `--filter=auto`.
Let's also set `allow_auto_filter` to 1 in `transport.c`, as the
transport layer must be able to accept the "auto" filter spec even if
the invoking command hasn't fully parsed it yet.
When an "auto" filter is requested, let's have the "fetch-pack.c" code
in `do_fetch_pack_v2()` compute a filter and send it to the server.
In `do_fetch_pack_v2()` the logic also needs to check for the
"promisor-remote" capability and call `promisor_remote_reply()` to
parse advertised remotes and populate the list of those accepted (and
their filters).
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
Documentation/fetch-options.adoc | 19 ++++++---
Documentation/git-clone.adoc | 25 ++++++++---
Documentation/gitprotocol-v2.adoc | 16 ++++---
builtin/clone.c | 2 +
builtin/fetch.c | 2 +
fetch-pack.c | 20 +++++++++
t/t5710-promisor-remote-capability.sh | 60 +++++++++++++++++++++++++++
transport.c | 1 +
8 files changed, 130 insertions(+), 15 deletions(-)
diff --git a/Documentation/fetch-options.adoc b/Documentation/fetch-options.adoc
index 70a9818331..f7432d4b29 100644
--- a/Documentation/fetch-options.adoc
+++ b/Documentation/fetch-options.adoc
@@ -92,11 +92,20 @@ precedence over the `fetch.output` config option.
Use the partial clone feature and request that the server sends
a subset of reachable objects according to a given object filter.
When using `--filter`, the supplied _<filter-spec>_ is used for
- the partial fetch. For example, `--filter=blob:none` will filter
- out all blobs (file contents) until needed by Git. Also,
- `--filter=blob:limit=<size>` will filter out all blobs of size
- at least _<size>_. For more details on filter specifications, see
- the `--filter` option in linkgit:git-rev-list[1].
+ the partial fetch.
++
+If `--filter=auto` is used, the filter specification is determined
+automatically by combining the filter specifications advertised by
+the server for the promisor remotes that the client accepts (see
+linkgit:gitprotocol-v2[5] and the `promisor.acceptFromServer`
+configuration option in linkgit:git-config[1]).
++
+For details on all other available filter specifications, see the
+`--filter=<filter-spec>` option in linkgit:git-rev-list[1].
++
+For example, `--filter=blob:none` will filter out all blobs (file
+contents) until needed by Git. Also, `--filter=blob:limit=<size>` will
+filter out all blobs of size at least _<size>_.
ifndef::git-pull[]
`--write-fetch-head`::
diff --git a/Documentation/git-clone.adoc b/Documentation/git-clone.adoc
index 57cdfb7620..0db2d1e5f0 100644
--- a/Documentation/git-clone.adoc
+++ b/Documentation/git-clone.adoc
@@ -187,11 +187,26 @@ objects from the source repository into a pack in the cloned repository.
Use the partial clone feature and request that the server sends
a subset of reachable objects according to a given object filter.
When using `--filter`, the supplied _<filter-spec>_ is used for
- the partial clone filter. For example, `--filter=blob:none` will
- filter out all blobs (file contents) until needed by Git. Also,
- `--filter=blob:limit=<size>` will filter out all blobs of size
- at least _<size>_. For more details on filter specifications, see
- the `--filter` option in linkgit:git-rev-list[1].
+ the partial clone filter.
++
+If `--filter=auto` is used the filter specification is determined
+automatically through the 'promisor-remote' protocol (see
+linkgit:gitprotocol-v2[5]) by combining the filter specifications
+advertised by the server for the promisor remotes that the client
+accepts (see the `promisor.acceptFromServer` configuration option in
+linkgit:git-config[1]). This allows the server to suggest the optimal
+filter for the available promisor remotes.
++
+As with other filter specifications, the "auto" value is persisted in
+the configuration. This ensures that future fetches will continue to
+adapt to the server's current recommendation.
++
+For details on all other available filter specifications, see the
+`--filter=<filter-spec>` option in linkgit:git-rev-list[1].
++
+For example, `--filter=blob:none` will filter out all blobs (file
+contents) until needed by Git. Also, `--filter=blob:limit=<size>` will
+filter out all blobs of size at least _<size>_.
`--also-filter-submodules`::
Also apply the partial clone filter to any submodules in the repository.
diff --git a/Documentation/gitprotocol-v2.adoc b/Documentation/gitprotocol-v2.adoc
index d93dd279ea..f985cb4c47 100644
--- a/Documentation/gitprotocol-v2.adoc
+++ b/Documentation/gitprotocol-v2.adoc
@@ -812,10 +812,15 @@ MUST appear first in each pr-fields, in that order.
After these mandatory fields, the server MAY advertise the following
optional fields in any order:
-`partialCloneFilter`:: The filter specification used by the remote.
+`partialCloneFilter`:: The filter specification for the remote. It
+corresponds to the "remote.<name>.partialCloneFilter" config setting.
Clients can use this to determine if the remote's filtering strategy
-is compatible with their needs (e.g., checking if both use "blob:none").
-It corresponds to the "remote.<name>.partialCloneFilter" config setting.
+is compatible with their needs (e.g., checking if both use
+"blob:none"). Additionally they can use this through the
+`--filter=auto` option in linkgit:git-clone[1]. With that option, the
+filter specification of the clone will be automatically computed by
+combining the filter specifications of the promisor remotes the client
+accepts.
`token`:: An authentication token that clients can use when
connecting to the remote. It corresponds to the "remote.<name>.token"
@@ -828,8 +833,9 @@ future protocol extensions.
The client can use information transmitted through these fields to
decide if it accepts the advertised promisor remote. Also, the client
-can be configured to store the values of these fields (see
-"promisor.storeFields" in linkgit:git-config[1]).
+can be configured to store the values of these fields or use them
+to automatically configure the repository (see "promisor.storeFields"
+in linkgit:git-config[1] and `--filter=auto` in linkgit:git-clone[1]).
Field values MUST be urlencoded.
diff --git a/builtin/clone.c b/builtin/clone.c
index 186e5498d4..41bbaea72a 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -1001,6 +1001,8 @@ int cmd_clone(int argc,
NULL
};
+ filter_options.allow_auto_filter = 1;
+
packet_trace_identity("clone");
repo_config(the_repository, git_clone_config, NULL);
diff --git a/builtin/fetch.c b/builtin/fetch.c
index b984173447..ddc30a0d30 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -2439,6 +2439,8 @@ int cmd_fetch(int argc,
OPT_END()
};
+ filter_options.allow_auto_filter = 1;
+
packet_trace_identity("fetch");
/* Record the command line for the reflog */
diff --git a/fetch-pack.c b/fetch-pack.c
index 40316c9a34..12ccea0dab 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -35,6 +35,7 @@
#include "sigchain.h"
#include "mergesort.h"
#include "prio-queue.h"
+#include "promisor-remote.h"
static int transfer_unpack_limit = -1;
static int fetch_unpack_limit = -1;
@@ -1661,6 +1662,25 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
struct string_list packfile_uris = STRING_LIST_INIT_DUP;
int i;
struct strvec index_pack_args = STRVEC_INIT;
+ const char *promisor_remote_config;
+
+ if (server_feature_v2("promisor-remote", &promisor_remote_config)) {
+ char *remote_name = promisor_remote_reply(promisor_remote_config);
+ free(remote_name);
+ }
+
+ if (args->filter_options.choice == LOFC_AUTO) {
+ struct strbuf errbuf = STRBUF_INIT;
+ char *constructed_filter = promisor_remote_construct_filter(r);
+
+ list_objects_filter_resolve_auto(&args->filter_options,
+ constructed_filter, &errbuf);
+ if (errbuf.len > 0)
+ die(_("couldn't resolve 'auto' filter: %s"), errbuf.buf);
+
+ free(constructed_filter);
+ strbuf_release(&errbuf);
+ }
negotiator = &negotiator_alloc;
if (args->refetch)
diff --git a/t/t5710-promisor-remote-capability.sh b/t/t5710-promisor-remote-capability.sh
index a726af214a..21543bce20 100755
--- a/t/t5710-promisor-remote-capability.sh
+++ b/t/t5710-promisor-remote-capability.sh
@@ -409,6 +409,66 @@ test_expect_success "clone with promisor.storeFields=partialCloneFilter" '
check_missing_objects server 1 "$oid"
'
+test_expect_success "clone and fetch with --filter=auto" '
+ git -C server config promisor.advertise true &&
+ test_when_finished "rm -rf client trace" &&
+
+ git -C server config remote.lop.partialCloneFilter "blob:limit=9500" &&
+ test_config -C server promisor.sendFields "partialCloneFilter" &&
+
+ GIT_TRACE_PACKET="$(pwd)/trace" GIT_NO_LAZY_FETCH=0 git clone \
+ -c remote.lop.promisor=true \
+ -c remote.lop.url="file://$(pwd)/lop" \
+ -c promisor.acceptfromserver=All \
+ --no-local --filter=auto server client 2>err &&
+
+ test_grep "filter blob:limit=9500" trace &&
+ test_grep ! "filter auto" trace &&
+
+ # Verify "auto" is persisted in config
+ echo auto >expected &&
+ git -C client config remote.origin.partialCloneFilter >actual &&
+ test_cmp expected actual &&
+
+ # Check that the largest object is still missing on the server
+ check_missing_objects server 1 "$oid" &&
+
+ # Now change the filter on the server
+ git -C server config remote.lop.partialCloneFilter "blob:limit=5678" &&
+
+ # Get a new commit on the server to ensure "git fetch" actually runs fetch-pack
+ test_commit -C template new-commit &&
+ git -C template push --all "$(pwd)/server" &&
+
+ # Perform a fetch WITH --filter=auto
+ rm -rf trace &&
+ GIT_TRACE_PACKET="$(pwd)/trace" git -C client fetch --filter=auto &&
+
+ # Verify that the new filter was used
+ test_grep "filter blob:limit=5678" trace &&
+
+ # Check that the largest object is still missing on the server
+ check_missing_objects server 1 "$oid" &&
+
+ # Change the filter on the server again
+ git -C server config remote.lop.partialCloneFilter "blob:limit=5432" &&
+
+ # Get yet a new commit on the server to ensure fetch-pack runs
+ test_commit -C template yet-a-new-commit &&
+ git -C template push --all "$(pwd)/server" &&
+
+ # Perform a fetch WITHOUT --filter=auto
+ # Relies on "auto" being persisted in the client config
+ rm -rf trace &&
+ GIT_TRACE_PACKET="$(pwd)/trace" git -C client fetch &&
+
+ # Verify that the new filter was used
+ test_grep "filter blob:limit=5432" trace &&
+
+ # Check that the largest object is still missing on the server
+ check_missing_objects server 1 "$oid"
+'
+
test_expect_success "clone with promisor.advertise set to 'true' but don't delete the client" '
git -C server config promisor.advertise true &&
diff --git a/transport.c b/transport.c
index c7f06a7382..cde8d83a57 100644
--- a/transport.c
+++ b/transport.c
@@ -1219,6 +1219,7 @@ struct transport *transport_get(struct remote *remote, const char *url)
*/
struct git_transport_data *data = xcalloc(1, sizeof(*data));
list_objects_filter_init(&data->options.filter_options);
+ data->options.filter_options.allow_auto_filter = 1;
ret->data = data;
ret->vtable = &builtin_smart_vtable;
ret->smart_options = &(data->options);
--
2.52.0.319.gfcaffa7898
^ permalink raw reply related [flat|nested] 80+ messages in thread* Re: [PATCH 9/9] fetch-pack: wire up and enable auto filter logic
2025-12-23 11:11 ` [PATCH 9/9] fetch-pack: wire up and enable auto filter logic Christian Couder
@ 2026-01-07 10:05 ` Patrick Steinhardt
2026-02-04 11:06 ` Christian Couder
0 siblings, 1 reply; 80+ messages in thread
From: Patrick Steinhardt @ 2026-01-07 10:05 UTC (permalink / raw)
To: Christian Couder
Cc: git, Junio C Hamano, Taylor Blau, Karthik Nayak, Elijah Newren,
Christian Couder
On Tue, Dec 23, 2025 at 12:11:13PM +0100, Christian Couder wrote:
> diff --git a/Documentation/fetch-options.adoc b/Documentation/fetch-options.adoc
> index 70a9818331..f7432d4b29 100644
> --- a/Documentation/fetch-options.adoc
> +++ b/Documentation/fetch-options.adoc
> @@ -92,11 +92,20 @@ precedence over the `fetch.output` config option.
> Use the partial clone feature and request that the server sends
> a subset of reachable objects according to a given object filter.
> When using `--filter`, the supplied _<filter-spec>_ is used for
> - the partial fetch. For example, `--filter=blob:none` will filter
> - out all blobs (file contents) until needed by Git. Also,
> - `--filter=blob:limit=<size>` will filter out all blobs of size
> - at least _<size>_. For more details on filter specifications, see
> - the `--filter` option in linkgit:git-rev-list[1].
> + the partial fetch.
> ++
> +If `--filter=auto` is used, the filter specification is determined
> +automatically by combining the filter specifications advertised by
> +the server for the promisor remotes that the client accepts (see
> +linkgit:gitprotocol-v2[5] and the `promisor.acceptFromServer`
> +configuration option in linkgit:git-config[1]).
Okay, so if "promisor.acceptFromServer" enables a subset of advertised
promisors we will automatically use their advertised filters. But what
about the case where we already have a set of local promisors with their
own filters, would those also honored by "--filter=auto"?
> diff --git a/builtin/clone.c b/builtin/clone.c
> index 186e5498d4..41bbaea72a 100644
> --- a/builtin/clone.c
> +++ b/builtin/clone.c
> @@ -1001,6 +1001,8 @@ int cmd_clone(int argc,
> NULL
> };
>
> + filter_options.allow_auto_filter = 1;
> +
> packet_trace_identity("clone");
>
> repo_config(the_repository, git_clone_config, NULL);
> diff --git a/builtin/fetch.c b/builtin/fetch.c
> index b984173447..ddc30a0d30 100644
> --- a/builtin/fetch.c
> +++ b/builtin/fetch.c
> @@ -2439,6 +2439,8 @@ int cmd_fetch(int argc,
> OPT_END()
> };
>
> + filter_options.allow_auto_filter = 1;
> +
> packet_trace_identity("fetch");
>
> /* Record the command line for the reflog */
Nice that both of these changes are so easy now.
> diff --git a/fetch-pack.c b/fetch-pack.c
> index 40316c9a34..12ccea0dab 100644
> --- a/fetch-pack.c
> +++ b/fetch-pack.c
> @@ -1661,6 +1662,25 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
> struct string_list packfile_uris = STRING_LIST_INIT_DUP;
> int i;
> struct strvec index_pack_args = STRVEC_INIT;
> + const char *promisor_remote_config;
> +
> + if (server_feature_v2("promisor-remote", &promisor_remote_config)) {
> + char *remote_name = promisor_remote_reply(promisor_remote_config);
> + free(remote_name);
> + }
> +
> + if (args->filter_options.choice == LOFC_AUTO) {
> + struct strbuf errbuf = STRBUF_INIT;
> + char *constructed_filter = promisor_remote_construct_filter(r);
> +
> + list_objects_filter_resolve_auto(&args->filter_options,
> + constructed_filter, &errbuf);
> + if (errbuf.len > 0)
> + die(_("couldn't resolve 'auto' filter: %s"), errbuf.buf);
Now that I see it being used I think that the calling convention of this
function is a bit weird. I would've expected the function to return an
error code that the caller can consult instead of having to check for
`errbuf.len`.
Patrick
^ permalink raw reply [flat|nested] 80+ messages in thread* Re: [PATCH 9/9] fetch-pack: wire up and enable auto filter logic
2026-01-07 10:05 ` Patrick Steinhardt
@ 2026-02-04 11:06 ` Christian Couder
0 siblings, 0 replies; 80+ messages in thread
From: Christian Couder @ 2026-02-04 11:06 UTC (permalink / raw)
To: Patrick Steinhardt
Cc: git, Junio C Hamano, Taylor Blau, Karthik Nayak, Elijah Newren,
Christian Couder
On Wed, Jan 7, 2026 at 11:05 AM Patrick Steinhardt <ps@pks.im> wrote:
>
> On Tue, Dec 23, 2025 at 12:11:13PM +0100, Christian Couder wrote:
> > diff --git a/Documentation/fetch-options.adoc b/Documentation/fetch-options.adoc
> > index 70a9818331..f7432d4b29 100644
> > --- a/Documentation/fetch-options.adoc
> > +++ b/Documentation/fetch-options.adoc
> > @@ -92,11 +92,20 @@ precedence over the `fetch.output` config option.
> > Use the partial clone feature and request that the server sends
> > a subset of reachable objects according to a given object filter.
> > When using `--filter`, the supplied _<filter-spec>_ is used for
> > - the partial fetch. For example, `--filter=blob:none` will filter
> > - out all blobs (file contents) until needed by Git. Also,
> > - `--filter=blob:limit=<size>` will filter out all blobs of size
> > - at least _<size>_. For more details on filter specifications, see
> > - the `--filter` option in linkgit:git-rev-list[1].
> > + the partial fetch.
> > ++
> > +If `--filter=auto` is used, the filter specification is determined
> > +automatically by combining the filter specifications advertised by
> > +the server for the promisor remotes that the client accepts (see
> > +linkgit:gitprotocol-v2[5] and the `promisor.acceptFromServer`
> > +configuration option in linkgit:git-config[1]).
>
> Okay, so if "promisor.acceptFromServer" enables a subset of advertised
> promisors we will automatically use their advertised filters. But what
> about the case where we already have a set of local promisors with their
> own filters, would those also honored by "--filter=auto"?
No, they wouldn't be honored. 'auto' means that the client fully
accepts the filters advertised by the server. Maybe we could add a new
mode for using the locally configured filter by default and only using
the advertised filter if there is no locally configured filter for the
remote, but we can do that later.
> > diff --git a/fetch-pack.c b/fetch-pack.c
> > index 40316c9a34..12ccea0dab 100644
> > --- a/fetch-pack.c
> > +++ b/fetch-pack.c
> > @@ -1661,6 +1662,25 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
> > struct string_list packfile_uris = STRING_LIST_INIT_DUP;
> > int i;
> > struct strvec index_pack_args = STRVEC_INIT;
> > + const char *promisor_remote_config;
> > +
> > + if (server_feature_v2("promisor-remote", &promisor_remote_config)) {
> > + char *remote_name = promisor_remote_reply(promisor_remote_config);
> > + free(remote_name);
> > + }
> > +
> > + if (args->filter_options.choice == LOFC_AUTO) {
> > + struct strbuf errbuf = STRBUF_INIT;
> > + char *constructed_filter = promisor_remote_construct_filter(r);
> > +
> > + list_objects_filter_resolve_auto(&args->filter_options,
> > + constructed_filter, &errbuf);
> > + if (errbuf.len > 0)
> > + die(_("couldn't resolve 'auto' filter: %s"), errbuf.buf);
>
> Now that I see it being used I think that the calling convention of this
> function is a bit weird. I would've expected the function to return an
> error code that the caller can consult instead of having to check for
> `errbuf.len`.
Right, anyway I have removed that `list_objects_filter_resolve_auto()`
function altogether by removing the patch that introduced it in v2.
Thanks!
^ permalink raw reply [flat|nested] 80+ messages in thread
* [PATCH v2 0/8] Implement `promisor.storeFields` and `--filter=auto`
2025-12-23 11:11 [PATCH 0/9] Implement `promisor.storeFields` and `--filter=auto` Christian Couder
` (8 preceding siblings ...)
2025-12-23 11:11 ` [PATCH 9/9] fetch-pack: wire up and enable auto filter logic Christian Couder
@ 2026-02-04 11:08 ` Christian Couder
2026-02-04 11:08 ` [PATCH v2 1/8] promisor-remote: refactor initialising field lists Christian Couder
` (8 more replies)
2026-04-27 12:41 ` [PATCH v2 0/8] Auto-configure advertised remotes via URL allowlist Christian Couder
10 siblings, 9 replies; 80+ messages in thread
From: Christian Couder @ 2026-02-04 11:08 UTC (permalink / raw)
To: git
Cc: Junio C Hamano, Patrick Steinhardt, Taylor Blau, Karthik Nayak,
Elijah Newren, Jean-Noël Avila, Christian Couder
Introduction
============
A previous patch series added the possibility to pass additional
fields, a "partialCloneFilter" and a "token" for each advertised
promisor remote, from a server to a client through the
"promisor-remote" capability.
On the client side though, it has so far only been possible to use
this new information to compare it with local information and then
decide if the corresponding advertised promisor remote is accepted or
not.
For the "token" it would be useful if it could be stored on the
client. For example in a setup where the client uses specialized
remote helpers which need a token to access the promisor remotes
advertised by the server, storing the token would allow the token to
be used when the client directly accesses a promisor remote for
example to lazy fetch some blobs it now needs.
To enable such a workflow, where the server can rotate tokens and the
client can have updated tokens from the server by simply fetching from
it, the first part of this series introduces a new
"promisor.storeFields" configuration option on the client side,
similar to the "promisor.checkFields" configuration option. When field
names, "token" or "partialCloneFilter", are listed in this new
configuration option, then the values of these field names transmitted
by the server are stored in the local configuration on the client
side.
Note that for security reasons, the corresponding remote name and url
of the advertised promisor remotes must have already been configured
on the client side. No new remote name nor url are configured.
For the "partialCloneFilter" field, simply storing the value is not
enough to enable dynamic updates. Currently, when a user initiates a
partial clone with `--filter=<filter-spec>`, that specific
<filter-spec> is saved in the client's local configuration (e.g.,
remote.origin.partialCloneFilter). Subsequent fetches then reuse this
value, ignoring suggestions from the server.
To avoid breaking this mechanism and still be able to use the
<filter-spec> that the server suggests for the promisor remotes that
the client accepts, the second part of this series introduces a new
`--filter=auto` mode for `git clone` and `git fetch`.
When `--filter=auto` is used, then "auto" is still saved as the
<filter-spec> for the server locally on the client, and then when a
fetch-pack happens, instead of passing just "auto", the actual filter
requested by the client is computed by combining the <filter-spec>s
that the server suggested for the promisor remotes that the client
accepted. This uses the "combine" filter mechanism that already exists
in "list-objects-filter-options.{c,h}".
This way by just using `--filter=auto` when cloning, a client makes
sure it will use the <filter-spec>s suggested by the server for the
promisor remotes it accepts.
This work is part of the "LOP" effort documented in:
Documentation/technical/large-object-promisors.adoc
See that doc for more information on the broader context.
Overview of the patches
=======================
Patches 1/8 and 2/8 are the first part of the series and implement the
new "promisor.storeFields" configuration option. Patch 1/8 is a small
preparatory refactoring.
Patches from 3/8 to 8/8 implement the `--filter=auto` option:
- Patches 3/8 and 4/8 are cleanups of "builtin/clone.c" and
"builtin/fetch.c" respectively that make the `filter_options`
variable local to cmd_clone() or cmd_fetch().
- Patch 5/8 is a doc update as `--filter=<filter-spec>` wasn't
documented for `git fetch`.
- Patch 6/8 improves "list-objects-filter-options.{c,h}" to
support the new 'auto' mode.
- Patch 7/8 improves "promisor-remote.{c,h}" to support the new
'auto' mode.
- Patch 8/8 make the new 'auto' mode actually work by wiring up
everything together.
CI Report
=========
All the tests pass, see:
https://github.com/chriscool/git/actions/runs/21665784738
Changes since v1
================
Thanks to Patrick Steinhardt and Jean-Noël Avila for reviewing the
previous version!
In patch 2/8:
- A note has been added to the commit message to clarify why the new
"promisor.storeFields" configuration variable might not be very
useful for partial cloçne filters.
- repo_config_set_gently() is used instead of
repo_config_set_worktree_gently().
- new_store_info() and free_store_info() have been renamed
store_info_new() and store_info_free() respectively.
In patch 5/8:
- The commit message has been clarified to say that we use the same
words as in the `git clone`documentation and that we are not
trying to improve on them.
- Backticks have been added around "--filter=<filter-spec>".
In patch 6/8:
- The commit message has been clarified to note that the new
`allow_auto_filter` flag depends on the command not on user input.
- A call to strbuf_addstr() has been indented better.
- A `goto cleanup;` instruction has been added.
Patch 7/9 in v1 has been removed, as another way to combine filter has
been implemented (see below).
In patch 7/8:
- A typo in the commit message subject has been fixed (missing 's').
- The commit message has been fixed to say that `advertised_filte`
is added in the current commit, not a previous one, and to remove
a mention of `list_objects_filter_combine()` as the previous
commit that introduced that function has been removed.
- promisor_remote_construct_filter() now uses a temporary `struct
list_objects_filter_options` to construct a combined filter
(instead of `list_objects_filter_combine()`).
- The comment on top of the declaration of
promisor_remote_construct_filter() in "promisor-remote.h" has been
improved a bit.
In patch 8/8:
- Instead of using list_objects_filter_resolve_auto(), we use
gently_parse_list_objects_filter() directly.
- We unset `allow_auto_filter` before parsing the combined filter as
it must not be 'auto'.
- A die() message has been improved a bit.
Range diff since v1
===================
1: fcaffa7898 = 1: e19b1518cd promisor-remote: refactor initialising field lists
2: 9bcfa03987 ! 2: 8f20baac17 promisor-remote: allow a client to store fields
@@ Commit message
available when the client needs to access the promisor remotes for a
lazy fetch.
- In the same way, if it appears that it's better to use a different
- filter to access a promisor remote, it could be helpful if the client
- could automatically use it.
-
To allow this, let's introduce a new "promisor.storeFields"
configuration variable.
- Like "promisor.checkFields" and "promisor.sendFields", it should
- contain a comma or space separated list of field names. Only the
- "partialCloneFilter" and "token" field names are supported for now.
+ Note that for a partial clone filter, it's less interesting to have
+ it stored on the client. This is because a filter should be used
+ right away and we already pass a `--filter=<filter-spec>` option to
+ `git clone` when starting a partial clone. Storing the filter could
+ perhaps still be interesting for information purposes.
+
+ Like "promisor.checkFields" and "promisor.sendFields", the new
+ configuration variable should contain a comma or space separated list
+ of field names. Only the "partialCloneFilter" and "token" field names
+ are supported for now.
When a server advertises a promisor remote, for example "foo", along
with for example "token=XXXXX" to a client, and on the client side
@@ promisor-remote.c: static struct promisor_info *parse_one_advertised_remote(cons
+ current ? current : "",
+ advertised);
+
-+ repo_config_set_worktree_gently(repo, key, advertised);
++ repo_config_set_gently(repo, key, advertised);
+ free(key);
+
+ return true;
@@ promisor-remote.c: static struct promisor_info *parse_one_advertised_remote(cons
+ bool store_token;
+};
+
-+static struct store_info *new_store_info(struct repository *repo)
++static struct store_info *store_info_new(struct repository *repo)
+{
+ struct string_list *fields_to_store = fields_stored();
+ struct store_info *s = xmalloc(sizeof(*s));
@@ promisor-remote.c: static struct promisor_info *parse_one_advertised_remote(cons
+ return s;
+}
+
-+static void free_store_info(struct store_info *s)
++static void store_info_free(struct store_info *s)
+{
+ if (s) {
+ promisor_info_list_clear(&s->config_info);
@@ promisor-remote.c: static void filter_promisor_remote(struct repository *repo,
- if (should_accept_remote(accept, advertised, &config_info))
+ if (should_accept_remote(accept, advertised, &config_info)) {
+ if (!store_info)
-+ store_info = new_store_info(repo);
++ store_info = store_info_new(repo);
+ if (promisor_store_advertised_fields(advertised, store_info))
+ reload_config = true;
+
@@ promisor-remote.c: static void filter_promisor_remote(struct repository *repo,
promisor_info_list_clear(&config_info);
string_list_clear(&remote_info, 0);
-+ free_store_info(store_info);
++ store_info_free(store_info);
+
+ if (reload_config)
+ repo_promisor_remote_reinit(repo);
3: 629b1ba1af = 3: 9d53a79600 clone: make filter_options local to cmd_clone()
4: ab9105062d = 4: b24907e6dc fetch: make filter_options local to cmd_fetch()
5: 72924115c1 ! 5: 90fb77360b doc: fetch: document `--filter=<filter-spec>` option
@@ Commit message
The `--filter=<filter-spec>` option is documented in most commands that
support it except `git fetch`.
- Let's fix that and document that option properly in the same way as it
- is already documented for `git clone`.
+ Let's fix that and document that option using the same words already
+ used to document it for `git clone`.
+
+ Those words could probably be improved, but they are not wrong, so
+ let's just use them for now and leave improving them for future work.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
@@ Documentation/fetch-options.adoc: linkgit:git-config[1].
This is incompatible with `--recurse-submodules=(yes|on-demand)` and takes
precedence over the `fetch.output` config option.
-+--filter=<filter-spec>::
++`--filter=<filter-spec>`::
+ Use the partial clone feature and request that the server sends
+ a subset of reachable objects according to a given object filter.
+ When using `--filter`, the supplied _<filter-spec>_ is used for
6: c2f18b2055 ! 6: b524b24024 list-objects-filter-options: support 'auto' mode for --filter
@@ Commit message
- Add a new `unsigned int allow_auto_filter : 1;` flag to
`struct list_objects_filter_options` which specifies if "auto" is
- accepted or not.
+ accepted or not by the current command.
- Change gently_parse_list_objects_filter() to parse "auto" if it's
accepted.
- Make sure we die() if "auto" is combined with another filter.
@@ Commit message
If we ever want to give a meaning to combining "auto" with a different
filter too, nothing prevents us to do that in future work either.
+ Also note that the new `allow_auto_filter` flag depends on the command,
+ not user choices, so it should be reset to the command default when
+ `struct list_objects_filter_options` instances are reset.
+
While at it, let's add a new "u-list-objects-filter-options.c" file for
`struct list_objects_filter_options` related unit tests. For now it
only tests gently_parse_list_objects_filter() though.
@@ list-objects-filter-options.c: int gently_parse_list_objects_filter(
- if (!strcmp(arg, "blob:none")) {
+ if (!strcmp(arg, "auto")) {
+ if (!filter_options->allow_auto_filter) {
-+ strbuf_addstr(
-+ errbuf,
-+ _("'auto' filter not supported by this command"));
++ strbuf_addstr(errbuf,
++ _("'auto' filter not supported by this command"));
+ return 1;
+ }
+ filter_options->choice = LOFC_AUTO;
@@ list-objects-filter-options.c: static int parse_combine_subfilter(
+ goto cleanup;
+
+ result = (filter_options->sub[new_index].choice == LOFC_AUTO);
-+ if (result)
++ if (result) {
+ strbuf_addstr(errbuf, _("an 'auto' filter cannot be combined"));
++ goto cleanup;
++ }
+cleanup:
free(decoded);
7: 1b88355e66 < -: ---------- list-objects-filter-options: implement auto filter resolution
8: 6ac8bf81f0 ! 7: 4ec51ee88f promisor-remote: keep advertised filter in memory
@@ Metadata
Author: Christian Couder <chriscool@tuxfamily.org>
## Commit message ##
- promisor-remote: keep advertised filter in memory
+ promisor-remote: keep advertised filters in memory
Currently, advertised filters are only kept in memory temporarily
during parsing, or persisted to disk if `promisor.storeFields`
@@ Commit message
promisor remotes the client accepted.
To enable the client to construct a filter spec based on these filters,
- let's add a `promisor_remote_construct_filter(repo)` function.
+ let's also add a `promisor_remote_construct_filter(repo)` function.
This function:
- iterates over all accepted promisor remotes in the repository,
- collects the filters advertised for them (using `advertised_filter`
- which a previous commit added to `struct promisor_remote`), and
- - generates a single filter spec for them (using the
- `list_objects_filter_combine()` function added by a previous commit).
+ added in this commit, and
+ - generates a single filter spec for them.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
@@ promisor-remote.c: void mark_promisor_remotes_as_accepted(struct repository *r,
+
+char *promisor_remote_construct_filter(struct repository *repo)
+{
-+ struct string_list advertised_filters = STRING_LIST_INIT_NODUP;
+ struct promisor_remote *r;
-+ char *result;
++ struct list_objects_filter_options filter_options = LIST_OBJECTS_FILTER_INIT;
++ struct strbuf err = STRBUF_INIT;
++ char *result = NULL;
+
+ promisor_remote_init(repo);
+
+ for (r = repo->promisor_remote_config->promisors; r; r = r->next) {
+ if (r->accepted && r->advertised_filter)
-+ string_list_append(&advertised_filters, r->advertised_filter);
++ if (gently_parse_list_objects_filter(&filter_options,
++ r->advertised_filter,
++ &err)) {
++ warning(_("promisor remote '%s' advertised invalid filter '%s': %s"),
++ r->name, r->advertised_filter, err.buf);
++ strbuf_reset(&err);
++ continue;
++ }
+ }
+
-+ result = list_objects_filter_combine(&advertised_filters);
++ if (filter_options.choice)
++ result = xstrdup(expand_list_objects_filter_spec(&filter_options));
+
-+ string_list_clear(&advertised_filters, 0);
++ list_objects_filter_release(&filter_options);
++ strbuf_release(&err);
+
+ return result;
+}
@@ promisor-remote.h: void mark_promisor_remotes_as_accepted(struct repository *rep
int repo_has_accepted_promisor_remote(struct repository *r);
+/*
-+ * Use the filters from the accepted remotes to create a filter.
++ * Use the filters from the accepted remotes to create a combined
++ * filter (useful in `--filter=auto` mode).
+ */
+char *promisor_remote_construct_filter(struct repository *repo);
+
9: 7c822499e2 ! 8: 994ecb3317 fetch-pack: wire up and enable auto filter logic
@@ fetch-pack.c: static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
+ struct strbuf errbuf = STRBUF_INIT;
+ char *constructed_filter = promisor_remote_construct_filter(r);
+
-+ list_objects_filter_resolve_auto(&args->filter_options,
-+ constructed_filter, &errbuf);
++ list_objects_filter_release(&args->filter_options);
++ /* The result of resolving an 'auto' filter must not be 'auto' */
++ args->filter_options.allow_auto_filter = 0;
++
++ if (constructed_filter)
++ gently_parse_list_objects_filter(&args->filter_options,
++ constructed_filter,
++ &errbuf);
++
+ if (errbuf.len > 0)
-+ die(_("couldn't resolve 'auto' filter: %s"), errbuf.buf);
++ die(_("couldn't resolve 'auto' filter '%s': %s"),
++ constructed_filter, errbuf.buf);
+
+ free(constructed_filter);
+ strbuf_release(&errbuf);
Christian Couder (8):
promisor-remote: refactor initialising field lists
promisor-remote: allow a client to store fields
clone: make filter_options local to cmd_clone()
fetch: make filter_options local to cmd_fetch()
doc: fetch: document `--filter=<filter-spec>` option
list-objects-filter-options: support 'auto' mode for --filter
promisor-remote: keep advertised filters in memory
fetch-pack: wire up and enable auto filter logic
Documentation/config/promisor.adoc | 33 +++
Documentation/fetch-options.adoc | 19 ++
Documentation/git-clone.adoc | 25 +-
Documentation/gitprotocol-v2.adoc | 24 +-
Makefile | 1 +
builtin/clone.c | 18 +-
builtin/fetch.c | 50 ++--
fetch-pack.c | 28 +++
list-objects-filter-options.c | 37 ++-
list-objects-filter-options.h | 6 +
list-objects-filter.c | 8 +
promisor-remote.c | 232 +++++++++++++++++--
promisor-remote.h | 7 +
t/meson.build | 1 +
t/t5710-promisor-remote-capability.sh | 109 +++++++++
t/unit-tests/u-list-objects-filter-options.c | 53 +++++
transport.c | 1 +
17 files changed, 596 insertions(+), 56 deletions(-)
create mode 100644 t/unit-tests/u-list-objects-filter-options.c
--
2.53.0.rc2.10.g12663a1c75.dirty
^ permalink raw reply [flat|nested] 80+ messages in thread* [PATCH v2 1/8] promisor-remote: refactor initialising field lists
2026-02-04 11:08 ` [PATCH v2 0/8] Implement `promisor.storeFields` and `--filter=auto` Christian Couder
@ 2026-02-04 11:08 ` Christian Couder
2026-02-04 11:08 ` [PATCH v2 2/8] promisor-remote: allow a client to store fields Christian Couder
` (7 subsequent siblings)
8 siblings, 0 replies; 80+ messages in thread
From: Christian Couder @ 2026-02-04 11:08 UTC (permalink / raw)
To: git
Cc: Junio C Hamano, Patrick Steinhardt, Taylor Blau, Karthik Nayak,
Elijah Newren, Jean-Noël Avila, Christian Couder,
Christian Couder
In "promisor-remote.c", the fields_sent() and fields_checked()
functions serve similar purposes and contain a small amount of
duplicated code.
As we are going to add a similar function in a following commit,
let's refactor this common code into a new initialize_fields_list()
function.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
promisor-remote.c | 28 ++++++++++++++--------------
1 file changed, 14 insertions(+), 14 deletions(-)
diff --git a/promisor-remote.c b/promisor-remote.c
index 77ebf537e2..5d8151cedb 100644
--- a/promisor-remote.c
+++ b/promisor-remote.c
@@ -375,18 +375,24 @@ static char *fields_from_config(struct string_list *fields_list, const char *con
return fields;
}
+static struct string_list *initialize_fields_list(struct string_list *fields_list, int *initialized,
+ const char *config_key)
+{
+ if (!*initialized) {
+ fields_list->cmp = strcasecmp;
+ fields_from_config(fields_list, config_key);
+ *initialized = 1;
+ }
+
+ return fields_list;
+}
+
static struct string_list *fields_sent(void)
{
static struct string_list fields_list = STRING_LIST_INIT_NODUP;
static int initialized;
- if (!initialized) {
- fields_list.cmp = strcasecmp;
- fields_from_config(&fields_list, "promisor.sendFields");
- initialized = 1;
- }
-
- return &fields_list;
+ return initialize_fields_list(&fields_list, &initialized, "promisor.sendFields");
}
static struct string_list *fields_checked(void)
@@ -394,13 +400,7 @@ static struct string_list *fields_checked(void)
static struct string_list fields_list = STRING_LIST_INIT_NODUP;
static int initialized;
- if (!initialized) {
- fields_list.cmp = strcasecmp;
- fields_from_config(&fields_list, "promisor.checkFields");
- initialized = 1;
- }
-
- return &fields_list;
+ return initialize_fields_list(&fields_list, &initialized, "promisor.checkFields");
}
/*
--
2.53.0.rc2.10.g12663a1c75.dirty
^ permalink raw reply related [flat|nested] 80+ messages in thread* [PATCH v2 2/8] promisor-remote: allow a client to store fields
2026-02-04 11:08 ` [PATCH v2 0/8] Implement `promisor.storeFields` and `--filter=auto` Christian Couder
2026-02-04 11:08 ` [PATCH v2 1/8] promisor-remote: refactor initialising field lists Christian Couder
@ 2026-02-04 11:08 ` Christian Couder
2026-02-04 11:08 ` [PATCH v2 3/8] clone: make filter_options local to cmd_clone() Christian Couder
` (6 subsequent siblings)
8 siblings, 0 replies; 80+ messages in thread
From: Christian Couder @ 2026-02-04 11:08 UTC (permalink / raw)
To: git
Cc: Junio C Hamano, Patrick Steinhardt, Taylor Blau, Karthik Nayak,
Elijah Newren, Jean-Noël Avila, Christian Couder,
Christian Couder
A previous commit allowed a server to pass additional fields through
the "promisor-remote" protocol capability after the "name" and "url"
fields, specifically the "partialCloneFilter" and "token" fields.
Another previous commit, c213820c51 (promisor-remote: allow a client
to check fields, 2025-09-08), has made it possible for a client to
decide if it accepts a promisor remote advertised by a server based
on these additional fields.
Often though, it would be interesting for the client to just store in
its configuration files these additional fields passed by the server,
so that it can use them when needed.
For example if a token is necessary to access a promisor remote, that
token could be updated frequently only on the server side and then
passed to all the clients through the "promisor-remote" capability,
avoiding the need to update it on all the clients manually.
Storing the token on the client side makes sure that the token is
available when the client needs to access the promisor remotes for a
lazy fetch.
To allow this, let's introduce a new "promisor.storeFields"
configuration variable.
Note that for a partial clone filter, it's less interesting to have
it stored on the client. This is because a filter should be used
right away and we already pass a `--filter=<filter-spec>` option to
`git clone` when starting a partial clone. Storing the filter could
perhaps still be interesting for information purposes.
Like "promisor.checkFields" and "promisor.sendFields", the new
configuration variable should contain a comma or space separated list
of field names. Only the "partialCloneFilter" and "token" field names
are supported for now.
When a server advertises a promisor remote, for example "foo", along
with for example "token=XXXXX" to a client, and on the client side
"promisor.storeFields" contains "token", then the client will store
XXXXX for the "remote.foo.token" variable in its configuration file
and reload its configuration so it can immediately use this new
configuration variable.
A message is emitted on stderr to warn users when the config is
changed.
Note that even if "promisor.acceptFromServer" is set to "all", a
promisor remote has to be already configured on the client side for
some of its config to be changed. In any case no new remote is
configured and no new URL is stored.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
Documentation/config/promisor.adoc | 33 ++++++
Documentation/gitprotocol-v2.adoc | 12 ++-
promisor-remote.c | 148 +++++++++++++++++++++++++-
t/t5710-promisor-remote-capability.sh | 49 +++++++++
4 files changed, 236 insertions(+), 6 deletions(-)
diff --git a/Documentation/config/promisor.adoc b/Documentation/config/promisor.adoc
index 93e5e0d9b5..b0fa43b839 100644
--- a/Documentation/config/promisor.adoc
+++ b/Documentation/config/promisor.adoc
@@ -89,3 +89,36 @@ variable. The fields are checked only if the
`promisor.acceptFromServer` config variable is not set to "None". If
set to "None", this config variable has no effect. See
linkgit:gitprotocol-v2[5].
+
+promisor.storeFields::
+ A comma or space separated list of additional remote related
+ field names. If a client accepts an advertised remote, the
+ client will store the values associated with these field names
+ taken from the remote advertisement into its configuration,
+ and then reload its remote configuration. Currently,
+ "partialCloneFilter" and "token" are the only supported field
+ names.
++
+For example if a server advertises "partialCloneFilter=blob:limit=20k"
+for remote "foo", and that remote is accepted, then "blob:limit=20k"
+will be stored for the "remote.foo.partialCloneFilter" configuration
+variable.
++
+If the new field value from an advertised remote is the same as the
+existing field value for that remote on the client side, then no
+change is made to the client configuration though.
++
+When a new value is stored, a message is printed to standard error to
+let users know about this.
++
+Note that for security reasons, if the remote is not already
+configured on the client side, nothing will be stored for that
+remote. In any case, no new remote will be created and no URL will be
+stored.
++
+Before storing a partial clone filter, it's parsed to check it's
+valid. If it's not, a warning is emitted and it's not stored.
++
+Before storing a token, a check is performed to ensure it contains no
+control character. If the check fails, a warning is emitted and it's
+not stored.
diff --git a/Documentation/gitprotocol-v2.adoc b/Documentation/gitprotocol-v2.adoc
index c7db103299..d93dd279ea 100644
--- a/Documentation/gitprotocol-v2.adoc
+++ b/Documentation/gitprotocol-v2.adoc
@@ -826,9 +826,10 @@ are case-sensitive and MUST be transmitted exactly as specified
above. Clients MUST ignore fields they don't recognize to allow for
future protocol extensions.
-For now, the client can only use information transmitted through these
-fields to decide if it accepts the advertised promisor remote. In the
-future that information might be used for other purposes though.
+The client can use information transmitted through these fields to
+decide if it accepts the advertised promisor remote. Also, the client
+can be configured to store the values of these fields (see
+"promisor.storeFields" in linkgit:git-config[1]).
Field values MUST be urlencoded.
@@ -856,8 +857,9 @@ the server advertised, the client shouldn't advertise the
On the server side, the "promisor.advertise" and "promisor.sendFields"
configuration options can be used to control what it advertises. On
the client side, the "promisor.acceptFromServer" configuration option
-can be used to control what it accepts. See the documentation of these
-configuration options for more information.
+can be used to control what it accepts, and the "promisor.storeFields"
+option, to control what it stores. See the documentation of these
+configuration options in linkgit:git-config[1] for more information.
Note that in the future it would be nice if the "promisor-remote"
protocol capability could be used by the server, when responding to
diff --git a/promisor-remote.c b/promisor-remote.c
index 5d8151cedb..59997dd4c7 100644
--- a/promisor-remote.c
+++ b/promisor-remote.c
@@ -403,6 +403,14 @@ static struct string_list *fields_checked(void)
return initialize_fields_list(&fields_list, &initialized, "promisor.checkFields");
}
+static struct string_list *fields_stored(void)
+{
+ static struct string_list fields_list = STRING_LIST_INIT_NODUP;
+ static int initialized;
+
+ return initialize_fields_list(&fields_list, &initialized, "promisor.storeFields");
+}
+
/*
* Struct for promisor remotes involved in the "promisor-remote"
* protocol capability.
@@ -692,6 +700,132 @@ static struct promisor_info *parse_one_advertised_remote(const char *remote_info
return info;
}
+static bool store_one_field(struct repository *repo, const char *remote_name,
+ const char *field_name, const char *field_key,
+ const char *advertised, const char *current)
+{
+ if (advertised && (!current || strcmp(current, advertised))) {
+ char *key = xstrfmt("remote.%s.%s", remote_name, field_key);
+
+ fprintf(stderr, _("Storing new %s from server for remote '%s'.\n"
+ " '%s' -> '%s'\n"),
+ field_name, remote_name,
+ current ? current : "",
+ advertised);
+
+ repo_config_set_gently(repo, key, advertised);
+ free(key);
+
+ return true;
+ }
+
+ return false;
+}
+
+/* Check that a filter is valid by parsing it */
+static bool valid_filter(const char *filter, const char *remote_name)
+{
+ struct list_objects_filter_options filter_opts = LIST_OBJECTS_FILTER_INIT;
+ struct strbuf err = STRBUF_INIT;
+ int res = gently_parse_list_objects_filter(&filter_opts, filter, &err);
+
+ if (res)
+ warning(_("invalid filter '%s' for remote '%s' "
+ "will not be stored: %s"),
+ filter, remote_name, err.buf);
+
+ list_objects_filter_release(&filter_opts);
+ strbuf_release(&err);
+
+ return !res;
+}
+
+/* Check that a token doesn't contain any control character */
+static bool valid_token(const char *token, const char *remote_name)
+{
+ const char *c = token;
+
+ for (; *c; c++)
+ if (iscntrl(*c)) {
+ warning(_("invalid token '%s' for remote '%s' "
+ "will not be stored"),
+ token, remote_name);
+ return false;
+ }
+
+ return true;
+}
+
+struct store_info {
+ struct repository *repo;
+ struct string_list config_info;
+ bool store_filter;
+ bool store_token;
+};
+
+static struct store_info *store_info_new(struct repository *repo)
+{
+ struct string_list *fields_to_store = fields_stored();
+ struct store_info *s = xmalloc(sizeof(*s));
+
+ s->repo = repo;
+
+ string_list_init_nodup(&s->config_info);
+ promisor_config_info_list(repo, &s->config_info, fields_to_store);
+ string_list_sort(&s->config_info);
+
+ s->store_filter = !!string_list_lookup(fields_to_store, promisor_field_filter);
+ s->store_token = !!string_list_lookup(fields_to_store, promisor_field_token);
+
+ return s;
+}
+
+static void store_info_free(struct store_info *s)
+{
+ if (s) {
+ promisor_info_list_clear(&s->config_info);
+ free(s);
+ }
+}
+
+static bool promisor_store_advertised_fields(struct promisor_info *advertised,
+ struct store_info *store_info)
+{
+ struct promisor_info *p;
+ struct string_list_item *item;
+ const char *remote_name = advertised->name;
+ bool reload_config = false;
+
+ if (!(store_info->store_filter || store_info->store_token))
+ return false;
+
+ /*
+ * Get existing config info for the advertised promisor
+ * remote. This ensures the remote is already configured on
+ * the client side.
+ */
+ item = string_list_lookup(&store_info->config_info, remote_name);
+
+ if (!item)
+ return false;
+
+ p = item->util;
+
+ if (store_info->store_filter && advertised->filter &&
+ valid_filter(advertised->filter, remote_name))
+ reload_config |= store_one_field(store_info->repo, remote_name,
+ "filter", promisor_field_filter,
+ advertised->filter, p->filter);
+
+ if (store_info->store_token && advertised->token &&
+ valid_token(advertised->token, remote_name))
+ reload_config |= store_one_field(store_info->repo, remote_name,
+ "token", promisor_field_token,
+ advertised->token, p->token);
+
+ return reload_config;
+}
+
static void filter_promisor_remote(struct repository *repo,
struct strvec *accepted,
const char *info)
@@ -700,7 +834,9 @@ static void filter_promisor_remote(struct repository *repo,
enum accept_promisor accept = ACCEPT_NONE;
struct string_list config_info = STRING_LIST_INIT_NODUP;
struct string_list remote_info = STRING_LIST_INIT_DUP;
+ struct store_info *store_info = NULL;
struct string_list_item *item;
+ bool reload_config = false;
if (!repo_config_get_string_tmp(the_repository, "promisor.acceptfromserver", &accept_str)) {
if (!*accept_str || !strcasecmp("None", accept_str))
@@ -736,14 +872,24 @@ static void filter_promisor_remote(struct repository *repo,
string_list_sort(&config_info);
}
- if (should_accept_remote(accept, advertised, &config_info))
+ if (should_accept_remote(accept, advertised, &config_info)) {
+ if (!store_info)
+ store_info = store_info_new(repo);
+ if (promisor_store_advertised_fields(advertised, store_info))
+ reload_config = true;
+
strvec_push(accepted, advertised->name);
+ }
promisor_info_free(advertised);
}
promisor_info_list_clear(&config_info);
string_list_clear(&remote_info, 0);
+ store_info_free(store_info);
+
+ if (reload_config)
+ repo_promisor_remote_reinit(repo);
}
char *promisor_remote_reply(const char *info)
diff --git a/t/t5710-promisor-remote-capability.sh b/t/t5710-promisor-remote-capability.sh
index 023735d6a8..a726af214a 100755
--- a/t/t5710-promisor-remote-capability.sh
+++ b/t/t5710-promisor-remote-capability.sh
@@ -360,6 +360,55 @@ test_expect_success "clone with promisor.checkFields" '
check_missing_objects server 1 "$oid"
'
+test_expect_success "clone with promisor.storeFields=partialCloneFilter" '
+ git -C server config promisor.advertise true &&
+ test_when_finished "rm -rf client" &&
+
+ git -C server remote add otherLop "https://invalid.invalid" &&
+ git -C server config remote.otherLop.token "fooBar" &&
+ git -C server config remote.otherLop.stuff "baz" &&
+ git -C server config remote.otherLop.partialCloneFilter "blob:limit=10k" &&
+ test_when_finished "git -C server remote remove otherLop" &&
+
+ git -C server config remote.lop.token "fooXXX" &&
+ git -C server config remote.lop.partialCloneFilter "blob:limit=8k" &&
+
+ test_config -C server promisor.sendFields "partialCloneFilter, token" &&
+ test_when_finished "rm trace" &&
+
+ # Clone from server to create a client
+ GIT_TRACE_PACKET="$(pwd)/trace" GIT_NO_LAZY_FETCH=0 git clone \
+ -c remote.lop.promisor=true \
+ -c remote.lop.fetch="+refs/heads/*:refs/remotes/lop/*" \
+ -c remote.lop.url="file://$(pwd)/lop" \
+ -c remote.lop.token="fooYYY" \
+ -c remote.lop.partialCloneFilter="blob:none" \
+ -c promisor.acceptfromserver=All \
+ -c promisor.storeFields=partialcloneFilter \
+ --no-local --filter="blob:limit=5k" server client 2>err &&
+
+ # Check that the filter from the server is stored
+ echo "blob:limit=8k" >expected &&
+ git -C client config remote.lop.partialCloneFilter >actual &&
+ test_cmp expected actual &&
+
+ # Check that user is notified when the filter is stored
+ test_grep "Storing new filter from server for remote '\''lop'\''" err &&
+ test_grep "'\''blob:none'\'' -> '\''blob:limit=8k'\''" err &&
+
+ # Check that the token from the server is NOT stored
+ echo "fooYYY" >expected &&
+ git -C client config remote.lop.token >actual &&
+ test_cmp expected actual &&
+ test_grep ! "Storing new token from server" err &&
+
+ # Check that the filter for an unknown remote is NOT stored
+ test_must_fail git -C client config remote.otherLop.partialCloneFilter >actual &&
+
+ # Check that the largest object is still missing on the server
+ check_missing_objects server 1 "$oid"
+'
+
test_expect_success "clone with promisor.advertise set to 'true' but don't delete the client" '
git -C server config promisor.advertise true &&
--
2.53.0.rc2.10.g12663a1c75.dirty
^ permalink raw reply related [flat|nested] 80+ messages in thread* [PATCH v2 3/8] clone: make filter_options local to cmd_clone()
2026-02-04 11:08 ` [PATCH v2 0/8] Implement `promisor.storeFields` and `--filter=auto` Christian Couder
2026-02-04 11:08 ` [PATCH v2 1/8] promisor-remote: refactor initialising field lists Christian Couder
2026-02-04 11:08 ` [PATCH v2 2/8] promisor-remote: allow a client to store fields Christian Couder
@ 2026-02-04 11:08 ` Christian Couder
2026-02-04 11:08 ` [PATCH v2 4/8] fetch: make filter_options local to cmd_fetch() Christian Couder
` (5 subsequent siblings)
8 siblings, 0 replies; 80+ messages in thread
From: Christian Couder @ 2026-02-04 11:08 UTC (permalink / raw)
To: git
Cc: Junio C Hamano, Patrick Steinhardt, Taylor Blau, Karthik Nayak,
Elijah Newren, Jean-Noël Avila, Christian Couder,
Christian Couder
The `struct list_objects_filter_options filter_options` variable used
in "builtin/clone.c" to store the parsed filters specified by
`--filter=<filterspec>` is currently a static variable global to the
file.
As we are going to use it more in a following commit, it could become
a bit less easy to understand how it's managed.
To avoid that, let's make it clear that it's owned by cmd_clone() by
moving its definition into that function and making it non-static.
The only additional change to make this work is to pass it as an
argument to checkout(). So it's a small quite cheap cleanup anyway.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
builtin/clone.c | 16 +++++++++++-----
1 file changed, 11 insertions(+), 5 deletions(-)
diff --git a/builtin/clone.c b/builtin/clone.c
index b40cee5968..51f4b5809d 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -77,7 +77,6 @@ static struct string_list option_required_reference = STRING_LIST_INIT_NODUP;
static struct string_list option_optional_reference = STRING_LIST_INIT_NODUP;
static int max_jobs = -1;
static struct string_list option_recurse_submodules = STRING_LIST_INIT_NODUP;
-static struct list_objects_filter_options filter_options = LIST_OBJECTS_FILTER_INIT;
static int config_filter_submodules = -1; /* unspecified */
static int option_remote_submodules;
@@ -634,7 +633,9 @@ static int git_sparse_checkout_init(const char *repo)
return result;
}
-static int checkout(int submodule_progress, int filter_submodules,
+static int checkout(int submodule_progress,
+ struct list_objects_filter_options *filter_options,
+ int filter_submodules,
enum ref_storage_format ref_storage_format)
{
struct object_id oid;
@@ -723,9 +724,9 @@ static int checkout(int submodule_progress, int filter_submodules,
strvec_pushf(&cmd.args, "--ref-format=%s",
ref_storage_format_to_name(ref_storage_format));
- if (filter_submodules && filter_options.choice)
+ if (filter_submodules && filter_options->choice)
strvec_pushf(&cmd.args, "--filter=%s",
- expand_list_objects_filter_spec(&filter_options));
+ expand_list_objects_filter_spec(filter_options));
if (option_single_branch >= 0)
strvec_push(&cmd.args, option_single_branch ?
@@ -903,6 +904,7 @@ int cmd_clone(int argc,
enum transport_family family = TRANSPORT_FAMILY_ALL;
struct string_list option_config = STRING_LIST_INIT_DUP;
int option_dissociate = 0;
+ struct list_objects_filter_options filter_options = LIST_OBJECTS_FILTER_INIT;
int option_filter_submodules = -1; /* unspecified */
struct string_list server_options = STRING_LIST_INIT_NODUP;
const char *bundle_uri = NULL;
@@ -1625,9 +1627,13 @@ int cmd_clone(int argc,
return 1;
junk_mode = JUNK_LEAVE_REPO;
- err = checkout(submodule_progress, filter_submodules,
+ err = checkout(submodule_progress,
+ &filter_options,
+ filter_submodules,
ref_storage_format);
+ list_objects_filter_release(&filter_options);
+
string_list_clear(&option_not, 0);
string_list_clear(&option_config, 0);
string_list_clear(&server_options, 0);
--
2.53.0.rc2.10.g12663a1c75.dirty
^ permalink raw reply related [flat|nested] 80+ messages in thread* [PATCH v2 4/8] fetch: make filter_options local to cmd_fetch()
2026-02-04 11:08 ` [PATCH v2 0/8] Implement `promisor.storeFields` and `--filter=auto` Christian Couder
` (2 preceding siblings ...)
2026-02-04 11:08 ` [PATCH v2 3/8] clone: make filter_options local to cmd_clone() Christian Couder
@ 2026-02-04 11:08 ` Christian Couder
2026-02-04 11:08 ` [PATCH v2 5/8] doc: fetch: document `--filter=<filter-spec>` option Christian Couder
` (4 subsequent siblings)
8 siblings, 0 replies; 80+ messages in thread
From: Christian Couder @ 2026-02-04 11:08 UTC (permalink / raw)
To: git
Cc: Junio C Hamano, Patrick Steinhardt, Taylor Blau, Karthik Nayak,
Elijah Newren, Jean-Noël Avila, Christian Couder,
Christian Couder
The `struct list_objects_filter_options filter_options` variable used
in "builtin/fetch.c" to store the parsed filters specified by
`--filter=<filterspec>` is currently a static variable global to the
file.
As we are going to use it more in a following commit, it could become a
bit less easy to understand how it's managed.
To avoid that, let's make it clear that it's owned by cmd_fetch() by
moving its definition into that function and making it non-static.
This requires passing a pointer to it through the prepare_transport(),
do_fetch(), backfill_tags(), fetch_one_setup_partial(), and fetch_one()
functions, but it's quite straightforward.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
builtin/fetch.c | 48 +++++++++++++++++++++++++++---------------------
1 file changed, 27 insertions(+), 21 deletions(-)
diff --git a/builtin/fetch.c b/builtin/fetch.c
index 288d3772ea..b984173447 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -97,7 +97,6 @@ static struct strbuf default_rla = STRBUF_INIT;
static struct transport *gtransport;
static struct transport *gsecondary;
static struct refspec refmap = REFSPEC_INIT_FETCH;
-static struct list_objects_filter_options filter_options = LIST_OBJECTS_FILTER_INIT;
static struct string_list server_options = STRING_LIST_INIT_DUP;
static struct string_list negotiation_tip = STRING_LIST_INIT_NODUP;
@@ -1449,7 +1448,8 @@ static void add_negotiation_tips(struct git_transport_options *smart_options)
smart_options->negotiation_tips = oids;
}
-static struct transport *prepare_transport(struct remote *remote, int deepen)
+static struct transport *prepare_transport(struct remote *remote, int deepen,
+ struct list_objects_filter_options *filter_options)
{
struct transport *transport;
@@ -1473,9 +1473,9 @@ static struct transport *prepare_transport(struct remote *remote, int deepen)
set_option(transport, TRANS_OPT_UPDATE_SHALLOW, "yes");
if (refetch)
set_option(transport, TRANS_OPT_REFETCH, "yes");
- if (filter_options.choice) {
+ if (filter_options->choice) {
const char *spec =
- expand_list_objects_filter_spec(&filter_options);
+ expand_list_objects_filter_spec(filter_options);
set_option(transport, TRANS_OPT_LIST_OBJECTS_FILTER, spec);
set_option(transport, TRANS_OPT_FROM_PROMISOR, "1");
}
@@ -1493,7 +1493,8 @@ static int backfill_tags(struct display_state *display_state,
struct ref_transaction *transaction,
struct ref *ref_map,
struct fetch_head *fetch_head,
- const struct fetch_config *config)
+ const struct fetch_config *config,
+ struct list_objects_filter_options *filter_options)
{
int retcode, cannot_reuse;
@@ -1507,7 +1508,7 @@ static int backfill_tags(struct display_state *display_state,
cannot_reuse = transport->cannot_reuse ||
deepen_since || deepen_not.nr;
if (cannot_reuse) {
- gsecondary = prepare_transport(transport->remote, 0);
+ gsecondary = prepare_transport(transport->remote, 0, filter_options);
transport = gsecondary;
}
@@ -1713,7 +1714,8 @@ static int commit_ref_transaction(struct ref_transaction **transaction,
static int do_fetch(struct transport *transport,
struct refspec *rs,
- const struct fetch_config *config)
+ const struct fetch_config *config,
+ struct list_objects_filter_options *filter_options)
{
struct ref_transaction *transaction = NULL;
struct ref *ref_map = NULL;
@@ -1873,7 +1875,7 @@ static int do_fetch(struct transport *transport,
* the transaction and don't commit anything.
*/
if (backfill_tags(&display_state, transport, transaction, tags_ref_map,
- &fetch_head, config))
+ &fetch_head, config, filter_options))
retcode = 1;
}
@@ -2198,20 +2200,21 @@ static int fetch_multiple(struct string_list *list, int max_children,
* Fetching from the promisor remote should use the given filter-spec
* or inherit the default filter-spec from the config.
*/
-static inline void fetch_one_setup_partial(struct remote *remote)
+static inline void fetch_one_setup_partial(struct remote *remote,
+ struct list_objects_filter_options *filter_options)
{
/*
* Explicit --no-filter argument overrides everything, regardless
* of any prior partial clones and fetches.
*/
- if (filter_options.no_filter)
+ if (filter_options->no_filter)
return;
/*
* If no prior partial clone/fetch and the current fetch DID NOT
* request a partial-fetch, do a normal fetch.
*/
- if (!repo_has_promisor_remote(the_repository) && !filter_options.choice)
+ if (!repo_has_promisor_remote(the_repository) && !filter_options->choice)
return;
/*
@@ -2220,8 +2223,8 @@ static inline void fetch_one_setup_partial(struct remote *remote)
* filter-spec as the default for subsequent fetches to this
* remote if there is currently no default filter-spec.
*/
- if (filter_options.choice) {
- partial_clone_register(remote->name, &filter_options);
+ if (filter_options->choice) {
+ partial_clone_register(remote->name, filter_options);
return;
}
@@ -2230,14 +2233,15 @@ static inline void fetch_one_setup_partial(struct remote *remote)
* explicitly given filter-spec or inherit the filter-spec from
* the config.
*/
- if (!filter_options.choice)
- partial_clone_get_default_filter_spec(&filter_options, remote->name);
+ if (!filter_options->choice)
+ partial_clone_get_default_filter_spec(filter_options, remote->name);
return;
}
static int fetch_one(struct remote *remote, int argc, const char **argv,
int prune_tags_ok, int use_stdin_refspecs,
- const struct fetch_config *config)
+ const struct fetch_config *config,
+ struct list_objects_filter_options *filter_options)
{
struct refspec rs = REFSPEC_INIT_FETCH;
int i;
@@ -2249,7 +2253,7 @@ static int fetch_one(struct remote *remote, int argc, const char **argv,
die(_("no remote repository specified; please specify either a URL or a\n"
"remote name from which new revisions should be fetched"));
- gtransport = prepare_transport(remote, 1);
+ gtransport = prepare_transport(remote, 1, filter_options);
if (prune < 0) {
/* no command line request */
@@ -2304,7 +2308,7 @@ static int fetch_one(struct remote *remote, int argc, const char **argv,
sigchain_push_common(unlock_pack_on_signal);
atexit(unlock_pack_atexit);
sigchain_push(SIGPIPE, SIG_IGN);
- exit_code = do_fetch(gtransport, &rs, config);
+ exit_code = do_fetch(gtransport, &rs, config, filter_options);
sigchain_pop(SIGPIPE);
refspec_clear(&rs);
transport_disconnect(gtransport);
@@ -2329,6 +2333,7 @@ int cmd_fetch(int argc,
const char *submodule_prefix = "";
const char *bundle_uri;
struct string_list list = STRING_LIST_INIT_DUP;
+ struct list_objects_filter_options filter_options = LIST_OBJECTS_FILTER_INIT;
struct remote *remote = NULL;
int all = -1, multiple = 0;
int result = 0;
@@ -2594,7 +2599,7 @@ int cmd_fetch(int argc,
trace2_region_enter("fetch", "negotiate-only", the_repository);
if (!remote)
die(_("must supply remote when using --negotiate-only"));
- gtransport = prepare_transport(remote, 1);
+ gtransport = prepare_transport(remote, 1, &filter_options);
if (gtransport->smart_options) {
gtransport->smart_options->acked_commits = &acked_commits;
} else {
@@ -2616,12 +2621,12 @@ int cmd_fetch(int argc,
} else if (remote) {
if (filter_options.choice || repo_has_promisor_remote(the_repository)) {
trace2_region_enter("fetch", "setup-partial", the_repository);
- fetch_one_setup_partial(remote);
+ fetch_one_setup_partial(remote, &filter_options);
trace2_region_leave("fetch", "setup-partial", the_repository);
}
trace2_region_enter("fetch", "fetch-one", the_repository);
result = fetch_one(remote, argc, argv, prune_tags_ok, stdin_refspecs,
- &config);
+ &config, &filter_options);
trace2_region_leave("fetch", "fetch-one", the_repository);
} else {
int max_children = max_jobs;
@@ -2727,5 +2732,6 @@ int cmd_fetch(int argc,
cleanup:
string_list_clear(&list, 0);
+ list_objects_filter_release(&filter_options);
return result;
}
--
2.53.0.rc2.10.g12663a1c75.dirty
^ permalink raw reply related [flat|nested] 80+ messages in thread* [PATCH v2 5/8] doc: fetch: document `--filter=<filter-spec>` option
2026-02-04 11:08 ` [PATCH v2 0/8] Implement `promisor.storeFields` and `--filter=auto` Christian Couder
` (3 preceding siblings ...)
2026-02-04 11:08 ` [PATCH v2 4/8] fetch: make filter_options local to cmd_fetch() Christian Couder
@ 2026-02-04 11:08 ` Christian Couder
2026-02-11 11:48 ` Patrick Steinhardt
2026-02-04 11:08 ` [PATCH v2 6/8] list-objects-filter-options: support 'auto' mode for --filter Christian Couder
` (3 subsequent siblings)
8 siblings, 1 reply; 80+ messages in thread
From: Christian Couder @ 2026-02-04 11:08 UTC (permalink / raw)
To: git
Cc: Junio C Hamano, Patrick Steinhardt, Taylor Blau, Karthik Nayak,
Elijah Newren, Jean-Noël Avila, Christian Couder,
Christian Couder
The `--filter=<filter-spec>` option is documented in most commands that
support it except `git fetch`.
Let's fix that and document that option using the same words already
used to document it for `git clone`.
Those words could probably be improved, but they are not wrong, so
let's just use them for now and leave improving them for future work.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
Documentation/fetch-options.adoc | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/Documentation/fetch-options.adoc b/Documentation/fetch-options.adoc
index fcba46ee9e..1ef9807d00 100644
--- a/Documentation/fetch-options.adoc
+++ b/Documentation/fetch-options.adoc
@@ -88,6 +88,16 @@ linkgit:git-config[1].
This is incompatible with `--recurse-submodules=(yes|on-demand)` and takes
precedence over the `fetch.output` config option.
+`--filter=<filter-spec>`::
+ Use the partial clone feature and request that the server sends
+ a subset of reachable objects according to a given object filter.
+ When using `--filter`, the supplied _<filter-spec>_ is used for
+ the partial fetch. For example, `--filter=blob:none` will filter
+ out all blobs (file contents) until needed by Git. Also,
+ `--filter=blob:limit=<size>` will filter out all blobs of size
+ at least _<size>_. For more details on filter specifications, see
+ the `--filter` option in linkgit:git-rev-list[1].
+
ifndef::git-pull[]
`--write-fetch-head`::
`--no-write-fetch-head`::
--
2.53.0.rc2.10.g12663a1c75.dirty
^ permalink raw reply related [flat|nested] 80+ messages in thread* Re: [PATCH v2 5/8] doc: fetch: document `--filter=<filter-spec>` option
2026-02-04 11:08 ` [PATCH v2 5/8] doc: fetch: document `--filter=<filter-spec>` option Christian Couder
@ 2026-02-11 11:48 ` Patrick Steinhardt
2026-02-12 10:06 ` Christian Couder
0 siblings, 1 reply; 80+ messages in thread
From: Patrick Steinhardt @ 2026-02-11 11:48 UTC (permalink / raw)
To: Christian Couder
Cc: git, Junio C Hamano, Taylor Blau, Karthik Nayak, Elijah Newren,
Jean-Noël Avila, Christian Couder
On Wed, Feb 04, 2026 at 12:08:10PM +0100, Christian Couder wrote:
> The `--filter=<filter-spec>` option is documented in most commands that
> support it except `git fetch`.
>
> Let's fix that and document that option using the same words already
> used to document it for `git clone`.
>
> Those words could probably be improved, but they are not wrong, so
> let's just use them for now and leave improving them for future work.
Heh, this reads quite funny to me. I prefer the commit message from v1
myself, but don't care strongly about this.
Patrick
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH v2 5/8] doc: fetch: document `--filter=<filter-spec>` option
2026-02-11 11:48 ` Patrick Steinhardt
@ 2026-02-12 10:06 ` Christian Couder
0 siblings, 0 replies; 80+ messages in thread
From: Christian Couder @ 2026-02-12 10:06 UTC (permalink / raw)
To: Patrick Steinhardt
Cc: git, Junio C Hamano, Taylor Blau, Karthik Nayak, Elijah Newren,
Jean-Noël Avila, Christian Couder
On Wed, Feb 11, 2026 at 12:48 PM Patrick Steinhardt <ps@pks.im> wrote:
>
> On Wed, Feb 04, 2026 at 12:08:10PM +0100, Christian Couder wrote:
> > The `--filter=<filter-spec>` option is documented in most commands that
> > support it except `git fetch`.
> >
> > Let's fix that and document that option using the same words already
> > used to document it for `git clone`.
> >
> > Those words could probably be improved, but they are not wrong, so
> > let's just use them for now and leave improving them for future work.
>
> Heh, this reads quite funny to me. I prefer the commit message from v1
> myself, but don't care strongly about this.
Yeah, what about the following then:
The `--filter=<filter-spec>` option is documented in most commands that
support it except `git fetch`.
Let's fix that and document this option. To ensure consistency across
commands, let's reuse the exact description currently found in
`git clone`.
^ permalink raw reply [flat|nested] 80+ messages in thread
* [PATCH v2 6/8] list-objects-filter-options: support 'auto' mode for --filter
2026-02-04 11:08 ` [PATCH v2 0/8] Implement `promisor.storeFields` and `--filter=auto` Christian Couder
` (4 preceding siblings ...)
2026-02-04 11:08 ` [PATCH v2 5/8] doc: fetch: document `--filter=<filter-spec>` option Christian Couder
@ 2026-02-04 11:08 ` Christian Couder
2026-02-04 11:08 ` [PATCH v2 7/8] promisor-remote: keep advertised filters in memory Christian Couder
` (2 subsequent siblings)
8 siblings, 0 replies; 80+ messages in thread
From: Christian Couder @ 2026-02-04 11:08 UTC (permalink / raw)
To: git
Cc: Junio C Hamano, Patrick Steinhardt, Taylor Blau, Karthik Nayak,
Elijah Newren, Jean-Noël Avila, Christian Couder,
Christian Couder
In a following commit, we are going to allow passing "auto" as a
<filterspec> to the `--filter=<filterspec>` option, but only for some
commands. Other commands that support the `--filter=<filterspec>`
option should still die() when 'auto' is passed.
Let's set up the "list-objects-filter-options.{c,h}" infrastructure to
support that:
- Add a new `unsigned int allow_auto_filter : 1;` flag to
`struct list_objects_filter_options` which specifies if "auto" is
accepted or not by the current command.
- Change gently_parse_list_objects_filter() to parse "auto" if it's
accepted.
- Make sure we die() if "auto" is combined with another filter.
- Update list_objects_filter_release() to preserve the
allow_auto_filter flag, as this function is often called (via
opt_parse_list_objects_filter) to reset the struct before parsing a
new value.
Let's also update `list-objects-filter.c` to recognize the new
`LOFC_AUTO` choice. Since "auto" must be resolved to a concrete filter
before filtering actually begins, initializing a filter with
`LOFC_AUTO` is invalid and will trigger a BUG().
Note that ideally combining "auto" with "auto" could be allowed, but in
practice, it's probably not worth the added code complexity. And if we
really want it, nothing prevents us to allow it in future work.
If we ever want to give a meaning to combining "auto" with a different
filter too, nothing prevents us to do that in future work either.
Also note that the new `allow_auto_filter` flag depends on the command,
not user choices, so it should be reset to the command default when
`struct list_objects_filter_options` instances are reset.
While at it, let's add a new "u-list-objects-filter-options.c" file for
`struct list_objects_filter_options` related unit tests. For now it
only tests gently_parse_list_objects_filter() though.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
Makefile | 1 +
list-objects-filter-options.c | 37 ++++++++++++--
list-objects-filter-options.h | 6 +++
list-objects-filter.c | 8 +++
t/meson.build | 1 +
t/unit-tests/u-list-objects-filter-options.c | 53 ++++++++++++++++++++
6 files changed, 103 insertions(+), 3 deletions(-)
create mode 100644 t/unit-tests/u-list-objects-filter-options.c
diff --git a/Makefile b/Makefile
index 8aa489f3b6..04256f747c 100644
--- a/Makefile
+++ b/Makefile
@@ -1516,6 +1516,7 @@ CLAR_TEST_SUITES += u-dir
CLAR_TEST_SUITES += u-example-decorate
CLAR_TEST_SUITES += u-hash
CLAR_TEST_SUITES += u-hashmap
+CLAR_TEST_SUITES += u-list-objects-filter-options
CLAR_TEST_SUITES += u-mem-pool
CLAR_TEST_SUITES += u-oid-array
CLAR_TEST_SUITES += u-oidmap
diff --git a/list-objects-filter-options.c b/list-objects-filter-options.c
index 7420bf81fe..ad92cbaa37 100644
--- a/list-objects-filter-options.c
+++ b/list-objects-filter-options.c
@@ -20,6 +20,8 @@ const char *list_object_filter_config_name(enum list_objects_filter_choice c)
case LOFC_DISABLED:
/* we have no name for "no filter at all" */
break;
+ case LOFC_AUTO:
+ return "auto";
case LOFC_BLOB_NONE:
return "blob:none";
case LOFC_BLOB_LIMIT:
@@ -52,7 +54,16 @@ int gently_parse_list_objects_filter(
if (filter_options->choice)
BUG("filter_options already populated");
- if (!strcmp(arg, "blob:none")) {
+ if (!strcmp(arg, "auto")) {
+ if (!filter_options->allow_auto_filter) {
+ strbuf_addstr(errbuf,
+ _("'auto' filter not supported by this command"));
+ return 1;
+ }
+ filter_options->choice = LOFC_AUTO;
+ return 0;
+
+ } else if (!strcmp(arg, "blob:none")) {
filter_options->choice = LOFC_BLOB_NONE;
return 0;
@@ -146,10 +157,22 @@ static int parse_combine_subfilter(
decoded = url_percent_decode(subspec->buf);
- result = has_reserved_character(subspec, errbuf) ||
- gently_parse_list_objects_filter(
+ result = has_reserved_character(subspec, errbuf);
+ if (result)
+ goto cleanup;
+
+ result = gently_parse_list_objects_filter(
&filter_options->sub[new_index], decoded, errbuf);
+ if (result)
+ goto cleanup;
+
+ result = (filter_options->sub[new_index].choice == LOFC_AUTO);
+ if (result) {
+ strbuf_addstr(errbuf, _("an 'auto' filter cannot be combined"));
+ goto cleanup;
+ }
+cleanup:
free(decoded);
return result;
}
@@ -263,6 +286,9 @@ void parse_list_objects_filter(
} else {
struct list_objects_filter_options *sub;
+ if (filter_options->choice == LOFC_AUTO)
+ die(_("an 'auto' filter is incompatible with any other filter"));
+
/*
* Make filter_options an LOFC_COMBINE spec so we can trivially
* add subspecs to it.
@@ -277,6 +303,9 @@ void parse_list_objects_filter(
if (gently_parse_list_objects_filter(sub, arg, &errbuf))
die("%s", errbuf.buf);
+ if (sub->choice == LOFC_AUTO)
+ die(_("an 'auto' filter is incompatible with any other filter"));
+
strbuf_addch(&filter_options->filter_spec, '+');
filter_spec_append_urlencode(filter_options, arg);
}
@@ -317,6 +346,7 @@ void list_objects_filter_release(
struct list_objects_filter_options *filter_options)
{
size_t sub;
+ unsigned int allow_auto_filter = filter_options->allow_auto_filter;
if (!filter_options)
return;
@@ -326,6 +356,7 @@ void list_objects_filter_release(
list_objects_filter_release(&filter_options->sub[sub]);
free(filter_options->sub);
list_objects_filter_init(filter_options);
+ filter_options->allow_auto_filter = allow_auto_filter;
}
void partial_clone_register(
diff --git a/list-objects-filter-options.h b/list-objects-filter-options.h
index 7b2108b986..77d7bbc846 100644
--- a/list-objects-filter-options.h
+++ b/list-objects-filter-options.h
@@ -18,6 +18,7 @@ enum list_objects_filter_choice {
LOFC_SPARSE_OID,
LOFC_OBJECT_TYPE,
LOFC_COMBINE,
+ LOFC_AUTO,
LOFC__COUNT /* must be last */
};
@@ -50,6 +51,11 @@ struct list_objects_filter_options {
*/
unsigned int no_filter : 1;
+ /*
+ * Is LOFC_AUTO a valid option?
+ */
+ unsigned int allow_auto_filter : 1;
+
/*
* BEGIN choice-specific parsed values from within the filter-spec. Only
* some values will be defined for any given choice.
diff --git a/list-objects-filter.c b/list-objects-filter.c
index acd65ebb73..78316e7f90 100644
--- a/list-objects-filter.c
+++ b/list-objects-filter.c
@@ -745,6 +745,13 @@ static void filter_combine__init(
filter->finalize_omits_fn = filter_combine__finalize_omits;
}
+static void filter_auto__init(
+ struct list_objects_filter_options *filter_options UNUSED,
+ struct filter *filter UNUSED)
+{
+ BUG("LOFC_AUTO should have been resolved before initializing the filter");
+}
+
typedef void (*filter_init_fn)(
struct list_objects_filter_options *filter_options,
struct filter *filter);
@@ -760,6 +767,7 @@ static filter_init_fn s_filters[] = {
filter_sparse_oid__init,
filter_object_type__init,
filter_combine__init,
+ filter_auto__init,
};
struct filter *list_objects_filter__init(
diff --git a/t/meson.build b/t/meson.build
index 459c52a489..0bd66cc6ce 100644
--- a/t/meson.build
+++ b/t/meson.build
@@ -4,6 +4,7 @@ clar_test_suites = [
'unit-tests/u-example-decorate.c',
'unit-tests/u-hash.c',
'unit-tests/u-hashmap.c',
+ 'unit-tests/u-list-objects-filter-options.c',
'unit-tests/u-mem-pool.c',
'unit-tests/u-oid-array.c',
'unit-tests/u-oidmap.c',
diff --git a/t/unit-tests/u-list-objects-filter-options.c b/t/unit-tests/u-list-objects-filter-options.c
new file mode 100644
index 0000000000..f7d73701b5
--- /dev/null
+++ b/t/unit-tests/u-list-objects-filter-options.c
@@ -0,0 +1,53 @@
+#include "unit-test.h"
+#include "list-objects-filter-options.h"
+#include "strbuf.h"
+
+/* Helper to test gently_parse_list_objects_filter() */
+static void check_gentle_parse(const char *filter_spec,
+ int expect_success,
+ int allow_auto,
+ enum list_objects_filter_choice expected_choice)
+{
+ struct list_objects_filter_options filter_options = LIST_OBJECTS_FILTER_INIT;
+ struct strbuf errbuf = STRBUF_INIT;
+ int ret;
+
+ filter_options.allow_auto_filter = allow_auto;
+
+ ret = gently_parse_list_objects_filter(&filter_options, filter_spec, &errbuf);
+
+ if (expect_success) {
+ cl_assert_equal_i(ret, 0);
+ cl_assert_equal_i(expected_choice, filter_options.choice);
+ cl_assert_equal_i(errbuf.len, 0);
+ } else {
+ cl_assert(ret != 0);
+ cl_assert(errbuf.len > 0);
+ }
+
+ strbuf_release(&errbuf);
+ list_objects_filter_release(&filter_options);
+}
+
+void test_list_objects_filter_options__regular_filters(void)
+{
+ check_gentle_parse("blob:none", 1, 0, LOFC_BLOB_NONE);
+ check_gentle_parse("blob:none", 1, 1, LOFC_BLOB_NONE);
+ check_gentle_parse("blob:limit=5k", 1, 0, LOFC_BLOB_LIMIT);
+ check_gentle_parse("blob:limit=5k", 1, 1, LOFC_BLOB_LIMIT);
+ check_gentle_parse("combine:blob:none+tree:0", 1, 0, LOFC_COMBINE);
+ check_gentle_parse("combine:blob:none+tree:0", 1, 1, LOFC_COMBINE);
+}
+
+void test_list_objects_filter_options__auto_allowed(void)
+{
+ check_gentle_parse("auto", 1, 1, LOFC_AUTO);
+ check_gentle_parse("auto", 0, 0, 0);
+}
+
+void test_list_objects_filter_options__combine_auto_fails(void)
+{
+ check_gentle_parse("combine:auto+blob:none", 0, 1, 0);
+ check_gentle_parse("combine:blob:none+auto", 0, 1, 0);
+ check_gentle_parse("combine:auto+auto", 0, 1, 0);
+}
--
2.53.0.rc2.10.g12663a1c75.dirty
^ permalink raw reply related [flat|nested] 80+ messages in thread* [PATCH v2 7/8] promisor-remote: keep advertised filters in memory
2026-02-04 11:08 ` [PATCH v2 0/8] Implement `promisor.storeFields` and `--filter=auto` Christian Couder
` (5 preceding siblings ...)
2026-02-04 11:08 ` [PATCH v2 6/8] list-objects-filter-options: support 'auto' mode for --filter Christian Couder
@ 2026-02-04 11:08 ` Christian Couder
2026-02-04 11:08 ` [PATCH v2 8/8] fetch-pack: wire up and enable auto filter logic Christian Couder
2026-02-12 10:08 ` [PATCH v3 0/9] Implement `promisor.storeFields` and `--filter=auto` Christian Couder
8 siblings, 0 replies; 80+ messages in thread
From: Christian Couder @ 2026-02-04 11:08 UTC (permalink / raw)
To: git
Cc: Junio C Hamano, Patrick Steinhardt, Taylor Blau, Karthik Nayak,
Elijah Newren, Jean-Noël Avila, Christian Couder,
Christian Couder
Currently, advertised filters are only kept in memory temporarily
during parsing, or persisted to disk if `promisor.storeFields`
contains 'partialCloneFilter'.
In a following commit though, we will add a `--filter=auto` option.
This option will enable the client to use the filters that the server
is suggesting for the promisor remotes the client accepts.
To use them even if `promisor.storeFields` is not configured, these
filters should be stored somewhere for the current session.
Let's add an `advertised_filter` field to `struct promisor_remote`
for that purpose.
To ensure that the filters are available in all cases,
filter_promisor_remote() captures them into a temporary list and
applies them to the `promisor_remote` structs after the potential
configuration reload.
Then the accepted remotes are marked as `accepted` in the repository
state. This ensures that subsequent calls to look up accepted remotes
(like in the filter construction below) actually find them.
In a following commit, we will add a `--filter=auto` option that will
enable a client to use the filters suggested by the server for the
promisor remotes the client accepted.
To enable the client to construct a filter spec based on these filters,
let's also add a `promisor_remote_construct_filter(repo)` function.
This function:
- iterates over all accepted promisor remotes in the repository,
- collects the filters advertised for them (using `advertised_filter`
added in this commit, and
- generates a single filter spec for them.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
promisor-remote.c | 58 +++++++++++++++++++++++++++++++++++++++++++++++
promisor-remote.h | 7 ++++++
2 files changed, 65 insertions(+)
diff --git a/promisor-remote.c b/promisor-remote.c
index 59997dd4c7..d0bfb209dc 100644
--- a/promisor-remote.c
+++ b/promisor-remote.c
@@ -193,6 +193,7 @@ void promisor_remote_clear(struct promisor_remote_config *config)
while (config->promisors) {
struct promisor_remote *r = config->promisors;
free(r->partial_clone_filter);
+ free(r->advertised_filter);
config->promisors = config->promisors->next;
free(r);
}
@@ -837,6 +838,7 @@ static void filter_promisor_remote(struct repository *repo,
struct store_info *store_info = NULL;
struct string_list_item *item;
bool reload_config = false;
+ struct string_list captured_filters = STRING_LIST_INIT_DUP;
if (!repo_config_get_string_tmp(the_repository, "promisor.acceptfromserver", &accept_str)) {
if (!*accept_str || !strcasecmp("None", accept_str))
@@ -879,6 +881,13 @@ static void filter_promisor_remote(struct repository *repo,
reload_config = true;
strvec_push(accepted, advertised->name);
+
+ /* Capture advertised filters for accepted remotes */
+ if (advertised->filter) {
+ struct string_list_item *i;
+ i = string_list_append(&captured_filters, advertised->name);
+ i->util = xstrdup(advertised->filter);
+ }
}
promisor_info_free(advertised);
@@ -890,6 +899,25 @@ static void filter_promisor_remote(struct repository *repo,
if (reload_config)
repo_promisor_remote_reinit(repo);
+
+ /* Apply captured filters to the stable repo state */
+ for_each_string_list_item(item, &captured_filters) {
+ struct promisor_remote *r = repo_promisor_remote_find(repo, item->string);
+ if (r) {
+ free(r->advertised_filter);
+ r->advertised_filter = item->util;
+ item->util = NULL;
+ }
+ }
+
+ string_list_clear(&captured_filters, 1);
+
+ /* Mark the remotes as accepted in the repository state */
+ for (size_t i = 0; i < accepted->nr; i++) {
+ struct promisor_remote *r = repo_promisor_remote_find(repo, accepted->v[i]);
+ if (r)
+ r->accepted = 1;
+ }
}
char *promisor_remote_reply(const char *info)
@@ -935,3 +963,33 @@ void mark_promisor_remotes_as_accepted(struct repository *r, const char *remotes
string_list_clear(&accepted_remotes, 0);
}
+
+char *promisor_remote_construct_filter(struct repository *repo)
+{
+ struct promisor_remote *r;
+ struct list_objects_filter_options filter_options = LIST_OBJECTS_FILTER_INIT;
+ struct strbuf err = STRBUF_INIT;
+ char *result = NULL;
+
+ promisor_remote_init(repo);
+
+ for (r = repo->promisor_remote_config->promisors; r; r = r->next) {
+ if (r->accepted && r->advertised_filter)
+ if (gently_parse_list_objects_filter(&filter_options,
+ r->advertised_filter,
+ &err)) {
+ warning(_("promisor remote '%s' advertised invalid filter '%s': %s"),
+ r->name, r->advertised_filter, err.buf);
+ strbuf_reset(&err);
+ continue;
+ }
+ }
+
+ if (filter_options.choice)
+ result = xstrdup(expand_list_objects_filter_spec(&filter_options));
+
+ list_objects_filter_release(&filter_options);
+ strbuf_release(&err);
+
+ return result;
+}
diff --git a/promisor-remote.h b/promisor-remote.h
index 263d331a55..d227299fd0 100644
--- a/promisor-remote.h
+++ b/promisor-remote.h
@@ -15,6 +15,7 @@ struct object_id;
struct promisor_remote {
struct promisor_remote *next;
char *partial_clone_filter;
+ char *advertised_filter;
unsigned int accepted : 1;
const char name[FLEX_ARRAY];
};
@@ -67,4 +68,10 @@ void mark_promisor_remotes_as_accepted(struct repository *repo, const char *remo
*/
int repo_has_accepted_promisor_remote(struct repository *r);
+/*
+ * Use the filters from the accepted remotes to create a combined
+ * filter (useful in `--filter=auto` mode).
+ */
+char *promisor_remote_construct_filter(struct repository *repo);
+
#endif /* PROMISOR_REMOTE_H */
--
2.53.0.rc2.10.g12663a1c75.dirty
^ permalink raw reply related [flat|nested] 80+ messages in thread* [PATCH v2 8/8] fetch-pack: wire up and enable auto filter logic
2026-02-04 11:08 ` [PATCH v2 0/8] Implement `promisor.storeFields` and `--filter=auto` Christian Couder
` (6 preceding siblings ...)
2026-02-04 11:08 ` [PATCH v2 7/8] promisor-remote: keep advertised filters in memory Christian Couder
@ 2026-02-04 11:08 ` Christian Couder
2026-02-11 11:48 ` Patrick Steinhardt
2026-02-12 10:08 ` [PATCH v3 0/9] Implement `promisor.storeFields` and `--filter=auto` Christian Couder
8 siblings, 1 reply; 80+ messages in thread
From: Christian Couder @ 2026-02-04 11:08 UTC (permalink / raw)
To: git
Cc: Junio C Hamano, Patrick Steinhardt, Taylor Blau, Karthik Nayak,
Elijah Newren, Jean-Noël Avila, Christian Couder,
Christian Couder
Previous commits have set up an infrastructure for `--filter=auto` to
automatically prepare a partial clone filter based on what the server
advertised and the client accepted.
Using that infrastructure, let's now enable the `--filter=auto` option
in `git clone` and `git fetch` by setting `allow_auto_filter` to 1.
Note that these small changes mean that when `git clone --filter=auto`
or `git fetch --filter=auto` are used, "auto" is automatically saved
as the partial clone filter for the server on the client. Therefore
subsequent calls to `git fetch` on the client will automatically use
this "auto" mode even without `--filter=auto`.
Let's also set `allow_auto_filter` to 1 in `transport.c`, as the
transport layer must be able to accept the "auto" filter spec even if
the invoking command hasn't fully parsed it yet.
When an "auto" filter is requested, let's have the "fetch-pack.c" code
in `do_fetch_pack_v2()` compute a filter and send it to the server.
In `do_fetch_pack_v2()` the logic also needs to check for the
"promisor-remote" capability and call `promisor_remote_reply()` to
parse advertised remotes and populate the list of those accepted (and
their filters).
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
Documentation/fetch-options.adoc | 19 ++++++---
Documentation/git-clone.adoc | 25 ++++++++---
Documentation/gitprotocol-v2.adoc | 16 ++++---
builtin/clone.c | 2 +
builtin/fetch.c | 2 +
fetch-pack.c | 28 +++++++++++++
t/t5710-promisor-remote-capability.sh | 60 +++++++++++++++++++++++++++
transport.c | 1 +
8 files changed, 138 insertions(+), 15 deletions(-)
diff --git a/Documentation/fetch-options.adoc b/Documentation/fetch-options.adoc
index 1ef9807d00..a0cfb50d89 100644
--- a/Documentation/fetch-options.adoc
+++ b/Documentation/fetch-options.adoc
@@ -92,11 +92,20 @@ precedence over the `fetch.output` config option.
Use the partial clone feature and request that the server sends
a subset of reachable objects according to a given object filter.
When using `--filter`, the supplied _<filter-spec>_ is used for
- the partial fetch. For example, `--filter=blob:none` will filter
- out all blobs (file contents) until needed by Git. Also,
- `--filter=blob:limit=<size>` will filter out all blobs of size
- at least _<size>_. For more details on filter specifications, see
- the `--filter` option in linkgit:git-rev-list[1].
+ the partial fetch.
++
+If `--filter=auto` is used, the filter specification is determined
+automatically by combining the filter specifications advertised by
+the server for the promisor remotes that the client accepts (see
+linkgit:gitprotocol-v2[5] and the `promisor.acceptFromServer`
+configuration option in linkgit:git-config[1]).
++
+For details on all other available filter specifications, see the
+`--filter=<filter-spec>` option in linkgit:git-rev-list[1].
++
+For example, `--filter=blob:none` will filter out all blobs (file
+contents) until needed by Git. Also, `--filter=blob:limit=<size>` will
+filter out all blobs of size at least _<size>_.
ifndef::git-pull[]
`--write-fetch-head`::
diff --git a/Documentation/git-clone.adoc b/Documentation/git-clone.adoc
index 57cdfb7620..0db2d1e5f0 100644
--- a/Documentation/git-clone.adoc
+++ b/Documentation/git-clone.adoc
@@ -187,11 +187,26 @@ objects from the source repository into a pack in the cloned repository.
Use the partial clone feature and request that the server sends
a subset of reachable objects according to a given object filter.
When using `--filter`, the supplied _<filter-spec>_ is used for
- the partial clone filter. For example, `--filter=blob:none` will
- filter out all blobs (file contents) until needed by Git. Also,
- `--filter=blob:limit=<size>` will filter out all blobs of size
- at least _<size>_. For more details on filter specifications, see
- the `--filter` option in linkgit:git-rev-list[1].
+ the partial clone filter.
++
+If `--filter=auto` is used the filter specification is determined
+automatically through the 'promisor-remote' protocol (see
+linkgit:gitprotocol-v2[5]) by combining the filter specifications
+advertised by the server for the promisor remotes that the client
+accepts (see the `promisor.acceptFromServer` configuration option in
+linkgit:git-config[1]). This allows the server to suggest the optimal
+filter for the available promisor remotes.
++
+As with other filter specifications, the "auto" value is persisted in
+the configuration. This ensures that future fetches will continue to
+adapt to the server's current recommendation.
++
+For details on all other available filter specifications, see the
+`--filter=<filter-spec>` option in linkgit:git-rev-list[1].
++
+For example, `--filter=blob:none` will filter out all blobs (file
+contents) until needed by Git. Also, `--filter=blob:limit=<size>` will
+filter out all blobs of size at least _<size>_.
`--also-filter-submodules`::
Also apply the partial clone filter to any submodules in the repository.
diff --git a/Documentation/gitprotocol-v2.adoc b/Documentation/gitprotocol-v2.adoc
index d93dd279ea..f985cb4c47 100644
--- a/Documentation/gitprotocol-v2.adoc
+++ b/Documentation/gitprotocol-v2.adoc
@@ -812,10 +812,15 @@ MUST appear first in each pr-fields, in that order.
After these mandatory fields, the server MAY advertise the following
optional fields in any order:
-`partialCloneFilter`:: The filter specification used by the remote.
+`partialCloneFilter`:: The filter specification for the remote. It
+corresponds to the "remote.<name>.partialCloneFilter" config setting.
Clients can use this to determine if the remote's filtering strategy
-is compatible with their needs (e.g., checking if both use "blob:none").
-It corresponds to the "remote.<name>.partialCloneFilter" config setting.
+is compatible with their needs (e.g., checking if both use
+"blob:none"). Additionally they can use this through the
+`--filter=auto` option in linkgit:git-clone[1]. With that option, the
+filter specification of the clone will be automatically computed by
+combining the filter specifications of the promisor remotes the client
+accepts.
`token`:: An authentication token that clients can use when
connecting to the remote. It corresponds to the "remote.<name>.token"
@@ -828,8 +833,9 @@ future protocol extensions.
The client can use information transmitted through these fields to
decide if it accepts the advertised promisor remote. Also, the client
-can be configured to store the values of these fields (see
-"promisor.storeFields" in linkgit:git-config[1]).
+can be configured to store the values of these fields or use them
+to automatically configure the repository (see "promisor.storeFields"
+in linkgit:git-config[1] and `--filter=auto` in linkgit:git-clone[1]).
Field values MUST be urlencoded.
diff --git a/builtin/clone.c b/builtin/clone.c
index 51f4b5809d..67c7db104f 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -1001,6 +1001,8 @@ int cmd_clone(int argc,
NULL
};
+ filter_options.allow_auto_filter = 1;
+
packet_trace_identity("clone");
repo_config(the_repository, git_clone_config, NULL);
diff --git a/builtin/fetch.c b/builtin/fetch.c
index b984173447..ddc30a0d30 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -2439,6 +2439,8 @@ int cmd_fetch(int argc,
OPT_END()
};
+ filter_options.allow_auto_filter = 1;
+
packet_trace_identity("fetch");
/* Record the command line for the reflog */
diff --git a/fetch-pack.c b/fetch-pack.c
index 40316c9a34..5e9a969e31 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -35,6 +35,7 @@
#include "sigchain.h"
#include "mergesort.h"
#include "prio-queue.h"
+#include "promisor-remote.h"
static int transfer_unpack_limit = -1;
static int fetch_unpack_limit = -1;
@@ -1661,6 +1662,33 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
struct string_list packfile_uris = STRING_LIST_INIT_DUP;
int i;
struct strvec index_pack_args = STRVEC_INIT;
+ const char *promisor_remote_config;
+
+ if (server_feature_v2("promisor-remote", &promisor_remote_config)) {
+ char *remote_name = promisor_remote_reply(promisor_remote_config);
+ free(remote_name);
+ }
+
+ if (args->filter_options.choice == LOFC_AUTO) {
+ struct strbuf errbuf = STRBUF_INIT;
+ char *constructed_filter = promisor_remote_construct_filter(r);
+
+ list_objects_filter_release(&args->filter_options);
+ /* The result of resolving an 'auto' filter must not be 'auto' */
+ args->filter_options.allow_auto_filter = 0;
+
+ if (constructed_filter)
+ gently_parse_list_objects_filter(&args->filter_options,
+ constructed_filter,
+ &errbuf);
+
+ if (errbuf.len > 0)
+ die(_("couldn't resolve 'auto' filter '%s': %s"),
+ constructed_filter, errbuf.buf);
+
+ free(constructed_filter);
+ strbuf_release(&errbuf);
+ }
negotiator = &negotiator_alloc;
if (args->refetch)
diff --git a/t/t5710-promisor-remote-capability.sh b/t/t5710-promisor-remote-capability.sh
index a726af214a..21543bce20 100755
--- a/t/t5710-promisor-remote-capability.sh
+++ b/t/t5710-promisor-remote-capability.sh
@@ -409,6 +409,66 @@ test_expect_success "clone with promisor.storeFields=partialCloneFilter" '
check_missing_objects server 1 "$oid"
'
+test_expect_success "clone and fetch with --filter=auto" '
+ git -C server config promisor.advertise true &&
+ test_when_finished "rm -rf client trace" &&
+
+ git -C server config remote.lop.partialCloneFilter "blob:limit=9500" &&
+ test_config -C server promisor.sendFields "partialCloneFilter" &&
+
+ GIT_TRACE_PACKET="$(pwd)/trace" GIT_NO_LAZY_FETCH=0 git clone \
+ -c remote.lop.promisor=true \
+ -c remote.lop.url="file://$(pwd)/lop" \
+ -c promisor.acceptfromserver=All \
+ --no-local --filter=auto server client 2>err &&
+
+ test_grep "filter blob:limit=9500" trace &&
+ test_grep ! "filter auto" trace &&
+
+ # Verify "auto" is persisted in config
+ echo auto >expected &&
+ git -C client config remote.origin.partialCloneFilter >actual &&
+ test_cmp expected actual &&
+
+ # Check that the largest object is still missing on the server
+ check_missing_objects server 1 "$oid" &&
+
+ # Now change the filter on the server
+ git -C server config remote.lop.partialCloneFilter "blob:limit=5678" &&
+
+ # Get a new commit on the server to ensure "git fetch" actually runs fetch-pack
+ test_commit -C template new-commit &&
+ git -C template push --all "$(pwd)/server" &&
+
+ # Perform a fetch WITH --filter=auto
+ rm -rf trace &&
+ GIT_TRACE_PACKET="$(pwd)/trace" git -C client fetch --filter=auto &&
+
+ # Verify that the new filter was used
+ test_grep "filter blob:limit=5678" trace &&
+
+ # Check that the largest object is still missing on the server
+ check_missing_objects server 1 "$oid" &&
+
+ # Change the filter on the server again
+ git -C server config remote.lop.partialCloneFilter "blob:limit=5432" &&
+
+ # Get yet a new commit on the server to ensure fetch-pack runs
+ test_commit -C template yet-a-new-commit &&
+ git -C template push --all "$(pwd)/server" &&
+
+ # Perform a fetch WITHOUT --filter=auto
+ # Relies on "auto" being persisted in the client config
+ rm -rf trace &&
+ GIT_TRACE_PACKET="$(pwd)/trace" git -C client fetch &&
+
+ # Verify that the new filter was used
+ test_grep "filter blob:limit=5432" trace &&
+
+ # Check that the largest object is still missing on the server
+ check_missing_objects server 1 "$oid"
+'
+
test_expect_success "clone with promisor.advertise set to 'true' but don't delete the client" '
git -C server config promisor.advertise true &&
diff --git a/transport.c b/transport.c
index c7f06a7382..cde8d83a57 100644
--- a/transport.c
+++ b/transport.c
@@ -1219,6 +1219,7 @@ struct transport *transport_get(struct remote *remote, const char *url)
*/
struct git_transport_data *data = xcalloc(1, sizeof(*data));
list_objects_filter_init(&data->options.filter_options);
+ data->options.filter_options.allow_auto_filter = 1;
ret->data = data;
ret->vtable = &builtin_smart_vtable;
ret->smart_options = &(data->options);
--
2.53.0.rc2.10.g12663a1c75.dirty
^ permalink raw reply related [flat|nested] 80+ messages in thread* Re: [PATCH v2 8/8] fetch-pack: wire up and enable auto filter logic
2026-02-04 11:08 ` [PATCH v2 8/8] fetch-pack: wire up and enable auto filter logic Christian Couder
@ 2026-02-11 11:48 ` Patrick Steinhardt
2026-02-12 10:07 ` Christian Couder
0 siblings, 1 reply; 80+ messages in thread
From: Patrick Steinhardt @ 2026-02-11 11:48 UTC (permalink / raw)
To: Christian Couder
Cc: git, Junio C Hamano, Taylor Blau, Karthik Nayak, Elijah Newren,
Jean-Noël Avila, Christian Couder
On Wed, Feb 04, 2026 at 12:08:13PM +0100, Christian Couder wrote:
> diff --git a/fetch-pack.c b/fetch-pack.c
> index 40316c9a34..5e9a969e31 100644
> --- a/fetch-pack.c
> +++ b/fetch-pack.c
> @@ -1661,6 +1662,33 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
> struct string_list packfile_uris = STRING_LIST_INIT_DUP;
> int i;
> struct strvec index_pack_args = STRVEC_INIT;
> + const char *promisor_remote_config;
> +
> + if (server_feature_v2("promisor-remote", &promisor_remote_config)) {
> + char *remote_name = promisor_remote_reply(promisor_remote_config);
> + free(remote_name);
> + }
Huh. Do we only call this function because it calls
`filter_promisor_remote()`? We don't seem to care about anything else
and do some more work to assemble the `remote_name` string that
ultimately ends up being pointless.
Maybe we should instead expose that function?
> + if (args->filter_options.choice == LOFC_AUTO) {
> + struct strbuf errbuf = STRBUF_INIT;
> + char *constructed_filter = promisor_remote_construct_filter(r);
> +
> + list_objects_filter_release(&args->filter_options);
> + /* The result of resolving an 'auto' filter must not be 'auto' */
> + args->filter_options.allow_auto_filter = 0;
We didn't resolve though, we only released it. So the commend doesn't
seem accurate to me anymore.
> + if (constructed_filter)
> + gently_parse_list_objects_filter(&args->filter_options,
> + constructed_filter,
> + &errbuf);
> +
> + if (errbuf.len > 0)
> + die(_("couldn't resolve 'auto' filter '%s': %s"),
> + constructed_filter, errbuf.buf);
I think `gently_parse_list_objects_filter()` already returns non-zero in
all failure cases, so shouldn't we rather:
if (constructed_filter &&
gently_parse_list_objects_filter(&args->filter_options,
constructed_filter,
&errbuf);
die(_("couldn't resolve 'auto' filter '%s': %s"),
constructed_filter, errbuf.buf);
Patrick
^ permalink raw reply [flat|nested] 80+ messages in thread* Re: [PATCH v2 8/8] fetch-pack: wire up and enable auto filter logic
2026-02-11 11:48 ` Patrick Steinhardt
@ 2026-02-12 10:07 ` Christian Couder
0 siblings, 0 replies; 80+ messages in thread
From: Christian Couder @ 2026-02-12 10:07 UTC (permalink / raw)
To: Patrick Steinhardt
Cc: git, Junio C Hamano, Taylor Blau, Karthik Nayak, Elijah Newren,
Jean-Noël Avila, Christian Couder
On Wed, Feb 11, 2026 at 12:48 PM Patrick Steinhardt <ps@pks.im> wrote:
>
> On Wed, Feb 04, 2026 at 12:08:13PM +0100, Christian Couder wrote:
> > diff --git a/fetch-pack.c b/fetch-pack.c
> > index 40316c9a34..5e9a969e31 100644
> > --- a/fetch-pack.c
> > +++ b/fetch-pack.c
> > @@ -1661,6 +1662,33 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
> > struct string_list packfile_uris = STRING_LIST_INIT_DUP;
> > int i;
> > struct strvec index_pack_args = STRVEC_INIT;
> > + const char *promisor_remote_config;
> > +
> > + if (server_feature_v2("promisor-remote", &promisor_remote_config)) {
> > + char *remote_name = promisor_remote_reply(promisor_remote_config);
> > + free(remote_name);
> > + }
>
> Huh. Do we only call this function because it calls
> `filter_promisor_remote()`? We don't seem to care about anything else
> and do some more work to assemble the `remote_name` string that
> ultimately ends up being pointless.
>
> Maybe we should instead expose that function?
Yeah, we could expose that function, but then we would discard the
`struct strvec` that the function requires and populates, so the "huh"
factor might in some way be even bigger.
I think it would be better to change the signature of
promisor_remote_reply() to:
void promisor_remote_reply(const char *info, char **accepted)
This way we could pass NULL as the second argument and the function
would not assemble a string in that case.
> > + if (args->filter_options.choice == LOFC_AUTO) {
> > + struct strbuf errbuf = STRBUF_INIT;
> > + char *constructed_filter = promisor_remote_construct_filter(r);
> > +
> > + list_objects_filter_release(&args->filter_options);
> > + /* The result of resolving an 'auto' filter must not be 'auto' */
> > + args->filter_options.allow_auto_filter = 0;
>
> We didn't resolve though, we only released it. So the commend doesn't
> seem accurate to me anymore.
What the comment wanted to say is that when we are going to resolve an
auto filter, in gently_parse_list_objects_filter() below, the result
must not be 'auto', so we disallow 'auto'.
So maybe something like /* Disallow 'auto' as a result of the
resolution of this 'auto' filter below */ ?
> > + if (constructed_filter)
> > + gently_parse_list_objects_filter(&args->filter_options,
> > + constructed_filter,
> > + &errbuf);
> > +
> > + if (errbuf.len > 0)
> > + die(_("couldn't resolve 'auto' filter '%s': %s"),
> > + constructed_filter, errbuf.buf);
>
> I think `gently_parse_list_objects_filter()` already returns non-zero in
> all failure cases, so shouldn't we rather:
>
> if (constructed_filter &&
> gently_parse_list_objects_filter(&args->filter_options,
> constructed_filter,
> &errbuf);
> die(_("couldn't resolve 'auto' filter '%s': %s"),
> constructed_filter, errbuf.buf);
Yeah, it might be easier to understand. I will use your suggestion.
Thanks.
^ permalink raw reply [flat|nested] 80+ messages in thread
* [PATCH v3 0/9] Implement `promisor.storeFields` and `--filter=auto`
2026-02-04 11:08 ` [PATCH v2 0/8] Implement `promisor.storeFields` and `--filter=auto` Christian Couder
` (7 preceding siblings ...)
2026-02-04 11:08 ` [PATCH v2 8/8] fetch-pack: wire up and enable auto filter logic Christian Couder
@ 2026-02-12 10:08 ` Christian Couder
2026-02-12 10:08 ` [PATCH v3 1/9] promisor-remote: refactor initialising field lists Christian Couder
` (10 more replies)
8 siblings, 11 replies; 80+ messages in thread
From: Christian Couder @ 2026-02-12 10:08 UTC (permalink / raw)
To: git
Cc: Junio C Hamano, Patrick Steinhardt, Taylor Blau, Karthik Nayak,
Elijah Newren, Jean-Noël Avila, Christian Couder
Introduction
============
A previous patch series added the possibility to pass additional
fields, a "partialCloneFilter" and a "token" for each advertised
promisor remote, from a server to a client through the
"promisor-remote" capability.
On the client side though, it has so far only been possible to use
this new information to compare it with local information and then
decide if the corresponding advertised promisor remote is accepted or
not.
For the "token" it would be useful if it could be stored on the
client. For example in a setup where the client uses specialized
remote helpers which need a token to access the promisor remotes
advertised by the server, storing the token would allow the token to
be used when the client directly accesses a promisor remote for
example to lazy fetch some blobs it now needs.
To enable such a workflow, where the server can rotate tokens and the
client can have updated tokens from the server by simply fetching from
it, the first part of this series introduces a new
"promisor.storeFields" configuration option on the client side,
similar to the "promisor.checkFields" configuration option. When field
names, "token" or "partialCloneFilter", are listed in this new
configuration option, then the values of these field names transmitted
by the server are stored in the local configuration on the client
side.
Note that for security reasons, the corresponding remote name and url
of the advertised promisor remotes must have already been configured
on the client side. No new remote name nor url are configured.
For the "partialCloneFilter" field, simply storing the value is not
enough to enable dynamic updates. Currently, when a user initiates a
partial clone with `--filter=<filter-spec>`, that specific
<filter-spec> is saved in the client's local configuration (e.g.,
remote.origin.partialCloneFilter). Subsequent fetches then reuse this
value, ignoring suggestions from the server.
To avoid breaking this mechanism and still be able to use the
<filter-spec> that the server suggests for the promisor remotes that
the client accepts, the second part of this series introduces a new
`--filter=auto` mode for `git clone` and `git fetch`.
When `--filter=auto` is used, then "auto" is still saved as the
<filter-spec> for the server locally on the client, and then when a
fetch-pack happens, instead of passing just "auto", the actual filter
requested by the client is computed by combining the <filter-spec>s
that the server suggested for the promisor remotes that the client
accepted. This uses the "combine" filter mechanism that already exists
in "list-objects-filter-options.{c,h}".
This way by just using `--filter=auto` when cloning, a client makes
sure it will use the <filter-spec>s suggested by the server for the
promisor remotes it accepts.
This work is part of the "LOP" effort documented in:
Documentation/technical/large-object-promisors.adoc
See that doc for more information on the broader context.
Overview of the patches
=======================
Patches 1/9 and 2/9 are the first part of the series and implement the
new "promisor.storeFields" configuration option. Patch 1/9 is a small
preparatory refactoring.
Patches from 3/9 to 9/9 implement the `--filter=auto` option:
- Patches 3/9 and 4/9 are cleanups of "builtin/clone.c" and
"builtin/fetch.c" respectively that make the `filter_options`
variable local to cmd_clone() or cmd_fetch().
- Patch 5/9 is a doc update as `--filter=<filter-spec>` wasn't
documented for `git fetch`.
- Patch 6/9 improves "list-objects-filter-options.{c,h}" to
support the new 'auto' mode.
- Patches 7/9 and 8/9 improves "promisor-remote.{c,h}" to support
the new 'auto' mode.
- Patch 9/9 make the new 'auto' mode actually work by wiring up
everything together.
CI Report
=========
All the tests pass, see:
https://github.com/chriscool/git/actions/runs/21940309492
Changes since v2
================
Thanks to Patrick Steinhardt, Jean-Noël Avila and Junio Hamano for
reviewing the previous version!
The patch series has been rebased on top of current 'master' at
864f55e190 (The second batch, 2026-02-09) to avoid a small conflict.
In patch 2/9, new checks have been added to the "clone with
promisor.storeFields=partialCloneFilter" test. We now check that a
subsequent fetch can update the configuration.
In patch 4/9, a small change has been made to the arguments of
`backfill_tags()` in "builtin/fetch.c" to fix a conflict with 'master'.
In patch 5/9, the commit message has been improved.
In patch 7/9, `captured_filters` has been renamed `accepted_filters`.
Patch 8/9 is new. It changes the signature of
`promisor_remote_reply()` and allows this function to not assemble a
reply string if this is not needed by the caller.
Patch 9/9, has a number of small changes in "fetch-pack.c":
- The call to `promisor_remote_reply()` is simplified a bit as it
doesn't require a reply string to be assembled.
- A comment has been reworded for clarity.
- The call to `gently_parse_list_objects_filter()` and the check to
error out in case it fails have been simplified.
Range diff since v2
===================
1: e19b1518cd = 1: 79255ceba7 promisor-remote: refactor initialising field lists
2: 8f20baac17 ! 2: 012aa7ef19 promisor-remote: allow a client to store fields
@@ t/t5710-promisor-remote-capability.sh: test_expect_success "clone with promisor.
+ test_must_fail git -C client config remote.otherLop.partialCloneFilter >actual &&
+
+ # Check that the largest object is still missing on the server
-+ check_missing_objects server 1 "$oid"
++ check_missing_objects server 1 "$oid" &&
++
++ # Change the configuration on the server and fetch from the client
++ git -C server config remote.lop.partialCloneFilter "blob:limit=7k" &&
++ GIT_NO_LAZY_FETCH=0 git -C client fetch \
++ --filter="blob:limit=5k" ../server 2>err &&
++
++ # Check that the fetch updated the configuration on the client
++ echo "blob:limit=7k" >expected &&
++ git -C client config remote.lop.partialCloneFilter >actual &&
++ test_cmp expected actual &&
++
++ # Check that user is notified when the new filter is stored
++ test_grep "Storing new filter from server for remote '\''lop'\''" err &&
++ test_grep "'\''blob:limit=8k'\'' -> '\''blob:limit=7k'\''" err
+'
+
test_expect_success "clone with promisor.advertise set to 'true' but don't delete the client" '
3: 9d53a79600 = 3: f17a62e73e clone: make filter_options local to cmd_clone()
4: b24907e6dc ! 4: 3c6e28dd84 fetch: make filter_options local to cmd_fetch()
@@ builtin/fetch.c: static struct transport *prepare_transport(struct remote *remot
set_option(transport, TRANS_OPT_FROM_PROMISOR, "1");
}
@@ builtin/fetch.c: static int backfill_tags(struct display_state *display_state,
- struct ref_transaction *transaction,
struct ref *ref_map,
struct fetch_head *fetch_head,
-- const struct fetch_config *config)
-+ const struct fetch_config *config,
+ const struct fetch_config *config,
+- struct ref_update_display_info_array *display_array)
++ struct ref_update_display_info_array *display_array,
+ struct list_objects_filter_options *filter_options)
{
int retcode, cannot_reuse;
@@ builtin/fetch.c: static int do_fetch(struct transport *transport,
* the transaction and don't commit anything.
*/
if (backfill_tags(&display_state, transport, transaction, tags_ref_map,
-- &fetch_head, config))
-+ &fetch_head, config, filter_options))
+- &fetch_head, config, &display_array))
++ &fetch_head, config, &display_array, filter_options))
retcode = 1;
}
5: 90fb77360b ! 5: 3037d546b2 doc: fetch: document `--filter=<filter-spec>` option
@@ Commit message
The `--filter=<filter-spec>` option is documented in most commands that
support it except `git fetch`.
- Let's fix that and document that option using the same words already
- used to document it for `git clone`.
-
- Those words could probably be improved, but they are not wrong, so
- let's just use them for now and leave improving them for future work.
+ Let's fix that and document this option. To ensure consistency across
+ commands, let's reuse the exact description currently found in
+ `git clone`.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
6: b524b24024 = 6: 9ce57b88dc list-objects-filter-options: support 'auto' mode for --filter
7: 4ec51ee88f ! 7: 37042f7019 promisor-remote: keep advertised filters in memory
@@ promisor-remote.c: static void filter_promisor_remote(struct repository *repo,
struct store_info *store_info = NULL;
struct string_list_item *item;
bool reload_config = false;
-+ struct string_list captured_filters = STRING_LIST_INIT_DUP;
++ struct string_list accepted_filters = STRING_LIST_INIT_DUP;
if (!repo_config_get_string_tmp(the_repository, "promisor.acceptfromserver", &accept_str)) {
if (!*accept_str || !strcasecmp("None", accept_str))
@@ promisor-remote.c: static void filter_promisor_remote(struct repository *repo,
+ /* Capture advertised filters for accepted remotes */
+ if (advertised->filter) {
+ struct string_list_item *i;
-+ i = string_list_append(&captured_filters, advertised->name);
++ i = string_list_append(&accepted_filters, advertised->name);
+ i->util = xstrdup(advertised->filter);
+ }
}
@@ promisor-remote.c: static void filter_promisor_remote(struct repository *repo,
if (reload_config)
repo_promisor_remote_reinit(repo);
+
-+ /* Apply captured filters to the stable repo state */
-+ for_each_string_list_item(item, &captured_filters) {
++ /* Apply accepted remote filters to the stable repo state */
++ for_each_string_list_item(item, &accepted_filters) {
+ struct promisor_remote *r = repo_promisor_remote_find(repo, item->string);
+ if (r) {
+ free(r->advertised_filter);
@@ promisor-remote.c: static void filter_promisor_remote(struct repository *repo,
+ }
+ }
+
-+ string_list_clear(&captured_filters, 1);
++ string_list_clear(&accepted_filters, 1);
+
+ /* Mark the remotes as accepted in the repository state */
+ for (size_t i = 0; i < accepted->nr; i++) {
-: ---------- > 8: dd17069aad promisor-remote: change promisor_remote_reply()'s signature
8: 994ecb3317 ! 9: 0f9675f477 fetch-pack: wire up and enable auto filter logic
@@ fetch-pack.c: static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
struct strvec index_pack_args = STRVEC_INIT;
+ const char *promisor_remote_config;
+
-+ if (server_feature_v2("promisor-remote", &promisor_remote_config)) {
-+ char *remote_name = promisor_remote_reply(promisor_remote_config);
-+ free(remote_name);
-+ }
++ if (server_feature_v2("promisor-remote", &promisor_remote_config))
++ promisor_remote_reply(promisor_remote_config, NULL);
+
+ if (args->filter_options.choice == LOFC_AUTO) {
+ struct strbuf errbuf = STRBUF_INIT;
+ char *constructed_filter = promisor_remote_construct_filter(r);
+
+ list_objects_filter_release(&args->filter_options);
-+ /* The result of resolving an 'auto' filter must not be 'auto' */
++ /* Disallow 'auto' as a result of the resolution of this 'auto' filter below */
+ args->filter_options.allow_auto_filter = 0;
+
-+ if (constructed_filter)
-+ gently_parse_list_objects_filter(&args->filter_options,
-+ constructed_filter,
-+ &errbuf);
-+
-+ if (errbuf.len > 0)
++ if (constructed_filter &&
++ gently_parse_list_objects_filter(&args->filter_options,
++ constructed_filter,
++ &errbuf))
+ die(_("couldn't resolve 'auto' filter '%s': %s"),
+ constructed_filter, errbuf.buf);
+
@@ fetch-pack.c: static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
## t/t5710-promisor-remote-capability.sh ##
@@ t/t5710-promisor-remote-capability.sh: test_expect_success "clone with promisor.storeFields=partialCloneFilter" '
- check_missing_objects server 1 "$oid"
+ test_grep "'\''blob:limit=8k'\'' -> '\''blob:limit=7k'\''" err
'
+test_expect_success "clone and fetch with --filter=auto" '
Christian Couder (9):
promisor-remote: refactor initialising field lists
promisor-remote: allow a client to store fields
clone: make filter_options local to cmd_clone()
fetch: make filter_options local to cmd_fetch()
doc: fetch: document `--filter=<filter-spec>` option
list-objects-filter-options: support 'auto' mode for --filter
promisor-remote: keep advertised filters in memory
promisor-remote: change promisor_remote_reply()'s signature
fetch-pack: wire up and enable auto filter logic
Documentation/config/promisor.adoc | 33 +++
Documentation/fetch-options.adoc | 19 ++
Documentation/git-clone.adoc | 25 +-
Documentation/gitprotocol-v2.adoc | 24 +-
Makefile | 1 +
builtin/clone.c | 18 +-
builtin/fetch.c | 50 ++--
connect.c | 3 +-
fetch-pack.c | 24 ++
list-objects-filter-options.c | 37 ++-
list-objects-filter-options.h | 6 +
list-objects-filter.c | 8 +
promisor-remote.c | 256 +++++++++++++++++--
promisor-remote.h | 17 +-
t/meson.build | 1 +
t/t5710-promisor-remote-capability.sh | 123 +++++++++
t/unit-tests/u-list-objects-filter-options.c | 53 ++++
transport.c | 1 +
18 files changed, 626 insertions(+), 73 deletions(-)
create mode 100644 t/unit-tests/u-list-objects-filter-options.c
--
2.53.0.70.g3d1fd9d397.dirty
^ permalink raw reply [flat|nested] 80+ messages in thread* [PATCH v3 1/9] promisor-remote: refactor initialising field lists
2026-02-12 10:08 ` [PATCH v3 0/9] Implement `promisor.storeFields` and `--filter=auto` Christian Couder
@ 2026-02-12 10:08 ` Christian Couder
2026-02-12 10:08 ` [PATCH v3 2/9] promisor-remote: allow a client to store fields Christian Couder
` (9 subsequent siblings)
10 siblings, 0 replies; 80+ messages in thread
From: Christian Couder @ 2026-02-12 10:08 UTC (permalink / raw)
To: git
Cc: Junio C Hamano, Patrick Steinhardt, Taylor Blau, Karthik Nayak,
Elijah Newren, Jean-Noël Avila, Christian Couder,
Christian Couder
In "promisor-remote.c", the fields_sent() and fields_checked()
functions serve similar purposes and contain a small amount of
duplicated code.
As we are going to add a similar function in a following commit,
let's refactor this common code into a new initialize_fields_list()
function.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
promisor-remote.c | 28 ++++++++++++++--------------
1 file changed, 14 insertions(+), 14 deletions(-)
diff --git a/promisor-remote.c b/promisor-remote.c
index 77ebf537e2..5d8151cedb 100644
--- a/promisor-remote.c
+++ b/promisor-remote.c
@@ -375,18 +375,24 @@ static char *fields_from_config(struct string_list *fields_list, const char *con
return fields;
}
+static struct string_list *initialize_fields_list(struct string_list *fields_list, int *initialized,
+ const char *config_key)
+{
+ if (!*initialized) {
+ fields_list->cmp = strcasecmp;
+ fields_from_config(fields_list, config_key);
+ *initialized = 1;
+ }
+
+ return fields_list;
+}
+
static struct string_list *fields_sent(void)
{
static struct string_list fields_list = STRING_LIST_INIT_NODUP;
static int initialized;
- if (!initialized) {
- fields_list.cmp = strcasecmp;
- fields_from_config(&fields_list, "promisor.sendFields");
- initialized = 1;
- }
-
- return &fields_list;
+ return initialize_fields_list(&fields_list, &initialized, "promisor.sendFields");
}
static struct string_list *fields_checked(void)
@@ -394,13 +400,7 @@ static struct string_list *fields_checked(void)
static struct string_list fields_list = STRING_LIST_INIT_NODUP;
static int initialized;
- if (!initialized) {
- fields_list.cmp = strcasecmp;
- fields_from_config(&fields_list, "promisor.checkFields");
- initialized = 1;
- }
-
- return &fields_list;
+ return initialize_fields_list(&fields_list, &initialized, "promisor.checkFields");
}
/*
--
2.53.0.70.g3d1fd9d397.dirty
^ permalink raw reply related [flat|nested] 80+ messages in thread* [PATCH v3 2/9] promisor-remote: allow a client to store fields
2026-02-12 10:08 ` [PATCH v3 0/9] Implement `promisor.storeFields` and `--filter=auto` Christian Couder
2026-02-12 10:08 ` [PATCH v3 1/9] promisor-remote: refactor initialising field lists Christian Couder
@ 2026-02-12 10:08 ` Christian Couder
2026-02-12 10:08 ` [PATCH v3 3/9] clone: make filter_options local to cmd_clone() Christian Couder
` (8 subsequent siblings)
10 siblings, 0 replies; 80+ messages in thread
From: Christian Couder @ 2026-02-12 10:08 UTC (permalink / raw)
To: git
Cc: Junio C Hamano, Patrick Steinhardt, Taylor Blau, Karthik Nayak,
Elijah Newren, Jean-Noël Avila, Christian Couder,
Christian Couder
A previous commit allowed a server to pass additional fields through
the "promisor-remote" protocol capability after the "name" and "url"
fields, specifically the "partialCloneFilter" and "token" fields.
Another previous commit, c213820c51 (promisor-remote: allow a client
to check fields, 2025-09-08), has made it possible for a client to
decide if it accepts a promisor remote advertised by a server based
on these additional fields.
Often though, it would be interesting for the client to just store in
its configuration files these additional fields passed by the server,
so that it can use them when needed.
For example if a token is necessary to access a promisor remote, that
token could be updated frequently only on the server side and then
passed to all the clients through the "promisor-remote" capability,
avoiding the need to update it on all the clients manually.
Storing the token on the client side makes sure that the token is
available when the client needs to access the promisor remotes for a
lazy fetch.
To allow this, let's introduce a new "promisor.storeFields"
configuration variable.
Note that for a partial clone filter, it's less interesting to have
it stored on the client. This is because a filter should be used
right away and we already pass a `--filter=<filter-spec>` option to
`git clone` when starting a partial clone. Storing the filter could
perhaps still be interesting for information purposes.
Like "promisor.checkFields" and "promisor.sendFields", the new
configuration variable should contain a comma or space separated list
of field names. Only the "partialCloneFilter" and "token" field names
are supported for now.
When a server advertises a promisor remote, for example "foo", along
with for example "token=XXXXX" to a client, and on the client side
"promisor.storeFields" contains "token", then the client will store
XXXXX for the "remote.foo.token" variable in its configuration file
and reload its configuration so it can immediately use this new
configuration variable.
A message is emitted on stderr to warn users when the config is
changed.
Note that even if "promisor.acceptFromServer" is set to "all", a
promisor remote has to be already configured on the client side for
some of its config to be changed. In any case no new remote is
configured and no new URL is stored.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
Documentation/config/promisor.adoc | 33 ++++++
Documentation/gitprotocol-v2.adoc | 12 ++-
promisor-remote.c | 148 +++++++++++++++++++++++++-
t/t5710-promisor-remote-capability.sh | 63 +++++++++++
4 files changed, 250 insertions(+), 6 deletions(-)
diff --git a/Documentation/config/promisor.adoc b/Documentation/config/promisor.adoc
index 93e5e0d9b5..b0fa43b839 100644
--- a/Documentation/config/promisor.adoc
+++ b/Documentation/config/promisor.adoc
@@ -89,3 +89,36 @@ variable. The fields are checked only if the
`promisor.acceptFromServer` config variable is not set to "None". If
set to "None", this config variable has no effect. See
linkgit:gitprotocol-v2[5].
+
+promisor.storeFields::
+ A comma or space separated list of additional remote related
+ field names. If a client accepts an advertised remote, the
+ client will store the values associated with these field names
+ taken from the remote advertisement into its configuration,
+ and then reload its remote configuration. Currently,
+ "partialCloneFilter" and "token" are the only supported field
+ names.
++
+For example if a server advertises "partialCloneFilter=blob:limit=20k"
+for remote "foo", and that remote is accepted, then "blob:limit=20k"
+will be stored for the "remote.foo.partialCloneFilter" configuration
+variable.
++
+If the new field value from an advertised remote is the same as the
+existing field value for that remote on the client side, then no
+change is made to the client configuration though.
++
+When a new value is stored, a message is printed to standard error to
+let users know about this.
++
+Note that for security reasons, if the remote is not already
+configured on the client side, nothing will be stored for that
+remote. In any case, no new remote will be created and no URL will be
+stored.
++
+Before storing a partial clone filter, it's parsed to check it's
+valid. If it's not, a warning is emitted and it's not stored.
++
+Before storing a token, a check is performed to ensure it contains no
+control character. If the check fails, a warning is emitted and it's
+not stored.
diff --git a/Documentation/gitprotocol-v2.adoc b/Documentation/gitprotocol-v2.adoc
index c7db103299..d93dd279ea 100644
--- a/Documentation/gitprotocol-v2.adoc
+++ b/Documentation/gitprotocol-v2.adoc
@@ -826,9 +826,10 @@ are case-sensitive and MUST be transmitted exactly as specified
above. Clients MUST ignore fields they don't recognize to allow for
future protocol extensions.
-For now, the client can only use information transmitted through these
-fields to decide if it accepts the advertised promisor remote. In the
-future that information might be used for other purposes though.
+The client can use information transmitted through these fields to
+decide if it accepts the advertised promisor remote. Also, the client
+can be configured to store the values of these fields (see
+"promisor.storeFields" in linkgit:git-config[1]).
Field values MUST be urlencoded.
@@ -856,8 +857,9 @@ the server advertised, the client shouldn't advertise the
On the server side, the "promisor.advertise" and "promisor.sendFields"
configuration options can be used to control what it advertises. On
the client side, the "promisor.acceptFromServer" configuration option
-can be used to control what it accepts. See the documentation of these
-configuration options for more information.
+can be used to control what it accepts, and the "promisor.storeFields"
+option, to control what it stores. See the documentation of these
+configuration options in linkgit:git-config[1] for more information.
Note that in the future it would be nice if the "promisor-remote"
protocol capability could be used by the server, when responding to
diff --git a/promisor-remote.c b/promisor-remote.c
index 5d8151cedb..59997dd4c7 100644
--- a/promisor-remote.c
+++ b/promisor-remote.c
@@ -403,6 +403,14 @@ static struct string_list *fields_checked(void)
return initialize_fields_list(&fields_list, &initialized, "promisor.checkFields");
}
+static struct string_list *fields_stored(void)
+{
+ static struct string_list fields_list = STRING_LIST_INIT_NODUP;
+ static int initialized;
+
+ return initialize_fields_list(&fields_list, &initialized, "promisor.storeFields");
+}
+
/*
* Struct for promisor remotes involved in the "promisor-remote"
* protocol capability.
@@ -692,6 +700,132 @@ static struct promisor_info *parse_one_advertised_remote(const char *remote_info
return info;
}
+static bool store_one_field(struct repository *repo, const char *remote_name,
+ const char *field_name, const char *field_key,
+ const char *advertised, const char *current)
+{
+ if (advertised && (!current || strcmp(current, advertised))) {
+ char *key = xstrfmt("remote.%s.%s", remote_name, field_key);
+
+ fprintf(stderr, _("Storing new %s from server for remote '%s'.\n"
+ " '%s' -> '%s'\n"),
+ field_name, remote_name,
+ current ? current : "",
+ advertised);
+
+ repo_config_set_gently(repo, key, advertised);
+ free(key);
+
+ return true;
+ }
+
+ return false;
+}
+
+/* Check that a filter is valid by parsing it */
+static bool valid_filter(const char *filter, const char *remote_name)
+{
+ struct list_objects_filter_options filter_opts = LIST_OBJECTS_FILTER_INIT;
+ struct strbuf err = STRBUF_INIT;
+ int res = gently_parse_list_objects_filter(&filter_opts, filter, &err);
+
+ if (res)
+ warning(_("invalid filter '%s' for remote '%s' "
+ "will not be stored: %s"),
+ filter, remote_name, err.buf);
+
+ list_objects_filter_release(&filter_opts);
+ strbuf_release(&err);
+
+ return !res;
+}
+
+/* Check that a token doesn't contain any control character */
+static bool valid_token(const char *token, const char *remote_name)
+{
+ const char *c = token;
+
+ for (; *c; c++)
+ if (iscntrl(*c)) {
+ warning(_("invalid token '%s' for remote '%s' "
+ "will not be stored"),
+ token, remote_name);
+ return false;
+ }
+
+ return true;
+}
+
+struct store_info {
+ struct repository *repo;
+ struct string_list config_info;
+ bool store_filter;
+ bool store_token;
+};
+
+static struct store_info *store_info_new(struct repository *repo)
+{
+ struct string_list *fields_to_store = fields_stored();
+ struct store_info *s = xmalloc(sizeof(*s));
+
+ s->repo = repo;
+
+ string_list_init_nodup(&s->config_info);
+ promisor_config_info_list(repo, &s->config_info, fields_to_store);
+ string_list_sort(&s->config_info);
+
+ s->store_filter = !!string_list_lookup(fields_to_store, promisor_field_filter);
+ s->store_token = !!string_list_lookup(fields_to_store, promisor_field_token);
+
+ return s;
+}
+
+static void store_info_free(struct store_info *s)
+{
+ if (s) {
+ promisor_info_list_clear(&s->config_info);
+ free(s);
+ }
+}
+
+static bool promisor_store_advertised_fields(struct promisor_info *advertised,
+ struct store_info *store_info)
+{
+ struct promisor_info *p;
+ struct string_list_item *item;
+ const char *remote_name = advertised->name;
+ bool reload_config = false;
+
+ if (!(store_info->store_filter || store_info->store_token))
+ return false;
+
+ /*
+ * Get existing config info for the advertised promisor
+ * remote. This ensures the remote is already configured on
+ * the client side.
+ */
+ item = string_list_lookup(&store_info->config_info, remote_name);
+
+ if (!item)
+ return false;
+
+ p = item->util;
+
+ if (store_info->store_filter && advertised->filter &&
+ valid_filter(advertised->filter, remote_name))
+ reload_config |= store_one_field(store_info->repo, remote_name,
+ "filter", promisor_field_filter,
+ advertised->filter, p->filter);
+
+ if (store_info->store_token && advertised->token &&
+ valid_token(advertised->token, remote_name))
+ reload_config |= store_one_field(store_info->repo, remote_name,
+ "token", promisor_field_token,
+ advertised->token, p->token);
+
+ return reload_config;
+}
+
static void filter_promisor_remote(struct repository *repo,
struct strvec *accepted,
const char *info)
@@ -700,7 +834,9 @@ static void filter_promisor_remote(struct repository *repo,
enum accept_promisor accept = ACCEPT_NONE;
struct string_list config_info = STRING_LIST_INIT_NODUP;
struct string_list remote_info = STRING_LIST_INIT_DUP;
+ struct store_info *store_info = NULL;
struct string_list_item *item;
+ bool reload_config = false;
if (!repo_config_get_string_tmp(the_repository, "promisor.acceptfromserver", &accept_str)) {
if (!*accept_str || !strcasecmp("None", accept_str))
@@ -736,14 +872,24 @@ static void filter_promisor_remote(struct repository *repo,
string_list_sort(&config_info);
}
- if (should_accept_remote(accept, advertised, &config_info))
+ if (should_accept_remote(accept, advertised, &config_info)) {
+ if (!store_info)
+ store_info = store_info_new(repo);
+ if (promisor_store_advertised_fields(advertised, store_info))
+ reload_config = true;
+
strvec_push(accepted, advertised->name);
+ }
promisor_info_free(advertised);
}
promisor_info_list_clear(&config_info);
string_list_clear(&remote_info, 0);
+ store_info_free(store_info);
+
+ if (reload_config)
+ repo_promisor_remote_reinit(repo);
}
char *promisor_remote_reply(const char *info)
diff --git a/t/t5710-promisor-remote-capability.sh b/t/t5710-promisor-remote-capability.sh
index 023735d6a8..6ef6431bd7 100755
--- a/t/t5710-promisor-remote-capability.sh
+++ b/t/t5710-promisor-remote-capability.sh
@@ -360,6 +360,69 @@ test_expect_success "clone with promisor.checkFields" '
check_missing_objects server 1 "$oid"
'
+test_expect_success "clone with promisor.storeFields=partialCloneFilter" '
+ git -C server config promisor.advertise true &&
+ test_when_finished "rm -rf client" &&
+
+ git -C server remote add otherLop "https://invalid.invalid" &&
+ git -C server config remote.otherLop.token "fooBar" &&
+ git -C server config remote.otherLop.stuff "baz" &&
+ git -C server config remote.otherLop.partialCloneFilter "blob:limit=10k" &&
+ test_when_finished "git -C server remote remove otherLop" &&
+
+ git -C server config remote.lop.token "fooXXX" &&
+ git -C server config remote.lop.partialCloneFilter "blob:limit=8k" &&
+
+ test_config -C server promisor.sendFields "partialCloneFilter, token" &&
+ test_when_finished "rm trace" &&
+
+ # Clone from server to create a client
+ GIT_TRACE_PACKET="$(pwd)/trace" GIT_NO_LAZY_FETCH=0 git clone \
+ -c remote.lop.promisor=true \
+ -c remote.lop.fetch="+refs/heads/*:refs/remotes/lop/*" \
+ -c remote.lop.url="file://$(pwd)/lop" \
+ -c remote.lop.token="fooYYY" \
+ -c remote.lop.partialCloneFilter="blob:none" \
+ -c promisor.acceptfromserver=All \
+ -c promisor.storeFields=partialcloneFilter \
+ --no-local --filter="blob:limit=5k" server client 2>err &&
+
+ # Check that the filter from the server is stored
+ echo "blob:limit=8k" >expected &&
+ git -C client config remote.lop.partialCloneFilter >actual &&
+ test_cmp expected actual &&
+
+ # Check that user is notified when the filter is stored
+ test_grep "Storing new filter from server for remote '\''lop'\''" err &&
+ test_grep "'\''blob:none'\'' -> '\''blob:limit=8k'\''" err &&
+
+ # Check that the token from the server is NOT stored
+ echo "fooYYY" >expected &&
+ git -C client config remote.lop.token >actual &&
+ test_cmp expected actual &&
+ test_grep ! "Storing new token from server" err &&
+
+ # Check that the filter for an unknown remote is NOT stored
+ test_must_fail git -C client config remote.otherLop.partialCloneFilter >actual &&
+
+ # Check that the largest object is still missing on the server
+ check_missing_objects server 1 "$oid" &&
+
+ # Change the configuration on the server and fetch from the client
+ git -C server config remote.lop.partialCloneFilter "blob:limit=7k" &&
+ GIT_NO_LAZY_FETCH=0 git -C client fetch \
+ --filter="blob:limit=5k" ../server 2>err &&
+
+ # Check that the fetch updated the configuration on the client
+ echo "blob:limit=7k" >expected &&
+ git -C client config remote.lop.partialCloneFilter >actual &&
+ test_cmp expected actual &&
+
+ # Check that user is notified when the new filter is stored
+ test_grep "Storing new filter from server for remote '\''lop'\''" err &&
+ test_grep "'\''blob:limit=8k'\'' -> '\''blob:limit=7k'\''" err
+'
+
test_expect_success "clone with promisor.advertise set to 'true' but don't delete the client" '
git -C server config promisor.advertise true &&
--
2.53.0.70.g3d1fd9d397.dirty
^ permalink raw reply related [flat|nested] 80+ messages in thread* [PATCH v3 3/9] clone: make filter_options local to cmd_clone()
2026-02-12 10:08 ` [PATCH v3 0/9] Implement `promisor.storeFields` and `--filter=auto` Christian Couder
2026-02-12 10:08 ` [PATCH v3 1/9] promisor-remote: refactor initialising field lists Christian Couder
2026-02-12 10:08 ` [PATCH v3 2/9] promisor-remote: allow a client to store fields Christian Couder
@ 2026-02-12 10:08 ` Christian Couder
2026-02-12 10:08 ` [PATCH v3 4/9] fetch: make filter_options local to cmd_fetch() Christian Couder
` (7 subsequent siblings)
10 siblings, 0 replies; 80+ messages in thread
From: Christian Couder @ 2026-02-12 10:08 UTC (permalink / raw)
To: git
Cc: Junio C Hamano, Patrick Steinhardt, Taylor Blau, Karthik Nayak,
Elijah Newren, Jean-Noël Avila, Christian Couder,
Christian Couder
The `struct list_objects_filter_options filter_options` variable used
in "builtin/clone.c" to store the parsed filters specified by
`--filter=<filterspec>` is currently a static variable global to the
file.
As we are going to use it more in a following commit, it could become
a bit less easy to understand how it's managed.
To avoid that, let's make it clear that it's owned by cmd_clone() by
moving its definition into that function and making it non-static.
The only additional change to make this work is to pass it as an
argument to checkout(). So it's a small quite cheap cleanup anyway.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
builtin/clone.c | 16 +++++++++++-----
1 file changed, 11 insertions(+), 5 deletions(-)
diff --git a/builtin/clone.c b/builtin/clone.c
index b14a39a687..bb27472020 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -77,7 +77,6 @@ static struct string_list option_required_reference = STRING_LIST_INIT_NODUP;
static struct string_list option_optional_reference = STRING_LIST_INIT_NODUP;
static int max_jobs = -1;
static struct string_list option_recurse_submodules = STRING_LIST_INIT_NODUP;
-static struct list_objects_filter_options filter_options = LIST_OBJECTS_FILTER_INIT;
static int config_filter_submodules = -1; /* unspecified */
static int option_remote_submodules;
@@ -634,7 +633,9 @@ static int git_sparse_checkout_init(const char *repo)
return result;
}
-static int checkout(int submodule_progress, int filter_submodules,
+static int checkout(int submodule_progress,
+ struct list_objects_filter_options *filter_options,
+ int filter_submodules,
enum ref_storage_format ref_storage_format)
{
struct object_id oid;
@@ -723,9 +724,9 @@ static int checkout(int submodule_progress, int filter_submodules,
strvec_pushf(&cmd.args, "--ref-format=%s",
ref_storage_format_to_name(ref_storage_format));
- if (filter_submodules && filter_options.choice)
+ if (filter_submodules && filter_options->choice)
strvec_pushf(&cmd.args, "--filter=%s",
- expand_list_objects_filter_spec(&filter_options));
+ expand_list_objects_filter_spec(filter_options));
if (option_single_branch >= 0)
strvec_push(&cmd.args, option_single_branch ?
@@ -903,6 +904,7 @@ int cmd_clone(int argc,
enum transport_family family = TRANSPORT_FAMILY_ALL;
struct string_list option_config = STRING_LIST_INIT_DUP;
int option_dissociate = 0;
+ struct list_objects_filter_options filter_options = LIST_OBJECTS_FILTER_INIT;
int option_filter_submodules = -1; /* unspecified */
struct string_list server_options = STRING_LIST_INIT_NODUP;
const char *bundle_uri = NULL;
@@ -1624,9 +1626,13 @@ int cmd_clone(int argc,
return 1;
junk_mode = JUNK_LEAVE_REPO;
- err = checkout(submodule_progress, filter_submodules,
+ err = checkout(submodule_progress,
+ &filter_options,
+ filter_submodules,
ref_storage_format);
+ list_objects_filter_release(&filter_options);
+
string_list_clear(&option_not, 0);
string_list_clear(&option_config, 0);
string_list_clear(&server_options, 0);
--
2.53.0.70.g3d1fd9d397.dirty
^ permalink raw reply related [flat|nested] 80+ messages in thread* [PATCH v3 4/9] fetch: make filter_options local to cmd_fetch()
2026-02-12 10:08 ` [PATCH v3 0/9] Implement `promisor.storeFields` and `--filter=auto` Christian Couder
` (2 preceding siblings ...)
2026-02-12 10:08 ` [PATCH v3 3/9] clone: make filter_options local to cmd_clone() Christian Couder
@ 2026-02-12 10:08 ` Christian Couder
2026-02-12 10:08 ` [PATCH v3 5/9] doc: fetch: document `--filter=<filter-spec>` option Christian Couder
` (6 subsequent siblings)
10 siblings, 0 replies; 80+ messages in thread
From: Christian Couder @ 2026-02-12 10:08 UTC (permalink / raw)
To: git
Cc: Junio C Hamano, Patrick Steinhardt, Taylor Blau, Karthik Nayak,
Elijah Newren, Jean-Noël Avila, Christian Couder,
Christian Couder
The `struct list_objects_filter_options filter_options` variable used
in "builtin/fetch.c" to store the parsed filters specified by
`--filter=<filterspec>` is currently a static variable global to the
file.
As we are going to use it more in a following commit, it could become a
bit less easy to understand how it's managed.
To avoid that, let's make it clear that it's owned by cmd_fetch() by
moving its definition into that function and making it non-static.
This requires passing a pointer to it through the prepare_transport(),
do_fetch(), backfill_tags(), fetch_one_setup_partial(), and fetch_one()
functions, but it's quite straightforward.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
builtin/fetch.c | 48 +++++++++++++++++++++++++++---------------------
1 file changed, 27 insertions(+), 21 deletions(-)
diff --git a/builtin/fetch.c b/builtin/fetch.c
index a3bc7e9380..8fbf3557ce 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -97,7 +97,6 @@ static struct strbuf default_rla = STRBUF_INIT;
static struct transport *gtransport;
static struct transport *gsecondary;
static struct refspec refmap = REFSPEC_INIT_FETCH;
-static struct list_objects_filter_options filter_options = LIST_OBJECTS_FILTER_INIT;
static struct string_list server_options = STRING_LIST_INIT_DUP;
static struct string_list negotiation_tip = STRING_LIST_INIT_NODUP;
@@ -1562,7 +1561,8 @@ static void add_negotiation_tips(struct git_transport_options *smart_options)
smart_options->negotiation_tips = oids;
}
-static struct transport *prepare_transport(struct remote *remote, int deepen)
+static struct transport *prepare_transport(struct remote *remote, int deepen,
+ struct list_objects_filter_options *filter_options)
{
struct transport *transport;
@@ -1586,9 +1586,9 @@ static struct transport *prepare_transport(struct remote *remote, int deepen)
set_option(transport, TRANS_OPT_UPDATE_SHALLOW, "yes");
if (refetch)
set_option(transport, TRANS_OPT_REFETCH, "yes");
- if (filter_options.choice) {
+ if (filter_options->choice) {
const char *spec =
- expand_list_objects_filter_spec(&filter_options);
+ expand_list_objects_filter_spec(filter_options);
set_option(transport, TRANS_OPT_LIST_OBJECTS_FILTER, spec);
set_option(transport, TRANS_OPT_FROM_PROMISOR, "1");
}
@@ -1607,7 +1607,8 @@ static int backfill_tags(struct display_state *display_state,
struct ref *ref_map,
struct fetch_head *fetch_head,
const struct fetch_config *config,
- struct ref_update_display_info_array *display_array)
+ struct ref_update_display_info_array *display_array,
+ struct list_objects_filter_options *filter_options)
{
int retcode, cannot_reuse;
@@ -1621,7 +1622,7 @@ static int backfill_tags(struct display_state *display_state,
cannot_reuse = transport->cannot_reuse ||
deepen_since || deepen_not.nr;
if (cannot_reuse) {
- gsecondary = prepare_transport(transport->remote, 0);
+ gsecondary = prepare_transport(transport->remote, 0, filter_options);
transport = gsecondary;
}
@@ -1834,7 +1835,8 @@ static int commit_ref_transaction(struct ref_transaction **transaction,
static int do_fetch(struct transport *transport,
struct refspec *rs,
- const struct fetch_config *config)
+ const struct fetch_config *config,
+ struct list_objects_filter_options *filter_options)
{
struct ref_transaction *transaction = NULL;
struct ref *ref_map = NULL;
@@ -1997,7 +1999,7 @@ static int do_fetch(struct transport *transport,
* the transaction and don't commit anything.
*/
if (backfill_tags(&display_state, transport, transaction, tags_ref_map,
- &fetch_head, config, &display_array))
+ &fetch_head, config, &display_array, filter_options))
retcode = 1;
}
@@ -2339,20 +2341,21 @@ static int fetch_multiple(struct string_list *list, int max_children,
* Fetching from the promisor remote should use the given filter-spec
* or inherit the default filter-spec from the config.
*/
-static inline void fetch_one_setup_partial(struct remote *remote)
+static inline void fetch_one_setup_partial(struct remote *remote,
+ struct list_objects_filter_options *filter_options)
{
/*
* Explicit --no-filter argument overrides everything, regardless
* of any prior partial clones and fetches.
*/
- if (filter_options.no_filter)
+ if (filter_options->no_filter)
return;
/*
* If no prior partial clone/fetch and the current fetch DID NOT
* request a partial-fetch, do a normal fetch.
*/
- if (!repo_has_promisor_remote(the_repository) && !filter_options.choice)
+ if (!repo_has_promisor_remote(the_repository) && !filter_options->choice)
return;
/*
@@ -2361,8 +2364,8 @@ static inline void fetch_one_setup_partial(struct remote *remote)
* filter-spec as the default for subsequent fetches to this
* remote if there is currently no default filter-spec.
*/
- if (filter_options.choice) {
- partial_clone_register(remote->name, &filter_options);
+ if (filter_options->choice) {
+ partial_clone_register(remote->name, filter_options);
return;
}
@@ -2371,14 +2374,15 @@ static inline void fetch_one_setup_partial(struct remote *remote)
* explicitly given filter-spec or inherit the filter-spec from
* the config.
*/
- if (!filter_options.choice)
- partial_clone_get_default_filter_spec(&filter_options, remote->name);
+ if (!filter_options->choice)
+ partial_clone_get_default_filter_spec(filter_options, remote->name);
return;
}
static int fetch_one(struct remote *remote, int argc, const char **argv,
int prune_tags_ok, int use_stdin_refspecs,
- const struct fetch_config *config)
+ const struct fetch_config *config,
+ struct list_objects_filter_options *filter_options)
{
struct refspec rs = REFSPEC_INIT_FETCH;
int i;
@@ -2390,7 +2394,7 @@ static int fetch_one(struct remote *remote, int argc, const char **argv,
die(_("no remote repository specified; please specify either a URL or a\n"
"remote name from which new revisions should be fetched"));
- gtransport = prepare_transport(remote, 1);
+ gtransport = prepare_transport(remote, 1, filter_options);
if (prune < 0) {
/* no command line request */
@@ -2445,7 +2449,7 @@ static int fetch_one(struct remote *remote, int argc, const char **argv,
sigchain_push_common(unlock_pack_on_signal);
atexit(unlock_pack_atexit);
sigchain_push(SIGPIPE, SIG_IGN);
- exit_code = do_fetch(gtransport, &rs, config);
+ exit_code = do_fetch(gtransport, &rs, config, filter_options);
sigchain_pop(SIGPIPE);
refspec_clear(&rs);
transport_disconnect(gtransport);
@@ -2470,6 +2474,7 @@ int cmd_fetch(int argc,
const char *submodule_prefix = "";
const char *bundle_uri;
struct string_list list = STRING_LIST_INIT_DUP;
+ struct list_objects_filter_options filter_options = LIST_OBJECTS_FILTER_INIT;
struct remote *remote = NULL;
int all = -1, multiple = 0;
int result = 0;
@@ -2735,7 +2740,7 @@ int cmd_fetch(int argc,
trace2_region_enter("fetch", "negotiate-only", the_repository);
if (!remote)
die(_("must supply remote when using --negotiate-only"));
- gtransport = prepare_transport(remote, 1);
+ gtransport = prepare_transport(remote, 1, &filter_options);
if (gtransport->smart_options) {
gtransport->smart_options->acked_commits = &acked_commits;
} else {
@@ -2757,12 +2762,12 @@ int cmd_fetch(int argc,
} else if (remote) {
if (filter_options.choice || repo_has_promisor_remote(the_repository)) {
trace2_region_enter("fetch", "setup-partial", the_repository);
- fetch_one_setup_partial(remote);
+ fetch_one_setup_partial(remote, &filter_options);
trace2_region_leave("fetch", "setup-partial", the_repository);
}
trace2_region_enter("fetch", "fetch-one", the_repository);
result = fetch_one(remote, argc, argv, prune_tags_ok, stdin_refspecs,
- &config);
+ &config, &filter_options);
trace2_region_leave("fetch", "fetch-one", the_repository);
} else {
int max_children = max_jobs;
@@ -2868,5 +2873,6 @@ int cmd_fetch(int argc,
cleanup:
string_list_clear(&list, 0);
+ list_objects_filter_release(&filter_options);
return result;
}
--
2.53.0.70.g3d1fd9d397.dirty
^ permalink raw reply related [flat|nested] 80+ messages in thread* [PATCH v3 5/9] doc: fetch: document `--filter=<filter-spec>` option
2026-02-12 10:08 ` [PATCH v3 0/9] Implement `promisor.storeFields` and `--filter=auto` Christian Couder
` (3 preceding siblings ...)
2026-02-12 10:08 ` [PATCH v3 4/9] fetch: make filter_options local to cmd_fetch() Christian Couder
@ 2026-02-12 10:08 ` Christian Couder
2026-02-12 10:08 ` [PATCH v3 6/9] list-objects-filter-options: support 'auto' mode for --filter Christian Couder
` (5 subsequent siblings)
10 siblings, 0 replies; 80+ messages in thread
From: Christian Couder @ 2026-02-12 10:08 UTC (permalink / raw)
To: git
Cc: Junio C Hamano, Patrick Steinhardt, Taylor Blau, Karthik Nayak,
Elijah Newren, Jean-Noël Avila, Christian Couder,
Christian Couder
The `--filter=<filter-spec>` option is documented in most commands that
support it except `git fetch`.
Let's fix that and document this option. To ensure consistency across
commands, let's reuse the exact description currently found in
`git clone`.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
Documentation/fetch-options.adoc | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/Documentation/fetch-options.adoc b/Documentation/fetch-options.adoc
index fcba46ee9e..1ef9807d00 100644
--- a/Documentation/fetch-options.adoc
+++ b/Documentation/fetch-options.adoc
@@ -88,6 +88,16 @@ linkgit:git-config[1].
This is incompatible with `--recurse-submodules=(yes|on-demand)` and takes
precedence over the `fetch.output` config option.
+`--filter=<filter-spec>`::
+ Use the partial clone feature and request that the server sends
+ a subset of reachable objects according to a given object filter.
+ When using `--filter`, the supplied _<filter-spec>_ is used for
+ the partial fetch. For example, `--filter=blob:none` will filter
+ out all blobs (file contents) until needed by Git. Also,
+ `--filter=blob:limit=<size>` will filter out all blobs of size
+ at least _<size>_. For more details on filter specifications, see
+ the `--filter` option in linkgit:git-rev-list[1].
+
ifndef::git-pull[]
`--write-fetch-head`::
`--no-write-fetch-head`::
--
2.53.0.70.g3d1fd9d397.dirty
^ permalink raw reply related [flat|nested] 80+ messages in thread* [PATCH v3 6/9] list-objects-filter-options: support 'auto' mode for --filter
2026-02-12 10:08 ` [PATCH v3 0/9] Implement `promisor.storeFields` and `--filter=auto` Christian Couder
` (4 preceding siblings ...)
2026-02-12 10:08 ` [PATCH v3 5/9] doc: fetch: document `--filter=<filter-spec>` option Christian Couder
@ 2026-02-12 10:08 ` Christian Couder
2026-02-14 2:35 ` Jeff King
2026-02-12 10:08 ` [PATCH v3 7/9] promisor-remote: keep advertised filters in memory Christian Couder
` (4 subsequent siblings)
10 siblings, 1 reply; 80+ messages in thread
From: Christian Couder @ 2026-02-12 10:08 UTC (permalink / raw)
To: git
Cc: Junio C Hamano, Patrick Steinhardt, Taylor Blau, Karthik Nayak,
Elijah Newren, Jean-Noël Avila, Christian Couder,
Christian Couder
In a following commit, we are going to allow passing "auto" as a
<filterspec> to the `--filter=<filterspec>` option, but only for some
commands. Other commands that support the `--filter=<filterspec>`
option should still die() when 'auto' is passed.
Let's set up the "list-objects-filter-options.{c,h}" infrastructure to
support that:
- Add a new `unsigned int allow_auto_filter : 1;` flag to
`struct list_objects_filter_options` which specifies if "auto" is
accepted or not by the current command.
- Change gently_parse_list_objects_filter() to parse "auto" if it's
accepted.
- Make sure we die() if "auto" is combined with another filter.
- Update list_objects_filter_release() to preserve the
allow_auto_filter flag, as this function is often called (via
opt_parse_list_objects_filter) to reset the struct before parsing a
new value.
Let's also update `list-objects-filter.c` to recognize the new
`LOFC_AUTO` choice. Since "auto" must be resolved to a concrete filter
before filtering actually begins, initializing a filter with
`LOFC_AUTO` is invalid and will trigger a BUG().
Note that ideally combining "auto" with "auto" could be allowed, but in
practice, it's probably not worth the added code complexity. And if we
really want it, nothing prevents us to allow it in future work.
If we ever want to give a meaning to combining "auto" with a different
filter too, nothing prevents us to do that in future work either.
Also note that the new `allow_auto_filter` flag depends on the command,
not user choices, so it should be reset to the command default when
`struct list_objects_filter_options` instances are reset.
While at it, let's add a new "u-list-objects-filter-options.c" file for
`struct list_objects_filter_options` related unit tests. For now it
only tests gently_parse_list_objects_filter() though.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
Makefile | 1 +
list-objects-filter-options.c | 37 ++++++++++++--
list-objects-filter-options.h | 6 +++
list-objects-filter.c | 8 +++
t/meson.build | 1 +
t/unit-tests/u-list-objects-filter-options.c | 53 ++++++++++++++++++++
6 files changed, 103 insertions(+), 3 deletions(-)
create mode 100644 t/unit-tests/u-list-objects-filter-options.c
diff --git a/Makefile b/Makefile
index 4ac44331ea..9e174dd06c 100644
--- a/Makefile
+++ b/Makefile
@@ -1518,6 +1518,7 @@ CLAR_TEST_SUITES += u-dir
CLAR_TEST_SUITES += u-example-decorate
CLAR_TEST_SUITES += u-hash
CLAR_TEST_SUITES += u-hashmap
+CLAR_TEST_SUITES += u-list-objects-filter-options
CLAR_TEST_SUITES += u-mem-pool
CLAR_TEST_SUITES += u-oid-array
CLAR_TEST_SUITES += u-oidmap
diff --git a/list-objects-filter-options.c b/list-objects-filter-options.c
index 7420bf81fe..ad92cbaa37 100644
--- a/list-objects-filter-options.c
+++ b/list-objects-filter-options.c
@@ -20,6 +20,8 @@ const char *list_object_filter_config_name(enum list_objects_filter_choice c)
case LOFC_DISABLED:
/* we have no name for "no filter at all" */
break;
+ case LOFC_AUTO:
+ return "auto";
case LOFC_BLOB_NONE:
return "blob:none";
case LOFC_BLOB_LIMIT:
@@ -52,7 +54,16 @@ int gently_parse_list_objects_filter(
if (filter_options->choice)
BUG("filter_options already populated");
- if (!strcmp(arg, "blob:none")) {
+ if (!strcmp(arg, "auto")) {
+ if (!filter_options->allow_auto_filter) {
+ strbuf_addstr(errbuf,
+ _("'auto' filter not supported by this command"));
+ return 1;
+ }
+ filter_options->choice = LOFC_AUTO;
+ return 0;
+
+ } else if (!strcmp(arg, "blob:none")) {
filter_options->choice = LOFC_BLOB_NONE;
return 0;
@@ -146,10 +157,22 @@ static int parse_combine_subfilter(
decoded = url_percent_decode(subspec->buf);
- result = has_reserved_character(subspec, errbuf) ||
- gently_parse_list_objects_filter(
+ result = has_reserved_character(subspec, errbuf);
+ if (result)
+ goto cleanup;
+
+ result = gently_parse_list_objects_filter(
&filter_options->sub[new_index], decoded, errbuf);
+ if (result)
+ goto cleanup;
+
+ result = (filter_options->sub[new_index].choice == LOFC_AUTO);
+ if (result) {
+ strbuf_addstr(errbuf, _("an 'auto' filter cannot be combined"));
+ goto cleanup;
+ }
+cleanup:
free(decoded);
return result;
}
@@ -263,6 +286,9 @@ void parse_list_objects_filter(
} else {
struct list_objects_filter_options *sub;
+ if (filter_options->choice == LOFC_AUTO)
+ die(_("an 'auto' filter is incompatible with any other filter"));
+
/*
* Make filter_options an LOFC_COMBINE spec so we can trivially
* add subspecs to it.
@@ -277,6 +303,9 @@ void parse_list_objects_filter(
if (gently_parse_list_objects_filter(sub, arg, &errbuf))
die("%s", errbuf.buf);
+ if (sub->choice == LOFC_AUTO)
+ die(_("an 'auto' filter is incompatible with any other filter"));
+
strbuf_addch(&filter_options->filter_spec, '+');
filter_spec_append_urlencode(filter_options, arg);
}
@@ -317,6 +346,7 @@ void list_objects_filter_release(
struct list_objects_filter_options *filter_options)
{
size_t sub;
+ unsigned int allow_auto_filter = filter_options->allow_auto_filter;
if (!filter_options)
return;
@@ -326,6 +356,7 @@ void list_objects_filter_release(
list_objects_filter_release(&filter_options->sub[sub]);
free(filter_options->sub);
list_objects_filter_init(filter_options);
+ filter_options->allow_auto_filter = allow_auto_filter;
}
void partial_clone_register(
diff --git a/list-objects-filter-options.h b/list-objects-filter-options.h
index 7b2108b986..77d7bbc846 100644
--- a/list-objects-filter-options.h
+++ b/list-objects-filter-options.h
@@ -18,6 +18,7 @@ enum list_objects_filter_choice {
LOFC_SPARSE_OID,
LOFC_OBJECT_TYPE,
LOFC_COMBINE,
+ LOFC_AUTO,
LOFC__COUNT /* must be last */
};
@@ -50,6 +51,11 @@ struct list_objects_filter_options {
*/
unsigned int no_filter : 1;
+ /*
+ * Is LOFC_AUTO a valid option?
+ */
+ unsigned int allow_auto_filter : 1;
+
/*
* BEGIN choice-specific parsed values from within the filter-spec. Only
* some values will be defined for any given choice.
diff --git a/list-objects-filter.c b/list-objects-filter.c
index acd65ebb73..78316e7f90 100644
--- a/list-objects-filter.c
+++ b/list-objects-filter.c
@@ -745,6 +745,13 @@ static void filter_combine__init(
filter->finalize_omits_fn = filter_combine__finalize_omits;
}
+static void filter_auto__init(
+ struct list_objects_filter_options *filter_options UNUSED,
+ struct filter *filter UNUSED)
+{
+ BUG("LOFC_AUTO should have been resolved before initializing the filter");
+}
+
typedef void (*filter_init_fn)(
struct list_objects_filter_options *filter_options,
struct filter *filter);
@@ -760,6 +767,7 @@ static filter_init_fn s_filters[] = {
filter_sparse_oid__init,
filter_object_type__init,
filter_combine__init,
+ filter_auto__init,
};
struct filter *list_objects_filter__init(
diff --git a/t/meson.build b/t/meson.build
index a04a7a86cf..bec4c72327 100644
--- a/t/meson.build
+++ b/t/meson.build
@@ -4,6 +4,7 @@ clar_test_suites = [
'unit-tests/u-example-decorate.c',
'unit-tests/u-hash.c',
'unit-tests/u-hashmap.c',
+ 'unit-tests/u-list-objects-filter-options.c',
'unit-tests/u-mem-pool.c',
'unit-tests/u-oid-array.c',
'unit-tests/u-oidmap.c',
diff --git a/t/unit-tests/u-list-objects-filter-options.c b/t/unit-tests/u-list-objects-filter-options.c
new file mode 100644
index 0000000000..f7d73701b5
--- /dev/null
+++ b/t/unit-tests/u-list-objects-filter-options.c
@@ -0,0 +1,53 @@
+#include "unit-test.h"
+#include "list-objects-filter-options.h"
+#include "strbuf.h"
+
+/* Helper to test gently_parse_list_objects_filter() */
+static void check_gentle_parse(const char *filter_spec,
+ int expect_success,
+ int allow_auto,
+ enum list_objects_filter_choice expected_choice)
+{
+ struct list_objects_filter_options filter_options = LIST_OBJECTS_FILTER_INIT;
+ struct strbuf errbuf = STRBUF_INIT;
+ int ret;
+
+ filter_options.allow_auto_filter = allow_auto;
+
+ ret = gently_parse_list_objects_filter(&filter_options, filter_spec, &errbuf);
+
+ if (expect_success) {
+ cl_assert_equal_i(ret, 0);
+ cl_assert_equal_i(expected_choice, filter_options.choice);
+ cl_assert_equal_i(errbuf.len, 0);
+ } else {
+ cl_assert(ret != 0);
+ cl_assert(errbuf.len > 0);
+ }
+
+ strbuf_release(&errbuf);
+ list_objects_filter_release(&filter_options);
+}
+
+void test_list_objects_filter_options__regular_filters(void)
+{
+ check_gentle_parse("blob:none", 1, 0, LOFC_BLOB_NONE);
+ check_gentle_parse("blob:none", 1, 1, LOFC_BLOB_NONE);
+ check_gentle_parse("blob:limit=5k", 1, 0, LOFC_BLOB_LIMIT);
+ check_gentle_parse("blob:limit=5k", 1, 1, LOFC_BLOB_LIMIT);
+ check_gentle_parse("combine:blob:none+tree:0", 1, 0, LOFC_COMBINE);
+ check_gentle_parse("combine:blob:none+tree:0", 1, 1, LOFC_COMBINE);
+}
+
+void test_list_objects_filter_options__auto_allowed(void)
+{
+ check_gentle_parse("auto", 1, 1, LOFC_AUTO);
+ check_gentle_parse("auto", 0, 0, 0);
+}
+
+void test_list_objects_filter_options__combine_auto_fails(void)
+{
+ check_gentle_parse("combine:auto+blob:none", 0, 1, 0);
+ check_gentle_parse("combine:blob:none+auto", 0, 1, 0);
+ check_gentle_parse("combine:auto+auto", 0, 1, 0);
+}
--
2.53.0.70.g3d1fd9d397.dirty
^ permalink raw reply related [flat|nested] 80+ messages in thread* Re: [PATCH v3 6/9] list-objects-filter-options: support 'auto' mode for --filter
2026-02-12 10:08 ` [PATCH v3 6/9] list-objects-filter-options: support 'auto' mode for --filter Christian Couder
@ 2026-02-14 2:35 ` Jeff King
2026-02-16 13:26 ` Christian Couder
0 siblings, 1 reply; 80+ messages in thread
From: Jeff King @ 2026-02-14 2:35 UTC (permalink / raw)
To: Christian Couder
Cc: git, Junio C Hamano, Patrick Steinhardt, Taylor Blau,
Karthik Nayak, Elijah Newren, Jean-Noël Avila,
Christian Couder
On Thu, Feb 12, 2026 at 11:08:37AM +0100, Christian Couder wrote:
> @@ -317,6 +346,7 @@ void list_objects_filter_release(
> struct list_objects_filter_options *filter_options)
> {
> size_t sub;
> + unsigned int allow_auto_filter = filter_options->allow_auto_filter;
>
> if (!filter_options)
> return;
This will segfault if anybody passes in a NULL filter_options, before we
get to the NULL check in the context.
I don't think anybody does this in practice, but probably we should
either remove the NULL check, or you should push the assignment of your
local variable down below it.
(Noticed by Coverity).
-Peff
^ permalink raw reply [flat|nested] 80+ messages in thread* Re: [PATCH v3 6/9] list-objects-filter-options: support 'auto' mode for --filter
2026-02-14 2:35 ` Jeff King
@ 2026-02-16 13:26 ` Christian Couder
0 siblings, 0 replies; 80+ messages in thread
From: Christian Couder @ 2026-02-16 13:26 UTC (permalink / raw)
To: Jeff King
Cc: git, Junio C Hamano, Patrick Steinhardt, Taylor Blau,
Karthik Nayak, Elijah Newren, Jean-Noël Avila,
Christian Couder
On Sat, Feb 14, 2026 at 3:35 AM Jeff King <peff@peff.net> wrote:
>
> On Thu, Feb 12, 2026 at 11:08:37AM +0100, Christian Couder wrote:
>
> > @@ -317,6 +346,7 @@ void list_objects_filter_release(
> > struct list_objects_filter_options *filter_options)
> > {
> > size_t sub;
> > + unsigned int allow_auto_filter = filter_options->allow_auto_filter;
> >
> > if (!filter_options)
> > return;
>
> This will segfault if anybody passes in a NULL filter_options, before we
> get to the NULL check in the context.
>
> I don't think anybody does this in practice, but probably we should
> either remove the NULL check, or you should push the assignment of your
> local variable down below it.
Thanks Peff, I have moved the assignment of the local variable below
the NULL check.
A v4 with this single change compared to v3 has just been sent.
^ permalink raw reply [flat|nested] 80+ messages in thread
* [PATCH v3 7/9] promisor-remote: keep advertised filters in memory
2026-02-12 10:08 ` [PATCH v3 0/9] Implement `promisor.storeFields` and `--filter=auto` Christian Couder
` (5 preceding siblings ...)
2026-02-12 10:08 ` [PATCH v3 6/9] list-objects-filter-options: support 'auto' mode for --filter Christian Couder
@ 2026-02-12 10:08 ` Christian Couder
2026-02-12 10:08 ` [PATCH v3 8/9] promisor-remote: change promisor_remote_reply()'s signature Christian Couder
` (3 subsequent siblings)
10 siblings, 0 replies; 80+ messages in thread
From: Christian Couder @ 2026-02-12 10:08 UTC (permalink / raw)
To: git
Cc: Junio C Hamano, Patrick Steinhardt, Taylor Blau, Karthik Nayak,
Elijah Newren, Jean-Noël Avila, Christian Couder,
Christian Couder
Currently, advertised filters are only kept in memory temporarily
during parsing, or persisted to disk if `promisor.storeFields`
contains 'partialCloneFilter'.
In a following commit though, we will add a `--filter=auto` option.
This option will enable the client to use the filters that the server
is suggesting for the promisor remotes the client accepts.
To use them even if `promisor.storeFields` is not configured, these
filters should be stored somewhere for the current session.
Let's add an `advertised_filter` field to `struct promisor_remote`
for that purpose.
To ensure that the filters are available in all cases,
filter_promisor_remote() captures them into a temporary list and
applies them to the `promisor_remote` structs after the potential
configuration reload.
Then the accepted remotes are marked as `accepted` in the repository
state. This ensures that subsequent calls to look up accepted remotes
(like in the filter construction below) actually find them.
In a following commit, we will add a `--filter=auto` option that will
enable a client to use the filters suggested by the server for the
promisor remotes the client accepted.
To enable the client to construct a filter spec based on these filters,
let's also add a `promisor_remote_construct_filter(repo)` function.
This function:
- iterates over all accepted promisor remotes in the repository,
- collects the filters advertised for them (using `advertised_filter`
added in this commit, and
- generates a single filter spec for them.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
promisor-remote.c | 58 +++++++++++++++++++++++++++++++++++++++++++++++
promisor-remote.h | 7 ++++++
2 files changed, 65 insertions(+)
diff --git a/promisor-remote.c b/promisor-remote.c
index 59997dd4c7..f3bafb7731 100644
--- a/promisor-remote.c
+++ b/promisor-remote.c
@@ -193,6 +193,7 @@ void promisor_remote_clear(struct promisor_remote_config *config)
while (config->promisors) {
struct promisor_remote *r = config->promisors;
free(r->partial_clone_filter);
+ free(r->advertised_filter);
config->promisors = config->promisors->next;
free(r);
}
@@ -837,6 +838,7 @@ static void filter_promisor_remote(struct repository *repo,
struct store_info *store_info = NULL;
struct string_list_item *item;
bool reload_config = false;
+ struct string_list accepted_filters = STRING_LIST_INIT_DUP;
if (!repo_config_get_string_tmp(the_repository, "promisor.acceptfromserver", &accept_str)) {
if (!*accept_str || !strcasecmp("None", accept_str))
@@ -879,6 +881,13 @@ static void filter_promisor_remote(struct repository *repo,
reload_config = true;
strvec_push(accepted, advertised->name);
+
+ /* Capture advertised filters for accepted remotes */
+ if (advertised->filter) {
+ struct string_list_item *i;
+ i = string_list_append(&accepted_filters, advertised->name);
+ i->util = xstrdup(advertised->filter);
+ }
}
promisor_info_free(advertised);
@@ -890,6 +899,25 @@ static void filter_promisor_remote(struct repository *repo,
if (reload_config)
repo_promisor_remote_reinit(repo);
+
+ /* Apply accepted remote filters to the stable repo state */
+ for_each_string_list_item(item, &accepted_filters) {
+ struct promisor_remote *r = repo_promisor_remote_find(repo, item->string);
+ if (r) {
+ free(r->advertised_filter);
+ r->advertised_filter = item->util;
+ item->util = NULL;
+ }
+ }
+
+ string_list_clear(&accepted_filters, 1);
+
+ /* Mark the remotes as accepted in the repository state */
+ for (size_t i = 0; i < accepted->nr; i++) {
+ struct promisor_remote *r = repo_promisor_remote_find(repo, accepted->v[i]);
+ if (r)
+ r->accepted = 1;
+ }
}
char *promisor_remote_reply(const char *info)
@@ -935,3 +963,33 @@ void mark_promisor_remotes_as_accepted(struct repository *r, const char *remotes
string_list_clear(&accepted_remotes, 0);
}
+
+char *promisor_remote_construct_filter(struct repository *repo)
+{
+ struct promisor_remote *r;
+ struct list_objects_filter_options filter_options = LIST_OBJECTS_FILTER_INIT;
+ struct strbuf err = STRBUF_INIT;
+ char *result = NULL;
+
+ promisor_remote_init(repo);
+
+ for (r = repo->promisor_remote_config->promisors; r; r = r->next) {
+ if (r->accepted && r->advertised_filter)
+ if (gently_parse_list_objects_filter(&filter_options,
+ r->advertised_filter,
+ &err)) {
+ warning(_("promisor remote '%s' advertised invalid filter '%s': %s"),
+ r->name, r->advertised_filter, err.buf);
+ strbuf_reset(&err);
+ continue;
+ }
+ }
+
+ if (filter_options.choice)
+ result = xstrdup(expand_list_objects_filter_spec(&filter_options));
+
+ list_objects_filter_release(&filter_options);
+ strbuf_release(&err);
+
+ return result;
+}
diff --git a/promisor-remote.h b/promisor-remote.h
index 263d331a55..d227299fd0 100644
--- a/promisor-remote.h
+++ b/promisor-remote.h
@@ -15,6 +15,7 @@ struct object_id;
struct promisor_remote {
struct promisor_remote *next;
char *partial_clone_filter;
+ char *advertised_filter;
unsigned int accepted : 1;
const char name[FLEX_ARRAY];
};
@@ -67,4 +68,10 @@ void mark_promisor_remotes_as_accepted(struct repository *repo, const char *remo
*/
int repo_has_accepted_promisor_remote(struct repository *r);
+/*
+ * Use the filters from the accepted remotes to create a combined
+ * filter (useful in `--filter=auto` mode).
+ */
+char *promisor_remote_construct_filter(struct repository *repo);
+
#endif /* PROMISOR_REMOTE_H */
--
2.53.0.70.g3d1fd9d397.dirty
^ permalink raw reply related [flat|nested] 80+ messages in thread* [PATCH v3 8/9] promisor-remote: change promisor_remote_reply()'s signature
2026-02-12 10:08 ` [PATCH v3 0/9] Implement `promisor.storeFields` and `--filter=auto` Christian Couder
` (6 preceding siblings ...)
2026-02-12 10:08 ` [PATCH v3 7/9] promisor-remote: keep advertised filters in memory Christian Couder
@ 2026-02-12 10:08 ` Christian Couder
2026-02-13 11:25 ` Patrick Steinhardt
2026-02-12 10:08 ` [PATCH v3 9/9] fetch-pack: wire up and enable auto filter logic Christian Couder
` (2 subsequent siblings)
10 siblings, 1 reply; 80+ messages in thread
From: Christian Couder @ 2026-02-12 10:08 UTC (permalink / raw)
To: git
Cc: Junio C Hamano, Patrick Steinhardt, Taylor Blau, Karthik Nayak,
Elijah Newren, Jean-Noël Avila, Christian Couder,
Christian Couder
The `promisor_remote_reply()` function performs two tasks:
1. It uses filter_promisor_remote() to parse the server's
"promisor-remote" advertisement and to mark accepted remotes in the
repository configuration.
2. It assembles a reply string containing the accepted remote names to
send back to the server.
In a following commit, the fetch-pack logic will need to trigger the
side effect (1) to ensure the repository state is correct, but it will
not need to send a reply (2).
To avoid assembling a reply string when it is not needed, let's change
the signature of promisor_remote_reply(). It will now return `void` and
accept a second `char **accepted_out` argument. Only if that argument
is not NULL will a reply string be assembled and returned back to the
caller via that argument.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
connect.c | 3 ++-
promisor-remote.c | 24 +++++++++++++-----------
promisor-remote.h | 10 +++++-----
3 files changed, 20 insertions(+), 17 deletions(-)
diff --git a/connect.c b/connect.c
index c6f76e3082..a02583a102 100644
--- a/connect.c
+++ b/connect.c
@@ -505,7 +505,8 @@ static void send_capabilities(int fd_out, struct packet_reader *reader)
reader->hash_algo = &hash_algos[GIT_HASH_SHA1_LEGACY];
}
if (server_feature_v2("promisor-remote", &promisor_remote_info)) {
- char *reply = promisor_remote_reply(promisor_remote_info);
+ char *reply;
+ promisor_remote_reply(promisor_remote_info, &reply);
if (reply) {
packet_write_fmt(fd_out, "promisor-remote=%s", reply);
free(reply);
diff --git a/promisor-remote.c b/promisor-remote.c
index f3bafb7731..96fa215b06 100644
--- a/promisor-remote.c
+++ b/promisor-remote.c
@@ -920,25 +920,27 @@ static void filter_promisor_remote(struct repository *repo,
}
}
-char *promisor_remote_reply(const char *info)
+void promisor_remote_reply(const char *info, char **accepted_out)
{
struct strvec accepted = STRVEC_INIT;
- struct strbuf reply = STRBUF_INIT;
filter_promisor_remote(the_repository, &accepted, info);
- if (!accepted.nr)
- return NULL;
-
- for (size_t i = 0; i < accepted.nr; i++) {
- if (i)
- strbuf_addch(&reply, ';');
- strbuf_addstr_urlencode(&reply, accepted.v[i], allow_unsanitized);
+ if (accepted_out) {
+ if (accepted.nr) {
+ struct strbuf reply = STRBUF_INIT;
+ for (size_t i = 0; i < accepted.nr; i++) {
+ if (i)
+ strbuf_addch(&reply, ';');
+ strbuf_addstr_urlencode(&reply, accepted.v[i], allow_unsanitized);
+ }
+ *accepted_out = strbuf_detach(&reply, NULL);
+ } else {
+ *accepted_out = NULL;
+ }
}
strvec_clear(&accepted);
-
- return strbuf_detach(&reply, NULL);
}
void mark_promisor_remotes_as_accepted(struct repository *r, const char *remotes)
diff --git a/promisor-remote.h b/promisor-remote.h
index d227299fd0..3d4d2de018 100644
--- a/promisor-remote.h
+++ b/promisor-remote.h
@@ -49,12 +49,12 @@ char *promisor_remote_info(struct repository *repo);
/*
* Prepare a reply to a "promisor-remote" advertisement from a server.
* Check the value of "promisor.acceptfromserver" and maybe the
- * configured promisor remotes, if any, to prepare the reply.
- * Return value is NULL if no promisor remote from the server
- * is accepted. Otherwise it contains the names of the accepted promisor
- * remotes separated by ';'. See gitprotocol-v2(5).
+ * configured promisor remotes, if any, to prepare the reply. If the
+ * `accepted_out` argument is not NULL, it is set to either NULL or to
+ * the names of the accepted promisor remotes separated by ';' if
+ * any. See gitprotocol-v2(5).
*/
-char *promisor_remote_reply(const char *info);
+void promisor_remote_reply(const char *info, char **accepted_out);
/*
* Set the 'accepted' flag for some promisor remotes. Useful on the
--
2.53.0.70.g3d1fd9d397.dirty
^ permalink raw reply related [flat|nested] 80+ messages in thread* Re: [PATCH v3 8/9] promisor-remote: change promisor_remote_reply()'s signature
2026-02-12 10:08 ` [PATCH v3 8/9] promisor-remote: change promisor_remote_reply()'s signature Christian Couder
@ 2026-02-13 11:25 ` Patrick Steinhardt
0 siblings, 0 replies; 80+ messages in thread
From: Patrick Steinhardt @ 2026-02-13 11:25 UTC (permalink / raw)
To: Christian Couder
Cc: git, Junio C Hamano, Taylor Blau, Karthik Nayak, Elijah Newren,
Jean-Noël Avila, Christian Couder
On Thu, Feb 12, 2026 at 11:08:39AM +0100, Christian Couder wrote:
> diff --git a/promisor-remote.c b/promisor-remote.c
> index f3bafb7731..96fa215b06 100644
> --- a/promisor-remote.c
> +++ b/promisor-remote.c
> @@ -920,25 +920,27 @@ static void filter_promisor_remote(struct repository *repo,
> }
> }
>
> -char *promisor_remote_reply(const char *info)
> +void promisor_remote_reply(const char *info, char **accepted_out)
> {
> struct strvec accepted = STRVEC_INIT;
> - struct strbuf reply = STRBUF_INIT;
>
> filter_promisor_remote(the_repository, &accepted, info);
>
> - if (!accepted.nr)
> - return NULL;
> -
> - for (size_t i = 0; i < accepted.nr; i++) {
> - if (i)
> - strbuf_addch(&reply, ';');
> - strbuf_addstr_urlencode(&reply, accepted.v[i], allow_unsanitized);
> + if (accepted_out) {
> + if (accepted.nr) {
> + struct strbuf reply = STRBUF_INIT;
> + for (size_t i = 0; i < accepted.nr; i++) {
> + if (i)
> + strbuf_addch(&reply, ';');
> + strbuf_addstr_urlencode(&reply, accepted.v[i], allow_unsanitized);
> + }
> + *accepted_out = strbuf_detach(&reply, NULL);
> + } else {
> + *accepted_out = NULL;
> + }
> }
>
> strvec_clear(&accepted);
> -
> - return strbuf_detach(&reply, NULL);
> }
Okay, makes sense. This directly addresses my comment on v2 that it's
kind of weird that we do all of this only to discard the result in the
next commit.
Patrick
^ permalink raw reply [flat|nested] 80+ messages in thread
* [PATCH v3 9/9] fetch-pack: wire up and enable auto filter logic
2026-02-12 10:08 ` [PATCH v3 0/9] Implement `promisor.storeFields` and `--filter=auto` Christian Couder
` (7 preceding siblings ...)
2026-02-12 10:08 ` [PATCH v3 8/9] promisor-remote: change promisor_remote_reply()'s signature Christian Couder
@ 2026-02-12 10:08 ` Christian Couder
2026-02-13 11:26 ` Patrick Steinhardt
2026-02-13 11:26 ` [PATCH v3 0/9] Implement `promisor.storeFields` and `--filter=auto` Patrick Steinhardt
2026-02-16 13:23 ` [PATCH v4 " Christian Couder
10 siblings, 1 reply; 80+ messages in thread
From: Christian Couder @ 2026-02-12 10:08 UTC (permalink / raw)
To: git
Cc: Junio C Hamano, Patrick Steinhardt, Taylor Blau, Karthik Nayak,
Elijah Newren, Jean-Noël Avila, Christian Couder,
Christian Couder
Previous commits have set up an infrastructure for `--filter=auto` to
automatically prepare a partial clone filter based on what the server
advertised and the client accepted.
Using that infrastructure, let's now enable the `--filter=auto` option
in `git clone` and `git fetch` by setting `allow_auto_filter` to 1.
Note that these small changes mean that when `git clone --filter=auto`
or `git fetch --filter=auto` are used, "auto" is automatically saved
as the partial clone filter for the server on the client. Therefore
subsequent calls to `git fetch` on the client will automatically use
this "auto" mode even without `--filter=auto`.
Let's also set `allow_auto_filter` to 1 in `transport.c`, as the
transport layer must be able to accept the "auto" filter spec even if
the invoking command hasn't fully parsed it yet.
When an "auto" filter is requested, let's have the "fetch-pack.c" code
in `do_fetch_pack_v2()` compute a filter and send it to the server.
In `do_fetch_pack_v2()` the logic also needs to check for the
"promisor-remote" capability and call `promisor_remote_reply()` to
parse advertised remotes and populate the list of those accepted (and
their filters).
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
Documentation/fetch-options.adoc | 19 ++++++---
Documentation/git-clone.adoc | 25 ++++++++---
Documentation/gitprotocol-v2.adoc | 16 ++++---
builtin/clone.c | 2 +
builtin/fetch.c | 2 +
fetch-pack.c | 24 +++++++++++
t/t5710-promisor-remote-capability.sh | 60 +++++++++++++++++++++++++++
transport.c | 1 +
8 files changed, 134 insertions(+), 15 deletions(-)
diff --git a/Documentation/fetch-options.adoc b/Documentation/fetch-options.adoc
index 1ef9807d00..a0cfb50d89 100644
--- a/Documentation/fetch-options.adoc
+++ b/Documentation/fetch-options.adoc
@@ -92,11 +92,20 @@ precedence over the `fetch.output` config option.
Use the partial clone feature and request that the server sends
a subset of reachable objects according to a given object filter.
When using `--filter`, the supplied _<filter-spec>_ is used for
- the partial fetch. For example, `--filter=blob:none` will filter
- out all blobs (file contents) until needed by Git. Also,
- `--filter=blob:limit=<size>` will filter out all blobs of size
- at least _<size>_. For more details on filter specifications, see
- the `--filter` option in linkgit:git-rev-list[1].
+ the partial fetch.
++
+If `--filter=auto` is used, the filter specification is determined
+automatically by combining the filter specifications advertised by
+the server for the promisor remotes that the client accepts (see
+linkgit:gitprotocol-v2[5] and the `promisor.acceptFromServer`
+configuration option in linkgit:git-config[1]).
++
+For details on all other available filter specifications, see the
+`--filter=<filter-spec>` option in linkgit:git-rev-list[1].
++
+For example, `--filter=blob:none` will filter out all blobs (file
+contents) until needed by Git. Also, `--filter=blob:limit=<size>` will
+filter out all blobs of size at least _<size>_.
ifndef::git-pull[]
`--write-fetch-head`::
diff --git a/Documentation/git-clone.adoc b/Documentation/git-clone.adoc
index 57cdfb7620..0db2d1e5f0 100644
--- a/Documentation/git-clone.adoc
+++ b/Documentation/git-clone.adoc
@@ -187,11 +187,26 @@ objects from the source repository into a pack in the cloned repository.
Use the partial clone feature and request that the server sends
a subset of reachable objects according to a given object filter.
When using `--filter`, the supplied _<filter-spec>_ is used for
- the partial clone filter. For example, `--filter=blob:none` will
- filter out all blobs (file contents) until needed by Git. Also,
- `--filter=blob:limit=<size>` will filter out all blobs of size
- at least _<size>_. For more details on filter specifications, see
- the `--filter` option in linkgit:git-rev-list[1].
+ the partial clone filter.
++
+If `--filter=auto` is used the filter specification is determined
+automatically through the 'promisor-remote' protocol (see
+linkgit:gitprotocol-v2[5]) by combining the filter specifications
+advertised by the server for the promisor remotes that the client
+accepts (see the `promisor.acceptFromServer` configuration option in
+linkgit:git-config[1]). This allows the server to suggest the optimal
+filter for the available promisor remotes.
++
+As with other filter specifications, the "auto" value is persisted in
+the configuration. This ensures that future fetches will continue to
+adapt to the server's current recommendation.
++
+For details on all other available filter specifications, see the
+`--filter=<filter-spec>` option in linkgit:git-rev-list[1].
++
+For example, `--filter=blob:none` will filter out all blobs (file
+contents) until needed by Git. Also, `--filter=blob:limit=<size>` will
+filter out all blobs of size at least _<size>_.
`--also-filter-submodules`::
Also apply the partial clone filter to any submodules in the repository.
diff --git a/Documentation/gitprotocol-v2.adoc b/Documentation/gitprotocol-v2.adoc
index d93dd279ea..f985cb4c47 100644
--- a/Documentation/gitprotocol-v2.adoc
+++ b/Documentation/gitprotocol-v2.adoc
@@ -812,10 +812,15 @@ MUST appear first in each pr-fields, in that order.
After these mandatory fields, the server MAY advertise the following
optional fields in any order:
-`partialCloneFilter`:: The filter specification used by the remote.
+`partialCloneFilter`:: The filter specification for the remote. It
+corresponds to the "remote.<name>.partialCloneFilter" config setting.
Clients can use this to determine if the remote's filtering strategy
-is compatible with their needs (e.g., checking if both use "blob:none").
-It corresponds to the "remote.<name>.partialCloneFilter" config setting.
+is compatible with their needs (e.g., checking if both use
+"blob:none"). Additionally they can use this through the
+`--filter=auto` option in linkgit:git-clone[1]. With that option, the
+filter specification of the clone will be automatically computed by
+combining the filter specifications of the promisor remotes the client
+accepts.
`token`:: An authentication token that clients can use when
connecting to the remote. It corresponds to the "remote.<name>.token"
@@ -828,8 +833,9 @@ future protocol extensions.
The client can use information transmitted through these fields to
decide if it accepts the advertised promisor remote. Also, the client
-can be configured to store the values of these fields (see
-"promisor.storeFields" in linkgit:git-config[1]).
+can be configured to store the values of these fields or use them
+to automatically configure the repository (see "promisor.storeFields"
+in linkgit:git-config[1] and `--filter=auto` in linkgit:git-clone[1]).
Field values MUST be urlencoded.
diff --git a/builtin/clone.c b/builtin/clone.c
index bb27472020..45d8fa0eed 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -1001,6 +1001,8 @@ int cmd_clone(int argc,
NULL
};
+ filter_options.allow_auto_filter = 1;
+
packet_trace_identity("clone");
repo_config(the_repository, git_clone_config, NULL);
diff --git a/builtin/fetch.c b/builtin/fetch.c
index 8fbf3557ce..573c295241 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -2580,6 +2580,8 @@ int cmd_fetch(int argc,
OPT_END()
};
+ filter_options.allow_auto_filter = 1;
+
packet_trace_identity("fetch");
/* Record the command line for the reflog */
diff --git a/fetch-pack.c b/fetch-pack.c
index 40316c9a34..9f8f980516 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -35,6 +35,7 @@
#include "sigchain.h"
#include "mergesort.h"
#include "prio-queue.h"
+#include "promisor-remote.h"
static int transfer_unpack_limit = -1;
static int fetch_unpack_limit = -1;
@@ -1661,6 +1662,29 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
struct string_list packfile_uris = STRING_LIST_INIT_DUP;
int i;
struct strvec index_pack_args = STRVEC_INIT;
+ const char *promisor_remote_config;
+
+ if (server_feature_v2("promisor-remote", &promisor_remote_config))
+ promisor_remote_reply(promisor_remote_config, NULL);
+
+ if (args->filter_options.choice == LOFC_AUTO) {
+ struct strbuf errbuf = STRBUF_INIT;
+ char *constructed_filter = promisor_remote_construct_filter(r);
+
+ list_objects_filter_release(&args->filter_options);
+ /* Disallow 'auto' as a result of the resolution of this 'auto' filter below */
+ args->filter_options.allow_auto_filter = 0;
+
+ if (constructed_filter &&
+ gently_parse_list_objects_filter(&args->filter_options,
+ constructed_filter,
+ &errbuf))
+ die(_("couldn't resolve 'auto' filter '%s': %s"),
+ constructed_filter, errbuf.buf);
+
+ free(constructed_filter);
+ strbuf_release(&errbuf);
+ }
negotiator = &negotiator_alloc;
if (args->refetch)
diff --git a/t/t5710-promisor-remote-capability.sh b/t/t5710-promisor-remote-capability.sh
index 6ef6431bd7..532e6f0fea 100755
--- a/t/t5710-promisor-remote-capability.sh
+++ b/t/t5710-promisor-remote-capability.sh
@@ -423,6 +423,66 @@ test_expect_success "clone with promisor.storeFields=partialCloneFilter" '
test_grep "'\''blob:limit=8k'\'' -> '\''blob:limit=7k'\''" err
'
+test_expect_success "clone and fetch with --filter=auto" '
+ git -C server config promisor.advertise true &&
+ test_when_finished "rm -rf client trace" &&
+
+ git -C server config remote.lop.partialCloneFilter "blob:limit=9500" &&
+ test_config -C server promisor.sendFields "partialCloneFilter" &&
+
+ GIT_TRACE_PACKET="$(pwd)/trace" GIT_NO_LAZY_FETCH=0 git clone \
+ -c remote.lop.promisor=true \
+ -c remote.lop.url="file://$(pwd)/lop" \
+ -c promisor.acceptfromserver=All \
+ --no-local --filter=auto server client 2>err &&
+
+ test_grep "filter blob:limit=9500" trace &&
+ test_grep ! "filter auto" trace &&
+
+ # Verify "auto" is persisted in config
+ echo auto >expected &&
+ git -C client config remote.origin.partialCloneFilter >actual &&
+ test_cmp expected actual &&
+
+ # Check that the largest object is still missing on the server
+ check_missing_objects server 1 "$oid" &&
+
+ # Now change the filter on the server
+ git -C server config remote.lop.partialCloneFilter "blob:limit=5678" &&
+
+ # Get a new commit on the server to ensure "git fetch" actually runs fetch-pack
+ test_commit -C template new-commit &&
+ git -C template push --all "$(pwd)/server" &&
+
+ # Perform a fetch WITH --filter=auto
+ rm -rf trace &&
+ GIT_TRACE_PACKET="$(pwd)/trace" git -C client fetch --filter=auto &&
+
+ # Verify that the new filter was used
+ test_grep "filter blob:limit=5678" trace &&
+
+ # Check that the largest object is still missing on the server
+ check_missing_objects server 1 "$oid" &&
+
+ # Change the filter on the server again
+ git -C server config remote.lop.partialCloneFilter "blob:limit=5432" &&
+
+ # Get yet a new commit on the server to ensure fetch-pack runs
+ test_commit -C template yet-a-new-commit &&
+ git -C template push --all "$(pwd)/server" &&
+
+ # Perform a fetch WITHOUT --filter=auto
+ # Relies on "auto" being persisted in the client config
+ rm -rf trace &&
+ GIT_TRACE_PACKET="$(pwd)/trace" git -C client fetch &&
+
+ # Verify that the new filter was used
+ test_grep "filter blob:limit=5432" trace &&
+
+ # Check that the largest object is still missing on the server
+ check_missing_objects server 1 "$oid"
+'
+
test_expect_success "clone with promisor.advertise set to 'true' but don't delete the client" '
git -C server config promisor.advertise true &&
diff --git a/transport.c b/transport.c
index c7f06a7382..cde8d83a57 100644
--- a/transport.c
+++ b/transport.c
@@ -1219,6 +1219,7 @@ struct transport *transport_get(struct remote *remote, const char *url)
*/
struct git_transport_data *data = xcalloc(1, sizeof(*data));
list_objects_filter_init(&data->options.filter_options);
+ data->options.filter_options.allow_auto_filter = 1;
ret->data = data;
ret->vtable = &builtin_smart_vtable;
ret->smart_options = &(data->options);
--
2.53.0.70.g3d1fd9d397.dirty
^ permalink raw reply related [flat|nested] 80+ messages in thread* Re: [PATCH v3 9/9] fetch-pack: wire up and enable auto filter logic
2026-02-12 10:08 ` [PATCH v3 9/9] fetch-pack: wire up and enable auto filter logic Christian Couder
@ 2026-02-13 11:26 ` Patrick Steinhardt
0 siblings, 0 replies; 80+ messages in thread
From: Patrick Steinhardt @ 2026-02-13 11:26 UTC (permalink / raw)
To: Christian Couder
Cc: git, Junio C Hamano, Taylor Blau, Karthik Nayak, Elijah Newren,
Jean-Noël Avila, Christian Couder
On Thu, Feb 12, 2026 at 11:08:40AM +0100, Christian Couder wrote:
> diff --git a/fetch-pack.c b/fetch-pack.c
> index 40316c9a34..9f8f980516 100644
> --- a/fetch-pack.c
> +++ b/fetch-pack.c
> @@ -1661,6 +1662,29 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
> struct string_list packfile_uris = STRING_LIST_INIT_DUP;
> int i;
> struct strvec index_pack_args = STRVEC_INIT;
> + const char *promisor_remote_config;
> +
> + if (server_feature_v2("promisor-remote", &promisor_remote_config))
> + promisor_remote_reply(promisor_remote_config, NULL);
And here we now pass a `NULL` pointer so that we don't have to free the
result that we didn't want to have in the first place. Good.
Patrick
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH v3 0/9] Implement `promisor.storeFields` and `--filter=auto`
2026-02-12 10:08 ` [PATCH v3 0/9] Implement `promisor.storeFields` and `--filter=auto` Christian Couder
` (8 preceding siblings ...)
2026-02-12 10:08 ` [PATCH v3 9/9] fetch-pack: wire up and enable auto filter logic Christian Couder
@ 2026-02-13 11:26 ` Patrick Steinhardt
2026-02-16 13:23 ` [PATCH v4 " Christian Couder
10 siblings, 0 replies; 80+ messages in thread
From: Patrick Steinhardt @ 2026-02-13 11:26 UTC (permalink / raw)
To: Christian Couder
Cc: git, Junio C Hamano, Taylor Blau, Karthik Nayak, Elijah Newren,
Jean-Noël Avila
On Thu, Feb 12, 2026 at 11:08:31AM +0100, Christian Couder wrote:
> Changes since v2
> ================
>
> Thanks to Patrick Steinhardt, Jean-Noël Avila and Junio Hamano for
> reviewing the previous version!
>
> The patch series has been rebased on top of current 'master' at
> 864f55e190 (The second batch, 2026-02-09) to avoid a small conflict.
>
> In patch 2/9, new checks have been added to the "clone with
> promisor.storeFields=partialCloneFilter" test. We now check that a
> subsequent fetch can update the configuration.
>
> In patch 4/9, a small change has been made to the arguments of
> `backfill_tags()` in "builtin/fetch.c" to fix a conflict with 'master'.
>
> In patch 5/9, the commit message has been improved.
>
> In patch 7/9, `captured_filters` has been renamed `accepted_filters`.
>
> Patch 8/9 is new. It changes the signature of
> `promisor_remote_reply()` and allows this function to not assemble a
> reply string if this is not needed by the caller.
>
> Patch 9/9, has a number of small changes in "fetch-pack.c":
>
> - The call to `promisor_remote_reply()` is simplified a bit as it
> doesn't require a reply string to be assembled.
>
> - A comment has been reworded for clarity.
>
> - The call to `gently_parse_list_objects_filter()` and the check to
> error out in case it fails have been simplified.
All of these changes look good to me, thanks!
Patrick
^ permalink raw reply [flat|nested] 80+ messages in thread* [PATCH v4 0/9] Implement `promisor.storeFields` and `--filter=auto`
2026-02-12 10:08 ` [PATCH v3 0/9] Implement `promisor.storeFields` and `--filter=auto` Christian Couder
` (9 preceding siblings ...)
2026-02-13 11:26 ` [PATCH v3 0/9] Implement `promisor.storeFields` and `--filter=auto` Patrick Steinhardt
@ 2026-02-16 13:23 ` Christian Couder
2026-02-16 13:23 ` [PATCH v4 1/9] promisor-remote: refactor initialising field lists Christian Couder
` (8 more replies)
10 siblings, 9 replies; 80+ messages in thread
From: Christian Couder @ 2026-02-16 13:23 UTC (permalink / raw)
To: git
Cc: Junio C Hamano, Patrick Steinhardt, Taylor Blau, Karthik Nayak,
Elijah Newren, Jean-Noël Avila, Jeff King, Christian Couder
Introduction
============
A previous patch series added the possibility to pass additional
fields, a "partialCloneFilter" and a "token" for each advertised
promisor remote, from a server to a client through the
"promisor-remote" capability.
On the client side though, it has so far only been possible to use
this new information to compare it with local information and then
decide if the corresponding advertised promisor remote is accepted or
not.
For the "token" it would be useful if it could be stored on the
client. For example in a setup where the client uses specialized
remote helpers which need a token to access the promisor remotes
advertised by the server, storing the token would allow the token to
be used when the client directly accesses a promisor remote for
example to lazy fetch some blobs it now needs.
To enable such a workflow, where the server can rotate tokens and the
client can have updated tokens from the server by simply fetching from
it, the first part of this series introduces a new
"promisor.storeFields" configuration option on the client side,
similar to the "promisor.checkFields" configuration option. When field
names, "token" or "partialCloneFilter", are listed in this new
configuration option, then the values of these field names transmitted
by the server are stored in the local configuration on the client
side.
Note that for security reasons, the corresponding remote name and url
of the advertised promisor remotes must have already been configured
on the client side. No new remote name nor url are configured.
For the "partialCloneFilter" field, simply storing the value is not
enough to enable dynamic updates. Currently, when a user initiates a
partial clone with `--filter=<filter-spec>`, that specific
<filter-spec> is saved in the client's local configuration (e.g.,
remote.origin.partialCloneFilter). Subsequent fetches then reuse this
value, ignoring suggestions from the server.
To avoid breaking this mechanism and still be able to use the
<filter-spec> that the server suggests for the promisor remotes that
the client accepts, the second part of this series introduces a new
`--filter=auto` mode for `git clone` and `git fetch`.
When `--filter=auto` is used, then "auto" is still saved as the
<filter-spec> for the server locally on the client, and then when a
fetch-pack happens, instead of passing just "auto", the actual filter
requested by the client is computed by combining the <filter-spec>s
that the server suggested for the promisor remotes that the client
accepted. This uses the "combine" filter mechanism that already exists
in "list-objects-filter-options.{c,h}".
This way by just using `--filter=auto` when cloning, a client makes
sure it will use the <filter-spec>s suggested by the server for the
promisor remotes it accepts.
This work is part of the "LOP" effort documented in:
Documentation/technical/large-object-promisors.adoc
See that doc for more information on the broader context.
Overview of the patches
=======================
Patches 1/9 and 2/9 are the first part of the series and implement the
new "promisor.storeFields" configuration option. Patch 1/9 is a small
preparatory refactoring.
Patches from 3/9 to 9/9 implement the `--filter=auto` option:
- Patches 3/9 and 4/9 are cleanups of "builtin/clone.c" and
"builtin/fetch.c" respectively that make the `filter_options`
variable local to cmd_clone() or cmd_fetch().
- Patch 5/9 is a doc update as `--filter=<filter-spec>` wasn't
documented for `git fetch`.
- Patch 6/9 improves "list-objects-filter-options.{c,h}" to
support the new 'auto' mode.
- Patches 7/9 and 8/9 improves "promisor-remote.{c,h}" to support
the new 'auto' mode.
- Patch 9/9 make the new 'auto' mode actually work by wiring up
everything together.
CI Report
=========
All the tests pass, see:
https://github.com/chriscool/git/actions/runs/22059799525
Changes since v3
================
Thanks to Patrick Steinhardt, Jean-Noël Avila, Peff and Junio Hamano
for reviewing or commenting on the previous version!
The only change compared to v3 is in patch 6/9 where in
"list-objects-filter-options.c" the `allow_auto_filter` variable in
`list_objects_filter_release()` is now initialized after the check to
return if `filter_options` is NULL instead of before that check.
Range diff since v3
===================
1: 79255ceba7 = 1: 79255ceba7 promisor-remote: refactor initialising field lists
2: 012aa7ef19 = 2: 012aa7ef19 promisor-remote: allow a client to store fields
3: f17a62e73e = 3: f17a62e73e clone: make filter_options local to cmd_clone()
4: 3c6e28dd84 = 4: 3c6e28dd84 fetch: make filter_options local to cmd_fetch()
5: 3037d546b2 = 5: 3037d546b2 doc: fetch: document `--filter=<filter-spec>` option
6: 9ce57b88dc ! 6: 366c93e836 list-objects-filter-options: support 'auto' mode for --filter
@@ list-objects-filter-options.c: void list_objects_filter_release(
struct list_objects_filter_options *filter_options)
{
size_t sub;
-+ unsigned int allow_auto_filter = filter_options->allow_auto_filter;
++ unsigned int allow_auto_filter;
if (!filter_options)
return;
-@@ list-objects-filter-options.c: void list_objects_filter_release(
++
++ allow_auto_filter = filter_options->allow_auto_filter;
+ strbuf_release(&filter_options->filter_spec);
+ free(filter_options->sparse_oid_name);
+ for (sub = 0; sub < filter_options->sub_nr; sub++)
list_objects_filter_release(&filter_options->sub[sub]);
free(filter_options->sub);
list_objects_filter_init(filter_options);
7: 37042f7019 = 7: 2eb3b9cddd promisor-remote: keep advertised filters in memory
8: dd17069aad = 8: fae3e9089d promisor-remote: change promisor_remote_reply()'s signature
9: 0f9675f477 = 9: 4627d513d6 fetch-pack: wire up and enable auto filter logic
Christian Couder (9):
promisor-remote: refactor initialising field lists
promisor-remote: allow a client to store fields
clone: make filter_options local to cmd_clone()
fetch: make filter_options local to cmd_fetch()
doc: fetch: document `--filter=<filter-spec>` option
list-objects-filter-options: support 'auto' mode for --filter
promisor-remote: keep advertised filters in memory
promisor-remote: change promisor_remote_reply()'s signature
fetch-pack: wire up and enable auto filter logic
Documentation/config/promisor.adoc | 33 +++
Documentation/fetch-options.adoc | 19 ++
Documentation/git-clone.adoc | 25 +-
Documentation/gitprotocol-v2.adoc | 24 +-
Makefile | 1 +
builtin/clone.c | 18 +-
builtin/fetch.c | 50 ++--
connect.c | 3 +-
fetch-pack.c | 24 ++
list-objects-filter-options.c | 39 ++-
list-objects-filter-options.h | 6 +
list-objects-filter.c | 8 +
promisor-remote.c | 256 +++++++++++++++++--
promisor-remote.h | 17 +-
t/meson.build | 1 +
t/t5710-promisor-remote-capability.sh | 123 +++++++++
t/unit-tests/u-list-objects-filter-options.c | 53 ++++
transport.c | 1 +
18 files changed, 628 insertions(+), 73 deletions(-)
create mode 100644 t/unit-tests/u-list-objects-filter-options.c
--
2.53.0.77.g4627d513d6
^ permalink raw reply [flat|nested] 80+ messages in thread* [PATCH v4 1/9] promisor-remote: refactor initialising field lists
2026-02-16 13:23 ` [PATCH v4 " Christian Couder
@ 2026-02-16 13:23 ` Christian Couder
2026-02-16 13:23 ` [PATCH v4 2/9] promisor-remote: allow a client to store fields Christian Couder
` (7 subsequent siblings)
8 siblings, 0 replies; 80+ messages in thread
From: Christian Couder @ 2026-02-16 13:23 UTC (permalink / raw)
To: git
Cc: Junio C Hamano, Patrick Steinhardt, Taylor Blau, Karthik Nayak,
Elijah Newren, Jean-Noël Avila, Jeff King, Christian Couder,
Christian Couder
In "promisor-remote.c", the fields_sent() and fields_checked()
functions serve similar purposes and contain a small amount of
duplicated code.
As we are going to add a similar function in a following commit,
let's refactor this common code into a new initialize_fields_list()
function.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
promisor-remote.c | 28 ++++++++++++++--------------
1 file changed, 14 insertions(+), 14 deletions(-)
diff --git a/promisor-remote.c b/promisor-remote.c
index 77ebf537e2..5d8151cedb 100644
--- a/promisor-remote.c
+++ b/promisor-remote.c
@@ -375,18 +375,24 @@ static char *fields_from_config(struct string_list *fields_list, const char *con
return fields;
}
+static struct string_list *initialize_fields_list(struct string_list *fields_list, int *initialized,
+ const char *config_key)
+{
+ if (!*initialized) {
+ fields_list->cmp = strcasecmp;
+ fields_from_config(fields_list, config_key);
+ *initialized = 1;
+ }
+
+ return fields_list;
+}
+
static struct string_list *fields_sent(void)
{
static struct string_list fields_list = STRING_LIST_INIT_NODUP;
static int initialized;
- if (!initialized) {
- fields_list.cmp = strcasecmp;
- fields_from_config(&fields_list, "promisor.sendFields");
- initialized = 1;
- }
-
- return &fields_list;
+ return initialize_fields_list(&fields_list, &initialized, "promisor.sendFields");
}
static struct string_list *fields_checked(void)
@@ -394,13 +400,7 @@ static struct string_list *fields_checked(void)
static struct string_list fields_list = STRING_LIST_INIT_NODUP;
static int initialized;
- if (!initialized) {
- fields_list.cmp = strcasecmp;
- fields_from_config(&fields_list, "promisor.checkFields");
- initialized = 1;
- }
-
- return &fields_list;
+ return initialize_fields_list(&fields_list, &initialized, "promisor.checkFields");
}
/*
--
2.53.0.77.g4627d513d6
^ permalink raw reply related [flat|nested] 80+ messages in thread* [PATCH v4 2/9] promisor-remote: allow a client to store fields
2026-02-16 13:23 ` [PATCH v4 " Christian Couder
2026-02-16 13:23 ` [PATCH v4 1/9] promisor-remote: refactor initialising field lists Christian Couder
@ 2026-02-16 13:23 ` Christian Couder
2026-02-16 13:23 ` [PATCH v4 3/9] clone: make filter_options local to cmd_clone() Christian Couder
` (6 subsequent siblings)
8 siblings, 0 replies; 80+ messages in thread
From: Christian Couder @ 2026-02-16 13:23 UTC (permalink / raw)
To: git
Cc: Junio C Hamano, Patrick Steinhardt, Taylor Blau, Karthik Nayak,
Elijah Newren, Jean-Noël Avila, Jeff King, Christian Couder,
Christian Couder
A previous commit allowed a server to pass additional fields through
the "promisor-remote" protocol capability after the "name" and "url"
fields, specifically the "partialCloneFilter" and "token" fields.
Another previous commit, c213820c51 (promisor-remote: allow a client
to check fields, 2025-09-08), has made it possible for a client to
decide if it accepts a promisor remote advertised by a server based
on these additional fields.
Often though, it would be interesting for the client to just store in
its configuration files these additional fields passed by the server,
so that it can use them when needed.
For example if a token is necessary to access a promisor remote, that
token could be updated frequently only on the server side and then
passed to all the clients through the "promisor-remote" capability,
avoiding the need to update it on all the clients manually.
Storing the token on the client side makes sure that the token is
available when the client needs to access the promisor remotes for a
lazy fetch.
To allow this, let's introduce a new "promisor.storeFields"
configuration variable.
Note that for a partial clone filter, it's less interesting to have
it stored on the client. This is because a filter should be used
right away and we already pass a `--filter=<filter-spec>` option to
`git clone` when starting a partial clone. Storing the filter could
perhaps still be interesting for information purposes.
Like "promisor.checkFields" and "promisor.sendFields", the new
configuration variable should contain a comma or space separated list
of field names. Only the "partialCloneFilter" and "token" field names
are supported for now.
When a server advertises a promisor remote, for example "foo", along
with for example "token=XXXXX" to a client, and on the client side
"promisor.storeFields" contains "token", then the client will store
XXXXX for the "remote.foo.token" variable in its configuration file
and reload its configuration so it can immediately use this new
configuration variable.
A message is emitted on stderr to warn users when the config is
changed.
Note that even if "promisor.acceptFromServer" is set to "all", a
promisor remote has to be already configured on the client side for
some of its config to be changed. In any case no new remote is
configured and no new URL is stored.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
Documentation/config/promisor.adoc | 33 ++++++
Documentation/gitprotocol-v2.adoc | 12 ++-
promisor-remote.c | 148 +++++++++++++++++++++++++-
t/t5710-promisor-remote-capability.sh | 63 +++++++++++
4 files changed, 250 insertions(+), 6 deletions(-)
diff --git a/Documentation/config/promisor.adoc b/Documentation/config/promisor.adoc
index 93e5e0d9b5..b0fa43b839 100644
--- a/Documentation/config/promisor.adoc
+++ b/Documentation/config/promisor.adoc
@@ -89,3 +89,36 @@ variable. The fields are checked only if the
`promisor.acceptFromServer` config variable is not set to "None". If
set to "None", this config variable has no effect. See
linkgit:gitprotocol-v2[5].
+
+promisor.storeFields::
+ A comma or space separated list of additional remote related
+ field names. If a client accepts an advertised remote, the
+ client will store the values associated with these field names
+ taken from the remote advertisement into its configuration,
+ and then reload its remote configuration. Currently,
+ "partialCloneFilter" and "token" are the only supported field
+ names.
++
+For example if a server advertises "partialCloneFilter=blob:limit=20k"
+for remote "foo", and that remote is accepted, then "blob:limit=20k"
+will be stored for the "remote.foo.partialCloneFilter" configuration
+variable.
++
+If the new field value from an advertised remote is the same as the
+existing field value for that remote on the client side, then no
+change is made to the client configuration though.
++
+When a new value is stored, a message is printed to standard error to
+let users know about this.
++
+Note that for security reasons, if the remote is not already
+configured on the client side, nothing will be stored for that
+remote. In any case, no new remote will be created and no URL will be
+stored.
++
+Before storing a partial clone filter, it's parsed to check it's
+valid. If it's not, a warning is emitted and it's not stored.
++
+Before storing a token, a check is performed to ensure it contains no
+control character. If the check fails, a warning is emitted and it's
+not stored.
diff --git a/Documentation/gitprotocol-v2.adoc b/Documentation/gitprotocol-v2.adoc
index c7db103299..d93dd279ea 100644
--- a/Documentation/gitprotocol-v2.adoc
+++ b/Documentation/gitprotocol-v2.adoc
@@ -826,9 +826,10 @@ are case-sensitive and MUST be transmitted exactly as specified
above. Clients MUST ignore fields they don't recognize to allow for
future protocol extensions.
-For now, the client can only use information transmitted through these
-fields to decide if it accepts the advertised promisor remote. In the
-future that information might be used for other purposes though.
+The client can use information transmitted through these fields to
+decide if it accepts the advertised promisor remote. Also, the client
+can be configured to store the values of these fields (see
+"promisor.storeFields" in linkgit:git-config[1]).
Field values MUST be urlencoded.
@@ -856,8 +857,9 @@ the server advertised, the client shouldn't advertise the
On the server side, the "promisor.advertise" and "promisor.sendFields"
configuration options can be used to control what it advertises. On
the client side, the "promisor.acceptFromServer" configuration option
-can be used to control what it accepts. See the documentation of these
-configuration options for more information.
+can be used to control what it accepts, and the "promisor.storeFields"
+option, to control what it stores. See the documentation of these
+configuration options in linkgit:git-config[1] for more information.
Note that in the future it would be nice if the "promisor-remote"
protocol capability could be used by the server, when responding to
diff --git a/promisor-remote.c b/promisor-remote.c
index 5d8151cedb..59997dd4c7 100644
--- a/promisor-remote.c
+++ b/promisor-remote.c
@@ -403,6 +403,14 @@ static struct string_list *fields_checked(void)
return initialize_fields_list(&fields_list, &initialized, "promisor.checkFields");
}
+static struct string_list *fields_stored(void)
+{
+ static struct string_list fields_list = STRING_LIST_INIT_NODUP;
+ static int initialized;
+
+ return initialize_fields_list(&fields_list, &initialized, "promisor.storeFields");
+}
+
/*
* Struct for promisor remotes involved in the "promisor-remote"
* protocol capability.
@@ -692,6 +700,132 @@ static struct promisor_info *parse_one_advertised_remote(const char *remote_info
return info;
}
+static bool store_one_field(struct repository *repo, const char *remote_name,
+ const char *field_name, const char *field_key,
+ const char *advertised, const char *current)
+{
+ if (advertised && (!current || strcmp(current, advertised))) {
+ char *key = xstrfmt("remote.%s.%s", remote_name, field_key);
+
+ fprintf(stderr, _("Storing new %s from server for remote '%s'.\n"
+ " '%s' -> '%s'\n"),
+ field_name, remote_name,
+ current ? current : "",
+ advertised);
+
+ repo_config_set_gently(repo, key, advertised);
+ free(key);
+
+ return true;
+ }
+
+ return false;
+}
+
+/* Check that a filter is valid by parsing it */
+static bool valid_filter(const char *filter, const char *remote_name)
+{
+ struct list_objects_filter_options filter_opts = LIST_OBJECTS_FILTER_INIT;
+ struct strbuf err = STRBUF_INIT;
+ int res = gently_parse_list_objects_filter(&filter_opts, filter, &err);
+
+ if (res)
+ warning(_("invalid filter '%s' for remote '%s' "
+ "will not be stored: %s"),
+ filter, remote_name, err.buf);
+
+ list_objects_filter_release(&filter_opts);
+ strbuf_release(&err);
+
+ return !res;
+}
+
+/* Check that a token doesn't contain any control character */
+static bool valid_token(const char *token, const char *remote_name)
+{
+ const char *c = token;
+
+ for (; *c; c++)
+ if (iscntrl(*c)) {
+ warning(_("invalid token '%s' for remote '%s' "
+ "will not be stored"),
+ token, remote_name);
+ return false;
+ }
+
+ return true;
+}
+
+struct store_info {
+ struct repository *repo;
+ struct string_list config_info;
+ bool store_filter;
+ bool store_token;
+};
+
+static struct store_info *store_info_new(struct repository *repo)
+{
+ struct string_list *fields_to_store = fields_stored();
+ struct store_info *s = xmalloc(sizeof(*s));
+
+ s->repo = repo;
+
+ string_list_init_nodup(&s->config_info);
+ promisor_config_info_list(repo, &s->config_info, fields_to_store);
+ string_list_sort(&s->config_info);
+
+ s->store_filter = !!string_list_lookup(fields_to_store, promisor_field_filter);
+ s->store_token = !!string_list_lookup(fields_to_store, promisor_field_token);
+
+ return s;
+}
+
+static void store_info_free(struct store_info *s)
+{
+ if (s) {
+ promisor_info_list_clear(&s->config_info);
+ free(s);
+ }
+}
+
+static bool promisor_store_advertised_fields(struct promisor_info *advertised,
+ struct store_info *store_info)
+{
+ struct promisor_info *p;
+ struct string_list_item *item;
+ const char *remote_name = advertised->name;
+ bool reload_config = false;
+
+ if (!(store_info->store_filter || store_info->store_token))
+ return false;
+
+ /*
+ * Get existing config info for the advertised promisor
+ * remote. This ensures the remote is already configured on
+ * the client side.
+ */
+ item = string_list_lookup(&store_info->config_info, remote_name);
+
+ if (!item)
+ return false;
+
+ p = item->util;
+
+ if (store_info->store_filter && advertised->filter &&
+ valid_filter(advertised->filter, remote_name))
+ reload_config |= store_one_field(store_info->repo, remote_name,
+ "filter", promisor_field_filter,
+ advertised->filter, p->filter);
+
+ if (store_info->store_token && advertised->token &&
+ valid_token(advertised->token, remote_name))
+ reload_config |= store_one_field(store_info->repo, remote_name,
+ "token", promisor_field_token,
+ advertised->token, p->token);
+
+ return reload_config;
+}
+
static void filter_promisor_remote(struct repository *repo,
struct strvec *accepted,
const char *info)
@@ -700,7 +834,9 @@ static void filter_promisor_remote(struct repository *repo,
enum accept_promisor accept = ACCEPT_NONE;
struct string_list config_info = STRING_LIST_INIT_NODUP;
struct string_list remote_info = STRING_LIST_INIT_DUP;
+ struct store_info *store_info = NULL;
struct string_list_item *item;
+ bool reload_config = false;
if (!repo_config_get_string_tmp(the_repository, "promisor.acceptfromserver", &accept_str)) {
if (!*accept_str || !strcasecmp("None", accept_str))
@@ -736,14 +872,24 @@ static void filter_promisor_remote(struct repository *repo,
string_list_sort(&config_info);
}
- if (should_accept_remote(accept, advertised, &config_info))
+ if (should_accept_remote(accept, advertised, &config_info)) {
+ if (!store_info)
+ store_info = store_info_new(repo);
+ if (promisor_store_advertised_fields(advertised, store_info))
+ reload_config = true;
+
strvec_push(accepted, advertised->name);
+ }
promisor_info_free(advertised);
}
promisor_info_list_clear(&config_info);
string_list_clear(&remote_info, 0);
+ store_info_free(store_info);
+
+ if (reload_config)
+ repo_promisor_remote_reinit(repo);
}
char *promisor_remote_reply(const char *info)
diff --git a/t/t5710-promisor-remote-capability.sh b/t/t5710-promisor-remote-capability.sh
index 023735d6a8..6ef6431bd7 100755
--- a/t/t5710-promisor-remote-capability.sh
+++ b/t/t5710-promisor-remote-capability.sh
@@ -360,6 +360,69 @@ test_expect_success "clone with promisor.checkFields" '
check_missing_objects server 1 "$oid"
'
+test_expect_success "clone with promisor.storeFields=partialCloneFilter" '
+ git -C server config promisor.advertise true &&
+ test_when_finished "rm -rf client" &&
+
+ git -C server remote add otherLop "https://invalid.invalid" &&
+ git -C server config remote.otherLop.token "fooBar" &&
+ git -C server config remote.otherLop.stuff "baz" &&
+ git -C server config remote.otherLop.partialCloneFilter "blob:limit=10k" &&
+ test_when_finished "git -C server remote remove otherLop" &&
+
+ git -C server config remote.lop.token "fooXXX" &&
+ git -C server config remote.lop.partialCloneFilter "blob:limit=8k" &&
+
+ test_config -C server promisor.sendFields "partialCloneFilter, token" &&
+ test_when_finished "rm trace" &&
+
+ # Clone from server to create a client
+ GIT_TRACE_PACKET="$(pwd)/trace" GIT_NO_LAZY_FETCH=0 git clone \
+ -c remote.lop.promisor=true \
+ -c remote.lop.fetch="+refs/heads/*:refs/remotes/lop/*" \
+ -c remote.lop.url="file://$(pwd)/lop" \
+ -c remote.lop.token="fooYYY" \
+ -c remote.lop.partialCloneFilter="blob:none" \
+ -c promisor.acceptfromserver=All \
+ -c promisor.storeFields=partialcloneFilter \
+ --no-local --filter="blob:limit=5k" server client 2>err &&
+
+ # Check that the filter from the server is stored
+ echo "blob:limit=8k" >expected &&
+ git -C client config remote.lop.partialCloneFilter >actual &&
+ test_cmp expected actual &&
+
+ # Check that user is notified when the filter is stored
+ test_grep "Storing new filter from server for remote '\''lop'\''" err &&
+ test_grep "'\''blob:none'\'' -> '\''blob:limit=8k'\''" err &&
+
+ # Check that the token from the server is NOT stored
+ echo "fooYYY" >expected &&
+ git -C client config remote.lop.token >actual &&
+ test_cmp expected actual &&
+ test_grep ! "Storing new token from server" err &&
+
+ # Check that the filter for an unknown remote is NOT stored
+ test_must_fail git -C client config remote.otherLop.partialCloneFilter >actual &&
+
+ # Check that the largest object is still missing on the server
+ check_missing_objects server 1 "$oid" &&
+
+ # Change the configuration on the server and fetch from the client
+ git -C server config remote.lop.partialCloneFilter "blob:limit=7k" &&
+ GIT_NO_LAZY_FETCH=0 git -C client fetch \
+ --filter="blob:limit=5k" ../server 2>err &&
+
+ # Check that the fetch updated the configuration on the client
+ echo "blob:limit=7k" >expected &&
+ git -C client config remote.lop.partialCloneFilter >actual &&
+ test_cmp expected actual &&
+
+ # Check that user is notified when the new filter is stored
+ test_grep "Storing new filter from server for remote '\''lop'\''" err &&
+ test_grep "'\''blob:limit=8k'\'' -> '\''blob:limit=7k'\''" err
+'
+
test_expect_success "clone with promisor.advertise set to 'true' but don't delete the client" '
git -C server config promisor.advertise true &&
--
2.53.0.77.g4627d513d6
^ permalink raw reply related [flat|nested] 80+ messages in thread* [PATCH v4 3/9] clone: make filter_options local to cmd_clone()
2026-02-16 13:23 ` [PATCH v4 " Christian Couder
2026-02-16 13:23 ` [PATCH v4 1/9] promisor-remote: refactor initialising field lists Christian Couder
2026-02-16 13:23 ` [PATCH v4 2/9] promisor-remote: allow a client to store fields Christian Couder
@ 2026-02-16 13:23 ` Christian Couder
2026-02-16 13:23 ` [PATCH v4 4/9] fetch: make filter_options local to cmd_fetch() Christian Couder
` (5 subsequent siblings)
8 siblings, 0 replies; 80+ messages in thread
From: Christian Couder @ 2026-02-16 13:23 UTC (permalink / raw)
To: git
Cc: Junio C Hamano, Patrick Steinhardt, Taylor Blau, Karthik Nayak,
Elijah Newren, Jean-Noël Avila, Jeff King, Christian Couder,
Christian Couder
The `struct list_objects_filter_options filter_options` variable used
in "builtin/clone.c" to store the parsed filters specified by
`--filter=<filterspec>` is currently a static variable global to the
file.
As we are going to use it more in a following commit, it could become
a bit less easy to understand how it's managed.
To avoid that, let's make it clear that it's owned by cmd_clone() by
moving its definition into that function and making it non-static.
The only additional change to make this work is to pass it as an
argument to checkout(). So it's a small quite cheap cleanup anyway.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
builtin/clone.c | 16 +++++++++++-----
1 file changed, 11 insertions(+), 5 deletions(-)
diff --git a/builtin/clone.c b/builtin/clone.c
index b14a39a687..bb27472020 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -77,7 +77,6 @@ static struct string_list option_required_reference = STRING_LIST_INIT_NODUP;
static struct string_list option_optional_reference = STRING_LIST_INIT_NODUP;
static int max_jobs = -1;
static struct string_list option_recurse_submodules = STRING_LIST_INIT_NODUP;
-static struct list_objects_filter_options filter_options = LIST_OBJECTS_FILTER_INIT;
static int config_filter_submodules = -1; /* unspecified */
static int option_remote_submodules;
@@ -634,7 +633,9 @@ static int git_sparse_checkout_init(const char *repo)
return result;
}
-static int checkout(int submodule_progress, int filter_submodules,
+static int checkout(int submodule_progress,
+ struct list_objects_filter_options *filter_options,
+ int filter_submodules,
enum ref_storage_format ref_storage_format)
{
struct object_id oid;
@@ -723,9 +724,9 @@ static int checkout(int submodule_progress, int filter_submodules,
strvec_pushf(&cmd.args, "--ref-format=%s",
ref_storage_format_to_name(ref_storage_format));
- if (filter_submodules && filter_options.choice)
+ if (filter_submodules && filter_options->choice)
strvec_pushf(&cmd.args, "--filter=%s",
- expand_list_objects_filter_spec(&filter_options));
+ expand_list_objects_filter_spec(filter_options));
if (option_single_branch >= 0)
strvec_push(&cmd.args, option_single_branch ?
@@ -903,6 +904,7 @@ int cmd_clone(int argc,
enum transport_family family = TRANSPORT_FAMILY_ALL;
struct string_list option_config = STRING_LIST_INIT_DUP;
int option_dissociate = 0;
+ struct list_objects_filter_options filter_options = LIST_OBJECTS_FILTER_INIT;
int option_filter_submodules = -1; /* unspecified */
struct string_list server_options = STRING_LIST_INIT_NODUP;
const char *bundle_uri = NULL;
@@ -1624,9 +1626,13 @@ int cmd_clone(int argc,
return 1;
junk_mode = JUNK_LEAVE_REPO;
- err = checkout(submodule_progress, filter_submodules,
+ err = checkout(submodule_progress,
+ &filter_options,
+ filter_submodules,
ref_storage_format);
+ list_objects_filter_release(&filter_options);
+
string_list_clear(&option_not, 0);
string_list_clear(&option_config, 0);
string_list_clear(&server_options, 0);
--
2.53.0.77.g4627d513d6
^ permalink raw reply related [flat|nested] 80+ messages in thread* [PATCH v4 4/9] fetch: make filter_options local to cmd_fetch()
2026-02-16 13:23 ` [PATCH v4 " Christian Couder
` (2 preceding siblings ...)
2026-02-16 13:23 ` [PATCH v4 3/9] clone: make filter_options local to cmd_clone() Christian Couder
@ 2026-02-16 13:23 ` Christian Couder
2026-02-16 13:23 ` [PATCH v4 5/9] doc: fetch: document `--filter=<filter-spec>` option Christian Couder
` (4 subsequent siblings)
8 siblings, 0 replies; 80+ messages in thread
From: Christian Couder @ 2026-02-16 13:23 UTC (permalink / raw)
To: git
Cc: Junio C Hamano, Patrick Steinhardt, Taylor Blau, Karthik Nayak,
Elijah Newren, Jean-Noël Avila, Jeff King, Christian Couder,
Christian Couder
The `struct list_objects_filter_options filter_options` variable used
in "builtin/fetch.c" to store the parsed filters specified by
`--filter=<filterspec>` is currently a static variable global to the
file.
As we are going to use it more in a following commit, it could become a
bit less easy to understand how it's managed.
To avoid that, let's make it clear that it's owned by cmd_fetch() by
moving its definition into that function and making it non-static.
This requires passing a pointer to it through the prepare_transport(),
do_fetch(), backfill_tags(), fetch_one_setup_partial(), and fetch_one()
functions, but it's quite straightforward.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
builtin/fetch.c | 48 +++++++++++++++++++++++++++---------------------
1 file changed, 27 insertions(+), 21 deletions(-)
diff --git a/builtin/fetch.c b/builtin/fetch.c
index a3bc7e9380..8fbf3557ce 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -97,7 +97,6 @@ static struct strbuf default_rla = STRBUF_INIT;
static struct transport *gtransport;
static struct transport *gsecondary;
static struct refspec refmap = REFSPEC_INIT_FETCH;
-static struct list_objects_filter_options filter_options = LIST_OBJECTS_FILTER_INIT;
static struct string_list server_options = STRING_LIST_INIT_DUP;
static struct string_list negotiation_tip = STRING_LIST_INIT_NODUP;
@@ -1562,7 +1561,8 @@ static void add_negotiation_tips(struct git_transport_options *smart_options)
smart_options->negotiation_tips = oids;
}
-static struct transport *prepare_transport(struct remote *remote, int deepen)
+static struct transport *prepare_transport(struct remote *remote, int deepen,
+ struct list_objects_filter_options *filter_options)
{
struct transport *transport;
@@ -1586,9 +1586,9 @@ static struct transport *prepare_transport(struct remote *remote, int deepen)
set_option(transport, TRANS_OPT_UPDATE_SHALLOW, "yes");
if (refetch)
set_option(transport, TRANS_OPT_REFETCH, "yes");
- if (filter_options.choice) {
+ if (filter_options->choice) {
const char *spec =
- expand_list_objects_filter_spec(&filter_options);
+ expand_list_objects_filter_spec(filter_options);
set_option(transport, TRANS_OPT_LIST_OBJECTS_FILTER, spec);
set_option(transport, TRANS_OPT_FROM_PROMISOR, "1");
}
@@ -1607,7 +1607,8 @@ static int backfill_tags(struct display_state *display_state,
struct ref *ref_map,
struct fetch_head *fetch_head,
const struct fetch_config *config,
- struct ref_update_display_info_array *display_array)
+ struct ref_update_display_info_array *display_array,
+ struct list_objects_filter_options *filter_options)
{
int retcode, cannot_reuse;
@@ -1621,7 +1622,7 @@ static int backfill_tags(struct display_state *display_state,
cannot_reuse = transport->cannot_reuse ||
deepen_since || deepen_not.nr;
if (cannot_reuse) {
- gsecondary = prepare_transport(transport->remote, 0);
+ gsecondary = prepare_transport(transport->remote, 0, filter_options);
transport = gsecondary;
}
@@ -1834,7 +1835,8 @@ static int commit_ref_transaction(struct ref_transaction **transaction,
static int do_fetch(struct transport *transport,
struct refspec *rs,
- const struct fetch_config *config)
+ const struct fetch_config *config,
+ struct list_objects_filter_options *filter_options)
{
struct ref_transaction *transaction = NULL;
struct ref *ref_map = NULL;
@@ -1997,7 +1999,7 @@ static int do_fetch(struct transport *transport,
* the transaction and don't commit anything.
*/
if (backfill_tags(&display_state, transport, transaction, tags_ref_map,
- &fetch_head, config, &display_array))
+ &fetch_head, config, &display_array, filter_options))
retcode = 1;
}
@@ -2339,20 +2341,21 @@ static int fetch_multiple(struct string_list *list, int max_children,
* Fetching from the promisor remote should use the given filter-spec
* or inherit the default filter-spec from the config.
*/
-static inline void fetch_one_setup_partial(struct remote *remote)
+static inline void fetch_one_setup_partial(struct remote *remote,
+ struct list_objects_filter_options *filter_options)
{
/*
* Explicit --no-filter argument overrides everything, regardless
* of any prior partial clones and fetches.
*/
- if (filter_options.no_filter)
+ if (filter_options->no_filter)
return;
/*
* If no prior partial clone/fetch and the current fetch DID NOT
* request a partial-fetch, do a normal fetch.
*/
- if (!repo_has_promisor_remote(the_repository) && !filter_options.choice)
+ if (!repo_has_promisor_remote(the_repository) && !filter_options->choice)
return;
/*
@@ -2361,8 +2364,8 @@ static inline void fetch_one_setup_partial(struct remote *remote)
* filter-spec as the default for subsequent fetches to this
* remote if there is currently no default filter-spec.
*/
- if (filter_options.choice) {
- partial_clone_register(remote->name, &filter_options);
+ if (filter_options->choice) {
+ partial_clone_register(remote->name, filter_options);
return;
}
@@ -2371,14 +2374,15 @@ static inline void fetch_one_setup_partial(struct remote *remote)
* explicitly given filter-spec or inherit the filter-spec from
* the config.
*/
- if (!filter_options.choice)
- partial_clone_get_default_filter_spec(&filter_options, remote->name);
+ if (!filter_options->choice)
+ partial_clone_get_default_filter_spec(filter_options, remote->name);
return;
}
static int fetch_one(struct remote *remote, int argc, const char **argv,
int prune_tags_ok, int use_stdin_refspecs,
- const struct fetch_config *config)
+ const struct fetch_config *config,
+ struct list_objects_filter_options *filter_options)
{
struct refspec rs = REFSPEC_INIT_FETCH;
int i;
@@ -2390,7 +2394,7 @@ static int fetch_one(struct remote *remote, int argc, const char **argv,
die(_("no remote repository specified; please specify either a URL or a\n"
"remote name from which new revisions should be fetched"));
- gtransport = prepare_transport(remote, 1);
+ gtransport = prepare_transport(remote, 1, filter_options);
if (prune < 0) {
/* no command line request */
@@ -2445,7 +2449,7 @@ static int fetch_one(struct remote *remote, int argc, const char **argv,
sigchain_push_common(unlock_pack_on_signal);
atexit(unlock_pack_atexit);
sigchain_push(SIGPIPE, SIG_IGN);
- exit_code = do_fetch(gtransport, &rs, config);
+ exit_code = do_fetch(gtransport, &rs, config, filter_options);
sigchain_pop(SIGPIPE);
refspec_clear(&rs);
transport_disconnect(gtransport);
@@ -2470,6 +2474,7 @@ int cmd_fetch(int argc,
const char *submodule_prefix = "";
const char *bundle_uri;
struct string_list list = STRING_LIST_INIT_DUP;
+ struct list_objects_filter_options filter_options = LIST_OBJECTS_FILTER_INIT;
struct remote *remote = NULL;
int all = -1, multiple = 0;
int result = 0;
@@ -2735,7 +2740,7 @@ int cmd_fetch(int argc,
trace2_region_enter("fetch", "negotiate-only", the_repository);
if (!remote)
die(_("must supply remote when using --negotiate-only"));
- gtransport = prepare_transport(remote, 1);
+ gtransport = prepare_transport(remote, 1, &filter_options);
if (gtransport->smart_options) {
gtransport->smart_options->acked_commits = &acked_commits;
} else {
@@ -2757,12 +2762,12 @@ int cmd_fetch(int argc,
} else if (remote) {
if (filter_options.choice || repo_has_promisor_remote(the_repository)) {
trace2_region_enter("fetch", "setup-partial", the_repository);
- fetch_one_setup_partial(remote);
+ fetch_one_setup_partial(remote, &filter_options);
trace2_region_leave("fetch", "setup-partial", the_repository);
}
trace2_region_enter("fetch", "fetch-one", the_repository);
result = fetch_one(remote, argc, argv, prune_tags_ok, stdin_refspecs,
- &config);
+ &config, &filter_options);
trace2_region_leave("fetch", "fetch-one", the_repository);
} else {
int max_children = max_jobs;
@@ -2868,5 +2873,6 @@ int cmd_fetch(int argc,
cleanup:
string_list_clear(&list, 0);
+ list_objects_filter_release(&filter_options);
return result;
}
--
2.53.0.77.g4627d513d6
^ permalink raw reply related [flat|nested] 80+ messages in thread* [PATCH v4 5/9] doc: fetch: document `--filter=<filter-spec>` option
2026-02-16 13:23 ` [PATCH v4 " Christian Couder
` (3 preceding siblings ...)
2026-02-16 13:23 ` [PATCH v4 4/9] fetch: make filter_options local to cmd_fetch() Christian Couder
@ 2026-02-16 13:23 ` Christian Couder
2026-02-16 13:23 ` [PATCH v4 6/9] list-objects-filter-options: support 'auto' mode for --filter Christian Couder
` (3 subsequent siblings)
8 siblings, 0 replies; 80+ messages in thread
From: Christian Couder @ 2026-02-16 13:23 UTC (permalink / raw)
To: git
Cc: Junio C Hamano, Patrick Steinhardt, Taylor Blau, Karthik Nayak,
Elijah Newren, Jean-Noël Avila, Jeff King, Christian Couder,
Christian Couder
The `--filter=<filter-spec>` option is documented in most commands that
support it except `git fetch`.
Let's fix that and document this option. To ensure consistency across
commands, let's reuse the exact description currently found in
`git clone`.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
Documentation/fetch-options.adoc | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/Documentation/fetch-options.adoc b/Documentation/fetch-options.adoc
index fcba46ee9e..1ef9807d00 100644
--- a/Documentation/fetch-options.adoc
+++ b/Documentation/fetch-options.adoc
@@ -88,6 +88,16 @@ linkgit:git-config[1].
This is incompatible with `--recurse-submodules=(yes|on-demand)` and takes
precedence over the `fetch.output` config option.
+`--filter=<filter-spec>`::
+ Use the partial clone feature and request that the server sends
+ a subset of reachable objects according to a given object filter.
+ When using `--filter`, the supplied _<filter-spec>_ is used for
+ the partial fetch. For example, `--filter=blob:none` will filter
+ out all blobs (file contents) until needed by Git. Also,
+ `--filter=blob:limit=<size>` will filter out all blobs of size
+ at least _<size>_. For more details on filter specifications, see
+ the `--filter` option in linkgit:git-rev-list[1].
+
ifndef::git-pull[]
`--write-fetch-head`::
`--no-write-fetch-head`::
--
2.53.0.77.g4627d513d6
^ permalink raw reply related [flat|nested] 80+ messages in thread* [PATCH v4 6/9] list-objects-filter-options: support 'auto' mode for --filter
2026-02-16 13:23 ` [PATCH v4 " Christian Couder
` (4 preceding siblings ...)
2026-02-16 13:23 ` [PATCH v4 5/9] doc: fetch: document `--filter=<filter-spec>` option Christian Couder
@ 2026-02-16 13:23 ` Christian Couder
2026-02-16 13:23 ` [PATCH v4 7/9] promisor-remote: keep advertised filters in memory Christian Couder
` (2 subsequent siblings)
8 siblings, 0 replies; 80+ messages in thread
From: Christian Couder @ 2026-02-16 13:23 UTC (permalink / raw)
To: git
Cc: Junio C Hamano, Patrick Steinhardt, Taylor Blau, Karthik Nayak,
Elijah Newren, Jean-Noël Avila, Jeff King, Christian Couder,
Christian Couder
In a following commit, we are going to allow passing "auto" as a
<filterspec> to the `--filter=<filterspec>` option, but only for some
commands. Other commands that support the `--filter=<filterspec>`
option should still die() when 'auto' is passed.
Let's set up the "list-objects-filter-options.{c,h}" infrastructure to
support that:
- Add a new `unsigned int allow_auto_filter : 1;` flag to
`struct list_objects_filter_options` which specifies if "auto" is
accepted or not by the current command.
- Change gently_parse_list_objects_filter() to parse "auto" if it's
accepted.
- Make sure we die() if "auto" is combined with another filter.
- Update list_objects_filter_release() to preserve the
allow_auto_filter flag, as this function is often called (via
opt_parse_list_objects_filter) to reset the struct before parsing a
new value.
Let's also update `list-objects-filter.c` to recognize the new
`LOFC_AUTO` choice. Since "auto" must be resolved to a concrete filter
before filtering actually begins, initializing a filter with
`LOFC_AUTO` is invalid and will trigger a BUG().
Note that ideally combining "auto" with "auto" could be allowed, but in
practice, it's probably not worth the added code complexity. And if we
really want it, nothing prevents us to allow it in future work.
If we ever want to give a meaning to combining "auto" with a different
filter too, nothing prevents us to do that in future work either.
Also note that the new `allow_auto_filter` flag depends on the command,
not user choices, so it should be reset to the command default when
`struct list_objects_filter_options` instances are reset.
While at it, let's add a new "u-list-objects-filter-options.c" file for
`struct list_objects_filter_options` related unit tests. For now it
only tests gently_parse_list_objects_filter() though.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
Makefile | 1 +
list-objects-filter-options.c | 39 ++++++++++++--
list-objects-filter-options.h | 6 +++
list-objects-filter.c | 8 +++
t/meson.build | 1 +
t/unit-tests/u-list-objects-filter-options.c | 53 ++++++++++++++++++++
6 files changed, 105 insertions(+), 3 deletions(-)
create mode 100644 t/unit-tests/u-list-objects-filter-options.c
diff --git a/Makefile b/Makefile
index 4ac44331ea..9e174dd06c 100644
--- a/Makefile
+++ b/Makefile
@@ -1518,6 +1518,7 @@ CLAR_TEST_SUITES += u-dir
CLAR_TEST_SUITES += u-example-decorate
CLAR_TEST_SUITES += u-hash
CLAR_TEST_SUITES += u-hashmap
+CLAR_TEST_SUITES += u-list-objects-filter-options
CLAR_TEST_SUITES += u-mem-pool
CLAR_TEST_SUITES += u-oid-array
CLAR_TEST_SUITES += u-oidmap
diff --git a/list-objects-filter-options.c b/list-objects-filter-options.c
index 7420bf81fe..7f3e7b8f50 100644
--- a/list-objects-filter-options.c
+++ b/list-objects-filter-options.c
@@ -20,6 +20,8 @@ const char *list_object_filter_config_name(enum list_objects_filter_choice c)
case LOFC_DISABLED:
/* we have no name for "no filter at all" */
break;
+ case LOFC_AUTO:
+ return "auto";
case LOFC_BLOB_NONE:
return "blob:none";
case LOFC_BLOB_LIMIT:
@@ -52,7 +54,16 @@ int gently_parse_list_objects_filter(
if (filter_options->choice)
BUG("filter_options already populated");
- if (!strcmp(arg, "blob:none")) {
+ if (!strcmp(arg, "auto")) {
+ if (!filter_options->allow_auto_filter) {
+ strbuf_addstr(errbuf,
+ _("'auto' filter not supported by this command"));
+ return 1;
+ }
+ filter_options->choice = LOFC_AUTO;
+ return 0;
+
+ } else if (!strcmp(arg, "blob:none")) {
filter_options->choice = LOFC_BLOB_NONE;
return 0;
@@ -146,10 +157,22 @@ static int parse_combine_subfilter(
decoded = url_percent_decode(subspec->buf);
- result = has_reserved_character(subspec, errbuf) ||
- gently_parse_list_objects_filter(
+ result = has_reserved_character(subspec, errbuf);
+ if (result)
+ goto cleanup;
+
+ result = gently_parse_list_objects_filter(
&filter_options->sub[new_index], decoded, errbuf);
+ if (result)
+ goto cleanup;
+
+ result = (filter_options->sub[new_index].choice == LOFC_AUTO);
+ if (result) {
+ strbuf_addstr(errbuf, _("an 'auto' filter cannot be combined"));
+ goto cleanup;
+ }
+cleanup:
free(decoded);
return result;
}
@@ -263,6 +286,9 @@ void parse_list_objects_filter(
} else {
struct list_objects_filter_options *sub;
+ if (filter_options->choice == LOFC_AUTO)
+ die(_("an 'auto' filter is incompatible with any other filter"));
+
/*
* Make filter_options an LOFC_COMBINE spec so we can trivially
* add subspecs to it.
@@ -277,6 +303,9 @@ void parse_list_objects_filter(
if (gently_parse_list_objects_filter(sub, arg, &errbuf))
die("%s", errbuf.buf);
+ if (sub->choice == LOFC_AUTO)
+ die(_("an 'auto' filter is incompatible with any other filter"));
+
strbuf_addch(&filter_options->filter_spec, '+');
filter_spec_append_urlencode(filter_options, arg);
}
@@ -317,15 +346,19 @@ void list_objects_filter_release(
struct list_objects_filter_options *filter_options)
{
size_t sub;
+ unsigned int allow_auto_filter;
if (!filter_options)
return;
+
+ allow_auto_filter = filter_options->allow_auto_filter;
strbuf_release(&filter_options->filter_spec);
free(filter_options->sparse_oid_name);
for (sub = 0; sub < filter_options->sub_nr; sub++)
list_objects_filter_release(&filter_options->sub[sub]);
free(filter_options->sub);
list_objects_filter_init(filter_options);
+ filter_options->allow_auto_filter = allow_auto_filter;
}
void partial_clone_register(
diff --git a/list-objects-filter-options.h b/list-objects-filter-options.h
index 7b2108b986..77d7bbc846 100644
--- a/list-objects-filter-options.h
+++ b/list-objects-filter-options.h
@@ -18,6 +18,7 @@ enum list_objects_filter_choice {
LOFC_SPARSE_OID,
LOFC_OBJECT_TYPE,
LOFC_COMBINE,
+ LOFC_AUTO,
LOFC__COUNT /* must be last */
};
@@ -50,6 +51,11 @@ struct list_objects_filter_options {
*/
unsigned int no_filter : 1;
+ /*
+ * Is LOFC_AUTO a valid option?
+ */
+ unsigned int allow_auto_filter : 1;
+
/*
* BEGIN choice-specific parsed values from within the filter-spec. Only
* some values will be defined for any given choice.
diff --git a/list-objects-filter.c b/list-objects-filter.c
index acd65ebb73..78316e7f90 100644
--- a/list-objects-filter.c
+++ b/list-objects-filter.c
@@ -745,6 +745,13 @@ static void filter_combine__init(
filter->finalize_omits_fn = filter_combine__finalize_omits;
}
+static void filter_auto__init(
+ struct list_objects_filter_options *filter_options UNUSED,
+ struct filter *filter UNUSED)
+{
+ BUG("LOFC_AUTO should have been resolved before initializing the filter");
+}
+
typedef void (*filter_init_fn)(
struct list_objects_filter_options *filter_options,
struct filter *filter);
@@ -760,6 +767,7 @@ static filter_init_fn s_filters[] = {
filter_sparse_oid__init,
filter_object_type__init,
filter_combine__init,
+ filter_auto__init,
};
struct filter *list_objects_filter__init(
diff --git a/t/meson.build b/t/meson.build
index a04a7a86cf..bec4c72327 100644
--- a/t/meson.build
+++ b/t/meson.build
@@ -4,6 +4,7 @@ clar_test_suites = [
'unit-tests/u-example-decorate.c',
'unit-tests/u-hash.c',
'unit-tests/u-hashmap.c',
+ 'unit-tests/u-list-objects-filter-options.c',
'unit-tests/u-mem-pool.c',
'unit-tests/u-oid-array.c',
'unit-tests/u-oidmap.c',
diff --git a/t/unit-tests/u-list-objects-filter-options.c b/t/unit-tests/u-list-objects-filter-options.c
new file mode 100644
index 0000000000..f7d73701b5
--- /dev/null
+++ b/t/unit-tests/u-list-objects-filter-options.c
@@ -0,0 +1,53 @@
+#include "unit-test.h"
+#include "list-objects-filter-options.h"
+#include "strbuf.h"
+
+/* Helper to test gently_parse_list_objects_filter() */
+static void check_gentle_parse(const char *filter_spec,
+ int expect_success,
+ int allow_auto,
+ enum list_objects_filter_choice expected_choice)
+{
+ struct list_objects_filter_options filter_options = LIST_OBJECTS_FILTER_INIT;
+ struct strbuf errbuf = STRBUF_INIT;
+ int ret;
+
+ filter_options.allow_auto_filter = allow_auto;
+
+ ret = gently_parse_list_objects_filter(&filter_options, filter_spec, &errbuf);
+
+ if (expect_success) {
+ cl_assert_equal_i(ret, 0);
+ cl_assert_equal_i(expected_choice, filter_options.choice);
+ cl_assert_equal_i(errbuf.len, 0);
+ } else {
+ cl_assert(ret != 0);
+ cl_assert(errbuf.len > 0);
+ }
+
+ strbuf_release(&errbuf);
+ list_objects_filter_release(&filter_options);
+}
+
+void test_list_objects_filter_options__regular_filters(void)
+{
+ check_gentle_parse("blob:none", 1, 0, LOFC_BLOB_NONE);
+ check_gentle_parse("blob:none", 1, 1, LOFC_BLOB_NONE);
+ check_gentle_parse("blob:limit=5k", 1, 0, LOFC_BLOB_LIMIT);
+ check_gentle_parse("blob:limit=5k", 1, 1, LOFC_BLOB_LIMIT);
+ check_gentle_parse("combine:blob:none+tree:0", 1, 0, LOFC_COMBINE);
+ check_gentle_parse("combine:blob:none+tree:0", 1, 1, LOFC_COMBINE);
+}
+
+void test_list_objects_filter_options__auto_allowed(void)
+{
+ check_gentle_parse("auto", 1, 1, LOFC_AUTO);
+ check_gentle_parse("auto", 0, 0, 0);
+}
+
+void test_list_objects_filter_options__combine_auto_fails(void)
+{
+ check_gentle_parse("combine:auto+blob:none", 0, 1, 0);
+ check_gentle_parse("combine:blob:none+auto", 0, 1, 0);
+ check_gentle_parse("combine:auto+auto", 0, 1, 0);
+}
--
2.53.0.77.g4627d513d6
^ permalink raw reply related [flat|nested] 80+ messages in thread* [PATCH v4 7/9] promisor-remote: keep advertised filters in memory
2026-02-16 13:23 ` [PATCH v4 " Christian Couder
` (5 preceding siblings ...)
2026-02-16 13:23 ` [PATCH v4 6/9] list-objects-filter-options: support 'auto' mode for --filter Christian Couder
@ 2026-02-16 13:23 ` Christian Couder
2026-02-16 13:23 ` [PATCH v4 8/9] promisor-remote: change promisor_remote_reply()'s signature Christian Couder
2026-02-16 13:23 ` [PATCH v4 9/9] fetch-pack: wire up and enable auto filter logic Christian Couder
8 siblings, 0 replies; 80+ messages in thread
From: Christian Couder @ 2026-02-16 13:23 UTC (permalink / raw)
To: git
Cc: Junio C Hamano, Patrick Steinhardt, Taylor Blau, Karthik Nayak,
Elijah Newren, Jean-Noël Avila, Jeff King, Christian Couder,
Christian Couder
Currently, advertised filters are only kept in memory temporarily
during parsing, or persisted to disk if `promisor.storeFields`
contains 'partialCloneFilter'.
In a following commit though, we will add a `--filter=auto` option.
This option will enable the client to use the filters that the server
is suggesting for the promisor remotes the client accepts.
To use them even if `promisor.storeFields` is not configured, these
filters should be stored somewhere for the current session.
Let's add an `advertised_filter` field to `struct promisor_remote`
for that purpose.
To ensure that the filters are available in all cases,
filter_promisor_remote() captures them into a temporary list and
applies them to the `promisor_remote` structs after the potential
configuration reload.
Then the accepted remotes are marked as `accepted` in the repository
state. This ensures that subsequent calls to look up accepted remotes
(like in the filter construction below) actually find them.
In a following commit, we will add a `--filter=auto` option that will
enable a client to use the filters suggested by the server for the
promisor remotes the client accepted.
To enable the client to construct a filter spec based on these filters,
let's also add a `promisor_remote_construct_filter(repo)` function.
This function:
- iterates over all accepted promisor remotes in the repository,
- collects the filters advertised for them (using `advertised_filter`
added in this commit, and
- generates a single filter spec for them.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
promisor-remote.c | 58 +++++++++++++++++++++++++++++++++++++++++++++++
promisor-remote.h | 7 ++++++
2 files changed, 65 insertions(+)
diff --git a/promisor-remote.c b/promisor-remote.c
index 59997dd4c7..f3bafb7731 100644
--- a/promisor-remote.c
+++ b/promisor-remote.c
@@ -193,6 +193,7 @@ void promisor_remote_clear(struct promisor_remote_config *config)
while (config->promisors) {
struct promisor_remote *r = config->promisors;
free(r->partial_clone_filter);
+ free(r->advertised_filter);
config->promisors = config->promisors->next;
free(r);
}
@@ -837,6 +838,7 @@ static void filter_promisor_remote(struct repository *repo,
struct store_info *store_info = NULL;
struct string_list_item *item;
bool reload_config = false;
+ struct string_list accepted_filters = STRING_LIST_INIT_DUP;
if (!repo_config_get_string_tmp(the_repository, "promisor.acceptfromserver", &accept_str)) {
if (!*accept_str || !strcasecmp("None", accept_str))
@@ -879,6 +881,13 @@ static void filter_promisor_remote(struct repository *repo,
reload_config = true;
strvec_push(accepted, advertised->name);
+
+ /* Capture advertised filters for accepted remotes */
+ if (advertised->filter) {
+ struct string_list_item *i;
+ i = string_list_append(&accepted_filters, advertised->name);
+ i->util = xstrdup(advertised->filter);
+ }
}
promisor_info_free(advertised);
@@ -890,6 +899,25 @@ static void filter_promisor_remote(struct repository *repo,
if (reload_config)
repo_promisor_remote_reinit(repo);
+
+ /* Apply accepted remote filters to the stable repo state */
+ for_each_string_list_item(item, &accepted_filters) {
+ struct promisor_remote *r = repo_promisor_remote_find(repo, item->string);
+ if (r) {
+ free(r->advertised_filter);
+ r->advertised_filter = item->util;
+ item->util = NULL;
+ }
+ }
+
+ string_list_clear(&accepted_filters, 1);
+
+ /* Mark the remotes as accepted in the repository state */
+ for (size_t i = 0; i < accepted->nr; i++) {
+ struct promisor_remote *r = repo_promisor_remote_find(repo, accepted->v[i]);
+ if (r)
+ r->accepted = 1;
+ }
}
char *promisor_remote_reply(const char *info)
@@ -935,3 +963,33 @@ void mark_promisor_remotes_as_accepted(struct repository *r, const char *remotes
string_list_clear(&accepted_remotes, 0);
}
+
+char *promisor_remote_construct_filter(struct repository *repo)
+{
+ struct promisor_remote *r;
+ struct list_objects_filter_options filter_options = LIST_OBJECTS_FILTER_INIT;
+ struct strbuf err = STRBUF_INIT;
+ char *result = NULL;
+
+ promisor_remote_init(repo);
+
+ for (r = repo->promisor_remote_config->promisors; r; r = r->next) {
+ if (r->accepted && r->advertised_filter)
+ if (gently_parse_list_objects_filter(&filter_options,
+ r->advertised_filter,
+ &err)) {
+ warning(_("promisor remote '%s' advertised invalid filter '%s': %s"),
+ r->name, r->advertised_filter, err.buf);
+ strbuf_reset(&err);
+ continue;
+ }
+ }
+
+ if (filter_options.choice)
+ result = xstrdup(expand_list_objects_filter_spec(&filter_options));
+
+ list_objects_filter_release(&filter_options);
+ strbuf_release(&err);
+
+ return result;
+}
diff --git a/promisor-remote.h b/promisor-remote.h
index 263d331a55..d227299fd0 100644
--- a/promisor-remote.h
+++ b/promisor-remote.h
@@ -15,6 +15,7 @@ struct object_id;
struct promisor_remote {
struct promisor_remote *next;
char *partial_clone_filter;
+ char *advertised_filter;
unsigned int accepted : 1;
const char name[FLEX_ARRAY];
};
@@ -67,4 +68,10 @@ void mark_promisor_remotes_as_accepted(struct repository *repo, const char *remo
*/
int repo_has_accepted_promisor_remote(struct repository *r);
+/*
+ * Use the filters from the accepted remotes to create a combined
+ * filter (useful in `--filter=auto` mode).
+ */
+char *promisor_remote_construct_filter(struct repository *repo);
+
#endif /* PROMISOR_REMOTE_H */
--
2.53.0.77.g4627d513d6
^ permalink raw reply related [flat|nested] 80+ messages in thread* [PATCH v4 8/9] promisor-remote: change promisor_remote_reply()'s signature
2026-02-16 13:23 ` [PATCH v4 " Christian Couder
` (6 preceding siblings ...)
2026-02-16 13:23 ` [PATCH v4 7/9] promisor-remote: keep advertised filters in memory Christian Couder
@ 2026-02-16 13:23 ` Christian Couder
2026-02-16 13:23 ` [PATCH v4 9/9] fetch-pack: wire up and enable auto filter logic Christian Couder
8 siblings, 0 replies; 80+ messages in thread
From: Christian Couder @ 2026-02-16 13:23 UTC (permalink / raw)
To: git
Cc: Junio C Hamano, Patrick Steinhardt, Taylor Blau, Karthik Nayak,
Elijah Newren, Jean-Noël Avila, Jeff King, Christian Couder,
Christian Couder
The `promisor_remote_reply()` function performs two tasks:
1. It uses filter_promisor_remote() to parse the server's
"promisor-remote" advertisement and to mark accepted remotes in the
repository configuration.
2. It assembles a reply string containing the accepted remote names to
send back to the server.
In a following commit, the fetch-pack logic will need to trigger the
side effect (1) to ensure the repository state is correct, but it will
not need to send a reply (2).
To avoid assembling a reply string when it is not needed, let's change
the signature of promisor_remote_reply(). It will now return `void` and
accept a second `char **accepted_out` argument. Only if that argument
is not NULL will a reply string be assembled and returned back to the
caller via that argument.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
connect.c | 3 ++-
promisor-remote.c | 24 +++++++++++++-----------
promisor-remote.h | 10 +++++-----
3 files changed, 20 insertions(+), 17 deletions(-)
diff --git a/connect.c b/connect.c
index c6f76e3082..a02583a102 100644
--- a/connect.c
+++ b/connect.c
@@ -505,7 +505,8 @@ static void send_capabilities(int fd_out, struct packet_reader *reader)
reader->hash_algo = &hash_algos[GIT_HASH_SHA1_LEGACY];
}
if (server_feature_v2("promisor-remote", &promisor_remote_info)) {
- char *reply = promisor_remote_reply(promisor_remote_info);
+ char *reply;
+ promisor_remote_reply(promisor_remote_info, &reply);
if (reply) {
packet_write_fmt(fd_out, "promisor-remote=%s", reply);
free(reply);
diff --git a/promisor-remote.c b/promisor-remote.c
index f3bafb7731..96fa215b06 100644
--- a/promisor-remote.c
+++ b/promisor-remote.c
@@ -920,25 +920,27 @@ static void filter_promisor_remote(struct repository *repo,
}
}
-char *promisor_remote_reply(const char *info)
+void promisor_remote_reply(const char *info, char **accepted_out)
{
struct strvec accepted = STRVEC_INIT;
- struct strbuf reply = STRBUF_INIT;
filter_promisor_remote(the_repository, &accepted, info);
- if (!accepted.nr)
- return NULL;
-
- for (size_t i = 0; i < accepted.nr; i++) {
- if (i)
- strbuf_addch(&reply, ';');
- strbuf_addstr_urlencode(&reply, accepted.v[i], allow_unsanitized);
+ if (accepted_out) {
+ if (accepted.nr) {
+ struct strbuf reply = STRBUF_INIT;
+ for (size_t i = 0; i < accepted.nr; i++) {
+ if (i)
+ strbuf_addch(&reply, ';');
+ strbuf_addstr_urlencode(&reply, accepted.v[i], allow_unsanitized);
+ }
+ *accepted_out = strbuf_detach(&reply, NULL);
+ } else {
+ *accepted_out = NULL;
+ }
}
strvec_clear(&accepted);
-
- return strbuf_detach(&reply, NULL);
}
void mark_promisor_remotes_as_accepted(struct repository *r, const char *remotes)
diff --git a/promisor-remote.h b/promisor-remote.h
index d227299fd0..3d4d2de018 100644
--- a/promisor-remote.h
+++ b/promisor-remote.h
@@ -49,12 +49,12 @@ char *promisor_remote_info(struct repository *repo);
/*
* Prepare a reply to a "promisor-remote" advertisement from a server.
* Check the value of "promisor.acceptfromserver" and maybe the
- * configured promisor remotes, if any, to prepare the reply.
- * Return value is NULL if no promisor remote from the server
- * is accepted. Otherwise it contains the names of the accepted promisor
- * remotes separated by ';'. See gitprotocol-v2(5).
+ * configured promisor remotes, if any, to prepare the reply. If the
+ * `accepted_out` argument is not NULL, it is set to either NULL or to
+ * the names of the accepted promisor remotes separated by ';' if
+ * any. See gitprotocol-v2(5).
*/
-char *promisor_remote_reply(const char *info);
+void promisor_remote_reply(const char *info, char **accepted_out);
/*
* Set the 'accepted' flag for some promisor remotes. Useful on the
--
2.53.0.77.g4627d513d6
^ permalink raw reply related [flat|nested] 80+ messages in thread* [PATCH v4 9/9] fetch-pack: wire up and enable auto filter logic
2026-02-16 13:23 ` [PATCH v4 " Christian Couder
` (7 preceding siblings ...)
2026-02-16 13:23 ` [PATCH v4 8/9] promisor-remote: change promisor_remote_reply()'s signature Christian Couder
@ 2026-02-16 13:23 ` Christian Couder
8 siblings, 0 replies; 80+ messages in thread
From: Christian Couder @ 2026-02-16 13:23 UTC (permalink / raw)
To: git
Cc: Junio C Hamano, Patrick Steinhardt, Taylor Blau, Karthik Nayak,
Elijah Newren, Jean-Noël Avila, Jeff King, Christian Couder,
Christian Couder
Previous commits have set up an infrastructure for `--filter=auto` to
automatically prepare a partial clone filter based on what the server
advertised and the client accepted.
Using that infrastructure, let's now enable the `--filter=auto` option
in `git clone` and `git fetch` by setting `allow_auto_filter` to 1.
Note that these small changes mean that when `git clone --filter=auto`
or `git fetch --filter=auto` are used, "auto" is automatically saved
as the partial clone filter for the server on the client. Therefore
subsequent calls to `git fetch` on the client will automatically use
this "auto" mode even without `--filter=auto`.
Let's also set `allow_auto_filter` to 1 in `transport.c`, as the
transport layer must be able to accept the "auto" filter spec even if
the invoking command hasn't fully parsed it yet.
When an "auto" filter is requested, let's have the "fetch-pack.c" code
in `do_fetch_pack_v2()` compute a filter and send it to the server.
In `do_fetch_pack_v2()` the logic also needs to check for the
"promisor-remote" capability and call `promisor_remote_reply()` to
parse advertised remotes and populate the list of those accepted (and
their filters).
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
Documentation/fetch-options.adoc | 19 ++++++---
Documentation/git-clone.adoc | 25 ++++++++---
Documentation/gitprotocol-v2.adoc | 16 ++++---
builtin/clone.c | 2 +
builtin/fetch.c | 2 +
fetch-pack.c | 24 +++++++++++
t/t5710-promisor-remote-capability.sh | 60 +++++++++++++++++++++++++++
transport.c | 1 +
8 files changed, 134 insertions(+), 15 deletions(-)
diff --git a/Documentation/fetch-options.adoc b/Documentation/fetch-options.adoc
index 1ef9807d00..a0cfb50d89 100644
--- a/Documentation/fetch-options.adoc
+++ b/Documentation/fetch-options.adoc
@@ -92,11 +92,20 @@ precedence over the `fetch.output` config option.
Use the partial clone feature and request that the server sends
a subset of reachable objects according to a given object filter.
When using `--filter`, the supplied _<filter-spec>_ is used for
- the partial fetch. For example, `--filter=blob:none` will filter
- out all blobs (file contents) until needed by Git. Also,
- `--filter=blob:limit=<size>` will filter out all blobs of size
- at least _<size>_. For more details on filter specifications, see
- the `--filter` option in linkgit:git-rev-list[1].
+ the partial fetch.
++
+If `--filter=auto` is used, the filter specification is determined
+automatically by combining the filter specifications advertised by
+the server for the promisor remotes that the client accepts (see
+linkgit:gitprotocol-v2[5] and the `promisor.acceptFromServer`
+configuration option in linkgit:git-config[1]).
++
+For details on all other available filter specifications, see the
+`--filter=<filter-spec>` option in linkgit:git-rev-list[1].
++
+For example, `--filter=blob:none` will filter out all blobs (file
+contents) until needed by Git. Also, `--filter=blob:limit=<size>` will
+filter out all blobs of size at least _<size>_.
ifndef::git-pull[]
`--write-fetch-head`::
diff --git a/Documentation/git-clone.adoc b/Documentation/git-clone.adoc
index 57cdfb7620..0db2d1e5f0 100644
--- a/Documentation/git-clone.adoc
+++ b/Documentation/git-clone.adoc
@@ -187,11 +187,26 @@ objects from the source repository into a pack in the cloned repository.
Use the partial clone feature and request that the server sends
a subset of reachable objects according to a given object filter.
When using `--filter`, the supplied _<filter-spec>_ is used for
- the partial clone filter. For example, `--filter=blob:none` will
- filter out all blobs (file contents) until needed by Git. Also,
- `--filter=blob:limit=<size>` will filter out all blobs of size
- at least _<size>_. For more details on filter specifications, see
- the `--filter` option in linkgit:git-rev-list[1].
+ the partial clone filter.
++
+If `--filter=auto` is used the filter specification is determined
+automatically through the 'promisor-remote' protocol (see
+linkgit:gitprotocol-v2[5]) by combining the filter specifications
+advertised by the server for the promisor remotes that the client
+accepts (see the `promisor.acceptFromServer` configuration option in
+linkgit:git-config[1]). This allows the server to suggest the optimal
+filter for the available promisor remotes.
++
+As with other filter specifications, the "auto" value is persisted in
+the configuration. This ensures that future fetches will continue to
+adapt to the server's current recommendation.
++
+For details on all other available filter specifications, see the
+`--filter=<filter-spec>` option in linkgit:git-rev-list[1].
++
+For example, `--filter=blob:none` will filter out all blobs (file
+contents) until needed by Git. Also, `--filter=blob:limit=<size>` will
+filter out all blobs of size at least _<size>_.
`--also-filter-submodules`::
Also apply the partial clone filter to any submodules in the repository.
diff --git a/Documentation/gitprotocol-v2.adoc b/Documentation/gitprotocol-v2.adoc
index d93dd279ea..f985cb4c47 100644
--- a/Documentation/gitprotocol-v2.adoc
+++ b/Documentation/gitprotocol-v2.adoc
@@ -812,10 +812,15 @@ MUST appear first in each pr-fields, in that order.
After these mandatory fields, the server MAY advertise the following
optional fields in any order:
-`partialCloneFilter`:: The filter specification used by the remote.
+`partialCloneFilter`:: The filter specification for the remote. It
+corresponds to the "remote.<name>.partialCloneFilter" config setting.
Clients can use this to determine if the remote's filtering strategy
-is compatible with their needs (e.g., checking if both use "blob:none").
-It corresponds to the "remote.<name>.partialCloneFilter" config setting.
+is compatible with their needs (e.g., checking if both use
+"blob:none"). Additionally they can use this through the
+`--filter=auto` option in linkgit:git-clone[1]. With that option, the
+filter specification of the clone will be automatically computed by
+combining the filter specifications of the promisor remotes the client
+accepts.
`token`:: An authentication token that clients can use when
connecting to the remote. It corresponds to the "remote.<name>.token"
@@ -828,8 +833,9 @@ future protocol extensions.
The client can use information transmitted through these fields to
decide if it accepts the advertised promisor remote. Also, the client
-can be configured to store the values of these fields (see
-"promisor.storeFields" in linkgit:git-config[1]).
+can be configured to store the values of these fields or use them
+to automatically configure the repository (see "promisor.storeFields"
+in linkgit:git-config[1] and `--filter=auto` in linkgit:git-clone[1]).
Field values MUST be urlencoded.
diff --git a/builtin/clone.c b/builtin/clone.c
index bb27472020..45d8fa0eed 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -1001,6 +1001,8 @@ int cmd_clone(int argc,
NULL
};
+ filter_options.allow_auto_filter = 1;
+
packet_trace_identity("clone");
repo_config(the_repository, git_clone_config, NULL);
diff --git a/builtin/fetch.c b/builtin/fetch.c
index 8fbf3557ce..573c295241 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -2580,6 +2580,8 @@ int cmd_fetch(int argc,
OPT_END()
};
+ filter_options.allow_auto_filter = 1;
+
packet_trace_identity("fetch");
/* Record the command line for the reflog */
diff --git a/fetch-pack.c b/fetch-pack.c
index 40316c9a34..9f8f980516 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -35,6 +35,7 @@
#include "sigchain.h"
#include "mergesort.h"
#include "prio-queue.h"
+#include "promisor-remote.h"
static int transfer_unpack_limit = -1;
static int fetch_unpack_limit = -1;
@@ -1661,6 +1662,29 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
struct string_list packfile_uris = STRING_LIST_INIT_DUP;
int i;
struct strvec index_pack_args = STRVEC_INIT;
+ const char *promisor_remote_config;
+
+ if (server_feature_v2("promisor-remote", &promisor_remote_config))
+ promisor_remote_reply(promisor_remote_config, NULL);
+
+ if (args->filter_options.choice == LOFC_AUTO) {
+ struct strbuf errbuf = STRBUF_INIT;
+ char *constructed_filter = promisor_remote_construct_filter(r);
+
+ list_objects_filter_release(&args->filter_options);
+ /* Disallow 'auto' as a result of the resolution of this 'auto' filter below */
+ args->filter_options.allow_auto_filter = 0;
+
+ if (constructed_filter &&
+ gently_parse_list_objects_filter(&args->filter_options,
+ constructed_filter,
+ &errbuf))
+ die(_("couldn't resolve 'auto' filter '%s': %s"),
+ constructed_filter, errbuf.buf);
+
+ free(constructed_filter);
+ strbuf_release(&errbuf);
+ }
negotiator = &negotiator_alloc;
if (args->refetch)
diff --git a/t/t5710-promisor-remote-capability.sh b/t/t5710-promisor-remote-capability.sh
index 6ef6431bd7..532e6f0fea 100755
--- a/t/t5710-promisor-remote-capability.sh
+++ b/t/t5710-promisor-remote-capability.sh
@@ -423,6 +423,66 @@ test_expect_success "clone with promisor.storeFields=partialCloneFilter" '
test_grep "'\''blob:limit=8k'\'' -> '\''blob:limit=7k'\''" err
'
+test_expect_success "clone and fetch with --filter=auto" '
+ git -C server config promisor.advertise true &&
+ test_when_finished "rm -rf client trace" &&
+
+ git -C server config remote.lop.partialCloneFilter "blob:limit=9500" &&
+ test_config -C server promisor.sendFields "partialCloneFilter" &&
+
+ GIT_TRACE_PACKET="$(pwd)/trace" GIT_NO_LAZY_FETCH=0 git clone \
+ -c remote.lop.promisor=true \
+ -c remote.lop.url="file://$(pwd)/lop" \
+ -c promisor.acceptfromserver=All \
+ --no-local --filter=auto server client 2>err &&
+
+ test_grep "filter blob:limit=9500" trace &&
+ test_grep ! "filter auto" trace &&
+
+ # Verify "auto" is persisted in config
+ echo auto >expected &&
+ git -C client config remote.origin.partialCloneFilter >actual &&
+ test_cmp expected actual &&
+
+ # Check that the largest object is still missing on the server
+ check_missing_objects server 1 "$oid" &&
+
+ # Now change the filter on the server
+ git -C server config remote.lop.partialCloneFilter "blob:limit=5678" &&
+
+ # Get a new commit on the server to ensure "git fetch" actually runs fetch-pack
+ test_commit -C template new-commit &&
+ git -C template push --all "$(pwd)/server" &&
+
+ # Perform a fetch WITH --filter=auto
+ rm -rf trace &&
+ GIT_TRACE_PACKET="$(pwd)/trace" git -C client fetch --filter=auto &&
+
+ # Verify that the new filter was used
+ test_grep "filter blob:limit=5678" trace &&
+
+ # Check that the largest object is still missing on the server
+ check_missing_objects server 1 "$oid" &&
+
+ # Change the filter on the server again
+ git -C server config remote.lop.partialCloneFilter "blob:limit=5432" &&
+
+ # Get yet a new commit on the server to ensure fetch-pack runs
+ test_commit -C template yet-a-new-commit &&
+ git -C template push --all "$(pwd)/server" &&
+
+ # Perform a fetch WITHOUT --filter=auto
+ # Relies on "auto" being persisted in the client config
+ rm -rf trace &&
+ GIT_TRACE_PACKET="$(pwd)/trace" git -C client fetch &&
+
+ # Verify that the new filter was used
+ test_grep "filter blob:limit=5432" trace &&
+
+ # Check that the largest object is still missing on the server
+ check_missing_objects server 1 "$oid"
+'
+
test_expect_success "clone with promisor.advertise set to 'true' but don't delete the client" '
git -C server config promisor.advertise true &&
diff --git a/transport.c b/transport.c
index c7f06a7382..cde8d83a57 100644
--- a/transport.c
+++ b/transport.c
@@ -1219,6 +1219,7 @@ struct transport *transport_get(struct remote *remote, const char *url)
*/
struct git_transport_data *data = xcalloc(1, sizeof(*data));
list_objects_filter_init(&data->options.filter_options);
+ data->options.filter_options.allow_auto_filter = 1;
ret->data = data;
ret->vtable = &builtin_smart_vtable;
ret->smart_options = &(data->options);
--
2.53.0.77.g4627d513d6
^ permalink raw reply related [flat|nested] 80+ messages in thread
* [PATCH v2 0/8] Auto-configure advertised remotes via URL allowlist
2025-12-23 11:11 [PATCH 0/9] Implement `promisor.storeFields` and `--filter=auto` Christian Couder
` (9 preceding siblings ...)
2026-02-04 11:08 ` [PATCH v2 0/8] Implement `promisor.storeFields` and `--filter=auto` Christian Couder
@ 2026-04-27 12:41 ` Christian Couder
2026-04-27 12:41 ` [PATCH v2 1/8] t5710: simplify 'mkdir X' followed by 'git -C X init' Christian Couder
` (8 more replies)
10 siblings, 9 replies; 80+ messages in thread
From: Christian Couder @ 2026-04-27 12:41 UTC (permalink / raw)
To: git
Cc: Junio C Hamano, Patrick Steinhardt, Taylor Blau, Karthik Nayak,
Elijah Newren, Christian Couder
Currently, the "promisor-remote" protocol capability allows a server
to advertise promisor remotes (and their tokens/filters), but the
client's `promisor.acceptFromServer` mechanism requires these remotes
to already exist in the config.
This is a significant burden for users and administrators who have to
pre-configure remotes.
This patch series improves on this by introducing a new
`promisor.acceptFromServerUrl` config option, which provides an
additive, URL-based security allowlist.
Multiple `promisor.acceptFromServerUrl` config options can be provided
in different config files. Each one should contain a URL glob pattern
which can optionally be prefixed with a remote name in the
"[<name>=]<pattern>" format.
The goal is for something like a simple:
git config set --global promisor.acceptFromServerUrl "https://my-org.com/*"
to be all that is needed for internal work in many organizations.
With this new config option:
- The server can update fields (like tokens) for known remotes,
provided their URL matches the allowlist, even if
`acceptFromServer` is set to `None`.
- Unknown remotes advertised by the server can be automatically
configured on the client if their URL matches the allowlist.
- If there is no `<name>` prefix before the glob pattern matched, the
auto-configured remote is named using the
"promisor-auto-<sanitized-url>" format. So the same auto-configured
remote config entry will be reused for the same URL.
- If a `<name>` prefix is provided, it will be used for the
auto-configured remote config entry.
- If the chosen name (auto-generated or prefixed) already exists but
points to a different URL, overwriting the existing config is
prevented by appending a numeric suffix (e.g., -1, -2) to the name
and auto-configuring using that name.
- The server's originally advertised name is always saved in the
`remote.<name>.advertisedAs` config variable of the auto-configured
remote for tracing and debugging.
Security considerations:
- Advertised URLs and glob patterns are routed through
url_normalize() / url_normalize_pattern() before matching, to
prevent percent-encoding, case variation, or path-traversal (..)
bypasses.
- URL matching is done component by component: scheme and port
must match exactly (no wildcards), the host is matched with
WM_PATHNAME so a '*' cannot cross the '/' boundary into the
path, and the path is matched without WM_PATHNAME so '*' can
still span multi-level paths.
- Auto-generated remote names are sanitized (non-alphanumeric
characters are replaced with '-', runs of '-' are collapsed)
and prefixed with 'promisor-auto-'. User-supplied names (from
the 'name=<pattern>' syntax) are validated with
valid_remote_name(). Together, these prevent a server from
maliciously overwriting standard remotes (like 'origin').
- If the auto-generated or user-supplied name collides with an
existing remote configured to a different URL, a numeric
suffix ('-1', '-2', ...) is appended, up to a bounded limit,
so a server cannot hijack an existing remote by name.
- Known remotes are still subject to URL consistency checks:
even if an advertised URL matches the allowlist, it is only
accepted for a known remote if it matches the URL already
configured locally for that remote.
- The documentation explains in detail how to write secure glob
patterns in `promisor.acceptFromServerUrl`, and highlights the
risks of overly broad patterns on shared hosting platforms.
High level description of the patches
=====================================
- Patch 1/8 is new. It is a very small preparatory patch that
simplifies some tests a bit.
- Patches 2/8 and 3/8 expose and adapt a url_normalize_pattern()
helper function in the urlmatch API.
- Patch 4/8 adapts `struct promisor_info` by adding a new
`local_name` member to it to prepare for the next patches.
- Patches 5/8 to 7/8 implement the core feature. They introduce the
parsing machinery, add the additive allowlist for known remotes
(with url_normalize() security), and finally implement the
auto-creation and collision resolution for unknown remotes.
- Patch 8/8 cleans up and modernizes the existing
`promisor.acceptFromServer` documentation.
Changes compared to v1
======================
Thanks to Patrick and Junio for reviewing the previous versions of
this series and of the preparatory series.
- A lot of preparatory patches have been moved to a preparatory series
that has already been merged. See:
https://lore.kernel.org/git/20260407115243.358642-1-christian.couder@gmail.com/
This is why this v2 contains only 8 patches compared to 16 patches
in v1.
- Everywhere in this series "whitelist" as been replaced with
"allowlist".
- In the tests added in this series, the new $TRASH_DIRECTORY_URL and
$ENCODED_TRASH_DIRECTORY_URL introduced by the preparatory series
are used instead of the previous $PWD_URL and $ENCODED_PWD_URL.
- Patch 1/8 ("t5710: simplify 'mkdir X' followed by 'git -C X init'")
is new.
- Patch 3/8 ("urlmatch: add url_normalize_pattern() helper") replaces
patch 3/16 ("urlmatch: add url_is_valid_pattern() helper") because
in subsequent patches we now normalize patterns to validate them
and match them component by component against URLs.
- In patch 5/8, previously 13/16, ("promisor-remote: introduce
promisor.acceptFromServerUrl"):
- We add a `struct url_info pattern_info;` to `struct allowed_url`,
so we can validate patterns using url_normalize_pattern() and, in
a subsequent patch, match URLs component by component. This
requires a new allowed_url_free() function that is passed to
string_list_clear_func() to clear the `struct allowed_url`
instances.
- We don't use a `static struct string_list` to store the URL
patterns we accept. Instead we load them from the config into a
`struct string_list` passed as argument. The function doing this
is renamed accordingly from accept_from_server_url() to
load_accept_from_server_url().
- A "clone with invalid promisor.acceptFromServerUrl" test is moved
from patch 15/16 to this patch as it's more relevant in this
patch (where we validate the content of the
`promisor.acceptFromServerUrl` environment variable).
- In patch 6/8, previously 14/16, ("promisor-remote: trust known
remotes matching acceptFromServerUrl"):
- In the commit message, an example, which shows how the new
"acceptFromServerUrl" config option can be useful, is added.
- The matching of URLs advertised by the server to URLs patterns
from the config, is now performed component by component. This is
reflected in the commit message, the documentation and the
code. This ensures a `*` in the host pattern cannot cross into
the path.
- In the code, we add a new match_one_url() function to perform the
matching.
- In patch 7/8, previously 15/16 ("promisor-remote: auto-configure
unknown remotes"):
- In the doc, the unclear "considered trusted by the client" is
clarified using "a client is allowed to act on" and subsequent
explanations. In general the doc is also improved a bit.
- In the tests, parsing the "remote.<name>.advertisedAs" config
option is now more careful about the possibility that more than
one such options exist.
- The test that was moved to patch 5/8 is still enhanced a bit in
this commit by checking that no "remote.<name>.advertisedAs"
config option has been added.
CI tests
========
They all pass, see:
https://github.com/chriscool/git/actions/runs/24992478331
Range diff since v1
===================
1: b2894eb33a < -: ---------- promisor-remote: try accepted remotes before others in get_direct()
-: ---------- > 1: 44e9a16455 t5710: simplify 'mkdir X' followed by 'git -C X init'
2: a3206a6ae9 = 2: 42f174910c urlmatch: change 'allow_globs' arg to bool
3: 51bbf65c52 < -: ---------- urlmatch: add url_is_valid_pattern() helper
4: f367beef72 < -: ---------- promisor-remote: clarify that a remote is ignored
5: 1faf74cb3f < -: ---------- promisor-remote: refactor has_control_char()
6: 40cf0af639 < -: ---------- promisor-remote: refactor accept_from_server()
7: b75dca8037 < -: ---------- promisor-remote: keep accepted promisor_info structs alive
8: f5e55dc407 < -: ---------- promisor-remote: remove the 'accepted' strvec
-: ---------- > 3: 8088374458 urlmatch: add url_normalize_pattern() helper
9: 63c1db30de ! 4: 6bfda89a79 promisor-remote: add 'local_name' to 'struct promisor_info'
@@ Commit message
In a following commit, we will store promisor remote information under
a remote name different than the one the server advertised.
- To prepare for this change, let's add a new 'char* local_name' member
+ To prepare for this change, let's add a new 'char *local_name' member
to 'struct promisor_info', and let's update the related functions.
While at it, let's also add a small promisor_info_internal_name()
@@ Commit message
## promisor-remote.c ##
@@ promisor-remote.c: static struct string_list *fields_stored(void)
-
- /*
* Struct for promisor remotes involved in the "promisor-remote"
-- * protocol capability.
-+ * protocol capability:
+ * protocol capability.
*
- * Except for "name", each <member> in this struct and its <value>
- * should correspond (either on the client side or on the server side)
- * to a "remote.<name>.<member>" config variable set to <value> where
- * "<name>" is a promisor remote name.
-+ * - "name" is the name the server advertised.
-+ * - "local_name" is the name we use locally (may be auto-generated).
-+ *
+ * Except for "name" and "local_name", each <member> in this struct
+ * and its <value> should correspond (either on the client side or on
+ * the server side) to a "remote.<name>.<member>" config variable set
+ * to <value> where "<name>" is a promisor remote name.
*/
struct promisor_info {
- const char *name;
-+ const char *local_name;
+- const char *name;
++ const char *name; /* name the server advertised */
++ const char *local_name; /* name used locally (may be auto-generated) */
const char *url;
const char *filter;
const char *token;
10: e9b8a64ab8 < -: ---------- promisor-remote: pass config entry to all_fields_match() directly
11: 2e1260190a < -: ---------- promisor-remote: refactor should_accept_remote() control flow
12: b33f06173a < -: ---------- t5710: use proper file:// URIs for absolute paths
13: 681b03e248 ! 5: fefa17e6dd promisor-remote: introduce promisor.acceptFromServerUrl
@@ promisor-remote.c: static bool has_control_char(const char *s)
+struct allowed_url {
+ char *remote_name;
+ char *url_pattern;
++ struct url_info pattern_info;
+};
+
++static void allowed_url_free(void *util, const char *str UNUSED)
++{
++ struct allowed_url *allowed = util;
++
++ if (!allowed)
++ return;
++
++ /* Depending on prefix, free either remote_name or url_pattern */
++ free(allowed->remote_name ? allowed->remote_name : allowed->url_pattern);
++ free(allowed->pattern_info.url);
++ free(allowed);
++}
++
+static struct allowed_url *valid_accept_url(const char *url)
+{
+ char *dup, *p;
@@ promisor-remote.c: static bool has_control_char(const char *s)
+ p = dup;
+ }
+
-+ if (has_control_char(p) || !url_is_valid_pattern(p)) {
++ if (has_control_char(p)) {
+ warning(_("invalid url pattern '%s' "
+ "in '%s' from promisor.acceptFromServerUrl config"), p, url);
+ free(dup);
@@ promisor-remote.c: static bool has_control_char(const char *s)
+ allowed = xmalloc(sizeof(*allowed));
+ allowed->remote_name = (p == dup) ? NULL : dup;
+ allowed->url_pattern = p;
++ allowed->pattern_info.url = url_normalize_pattern(p, &allowed->pattern_info);
++ if (!allowed->pattern_info.url) {
++ warning(_("invalid url pattern '%s' "
++ "in '%s' from promisor.acceptFromServerUrl config"), p, url);
++ free(dup);
++ free(allowed);
++ return NULL;
++ }
+
+ return allowed;
+}
+
-+static struct string_list *accept_from_server_url(struct repository *repo)
++static void load_accept_from_server_url(struct repository *repo,
++ struct string_list *accept_urls)
+{
-+ static struct string_list accept_urls = STRING_LIST_INIT_DUP;
-+ static int initialized;
+ const struct string_list *config_urls;
+
-+ if (initialized)
-+ return &accept_urls;
-+
-+ initialized = 1;
-+
+ if (!repo_config_get_string_multi(repo, "promisor.acceptfromserverurl", &config_urls)) {
+ struct string_list_item *item;
+
@@ promisor-remote.c: static bool has_control_char(const char *s)
+ struct allowed_url *allowed = valid_accept_url(item->string);
+ if (allowed) {
+ struct string_list_item *new;
-+ new = string_list_append(&accept_urls, item->string);
++ new = string_list_append(accept_urls, item->string);
+ new->util = allowed;
+ }
+ }
+ }
-+
-+ return &accept_urls;
+}
+
static int should_accept_remote(enum accept_promisor accept,
@@ promisor-remote.c: static void filter_promisor_remote(struct repository *repo,
struct string_list_item *item;
bool reload_config = false;
enum accept_promisor accept = accept_from_server(repo);
-+ /* Pre-load and validate the acceptFromServerUrl config */
-+ (void)accept_from_server_url(repo);
++ struct string_list accept_urls = STRING_LIST_INIT_DUP;
++
++ /* Load and validate the acceptFromServerUrl config */
++ load_accept_from_server_url(repo, &accept_urls);
if (accept == ACCEPT_NONE)
return;
+@@ promisor-remote.c: static void filter_promisor_remote(struct repository *repo,
+ }
+ }
+
++ string_list_clear_func(&accept_urls, allowed_url_free);
+ promisor_info_list_clear(&config_info);
+ string_list_clear(&remote_info, 0);
+ store_info_free(store_info);
+
+ ## t/t5710-promisor-remote-capability.sh ##
+@@ t/t5710-promisor-remote-capability.sh: test_expect_success "clone with 'KnownUrl' and empty url, so not advertised" '
+ check_missing_objects server 1 "$oid"
+ '
+
++test_expect_success "clone with invalid promisor.acceptFromServerUrl" '
++ git -C server config promisor.advertise true &&
++ test_when_finished "rm -rf client" &&
++
++ # As "bad name" contains a space, which is not a valid remote name,
++ # the pattern should be rejected with a warning and no remote created.
++ GIT_NO_LAZY_FETCH=0 git clone \
++ -c promisor.acceptfromserver=None \
++ -c "promisor.acceptFromServerUrl=bad name=https://example.com/*" \
++ --no-local --filter="blob:limit=5k" server client 2>err &&
++
++ # Check that a warning was emitted
++ test_grep "invalid remote name '\''bad name'\''" err &&
++
++ # Check that the largest object is not missing on the server
++ check_missing_objects server 0 "" &&
++
++ # Reinitialize server so that the largest object is missing again
++ initialize_server 1 "$oid"
++'
++
+ test_expect_success "clone with promisor.sendFields" '
+ git -C server config promisor.advertise true &&
+ test_when_finished "rm -rf client" &&
14: 8c04e48d66 ! 6: 2f238d0a7a promisor-remote: trust known remotes matching acceptFromServerUrl
@@ Commit message
To enable such targeted updates for trusted URLs, let's use the URL
patterns from `promisor.acceptFromServerUrl` as an additional URL
- based whitelist.
+ based allowlist.
Concretely, let's check the advertised URLs against the URL glob
patterns by introducing a new small helper function called
url_matches_accept_list(), which iterates over the glob patterns and
returns the first matching allowed_url entry (or NULL).
- (Before matching, the advertised URL is passed through url_normalize()
- so that case variations in the scheme/host, percent-encoding tricks,
- and ".." path segments cannot bypass the whitelist.)
+ The URL matching is done component by component: scheme and port are
+ compared exactly, the host is matched with wildmatch() using the
+ WM_PATHNAME flag (so '*' cannot cross the '/' boundary into the path),
+ and the path is matched with wildmatch() without WM_PATHNAME (so '*'
+ can still match multi-level paths). Before matching, the advertised
+ URL is passed through url_normalize() so that case variations in the
+ scheme/host, percent-encoding tricks, and ".." path segments cannot
+ bypass the allowlist.
Let's then use this helper at the tail of should_accept_remote() so
that, when `accept == ACCEPT_NONE`, a known remote whose URL matches
- the whitelist is still accepted.
+ the allowlist is still accepted.
To prepare for this new logic, let's also:
@@ Commit message
and relax its early return so that the function is entered when
`accept_urls` has entries even if `accept == ACCEPT_NONE`.
+ With this, many organizations may only need something like:
+
+ git config set --global \
+ promisor.acceptFromServerUrl "https://my-org.com/*"
+
+ to accept only their own remotes. And if they need to accept additional
+ remotes in some specific repos, they can also set:
+
+ git config set promisor.acceptFromServer knownUrl
+
+ and configure the additional remote manually only in the repos where
+ they are needed.
+
Let's then properly document `promisor.acceptFromServerUrl` in
- "promisor.adoc" as an additive security whitelist for known remotes,
- including the URL normalization behavior, and let's mention it in
- "gitprotocol-v2.adoc".
+ "promisor.adoc" as an additive security allowlist for known remotes,
+ including the URL normalization behavior and the component-wise
+ matching, and let's mention it in "gitprotocol-v2.adoc".
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
@@ Documentation/config/promisor.adoc: promisor.acceptFromServer::
comparisons are case sensitive. See linkgit:gitprotocol-v2[5].
+promisor.acceptFromServerUrl::
-+ A glob pattern to specify which URLs advertised by a server
-+ are considered trusted by the client. This option acts as an
-+ additive security whitelist that works in conjunction with
-+ `promisor.acceptFromServer`.
++ A glob pattern to specify which server-advertised URLs a
++ client is allowed to act on. When a URL matches, the client
++ will accept the advertised remote as a promisor remote and may
++ automatically accept field updates (such as authentication
++ tokens) from the server, even if `promisor.acceptFromServer`
++ is set to `none` (the default).
++
+This option can appear multiple times in config files. An advertised
+URL will be accepted if it matches _ANY_ glob pattern specified by
+this option in _ANY_ config file read by Git.
++
-+Be _VERY_ careful with these glob patterns, as it can be a big
-+security hole to allow any advertised remote to be auto-configured!
++Be _VERY_ careful with these patterns: `*` matches any sequence of
++characters within the 'host' and 'path' parts of a URL (but cannot
++cross part boundaries). An overly broad pattern is a major security
++risk, as a matching URL allows a server to update fields (such as
++authentication tokens) on known remotes without further confirmation.
+To minimize security risks, follow these guidelines:
++
+1. Start with a secure protocol scheme, like `https://` or `ssh://`.
@@ Documentation/config/promisor.adoc: promisor.acceptFromServer::
+ your specific organization or namespace (e.g.,
+ `https://gitlab.com/your-org/*`).
++
-+3. Don't use globs (`*`) in the domain name. For example
-+ `https://cdn.example.com/*` is much safer than
-+ `https://*.example.com/*`, because the latter matches
-+ `https://evil-hacker.net/fake.example.com/repo`.
++3. Never use globs at the end of domain names. For example,
++ `https://cdn.your-org.com/*` might be safe, but
++ `https://cdn.your-org.com*/*` is a major security risk because
++ the latter matches `https://cdn.your-org.com.hacker.net/repo`.
++
-+4. Make sure to have a `/` at the end of the domain name (or the end
-+ of specific directories). For example `https://cdn.example.com/*`
-+ is much safer than `https://cdn.example.com*`, because the latter
-+ matches `https://cdn.example.com.hacker.net/repo`.
++4. Be careful using globs at the beginning of domain names. While the
++ code ensures a `*` in the host cannot cross into the path, a
++ pattern like `https://*.example.com/*` will still match any
++ subdomain. This is extremely dangerous on shared hosting platforms
++ (e.g., `https://*.github.io/*` trusts every user's site on the
++ entire platform).
++
-+Before matching, the advertised URL is normalized: the scheme and
-+host are lowercased, percent-encoded characters are decoded where
-+possible, and path segments like `..` are resolved. Glob patterns
-+are matched against this normalized URL as-is, so patterns should
-+be written in normalized form (e.g., lowercase scheme and host).
++Before matching, both the advertised URL and the pattern are
++normalized: the scheme and host are lowercased, percent-encoded
++characters are decoded where possible, and path segments like `..`
++are resolved. The port must also match exactly (e.g.,
++`https://example.com:8080/*` will not match a URL advertised on
++port 9999).
++
-+Even if `promisor.acceptFromServer` is set to `None` (the default),
-+Git will still accept field updates (like tokens) for known remotes,
-+provided their URLs match a pattern in
-+`promisor.acceptFromServerUrl`. See linkgit:gitprotocol-v2[5] for
-+details on the protocol.
++For the security implications of accepting a promisor remote, see the
++documentation of `promisor.acceptFromServer`. For details on the
++protocol, see linkgit:gitprotocol-v2[5].
+
promisor.checkFields::
A comma or space separated list of additional remote related
@@ promisor-remote.c
struct promisor_remote_config {
struct promisor_remote *promisors;
-@@ promisor-remote.c: static struct string_list *accept_from_server_url(struct repository *repo)
- return &accept_urls;
+@@ promisor-remote.c: static void load_accept_from_server_url(struct repository *repo,
+ }
}
++static bool match_one_url(const struct url_info *pi, const struct url_info *ui)
++{
++ const char *pat = pi->url;
++ const char *url = ui->url;
++ char *p_str, *u_str;
++ bool res;
++
++ /*
++ * Schemes must match exactly. They are case-folded by
++ * url_normalize(), so strncmp() suffices.
++ */
++ if (pi->scheme_len != ui->scheme_len || strncmp(pat, url, pi->scheme_len))
++ return false;
++
++ /*
++ * Ports must match exactly. url_normalize() strips default
++ * ports (like 443 for https), so length and content
++ * comparisons are sufficient.
++ */
++ if (pi->port_len != ui->port_len ||
++ strncmp(pat + pi->port_off, url + ui->port_off, pi->port_len))
++ return false;
++
++ /*
++ * Match host and path separately to prevent a '*' in the host
++ * portion of the pattern from matching across the '/'
++ * boundary into the path. Use WM_PATHNAME for the host so '*'
++ * cannot cross '/' there, and 0 for the path so '*' can still
++ * match multi-level paths.
++ */
++
++ p_str = xstrndup(pat + pi->host_off, pi->host_len);
++ u_str = xstrndup(url + ui->host_off, ui->host_len);
++ res = !wildmatch(p_str, u_str, WM_PATHNAME);
++ free(p_str);
++ free(u_str);
++
++ if (!res)
++ return false;
++
++ p_str = xstrndup(pat + pi->path_off, pi->path_len);
++ u_str = xstrndup(url + ui->path_off, ui->path_len);
++ res = !wildmatch(p_str, u_str, 0);
++ free(p_str);
++ free(u_str);
++
++ return res;
++}
++
+static struct allowed_url *url_matches_accept_list(
+ struct string_list *accept_urls, const char *url)
+{
+ struct string_list_item *item;
-+ char *normalized = url_normalize(url, NULL);
++ struct url_info url_info;
++
++ url_info.url = url_normalize(url, &url_info);
+
-+ if (!normalized)
++ if (!url_info.url)
+ return NULL;
+
+ for_each_string_list_item(item, accept_urls) {
+ struct allowed_url *allowed = item->util;
+
-+ if (!wildmatch(allowed->url_pattern, normalized, 0)) {
-+ free(normalized);
++ if (match_one_url(&allowed->pattern_info, &url_info)) {
++ free(url_info.url);
+ return allowed;
+ }
+ }
+
-+ free(normalized);
++ free(url_info.url);
+ return NULL;
+}
+
@@ promisor-remote.c: static int should_accept_remote(enum accept_promisor accept,
+ /*
+ * Even if accept == ACCEPT_NONE, we MUST trust this known
+ * remote to update its token or other such fields if its URL
-+ * matches the acceptFromServerUrl whitelist!
++ * matches the acceptFromServerUrl allowlist!
+ */
+ if (url_matches_accept_list(accept_urls, remote_url))
+ return all_fields_match(advertised, config_info, p);
@@ promisor-remote.c: static int should_accept_remote(enum accept_promisor accept,
static int skip_field_name_prefix(const char *elem, const char *field_name, const char **value)
@@ promisor-remote.c: static void filter_promisor_remote(struct repository *repo,
- struct string_list_item *item;
- bool reload_config = false;
- enum accept_promisor accept = accept_from_server(repo);
-- /* Pre-load and validate the acceptFromServerUrl config */
-- (void)accept_from_server_url(repo);
-+ struct string_list *accept_urls = accept_from_server_url(repo);
+ /* Load and validate the acceptFromServerUrl config */
+ load_accept_from_server_url(repo, &accept_urls);
- if (accept == ACCEPT_NONE)
-+ if (accept == ACCEPT_NONE && !accept_urls->nr)
++ if (accept == ACCEPT_NONE && !accept_urls.nr)
return;
/* Parse remote info received */
@@ promisor-remote.c: static void filter_promisor_remote(struct repository *repo,
}
- if (should_accept_remote(accept, advertised, &config_info)) {
-+ if (should_accept_remote(accept, advertised, accept_urls, &config_info)) {
++ if (should_accept_remote(accept, advertised, &accept_urls, &config_info)) {
if (!store_info)
store_info = store_info_new(repo);
if (promisor_store_advertised_fields(advertised, store_info))
@@ t/t5710-promisor-remote-capability.sh: test_expect_success "clone with 'KnownUrl
check_missing_objects server 1 "$oid"
'
-+test_expect_success "clone with 'None' but URL whitelisted" '
++test_expect_success "clone with 'None' but URL allowlisted" '
+ git -C server config promisor.advertise true &&
+ test_when_finished "rm -rf client" &&
+
+ GIT_NO_LAZY_FETCH=0 git clone -c remote.lop.promisor=true \
+ -c remote.lop.fetch="+refs/heads/*:refs/remotes/lop/*" \
-+ -c remote.lop.url="$PWD_URL/lop" \
++ -c remote.lop.url="$TRASH_DIRECTORY_URL/lop" \
+ -c promisor.acceptfromserver=None \
-+ -c promisor.acceptFromServerUrl="$ENCODED_PWD_URL/*" \
++ -c promisor.acceptFromServerUrl="$ENCODED_TRASH_DIRECTORY_URL/*" \
+ --no-local --filter="blob:limit=5k" server client &&
+
+ # Check that the largest object is still missing on the server
+ check_missing_objects server 1 "$oid"
+'
+
-+test_expect_success "clone with 'None' but URL not in whitelist" '
++test_expect_success "clone with 'None' but URL not in allowlist" '
+ git -C server config promisor.advertise true &&
+ test_when_finished "rm -rf client" &&
+
+ GIT_NO_LAZY_FETCH=0 git clone -c remote.lop.promisor=true \
+ -c remote.lop.fetch="+refs/heads/*:refs/remotes/lop/*" \
-+ -c remote.lop.url="$PWD_URL/lop" \
++ -c remote.lop.url="$TRASH_DIRECTORY_URL/lop" \
+ -c promisor.acceptfromserver=None \
+ -c promisor.acceptFromServerUrl="https://example.com/*" \
+ --no-local --filter="blob:limit=5k" server client &&
@@ t/t5710-promisor-remote-capability.sh: test_expect_success "clone with 'KnownUrl
+ initialize_server 1 "$oid"
+'
+
-+test_expect_success "clone with 'None' but URL whitelisted in one pattern out of two" '
++test_expect_success "clone with 'None' but URL allowlisted in one pattern out of two" '
+ git -C server config promisor.advertise true &&
+ test_when_finished "rm -rf client" &&
+
+ GIT_NO_LAZY_FETCH=0 git clone -c remote.lop.promisor=true \
+ -c remote.lop.fetch="+refs/heads/*:refs/remotes/lop/*" \
-+ -c remote.lop.url="$PWD_URL/lop" \
++ -c remote.lop.url="$TRASH_DIRECTORY_URL/lop" \
+ -c promisor.acceptfromserver=None \
+ -c promisor.acceptFromServerUrl="https://example.com/*" \
-+ -c promisor.acceptFromServerUrl="$ENCODED_PWD_URL/*" \
++ -c promisor.acceptFromServerUrl="$ENCODED_TRASH_DIRECTORY_URL/*" \
+ --no-local --filter="blob:limit=5k" server client &&
+
+ # Check that the largest object is still missing on the server
+ check_missing_objects server 1 "$oid"
+'
+
-+test_expect_success "clone with 'None', URL whitelisted, but client has different URL" '
++test_expect_success "clone with 'None', URL allowlisted, but client has different URL" '
+ git -C server config promisor.advertise true &&
+ test_when_finished "rm -rf client" &&
+
+ # The client configures "lop" with a different URL (serverTwo) than
+ # what the server advertises (lop). Even though the advertised URL
-+ # matches the whitelist, the remote is rejected because the
++ # matches the allowlist, the remote is rejected because the
+ # configured URL does not match the advertised one.
+ GIT_NO_LAZY_FETCH=0 git clone -c remote.lop.promisor=true \
+ -c remote.lop.fetch="+refs/heads/*:refs/remotes/lop/*" \
-+ -c remote.lop.url="$PWD_URL/serverTwo" \
++ -c remote.lop.url="$TRASH_DIRECTORY_URL/serverTwo" \
+ -c promisor.acceptfromserver=None \
-+ -c promisor.acceptFromServerUrl="$ENCODED_PWD_URL/*" \
++ -c promisor.acceptFromServerUrl="$ENCODED_TRASH_DIRECTORY_URL/*" \
+ --no-local --filter="blob:limit=5k" server client &&
+
+ # Check that the largest object is not missing on the server
@@ t/t5710-promisor-remote-capability.sh: test_expect_success "clone with 'KnownUrl
+ initialize_server 1 "$oid"
+'
+
- test_expect_success "clone with promisor.sendFields" '
+ test_expect_success "clone with invalid promisor.acceptFromServerUrl" '
git -C server config promisor.advertise true &&
test_when_finished "rm -rf client" &&
-@@ t/t5710-promisor-remote-capability.sh: test_expect_success "subsequent fetch from a client when promisor.advertise is f
- check_missing_objects server 1 "$oid"
- '
-
-+
-+
- test_done
15: 314150a860 ! 7: a077f33df4 promisor-remote: auto-configure unknown remotes
@@ Commit message
promisor-remote: auto-configure unknown remotes
Previous commits have introduced the `promisor.acceptFromServerUrl`
- config variable to whitelist some URLs advertised by a server through
+ config variable to allowlist some URLs advertised by a server through
the "promisor-remote" protocol capability.
However the new `promisor.acceptFromServerUrl` mechanism, like the old
@@ Commit message
## Documentation/config/promisor.adoc ##
@@ Documentation/config/promisor.adoc: promisor.acceptFromServer::
-
promisor.acceptFromServerUrl::
- A glob pattern to specify which URLs advertised by a server
-- are considered trusted by the client. This option acts as an
-- additive security whitelist that works in conjunction with
-- `promisor.acceptFromServer`.
-+ are allowed to be auto-configured (created and persisted) on
-+ the client side. Unlike `promisor.acceptFromServer`, which
-+ only accepts already configured remotes, a match against this
-+ option instructs Git to write a new `[remote "<name>"]`
-+ section to the client's configuration.
+ A glob pattern to specify which server-advertised URLs a
+ client is allowed to act on. When a URL matches, the client
+- will accept the advertised remote as a promisor remote and may
++ will accept the advertised remote as a promisor remote, may
++ automatically create a new remote configuration for it and may
+ automatically accept field updates (such as authentication
+ tokens) from the server, even if `promisor.acceptFromServer`
+ is set to `none` (the default).
+@@ Documentation/config/promisor.adoc: this option in _ANY_ config file read by Git.
+ Be _VERY_ careful with these patterns: `*` matches any sequence of
+ characters within the 'host' and 'path' parts of a URL (but cannot
+ cross part boundaries). An overly broad pattern is a major security
+-risk, as a matching URL allows a server to update fields (such as
+-authentication tokens) on known remotes without further confirmation.
+-To minimize security risks, follow these guidelines:
++risk, as a matching URL allows a server to auto-configure new remotes
++and to update fields (such as authentication tokens) on known remotes
++without further confirmation. To minimize security risks, follow these
++guidelines:
+ +
+ 1. Start with a secure protocol scheme, like `https://` or `ssh://`.
+
- This option can appear multiple times in config files. An advertised
- URL will be accepted if it matches _ANY_ glob pattern specified by
-@@ Documentation/config/promisor.adoc: possible, and path segments like `..` are resolved. Glob patterns
- are matched against this normalized URL as-is, so patterns should
- be written in normalized form (e.g., lowercase scheme and host).
+@@ Documentation/config/promisor.adoc: are resolved. The port must also match exactly (e.g.,
+ `https://example.com:8080/*` will not match a URL advertised on
+ port 9999).
+
--Even if `promisor.acceptFromServer` is set to `None` (the default),
--Git will still accept field updates (like tokens) for known remotes,
--provided their URLs match a pattern in
--`promisor.acceptFromServerUrl`. See linkgit:gitprotocol-v2[5] for
--details on the protocol.
+The glob pattern can optionally be prefixed with a remote name and an
+equals sign (e.g., `cdn=https://cdn.example.com/*`). If such a prefix
+is provided, accepted remotes will be saved under that name. If no
+such prefix is provided, a safe remote name will be automatically
+generated by sanitizing the URL and prefixing it with
-+`promisor-auto-`. If a remote with the chosen name already exists but
-+points to a different URL, Git will append a numeric suffix (e.g.,
-+`-1`, `-2`) to the name to prevent overwriting existing
-+configurations. You should make sure that this doesn't happen often
-+though, as remotes will be rejected if the numeric suffix increases
-+too much. In all cases, the original name advertised by the server is
-+recorded in the `remote.<name>.advertisedAs` configuration variable
-+for tracing and debugging purposes.
++`promisor-auto-`.
++
-+Note that this option acts as an additive security whitelist. It works
-+in conjunction with `promisor.acceptFromServer` (see the documentation
-+of that option for the implications of accepting a promisor
-+remote). Even if `promisor.acceptFromServer` is set to `None` (the
-+default), Git will still automatically configure new remotes, and
-+accept field updates (like tokens) for known remotes, provided their
-+URLs match a pattern in `promisor.acceptFromServerUrl`. See
-+linkgit:gitprotocol-v2[5] for details on the protocol.
-
- promisor.checkFields::
- A comma or space separated list of additional remote related
++If a remote with the chosen name already exists but points to a
++different URL, Git will append a numeric suffix (e.g., `-1`, `-2`) to
++the name to prevent overwriting existing configurations. You should
++make sure that this doesn't happen often though, as remotes will be
++rejected if the numeric suffix increases too much. In all cases, the
++original name advertised by the server is recorded in the
++`remote.<name>.advertisedAs` configuration variable for tracing and
++debugging purposes.
+++
+ For the security implications of accepting a promisor remote, see the
+ documentation of `promisor.acceptFromServer`. For details on the
+ protocol, see linkgit:gitprotocol-v2[5].
## Documentation/config/remote.adoc ##
@@ Documentation/config/remote.adoc: remote.<name>.promisor::
@@ promisor-remote.c: static void filter_promisor_remote(struct repository *repo,
string_list_sort(&config_info);
}
-- if (should_accept_remote(accept, advertised, accept_urls, &config_info)) {
-+ if (should_accept_remote(repo, accept, advertised, accept_urls,
+- if (should_accept_remote(accept, advertised, &accept_urls, &config_info)) {
++ if (should_accept_remote(repo, accept, advertised, &accept_urls,
+ &config_info, &reload_config)) {
if (!store_info)
store_info = store_info_new(repo);
if (promisor_store_advertised_fields(advertised, store_info))
## t/t5710-promisor-remote-capability.sh ##
-@@ t/t5710-promisor-remote-capability.sh: test_expect_success "clone with 'None', URL whitelisted, but client has differen
+@@ t/t5710-promisor-remote-capability.sh: test_expect_success "clone with 'None', URL allowlisted, but client has differen
initialize_server 1 "$oid"
'
-+test_expect_success "clone with URL whitelisted and no remote already configured" '
++test_expect_success "clone with URL allowlisted and no remote already configured" '
+ git -C server config promisor.advertise true &&
+ test_when_finished "rm -rf client" &&
++ test_when_finished "rm -f full_names" &&
+
+ GIT_NO_LAZY_FETCH=0 git clone \
+ -c promisor.acceptfromserver=None \
-+ -c promisor.acceptFromServerUrl="$ENCODED_PWD_URL/*" \
++ -c promisor.acceptFromServerUrl="$ENCODED_TRASH_DIRECTORY_URL/*" \
+ --no-local --filter="blob:limit=5k" server client &&
+
-+ # Check that a remote has been auto-created with the right fields.
-+ # The remote is identified by "remote.<name>.advertisedAs" == "lop".
-+ FULL_NAME=$(git -C client config --name-only --get-regexp "remote\..*\.advertisedas" "^lop$") &&
-+ REMOTE_NAME=$(echo "$FULL_NAME" | sed "s/remote\.\(.*\)\.advertisedas/\1/") &&
++ # Check that exactly one remote has been auto-created, identified
++ # by "remote.<name>.advertisedAs" == "lop".
++ git -C client config get --all --show-names --regexp \
++ "remote\..*\.advertisedas" >full_names &&
++ test_line_count = 1 full_names &&
++ REMOTE_NAME=$(sed "s/^remote\.\(.*\)\.advertisedas .*$/\1/" full_names) &&
+
+ # Check ".url" and ".promisor" values
-+ printf "%s\n" "$PWD_URL/lop" "true" >expect &&
++ printf "%s\n" "$TRASH_DIRECTORY_URL/lop" "true" >expect &&
+ git -C client config "remote.$REMOTE_NAME.url" >actual &&
+ git -C client config "remote.$REMOTE_NAME.promisor" >>actual &&
+ test_cmp expect actual &&
@@ t/t5710-promisor-remote-capability.sh: test_expect_success "clone with 'None', U
+ check_missing_objects server 1 "$oid"
+'
+
-+test_expect_success "clone with named URL whitelisted and no pre-configured remote" '
++test_expect_success "clone with named URL allowlisted and no pre-configured remote" '
+ git -C server config promisor.advertise true &&
+ test_when_finished "rm -rf client" &&
+
+ GIT_NO_LAZY_FETCH=0 git clone \
+ -c promisor.acceptfromserver=None \
-+ -c promisor.acceptFromServerUrl="cdn=$ENCODED_PWD_URL/*" \
++ -c promisor.acceptFromServerUrl="cdn=$ENCODED_TRASH_DIRECTORY_URL/*" \
+ --no-local --filter="blob:limit=5k" server client &&
+
+ # Check that a remote has been auto-created with the right "cdn" name and fields.
-+ printf "%s\n" "$PWD_URL/lop" "true" "lop" >expect &&
++ printf "%s\n" "$TRASH_DIRECTORY_URL/lop" "true" "lop" >expect &&
+ git -C client config "remote.cdn.url" >actual &&
+ git -C client config "remote.cdn.promisor" >>actual &&
+ git -C client config "remote.cdn.advertisedAs" >>actual &&
@@ t/t5710-promisor-remote-capability.sh: test_expect_success "clone with 'None', U
+ check_missing_objects server 1 "$oid"
+'
+
-+test_expect_success "clone with URL whitelisted but colliding name" '
++test_expect_success "clone with URL allowlisted but colliding name" '
+ git -C server config promisor.advertise true &&
+ test_when_finished "rm -rf client" &&
+
@@ t/t5710-promisor-remote-capability.sh: test_expect_success "clone with 'None', U
+ -c remote.cdn.fetch="+refs/heads/*:refs/remotes/lop/*" \
+ -c remote.cdn.url="https://example.com/cdn" \
+ -c promisor.acceptfromserver=None \
-+ -c promisor.acceptFromServerUrl="cdn=$ENCODED_PWD_URL/*" \
++ -c promisor.acceptFromServerUrl="cdn=$ENCODED_TRASH_DIRECTORY_URL/*" \
+ --no-local --filter="blob:limit=5k" server client &&
+
+ # Check that a remote has been auto-created with the right "cdn-1" name and fields.
-+ printf "%s\n" "$PWD_URL/lop" "true" "lop" >expect &&
++ printf "%s\n" "$TRASH_DIRECTORY_URL/lop" "true" "lop" >expect &&
+ git -C client config "remote.cdn-1.url" >actual &&
+ git -C client config "remote.cdn-1.promisor" >>actual &&
+ git -C client config "remote.cdn-1.advertisedAs" >>actual &&
@@ t/t5710-promisor-remote-capability.sh: test_expect_success "clone with 'None', U
+ check_missing_objects server 1 "$oid"
+'
+
-+test_expect_success "clone with URL whitelisted and reusable remote" '
++test_expect_success "clone with URL allowlisted and reusable remote" '
+ git -C server config promisor.advertise true &&
+ test_when_finished "rm -rf client" &&
+
+ GIT_NO_LAZY_FETCH=0 git clone \
+ -c remote.cdn.fetch="+refs/heads/*:refs/remotes/lop/*" \
-+ -c remote.cdn.url="$PWD_URL/lop" \
++ -c remote.cdn.url="$TRASH_DIRECTORY_URL/lop" \
+ -c promisor.acceptfromserver=None \
-+ -c promisor.acceptFromServerUrl="cdn=$ENCODED_PWD_URL/*" \
++ -c promisor.acceptFromServerUrl="cdn=$ENCODED_TRASH_DIRECTORY_URL/*" \
+ --no-local --filter="blob:limit=5k" server client &&
+
+ # Check that the existing "cdn" remote has been properly updated.
-+ printf "%s\n" "$PWD_URL/lop" "true" "lop" "+refs/heads/*:refs/remotes/lop/*" >expect &&
++ printf "%s\n" "$TRASH_DIRECTORY_URL/lop" "true" "lop" "+refs/heads/*:refs/remotes/lop/*" >expect &&
+ git -C client config "remote.cdn.url" >actual &&
+ git -C client config "remote.cdn.promisor" >>actual &&
+ git -C client config "remote.cdn.advertisedAs" >>actual &&
@@ t/t5710-promisor-remote-capability.sh: test_expect_success "clone with 'None', U
+ check_missing_objects server 1 "$oid"
+'
+
-+test_expect_success "clone with invalid promisor.acceptFromServerUrl" '
-+ git -C server config promisor.advertise true &&
-+ test_when_finished "rm -rf client" &&
-+
-+ # As "bad name" contains a space, which is not a valid remote name,
-+ # the pattern should be rejected with a warning and no remote created.
-+ GIT_NO_LAZY_FETCH=0 git clone \
-+ -c promisor.acceptfromserver=None \
-+ -c "promisor.acceptFromServerUrl=bad name=https://example.com/*" \
-+ --no-local --filter="blob:limit=5k" server client 2>err &&
-+
-+ # Check that a warning was emitted
-+ test_grep "invalid remote name '\''bad name'\''" err &&
-+
-+ # Check that no remote was auto-created
-+ test_must_fail git -C client config --get-regexp "remote\..*\.advertisedas" &&
-+
-+ # Check that the largest object is not missing on the server
-+ check_missing_objects server 0 "" &&
-+
-+ # Reinitialize server so that the largest object is missing again
-+ initialize_server 1 "$oid"
-+'
-+
- test_expect_success "clone with promisor.sendFields" '
+ test_expect_success "clone with invalid promisor.acceptFromServerUrl" '
git -C server config promisor.advertise true &&
test_when_finished "rm -rf client" &&
+@@ t/t5710-promisor-remote-capability.sh: test_expect_success "clone with invalid promisor.acceptFromServerUrl" '
+ # Check that a warning was emitted
+ test_grep "invalid remote name '\''bad name'\''" err &&
+
++ # Check that no remote was auto-created
++ test_must_fail git -C client config get --regexp "remote\..*\.advertisedas" &&
++
+ # Check that the largest object is not missing on the server
+ check_missing_objects server 0 "" &&
+
16: 20f70b52bb ! 8: b68b9497aa doc: promisor: improve acceptFromServer entry
@@ Documentation/config/promisor.adoc: variable is set to "true", and the "name" an
+for protocol details.
promisor.acceptFromServerUrl::
- A glob pattern to specify which URLs advertised by a server
+ A glob pattern to specify which server-advertised URLs a
Christian Couder (8):
t5710: simplify 'mkdir X' followed by 'git -C X init'
urlmatch: change 'allow_globs' arg to bool
urlmatch: add url_normalize_pattern() helper
promisor-remote: add 'local_name' to 'struct promisor_info'
promisor-remote: introduce promisor.acceptFromServerUrl
promisor-remote: trust known remotes matching acceptFromServerUrl
promisor-remote: auto-configure unknown remotes
doc: promisor: improve acceptFromServer entry
Documentation/config/promisor.adoc | 123 ++++++--
Documentation/config/remote.adoc | 9 +
Documentation/gitprotocol-v2.adoc | 9 +-
promisor-remote.c | 410 ++++++++++++++++++++++++--
t/t5710-promisor-remote-capability.sh | 202 ++++++++++++-
urlmatch.c | 11 +-
urlmatch.h | 12 +
7 files changed, 730 insertions(+), 46 deletions(-)
--
2.54.0.19.gb68b9497aa
^ permalink raw reply [flat|nested] 80+ messages in thread* [PATCH v2 1/8] t5710: simplify 'mkdir X' followed by 'git -C X init'
2026-04-27 12:41 ` [PATCH v2 0/8] Auto-configure advertised remotes via URL allowlist Christian Couder
@ 2026-04-27 12:41 ` Christian Couder
2026-04-27 12:41 ` [PATCH v2 2/8] urlmatch: change 'allow_globs' arg to bool Christian Couder
` (7 subsequent siblings)
8 siblings, 0 replies; 80+ messages in thread
From: Christian Couder @ 2026-04-27 12:41 UTC (permalink / raw)
To: git
Cc: Junio C Hamano, Patrick Steinhardt, Taylor Blau, Karthik Nayak,
Elijah Newren, Christian Couder, Christian Couder
It's simpler and more efficient to just use `git init client` instead
of `mkdir client && git -C client init`.
So let's replace the latter with the former.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
t/t5710-promisor-remote-capability.sh | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/t/t5710-promisor-remote-capability.sh b/t/t5710-promisor-remote-capability.sh
index b404ad9f0a..bf1cc54605 100755
--- a/t/t5710-promisor-remote-capability.sh
+++ b/t/t5710-promisor-remote-capability.sh
@@ -177,8 +177,7 @@ test_expect_success "init + fetch with promisor.advertise set to 'true'" '
git -C server config promisor.advertise true &&
test_when_finished "rm -rf client" &&
- mkdir client &&
- git -C client init &&
+ git init client &&
git -C client config remote.lop.promisor true &&
git -C client config remote.lop.fetch "+refs/heads/*:refs/remotes/lop/*" &&
git -C client config remote.lop.url "$TRASH_DIRECTORY_URL/lop" &&
@@ -231,8 +230,7 @@ test_expect_success "init + fetch two promisors but only one advertised" '
# Create a promisor that will be configured but not be used
git init --bare unused_lop &&
- mkdir client &&
- git -C client init &&
+ git init client &&
git -C client config remote.unused_lop.promisor true &&
git -C client config remote.unused_lop.fetch "+refs/heads/*:refs/remotes/unused_lop/*" &&
git -C client config remote.unused_lop.url "$TRASH_DIRECTORY_URL/unused_lop" &&
--
2.54.0.19.gb68b9497aa
^ permalink raw reply related [flat|nested] 80+ messages in thread* [PATCH v2 2/8] urlmatch: change 'allow_globs' arg to bool
2026-04-27 12:41 ` [PATCH v2 0/8] Auto-configure advertised remotes via URL allowlist Christian Couder
2026-04-27 12:41 ` [PATCH v2 1/8] t5710: simplify 'mkdir X' followed by 'git -C X init' Christian Couder
@ 2026-04-27 12:41 ` Christian Couder
2026-04-27 12:41 ` [PATCH v2 3/8] urlmatch: add url_normalize_pattern() helper Christian Couder
` (6 subsequent siblings)
8 siblings, 0 replies; 80+ messages in thread
From: Christian Couder @ 2026-04-27 12:41 UTC (permalink / raw)
To: git
Cc: Junio C Hamano, Patrick Steinhardt, Taylor Blau, Karthik Nayak,
Elijah Newren, Christian Couder, Christian Couder
The last argument of url_normalize_1() is `char allow_globs` but it is
used as a boolean, not as a char.
Let's convert it to a `bool`, and while at it convert the two calls to
url_normalize_1() so they pass 'true' or 'false' instead of '1' or '0'.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
urlmatch.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/urlmatch.c b/urlmatch.c
index eea8300489..989bc7eb8b 100644
--- a/urlmatch.c
+++ b/urlmatch.c
@@ -111,7 +111,7 @@ static int match_host(const struct url_info *url_info,
return (!url_len && !pat_len);
}
-static char *url_normalize_1(const char *url, struct url_info *out_info, char allow_globs)
+static char *url_normalize_1(const char *url, struct url_info *out_info, bool allow_globs)
{
/*
* Normalize NUL-terminated url using the following rules:
@@ -437,7 +437,7 @@ static char *url_normalize_1(const char *url, struct url_info *out_info, char al
char *url_normalize(const char *url, struct url_info *out_info)
{
- return url_normalize_1(url, out_info, 0);
+ return url_normalize_1(url, out_info, false);
}
static size_t url_match_prefix(const char *url,
@@ -577,7 +577,7 @@ int urlmatch_config_entry(const char *var, const char *value,
struct url_info norm_info;
config_url = xmemdupz(key, dot - key);
- norm_url = url_normalize_1(config_url, &norm_info, 1);
+ norm_url = url_normalize_1(config_url, &norm_info, true);
if (norm_url)
retval = match_urls(url, &norm_info, &matched);
else if (collect->fallback_match_fn)
--
2.54.0.19.gb68b9497aa
^ permalink raw reply related [flat|nested] 80+ messages in thread* [PATCH v2 3/8] urlmatch: add url_normalize_pattern() helper
2026-04-27 12:41 ` [PATCH v2 0/8] Auto-configure advertised remotes via URL allowlist Christian Couder
2026-04-27 12:41 ` [PATCH v2 1/8] t5710: simplify 'mkdir X' followed by 'git -C X init' Christian Couder
2026-04-27 12:41 ` [PATCH v2 2/8] urlmatch: change 'allow_globs' arg to bool Christian Couder
@ 2026-04-27 12:41 ` Christian Couder
2026-04-27 12:41 ` [PATCH v2 4/8] promisor-remote: add 'local_name' to 'struct promisor_info' Christian Couder
` (5 subsequent siblings)
8 siblings, 0 replies; 80+ messages in thread
From: Christian Couder @ 2026-04-27 12:41 UTC (permalink / raw)
To: git
Cc: Junio C Hamano, Patrick Steinhardt, Taylor Blau, Karthik Nayak,
Elijah Newren, Christian Couder, Christian Couder
In a following commit, we will need to normalize a URL glob pattern
(which may contain '*' in the host portion) and extract its component
offsets (host, path, etc.) for separate matching. Let's export a
dedicated helper function url_normalize_pattern() for that purpose.
It works like url_normalize(), but passes allow_globs=true to the
internal url_normalize_1(), so that '*' characters in the host are
accepted rather than rejected.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
urlmatch.c | 5 +++++
urlmatch.h | 12 ++++++++++++
2 files changed, 17 insertions(+)
diff --git a/urlmatch.c b/urlmatch.c
index 989bc7eb8b..7e734e2660 100644
--- a/urlmatch.c
+++ b/urlmatch.c
@@ -440,6 +440,11 @@ char *url_normalize(const char *url, struct url_info *out_info)
return url_normalize_1(url, out_info, false);
}
+char *url_normalize_pattern(const char *url, struct url_info *out_info)
+{
+ return url_normalize_1(url, out_info, true);
+}
+
static size_t url_match_prefix(const char *url,
const char *url_prefix,
size_t url_prefix_len)
diff --git a/urlmatch.h b/urlmatch.h
index 5ba85cea13..32c5067f9b 100644
--- a/urlmatch.h
+++ b/urlmatch.h
@@ -36,6 +36,18 @@ struct url_info {
char *url_normalize(const char *, struct url_info *);
+/*
+ * Like url_normalize(), but also allows '*' glob characters in the host
+ * portion. Use this when normalizing URL patterns from user configuration.
+ *
+ * Note that '*' is a valid path character per RFC 3986 (as a sub-delim),
+ * so glob patterns using '*' in the path are also accepted.
+ *
+ * Returns a newly allocated normalized string and fills out_info if
+ * non-NULL, or NULL if the pattern is invalid.
+ */
+char *url_normalize_pattern(const char *url, struct url_info *out_info);
+
struct urlmatch_item {
size_t hostmatch_len;
size_t pathmatch_len;
--
2.54.0.19.gb68b9497aa
^ permalink raw reply related [flat|nested] 80+ messages in thread* [PATCH v2 4/8] promisor-remote: add 'local_name' to 'struct promisor_info'
2026-04-27 12:41 ` [PATCH v2 0/8] Auto-configure advertised remotes via URL allowlist Christian Couder
` (2 preceding siblings ...)
2026-04-27 12:41 ` [PATCH v2 3/8] urlmatch: add url_normalize_pattern() helper Christian Couder
@ 2026-04-27 12:41 ` Christian Couder
2026-05-04 11:46 ` Toon Claes
2026-04-27 12:41 ` [PATCH v2 5/8] promisor-remote: introduce promisor.acceptFromServerUrl Christian Couder
` (4 subsequent siblings)
8 siblings, 1 reply; 80+ messages in thread
From: Christian Couder @ 2026-04-27 12:41 UTC (permalink / raw)
To: git
Cc: Junio C Hamano, Patrick Steinhardt, Taylor Blau, Karthik Nayak,
Elijah Newren, Christian Couder, Christian Couder
In a following commit, we will store promisor remote information under
a remote name different than the one the server advertised.
To prepare for this change, let's add a new 'char *local_name' member
to 'struct promisor_info', and let's update the related functions.
While at it, let's also add a small promisor_info_internal_name()
helper that returns `local_name` when set, `name` otherwise, and let's
use this small helper in promisor_store_advertised_fields() and in the
post-loop of filter_promisor_remote() so that lookups against the local
repo configuration use the right name.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
promisor-remote.c | 22 +++++++++++++++-------
1 file changed, 15 insertions(+), 7 deletions(-)
diff --git a/promisor-remote.c b/promisor-remote.c
index 38fa050542..7699e259eb 100644
--- a/promisor-remote.c
+++ b/promisor-remote.c
@@ -434,13 +434,14 @@ static struct string_list *fields_stored(void)
* Struct for promisor remotes involved in the "promisor-remote"
* protocol capability.
*
- * Except for "name", each <member> in this struct and its <value>
- * should correspond (either on the client side or on the server side)
- * to a "remote.<name>.<member>" config variable set to <value> where
- * "<name>" is a promisor remote name.
+ * Except for "name" and "local_name", each <member> in this struct
+ * and its <value> should correspond (either on the client side or on
+ * the server side) to a "remote.<name>.<member>" config variable set
+ * to <value> where "<name>" is a promisor remote name.
*/
struct promisor_info {
- const char *name;
+ const char *name; /* name the server advertised */
+ const char *local_name; /* name used locally (may be auto-generated) */
const char *url;
const char *filter;
const char *token;
@@ -449,6 +450,7 @@ struct promisor_info {
static void promisor_info_free(struct promisor_info *p)
{
free((char *)p->name);
+ free((char *)p->local_name);
free((char *)p->url);
free((char *)p->filter);
free((char *)p->token);
@@ -462,6 +464,11 @@ static void promisor_info_list_clear(struct string_list *list)
string_list_clear(list, 0);
}
+static const char *promisor_info_internal_name(struct promisor_info *p)
+{
+ return p->local_name ? p->local_name : p->name;
+}
+
static void set_one_field(struct promisor_info *p,
const char *field, const char *value)
{
@@ -829,7 +836,7 @@ static bool promisor_store_advertised_fields(struct promisor_info *advertised,
{
struct promisor_info *p;
struct string_list_item *item;
- const char *remote_name = advertised->name;
+ const char *remote_name = promisor_info_internal_name(advertised);
bool reload_config = false;
if (!(store_info->store_filter || store_info->store_token))
@@ -937,7 +944,8 @@ static void filter_promisor_remote(struct repository *repo,
/* Apply accepted remotes to the stable repo state */
for_each_string_list_item(item, accepted_remotes) {
struct promisor_info *info = item->util;
- struct promisor_remote *r = repo_promisor_remote_find(repo, info->name);
+ const char *local = promisor_info_internal_name(info);
+ struct promisor_remote *r = repo_promisor_remote_find(repo, local);
if (r) {
r->accepted = 1;
--
2.54.0.19.gb68b9497aa
^ permalink raw reply related [flat|nested] 80+ messages in thread* Re: [PATCH v2 4/8] promisor-remote: add 'local_name' to 'struct promisor_info'
2026-04-27 12:41 ` [PATCH v2 4/8] promisor-remote: add 'local_name' to 'struct promisor_info' Christian Couder
@ 2026-05-04 11:46 ` Toon Claes
0 siblings, 0 replies; 80+ messages in thread
From: Toon Claes @ 2026-05-04 11:46 UTC (permalink / raw)
To: Christian Couder, git
Cc: Junio C Hamano, Patrick Steinhardt, Taylor Blau, Karthik Nayak,
Elijah Newren, Christian Couder, Christian Couder
Christian Couder <christian.couder@gmail.com> writes:
> In a following commit, we will store promisor remote information under
> a remote name different than the one the server advertised.
>
> To prepare for this change, let's add a new 'char *local_name' member
> to 'struct promisor_info', and let's update the related functions.
>
> While at it, let's also add a small promisor_info_internal_name()
> helper that returns `local_name` when set, `name` otherwise, and let's
> use this small helper in promisor_store_advertised_fields() and in the
> post-loop of filter_promisor_remote() so that lookups against the local
> repo configuration use the right name.
It seems the `local_name` doesn't get filled in yet, so because
promisor_info_internal_name() falls back to `name` there is no
functional change in this commit. Okay.
--
Cheers,
Toon
^ permalink raw reply [flat|nested] 80+ messages in thread
* [PATCH v2 5/8] promisor-remote: introduce promisor.acceptFromServerUrl
2026-04-27 12:41 ` [PATCH v2 0/8] Auto-configure advertised remotes via URL allowlist Christian Couder
` (3 preceding siblings ...)
2026-04-27 12:41 ` [PATCH v2 4/8] promisor-remote: add 'local_name' to 'struct promisor_info' Christian Couder
@ 2026-04-27 12:41 ` Christian Couder
2026-04-27 12:41 ` [PATCH v2 6/8] promisor-remote: trust known remotes matching acceptFromServerUrl Christian Couder
` (3 subsequent siblings)
8 siblings, 0 replies; 80+ messages in thread
From: Christian Couder @ 2026-04-27 12:41 UTC (permalink / raw)
To: git
Cc: Junio C Hamano, Patrick Steinhardt, Taylor Blau, Karthik Nayak,
Elijah Newren, Christian Couder, Christian Couder
The "promisor-remote" protocol capability allows servers to advertise
promisor remotes, but doesn't allow these remotes to be automatically
configured on the client.
Let's introduce a new `promisor.acceptFromServerUrl` config variable
which contains a glob pattern, so that advertised remotes with a URL
matching that pattern will be automatically configured.
The glob pattern can optionally be prefixed with a remote name which
will be used as the name of the new local remote.
For now though, let's only introduce the functions to read and validate
the glob patterns and the optional prefixes.
Checking if the URLs of the advertised remotes match the glob patterns
and taking the appropriate action is left for a following commit.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
promisor-remote.c | 90 +++++++++++++++++++++++++++
t/t5710-promisor-remote-capability.sh | 21 +++++++
2 files changed, 111 insertions(+)
diff --git a/promisor-remote.c b/promisor-remote.c
index 7699e259eb..3f3924f587 100644
--- a/promisor-remote.c
+++ b/promisor-remote.c
@@ -12,6 +12,7 @@
#include "packfile.h"
#include "environment.h"
#include "url.h"
+#include "urlmatch.h"
#include "version.h"
struct promisor_remote_config {
@@ -657,6 +658,90 @@ static bool has_control_char(const char *s)
return false;
}
+struct allowed_url {
+ char *remote_name;
+ char *url_pattern;
+ struct url_info pattern_info;
+};
+
+static void allowed_url_free(void *util, const char *str UNUSED)
+{
+ struct allowed_url *allowed = util;
+
+ if (!allowed)
+ return;
+
+ /* Depending on prefix, free either remote_name or url_pattern */
+ free(allowed->remote_name ? allowed->remote_name : allowed->url_pattern);
+ free(allowed->pattern_info.url);
+ free(allowed);
+}
+
+static struct allowed_url *valid_accept_url(const char *url)
+{
+ char *dup, *p;
+ struct allowed_url *allowed;
+
+ if (!url)
+ return NULL;
+
+ dup = xstrdup(url);
+ p = strchr(dup, '=');
+ if (p) {
+ *p = '\0';
+ if (!valid_remote_name(dup)) {
+ warning(_("invalid remote name '%s' before '=' sign "
+ "in '%s' from promisor.acceptFromServerUrl config"),
+ dup, url);
+ free(dup);
+ return NULL;
+ }
+ p++;
+ } else {
+ p = dup;
+ }
+
+ if (has_control_char(p)) {
+ warning(_("invalid url pattern '%s' "
+ "in '%s' from promisor.acceptFromServerUrl config"), p, url);
+ free(dup);
+ return NULL;
+ }
+
+ allowed = xmalloc(sizeof(*allowed));
+ allowed->remote_name = (p == dup) ? NULL : dup;
+ allowed->url_pattern = p;
+ allowed->pattern_info.url = url_normalize_pattern(p, &allowed->pattern_info);
+ if (!allowed->pattern_info.url) {
+ warning(_("invalid url pattern '%s' "
+ "in '%s' from promisor.acceptFromServerUrl config"), p, url);
+ free(dup);
+ free(allowed);
+ return NULL;
+ }
+
+ return allowed;
+}
+
+static void load_accept_from_server_url(struct repository *repo,
+ struct string_list *accept_urls)
+{
+ const struct string_list *config_urls;
+
+ if (!repo_config_get_string_multi(repo, "promisor.acceptfromserverurl", &config_urls)) {
+ struct string_list_item *item;
+
+ for_each_string_list_item(item, config_urls) {
+ struct allowed_url *allowed = valid_accept_url(item->string);
+ if (allowed) {
+ struct string_list_item *new;
+ new = string_list_append(accept_urls, item->string);
+ new->util = allowed;
+ }
+ }
+ }
+}
+
static int should_accept_remote(enum accept_promisor accept,
struct promisor_info *advertised,
struct string_list *config_info)
@@ -901,6 +986,10 @@ static void filter_promisor_remote(struct repository *repo,
struct string_list_item *item;
bool reload_config = false;
enum accept_promisor accept = accept_from_server(repo);
+ struct string_list accept_urls = STRING_LIST_INIT_DUP;
+
+ /* Load and validate the acceptFromServerUrl config */
+ load_accept_from_server_url(repo, &accept_urls);
if (accept == ACCEPT_NONE)
return;
@@ -934,6 +1023,7 @@ static void filter_promisor_remote(struct repository *repo,
}
}
+ string_list_clear_func(&accept_urls, allowed_url_free);
promisor_info_list_clear(&config_info);
string_list_clear(&remote_info, 0);
store_info_free(store_info);
diff --git a/t/t5710-promisor-remote-capability.sh b/t/t5710-promisor-remote-capability.sh
index bf1cc54605..3b39505380 100755
--- a/t/t5710-promisor-remote-capability.sh
+++ b/t/t5710-promisor-remote-capability.sh
@@ -387,6 +387,27 @@ test_expect_success "clone with 'KnownUrl' and empty url, so not advertised" '
check_missing_objects server 1 "$oid"
'
+test_expect_success "clone with invalid promisor.acceptFromServerUrl" '
+ git -C server config promisor.advertise true &&
+ test_when_finished "rm -rf client" &&
+
+ # As "bad name" contains a space, which is not a valid remote name,
+ # the pattern should be rejected with a warning and no remote created.
+ GIT_NO_LAZY_FETCH=0 git clone \
+ -c promisor.acceptfromserver=None \
+ -c "promisor.acceptFromServerUrl=bad name=https://example.com/*" \
+ --no-local --filter="blob:limit=5k" server client 2>err &&
+
+ # Check that a warning was emitted
+ test_grep "invalid remote name '\''bad name'\''" err &&
+
+ # Check that the largest object is not missing on the server
+ check_missing_objects server 0 "" &&
+
+ # Reinitialize server so that the largest object is missing again
+ initialize_server 1 "$oid"
+'
+
test_expect_success "clone with promisor.sendFields" '
git -C server config promisor.advertise true &&
test_when_finished "rm -rf client" &&
--
2.54.0.19.gb68b9497aa
^ permalink raw reply related [flat|nested] 80+ messages in thread* [PATCH v2 6/8] promisor-remote: trust known remotes matching acceptFromServerUrl
2026-04-27 12:41 ` [PATCH v2 0/8] Auto-configure advertised remotes via URL allowlist Christian Couder
` (4 preceding siblings ...)
2026-04-27 12:41 ` [PATCH v2 5/8] promisor-remote: introduce promisor.acceptFromServerUrl Christian Couder
@ 2026-04-27 12:41 ` Christian Couder
2026-05-08 12:45 ` Toon Claes
2026-05-11 13:10 ` Toon Claes
2026-04-27 12:41 ` [PATCH v2 7/8] promisor-remote: auto-configure unknown remotes Christian Couder
` (2 subsequent siblings)
8 siblings, 2 replies; 80+ messages in thread
From: Christian Couder @ 2026-04-27 12:41 UTC (permalink / raw)
To: git
Cc: Junio C Hamano, Patrick Steinhardt, Taylor Blau, Karthik Nayak,
Elijah Newren, Christian Couder, Christian Couder
A previous commit introduced the `promisor.acceptFromServerUrl` config
variable along with the machinery to parse and validate the URL glob
patterns and optional remote name prefixes it contains. However, these
URL patterns are not yet tied into the client's acceptance logic.
When a promisor remote is already configured locally, its fields (like
authentication tokens) may occasionally need to be refreshed by the
server. If `promisor.acceptFromServer` is set to the secure default
("None"), these updates are rejected, potentially causing future
fetches to fail.
To enable such targeted updates for trusted URLs, let's use the URL
patterns from `promisor.acceptFromServerUrl` as an additional URL
based allowlist.
Concretely, let's check the advertised URLs against the URL glob
patterns by introducing a new small helper function called
url_matches_accept_list(), which iterates over the glob patterns and
returns the first matching allowed_url entry (or NULL).
The URL matching is done component by component: scheme and port are
compared exactly, the host is matched with wildmatch() using the
WM_PATHNAME flag (so '*' cannot cross the '/' boundary into the path),
and the path is matched with wildmatch() without WM_PATHNAME (so '*'
can still match multi-level paths). Before matching, the advertised
URL is passed through url_normalize() so that case variations in the
scheme/host, percent-encoding tricks, and ".." path segments cannot
bypass the allowlist.
Let's then use this helper at the tail of should_accept_remote() so
that, when `accept == ACCEPT_NONE`, a known remote whose URL matches
the allowlist is still accepted.
To prepare for this new logic, let's also:
- Add an 'accept_urls' parameter to should_accept_remote().
- Replace the BUG() guard in the ACCEPT_KNOWN_URL case with an
explicit 'if (accept == ACCEPT_KNOWN_URL) return' and a new
BUG() guard in the ACCEPT_NONE case, so url_matches_accept_list()
is only called in the ACCEPT_NONE case.
- Call accept_from_server_url() from filter_promisor_remote()
and relax its early return so that the function is entered when
`accept_urls` has entries even if `accept == ACCEPT_NONE`.
With this, many organizations may only need something like:
git config set --global \
promisor.acceptFromServerUrl "https://my-org.com/*"
to accept only their own remotes. And if they need to accept additional
remotes in some specific repos, they can also set:
git config set promisor.acceptFromServer knownUrl
and configure the additional remote manually only in the repos where
they are needed.
Let's then properly document `promisor.acceptFromServerUrl` in
"promisor.adoc" as an additive security allowlist for known remotes,
including the URL normalization behavior and the component-wise
matching, and let's mention it in "gitprotocol-v2.adoc".
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
Documentation/config/promisor.adoc | 52 ++++++++++++++
Documentation/gitprotocol-v2.adoc | 9 +--
promisor-remote.c | 98 +++++++++++++++++++++++++--
t/t5710-promisor-remote-capability.sh | 71 +++++++++++++++++++
4 files changed, 220 insertions(+), 10 deletions(-)
diff --git a/Documentation/config/promisor.adoc b/Documentation/config/promisor.adoc
index b0fa43b839..efc066c3f2 100644
--- a/Documentation/config/promisor.adoc
+++ b/Documentation/config/promisor.adoc
@@ -51,6 +51,58 @@ promisor.acceptFromServer::
to "fetch" and "clone" requests from the client. Name and URL
comparisons are case sensitive. See linkgit:gitprotocol-v2[5].
+promisor.acceptFromServerUrl::
+ A glob pattern to specify which server-advertised URLs a
+ client is allowed to act on. When a URL matches, the client
+ will accept the advertised remote as a promisor remote and may
+ automatically accept field updates (such as authentication
+ tokens) from the server, even if `promisor.acceptFromServer`
+ is set to `none` (the default).
++
+This option can appear multiple times in config files. An advertised
+URL will be accepted if it matches _ANY_ glob pattern specified by
+this option in _ANY_ config file read by Git.
++
+Be _VERY_ careful with these patterns: `*` matches any sequence of
+characters within the 'host' and 'path' parts of a URL (but cannot
+cross part boundaries). An overly broad pattern is a major security
+risk, as a matching URL allows a server to update fields (such as
+authentication tokens) on known remotes without further confirmation.
+To minimize security risks, follow these guidelines:
++
+1. Start with a secure protocol scheme, like `https://` or `ssh://`.
++
+2. Only allow domain names or paths where you control and trust _ALL_
+ the content. Be especially careful with shared hosting platforms
+ like `github.com` or `gitlab.com`. A broad pattern like
+ `https://gitlab.com/*` is dangerous because it trusts every
+ repository on the entire platform. Always restrict such patterns to
+ your specific organization or namespace (e.g.,
+ `https://gitlab.com/your-org/*`).
++
+3. Never use globs at the end of domain names. For example,
+ `https://cdn.your-org.com/*` might be safe, but
+ `https://cdn.your-org.com*/*` is a major security risk because
+ the latter matches `https://cdn.your-org.com.hacker.net/repo`.
++
+4. Be careful using globs at the beginning of domain names. While the
+ code ensures a `*` in the host cannot cross into the path, a
+ pattern like `https://*.example.com/*` will still match any
+ subdomain. This is extremely dangerous on shared hosting platforms
+ (e.g., `https://*.github.io/*` trusts every user's site on the
+ entire platform).
++
+Before matching, both the advertised URL and the pattern are
+normalized: the scheme and host are lowercased, percent-encoded
+characters are decoded where possible, and path segments like `..`
+are resolved. The port must also match exactly (e.g.,
+`https://example.com:8080/*` will not match a URL advertised on
+port 9999).
++
+For the security implications of accepting a promisor remote, see the
+documentation of `promisor.acceptFromServer`. For details on the
+protocol, see linkgit:gitprotocol-v2[5].
+
promisor.checkFields::
A comma or space separated list of additional remote related
field names. A client checks if the values of these fields
diff --git a/Documentation/gitprotocol-v2.adoc b/Documentation/gitprotocol-v2.adoc
index befa697d21..2beb70595f 100644
--- a/Documentation/gitprotocol-v2.adoc
+++ b/Documentation/gitprotocol-v2.adoc
@@ -866,10 +866,11 @@ the server advertised, the client shouldn't advertise the
On the server side, the "promisor.advertise" and "promisor.sendFields"
configuration options can be used to control what it advertises. On
-the client side, the "promisor.acceptFromServer" configuration option
-can be used to control what it accepts, and the "promisor.storeFields"
-option, to control what it stores. See the documentation of these
-configuration options in linkgit:git-config[1] for more information.
+the client side, the "promisor.acceptFromServer" and
+"promisor.acceptFromServerUrl" configuration options can be used to
+control what it accepts, and the "promisor.storeFields" option, to
+control what it stores. See the documentation of these configuration
+options in linkgit:git-config[1] for more information.
Note that in the future it would be nice if the "promisor-remote"
protocol capability could be used by the server, when responding to
diff --git a/promisor-remote.c b/promisor-remote.c
index 3f3924f587..72d5b94bf7 100644
--- a/promisor-remote.c
+++ b/promisor-remote.c
@@ -14,6 +14,7 @@
#include "url.h"
#include "urlmatch.h"
#include "version.h"
+#include "wildmatch.h"
struct promisor_remote_config {
struct promisor_remote *promisors;
@@ -742,8 +743,82 @@ static void load_accept_from_server_url(struct repository *repo,
}
}
+static bool match_one_url(const struct url_info *pi, const struct url_info *ui)
+{
+ const char *pat = pi->url;
+ const char *url = ui->url;
+ char *p_str, *u_str;
+ bool res;
+
+ /*
+ * Schemes must match exactly. They are case-folded by
+ * url_normalize(), so strncmp() suffices.
+ */
+ if (pi->scheme_len != ui->scheme_len || strncmp(pat, url, pi->scheme_len))
+ return false;
+
+ /*
+ * Ports must match exactly. url_normalize() strips default
+ * ports (like 443 for https), so length and content
+ * comparisons are sufficient.
+ */
+ if (pi->port_len != ui->port_len ||
+ strncmp(pat + pi->port_off, url + ui->port_off, pi->port_len))
+ return false;
+
+ /*
+ * Match host and path separately to prevent a '*' in the host
+ * portion of the pattern from matching across the '/'
+ * boundary into the path. Use WM_PATHNAME for the host so '*'
+ * cannot cross '/' there, and 0 for the path so '*' can still
+ * match multi-level paths.
+ */
+
+ p_str = xstrndup(pat + pi->host_off, pi->host_len);
+ u_str = xstrndup(url + ui->host_off, ui->host_len);
+ res = !wildmatch(p_str, u_str, WM_PATHNAME);
+ free(p_str);
+ free(u_str);
+
+ if (!res)
+ return false;
+
+ p_str = xstrndup(pat + pi->path_off, pi->path_len);
+ u_str = xstrndup(url + ui->path_off, ui->path_len);
+ res = !wildmatch(p_str, u_str, 0);
+ free(p_str);
+ free(u_str);
+
+ return res;
+}
+
+static struct allowed_url *url_matches_accept_list(
+ struct string_list *accept_urls, const char *url)
+{
+ struct string_list_item *item;
+ struct url_info url_info;
+
+ url_info.url = url_normalize(url, &url_info);
+
+ if (!url_info.url)
+ return NULL;
+
+ for_each_string_list_item(item, accept_urls) {
+ struct allowed_url *allowed = item->util;
+
+ if (match_one_url(&allowed->pattern_info, &url_info)) {
+ free(url_info.url);
+ return allowed;
+ }
+ }
+
+ free(url_info.url);
+ return NULL;
+}
+
static int should_accept_remote(enum accept_promisor accept,
struct promisor_info *advertised,
+ struct string_list *accept_urls,
struct string_list *config_info)
{
struct promisor_info *p;
@@ -771,9 +846,6 @@ static int should_accept_remote(enum accept_promisor accept,
if (accept == ACCEPT_KNOWN_NAME)
return all_fields_match(advertised, config_info, p);
- if (accept != ACCEPT_KNOWN_URL)
- BUG("Unhandled 'enum accept_promisor' value '%d'", accept);
-
if (strcmp(p->url, remote_url)) {
warning(_("known remote named '%s' but with URL '%s' instead of '%s', "
"ignoring this remote"),
@@ -781,7 +853,21 @@ static int should_accept_remote(enum accept_promisor accept,
return 0;
}
- return all_fields_match(advertised, config_info, p);
+ if (accept == ACCEPT_KNOWN_URL)
+ return all_fields_match(advertised, config_info, p);
+
+ if (accept != ACCEPT_NONE)
+ BUG("Unhandled 'enum accept_promisor' value '%d'", accept);
+
+ /*
+ * Even if accept == ACCEPT_NONE, we MUST trust this known
+ * remote to update its token or other such fields if its URL
+ * matches the acceptFromServerUrl allowlist!
+ */
+ if (url_matches_accept_list(accept_urls, remote_url))
+ return all_fields_match(advertised, config_info, p);
+
+ return 0;
}
static int skip_field_name_prefix(const char *elem, const char *field_name, const char **value)
@@ -991,7 +1077,7 @@ static void filter_promisor_remote(struct repository *repo,
/* Load and validate the acceptFromServerUrl config */
load_accept_from_server_url(repo, &accept_urls);
- if (accept == ACCEPT_NONE)
+ if (accept == ACCEPT_NONE && !accept_urls.nr)
return;
/* Parse remote info received */
@@ -1011,7 +1097,7 @@ static void filter_promisor_remote(struct repository *repo,
string_list_sort(&config_info);
}
- if (should_accept_remote(accept, advertised, &config_info)) {
+ if (should_accept_remote(accept, advertised, &accept_urls, &config_info)) {
if (!store_info)
store_info = store_info_new(repo);
if (promisor_store_advertised_fields(advertised, store_info))
diff --git a/t/t5710-promisor-remote-capability.sh b/t/t5710-promisor-remote-capability.sh
index 3b39505380..0659b2ac15 100755
--- a/t/t5710-promisor-remote-capability.sh
+++ b/t/t5710-promisor-remote-capability.sh
@@ -387,6 +387,77 @@ test_expect_success "clone with 'KnownUrl' and empty url, so not advertised" '
check_missing_objects server 1 "$oid"
'
+test_expect_success "clone with 'None' but URL allowlisted" '
+ git -C server config promisor.advertise true &&
+ test_when_finished "rm -rf client" &&
+
+ GIT_NO_LAZY_FETCH=0 git clone -c remote.lop.promisor=true \
+ -c remote.lop.fetch="+refs/heads/*:refs/remotes/lop/*" \
+ -c remote.lop.url="$TRASH_DIRECTORY_URL/lop" \
+ -c promisor.acceptfromserver=None \
+ -c promisor.acceptFromServerUrl="$ENCODED_TRASH_DIRECTORY_URL/*" \
+ --no-local --filter="blob:limit=5k" server client &&
+
+ # Check that the largest object is still missing on the server
+ check_missing_objects server 1 "$oid"
+'
+
+test_expect_success "clone with 'None' but URL not in allowlist" '
+ git -C server config promisor.advertise true &&
+ test_when_finished "rm -rf client" &&
+
+ GIT_NO_LAZY_FETCH=0 git clone -c remote.lop.promisor=true \
+ -c remote.lop.fetch="+refs/heads/*:refs/remotes/lop/*" \
+ -c remote.lop.url="$TRASH_DIRECTORY_URL/lop" \
+ -c promisor.acceptfromserver=None \
+ -c promisor.acceptFromServerUrl="https://example.com/*" \
+ --no-local --filter="blob:limit=5k" server client &&
+
+ # Check that the largest object is not missing on the server
+ check_missing_objects server 0 "" &&
+
+ # Reinitialize server so that the largest object is missing again
+ initialize_server 1 "$oid"
+'
+
+test_expect_success "clone with 'None' but URL allowlisted in one pattern out of two" '
+ git -C server config promisor.advertise true &&
+ test_when_finished "rm -rf client" &&
+
+ GIT_NO_LAZY_FETCH=0 git clone -c remote.lop.promisor=true \
+ -c remote.lop.fetch="+refs/heads/*:refs/remotes/lop/*" \
+ -c remote.lop.url="$TRASH_DIRECTORY_URL/lop" \
+ -c promisor.acceptfromserver=None \
+ -c promisor.acceptFromServerUrl="https://example.com/*" \
+ -c promisor.acceptFromServerUrl="$ENCODED_TRASH_DIRECTORY_URL/*" \
+ --no-local --filter="blob:limit=5k" server client &&
+
+ # Check that the largest object is still missing on the server
+ check_missing_objects server 1 "$oid"
+'
+
+test_expect_success "clone with 'None', URL allowlisted, but client has different URL" '
+ git -C server config promisor.advertise true &&
+ test_when_finished "rm -rf client" &&
+
+ # The client configures "lop" with a different URL (serverTwo) than
+ # what the server advertises (lop). Even though the advertised URL
+ # matches the allowlist, the remote is rejected because the
+ # configured URL does not match the advertised one.
+ GIT_NO_LAZY_FETCH=0 git clone -c remote.lop.promisor=true \
+ -c remote.lop.fetch="+refs/heads/*:refs/remotes/lop/*" \
+ -c remote.lop.url="$TRASH_DIRECTORY_URL/serverTwo" \
+ -c promisor.acceptfromserver=None \
+ -c promisor.acceptFromServerUrl="$ENCODED_TRASH_DIRECTORY_URL/*" \
+ --no-local --filter="blob:limit=5k" server client &&
+
+ # Check that the largest object is not missing on the server
+ check_missing_objects server 0 "" &&
+
+ # Reinitialize server so that the largest object is missing again
+ initialize_server 1 "$oid"
+'
+
test_expect_success "clone with invalid promisor.acceptFromServerUrl" '
git -C server config promisor.advertise true &&
test_when_finished "rm -rf client" &&
--
2.54.0.19.gb68b9497aa
^ permalink raw reply related [flat|nested] 80+ messages in thread* Re: [PATCH v2 6/8] promisor-remote: trust known remotes matching acceptFromServerUrl
2026-04-27 12:41 ` [PATCH v2 6/8] promisor-remote: trust known remotes matching acceptFromServerUrl Christian Couder
@ 2026-05-08 12:45 ` Toon Claes
2026-05-11 13:10 ` Toon Claes
1 sibling, 0 replies; 80+ messages in thread
From: Toon Claes @ 2026-05-08 12:45 UTC (permalink / raw)
To: Christian Couder, git
Cc: Junio C Hamano, Patrick Steinhardt, Taylor Blau, Karthik Nayak,
Elijah Newren, Christian Couder, Christian Couder
Christian Couder <christian.couder@gmail.com> writes:
> A previous commit introduced the `promisor.acceptFromServerUrl` config
> variable along with the machinery to parse and validate the URL glob
> patterns and optional remote name prefixes it contains. However, these
> URL patterns are not yet tied into the client's acceptance logic.
>
> When a promisor remote is already configured locally, its fields (like
> authentication tokens) may occasionally need to be refreshed by the
> server. If `promisor.acceptFromServer` is set to the secure default
> ("None"), these updates are rejected, potentially causing future
> fetches to fail.
>
> To enable such targeted updates for trusted URLs, let's use the URL
> patterns from `promisor.acceptFromServerUrl` as an additional URL
> based allowlist.
>
> Concretely, let's check the advertised URLs against the URL glob
> patterns by introducing a new small helper function called
> url_matches_accept_list(), which iterates over the glob patterns and
> returns the first matching allowed_url entry (or NULL).
>
> The URL matching is done component by component: scheme and port are
> compared exactly, the host is matched with wildmatch() using the
> WM_PATHNAME flag (so '*' cannot cross the '/' boundary into the path),
> and the path is matched with wildmatch() without WM_PATHNAME (so '*'
> can still match multi-level paths). Before matching, the advertised
> URL is passed through url_normalize() so that case variations in the
> scheme/host, percent-encoding tricks, and ".." path segments cannot
> bypass the allowlist.
>
> Let's then use this helper at the tail of should_accept_remote() so
> that, when `accept == ACCEPT_NONE`, a known remote whose URL matches
> the allowlist is still accepted.
>
> To prepare for this new logic, let's also:
>
> - Add an 'accept_urls' parameter to should_accept_remote().
>
> - Replace the BUG() guard in the ACCEPT_KNOWN_URL case with an
> explicit 'if (accept == ACCEPT_KNOWN_URL) return' and a new
> BUG() guard in the ACCEPT_NONE case, so url_matches_accept_list()
> is only called in the ACCEPT_NONE case.
>
> - Call accept_from_server_url() from filter_promisor_remote()
> and relax its early return so that the function is entered when
> `accept_urls` has entries even if `accept == ACCEPT_NONE`.
>
> With this, many organizations may only need something like:
>
> git config set --global \
> promisor.acceptFromServerUrl "https://my-org.com/*"
>
> to accept only their own remotes. And if they need to accept additional
> remotes in some specific repos, they can also set:
>
> git config set promisor.acceptFromServer knownUrl
>
> and configure the additional remote manually only in the repos where
> they are needed.
>
> Let's then properly document `promisor.acceptFromServerUrl` in
> "promisor.adoc" as an additive security allowlist for known remotes,
> including the URL normalization behavior and the component-wise
> matching, and let's mention it in "gitprotocol-v2.adoc".
>
> Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
> ---
> Documentation/config/promisor.adoc | 52 ++++++++++++++
> Documentation/gitprotocol-v2.adoc | 9 +--
> promisor-remote.c | 98 +++++++++++++++++++++++++--
> t/t5710-promisor-remote-capability.sh | 71 +++++++++++++++++++
> 4 files changed, 220 insertions(+), 10 deletions(-)
>
> diff --git a/Documentation/config/promisor.adoc b/Documentation/config/promisor.adoc
> index b0fa43b839..efc066c3f2 100644
> --- a/Documentation/config/promisor.adoc
> +++ b/Documentation/config/promisor.adoc
> @@ -51,6 +51,58 @@ promisor.acceptFromServer::
> to "fetch" and "clone" requests from the client. Name and URL
> comparisons are case sensitive. See linkgit:gitprotocol-v2[5].
>
> +promisor.acceptFromServerUrl::
> + A glob pattern to specify which server-advertised URLs a
> + client is allowed to act on. When a URL matches, the client
> + will accept the advertised remote as a promisor remote and may
> + automatically accept field updates (such as authentication
> + tokens) from the server, even if `promisor.acceptFromServer`
> + is set to `none` (the default).
> ++
> +This option can appear multiple times in config files. An advertised
> +URL will be accepted if it matches _ANY_ glob pattern specified by
> +this option in _ANY_ config file read by Git.
> ++
> +Be _VERY_ careful with these patterns: `*` matches any sequence of
> +characters within the 'host' and 'path' parts of a URL (but cannot
> +cross part boundaries). An overly broad pattern is a major security
> +risk, as a matching URL allows a server to update fields (such as
> +authentication tokens) on known remotes without further confirmation.
> +To minimize security risks, follow these guidelines:
> ++
> +1. Start with a secure protocol scheme, like `https://` or `ssh://`.
> ++
> +2. Only allow domain names or paths where you control and trust _ALL_
> + the content. Be especially careful with shared hosting platforms
> + like `github.com` or `gitlab.com`. A broad pattern like
> + `https://gitlab.com/*` is dangerous because it trusts every
> + repository on the entire platform. Always restrict such patterns to
> + your specific organization or namespace (e.g.,
> + `https://gitlab.com/your-org/*`).
> ++
> +3. Never use globs at the end of domain names. For example,
> + `https://cdn.your-org.com/*` might be safe, but
> + `https://cdn.your-org.com*/*` is a major security risk because
> + the latter matches `https://cdn.your-org.com.hacker.net/repo`.
> ++
> +4. Be careful using globs at the beginning of domain names. While the
> + code ensures a `*` in the host cannot cross into the path, a
> + pattern like `https://*.example.com/*` will still match any
> + subdomain. This is extremely dangerous on shared hosting platforms
> + (e.g., `https://*.github.io/*` trusts every user's site on the
> + entire platform).
> ++
> +Before matching, both the advertised URL and the pattern are
> +normalized: the scheme and host are lowercased, percent-encoded
> +characters are decoded where possible, and path segments like `..`
> +are resolved. The port must also match exactly (e.g.,
> +`https://example.com:8080/*` will not match a URL advertised on
> +port 9999).
> ++
> +For the security implications of accepting a promisor remote, see the
> +documentation of `promisor.acceptFromServer`. For details on the
> +protocol, see linkgit:gitprotocol-v2[5].
> +
> promisor.checkFields::
> A comma or space separated list of additional remote related
> field names. A client checks if the values of these fields
> diff --git a/Documentation/gitprotocol-v2.adoc b/Documentation/gitprotocol-v2.adoc
> index befa697d21..2beb70595f 100644
> --- a/Documentation/gitprotocol-v2.adoc
> +++ b/Documentation/gitprotocol-v2.adoc
> @@ -866,10 +866,11 @@ the server advertised, the client shouldn't advertise the
>
> On the server side, the "promisor.advertise" and "promisor.sendFields"
> configuration options can be used to control what it advertises. On
> -the client side, the "promisor.acceptFromServer" configuration option
> -can be used to control what it accepts, and the "promisor.storeFields"
> -option, to control what it stores. See the documentation of these
> -configuration options in linkgit:git-config[1] for more information.
> +the client side, the "promisor.acceptFromServer" and
> +"promisor.acceptFromServerUrl" configuration options can be used to
> +control what it accepts, and the "promisor.storeFields" option, to
> +control what it stores. See the documentation of these configuration
> +options in linkgit:git-config[1] for more information.
>
> Note that in the future it would be nice if the "promisor-remote"
> protocol capability could be used by the server, when responding to
> diff --git a/promisor-remote.c b/promisor-remote.c
> index 3f3924f587..72d5b94bf7 100644
> --- a/promisor-remote.c
> +++ b/promisor-remote.c
> @@ -14,6 +14,7 @@
> #include "url.h"
> #include "urlmatch.h"
> #include "version.h"
> +#include "wildmatch.h"
>
> struct promisor_remote_config {
> struct promisor_remote *promisors;
> @@ -742,8 +743,82 @@ static void load_accept_from_server_url(struct repository *repo,
> }
> }
>
> +static bool match_one_url(const struct url_info *pi, const struct url_info *ui)
> +{
> + const char *pat = pi->url;
> + const char *url = ui->url;
> + char *p_str, *u_str;
> + bool res;
> +
> + /*
> + * Schemes must match exactly. They are case-folded by
> + * url_normalize(), so strncmp() suffices.
> + */
> + if (pi->scheme_len != ui->scheme_len || strncmp(pat, url, pi->scheme_len))
> + return false;
> +
> + /*
> + * Ports must match exactly. url_normalize() strips default
> + * ports (like 443 for https), so length and content
> + * comparisons are sufficient.
> + */
> + if (pi->port_len != ui->port_len ||
> + strncmp(pat + pi->port_off, url + ui->port_off, pi->port_len))
> + return false;
> +
> + /*
> + * Match host and path separately to prevent a '*' in the host
> + * portion of the pattern from matching across the '/'
> + * boundary into the path. Use WM_PATHNAME for the host so '*'
> + * cannot cross '/' there, and 0 for the path so '*' can still
> + * match multi-level paths.
> + */
Do we actually need WM_PATHNAME, because we only xstrndup() the host
part anyway?
> +
> + p_str = xstrndup(pat + pi->host_off, pi->host_len);
> + u_str = xstrndup(url + ui->host_off, ui->host_len);
> + res = !wildmatch(p_str, u_str, WM_PATHNAME);
> + free(p_str);
> + free(u_str);
> +
> + if (!res)
> + return false;
> +
> + p_str = xstrndup(pat + pi->path_off, pi->path_len);
> + u_str = xstrndup(url + ui->path_off, ui->path_len);
> + res = !wildmatch(p_str, u_str, 0);
> + free(p_str);
> + free(u_str);
Is it correct we intentionally do not compare the user and pass (at
`user_off` and `passwd_off`)? I assume so, because this allows the
server to update those?
> +
> + return res;
> +}
> +
> +static struct allowed_url *url_matches_accept_list(
> + struct string_list *accept_urls, const char *url)
> +{
> + struct string_list_item *item;
> + struct url_info url_info;
> +
> + url_info.url = url_normalize(url, &url_info);
> +
> + if (!url_info.url)
> + return NULL;
> +
> + for_each_string_list_item(item, accept_urls) {
> + struct allowed_url *allowed = item->util;
> +
> + if (match_one_url(&allowed->pattern_info, &url_info)) {
> + free(url_info.url);
> + return allowed;
> + }
> + }
> +
> + free(url_info.url);
> + return NULL;
> +}
> +
> static int should_accept_remote(enum accept_promisor accept,
> struct promisor_info *advertised,
> + struct string_list *accept_urls,
> struct string_list *config_info)
> {
> struct promisor_info *p;
> @@ -771,9 +846,6 @@ static int should_accept_remote(enum accept_promisor accept,
> if (accept == ACCEPT_KNOWN_NAME)
> return all_fields_match(advertised, config_info, p);
>
> - if (accept != ACCEPT_KNOWN_URL)
> - BUG("Unhandled 'enum accept_promisor' value '%d'", accept);
> -
> if (strcmp(p->url, remote_url)) {
> warning(_("known remote named '%s' but with URL '%s' instead of '%s', "
> "ignoring this remote"),
> @@ -781,7 +853,21 @@ static int should_accept_remote(enum accept_promisor accept,
> return 0;
> }
>
> - return all_fields_match(advertised, config_info, p);
> + if (accept == ACCEPT_KNOWN_URL)
> + return all_fields_match(advertised, config_info, p);
> +
> + if (accept != ACCEPT_NONE)
> + BUG("Unhandled 'enum accept_promisor' value '%d'", accept);
> +
> + /*
> + * Even if accept == ACCEPT_NONE, we MUST trust this known
> + * remote to update its token or other such fields if its URL
> + * matches the acceptFromServerUrl allowlist!
> + */
> + if (url_matches_accept_list(accept_urls, remote_url))
> + return all_fields_match(advertised, config_info, p);
I should verify in the following patches, but it seems to me only when
promisor.AcceptFromServer is set to None it will store the advertised
servers to the local .git/config, or not?
> +
> + return 0;
> }
>
> static int skip_field_name_prefix(const char *elem, const char *field_name, const char **value)
> @@ -991,7 +1077,7 @@ static void filter_promisor_remote(struct repository *repo,
> /* Load and validate the acceptFromServerUrl config */
> load_accept_from_server_url(repo, &accept_urls);
>
> - if (accept == ACCEPT_NONE)
> + if (accept == ACCEPT_NONE && !accept_urls.nr)
> return;
>
> /* Parse remote info received */
> @@ -1011,7 +1097,7 @@ static void filter_promisor_remote(struct repository *repo,
> string_list_sort(&config_info);
> }
>
> - if (should_accept_remote(accept, advertised, &config_info)) {
> + if (should_accept_remote(accept, advertised, &accept_urls, &config_info)) {
> if (!store_info)
> store_info = store_info_new(repo);
> if (promisor_store_advertised_fields(advertised, store_info))
> diff --git a/t/t5710-promisor-remote-capability.sh b/t/t5710-promisor-remote-capability.sh
> index 3b39505380..0659b2ac15 100755
> --- a/t/t5710-promisor-remote-capability.sh
> +++ b/t/t5710-promisor-remote-capability.sh
> @@ -387,6 +387,77 @@ test_expect_success "clone with 'KnownUrl' and empty url, so not advertised" '
> check_missing_objects server 1 "$oid"
> '
>
> +test_expect_success "clone with 'None' but URL allowlisted" '
> + git -C server config promisor.advertise true &&
> + test_when_finished "rm -rf client" &&
> +
> + GIT_NO_LAZY_FETCH=0 git clone -c remote.lop.promisor=true \
> + -c remote.lop.fetch="+refs/heads/*:refs/remotes/lop/*" \
> + -c remote.lop.url="$TRASH_DIRECTORY_URL/lop" \
> + -c promisor.acceptfromserver=None \
> + -c promisor.acceptFromServerUrl="$ENCODED_TRASH_DIRECTORY_URL/*" \
> + --no-local --filter="blob:limit=5k" server client &&
> +
> + # Check that the largest object is still missing on the server
> + check_missing_objects server 1 "$oid"
> +'
Why do some tests end with `initialize_server 1 "$oid"` and this one
not? Isn't it weird tests prepare for the next test?
> +
> +test_expect_success "clone with 'None' but URL not in allowlist" '
> + git -C server config promisor.advertise true &&
> + test_when_finished "rm -rf client" &&
> +
> + GIT_NO_LAZY_FETCH=0 git clone -c remote.lop.promisor=true \
> + -c remote.lop.fetch="+refs/heads/*:refs/remotes/lop/*" \
> + -c remote.lop.url="$TRASH_DIRECTORY_URL/lop" \
> + -c promisor.acceptfromserver=None \
> + -c promisor.acceptFromServerUrl="https://example.com/*" \
> + --no-local --filter="blob:limit=5k" server client &&
> +
> + # Check that the largest object is not missing on the server
> + check_missing_objects server 0 "" &&
> +
> + # Reinitialize server so that the largest object is missing again
> + initialize_server 1 "$oid"
> +'
> +
> +test_expect_success "clone with 'None' but URL allowlisted in one pattern out of two" '
> + git -C server config promisor.advertise true &&
> + test_when_finished "rm -rf client" &&
> +
> + GIT_NO_LAZY_FETCH=0 git clone -c remote.lop.promisor=true \
> + -c remote.lop.fetch="+refs/heads/*:refs/remotes/lop/*" \
> + -c remote.lop.url="$TRASH_DIRECTORY_URL/lop" \
> + -c promisor.acceptfromserver=None \
> + -c promisor.acceptFromServerUrl="https://example.com/*" \
> + -c promisor.acceptFromServerUrl="$ENCODED_TRASH_DIRECTORY_URL/*" \
> + --no-local --filter="blob:limit=5k" server client &&
> +
> + # Check that the largest object is still missing on the server
> + check_missing_objects server 1 "$oid"
> +'
> +
> +test_expect_success "clone with 'None', URL allowlisted, but client has different URL" '
> + git -C server config promisor.advertise true &&
> + test_when_finished "rm -rf client" &&
> +
> + # The client configures "lop" with a different URL (serverTwo) than
> + # what the server advertises (lop). Even though the advertised URL
> + # matches the allowlist, the remote is rejected because the
> + # configured URL does not match the advertised one.
> + GIT_NO_LAZY_FETCH=0 git clone -c remote.lop.promisor=true \
> + -c remote.lop.fetch="+refs/heads/*:refs/remotes/lop/*" \
> + -c remote.lop.url="$TRASH_DIRECTORY_URL/serverTwo" \
> + -c promisor.acceptfromserver=None \
> + -c promisor.acceptFromServerUrl="$ENCODED_TRASH_DIRECTORY_URL/*" \
> + --no-local --filter="blob:limit=5k" server client &&
> +
> + # Check that the largest object is not missing on the server
> + check_missing_objects server 0 "" &&
> +
> + # Reinitialize server so that the largest object is missing again
> + initialize_server 1 "$oid"
> +'
> +
> test_expect_success "clone with invalid promisor.acceptFromServerUrl" '
> git -C server config promisor.advertise true &&
> test_when_finished "rm -rf client" &&
> --
> 2.54.0.19.gb68b9497aa
>
>
--
Cheers,
Toon
^ permalink raw reply [flat|nested] 80+ messages in thread* Re: [PATCH v2 6/8] promisor-remote: trust known remotes matching acceptFromServerUrl
2026-04-27 12:41 ` [PATCH v2 6/8] promisor-remote: trust known remotes matching acceptFromServerUrl Christian Couder
2026-05-08 12:45 ` Toon Claes
@ 2026-05-11 13:10 ` Toon Claes
1 sibling, 0 replies; 80+ messages in thread
From: Toon Claes @ 2026-05-11 13:10 UTC (permalink / raw)
To: Christian Couder, git
Cc: Junio C Hamano, Patrick Steinhardt, Taylor Blau, Karthik Nayak,
Elijah Newren, Christian Couder, Christian Couder
Christian Couder <christian.couder@gmail.com> writes:
> +static bool match_one_url(const struct url_info *pi, const struct url_info *ui)
> +{
> + const char *pat = pi->url;
> + const char *url = ui->url;
> + char *p_str, *u_str;
> + bool res;
> +
> + /*
> + * Schemes must match exactly. They are case-folded by
> + * url_normalize(), so strncmp() suffices.
> + */
> + if (pi->scheme_len != ui->scheme_len || strncmp(pat, url, pi->scheme_len))
> + return false;
> +
> + /*
> + * Ports must match exactly. url_normalize() strips default
> + * ports (like 443 for https), so length and content
> + * comparisons are sufficient.
> + */
> + if (pi->port_len != ui->port_len ||
> + strncmp(pat + pi->port_off, url + ui->port_off, pi->port_len))
> + return false;
> +
> + /*
> + * Match host and path separately to prevent a '*' in the host
> + * portion of the pattern from matching across the '/'
> + * boundary into the path. Use WM_PATHNAME for the host so '*'
> + * cannot cross '/' there, and 0 for the path so '*' can still
> + * match multi-level paths.
> + */
> +
> + p_str = xstrndup(pat + pi->host_off, pi->host_len);
> + u_str = xstrndup(url + ui->host_off, ui->host_len);
> + res = !wildmatch(p_str, u_str, WM_PATHNAME);
> + free(p_str);
> + free(u_str);
> +
> + if (!res)
I feel it's a bit confusing your negating the result from wildmatch()
to negate it here again? Maybe keep using the int return value, or
rename the variable to 'matches' ?
> + return false;
> +
> + p_str = xstrndup(pat + pi->path_off, pi->path_len);
> + u_str = xstrndup(url + ui->path_off, ui->path_len);
> + res = !wildmatch(p_str, u_str, 0);
> + free(p_str);
> + free(u_str);
> +
> + return res;
> +}
--
Cheers,
Toon
^ permalink raw reply [flat|nested] 80+ messages in thread
* [PATCH v2 7/8] promisor-remote: auto-configure unknown remotes
2026-04-27 12:41 ` [PATCH v2 0/8] Auto-configure advertised remotes via URL allowlist Christian Couder
` (5 preceding siblings ...)
2026-04-27 12:41 ` [PATCH v2 6/8] promisor-remote: trust known remotes matching acceptFromServerUrl Christian Couder
@ 2026-04-27 12:41 ` Christian Couder
2026-05-11 13:06 ` Toon Claes
2026-04-27 12:41 ` [PATCH v2 8/8] doc: promisor: improve acceptFromServer entry Christian Couder
2026-04-27 13:00 ` [PATCH v2 0/8] Auto-configure advertised remotes via URL allowlist Christian Couder
8 siblings, 1 reply; 80+ messages in thread
From: Christian Couder @ 2026-04-27 12:41 UTC (permalink / raw)
To: git
Cc: Junio C Hamano, Patrick Steinhardt, Taylor Blau, Karthik Nayak,
Elijah Newren, Christian Couder, Christian Couder
Previous commits have introduced the `promisor.acceptFromServerUrl`
config variable to allowlist some URLs advertised by a server through
the "promisor-remote" protocol capability.
However the new `promisor.acceptFromServerUrl` mechanism, like the old
`promisor.acceptFromServer` mechanism, still requires a remote to
already exist in the client's local configuration before it can be
accepted. This places a significant manual burden on users to
pre-configure these remotes, and creates friction for administrators
who have to troubleshoot or manually provision these setups for their
teams.
To eliminate this burden, let's automatically create a new `[remote]`
section in the client's config when a server advertises an unknown
remote whose URL matches a `promisor.acceptFromServerUrl` glob pattern.
Concretely, let's add four helpers:
- sanitize_remote_name(): turn an arbitrary URL-derived string into a
valid remote name by replacing non-alphanumeric characters,
collapsing runs of '-', and prepending "promisor-auto-".
- promisor_remote_name_from_url(): normalize the URL and extract
host+port+path to build a human-readable base name, then pass it
through sanitize_remote_name().
- configure_auto_promisor_remote(): write the remote.*.url,
remote.*.promisor and remote.*.advertisedAs keys to the repo
config.
- handle_matching_allowed_url(): pick the final name (user-supplied
alias or auto-generated), handle collisions by appending "-1",
"-2", etc., then call configure_auto_promisor_remote().
Let's also add should_accept_new_remote_url() which reuses the
url_matches_accept_list() helper introduced in a previous commit to
find a matching pattern, then delegates to handle_matching_allowed_url()
to create the remote.
And then let's call should_accept_new_remote_url() from the '!item'
(unknown remote) branch of should_accept_remote(), setting
`reload_config` so that the newly-written config is picked up.
Finally let's document all that by:
- expanding the `promisor.acceptFromServerUrl` entry to describe
auto-creation, the optional "name=" prefix syntax, the
"promisor-auto-*" generation rules, and numeric-suffix collision
handling, and by
- adding a "remote.<name>.advertisedAs" entry to "remote.adoc".
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
Documentation/config/promisor.adoc | 26 +++-
Documentation/config/remote.adoc | 9 ++
promisor-remote.c | 202 +++++++++++++++++++++++++-
t/t5710-promisor-remote-capability.sh | 104 +++++++++++++
4 files changed, 332 insertions(+), 9 deletions(-)
diff --git a/Documentation/config/promisor.adoc b/Documentation/config/promisor.adoc
index efc066c3f2..ae1686a6e0 100644
--- a/Documentation/config/promisor.adoc
+++ b/Documentation/config/promisor.adoc
@@ -54,7 +54,8 @@ promisor.acceptFromServer::
promisor.acceptFromServerUrl::
A glob pattern to specify which server-advertised URLs a
client is allowed to act on. When a URL matches, the client
- will accept the advertised remote as a promisor remote and may
+ will accept the advertised remote as a promisor remote, may
+ automatically create a new remote configuration for it and may
automatically accept field updates (such as authentication
tokens) from the server, even if `promisor.acceptFromServer`
is set to `none` (the default).
@@ -66,9 +67,10 @@ this option in _ANY_ config file read by Git.
Be _VERY_ careful with these patterns: `*` matches any sequence of
characters within the 'host' and 'path' parts of a URL (but cannot
cross part boundaries). An overly broad pattern is a major security
-risk, as a matching URL allows a server to update fields (such as
-authentication tokens) on known remotes without further confirmation.
-To minimize security risks, follow these guidelines:
+risk, as a matching URL allows a server to auto-configure new remotes
+and to update fields (such as authentication tokens) on known remotes
+without further confirmation. To minimize security risks, follow these
+guidelines:
+
1. Start with a secure protocol scheme, like `https://` or `ssh://`.
+
@@ -99,6 +101,22 @@ are resolved. The port must also match exactly (e.g.,
`https://example.com:8080/*` will not match a URL advertised on
port 9999).
+
+The glob pattern can optionally be prefixed with a remote name and an
+equals sign (e.g., `cdn=https://cdn.example.com/*`). If such a prefix
+is provided, accepted remotes will be saved under that name. If no
+such prefix is provided, a safe remote name will be automatically
+generated by sanitizing the URL and prefixing it with
+`promisor-auto-`.
++
+If a remote with the chosen name already exists but points to a
+different URL, Git will append a numeric suffix (e.g., `-1`, `-2`) to
+the name to prevent overwriting existing configurations. You should
+make sure that this doesn't happen often though, as remotes will be
+rejected if the numeric suffix increases too much. In all cases, the
+original name advertised by the server is recorded in the
+`remote.<name>.advertisedAs` configuration variable for tracing and
+debugging purposes.
++
For the security implications of accepting a promisor remote, see the
documentation of `promisor.acceptFromServer`. For details on the
protocol, see linkgit:gitprotocol-v2[5].
diff --git a/Documentation/config/remote.adoc b/Documentation/config/remote.adoc
index 91e46f66f5..6e2bbdf457 100644
--- a/Documentation/config/remote.adoc
+++ b/Documentation/config/remote.adoc
@@ -91,6 +91,15 @@ remote.<name>.promisor::
When set to true, this remote will be used to fetch promisor
objects.
+remote.<name>.advertisedAs::
+ When a promisor remote is automatically configured using
+ information advertised by a server through the
+ `promisor-remote` protocol capability (see
+ `promisor.acceptFromServerUrl`), the server's originally
+ advertised name is saved in this variable. This is for
+ information, tracing and debugging purposes. Users should not
+ typically modify or create such configuration entries.
+
remote.<name>.partialclonefilter::
The filter that will be applied when fetching from this promisor remote.
Changing or clearing this value will only affect fetches for new commits.
diff --git a/promisor-remote.c b/promisor-remote.c
index 72d5b94bf7..8c8a798fdb 100644
--- a/promisor-remote.c
+++ b/promisor-remote.c
@@ -816,10 +816,197 @@ static struct allowed_url *url_matches_accept_list(
return NULL;
}
-static int should_accept_remote(enum accept_promisor accept,
+/*
+ * Sanitize the buffer to make it a valid remote name coming from the
+ * server by:
+ *
+ * - replacing any non alphanumeric character with a '-'
+ * - stripping any leading '-',
+ * - condensing multiple '-' into one,
+ * - prepending "promisor-auto-",
+ * - validating the result.
+ */
+static int sanitize_remote_name(struct strbuf *buf, const char *url)
+{
+ char prev = '-';
+ for (size_t i = 0; i < buf->len; ) {
+ if (!isalnum(buf->buf[i]))
+ buf->buf[i] = '-';
+ if (prev == '-' && buf->buf[i] == '-') {
+ strbuf_remove(buf, i, 1);
+ } else {
+ prev = buf->buf[i];
+ i++;
+ }
+ }
+
+ strbuf_strip_suffix(buf, "-");
+
+ if (!buf->len) {
+ warning(_("couldn't generate a valid remote name from "
+ "advertised url '%s', ignoring this remote"), url);
+ return -1;
+ }
+
+ strbuf_insertstr(buf, 0, "promisor-auto-");
+
+ if (!valid_remote_name(buf->buf)) {
+ warning(_("generated remote name '%s' from advertised url '%s' "
+ "is invalid, ignoring this remote"), buf->buf, url);
+ return -1;
+ }
+
+ return 0;
+}
+
+static char *promisor_remote_name_from_url(const char *url)
+{
+ struct url_info url_info = { 0 };
+ char *normalized = url_normalize(url, &url_info);
+ struct strbuf buf = STRBUF_INIT;
+
+ if (!normalized) {
+ warning(_("couldn't normalize advertised url '%s', "
+ "ignoring this remote"), url);
+ return NULL;
+ }
+
+ if (url_info.host_len) {
+ strbuf_add(&buf, normalized + url_info.host_off, url_info.host_len);
+ strbuf_addch(&buf, '-');
+ }
+
+ if (url_info.port_len) {
+ strbuf_add(&buf, normalized + url_info.port_off, url_info.port_len);
+ strbuf_addch(&buf, '-');
+ }
+
+ if (url_info.path_len) {
+ strbuf_add(&buf, normalized + url_info.path_off, url_info.path_len);
+ strbuf_trim_trailing_dir_sep(&buf);
+ strbuf_strip_suffix(&buf, ".git");
+ }
+
+ free(normalized);
+
+ if (sanitize_remote_name(&buf, url)) {
+ strbuf_release(&buf);
+ return NULL;
+ }
+
+ return strbuf_detach(&buf, NULL);
+}
+
+static void configure_auto_promisor_remote(struct repository *repo,
+ const char *name,
+ const char *url,
+ const char *advertised_as,
+ bool reuse)
+{
+ char *key;
+
+ if (!reuse) {
+ fprintf(stderr, _("Auto-creating promisor remote '%s' for URL '%s'\n"),
+ name, url);
+
+ key = xstrfmt("remote.%s.url", name);
+ repo_config_set_gently(repo, key, url);
+ free(key);
+ }
+
+ /* NB: when reusing, this promotes an existing non-promisor remote */
+ key = xstrfmt("remote.%s.promisor", name);
+ repo_config_set_gently(repo, key, "true");
+ free(key);
+
+ if (advertised_as) {
+ key = xstrfmt("remote.%s.advertisedAs", name);
+ repo_config_set_gently(repo, key, advertised_as);
+ free(key);
+ }
+}
+
+#define MAX_REMOTES_WITH_SIMILAR_NAMES 20
+
+/* Return the allocated local name, or NULL on failure */
+static char *handle_matching_allowed_url(struct repository *repo,
+ char *allowed_name,
+ const char *remote_url,
+ const char *remote_name)
+{
+ char *name;
+ char *basename = allowed_name ?
+ xstrdup(allowed_name) :
+ promisor_remote_name_from_url(remote_url);
+ int i = 0;
+ bool reuse = false;
+
+ if (!basename)
+ return NULL;
+
+ name = xstrdup(basename);
+
+ while (i < MAX_REMOTES_WITH_SIMILAR_NAMES) {
+ char *url_key = xstrfmt("remote.%s.url", name);
+ const char *existing_url;
+ int exists = !repo_config_get_string_tmp(repo, url_key, &existing_url);
+
+ free(url_key);
+
+ if (!exists)
+ break; /* Free to use */
+
+ if (!strcmp(existing_url, remote_url)) {
+ reuse = true;
+ break; /* Same URL, so safe to reuse */
+ }
+
+ i++;
+ free(name);
+ name = xstrfmt("%s-%d", basename, i);
+ }
+
+ if (i < MAX_REMOTES_WITH_SIMILAR_NAMES) {
+ configure_auto_promisor_remote(repo, name,
+ remote_url, remote_name,
+ reuse);
+ } else {
+ warning(_("too many remotes accepted with name like '%s-X', "
+ "ignoring this remote"), basename);
+ FREE_AND_NULL(name);
+ }
+
+ free(basename);
+ return name;
+}
+
+static int should_accept_new_remote_url(struct repository *repo,
+ struct string_list *accept_urls,
+ struct promisor_info *advertised)
+{
+ struct allowed_url *allowed = url_matches_accept_list(accept_urls,
+ advertised->url);
+ if (allowed) {
+ char *name = handle_matching_allowed_url(repo,
+ allowed->remote_name,
+ advertised->url,
+ advertised->name);
+ if (name) {
+ free((char *)advertised->local_name);
+ advertised->local_name = name;
+ return 1;
+ }
+ }
+
+ return 0;
+}
+
+static int should_accept_remote(struct repository *repo,
+ enum accept_promisor accept,
struct promisor_info *advertised,
struct string_list *accept_urls,
- struct string_list *config_info)
+ struct string_list *config_info,
+ bool *reload_config)
{
struct promisor_info *p;
struct string_list_item *item;
@@ -837,9 +1024,13 @@ static int should_accept_remote(enum accept_promisor accept,
/* Get config info for that promisor remote */
item = string_list_lookup(config_info, remote_name);
- if (!item)
+ if (!item) {
/* We don't know about that remote */
- return 0;
+ int res = should_accept_new_remote_url(repo, accept_urls, advertised);
+ if (res)
+ *reload_config = true;
+ return res;
+ }
p = item->util;
@@ -1097,7 +1288,8 @@ static void filter_promisor_remote(struct repository *repo,
string_list_sort(&config_info);
}
- if (should_accept_remote(accept, advertised, &accept_urls, &config_info)) {
+ if (should_accept_remote(repo, accept, advertised, &accept_urls,
+ &config_info, &reload_config)) {
if (!store_info)
store_info = store_info_new(repo);
if (promisor_store_advertised_fields(advertised, store_info))
diff --git a/t/t5710-promisor-remote-capability.sh b/t/t5710-promisor-remote-capability.sh
index 0659b2ac15..549acff23f 100755
--- a/t/t5710-promisor-remote-capability.sh
+++ b/t/t5710-promisor-remote-capability.sh
@@ -458,6 +458,107 @@ test_expect_success "clone with 'None', URL allowlisted, but client has differen
initialize_server 1 "$oid"
'
+test_expect_success "clone with URL allowlisted and no remote already configured" '
+ git -C server config promisor.advertise true &&
+ test_when_finished "rm -rf client" &&
+ test_when_finished "rm -f full_names" &&
+
+ GIT_NO_LAZY_FETCH=0 git clone \
+ -c promisor.acceptfromserver=None \
+ -c promisor.acceptFromServerUrl="$ENCODED_TRASH_DIRECTORY_URL/*" \
+ --no-local --filter="blob:limit=5k" server client &&
+
+ # Check that exactly one remote has been auto-created, identified
+ # by "remote.<name>.advertisedAs" == "lop".
+ git -C client config get --all --show-names --regexp \
+ "remote\..*\.advertisedas" >full_names &&
+ test_line_count = 1 full_names &&
+ REMOTE_NAME=$(sed "s/^remote\.\(.*\)\.advertisedas .*$/\1/" full_names) &&
+
+ # Check ".url" and ".promisor" values
+ printf "%s\n" "$TRASH_DIRECTORY_URL/lop" "true" >expect &&
+ git -C client config "remote.$REMOTE_NAME.url" >actual &&
+ git -C client config "remote.$REMOTE_NAME.promisor" >>actual &&
+ test_cmp expect actual &&
+
+ # Check that the largest object is still missing on the server
+ check_missing_objects server 1 "$oid"
+'
+
+test_expect_success "clone with named URL allowlisted and no pre-configured remote" '
+ git -C server config promisor.advertise true &&
+ test_when_finished "rm -rf client" &&
+
+ GIT_NO_LAZY_FETCH=0 git clone \
+ -c promisor.acceptfromserver=None \
+ -c promisor.acceptFromServerUrl="cdn=$ENCODED_TRASH_DIRECTORY_URL/*" \
+ --no-local --filter="blob:limit=5k" server client &&
+
+ # Check that a remote has been auto-created with the right "cdn" name and fields.
+ printf "%s\n" "$TRASH_DIRECTORY_URL/lop" "true" "lop" >expect &&
+ git -C client config "remote.cdn.url" >actual &&
+ git -C client config "remote.cdn.promisor" >>actual &&
+ git -C client config "remote.cdn.advertisedAs" >>actual &&
+ test_cmp expect actual &&
+
+ # Check that the largest object is still missing on the server
+ check_missing_objects server 1 "$oid"
+'
+
+test_expect_success "clone with URL allowlisted but colliding name" '
+ git -C server config promisor.advertise true &&
+ test_when_finished "rm -rf client" &&
+
+ GIT_NO_LAZY_FETCH=0 git clone -c remote.cdn.promisor=true \
+ -c remote.cdn.fetch="+refs/heads/*:refs/remotes/lop/*" \
+ -c remote.cdn.url="https://example.com/cdn" \
+ -c promisor.acceptfromserver=None \
+ -c promisor.acceptFromServerUrl="cdn=$ENCODED_TRASH_DIRECTORY_URL/*" \
+ --no-local --filter="blob:limit=5k" server client &&
+
+ # Check that a remote has been auto-created with the right "cdn-1" name and fields.
+ printf "%s\n" "$TRASH_DIRECTORY_URL/lop" "true" "lop" >expect &&
+ git -C client config "remote.cdn-1.url" >actual &&
+ git -C client config "remote.cdn-1.promisor" >>actual &&
+ git -C client config "remote.cdn-1.advertisedAs" >>actual &&
+ test_cmp expect actual &&
+
+ # Check that the original "cdn" remote was not overwritten.
+ printf "%s\n" "https://example.com/cdn" "true" >expect &&
+ git -C client config "remote.cdn.url" >actual &&
+ git -C client config "remote.cdn.promisor" >>actual &&
+ test_cmp expect actual &&
+
+ # Check that the largest object is still missing on the server
+ check_missing_objects server 1 "$oid"
+'
+
+test_expect_success "clone with URL allowlisted and reusable remote" '
+ git -C server config promisor.advertise true &&
+ test_when_finished "rm -rf client" &&
+
+ GIT_NO_LAZY_FETCH=0 git clone \
+ -c remote.cdn.fetch="+refs/heads/*:refs/remotes/lop/*" \
+ -c remote.cdn.url="$TRASH_DIRECTORY_URL/lop" \
+ -c promisor.acceptfromserver=None \
+ -c promisor.acceptFromServerUrl="cdn=$ENCODED_TRASH_DIRECTORY_URL/*" \
+ --no-local --filter="blob:limit=5k" server client &&
+
+ # Check that the existing "cdn" remote has been properly updated.
+ printf "%s\n" "$TRASH_DIRECTORY_URL/lop" "true" "lop" "+refs/heads/*:refs/remotes/lop/*" >expect &&
+ git -C client config "remote.cdn.url" >actual &&
+ git -C client config "remote.cdn.promisor" >>actual &&
+ git -C client config "remote.cdn.advertisedAs" >>actual &&
+ git -C client config "remote.cdn.fetch" >>actual &&
+ test_cmp expect actual &&
+
+ # Check that no new "cdn-1" remote has been created.
+ test_must_fail git -C client config "remote.cdn-1.url" &&
+
+ # Check that the largest object is still missing on the server
+ check_missing_objects server 1 "$oid"
+'
+
test_expect_success "clone with invalid promisor.acceptFromServerUrl" '
git -C server config promisor.advertise true &&
test_when_finished "rm -rf client" &&
@@ -472,6 +573,9 @@ test_expect_success "clone with invalid promisor.acceptFromServerUrl" '
# Check that a warning was emitted
test_grep "invalid remote name '\''bad name'\''" err &&
+ # Check that no remote was auto-created
+ test_must_fail git -C client config get --regexp "remote\..*\.advertisedas" &&
+
# Check that the largest object is not missing on the server
check_missing_objects server 0 "" &&
--
2.54.0.19.gb68b9497aa
^ permalink raw reply related [flat|nested] 80+ messages in thread* Re: [PATCH v2 7/8] promisor-remote: auto-configure unknown remotes
2026-04-27 12:41 ` [PATCH v2 7/8] promisor-remote: auto-configure unknown remotes Christian Couder
@ 2026-05-11 13:06 ` Toon Claes
0 siblings, 0 replies; 80+ messages in thread
From: Toon Claes @ 2026-05-11 13:06 UTC (permalink / raw)
To: Christian Couder, git
Cc: Junio C Hamano, Patrick Steinhardt, Taylor Blau, Karthik Nayak,
Elijah Newren, Christian Couder, Christian Couder
Christian Couder <christian.couder@gmail.com> writes:
> Previous commits have introduced the `promisor.acceptFromServerUrl`
> config variable to allowlist some URLs advertised by a server through
> the "promisor-remote" protocol capability.
>
> However the new `promisor.acceptFromServerUrl` mechanism, like the old
> `promisor.acceptFromServer` mechanism, still requires a remote to
> already exist in the client's local configuration before it can be
> accepted. This places a significant manual burden on users to
> pre-configure these remotes, and creates friction for administrators
> who have to troubleshoot or manually provision these setups for their
> teams.
>
> To eliminate this burden, let's automatically create a new `[remote]`
> section in the client's config when a server advertises an unknown
> remote whose URL matches a `promisor.acceptFromServerUrl` glob pattern.
>
> Concretely, let's add four helpers:
>
> - sanitize_remote_name(): turn an arbitrary URL-derived string into a
> valid remote name by replacing non-alphanumeric characters,
> collapsing runs of '-', and prepending "promisor-auto-".
>
> - promisor_remote_name_from_url(): normalize the URL and extract
> host+port+path to build a human-readable base name, then pass it
> through sanitize_remote_name().
>
> - configure_auto_promisor_remote(): write the remote.*.url,
> remote.*.promisor and remote.*.advertisedAs keys to the repo
> config.
>
> - handle_matching_allowed_url(): pick the final name (user-supplied
> alias or auto-generated), handle collisions by appending "-1",
> "-2", etc., then call configure_auto_promisor_remote().
>
> Let's also add should_accept_new_remote_url() which reuses the
> url_matches_accept_list() helper introduced in a previous commit to
> find a matching pattern, then delegates to handle_matching_allowed_url()
> to create the remote.
>
> And then let's call should_accept_new_remote_url() from the '!item'
> (unknown remote) branch of should_accept_remote(), setting
> `reload_config` so that the newly-written config is picked up.
>
> Finally let's document all that by:
>
> - expanding the `promisor.acceptFromServerUrl` entry to describe
> auto-creation, the optional "name=" prefix syntax, the
> "promisor-auto-*" generation rules, and numeric-suffix collision
> handling, and by
> - adding a "remote.<name>.advertisedAs" entry to "remote.adoc".
>
> Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
> ---
> Documentation/config/promisor.adoc | 26 +++-
> Documentation/config/remote.adoc | 9 ++
> promisor-remote.c | 202 +++++++++++++++++++++++++-
> t/t5710-promisor-remote-capability.sh | 104 +++++++++++++
> 4 files changed, 332 insertions(+), 9 deletions(-)
>
> diff --git a/Documentation/config/promisor.adoc b/Documentation/config/promisor.adoc
> index efc066c3f2..ae1686a6e0 100644
> --- a/Documentation/config/promisor.adoc
> +++ b/Documentation/config/promisor.adoc
> @@ -54,7 +54,8 @@ promisor.acceptFromServer::
> promisor.acceptFromServerUrl::
> A glob pattern to specify which server-advertised URLs a
> client is allowed to act on. When a URL matches, the client
> - will accept the advertised remote as a promisor remote and may
> + will accept the advertised remote as a promisor remote, may
> + automatically create a new remote configuration for it and may
> automatically accept field updates (such as authentication
> tokens) from the server, even if `promisor.acceptFromServer`
> is set to `none` (the default).
> @@ -66,9 +67,10 @@ this option in _ANY_ config file read by Git.
> Be _VERY_ careful with these patterns: `*` matches any sequence of
> characters within the 'host' and 'path' parts of a URL (but cannot
> cross part boundaries). An overly broad pattern is a major security
> -risk, as a matching URL allows a server to update fields (such as
> -authentication tokens) on known remotes without further confirmation.
> -To minimize security risks, follow these guidelines:
> +risk, as a matching URL allows a server to auto-configure new remotes
> +and to update fields (such as authentication tokens) on known remotes
> +without further confirmation. To minimize security risks, follow these
> +guidelines:
> +
> 1. Start with a secure protocol scheme, like `https://` or `ssh://`.
> +
> @@ -99,6 +101,22 @@ are resolved. The port must also match exactly (e.g.,
> `https://example.com:8080/*` will not match a URL advertised on
> port 9999).
> +
> +The glob pattern can optionally be prefixed with a remote name and an
> +equals sign (e.g., `cdn=https://cdn.example.com/*`). If such a prefix
> +is provided, accepted remotes will be saved under that name. If no
> +such prefix is provided, a safe remote name will be automatically
> +generated by sanitizing the URL and prefixing it with
> +`promisor-auto-`.
> ++
> +If a remote with the chosen name already exists but points to a
> +different URL, Git will append a numeric suffix (e.g., `-1`, `-2`) to
> +the name to prevent overwriting existing configurations. You should
> +make sure that this doesn't happen often though, as remotes will be
> +rejected if the numeric suffix increases too much. In all cases, the
> +original name advertised by the server is recorded in the
> +`remote.<name>.advertisedAs` configuration variable for tracing and
> +debugging purposes.
> ++
> For the security implications of accepting a promisor remote, see the
> documentation of `promisor.acceptFromServer`. For details on the
> protocol, see linkgit:gitprotocol-v2[5].
> diff --git a/Documentation/config/remote.adoc b/Documentation/config/remote.adoc
> index 91e46f66f5..6e2bbdf457 100644
> --- a/Documentation/config/remote.adoc
> +++ b/Documentation/config/remote.adoc
> @@ -91,6 +91,15 @@ remote.<name>.promisor::
> When set to true, this remote will be used to fetch promisor
> objects.
>
> +remote.<name>.advertisedAs::
> + When a promisor remote is automatically configured using
> + information advertised by a server through the
> + `promisor-remote` protocol capability (see
> + `promisor.acceptFromServerUrl`), the server's originally
> + advertised name is saved in this variable. This is for
> + information, tracing and debugging purposes. Users should not
> + typically modify or create such configuration entries.
> +
> remote.<name>.partialclonefilter::
> The filter that will be applied when fetching from this promisor remote.
> Changing or clearing this value will only affect fetches for new commits.
> diff --git a/promisor-remote.c b/promisor-remote.c
> index 72d5b94bf7..8c8a798fdb 100644
> --- a/promisor-remote.c
> +++ b/promisor-remote.c
> @@ -816,10 +816,197 @@ static struct allowed_url *url_matches_accept_list(
> return NULL;
> }
>
> -static int should_accept_remote(enum accept_promisor accept,
> +/*
> + * Sanitize the buffer to make it a valid remote name coming from the
> + * server by:
> + *
> + * - replacing any non alphanumeric character with a '-'
> + * - stripping any leading '-',
> + * - condensing multiple '-' into one,
> + * - prepending "promisor-auto-",
> + * - validating the result.
> + */
> +static int sanitize_remote_name(struct strbuf *buf, const char *url)
> +{
> + char prev = '-';
> + for (size_t i = 0; i < buf->len; ) {
> + if (!isalnum(buf->buf[i]))
> + buf->buf[i] = '-';
> + if (prev == '-' && buf->buf[i] == '-') {
> + strbuf_remove(buf, i, 1);
> + } else {
> + prev = buf->buf[i];
> + i++;
> + }
> + }
> +
> + strbuf_strip_suffix(buf, "-");
> +
> + if (!buf->len) {
> + warning(_("couldn't generate a valid remote name from "
> + "advertised url '%s', ignoring this remote"), url);
> + return -1;
> + }
> +
> + strbuf_insertstr(buf, 0, "promisor-auto-");
> +
> + if (!valid_remote_name(buf->buf)) {
> + warning(_("generated remote name '%s' from advertised url '%s' "
> + "is invalid, ignoring this remote"), buf->buf, url);
> + return -1;
> + }
> +
> + return 0;
> +}
> +
> +static char *promisor_remote_name_from_url(const char *url)
> +{
> + struct url_info url_info = { 0 };
> + char *normalized = url_normalize(url, &url_info);
> + struct strbuf buf = STRBUF_INIT;
> +
> + if (!normalized) {
> + warning(_("couldn't normalize advertised url '%s', "
> + "ignoring this remote"), url);
> + return NULL;
> + }
> +
> + if (url_info.host_len) {
> + strbuf_add(&buf, normalized + url_info.host_off, url_info.host_len);
> + strbuf_addch(&buf, '-');
> + }
> +
> + if (url_info.port_len) {
> + strbuf_add(&buf, normalized + url_info.port_off, url_info.port_len);
> + strbuf_addch(&buf, '-');
If the url doesn't have a path, this could lead to the name being
`example-com-8443`. But we have a MAX_REMOTES_WITH_SIMILAR_NAMES at 20,
would this be an issue for a second remote without configured name?
As far as I can tell from handle_matching_allowed_url(), it's no issue,
because the numeric `-%d` suffix is added and we never atoi() the number
from existing remotes in the config.
> + }
> +
> + if (url_info.path_len) {
> + strbuf_add(&buf, normalized + url_info.path_off, url_info.path_len);
> + strbuf_trim_trailing_dir_sep(&buf);
> + strbuf_strip_suffix(&buf, ".git");
> + }
> +
> + free(normalized);
> +
> + if (sanitize_remote_name(&buf, url)) {
> + strbuf_release(&buf);
> + return NULL;
> + }
> +
> + return strbuf_detach(&buf, NULL);
> +}
> +
> +static void configure_auto_promisor_remote(struct repository *repo,
> + const char *name,
> + const char *url,
> + const char *advertised_as,
> + bool reuse)
> +{
> + char *key;
> +
> + if (!reuse) {
> + fprintf(stderr, _("Auto-creating promisor remote '%s' for URL '%s'\n"),
> + name, url);
> +
> + key = xstrfmt("remote.%s.url", name);
> + repo_config_set_gently(repo, key, url);
> + free(key);
> + }
> +
> + /* NB: when reusing, this promotes an existing non-promisor remote */
> + key = xstrfmt("remote.%s.promisor", name);
> + repo_config_set_gently(repo, key, "true");
> + free(key);
> +
> + if (advertised_as) {
> + key = xstrfmt("remote.%s.advertisedAs", name);
> + repo_config_set_gently(repo, key, advertised_as);
> + free(key);
> + }
> +}
> +
> +#define MAX_REMOTES_WITH_SIMILAR_NAMES 20
> +
> +/* Return the allocated local name, or NULL on failure */
> +static char *handle_matching_allowed_url(struct repository *repo,
> + char *allowed_name,
> + const char *remote_url,
> + const char *remote_name)
> +{
> + char *name;
> + char *basename = allowed_name ?
> + xstrdup(allowed_name) :
> + promisor_remote_name_from_url(remote_url);
> + int i = 0;
> + bool reuse = false;
> +
> + if (!basename)
> + return NULL;
> +
> + name = xstrdup(basename);
> +
> + while (i < MAX_REMOTES_WITH_SIMILAR_NAMES) {
> + char *url_key = xstrfmt("remote.%s.url", name);
> + const char *existing_url;
> + int exists = !repo_config_get_string_tmp(repo, url_key, &existing_url);
> +
> + free(url_key);
> +
> + if (!exists)
> + break; /* Free to use */
> +
> + if (!strcmp(existing_url, remote_url)) {
> + reuse = true;
> + break; /* Same URL, so safe to reuse */
> + }
> +
> + i++;
> + free(name);
> + name = xstrfmt("%s-%d", basename, i);
> + }
> +
> + if (i < MAX_REMOTES_WITH_SIMILAR_NAMES) {
> + configure_auto_promisor_remote(repo, name,
> + remote_url, remote_name,
> + reuse);
> + } else {
> + warning(_("too many remotes accepted with name like '%s-X', "
> + "ignoring this remote"), basename);
> + FREE_AND_NULL(name);
> + }
> +
> + free(basename);
> + return name;
> +}
> +
> +static int should_accept_new_remote_url(struct repository *repo,
> + struct string_list *accept_urls,
> + struct promisor_info *advertised)
> +{
> + struct allowed_url *allowed = url_matches_accept_list(accept_urls,
> + advertised->url);
> + if (allowed) {
> + char *name = handle_matching_allowed_url(repo,
> + allowed->remote_name,
> + advertised->url,
> + advertised->name);
> + if (name) {
> + free((char *)advertised->local_name);
> + advertised->local_name = name;
> + return 1;
> + }
> + }
> +
> + return 0;
> +}
> +
> +static int should_accept_remote(struct repository *repo,
> + enum accept_promisor accept,
> struct promisor_info *advertised,
> struct string_list *accept_urls,
> - struct string_list *config_info)
> + struct string_list *config_info,
> + bool *reload_config)
> {
> struct promisor_info *p;
> struct string_list_item *item;
> @@ -837,9 +1024,13 @@ static int should_accept_remote(enum accept_promisor accept,
> /* Get config info for that promisor remote */
> item = string_list_lookup(config_info, remote_name);
>
> - if (!item)
> + if (!item) {
> /* We don't know about that remote */
> - return 0;
> + int res = should_accept_new_remote_url(repo, accept_urls, advertised);
> + if (res)
> + *reload_config = true;
> + return res;
> + }
>
> p = item->util;
>
> @@ -1097,7 +1288,8 @@ static void filter_promisor_remote(struct repository *repo,
> string_list_sort(&config_info);
> }
>
> - if (should_accept_remote(accept, advertised, &accept_urls, &config_info)) {
> + if (should_accept_remote(repo, accept, advertised, &accept_urls,
> + &config_info, &reload_config)) {
> if (!store_info)
> store_info = store_info_new(repo);
> if (promisor_store_advertised_fields(advertised, store_info))
> diff --git a/t/t5710-promisor-remote-capability.sh b/t/t5710-promisor-remote-capability.sh
> index 0659b2ac15..549acff23f 100755
> --- a/t/t5710-promisor-remote-capability.sh
> +++ b/t/t5710-promisor-remote-capability.sh
> @@ -458,6 +458,107 @@ test_expect_success "clone with 'None', URL allowlisted, but client has differen
> initialize_server 1 "$oid"
> '
>
> +test_expect_success "clone with URL allowlisted and no remote already configured" '
> + git -C server config promisor.advertise true &&
> + test_when_finished "rm -rf client" &&
> + test_when_finished "rm -f full_names" &&
> +
> + GIT_NO_LAZY_FETCH=0 git clone \
> + -c promisor.acceptfromserver=None \
> + -c promisor.acceptFromServerUrl="$ENCODED_TRASH_DIRECTORY_URL/*" \
> + --no-local --filter="blob:limit=5k" server client &&
So promisor.acceptFromServerUrl only works if promisor.acceptFromServer
is "none"? I mean which one should precedence? If
promisor.acceptFromServer is set to "all", the promisor remote is
accepted by the client, but not saved to the config. Is that
intentional? Should we document that?
> + # Check that exactly one remote has been auto-created, identified
> + # by "remote.<name>.advertisedAs" == "lop".
> + git -C client config get --all --show-names --regexp \
> + "remote\..*\.advertisedas" >full_names &&
> + test_line_count = 1 full_names &&
> + REMOTE_NAME=$(sed "s/^remote\.\(.*\)\.advertisedas .*$/\1/" full_names) &&
> +
> + # Check ".url" and ".promisor" values
> + printf "%s\n" "$TRASH_DIRECTORY_URL/lop" "true" >expect &&
> + git -C client config "remote.$REMOTE_NAME.url" >actual &&
> + git -C client config "remote.$REMOTE_NAME.promisor" >>actual &&
> + test_cmp expect actual &&
> +
> + # Check that the largest object is still missing on the server
> + check_missing_objects server 1 "$oid"
> +'
> +
> +test_expect_success "clone with named URL allowlisted and no pre-configured remote" '
> + git -C server config promisor.advertise true &&
> + test_when_finished "rm -rf client" &&
> +
> + GIT_NO_LAZY_FETCH=0 git clone \
> + -c promisor.acceptfromserver=None \
> + -c promisor.acceptFromServerUrl="cdn=$ENCODED_TRASH_DIRECTORY_URL/*" \
> + --no-local --filter="blob:limit=5k" server client &&
> +
> + # Check that a remote has been auto-created with the right "cdn" name and fields.
> + printf "%s\n" "$TRASH_DIRECTORY_URL/lop" "true" "lop" >expect &&
> + git -C client config "remote.cdn.url" >actual &&
> + git -C client config "remote.cdn.promisor" >>actual &&
> + git -C client config "remote.cdn.advertisedAs" >>actual &&
> + test_cmp expect actual &&
> +
> + # Check that the largest object is still missing on the server
> + check_missing_objects server 1 "$oid"
> +'
> +
> +test_expect_success "clone with URL allowlisted but colliding name" '
> + git -C server config promisor.advertise true &&
> + test_when_finished "rm -rf client" &&
> +
> + GIT_NO_LAZY_FETCH=0 git clone -c remote.cdn.promisor=true \
> + -c remote.cdn.fetch="+refs/heads/*:refs/remotes/lop/*" \
> + -c remote.cdn.url="https://example.com/cdn" \
> + -c promisor.acceptfromserver=None \
> + -c promisor.acceptFromServerUrl="cdn=$ENCODED_TRASH_DIRECTORY_URL/*" \
> + --no-local --filter="blob:limit=5k" server client &&
> +
> + # Check that a remote has been auto-created with the right "cdn-1" name and fields.
> + printf "%s\n" "$TRASH_DIRECTORY_URL/lop" "true" "lop" >expect &&
> + git -C client config "remote.cdn-1.url" >actual &&
> + git -C client config "remote.cdn-1.promisor" >>actual &&
> + git -C client config "remote.cdn-1.advertisedAs" >>actual &&
> + test_cmp expect actual &&
> +
> + # Check that the original "cdn" remote was not overwritten.
> + printf "%s\n" "https://example.com/cdn" "true" >expect &&
> + git -C client config "remote.cdn.url" >actual &&
> + git -C client config "remote.cdn.promisor" >>actual &&
> + test_cmp expect actual &&
> +
> + # Check that the largest object is still missing on the server
> + check_missing_objects server 1 "$oid"
> +'
> +
> +test_expect_success "clone with URL allowlisted and reusable remote" '
> + git -C server config promisor.advertise true &&
> + test_when_finished "rm -rf client" &&
> +
> + GIT_NO_LAZY_FETCH=0 git clone \
> + -c remote.cdn.fetch="+refs/heads/*:refs/remotes/lop/*" \
> + -c remote.cdn.url="$TRASH_DIRECTORY_URL/lop" \
> + -c promisor.acceptfromserver=None \
> + -c promisor.acceptFromServerUrl="cdn=$ENCODED_TRASH_DIRECTORY_URL/*" \
> + --no-local --filter="blob:limit=5k" server client &&
> +
> + # Check that the existing "cdn" remote has been properly updated.
> + printf "%s\n" "$TRASH_DIRECTORY_URL/lop" "true" "lop" "+refs/heads/*:refs/remotes/lop/*" >expect &&
> + git -C client config "remote.cdn.url" >actual &&
> + git -C client config "remote.cdn.promisor" >>actual &&
> + git -C client config "remote.cdn.advertisedAs" >>actual &&
> + git -C client config "remote.cdn.fetch" >>actual &&
> + test_cmp expect actual &&
> +
> + # Check that no new "cdn-1" remote has been created.
> + test_must_fail git -C client config "remote.cdn-1.url" &&
> +
> + # Check that the largest object is still missing on the server
> + check_missing_objects server 1 "$oid"
> +'
> +
> test_expect_success "clone with invalid promisor.acceptFromServerUrl" '
> git -C server config promisor.advertise true &&
> test_when_finished "rm -rf client" &&
> @@ -472,6 +573,9 @@ test_expect_success "clone with invalid promisor.acceptFromServerUrl" '
> # Check that a warning was emitted
> test_grep "invalid remote name '\''bad name'\''" err &&
>
> + # Check that no remote was auto-created
> + test_must_fail git -C client config get --regexp "remote\..*\.advertisedas" &&
> +
> # Check that the largest object is not missing on the server
> check_missing_objects server 0 "" &&
>
> --
> 2.54.0.19.gb68b9497aa
>
>
--
Cheers,
Toon
^ permalink raw reply [flat|nested] 80+ messages in thread
* [PATCH v2 8/8] doc: promisor: improve acceptFromServer entry
2026-04-27 12:41 ` [PATCH v2 0/8] Auto-configure advertised remotes via URL allowlist Christian Couder
` (6 preceding siblings ...)
2026-04-27 12:41 ` [PATCH v2 7/8] promisor-remote: auto-configure unknown remotes Christian Couder
@ 2026-04-27 12:41 ` Christian Couder
2026-04-27 13:00 ` [PATCH v2 0/8] Auto-configure advertised remotes via URL allowlist Christian Couder
8 siblings, 0 replies; 80+ messages in thread
From: Christian Couder @ 2026-04-27 12:41 UTC (permalink / raw)
To: git
Cc: Junio C Hamano, Patrick Steinhardt, Taylor Blau, Karthik Nayak,
Elijah Newren, Christian Couder, Christian Couder
The entry for the `promisor.acceptFromServer` in
"Documentation/config/promisor.adoc" has a number of issues:
- it's not clear if new remotes and URLs can be created,
- it looks like a big block of text,
- it's not easy to see all the options,
- it's not easy to see which option is the default one,
- for "knownName", it says "advertised by the client" instead of
"advertised by the server",
- it doesn't refer to the new related `acceptFromServerUrl`
option.
Let's address all these issues by rewording large parts of it
and using bullet points for the different options.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
Documentation/config/promisor.adoc | 53 ++++++++++++++++++++----------
1 file changed, 35 insertions(+), 18 deletions(-)
diff --git a/Documentation/config/promisor.adoc b/Documentation/config/promisor.adoc
index ae1686a6e0..095c1693ac 100644
--- a/Documentation/config/promisor.adoc
+++ b/Documentation/config/promisor.adoc
@@ -32,24 +32,41 @@ variable is set to "true", and the "name" and "url" fields are always
advertised regardless of this setting.
promisor.acceptFromServer::
- If set to "all", a client will accept all the promisor remotes
- a server might advertise using the "promisor-remote"
- capability. If set to "knownName" the client will accept
- promisor remotes which are already configured on the client
- and have the same name as those advertised by the client. This
- is not very secure, but could be used in a corporate setup
- where servers and clients are trusted to not switch name and
- URLs. If set to "knownUrl", the client will accept promisor
- remotes which have both the same name and the same URL
- configured on the client as the name and URL advertised by the
- server. This is more secure than "all" or "knownName", so it
- should be used if possible instead of those options. Default
- is "none", which means no promisor remote advertised by a
- server will be accepted. By accepting a promisor remote, the
- client agrees that the server might omit objects that are
- lazily fetchable from this promisor remote from its responses
- to "fetch" and "clone" requests from the client. Name and URL
- comparisons are case sensitive. See linkgit:gitprotocol-v2[5].
+ Controls which promisor remotes advertised by a server (using the
+ "promisor-remote" protocol capability) a client will accept. By
+ accepting a promisor remote, the client agrees that the server
+ might omit objects that are lazily fetchable from this promisor
+ remote from its responses to "fetch" and "clone" requests.
++
+Note that this option does not cause new remotes to be automatically
+created in the client's configuration. It only allows remotes which
+are somehow already configured to be trusted for the current
+operation, or their fields to be updated (if `promisor.storeFields` is
+set and the remote already exists locally). To allow Git to
+automatically create and persist new remotes from server
+advertisements, use `promisor.acceptFromServerUrl`.
++
+The available options are:
++
+* `none` (default): No promisor remote advertised by a server will be
+ accepted.
++
+* `knownUrl`: The client will accept promisor remotes that are already
+ configured on the client and have both the same name and the same URL
+ as advertised by the server. This is more secure than `all` or
+ `knownName`, and should be used if possible instead of those options.
++
+* `knownName`: The client will accept promisor remotes that are already
+ configured on the client and have the same name as those advertised
+ by the server. This is not very secure, but could be used in a corporate
+ setup where servers and clients are trusted to not switch names and URLs.
++
+* `all`: The client will accept all the promisor remotes a server might
+ advertise. This is the least secure option and should only be used in
+ fully trusted environments.
++
+Name and URL comparisons are case-sensitive. See linkgit:gitprotocol-v2[5]
+for protocol details.
promisor.acceptFromServerUrl::
A glob pattern to specify which server-advertised URLs a
--
2.54.0.19.gb68b9497aa
^ permalink raw reply related [flat|nested] 80+ messages in thread* Re: [PATCH v2 0/8] Auto-configure advertised remotes via URL allowlist
2026-04-27 12:41 ` [PATCH v2 0/8] Auto-configure advertised remotes via URL allowlist Christian Couder
` (7 preceding siblings ...)
2026-04-27 12:41 ` [PATCH v2 8/8] doc: promisor: improve acceptFromServer entry Christian Couder
@ 2026-04-27 13:00 ` Christian Couder
8 siblings, 0 replies; 80+ messages in thread
From: Christian Couder @ 2026-04-27 13:00 UTC (permalink / raw)
To: git
Cc: Junio C Hamano, Patrick Steinhardt, Taylor Blau, Karthik Nayak,
Elijah Newren
Sorry, it looks like I sent this series in reply to:
https://lore.kernel.org/git/20251223111113.47473-1-christian.couder@gmail.com/
instead of:
https://lore.kernel.org/git/20260323080520.887550-1-christian.couder@gmail.com/
I will try to do better next time.
Also I forgot to say that this series is based on a merge of 'master'
@ v2.54.0 and 'cc/promisor-auto-config-url' (which is in 'next' but is
marked with "Will merge to master" in the last "What's cooking..."
email).
On Mon, Apr 27, 2026 at 2:41 PM Christian Couder
<christian.couder@gmail.com> wrote:
>
> Currently, the "promisor-remote" protocol capability allows a server
> to advertise promisor remotes (and their tokens/filters), but the
> client's `promisor.acceptFromServer` mechanism requires these remotes
> to already exist in the config.
>
> This is a significant burden for users and administrators who have to
> pre-configure remotes.
>
> This patch series improves on this by introducing a new
> `promisor.acceptFromServerUrl` config option, which provides an
> additive, URL-based security allowlist.
>
> Multiple `promisor.acceptFromServerUrl` config options can be provided
> in different config files. Each one should contain a URL glob pattern
> which can optionally be prefixed with a remote name in the
> "[<name>=]<pattern>" format.
>
> The goal is for something like a simple:
>
> git config set --global promisor.acceptFromServerUrl "https://my-org.com/*"
>
> to be all that is needed for internal work in many organizations.
>
> With this new config option:
>
> - The server can update fields (like tokens) for known remotes,
> provided their URL matches the allowlist, even if
> `acceptFromServer` is set to `None`.
>
> - Unknown remotes advertised by the server can be automatically
> configured on the client if their URL matches the allowlist.
>
> - If there is no `<name>` prefix before the glob pattern matched, the
> auto-configured remote is named using the
> "promisor-auto-<sanitized-url>" format. So the same auto-configured
> remote config entry will be reused for the same URL.
>
> - If a `<name>` prefix is provided, it will be used for the
> auto-configured remote config entry.
>
> - If the chosen name (auto-generated or prefixed) already exists but
> points to a different URL, overwriting the existing config is
> prevented by appending a numeric suffix (e.g., -1, -2) to the name
> and auto-configuring using that name.
>
> - The server's originally advertised name is always saved in the
> `remote.<name>.advertisedAs` config variable of the auto-configured
> remote for tracing and debugging.
>
> Security considerations:
>
> - Advertised URLs and glob patterns are routed through
> url_normalize() / url_normalize_pattern() before matching, to
> prevent percent-encoding, case variation, or path-traversal (..)
> bypasses.
>
> - URL matching is done component by component: scheme and port
> must match exactly (no wildcards), the host is matched with
> WM_PATHNAME so a '*' cannot cross the '/' boundary into the
> path, and the path is matched without WM_PATHNAME so '*' can
> still span multi-level paths.
>
> - Auto-generated remote names are sanitized (non-alphanumeric
> characters are replaced with '-', runs of '-' are collapsed)
> and prefixed with 'promisor-auto-'. User-supplied names (from
> the 'name=<pattern>' syntax) are validated with
> valid_remote_name(). Together, these prevent a server from
> maliciously overwriting standard remotes (like 'origin').
>
> - If the auto-generated or user-supplied name collides with an
> existing remote configured to a different URL, a numeric
> suffix ('-1', '-2', ...) is appended, up to a bounded limit,
> so a server cannot hijack an existing remote by name.
>
> - Known remotes are still subject to URL consistency checks:
> even if an advertised URL matches the allowlist, it is only
> accepted for a known remote if it matches the URL already
> configured locally for that remote.
>
> - The documentation explains in detail how to write secure glob
> patterns in `promisor.acceptFromServerUrl`, and highlights the
> risks of overly broad patterns on shared hosting platforms.
>
> High level description of the patches
> =====================================
>
> - Patch 1/8 is new. It is a very small preparatory patch that
> simplifies some tests a bit.
>
> - Patches 2/8 and 3/8 expose and adapt a url_normalize_pattern()
> helper function in the urlmatch API.
>
> - Patch 4/8 adapts `struct promisor_info` by adding a new
> `local_name` member to it to prepare for the next patches.
>
> - Patches 5/8 to 7/8 implement the core feature. They introduce the
> parsing machinery, add the additive allowlist for known remotes
> (with url_normalize() security), and finally implement the
> auto-creation and collision resolution for unknown remotes.
>
> - Patch 8/8 cleans up and modernizes the existing
> `promisor.acceptFromServer` documentation.
>
> Changes compared to v1
> ======================
>
> Thanks to Patrick and Junio for reviewing the previous versions of
> this series and of the preparatory series.
>
> - A lot of preparatory patches have been moved to a preparatory series
> that has already been merged. See:
>
> https://lore.kernel.org/git/20260407115243.358642-1-christian.couder@gmail.com/
>
> This is why this v2 contains only 8 patches compared to 16 patches
> in v1.
>
> - Everywhere in this series "whitelist" as been replaced with
> "allowlist".
>
> - In the tests added in this series, the new $TRASH_DIRECTORY_URL and
> $ENCODED_TRASH_DIRECTORY_URL introduced by the preparatory series
> are used instead of the previous $PWD_URL and $ENCODED_PWD_URL.
>
> - Patch 1/8 ("t5710: simplify 'mkdir X' followed by 'git -C X init'")
> is new.
>
> - Patch 3/8 ("urlmatch: add url_normalize_pattern() helper") replaces
> patch 3/16 ("urlmatch: add url_is_valid_pattern() helper") because
> in subsequent patches we now normalize patterns to validate them
> and match them component by component against URLs.
>
> - In patch 5/8, previously 13/16, ("promisor-remote: introduce
> promisor.acceptFromServerUrl"):
>
> - We add a `struct url_info pattern_info;` to `struct allowed_url`,
> so we can validate patterns using url_normalize_pattern() and, in
> a subsequent patch, match URLs component by component. This
> requires a new allowed_url_free() function that is passed to
> string_list_clear_func() to clear the `struct allowed_url`
> instances.
>
> - We don't use a `static struct string_list` to store the URL
> patterns we accept. Instead we load them from the config into a
> `struct string_list` passed as argument. The function doing this
> is renamed accordingly from accept_from_server_url() to
> load_accept_from_server_url().
>
> - A "clone with invalid promisor.acceptFromServerUrl" test is moved
> from patch 15/16 to this patch as it's more relevant in this
> patch (where we validate the content of the
> `promisor.acceptFromServerUrl` environment variable).
>
> - In patch 6/8, previously 14/16, ("promisor-remote: trust known
> remotes matching acceptFromServerUrl"):
>
> - In the commit message, an example, which shows how the new
> "acceptFromServerUrl" config option can be useful, is added.
>
> - The matching of URLs advertised by the server to URLs patterns
> from the config, is now performed component by component. This is
> reflected in the commit message, the documentation and the
> code. This ensures a `*` in the host pattern cannot cross into
> the path.
>
> - In the code, we add a new match_one_url() function to perform the
> matching.
>
> - In patch 7/8, previously 15/16 ("promisor-remote: auto-configure
> unknown remotes"):
>
> - In the doc, the unclear "considered trusted by the client" is
> clarified using "a client is allowed to act on" and subsequent
> explanations. In general the doc is also improved a bit.
>
> - In the tests, parsing the "remote.<name>.advertisedAs" config
> option is now more careful about the possibility that more than
> one such options exist.
>
> - The test that was moved to patch 5/8 is still enhanced a bit in
> this commit by checking that no "remote.<name>.advertisedAs"
> config option has been added.
>
> CI tests
> ========
>
> They all pass, see:
>
> https://github.com/chriscool/git/actions/runs/24992478331
>
> Range diff since v1
> ===================
>
> 1: b2894eb33a < -: ---------- promisor-remote: try accepted remotes before others in get_direct()
> -: ---------- > 1: 44e9a16455 t5710: simplify 'mkdir X' followed by 'git -C X init'
> 2: a3206a6ae9 = 2: 42f174910c urlmatch: change 'allow_globs' arg to bool
> 3: 51bbf65c52 < -: ---------- urlmatch: add url_is_valid_pattern() helper
> 4: f367beef72 < -: ---------- promisor-remote: clarify that a remote is ignored
> 5: 1faf74cb3f < -: ---------- promisor-remote: refactor has_control_char()
> 6: 40cf0af639 < -: ---------- promisor-remote: refactor accept_from_server()
> 7: b75dca8037 < -: ---------- promisor-remote: keep accepted promisor_info structs alive
> 8: f5e55dc407 < -: ---------- promisor-remote: remove the 'accepted' strvec
> -: ---------- > 3: 8088374458 urlmatch: add url_normalize_pattern() helper
> 9: 63c1db30de ! 4: 6bfda89a79 promisor-remote: add 'local_name' to 'struct promisor_info'
> @@ Commit message
> In a following commit, we will store promisor remote information under
> a remote name different than the one the server advertised.
>
> - To prepare for this change, let's add a new 'char* local_name' member
> + To prepare for this change, let's add a new 'char *local_name' member
> to 'struct promisor_info', and let's update the related functions.
>
> While at it, let's also add a small promisor_info_internal_name()
> @@ Commit message
>
> ## promisor-remote.c ##
> @@ promisor-remote.c: static struct string_list *fields_stored(void)
> -
> - /*
> * Struct for promisor remotes involved in the "promisor-remote"
> -- * protocol capability.
> -+ * protocol capability:
> + * protocol capability.
> *
> - * Except for "name", each <member> in this struct and its <value>
> - * should correspond (either on the client side or on the server side)
> - * to a "remote.<name>.<member>" config variable set to <value> where
> - * "<name>" is a promisor remote name.
> -+ * - "name" is the name the server advertised.
> -+ * - "local_name" is the name we use locally (may be auto-generated).
> -+ *
> + * Except for "name" and "local_name", each <member> in this struct
> + * and its <value> should correspond (either on the client side or on
> + * the server side) to a "remote.<name>.<member>" config variable set
> + * to <value> where "<name>" is a promisor remote name.
> */
> struct promisor_info {
> - const char *name;
> -+ const char *local_name;
> +- const char *name;
> ++ const char *name; /* name the server advertised */
> ++ const char *local_name; /* name used locally (may be auto-generated) */
> const char *url;
> const char *filter;
> const char *token;
> 10: e9b8a64ab8 < -: ---------- promisor-remote: pass config entry to all_fields_match() directly
> 11: 2e1260190a < -: ---------- promisor-remote: refactor should_accept_remote() control flow
> 12: b33f06173a < -: ---------- t5710: use proper file:// URIs for absolute paths
> 13: 681b03e248 ! 5: fefa17e6dd promisor-remote: introduce promisor.acceptFromServerUrl
> @@ promisor-remote.c: static bool has_control_char(const char *s)
> +struct allowed_url {
> + char *remote_name;
> + char *url_pattern;
> ++ struct url_info pattern_info;
> +};
> +
> ++static void allowed_url_free(void *util, const char *str UNUSED)
> ++{
> ++ struct allowed_url *allowed = util;
> ++
> ++ if (!allowed)
> ++ return;
> ++
> ++ /* Depending on prefix, free either remote_name or url_pattern */
> ++ free(allowed->remote_name ? allowed->remote_name : allowed->url_pattern);
> ++ free(allowed->pattern_info.url);
> ++ free(allowed);
> ++}
> ++
> +static struct allowed_url *valid_accept_url(const char *url)
> +{
> + char *dup, *p;
> @@ promisor-remote.c: static bool has_control_char(const char *s)
> + p = dup;
> + }
> +
> -+ if (has_control_char(p) || !url_is_valid_pattern(p)) {
> ++ if (has_control_char(p)) {
> + warning(_("invalid url pattern '%s' "
> + "in '%s' from promisor.acceptFromServerUrl config"), p, url);
> + free(dup);
> @@ promisor-remote.c: static bool has_control_char(const char *s)
> + allowed = xmalloc(sizeof(*allowed));
> + allowed->remote_name = (p == dup) ? NULL : dup;
> + allowed->url_pattern = p;
> ++ allowed->pattern_info.url = url_normalize_pattern(p, &allowed->pattern_info);
> ++ if (!allowed->pattern_info.url) {
> ++ warning(_("invalid url pattern '%s' "
> ++ "in '%s' from promisor.acceptFromServerUrl config"), p, url);
> ++ free(dup);
> ++ free(allowed);
> ++ return NULL;
> ++ }
> +
> + return allowed;
> +}
> +
> -+static struct string_list *accept_from_server_url(struct repository *repo)
> ++static void load_accept_from_server_url(struct repository *repo,
> ++ struct string_list *accept_urls)
> +{
> -+ static struct string_list accept_urls = STRING_LIST_INIT_DUP;
> -+ static int initialized;
> + const struct string_list *config_urls;
> +
> -+ if (initialized)
> -+ return &accept_urls;
> -+
> -+ initialized = 1;
> -+
> + if (!repo_config_get_string_multi(repo, "promisor.acceptfromserverurl", &config_urls)) {
> + struct string_list_item *item;
> +
> @@ promisor-remote.c: static bool has_control_char(const char *s)
> + struct allowed_url *allowed = valid_accept_url(item->string);
> + if (allowed) {
> + struct string_list_item *new;
> -+ new = string_list_append(&accept_urls, item->string);
> ++ new = string_list_append(accept_urls, item->string);
> + new->util = allowed;
> + }
> + }
> + }
> -+
> -+ return &accept_urls;
> +}
> +
> static int should_accept_remote(enum accept_promisor accept,
> @@ promisor-remote.c: static void filter_promisor_remote(struct repository *repo,
> struct string_list_item *item;
> bool reload_config = false;
> enum accept_promisor accept = accept_from_server(repo);
> -+ /* Pre-load and validate the acceptFromServerUrl config */
> -+ (void)accept_from_server_url(repo);
> ++ struct string_list accept_urls = STRING_LIST_INIT_DUP;
> ++
> ++ /* Load and validate the acceptFromServerUrl config */
> ++ load_accept_from_server_url(repo, &accept_urls);
>
> if (accept == ACCEPT_NONE)
> return;
> +@@ promisor-remote.c: static void filter_promisor_remote(struct repository *repo,
> + }
> + }
> +
> ++ string_list_clear_func(&accept_urls, allowed_url_free);
> + promisor_info_list_clear(&config_info);
> + string_list_clear(&remote_info, 0);
> + store_info_free(store_info);
> +
> + ## t/t5710-promisor-remote-capability.sh ##
> +@@ t/t5710-promisor-remote-capability.sh: test_expect_success "clone with 'KnownUrl' and empty url, so not advertised" '
> + check_missing_objects server 1 "$oid"
> + '
> +
> ++test_expect_success "clone with invalid promisor.acceptFromServerUrl" '
> ++ git -C server config promisor.advertise true &&
> ++ test_when_finished "rm -rf client" &&
> ++
> ++ # As "bad name" contains a space, which is not a valid remote name,
> ++ # the pattern should be rejected with a warning and no remote created.
> ++ GIT_NO_LAZY_FETCH=0 git clone \
> ++ -c promisor.acceptfromserver=None \
> ++ -c "promisor.acceptFromServerUrl=bad name=https://example.com/*" \
> ++ --no-local --filter="blob:limit=5k" server client 2>err &&
> ++
> ++ # Check that a warning was emitted
> ++ test_grep "invalid remote name '\''bad name'\''" err &&
> ++
> ++ # Check that the largest object is not missing on the server
> ++ check_missing_objects server 0 "" &&
> ++
> ++ # Reinitialize server so that the largest object is missing again
> ++ initialize_server 1 "$oid"
> ++'
> ++
> + test_expect_success "clone with promisor.sendFields" '
> + git -C server config promisor.advertise true &&
> + test_when_finished "rm -rf client" &&
> 14: 8c04e48d66 ! 6: 2f238d0a7a promisor-remote: trust known remotes matching acceptFromServerUrl
> @@ Commit message
>
> To enable such targeted updates for trusted URLs, let's use the URL
> patterns from `promisor.acceptFromServerUrl` as an additional URL
> - based whitelist.
> + based allowlist.
>
> Concretely, let's check the advertised URLs against the URL glob
> patterns by introducing a new small helper function called
> url_matches_accept_list(), which iterates over the glob patterns and
> returns the first matching allowed_url entry (or NULL).
>
> - (Before matching, the advertised URL is passed through url_normalize()
> - so that case variations in the scheme/host, percent-encoding tricks,
> - and ".." path segments cannot bypass the whitelist.)
> + The URL matching is done component by component: scheme and port are
> + compared exactly, the host is matched with wildmatch() using the
> + WM_PATHNAME flag (so '*' cannot cross the '/' boundary into the path),
> + and the path is matched with wildmatch() without WM_PATHNAME (so '*'
> + can still match multi-level paths). Before matching, the advertised
> + URL is passed through url_normalize() so that case variations in the
> + scheme/host, percent-encoding tricks, and ".." path segments cannot
> + bypass the allowlist.
>
> Let's then use this helper at the tail of should_accept_remote() so
> that, when `accept == ACCEPT_NONE`, a known remote whose URL matches
> - the whitelist is still accepted.
> + the allowlist is still accepted.
>
> To prepare for this new logic, let's also:
>
> @@ Commit message
> and relax its early return so that the function is entered when
> `accept_urls` has entries even if `accept == ACCEPT_NONE`.
>
> + With this, many organizations may only need something like:
> +
> + git config set --global \
> + promisor.acceptFromServerUrl "https://my-org.com/*"
> +
> + to accept only their own remotes. And if they need to accept additional
> + remotes in some specific repos, they can also set:
> +
> + git config set promisor.acceptFromServer knownUrl
> +
> + and configure the additional remote manually only in the repos where
> + they are needed.
> +
> Let's then properly document `promisor.acceptFromServerUrl` in
> - "promisor.adoc" as an additive security whitelist for known remotes,
> - including the URL normalization behavior, and let's mention it in
> - "gitprotocol-v2.adoc".
> + "promisor.adoc" as an additive security allowlist for known remotes,
> + including the URL normalization behavior and the component-wise
> + matching, and let's mention it in "gitprotocol-v2.adoc".
>
> Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
>
> @@ Documentation/config/promisor.adoc: promisor.acceptFromServer::
> comparisons are case sensitive. See linkgit:gitprotocol-v2[5].
>
> +promisor.acceptFromServerUrl::
> -+ A glob pattern to specify which URLs advertised by a server
> -+ are considered trusted by the client. This option acts as an
> -+ additive security whitelist that works in conjunction with
> -+ `promisor.acceptFromServer`.
> ++ A glob pattern to specify which server-advertised URLs a
> ++ client is allowed to act on. When a URL matches, the client
> ++ will accept the advertised remote as a promisor remote and may
> ++ automatically accept field updates (such as authentication
> ++ tokens) from the server, even if `promisor.acceptFromServer`
> ++ is set to `none` (the default).
> ++
> +This option can appear multiple times in config files. An advertised
> +URL will be accepted if it matches _ANY_ glob pattern specified by
> +this option in _ANY_ config file read by Git.
> ++
> -+Be _VERY_ careful with these glob patterns, as it can be a big
> -+security hole to allow any advertised remote to be auto-configured!
> ++Be _VERY_ careful with these patterns: `*` matches any sequence of
> ++characters within the 'host' and 'path' parts of a URL (but cannot
> ++cross part boundaries). An overly broad pattern is a major security
> ++risk, as a matching URL allows a server to update fields (such as
> ++authentication tokens) on known remotes without further confirmation.
> +To minimize security risks, follow these guidelines:
> ++
> +1. Start with a secure protocol scheme, like `https://` or `ssh://`.
> @@ Documentation/config/promisor.adoc: promisor.acceptFromServer::
> + your specific organization or namespace (e.g.,
> + `https://gitlab.com/your-org/*`).
> ++
> -+3. Don't use globs (`*`) in the domain name. For example
> -+ `https://cdn.example.com/*` is much safer than
> -+ `https://*.example.com/*`, because the latter matches
> -+ `https://evil-hacker.net/fake.example.com/repo`.
> ++3. Never use globs at the end of domain names. For example,
> ++ `https://cdn.your-org.com/*` might be safe, but
> ++ `https://cdn.your-org.com*/*` is a major security risk because
> ++ the latter matches `https://cdn.your-org.com.hacker.net/repo`.
> ++
> -+4. Make sure to have a `/` at the end of the domain name (or the end
> -+ of specific directories). For example `https://cdn.example.com/*`
> -+ is much safer than `https://cdn.example.com*`, because the latter
> -+ matches `https://cdn.example.com.hacker.net/repo`.
> ++4. Be careful using globs at the beginning of domain names. While the
> ++ code ensures a `*` in the host cannot cross into the path, a
> ++ pattern like `https://*.example.com/*` will still match any
> ++ subdomain. This is extremely dangerous on shared hosting platforms
> ++ (e.g., `https://*.github.io/*` trusts every user's site on the
> ++ entire platform).
> ++
> -+Before matching, the advertised URL is normalized: the scheme and
> -+host are lowercased, percent-encoded characters are decoded where
> -+possible, and path segments like `..` are resolved. Glob patterns
> -+are matched against this normalized URL as-is, so patterns should
> -+be written in normalized form (e.g., lowercase scheme and host).
> ++Before matching, both the advertised URL and the pattern are
> ++normalized: the scheme and host are lowercased, percent-encoded
> ++characters are decoded where possible, and path segments like `..`
> ++are resolved. The port must also match exactly (e.g.,
> ++`https://example.com:8080/*` will not match a URL advertised on
> ++port 9999).
> ++
> -+Even if `promisor.acceptFromServer` is set to `None` (the default),
> -+Git will still accept field updates (like tokens) for known remotes,
> -+provided their URLs match a pattern in
> -+`promisor.acceptFromServerUrl`. See linkgit:gitprotocol-v2[5] for
> -+details on the protocol.
> ++For the security implications of accepting a promisor remote, see the
> ++documentation of `promisor.acceptFromServer`. For details on the
> ++protocol, see linkgit:gitprotocol-v2[5].
> +
> promisor.checkFields::
> A comma or space separated list of additional remote related
> @@ promisor-remote.c
>
> struct promisor_remote_config {
> struct promisor_remote *promisors;
> -@@ promisor-remote.c: static struct string_list *accept_from_server_url(struct repository *repo)
> - return &accept_urls;
> +@@ promisor-remote.c: static void load_accept_from_server_url(struct repository *repo,
> + }
> }
>
> ++static bool match_one_url(const struct url_info *pi, const struct url_info *ui)
> ++{
> ++ const char *pat = pi->url;
> ++ const char *url = ui->url;
> ++ char *p_str, *u_str;
> ++ bool res;
> ++
> ++ /*
> ++ * Schemes must match exactly. They are case-folded by
> ++ * url_normalize(), so strncmp() suffices.
> ++ */
> ++ if (pi->scheme_len != ui->scheme_len || strncmp(pat, url, pi->scheme_len))
> ++ return false;
> ++
> ++ /*
> ++ * Ports must match exactly. url_normalize() strips default
> ++ * ports (like 443 for https), so length and content
> ++ * comparisons are sufficient.
> ++ */
> ++ if (pi->port_len != ui->port_len ||
> ++ strncmp(pat + pi->port_off, url + ui->port_off, pi->port_len))
> ++ return false;
> ++
> ++ /*
> ++ * Match host and path separately to prevent a '*' in the host
> ++ * portion of the pattern from matching across the '/'
> ++ * boundary into the path. Use WM_PATHNAME for the host so '*'
> ++ * cannot cross '/' there, and 0 for the path so '*' can still
> ++ * match multi-level paths.
> ++ */
> ++
> ++ p_str = xstrndup(pat + pi->host_off, pi->host_len);
> ++ u_str = xstrndup(url + ui->host_off, ui->host_len);
> ++ res = !wildmatch(p_str, u_str, WM_PATHNAME);
> ++ free(p_str);
> ++ free(u_str);
> ++
> ++ if (!res)
> ++ return false;
> ++
> ++ p_str = xstrndup(pat + pi->path_off, pi->path_len);
> ++ u_str = xstrndup(url + ui->path_off, ui->path_len);
> ++ res = !wildmatch(p_str, u_str, 0);
> ++ free(p_str);
> ++ free(u_str);
> ++
> ++ return res;
> ++}
> ++
> +static struct allowed_url *url_matches_accept_list(
> + struct string_list *accept_urls, const char *url)
> +{
> + struct string_list_item *item;
> -+ char *normalized = url_normalize(url, NULL);
> ++ struct url_info url_info;
> ++
> ++ url_info.url = url_normalize(url, &url_info);
> +
> -+ if (!normalized)
> ++ if (!url_info.url)
> + return NULL;
> +
> + for_each_string_list_item(item, accept_urls) {
> + struct allowed_url *allowed = item->util;
> +
> -+ if (!wildmatch(allowed->url_pattern, normalized, 0)) {
> -+ free(normalized);
> ++ if (match_one_url(&allowed->pattern_info, &url_info)) {
> ++ free(url_info.url);
> + return allowed;
> + }
> + }
> +
> -+ free(normalized);
> ++ free(url_info.url);
> + return NULL;
> +}
> +
> @@ promisor-remote.c: static int should_accept_remote(enum accept_promisor accept,
> + /*
> + * Even if accept == ACCEPT_NONE, we MUST trust this known
> + * remote to update its token or other such fields if its URL
> -+ * matches the acceptFromServerUrl whitelist!
> ++ * matches the acceptFromServerUrl allowlist!
> + */
> + if (url_matches_accept_list(accept_urls, remote_url))
> + return all_fields_match(advertised, config_info, p);
> @@ promisor-remote.c: static int should_accept_remote(enum accept_promisor accept,
>
> static int skip_field_name_prefix(const char *elem, const char *field_name, const char **value)
> @@ promisor-remote.c: static void filter_promisor_remote(struct repository *repo,
> - struct string_list_item *item;
> - bool reload_config = false;
> - enum accept_promisor accept = accept_from_server(repo);
> -- /* Pre-load and validate the acceptFromServerUrl config */
> -- (void)accept_from_server_url(repo);
> -+ struct string_list *accept_urls = accept_from_server_url(repo);
> + /* Load and validate the acceptFromServerUrl config */
> + load_accept_from_server_url(repo, &accept_urls);
>
> - if (accept == ACCEPT_NONE)
> -+ if (accept == ACCEPT_NONE && !accept_urls->nr)
> ++ if (accept == ACCEPT_NONE && !accept_urls.nr)
> return;
>
> /* Parse remote info received */
> @@ promisor-remote.c: static void filter_promisor_remote(struct repository *repo,
> }
>
> - if (should_accept_remote(accept, advertised, &config_info)) {
> -+ if (should_accept_remote(accept, advertised, accept_urls, &config_info)) {
> ++ if (should_accept_remote(accept, advertised, &accept_urls, &config_info)) {
> if (!store_info)
> store_info = store_info_new(repo);
> if (promisor_store_advertised_fields(advertised, store_info))
> @@ t/t5710-promisor-remote-capability.sh: test_expect_success "clone with 'KnownUrl
> check_missing_objects server 1 "$oid"
> '
>
> -+test_expect_success "clone with 'None' but URL whitelisted" '
> ++test_expect_success "clone with 'None' but URL allowlisted" '
> + git -C server config promisor.advertise true &&
> + test_when_finished "rm -rf client" &&
> +
> + GIT_NO_LAZY_FETCH=0 git clone -c remote.lop.promisor=true \
> + -c remote.lop.fetch="+refs/heads/*:refs/remotes/lop/*" \
> -+ -c remote.lop.url="$PWD_URL/lop" \
> ++ -c remote.lop.url="$TRASH_DIRECTORY_URL/lop" \
> + -c promisor.acceptfromserver=None \
> -+ -c promisor.acceptFromServerUrl="$ENCODED_PWD_URL/*" \
> ++ -c promisor.acceptFromServerUrl="$ENCODED_TRASH_DIRECTORY_URL/*" \
> + --no-local --filter="blob:limit=5k" server client &&
> +
> + # Check that the largest object is still missing on the server
> + check_missing_objects server 1 "$oid"
> +'
> +
> -+test_expect_success "clone with 'None' but URL not in whitelist" '
> ++test_expect_success "clone with 'None' but URL not in allowlist" '
> + git -C server config promisor.advertise true &&
> + test_when_finished "rm -rf client" &&
> +
> + GIT_NO_LAZY_FETCH=0 git clone -c remote.lop.promisor=true \
> + -c remote.lop.fetch="+refs/heads/*:refs/remotes/lop/*" \
> -+ -c remote.lop.url="$PWD_URL/lop" \
> ++ -c remote.lop.url="$TRASH_DIRECTORY_URL/lop" \
> + -c promisor.acceptfromserver=None \
> + -c promisor.acceptFromServerUrl="https://example.com/*" \
> + --no-local --filter="blob:limit=5k" server client &&
> @@ t/t5710-promisor-remote-capability.sh: test_expect_success "clone with 'KnownUrl
> + initialize_server 1 "$oid"
> +'
> +
> -+test_expect_success "clone with 'None' but URL whitelisted in one pattern out of two" '
> ++test_expect_success "clone with 'None' but URL allowlisted in one pattern out of two" '
> + git -C server config promisor.advertise true &&
> + test_when_finished "rm -rf client" &&
> +
> + GIT_NO_LAZY_FETCH=0 git clone -c remote.lop.promisor=true \
> + -c remote.lop.fetch="+refs/heads/*:refs/remotes/lop/*" \
> -+ -c remote.lop.url="$PWD_URL/lop" \
> ++ -c remote.lop.url="$TRASH_DIRECTORY_URL/lop" \
> + -c promisor.acceptfromserver=None \
> + -c promisor.acceptFromServerUrl="https://example.com/*" \
> -+ -c promisor.acceptFromServerUrl="$ENCODED_PWD_URL/*" \
> ++ -c promisor.acceptFromServerUrl="$ENCODED_TRASH_DIRECTORY_URL/*" \
> + --no-local --filter="blob:limit=5k" server client &&
> +
> + # Check that the largest object is still missing on the server
> + check_missing_objects server 1 "$oid"
> +'
> +
> -+test_expect_success "clone with 'None', URL whitelisted, but client has different URL" '
> ++test_expect_success "clone with 'None', URL allowlisted, but client has different URL" '
> + git -C server config promisor.advertise true &&
> + test_when_finished "rm -rf client" &&
> +
> + # The client configures "lop" with a different URL (serverTwo) than
> + # what the server advertises (lop). Even though the advertised URL
> -+ # matches the whitelist, the remote is rejected because the
> ++ # matches the allowlist, the remote is rejected because the
> + # configured URL does not match the advertised one.
> + GIT_NO_LAZY_FETCH=0 git clone -c remote.lop.promisor=true \
> + -c remote.lop.fetch="+refs/heads/*:refs/remotes/lop/*" \
> -+ -c remote.lop.url="$PWD_URL/serverTwo" \
> ++ -c remote.lop.url="$TRASH_DIRECTORY_URL/serverTwo" \
> + -c promisor.acceptfromserver=None \
> -+ -c promisor.acceptFromServerUrl="$ENCODED_PWD_URL/*" \
> ++ -c promisor.acceptFromServerUrl="$ENCODED_TRASH_DIRECTORY_URL/*" \
> + --no-local --filter="blob:limit=5k" server client &&
> +
> + # Check that the largest object is not missing on the server
> @@ t/t5710-promisor-remote-capability.sh: test_expect_success "clone with 'KnownUrl
> + initialize_server 1 "$oid"
> +'
> +
> - test_expect_success "clone with promisor.sendFields" '
> + test_expect_success "clone with invalid promisor.acceptFromServerUrl" '
> git -C server config promisor.advertise true &&
> test_when_finished "rm -rf client" &&
> -@@ t/t5710-promisor-remote-capability.sh: test_expect_success "subsequent fetch from a client when promisor.advertise is f
> - check_missing_objects server 1 "$oid"
> - '
> -
> -+
> -+
> - test_done
> 15: 314150a860 ! 7: a077f33df4 promisor-remote: auto-configure unknown remotes
> @@ Commit message
> promisor-remote: auto-configure unknown remotes
>
> Previous commits have introduced the `promisor.acceptFromServerUrl`
> - config variable to whitelist some URLs advertised by a server through
> + config variable to allowlist some URLs advertised by a server through
> the "promisor-remote" protocol capability.
>
> However the new `promisor.acceptFromServerUrl` mechanism, like the old
> @@ Commit message
>
> ## Documentation/config/promisor.adoc ##
> @@ Documentation/config/promisor.adoc: promisor.acceptFromServer::
> -
> promisor.acceptFromServerUrl::
> - A glob pattern to specify which URLs advertised by a server
> -- are considered trusted by the client. This option acts as an
> -- additive security whitelist that works in conjunction with
> -- `promisor.acceptFromServer`.
> -+ are allowed to be auto-configured (created and persisted) on
> -+ the client side. Unlike `promisor.acceptFromServer`, which
> -+ only accepts already configured remotes, a match against this
> -+ option instructs Git to write a new `[remote "<name>"]`
> -+ section to the client's configuration.
> + A glob pattern to specify which server-advertised URLs a
> + client is allowed to act on. When a URL matches, the client
> +- will accept the advertised remote as a promisor remote and may
> ++ will accept the advertised remote as a promisor remote, may
> ++ automatically create a new remote configuration for it and may
> + automatically accept field updates (such as authentication
> + tokens) from the server, even if `promisor.acceptFromServer`
> + is set to `none` (the default).
> +@@ Documentation/config/promisor.adoc: this option in _ANY_ config file read by Git.
> + Be _VERY_ careful with these patterns: `*` matches any sequence of
> + characters within the 'host' and 'path' parts of a URL (but cannot
> + cross part boundaries). An overly broad pattern is a major security
> +-risk, as a matching URL allows a server to update fields (such as
> +-authentication tokens) on known remotes without further confirmation.
> +-To minimize security risks, follow these guidelines:
> ++risk, as a matching URL allows a server to auto-configure new remotes
> ++and to update fields (such as authentication tokens) on known remotes
> ++without further confirmation. To minimize security risks, follow these
> ++guidelines:
> + +
> + 1. Start with a secure protocol scheme, like `https://` or `ssh://`.
> +
> - This option can appear multiple times in config files. An advertised
> - URL will be accepted if it matches _ANY_ glob pattern specified by
> -@@ Documentation/config/promisor.adoc: possible, and path segments like `..` are resolved. Glob patterns
> - are matched against this normalized URL as-is, so patterns should
> - be written in normalized form (e.g., lowercase scheme and host).
> +@@ Documentation/config/promisor.adoc: are resolved. The port must also match exactly (e.g.,
> + `https://example.com:8080/*` will not match a URL advertised on
> + port 9999).
> +
> --Even if `promisor.acceptFromServer` is set to `None` (the default),
> --Git will still accept field updates (like tokens) for known remotes,
> --provided their URLs match a pattern in
> --`promisor.acceptFromServerUrl`. See linkgit:gitprotocol-v2[5] for
> --details on the protocol.
> +The glob pattern can optionally be prefixed with a remote name and an
> +equals sign (e.g., `cdn=https://cdn.example.com/*`). If such a prefix
> +is provided, accepted remotes will be saved under that name. If no
> +such prefix is provided, a safe remote name will be automatically
> +generated by sanitizing the URL and prefixing it with
> -+`promisor-auto-`. If a remote with the chosen name already exists but
> -+points to a different URL, Git will append a numeric suffix (e.g.,
> -+`-1`, `-2`) to the name to prevent overwriting existing
> -+configurations. You should make sure that this doesn't happen often
> -+though, as remotes will be rejected if the numeric suffix increases
> -+too much. In all cases, the original name advertised by the server is
> -+recorded in the `remote.<name>.advertisedAs` configuration variable
> -+for tracing and debugging purposes.
> ++`promisor-auto-`.
> ++
> -+Note that this option acts as an additive security whitelist. It works
> -+in conjunction with `promisor.acceptFromServer` (see the documentation
> -+of that option for the implications of accepting a promisor
> -+remote). Even if `promisor.acceptFromServer` is set to `None` (the
> -+default), Git will still automatically configure new remotes, and
> -+accept field updates (like tokens) for known remotes, provided their
> -+URLs match a pattern in `promisor.acceptFromServerUrl`. See
> -+linkgit:gitprotocol-v2[5] for details on the protocol.
> -
> - promisor.checkFields::
> - A comma or space separated list of additional remote related
> ++If a remote with the chosen name already exists but points to a
> ++different URL, Git will append a numeric suffix (e.g., `-1`, `-2`) to
> ++the name to prevent overwriting existing configurations. You should
> ++make sure that this doesn't happen often though, as remotes will be
> ++rejected if the numeric suffix increases too much. In all cases, the
> ++original name advertised by the server is recorded in the
> ++`remote.<name>.advertisedAs` configuration variable for tracing and
> ++debugging purposes.
> +++
> + For the security implications of accepting a promisor remote, see the
> + documentation of `promisor.acceptFromServer`. For details on the
> + protocol, see linkgit:gitprotocol-v2[5].
>
> ## Documentation/config/remote.adoc ##
> @@ Documentation/config/remote.adoc: remote.<name>.promisor::
> @@ promisor-remote.c: static void filter_promisor_remote(struct repository *repo,
> string_list_sort(&config_info);
> }
>
> -- if (should_accept_remote(accept, advertised, accept_urls, &config_info)) {
> -+ if (should_accept_remote(repo, accept, advertised, accept_urls,
> +- if (should_accept_remote(accept, advertised, &accept_urls, &config_info)) {
> ++ if (should_accept_remote(repo, accept, advertised, &accept_urls,
> + &config_info, &reload_config)) {
> if (!store_info)
> store_info = store_info_new(repo);
> if (promisor_store_advertised_fields(advertised, store_info))
>
> ## t/t5710-promisor-remote-capability.sh ##
> -@@ t/t5710-promisor-remote-capability.sh: test_expect_success "clone with 'None', URL whitelisted, but client has differen
> +@@ t/t5710-promisor-remote-capability.sh: test_expect_success "clone with 'None', URL allowlisted, but client has differen
> initialize_server 1 "$oid"
> '
>
> -+test_expect_success "clone with URL whitelisted and no remote already configured" '
> ++test_expect_success "clone with URL allowlisted and no remote already configured" '
> + git -C server config promisor.advertise true &&
> + test_when_finished "rm -rf client" &&
> ++ test_when_finished "rm -f full_names" &&
> +
> + GIT_NO_LAZY_FETCH=0 git clone \
> + -c promisor.acceptfromserver=None \
> -+ -c promisor.acceptFromServerUrl="$ENCODED_PWD_URL/*" \
> ++ -c promisor.acceptFromServerUrl="$ENCODED_TRASH_DIRECTORY_URL/*" \
> + --no-local --filter="blob:limit=5k" server client &&
> +
> -+ # Check that a remote has been auto-created with the right fields.
> -+ # The remote is identified by "remote.<name>.advertisedAs" == "lop".
> -+ FULL_NAME=$(git -C client config --name-only --get-regexp "remote\..*\.advertisedas" "^lop$") &&
> -+ REMOTE_NAME=$(echo "$FULL_NAME" | sed "s/remote\.\(.*\)\.advertisedas/\1/") &&
> ++ # Check that exactly one remote has been auto-created, identified
> ++ # by "remote.<name>.advertisedAs" == "lop".
> ++ git -C client config get --all --show-names --regexp \
> ++ "remote\..*\.advertisedas" >full_names &&
> ++ test_line_count = 1 full_names &&
> ++ REMOTE_NAME=$(sed "s/^remote\.\(.*\)\.advertisedas .*$/\1/" full_names) &&
> +
> + # Check ".url" and ".promisor" values
> -+ printf "%s\n" "$PWD_URL/lop" "true" >expect &&
> ++ printf "%s\n" "$TRASH_DIRECTORY_URL/lop" "true" >expect &&
> + git -C client config "remote.$REMOTE_NAME.url" >actual &&
> + git -C client config "remote.$REMOTE_NAME.promisor" >>actual &&
> + test_cmp expect actual &&
> @@ t/t5710-promisor-remote-capability.sh: test_expect_success "clone with 'None', U
> + check_missing_objects server 1 "$oid"
> +'
> +
> -+test_expect_success "clone with named URL whitelisted and no pre-configured remote" '
> ++test_expect_success "clone with named URL allowlisted and no pre-configured remote" '
> + git -C server config promisor.advertise true &&
> + test_when_finished "rm -rf client" &&
> +
> + GIT_NO_LAZY_FETCH=0 git clone \
> + -c promisor.acceptfromserver=None \
> -+ -c promisor.acceptFromServerUrl="cdn=$ENCODED_PWD_URL/*" \
> ++ -c promisor.acceptFromServerUrl="cdn=$ENCODED_TRASH_DIRECTORY_URL/*" \
> + --no-local --filter="blob:limit=5k" server client &&
> +
> + # Check that a remote has been auto-created with the right "cdn" name and fields.
> -+ printf "%s\n" "$PWD_URL/lop" "true" "lop" >expect &&
> ++ printf "%s\n" "$TRASH_DIRECTORY_URL/lop" "true" "lop" >expect &&
> + git -C client config "remote.cdn.url" >actual &&
> + git -C client config "remote.cdn.promisor" >>actual &&
> + git -C client config "remote.cdn.advertisedAs" >>actual &&
> @@ t/t5710-promisor-remote-capability.sh: test_expect_success "clone with 'None', U
> + check_missing_objects server 1 "$oid"
> +'
> +
> -+test_expect_success "clone with URL whitelisted but colliding name" '
> ++test_expect_success "clone with URL allowlisted but colliding name" '
> + git -C server config promisor.advertise true &&
> + test_when_finished "rm -rf client" &&
> +
> @@ t/t5710-promisor-remote-capability.sh: test_expect_success "clone with 'None', U
> + -c remote.cdn.fetch="+refs/heads/*:refs/remotes/lop/*" \
> + -c remote.cdn.url="https://example.com/cdn" \
> + -c promisor.acceptfromserver=None \
> -+ -c promisor.acceptFromServerUrl="cdn=$ENCODED_PWD_URL/*" \
> ++ -c promisor.acceptFromServerUrl="cdn=$ENCODED_TRASH_DIRECTORY_URL/*" \
> + --no-local --filter="blob:limit=5k" server client &&
> +
> + # Check that a remote has been auto-created with the right "cdn-1" name and fields.
> -+ printf "%s\n" "$PWD_URL/lop" "true" "lop" >expect &&
> ++ printf "%s\n" "$TRASH_DIRECTORY_URL/lop" "true" "lop" >expect &&
> + git -C client config "remote.cdn-1.url" >actual &&
> + git -C client config "remote.cdn-1.promisor" >>actual &&
> + git -C client config "remote.cdn-1.advertisedAs" >>actual &&
> @@ t/t5710-promisor-remote-capability.sh: test_expect_success "clone with 'None', U
> + check_missing_objects server 1 "$oid"
> +'
> +
> -+test_expect_success "clone with URL whitelisted and reusable remote" '
> ++test_expect_success "clone with URL allowlisted and reusable remote" '
> + git -C server config promisor.advertise true &&
> + test_when_finished "rm -rf client" &&
> +
> + GIT_NO_LAZY_FETCH=0 git clone \
> + -c remote.cdn.fetch="+refs/heads/*:refs/remotes/lop/*" \
> -+ -c remote.cdn.url="$PWD_URL/lop" \
> ++ -c remote.cdn.url="$TRASH_DIRECTORY_URL/lop" \
> + -c promisor.acceptfromserver=None \
> -+ -c promisor.acceptFromServerUrl="cdn=$ENCODED_PWD_URL/*" \
> ++ -c promisor.acceptFromServerUrl="cdn=$ENCODED_TRASH_DIRECTORY_URL/*" \
> + --no-local --filter="blob:limit=5k" server client &&
> +
> + # Check that the existing "cdn" remote has been properly updated.
> -+ printf "%s\n" "$PWD_URL/lop" "true" "lop" "+refs/heads/*:refs/remotes/lop/*" >expect &&
> ++ printf "%s\n" "$TRASH_DIRECTORY_URL/lop" "true" "lop" "+refs/heads/*:refs/remotes/lop/*" >expect &&
> + git -C client config "remote.cdn.url" >actual &&
> + git -C client config "remote.cdn.promisor" >>actual &&
> + git -C client config "remote.cdn.advertisedAs" >>actual &&
> @@ t/t5710-promisor-remote-capability.sh: test_expect_success "clone with 'None', U
> + check_missing_objects server 1 "$oid"
> +'
> +
> -+test_expect_success "clone with invalid promisor.acceptFromServerUrl" '
> -+ git -C server config promisor.advertise true &&
> -+ test_when_finished "rm -rf client" &&
> -+
> -+ # As "bad name" contains a space, which is not a valid remote name,
> -+ # the pattern should be rejected with a warning and no remote created.
> -+ GIT_NO_LAZY_FETCH=0 git clone \
> -+ -c promisor.acceptfromserver=None \
> -+ -c "promisor.acceptFromServerUrl=bad name=https://example.com/*" \
> -+ --no-local --filter="blob:limit=5k" server client 2>err &&
> -+
> -+ # Check that a warning was emitted
> -+ test_grep "invalid remote name '\''bad name'\''" err &&
> -+
> -+ # Check that no remote was auto-created
> -+ test_must_fail git -C client config --get-regexp "remote\..*\.advertisedas" &&
> -+
> -+ # Check that the largest object is not missing on the server
> -+ check_missing_objects server 0 "" &&
> -+
> -+ # Reinitialize server so that the largest object is missing again
> -+ initialize_server 1 "$oid"
> -+'
> -+
> - test_expect_success "clone with promisor.sendFields" '
> + test_expect_success "clone with invalid promisor.acceptFromServerUrl" '
> git -C server config promisor.advertise true &&
> test_when_finished "rm -rf client" &&
> +@@ t/t5710-promisor-remote-capability.sh: test_expect_success "clone with invalid promisor.acceptFromServerUrl" '
> + # Check that a warning was emitted
> + test_grep "invalid remote name '\''bad name'\''" err &&
> +
> ++ # Check that no remote was auto-created
> ++ test_must_fail git -C client config get --regexp "remote\..*\.advertisedas" &&
> ++
> + # Check that the largest object is not missing on the server
> + check_missing_objects server 0 "" &&
> +
> 16: 20f70b52bb ! 8: b68b9497aa doc: promisor: improve acceptFromServer entry
> @@ Documentation/config/promisor.adoc: variable is set to "true", and the "name" an
> +for protocol details.
>
> promisor.acceptFromServerUrl::
> - A glob pattern to specify which URLs advertised by a server
> + A glob pattern to specify which server-advertised URLs a
>
>
> Christian Couder (8):
> t5710: simplify 'mkdir X' followed by 'git -C X init'
> urlmatch: change 'allow_globs' arg to bool
> urlmatch: add url_normalize_pattern() helper
> promisor-remote: add 'local_name' to 'struct promisor_info'
> promisor-remote: introduce promisor.acceptFromServerUrl
> promisor-remote: trust known remotes matching acceptFromServerUrl
> promisor-remote: auto-configure unknown remotes
> doc: promisor: improve acceptFromServer entry
>
> Documentation/config/promisor.adoc | 123 ++++++--
> Documentation/config/remote.adoc | 9 +
> Documentation/gitprotocol-v2.adoc | 9 +-
> promisor-remote.c | 410 ++++++++++++++++++++++++--
> t/t5710-promisor-remote-capability.sh | 202 ++++++++++++-
> urlmatch.c | 11 +-
> urlmatch.h | 12 +
> 7 files changed, 730 insertions(+), 46 deletions(-)
>
> --
> 2.54.0.19.gb68b9497aa
>
^ permalink raw reply [flat|nested] 80+ messages in thread