Git development
 help / color / mirror / Atom feed
From: Patrick Steinhardt <ps@pks.im>
To: Christian Couder <christian.couder@gmail.com>
Cc: git@vger.kernel.org, Junio C Hamano <gitster@pobox.com>,
	Taylor Blau <me@ttaylorr.com>,
	Karthik Nayak <karthik.188@gmail.com>,
	Elijah Newren <newren@gmail.com>
Subject: Re: [PATCH 00/10] Prepare for advertised remotes auto-configure via URL allowlist
Date: Thu, 2 Apr 2026 09:47:38 +0200	[thread overview]
Message-ID: <ac4fGpG9WLZqnH7f@pks.im> (raw)
In-Reply-To: <20260402070613.85934-1-christian.couder@gmail.com>

On Thu, Apr 02, 2026 at 09:06:03AM +0200, Christian Couder wrote:
> Recently, I sent a 16 patch long series that makes it possible for
> promisor remotes advertised through the "promisor-remote" protocol
> capability to be auto-configured on the client side via a URL
> allowlist configured using a new `promisor.acceptFromServerUrl`
> configuration variable:
> 
> https://lore.kernel.org/git/20260323080520.887550-1-christian.couder@gmail.com/
> 
> I got the suggestion to split the 16 patches series into two smaller
> series, starting with a preparatory series. So here is this
> preparatory series. It's a mix of mostly small fixes, refactorings and
> cleanups.
> 
> High level description of the patches
> =====================================
> 
>  - Patches 1-4/10 are fixes:
> 
>    - Patch 1/10 is the most significant. The others are relatively
>      small.
>      
>    - Patch 4/10 is the only new in this series (while all the others
>      were in the previous series). It prepares for Patch 5/10.
> 
>  - Patches 5-10/10 are refactorings and cleanups:
> 
>    - Patches 5-7/10 are relatively small independents cleanups or
>      refactorings.
> 
>    - Patches 8-9/10 are related refactorings simplifying the data
>      structures used in filter_promisor_remote() and
>      promisor_remote_reply().
> 
>    - Patch 10/10 is cleaning up the 'file://' URIs with absolute paths
>      in the test script.

Thanks for splitting out these changes. In case anybody else cares,
I've attached the range-diff below.

In any case, I'm happy with these changes. There's a single nit, but I
don't think it's worth rerolling over. Thanks!

Patrick

 1:  aeae4b9971 !  1:  30d0aa8a24 promisor-remote: try accepted remotes before others in get_direct()
    @@ Commit message
     
         Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
     
    + ## Documentation/gitprotocol-v2.adoc ##
    +@@ Documentation/gitprotocol-v2.adoc: advertised, it can reply with "promisor-remote=<pr-names>" where
    + where `pr-name` is the urlencoded name of a promisor remote the server
    + advertised and the client accepts.
    + 
    ++The promisor remotes that the client accepted will be tried before the
    ++other configured promisor remotes when the client attempts to fetch
    ++missing objects.
    ++
    + Note that, everywhere in this document, the ';' and ',' characters
    + MUST be encoded if they appear in `pr-name` or `field-value`.
    + 
    +
      ## promisor-remote.c ##
     @@ promisor-remote.c: static int remove_fetched_oids(struct repository *repo,
      	return remaining_nr;
    @@ promisor-remote.c: static int remove_fetched_oids(struct repository *repo,
     +	struct promisor_remote *r = repo->promisor_remote_config->promisors;
     +
     +	for (; r; r = r->next) {
    -+		if (accepted_only && !r->accepted)
    -+			continue;
    -+		if (!accepted_only && r->accepted)
    ++		if (accepted_only != r->accepted)
     +			continue;
     +		if (fetch_objects(repo, r->name, *remaining_oids, *remaining_nr) < 0) {
     +			if (*remaining_nr == 1)
    @@ promisor-remote.c: void promisor_remote_get_direct(struct repository *repo,
      
      	for (i = 0; i < remaining_nr; i++) {
      		if (is_promisor_object(repo, &remaining_oids[i]))
    +
    + ## t/t5710-promisor-remote-capability.sh ##
    +@@ t/t5710-promisor-remote-capability.sh: test_expect_success "init + fetch with promisor.advertise set to 'true'" '
    + 	check_missing_objects server 1 "$oid"
    + '
    + 
    ++test_expect_success "clone with two promisors but only one advertised" '
    ++	git -C server config promisor.advertise true &&
    ++	test_when_finished "rm -rf client unused_lop" &&
    ++
    ++	# Create a promisor that will be configured but not be used
    ++	git init --bare unused_lop &&
    ++
    ++	# Clone from server to create a client
    ++	GIT_TRACE="$(pwd)/trace" GIT_NO_LAZY_FETCH=0 git clone \
    ++		-c remote.unused_lop.promisor=true \
    ++		-c remote.unused_lop.fetch="+refs/heads/*:refs/remotes/unused_lop/*" \
    ++		-c remote.unused_lop.url="file://$(pwd)/unused_lop" \
    ++		-c remote.lop.promisor=true \
    ++		-c remote.lop.fetch="+refs/heads/*:refs/remotes/lop/*" \
    ++		-c remote.lop.url="file://$(pwd)/lop" \
    ++		-c promisor.acceptfromserver=All \
    ++		--no-local --filter="blob:limit=5k" server client &&
    ++
    ++	# Check that "unused_lop" appears before "lop" in the config
    ++	printf "remote.%s.promisor true\n" "unused_lop" "lop" "origin" >expect &&
    ++	git -C client config get --all --show-names --regexp "^remote\..*\.promisor$" >actual &&
    ++	test_cmp expect actual &&
    ++
    ++	# Check that "lop" was tried
    ++	test_grep " fetch lop " trace &&
    ++	# Check that "unused_lop" was not contacted
    ++	# This means "lop", the accepted promisor, was tried first
    ++	test_grep ! " fetch unused_lop " trace &&
    ++
    ++	# Check that the largest object is still missing on the server
    ++	check_missing_objects server 1 "$oid"
    ++'
    ++
    ++test_expect_success "init + fetch two promisors but only one advertised" '
    ++	git -C server config promisor.advertise true &&
    ++	test_when_finished "rm -rf client unused_lop" &&
    ++
    ++	# Create a promisor that will be configured but not be used
    ++	git init --bare unused_lop &&
    ++
    ++	mkdir client &&
    ++	git -C client init &&
    ++	git -C client config remote.unused_lop.promisor true &&
    ++	git -C client config remote.unused_lop.fetch "+refs/heads/*:refs/remotes/unused_lop/*" &&
    ++	git -C client config remote.unused_lop.url "file://$(pwd)/unused_lop" &&
    ++	git -C client config remote.lop.promisor true &&
    ++	git -C client config remote.lop.fetch "+refs/heads/*:refs/remotes/lop/*" &&
    ++	git -C client config remote.lop.url "file://$(pwd)/lop" &&
    ++	git -C client config remote.server.url "file://$(pwd)/server" &&
    ++	git -C client config remote.server.fetch "+refs/heads/*:refs/remotes/server/*" &&
    ++	git -C client config promisor.acceptfromserver All &&
    ++
    ++	# Check that "unused_lop" appears before "lop" in the config
    ++	printf "remote.%s.promisor true\n" "unused_lop" "lop" >expect &&
    ++	git -C client config get --all --show-names --regexp "^remote\..*\.promisor$" >actual &&
    ++	test_cmp expect actual &&
    ++
    ++	GIT_TRACE="$(pwd)/trace" GIT_NO_LAZY_FETCH=0 git -C client fetch --filter="blob:limit=5k" server &&
    ++
    ++	# Check that "lop" was tried
    ++	test_grep " fetch lop " trace &&
    ++	# Check that "unused_lop" was not contacted
    ++	# This means "lop", the accepted promisor, was tried first
    ++	test_grep ! " fetch unused_lop " trace &&
    ++
    ++	# Check that the largest object is still missing on the server
    ++	check_missing_objects server 1 "$oid"
    ++'
    ++
    + test_expect_success "clone with promisor.acceptfromserver set to 'KnownName'" '
    + 	git -C server config promisor.advertise true &&
    + 	test_when_finished "rm -rf client" &&
 2:  853051db60 <  -:  ---------- urlmatch: change 'allow_globs' arg to bool
 3:  e17cc86c37 <  -:  ---------- urlmatch: add url_is_valid_pattern() helper
10:  ac7742f5ba !  2:  60ba726cda promisor-remote: pass config entry to all_fields_match() directly
    @@ Metadata
      ## Commit message ##
         promisor-remote: pass config entry to all_fields_match() directly
     
    -    The `in_list == 0` path of all_fields_match() re-looks up the
    -    remote in config_info by advertised->name, even though every
    +    The `in_list == 0` path of all_fields_match() looks up the remote in
    +    `config_info` by `advertised->name` repeatedly, even though every
         caller in should_accept_remote() has already performed this
    -    lookup and holds the result in 'p'.
    +    lookup and holds the result in `p`.
     
         To avoid this useless work, let's replace the `int in_list`
         parameter with a `struct promisor_info *config_entry` pointer:
    @@ Commit message
            avoiding the redundant string_list_lookup() call.
     
         This removes the hidden dependency on `advertised->name` inside
    -    all_fields_match(), which would be wrong in the following commits when
    -    auto-configured remotes will be implemented as the local config name
    -    may differ from the server's advertised name.
    +    all_fields_match(), which would be wrong if in the future
    +    auto-configured remotes are implemented, as the local config name may
    +    differ from the server's advertised name.
    +
    +    While at it, let's also add a comment before all_fields_match() and
    +    match_field_against_config() to help understand how things work and
    +    help avoid similar issues.
     
         Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
     
      ## promisor-remote.c ##
    +@@ promisor-remote.c: enum accept_promisor {
    + 	ACCEPT_ALL
    + };
    + 
    ++/*
    ++ * Check if a specific field and its advertised value match the local
    ++ * configuration of a given promisor remote.
    ++ *
    ++ * Returns 1 if they match, 0 otherwise.
    ++ */
    + static int match_field_against_config(const char *field, const char *value,
    + 				      struct promisor_info *config_info)
    + {
     @@ promisor-remote.c: static int match_field_against_config(const char *field, const char *value,
      	return 0;
      }
    @@ promisor-remote.c: static int should_accept_remote(enum accept_promisor accept,
     -		return all_fields_match(advertised, config_info, 0);
     +		return all_fields_match(advertised, config_info, p);
      
    - 	warning(_("known remote named '%s' but with URL '%s' instead of '%s', "
    - 		  "ignoring this remote"), remote_name, p->url, remote_url);
    + 	warning(_("known remote named '%s' but with URL '%s' instead of '%s'"),
    + 		remote_name, p->url, remote_url);
 4:  c2f9f3ebef !  3:  d69391fd61 promisor-remote: clarify that a remote is ignored
    @@ Metadata
      ## Commit message ##
         promisor-remote: clarify that a remote is ignored
     
    -    In should_accept_remote() when a remote is ignored, we might tell users
    -    why it is ignored in a warning, but we don't tell them that the remote
    -    is actually ignored.
    +    In should_accept_remote() and parse_one_advertised_remote(), when a
    +    remote is ignored, we tell users why it is ignored in a warning, but we
    +    don't tell them that the remote is actually ignored.
     
         Let's clarify that, so users have a better idea of what's actually
         happening.
    @@ promisor-remote.c: static int should_accept_remote(enum accept_promisor accept,
      	}
      
      	if (!strcmp(p->url, remote_url))
    - 		return all_fields_match(advertised, config_info, 0);
    + 		return all_fields_match(advertised, config_info, p);
      
     -	warning(_("known remote named '%s' but with URL '%s' instead of '%s'"),
     -		remote_name, p->url, remote_url);
    @@ promisor-remote.c: static int should_accept_remote(enum accept_promisor accept,
      
      	return 0;
      }
    +@@ promisor-remote.c: static struct promisor_info *parse_one_advertised_remote(const char *remote_info
    + 	string_list_clear(&elem_list, 0);
    + 
    + 	if (!info->name || !info->url) {
    +-		warning(_("server advertised a promisor remote without a name or URL: %s"),
    +-			remote_info);
    ++		warning(_("server advertised a promisor remote without a name or URL: '%s', "
    ++			  "ignoring this remote"), remote_info);
    + 		promisor_info_free(info);
    + 		return NULL;
    + 	}
 -:  ---------- >  4:  f4815d68ea promisor-remote: reject empty name or URL in advertised remote
11:  ebc6f2dbeb !  5:  48605c4612 promisor-remote: refactor should_accept_remote() control flow
    @@ Metadata
      ## Commit message ##
         promisor-remote: refactor should_accept_remote() control flow
     
    -    In following commits, we are going to add URL-based acceptance logic
    -    into should_accept_remote().
    +    A previous commit made sure we now reject empty URLs early at parse
    +    time. This makes the existing warning() in case a remote URL is NULL
    +    or empty very unlikely to be useful.
     
    -    To prepare for the upcoming changes, let's restructure the control flow
    -    in should_accept_remote().
    +    In future work, we also plan to add URL-based acceptance logic into
    +    should_accept_remote().
    +
    +    To adapt to previous changes and prepare for upcoming changes, let's
    +    restructure the control flow in should_accept_remote().
     
         Concretely, let's:
     
    -     - Move the empty-URL check to the very top of the function, so that
    -       every acceptance mode, instead of only ACCEPT_KNOWN_URL, rejects
    -       remotes with a missing or blank URL early.
    +     - Replace the warning() in case of an empty URL with a BUG(), as a
    +       previous commit made sure empty URLs are rejected early at parse
    +       time.
    +
    +     - Move that modified empty-URL check to the very top of the function,
    +       so that every acceptance mode, instead of only ACCEPT_KNOWN_URL, is
    +       covered.
     
          - Invert the URL comparison: instead of returning on match and
            warning on mismatch, return early on mismatch and let the match
            case fall through. This opens a single exit path at the bottom of
    -       the function for following commits to extend.
    -
    -    Anyway the server shouldn't send any empty URL in the first place, so
    -    this shouldn't change any behavior in practice.
    +       the function for future commits to extend.
     
         Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
     
    @@ promisor-remote.c: static int should_accept_remote(enum accept_promisor accept,
      	const char *remote_name = advertised->name;
      	const char *remote_url = advertised->url;
      
    -+	if (!remote_url || !*remote_url) {
    -+		warning(_("no or empty URL advertised for remote '%s', "
    -+			  "ignoring this remote"), remote_name);
    -+		return 0;
    -+	}
    ++	if (!remote_url || !*remote_url)
    ++		BUG("no or empty URL advertised for remote '%s'; "
    ++		    "this remote should have been rejected earlier",
    ++		    remote_name);
     +
      	if (accept == ACCEPT_ALL)
      		return all_fields_match(advertised, config_info, NULL);
 5:  bdd62c20a4 !  6:  632e3bfe25 promisor-remote: refactor has_control_char()
    @@ Metadata
      ## Commit message ##
         promisor-remote: refactor has_control_char()
     
    -    In a following commit we are going to check if some strings contain
    -    control character, so let's refactor the logic to do that in a new
    +    In a future commit we are going to check if some strings contain
    +    control characters, so let's refactor the logic to do that in a new
         has_control_char() helper function.
     
    +    It cleans up the code a bit anyway.
    +
         Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
     
      ## promisor-remote.c ##
 6:  962db2ee6c !  7:  12dc262514 promisor-remote: refactor accept_from_server()
    @@ Metadata
      ## Commit message ##
         promisor-remote: refactor accept_from_server()
     
    -    In a following commit, we are going to add more logic to
    +    In future commits, we are going to add more logic to
         filter_promisor_remote() which is already doing a lot of things.
     
         Let's alleviate that by moving the logic that checks and validates the
 7:  b4d125bd36 !  8:  9d47ed27e8 promisor-remote: keep accepted promisor_info structs alive
    @@ Commit message
         protocol reply, apply advertised filters, and mark remotes as
         accepted, rather than iterating three separate structures.
     
    -    This refactoring also prepares for a following commit that will add a
    +    This refactoring also prepares for a future commit that will add a
         'local_name' member to 'struct promisor_info'. Since struct instances
         stay alive, downstream code will be able to simply read both names
         from them rather than needing yet another parallel strvec.
 8:  3252e216f1 =  9:  b6c020696d promisor-remote: remove the 'accepted' strvec
 9:  60e17ccb99 <  -:  ---------- promisor-remote: add 'local_name' to 'struct promisor_info'
12:  467ee9eff3 ! 10:  d47aa7a31a t5710: use proper file:// URIs for absolute paths
    @@ Commit message
         (`file://D:/a/repo`). Standard URI parsers misinterpret this format,
         treating `D:` as the host rather than part of the absolute path.
     
    +    This is to be expected because RFC 8089 says that the `//` prefix with
    +    an empty local host must be followed by an absolute path starting with
    +    a slash.
    +
         While this hasn't broken the existing tests (because the old
    -    `promisor.acceptFromServer` logic relies entirely on strict `strcmp`
    +    `promisor.acceptFromServer` logic relies entirely on strict `strcmp()`
         without normalizing the URLs), it will break future commits that pass
    -    these URLs through `url_normalize()`.
    +    these URLs through `url_normalize()` or similar functions.
     
         To future-proof the tests and ensure cross-platform URI compliance,
         let's introduce a $PWD_URL helper that explicitly guarantees a leading
13:  637db2b4d8 <  -:  ---------- promisor-remote: introduce promisor.acceptFromServerUrl
14:  c03c5261fb <  -:  ---------- promisor-remote: trust known remotes matching acceptFromServerUrl
15:  0d8acf10f8 <  -:  ---------- promisor-remote: auto-configure unknown remotes
16:  a27565223f <  -:  ---------- doc: promisor: improve acceptFromServer entry

  parent reply	other threads:[~2026-04-02  7:47 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-02  7:06 [PATCH 00/10] Prepare for advertised remotes auto-configure via URL allowlist Christian Couder
2026-04-02  7:06 ` [PATCH 01/10] promisor-remote: try accepted remotes before others in get_direct() Christian Couder
2026-04-02  7:46   ` Patrick Steinhardt
2026-04-07 12:05     ` Christian Couder
2026-04-07 14:49       ` Junio C Hamano
2026-04-02  7:06 ` [PATCH 02/10] promisor-remote: pass config entry to all_fields_match() directly Christian Couder
2026-04-02  7:06 ` [PATCH 03/10] promisor-remote: clarify that a remote is ignored Christian Couder
2026-04-02  7:06 ` [PATCH 04/10] promisor-remote: reject empty name or URL in advertised remote Christian Couder
2026-04-02  7:46   ` Patrick Steinhardt
2026-04-02  7:06 ` [PATCH 05/10] promisor-remote: refactor should_accept_remote() control flow Christian Couder
2026-04-02  7:06 ` [PATCH 06/10] promisor-remote: refactor has_control_char() Christian Couder
2026-04-02  7:06 ` [PATCH 07/10] promisor-remote: refactor accept_from_server() Christian Couder
2026-04-02  7:06 ` [PATCH 08/10] promisor-remote: keep accepted promisor_info structs alive Christian Couder
2026-04-02  7:06 ` [PATCH 09/10] promisor-remote: remove the 'accepted' strvec Christian Couder
2026-04-02  7:06 ` [PATCH 10/10] t5710: use proper file:// URIs for absolute paths Christian Couder
2026-04-02  9:58   ` Junio C Hamano
2026-04-03  8:16     ` Christian Couder
2026-04-07 12:02       ` Christian Couder
2026-04-02  7:47 ` Patrick Steinhardt [this message]
2026-04-02 18:37 ` [PATCH 00/10] Prepare for advertised remotes auto-configure via URL allowlist Junio C Hamano
2026-04-07 11:57   ` Christian Couder
2026-04-07 11:52 ` [PATCH v2 " Christian Couder
2026-04-07 11:52   ` [PATCH v2 01/10] promisor-remote: try accepted remotes before others in get_direct() Christian Couder
2026-04-07 11:52   ` [PATCH v2 02/10] promisor-remote: pass config entry to all_fields_match() directly Christian Couder
2026-04-07 16:56     ` Junio C Hamano
2026-04-07 11:52   ` [PATCH v2 03/10] promisor-remote: clarify that a remote is ignored Christian Couder
2026-04-07 17:27     ` Junio C Hamano
2026-04-07 11:52   ` [PATCH v2 04/10] promisor-remote: reject empty name or URL in advertised remote Christian Couder
2026-04-07 17:36     ` Junio C Hamano
2026-04-07 11:52   ` [PATCH v2 05/10] promisor-remote: refactor should_accept_remote() control flow Christian Couder
2026-04-07 11:52   ` [PATCH v2 06/10] promisor-remote: refactor has_control_char() Christian Couder
2026-04-07 11:52   ` [PATCH v2 07/10] promisor-remote: refactor accept_from_server() Christian Couder
2026-04-07 11:52   ` [PATCH v2 08/10] promisor-remote: keep accepted promisor_info structs alive Christian Couder
2026-04-07 11:52   ` [PATCH v2 09/10] promisor-remote: remove the 'accepted' strvec Christian Couder
2026-04-07 11:52   ` [PATCH v2 10/10] t5710: use proper file:// URIs for absolute paths Christian Couder

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ac4fGpG9WLZqnH7f@pks.im \
    --to=ps@pks.im \
    --cc=christian.couder@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=karthik.188@gmail.com \
    --cc=me@ttaylorr.com \
    --cc=newren@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox