git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Patrick Steinhardt <ps@pks.im>
To: Christian Couder <christian.couder@gmail.com>
Cc: git@vger.kernel.org, Junio C Hamano <gitster@pobox.com>,
	John Cai <johncai86@gmail.com>, Taylor Blau <me@ttaylorr.com>,
	Eric Sunshine <sunshine@sunshineco.com>,
	Christian Couder <chriscool@tuxfamily.org>
Subject: Re: [PATCH v2 3/4] Add 'promisor-remote' capability to protocol v2
Date: Mon, 30 Sep 2024 09:56:59 +0200	[thread overview]
Message-ID: <ZvpZy9XMpLI0DQQg@pks.im> (raw)
In-Reply-To: <20240910163000.1985723-4-christian.couder@gmail.com>

On Tue, Sep 10, 2024 at 06:29:59PM +0200, Christian Couder wrote:
> diff --git a/Documentation/config/promisor.txt b/Documentation/config/promisor.txt
> index 98c5cb2ec2..9cbfe3e59e 100644
> --- a/Documentation/config/promisor.txt
> +++ b/Documentation/config/promisor.txt
> @@ -1,3 +1,20 @@
>  promisor.quiet::
>  	If set to "true" assume `--quiet` when fetching additional
>  	objects for a partial clone.
> +
> +promisor.advertise::
> +	If set to "true", a server will use the "promisor-remote"
> +	capability, see linkgit:gitprotocol-v2[5], to advertise the
> +	promisor remotes it is using, if it uses some. Default is
> +	"false", which means the "promisor-remote" capability is not
> +	advertised.
> +
> +promisor.acceptFromServer::
> +	If set to "all", a client will accept all the promisor remotes
> +	a server might advertise using the "promisor-remote"
> +	capability. Default is "none", which means no promisor remote
> +	advertised by a server will be accepted. By accepting a
> +	promisor remote, the client agrees that the server might omit
> +	objects that are lazily fetchable from this promisor remote
> +	from its responses to "fetch" and "clone" requests from the
> +	client. See linkgit:gitprotocol-v2[5].

I wonder a bit about whether making this an option is all that sensible,
because that would of course apply globally to every server that you
might want to clone from. Wouldn't it be more sensible to make this
configurabe per server?

Another question: servers may advertise bogus addresses to us, and as
far as I can see there are currently no precautions in place against
malicious cases. The server might for example use this to redirect us to
a remote that uses no encryption, the Git protocol or even the "file://"
protocol. I guess the sane thing here would be to default to allow
clones via "https://" only, but make the set of accepted protocols
configurable.

> diff --git a/Documentation/gitprotocol-v2.txt b/Documentation/gitprotocol-v2.txt
> index 414bc625d5..65d5256baf 100644
> --- a/Documentation/gitprotocol-v2.txt
> +++ b/Documentation/gitprotocol-v2.txt
> @@ -781,6 +781,60 @@ retrieving the header from a bundle at the indicated URI, and thus
>  save themselves and the server(s) the request(s) needed to inspect the
>  headers of that bundle or bundles.
>  
> +promisor-remote=<pr-infos>
> +~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +The server may advertise some promisor remotes it is using or knows
> +about to a client which may want to use them as its promisor remotes,
> +instead of this repository. In this case <pr-infos> should be of the
> +form:
> +
> +	pr-infos = pr-info | pr-infos ";" pr-info

Wouldn't it be preferable to make this multiple lines so that we cannot
ever burst through the pktline limits?

> +	pr-info = "name=" pr-name | "name=" pr-name "," "url=" pr-url
> +
> +where `pr-name` is the urlencoded name of a promisor remote, and
> +`pr-url` the urlencoded URL of that promisor remote.
> +In this case, if the client decides to use one or more promisor
> +remotes the server advertised, it can reply with
> +"promisor-remote=<pr-names>" where <pr-names> should be of the form:

One of the things that LFS provides is custom transfer types. It is for
example possible to use NFS or some other arbitrary protocol to fetch or
upload data. It should be possible to provide similar functionality on
the Git side via custom transport helpers, too, and if we make the
accepted set of helpers configurable as proposed further up this could
be made safe, too.

But one thing I'm missing here is any documentation around how the
client would know which promisor-remote to pick when the remote
advertises multiple of them. The easiest schema would of course be to
pick the first one whose transport helper the client understands and
considers to be safe. But given that we're talking about offloading of
large blobs, would we have usecases for advertising e.g. region-scoped
remotes that require more information on the client-side?

Also, are the promisor remotes promising to each contain all objects? Or
would the client have to ask each promisor remote until it finds a
desired object?

> +	pr-names = pr-name | pr-names ";" pr-name
> +
> +where `pr-name` is the urlencoded name of a promisor remote the server
> +advertised and the client accepts.
> +
> +Note that, everywhere in this document, `pr-name` MUST be a valid
> +remote name, and the ';' and ',' characters MUST be encoded if they
> +appear in `pr-name` or `pr-url`.

So I assume the intent here is to let the client add that promisor
remote with that exact, server-provided name? That makes me wonder about
two different scenarios:

  - We must keep the remote from announcing "origin".

  - What if we eventually decide to allow users to provide their own
    names for remotes during git-clone(1)?

Overall, I don't think that it's a good idea to let the remote dictate
which name a client's remotes have.

> +If the server doesn't know any promisor remote that could be good for
> +a client to use, or prefers a client not to use any promisor remote it
> +uses or knows about, it shouldn't advertise the "promisor-remote"
> +capability at all.
> +
> +In this case, or if the client doesn't want to use any promisor remote
> +the server advertised, the client shouldn't advertise the
> +"promisor-remote" capability at all in its reply.
> +
> +The "promisor.advertise" and "promisor.acceptFromServer" configuration
> +options can be used on the server and client side respectively to
> +control what they advertise or accept respectively. See the
> +documentation of these configuration options for more information.
> +
> +Note that in the future it would be nice if the "promisor-remote"
> +protocol capability could be used by the server, when responding to
> +`git fetch` or `git clone`, to advertise better-connected remotes that
> +the client can use as promisor remotes, instead of this repository, so
> +that the client can lazily fetch objects from these other
> +better-connected remotes. This would require the server to omit in its
> +response the objects available on the better-connected remotes that
> +the client has accepted. This hasn't been implemented yet though. So
> +for now this "promisor-remote" capability is useful only when the
> +server advertises some promisor remotes it already uses to borrow
> +objects from.

In the cover letter you mention that the server may not even have some
objects at all in the future. I wonder how that is supposed to interact
with clients that do not know about the "promisor-remote" capability at
all though.

From my point of view the server should be able tot handle that just
fine and provide a full packfile to the client. That would of course
require the server to fetch missing objects from its own promisor
remotes. Do we want to state explicitly that this is a MUST for servers
so that we don't end up in a future where clients wouldn't be able to
fetch from some forges anymore?

Patrick

  reply	other threads:[~2024-09-30  7:57 UTC|newest]

Thread overview: 110+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-31 13:40 [PATCH 0/4] Introduce a "promisor-remote" capability Christian Couder
2024-07-31 13:40 ` [PATCH 1/4] version: refactor strbuf_sanitize() Christian Couder
2024-07-31 17:18   ` Junio C Hamano
2024-08-20 11:29     ` Christian Couder
2024-07-31 13:40 ` [PATCH 2/4] strbuf: refactor strbuf_trim_trailing_ch() Christian Couder
2024-07-31 17:29   ` Junio C Hamano
2024-07-31 21:49     ` Taylor Blau
2024-08-20 11:29       ` Christian Couder
2024-08-20 11:29     ` Christian Couder
2024-07-31 13:40 ` [PATCH 3/4] Add 'promisor-remote' capability to protocol v2 Christian Couder
2024-07-31 15:40   ` Taylor Blau
2024-08-20 11:32     ` Christian Couder
2024-08-20 17:01       ` Junio C Hamano
2024-09-10 16:32         ` Christian Couder
2024-07-31 16:16   ` Taylor Blau
2024-08-20 11:32     ` Christian Couder
2024-08-20 16:55       ` Junio C Hamano
2024-09-10 16:32       ` Christian Couder
2024-09-10 17:46         ` Junio C Hamano
2024-07-31 18:25   ` Junio C Hamano
2024-07-31 19:34     ` Junio C Hamano
2024-08-20 12:21     ` Christian Couder
2024-08-05 13:48   ` Patrick Steinhardt
2024-08-19 20:00     ` Junio C Hamano
2024-09-10 16:31     ` Christian Couder
2024-07-31 13:40 ` [PATCH 4/4] promisor-remote: check advertised name or URL Christian Couder
2024-07-31 18:35   ` Junio C Hamano
2024-09-10 16:32     ` Christian Couder
2024-07-31 16:01 ` [PATCH 0/4] Introduce a "promisor-remote" capability Junio C Hamano
2024-07-31 16:17 ` Taylor Blau
2024-09-10 16:29 ` [PATCH v2 " Christian Couder
2024-09-10 16:29   ` [PATCH v2 1/4] version: refactor strbuf_sanitize() Christian Couder
2024-09-10 16:29   ` [PATCH v2 2/4] strbuf: refactor strbuf_trim_trailing_ch() Christian Couder
2024-09-10 16:29   ` [PATCH v2 3/4] Add 'promisor-remote' capability to protocol v2 Christian Couder
2024-09-30  7:56     ` Patrick Steinhardt [this message]
2024-09-30 13:28       ` Christian Couder
2024-10-01 10:14         ` Patrick Steinhardt
2024-10-01 18:47           ` Junio C Hamano
2024-11-06 14:04     ` Patrick Steinhardt
2024-11-28  5:47     ` Junio C Hamano
2024-11-28 15:31       ` Christian Couder
2024-11-29  1:31         ` Junio C Hamano
2024-09-10 16:30   ` [PATCH v2 4/4] promisor-remote: check advertised name or URL Christian Couder
2024-09-30  7:57     ` Patrick Steinhardt
2024-09-26 18:09   ` [PATCH v2 0/4] Introduce a "promisor-remote" capability Junio C Hamano
2024-09-27  9:15     ` Christian Couder
2024-09-27 22:48       ` Junio C Hamano
2024-09-27 23:31         ` rsbecker
2024-09-28 10:56           ` Kristoffer Haugsbakk
2024-09-30  7:57         ` Patrick Steinhardt
2024-09-30  9:17           ` Christian Couder
2024-09-30 16:52             ` Junio C Hamano
2024-10-01 10:14             ` Patrick Steinhardt
2024-09-30 16:34           ` Junio C Hamano
2024-09-30 21:26           ` brian m. carlson
2024-09-30 22:27             ` Junio C Hamano
2024-10-01 10:13               ` Patrick Steinhardt
2024-12-06 12:42   ` [PATCH v3 0/5] " Christian Couder
2024-12-06 12:42     ` [PATCH v3 1/5] version: refactor strbuf_sanitize() Christian Couder
2024-12-07  6:21       ` Junio C Hamano
2025-01-27 15:07         ` Christian Couder
2024-12-06 12:42     ` [PATCH v3 2/5] strbuf: refactor strbuf_trim_trailing_ch() Christian Couder
2024-12-07  6:35       ` Junio C Hamano
2025-01-27 15:07         ` Christian Couder
2024-12-16 11:47       ` karthik nayak
2024-12-06 12:42     ` [PATCH v3 3/5] Add 'promisor-remote' capability to protocol v2 Christian Couder
2024-12-07  7:59       ` Junio C Hamano
2025-01-27 15:08         ` Christian Couder
2024-12-06 12:42     ` [PATCH v3 4/5] promisor-remote: check advertised name or URL Christian Couder
2024-12-06 12:42     ` [PATCH v3 5/5] doc: add technical design doc for large object promisors Christian Couder
2024-12-10  1:28       ` Junio C Hamano
2025-01-27 15:12         ` Christian Couder
2024-12-10 11:43       ` Junio C Hamano
2024-12-16  9:00         ` Patrick Steinhardt
2025-01-27 15:11         ` Christian Couder
2025-01-27 18:02           ` Junio C Hamano
2025-02-18 11:42             ` Christian Couder
2024-12-09  8:04     ` [PATCH v3 0/5] Introduce a "promisor-remote" capability Junio C Hamano
2024-12-09 10:40       ` Christian Couder
2024-12-09 10:42         ` Christian Couder
2024-12-09 23:01         ` Junio C Hamano
2025-01-27 15:05           ` Christian Couder
2025-01-27 19:38             ` Junio C Hamano
2025-01-27 15:16     ` [PATCH v4 0/6] " Christian Couder
2025-01-27 15:16       ` [PATCH v4 1/6] version: replace manual ASCII checks with isprint() for clarity Christian Couder
2025-01-27 15:16       ` [PATCH v4 2/6] version: refactor redact_non_printables() Christian Couder
2025-01-27 15:16       ` [PATCH v4 3/6] version: make redact_non_printables() non-static Christian Couder
2025-01-30 10:51         ` Patrick Steinhardt
2025-02-18 11:42           ` Christian Couder
2025-01-27 15:16       ` [PATCH v4 4/6] Add 'promisor-remote' capability to protocol v2 Christian Couder
2025-01-30 10:51         ` Patrick Steinhardt
2025-02-18 11:41           ` Christian Couder
2025-01-27 15:17       ` [PATCH v4 5/6] promisor-remote: check advertised name or URL Christian Couder
2025-01-27 23:48         ` Junio C Hamano
2025-01-28  0:01           ` Junio C Hamano
2025-01-30 10:51           ` Patrick Steinhardt
2025-02-18 11:41             ` Christian Couder
2025-02-18 11:42           ` Christian Couder
2025-01-27 15:17       ` [PATCH v4 6/6] doc: add technical design doc for large object promisors Christian Couder
2025-01-27 21:14       ` [PATCH v4 0/6] Introduce a "promisor-remote" capability Junio C Hamano
2025-02-18 11:40         ` Christian Couder
2025-02-18 11:32       ` [PATCH v5 0/3] " Christian Couder
2025-02-18 11:32         ` [PATCH v5 1/3] Add 'promisor-remote' capability to protocol v2 Christian Couder
2025-02-18 11:32         ` [PATCH v5 2/3] promisor-remote: check advertised name or URL Christian Couder
2025-02-18 11:32         ` [PATCH v5 3/3] doc: add technical design doc for large object promisors Christian Couder
2025-02-21  8:33           ` Patrick Steinhardt
2025-03-03 16:58             ` Junio C Hamano
2025-02-18 19:07         ` [PATCH v5 0/3] Introduce a "promisor-remote" capability Junio C Hamano
2025-02-21  8:34         ` Patrick Steinhardt
2025-02-21 18:40           ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZvpZy9XMpLI0DQQg@pks.im \
    --to=ps@pks.im \
    --cc=chriscool@tuxfamily.org \
    --cc=christian.couder@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=johncai86@gmail.com \
    --cc=me@ttaylorr.com \
    --cc=sunshine@sunshineco.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).