From: Junio C Hamano <gitster@pobox.com>
To: Christian Couder <christian.couder@gmail.com>
Cc: Patrick Steinhardt <ps@pks.im>,
git@vger.kernel.org, John Cai <johncai86@gmail.com>,
Taylor Blau <me@ttaylorr.com>,
Eric Sunshine <sunshine@sunshineco.com>,
Michael Haggerty <mhagger@alum.mit.edu>,
"brian m. carlson" <sandals@crustytoothpaste.net>
Subject: Re: [PATCH v2 0/4] Introduce a "promisor-remote" capability
Date: Mon, 30 Sep 2024 09:52:58 -0700 [thread overview]
Message-ID: <xmqqa5fp9j85.fsf@gitster.g> (raw)
In-Reply-To: <CAP8UFD1yK0vwHYh+jVNUEV0mRnbjyDTOrRjuwjOwSCn-fngVzw@mail.gmail.com> (Christian Couder's message of "Mon, 30 Sep 2024 11:17:48 +0200")
Christian Couder <christian.couder@gmail.com> writes:
>> But there are still a couple of pieces missing in the bigger puzzle:
>>
>> - How would a client know to omit certain objects? Right now it only
>> knows that there are promisor remotes, but it doesn't know that it
>> e.g. should omit every blob larger than X megabytes. The answer
>> could of course be that the client should just know to do a partial
>> clone by themselves.
>
> If we add a "filter" field to the "promisor-remote" capability in a
> future patch series, then the server could pass information like a
> filter-spec that the client could use to omit some large blobs.
Yes, but at that point, is the current scheme to mark a promisor
pack with a single bit, the fact that the pack came from a promisor
remote (which one?, and for what filter settings does the remote
used?) becomes insufficient, isn't it? Chipping away one by one is
fine, but we'd at least need to be aware that it is one of the
things we need to upgrade in the scope of the bigger picture.
It may even be OK to upgrade the on-the-wire protocol side before
the code on the both ends learn to take advantage of the feature
(e.g., to add "promisor-remote" capability itself, or to add the
capability that can also convey the associated filter specification
to that remote), but without even the design (let alone the
implementation) of what runs on both ends of the connection to to
make use of what is communicated via the capability, it is rather
hard to get the details of the protocol design right.
As on-the-wire protocol is harder to upgrade due to compatibility
constraints, it smells like it is a better order to do things if it
is left as the _last_ piece to be designed and implemented, if we
were to chip away one-by-one. That may, for example, go like this:
(0) We want to ensure that the projects can specify what kind of
objects are to be offloaded to other transports.
(1) We design the client end first. We may want to be able to
choose what remote to run a lazy fetch against, based on a
filter spec, for example. We realize and make a mental note
that our new "capability" needs to tell the client enough
information to make such a decision.
(2) We design the server end to supply the above pieces of
information to the client end. During this process, we may
realize that some pieces of information cannot be prepared on
the server end and (1) may need to get adjusted.
(3) There may be tons of other things that need to be designed and
implemented before we know what pieces of information our new
"capability" needs to convey, and what these pieces of
information mean by iterating (1) and (2).
(4) Once we nail (3) down, we can add a new protocol capability,
knowing how it should work, and knowing that the client and the
server ends would work well once it is implemented.
>> At GitLab, we're thinking
>> about the ability to use rolling hash functions to chunk such big
>> objects into smaller parts to also allow for somewhat efficient
>> deduplication. We're also thinking about how to make the overall ODB
>> pluggable such that we can eventually make it more scalable in this
>> context. But that's of course thinking into the future quite a bit.
Reminds me of rsync and bup ;-).
Thanks.
next prev parent reply other threads:[~2024-09-30 16:53 UTC|newest]
Thread overview: 110+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-07-31 13:40 [PATCH 0/4] Introduce a "promisor-remote" capability Christian Couder
2024-07-31 13:40 ` [PATCH 1/4] version: refactor strbuf_sanitize() Christian Couder
2024-07-31 17:18 ` Junio C Hamano
2024-08-20 11:29 ` Christian Couder
2024-07-31 13:40 ` [PATCH 2/4] strbuf: refactor strbuf_trim_trailing_ch() Christian Couder
2024-07-31 17:29 ` Junio C Hamano
2024-07-31 21:49 ` Taylor Blau
2024-08-20 11:29 ` Christian Couder
2024-08-20 11:29 ` Christian Couder
2024-07-31 13:40 ` [PATCH 3/4] Add 'promisor-remote' capability to protocol v2 Christian Couder
2024-07-31 15:40 ` Taylor Blau
2024-08-20 11:32 ` Christian Couder
2024-08-20 17:01 ` Junio C Hamano
2024-09-10 16:32 ` Christian Couder
2024-07-31 16:16 ` Taylor Blau
2024-08-20 11:32 ` Christian Couder
2024-08-20 16:55 ` Junio C Hamano
2024-09-10 16:32 ` Christian Couder
2024-09-10 17:46 ` Junio C Hamano
2024-07-31 18:25 ` Junio C Hamano
2024-07-31 19:34 ` Junio C Hamano
2024-08-20 12:21 ` Christian Couder
2024-08-05 13:48 ` Patrick Steinhardt
2024-08-19 20:00 ` Junio C Hamano
2024-09-10 16:31 ` Christian Couder
2024-07-31 13:40 ` [PATCH 4/4] promisor-remote: check advertised name or URL Christian Couder
2024-07-31 18:35 ` Junio C Hamano
2024-09-10 16:32 ` Christian Couder
2024-07-31 16:01 ` [PATCH 0/4] Introduce a "promisor-remote" capability Junio C Hamano
2024-07-31 16:17 ` Taylor Blau
2024-09-10 16:29 ` [PATCH v2 " Christian Couder
2024-09-10 16:29 ` [PATCH v2 1/4] version: refactor strbuf_sanitize() Christian Couder
2024-09-10 16:29 ` [PATCH v2 2/4] strbuf: refactor strbuf_trim_trailing_ch() Christian Couder
2024-09-10 16:29 ` [PATCH v2 3/4] Add 'promisor-remote' capability to protocol v2 Christian Couder
2024-09-30 7:56 ` Patrick Steinhardt
2024-09-30 13:28 ` Christian Couder
2024-10-01 10:14 ` Patrick Steinhardt
2024-10-01 18:47 ` Junio C Hamano
2024-11-06 14:04 ` Patrick Steinhardt
2024-11-28 5:47 ` Junio C Hamano
2024-11-28 15:31 ` Christian Couder
2024-11-29 1:31 ` Junio C Hamano
2024-09-10 16:30 ` [PATCH v2 4/4] promisor-remote: check advertised name or URL Christian Couder
2024-09-30 7:57 ` Patrick Steinhardt
2024-09-26 18:09 ` [PATCH v2 0/4] Introduce a "promisor-remote" capability Junio C Hamano
2024-09-27 9:15 ` Christian Couder
2024-09-27 22:48 ` Junio C Hamano
2024-09-27 23:31 ` rsbecker
2024-09-28 10:56 ` Kristoffer Haugsbakk
2024-09-30 7:57 ` Patrick Steinhardt
2024-09-30 9:17 ` Christian Couder
2024-09-30 16:52 ` Junio C Hamano [this message]
2024-10-01 10:14 ` Patrick Steinhardt
2024-09-30 16:34 ` Junio C Hamano
2024-09-30 21:26 ` brian m. carlson
2024-09-30 22:27 ` Junio C Hamano
2024-10-01 10:13 ` Patrick Steinhardt
2024-12-06 12:42 ` [PATCH v3 0/5] " Christian Couder
2024-12-06 12:42 ` [PATCH v3 1/5] version: refactor strbuf_sanitize() Christian Couder
2024-12-07 6:21 ` Junio C Hamano
2025-01-27 15:07 ` Christian Couder
2024-12-06 12:42 ` [PATCH v3 2/5] strbuf: refactor strbuf_trim_trailing_ch() Christian Couder
2024-12-07 6:35 ` Junio C Hamano
2025-01-27 15:07 ` Christian Couder
2024-12-16 11:47 ` karthik nayak
2024-12-06 12:42 ` [PATCH v3 3/5] Add 'promisor-remote' capability to protocol v2 Christian Couder
2024-12-07 7:59 ` Junio C Hamano
2025-01-27 15:08 ` Christian Couder
2024-12-06 12:42 ` [PATCH v3 4/5] promisor-remote: check advertised name or URL Christian Couder
2024-12-06 12:42 ` [PATCH v3 5/5] doc: add technical design doc for large object promisors Christian Couder
2024-12-10 1:28 ` Junio C Hamano
2025-01-27 15:12 ` Christian Couder
2024-12-10 11:43 ` Junio C Hamano
2024-12-16 9:00 ` Patrick Steinhardt
2025-01-27 15:11 ` Christian Couder
2025-01-27 18:02 ` Junio C Hamano
2025-02-18 11:42 ` Christian Couder
2024-12-09 8:04 ` [PATCH v3 0/5] Introduce a "promisor-remote" capability Junio C Hamano
2024-12-09 10:40 ` Christian Couder
2024-12-09 10:42 ` Christian Couder
2024-12-09 23:01 ` Junio C Hamano
2025-01-27 15:05 ` Christian Couder
2025-01-27 19:38 ` Junio C Hamano
2025-01-27 15:16 ` [PATCH v4 0/6] " Christian Couder
2025-01-27 15:16 ` [PATCH v4 1/6] version: replace manual ASCII checks with isprint() for clarity Christian Couder
2025-01-27 15:16 ` [PATCH v4 2/6] version: refactor redact_non_printables() Christian Couder
2025-01-27 15:16 ` [PATCH v4 3/6] version: make redact_non_printables() non-static Christian Couder
2025-01-30 10:51 ` Patrick Steinhardt
2025-02-18 11:42 ` Christian Couder
2025-01-27 15:16 ` [PATCH v4 4/6] Add 'promisor-remote' capability to protocol v2 Christian Couder
2025-01-30 10:51 ` Patrick Steinhardt
2025-02-18 11:41 ` Christian Couder
2025-01-27 15:17 ` [PATCH v4 5/6] promisor-remote: check advertised name or URL Christian Couder
2025-01-27 23:48 ` Junio C Hamano
2025-01-28 0:01 ` Junio C Hamano
2025-01-30 10:51 ` Patrick Steinhardt
2025-02-18 11:41 ` Christian Couder
2025-02-18 11:42 ` Christian Couder
2025-01-27 15:17 ` [PATCH v4 6/6] doc: add technical design doc for large object promisors Christian Couder
2025-01-27 21:14 ` [PATCH v4 0/6] Introduce a "promisor-remote" capability Junio C Hamano
2025-02-18 11:40 ` Christian Couder
2025-02-18 11:32 ` [PATCH v5 0/3] " Christian Couder
2025-02-18 11:32 ` [PATCH v5 1/3] Add 'promisor-remote' capability to protocol v2 Christian Couder
2025-02-18 11:32 ` [PATCH v5 2/3] promisor-remote: check advertised name or URL Christian Couder
2025-02-18 11:32 ` [PATCH v5 3/3] doc: add technical design doc for large object promisors Christian Couder
2025-02-21 8:33 ` Patrick Steinhardt
2025-03-03 16:58 ` Junio C Hamano
2025-02-18 19:07 ` [PATCH v5 0/3] Introduce a "promisor-remote" capability Junio C Hamano
2025-02-21 8:34 ` Patrick Steinhardt
2025-02-21 18:40 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xmqqa5fp9j85.fsf@gitster.g \
--to=gitster@pobox.com \
--cc=christian.couder@gmail.com \
--cc=git@vger.kernel.org \
--cc=johncai86@gmail.com \
--cc=me@ttaylorr.com \
--cc=mhagger@alum.mit.edu \
--cc=ps@pks.im \
--cc=sandals@crustytoothpaste.net \
--cc=sunshine@sunshineco.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).