From: Derrick Stolee <stolee@gmail.com>
To: Matthew John Cheetham <mjcheetham@outlook.com>,
Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com>,
git@vger.kernel.org
Cc: gitster@pobox.com, ps@pks.im
Subject: Re: [PATCH v3 5/7] fetch: add --negotiation-include option for negotiation
Date: Tue, 12 May 2026 12:54:35 -0400 [thread overview]
Message-ID: <e33ed325-4533-4f83-bcf3-acea2188f208@gmail.com> (raw)
In-Reply-To: <VI0PR03MB1163403743E62E0FDC4AFE52DC0392@VI0PR03MB11634.eurprd03.prod.outlook.com>
On 5/12/26 10:38 AM, Matthew John Cheetham wrote:
> On 2026-04-22 16:25, Derrick Stolee via GitGitGadget wrote:
>> From: Derrick Stolee <stolee@gmail.com>
>>
>> Add a new --negotiation-include option to 'git fetch', which ensures
>> that certain ref tips are always sent as 'have' lines during fetch
>> negotiation, regardless of what the negotiation algorithm selects.
>>
>> This is useful when the repository has a large number of references, so
>> the normal negotiation algorithm truncates the list. This is especially
>> important in repositories with long parallel commit histories. For
>> example, a repo could have a 'dev' branch for development and a
>> 'release' branch for released versions. If the 'dev' branch isn't
>> selected for negotiation, then it's not a big deal because there are
>> many in-progress development branches with a shared history. However, if
>> 'release' is not selected for negotiation, then the server may think
>> that this is the first time the client has asked for that reference,
>> causing a full download of its parallel commit history (and any extra
>> data that may be unique to that branch). This is based on a real example
>> where certain fetches would grow to 60+ GB when a release branch
>> updated.
>>
>> This option is a complement to --negotiation-restrict, which reduces the
>> negotiation ref set to a specific list. In the earlier example, using
>> --negotiation-restrict to focus the negotiation to 'dev' and 'release'
>> would avoid those problematic downloads, but would still not allow
>> advertising potentially-relevant user brances. In this way, the
>> 'include' version solves the problem I mention while allowing
>> negotiation to pick other references opportunistically. The two options
>> can also be combined to allow the best of both worlds.
>
> Nice explanation and motivation for the need of such as feature.
>
> One small typo: s/brances/branches/
Thanks.
>> --- a/Documentation/fetch-options.adoc
>> +++ b/Documentation/fetch-options.adoc
>> @@ -73,6 +73,25 @@ See also the `fetch.negotiationAlgorithm` and `push.negotiate`
>> configuration variables documented in linkgit:git-config[1], and the
>> `--negotiate-only` option below.
>> +`--negotiation-include=<revision>`::
>> + Ensure that the given ref tip is always sent as a "have" line
>> + during fetch negotiation, regardless of what the negotiation
>> + algorithm selects. This is useful to guarantee that common
>> + history reachable from specific refs is always considered, even
>> + when `--negotiation-restrict` restricts the set of tips or when
>> + the negotiation algorithm would otherwise skip them.
>> ++
>> +This option may be specified more than once; if so, each ref is sent
>> +unconditionally.
>> ++
>> +The argument may be an exact ref name (e.g. `refs/heads/release`) or a
>> +glob pattern (e.g. `refs/heads/release/{asterisk}`). The pattern syntax
>> +is the same as for `--negotiation-restrict`.
>> ++
>> +If `--negotiation-restrict` is used, the have set is first restricted by
>> +that option and then increased to include the tips specified by
>> +`--negotiation-include`.
>> +
>
> The placeholder `<revision>` and the description in the body of "ref
> name or glob" slightly disagree with each other. The `--negotiation-restrict`
> docs use `(<commit>|<glob>)` in the syntax definition and
> "a glob on ref names, a ref, or .. SHA-1 of a commit".
Good eye.
> `resolve_negotiation_include()` calls `repo_get_oid()` for non-globs
> so bare OIDs and abbreviated SHAs work too. Perhaps consider aligning the
> syntaxes, and mention that OIDs work too.
Will do.
>> @@ -1547,10 +1548,14 @@ static void add_negotiation_restrict_tips(struct
>> git_transport_options *smart_op
>> int old_nr;
>> if (!has_glob_specials(s)) {
>> struct object_id oid;
>> +
>> + /* Ignore missing reference. */
>> if (repo_get_oid(the_repository, s, &oid))
>> - die(_("%s is not a valid object"), s);
>> + continue;
>> + /* Fail on missing object pointed by ref. */
>> if (!odb_has_object(the_repository->objects, &oid, 0))
>> die(_("the object %s does not exist"), s);
>> +
>> oid_array_append(oids, &oid);
>> continue;
>> }
>
> This is the change in behaviour - unresolvable revs were a fatal error
> and are now silently ignored.
>
> Note that t5510 '--negotiation-tip rejects missing OIDs' still passes
> because it uses an all-zero OID, which parses as a valid hex string,
> and dies on the second check "object does not exist". Using something
> like `--negotiation-tip=notreal` that previously would error will now
> silently be ignored.
>
> Is it worth another test? (invalid object vs not exists)?
Yes, let's add a test to guarantee this behavior works.
>> @@ -1615,6 +1620,13 @@ static struct transport *prepare_transport(struct
>> remote *remote, int deepen,
>> strbuf_release(&config_name);
>> }
>> }
>> + if (negotiation_include.nr) {
>> + if (transport->smart_options)
>> + transport->smart_options->negotiation_include =
>> &negotiation_include;
>> + else
>> + warning(_("ignoring %s because the protocol does not support it"),
>> + "--negotiation-include");
>> + }
>> return transport;
>> }
>
> There is a difference between the existing `--negotiation-restrict`
> option and the new `--negotiation-include` option. Patch 3's commit
> message says:
>
> "The 'tips' part is kept because this is an oid_array in the transport
> layer. This requires the builtin to handle parsing refs into
> collections of oids so the transport layer can handle this cleaner
> form of the data."
>
> The new option passes the raw `string_list` to the transport layer and
> lets it resolve it instead. If the transport layer now learns how to
> resolve refs to oids, why not for tips/restrict?
>
> Would it be easier for future readers for these complementary options
> to resolve their inputs at the same layer? Or at least call out why:
> "would prefer raw tips but for back-compat we resolve in the built-in"
> for example.
This is a really key observation. It's a bit of work to unravel, but I
think it's better for unifying these things. Look forward to a better
organization in the next version.
>> +static void resolve_negotiation_include(const struct string_list
>> *negotiation_include,
>> + struct oidset *result)
>> +{
>> + struct string_list_item *item;
>> +
>> + if (!negotiation_include || !negotiation_include->nr)
>> + return;
>> +
>> + for_each_string_list_item(item, negotiation_include) {
>> + if (!has_glob_specials(item->string)) {
>> + struct object_id oid;
>> +
>> + /* Ignore missing reference. */
>> + if (repo_get_oid(the_repository, item->string, &oid))
>> + continue;
>> +
>> + /* Fail on missing object pointed by ref. */
>> + if (!odb_has_object(the_repository->objects, &oid, 0))
>> + die(_("the object %s does not exist"),
>> + item->string);
>> +
>> + oidset_insert(result, &oid);
>> + } else {
>> + struct refs_for_each_ref_options opts = {
>> + .pattern = item->string,
>> + };
>> + refs_for_each_ref_ext(
>> + get_main_ref_store(the_repository),
>> + add_oid_to_oidset, result, &opts);
>> + }
>> + }
>> +}
>> +
>
> `resolve_negotiation_include()` is basically doing the same as
> `add_negotiation_restrict_tips()` except outputting to an `oidset`
> vs `oid_array`. This is a result of the difference in ref resolution
> layer between `--negotiation-restrict/tip` and `-include`.
Yes, this code will be replaced with a unified approach in the
next version.
>> static int find_common(struct fetch_negotiator *negotiator,
>> struct fetch_pack_args *args,
>> int fd[2], struct object_id *result_oid,
>> @@ -347,6 +390,7 @@ static int find_common(struct fetch_negotiator *negotiator,
>> struct strbuf req_buf = STRBUF_INIT;
>> size_t state_len = 0;
>> struct packet_reader reader;
>> + struct oidset negotiation_include_oids = OIDSET_INIT;
>> if (args->stateless_rpc && multi_ack == 1)
>> die(_("the option '%s' requires '%s'"), "--stateless-rpc",
>> "multi_ack_detailed");
>> @@ -474,6 +518,33 @@ static int find_common(struct fetch_negotiator *negotiator,
>> trace2_region_enter("fetch-pack", "negotiation_v0_v1", the_repository);
>> flushes = 0;
>> retval = -1;
>> +
>> + /* Send unconditional haves from --negotiation-include */
>> + resolve_negotiation_include(args->negotiation_include,
>> + &negotiation_include_oids);
>> + if (oidset_size(&negotiation_include_oids)) {
>> + struct oidset_iter iter;
>> + oidset_iter_init(&negotiation_include_oids, &iter);
>> +
>> + while ((oid = oidset_iter_next(&iter))) {
>> + struct commit *commit;
>> + packet_buf_write(&req_buf, "have %s\n",
>> + oid_to_hex(oid));
>> + print_verbose(args, "have %s", oid_to_hex(oid));
>> + count++;
>> +
>> + /*
>> + * If this is a commit, then mark as COMMON to
>> + * avoid the negotiator also outputting it as
>> + * a have.
>> + */
>> + commit = lookup_commit(the_repository, oid);
>> + if (commit &&
>> + !repo_parse_commit(the_repository, commit))
>> + commit->object.flags |= COMMON;
>> + }
>> + }
>> +
>
> I want to make sure I understand the COMMON pre-marking before
> commenting further on this patch. My understanding is there are actually
> two different COMMON bits in the tree, one defined in fetch-pack.c
> (bit 6) and one in negotiator/default.c (bit 2):
>
> - fetch-pack.c's COMMON (bit 6) is set after a server ACK confirms an
> OID is common with us and is read to decide when we've established
> enough common ground to terminate negotiation. This is not consulted
> in find_common().
>
> - negotiator/default.c's COMMON (bit 2) is a book-keeping flag used by
> `get_rev()` to decide if we skip emitting a commit as a 'have'.
>
> Since we're in fetch-pack.c here, the `commit->object.flags |= COMMON`
> line is setting bit 6. The `get_rev()` call in negotiator/default.c
> never checks bit 6, only bit 2. As far as I can tell, this mark won't
> suppress the negotiator from emitting another 'have' line in the
> protocol v0/v1 paths in `find_common()`.
>
> The v2 path doesn't touch the flags.. `add_haves` dedups via `oidset_contains()`:
>
> while ((oid = negotiator->next(negotiator))) {
> if (negotiation_include_oids &&
> oidset_contains(negotiation_include_oids, oid))
> continue;
> packet_buf_write(req_buf, "have %s\n", ...);
> }
>
> This works, and is what the new 'avoids duplicates with negotiator' test
> runs against, on protocol v2. If we run on protocol v0/v1, and if my
> assessment is correct, then we'd see a duplicate I think?
>
> Sorry if I've not understood correctly or am missing something, which is
> entirely possible :-)
This is a great catch! It shows that I'm breaking some abstractions here,
and thus it's easy to make such a mistake. It's worse that I don't catch
this problem in the tests that I am adding. I'll add a test that
demonstrates the difference.
But beyond that, I think the biggest issue is that the consumer of an
abstract 'negotiator' is assuming something about its implementation. This
means that I should update the negotiator struct to have a function
pointer dedicated to "I chose to send this 'have'" and then the negotiator
can control how to prevent sending more 'have's reachable from those tips.
Thanks,
-Stolee
next prev parent reply other threads:[~2026-05-12 16:54 UTC|newest]
Thread overview: 54+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-08 14:36 [PATCH 0/4] fetch: add --must-have and remote.*.mustHave Derrick Stolee via GitGitGadget
2026-04-08 14:36 ` [PATCH 1/4] t5516: fix test order flakiness Derrick Stolee via GitGitGadget
2026-04-08 14:36 ` [PATCH 2/4] fetch: add --must-have option for negotiation Derrick Stolee via GitGitGadget
2026-04-08 14:36 ` [PATCH 3/4] remote: add mustHave config as default for --must-have Derrick Stolee via GitGitGadget
2026-04-08 14:36 ` [PATCH 4/4] send-pack: pass --must-have for push negotiation Derrick Stolee via GitGitGadget
2026-04-08 18:59 ` [PATCH 0/4] fetch: add --must-have and remote.*.mustHave Junio C Hamano
2026-04-09 12:53 ` Derrick Stolee
2026-04-15 15:14 ` [PATCH v2 0/7] fetch: rework negotiation tip options Derrick Stolee via GitGitGadget
2026-04-15 15:14 ` [PATCH v2 1/7] t5516: fix test order flakiness Derrick Stolee via GitGitGadget
2026-04-15 15:14 ` [PATCH v2 2/7] fetch: add --negotiation-restrict option Derrick Stolee via GitGitGadget
2026-04-15 21:57 ` Junio C Hamano
2026-04-19 23:00 ` Derrick Stolee
2026-04-20 10:32 ` Junio C Hamano
2026-04-20 11:35 ` Derrick Stolee
2026-04-15 15:14 ` [PATCH v2 3/7] transport: rename negotiation_tips Derrick Stolee via GitGitGadget
2026-04-20 8:11 ` Patrick Steinhardt
2026-04-15 15:14 ` [PATCH v2 4/7] remote: add remote.*.negotiationRestrict config Derrick Stolee via GitGitGadget
2026-04-15 19:16 ` Junio C Hamano
2026-04-15 15:14 ` [PATCH v2 5/7] fetch: add --negotiation-require option for negotiation Derrick Stolee via GitGitGadget
2026-04-15 19:50 ` Junio C Hamano
2026-04-21 18:06 ` Derrick Stolee
2026-04-20 8:11 ` Patrick Steinhardt
2026-04-20 11:41 ` Derrick Stolee
2026-04-15 15:14 ` [PATCH v2 6/7] remote: add negotiationRequire config as default for --negotiation-require Derrick Stolee via GitGitGadget
2026-04-15 15:14 ` [PATCH v2 7/7] send-pack: pass negotiation config in push Derrick Stolee via GitGitGadget
2026-04-22 15:25 ` [PATCH v3 0/7] fetch: rework negotiation tip options Derrick Stolee via GitGitGadget
2026-04-22 15:25 ` [PATCH v3 1/7] t5516: fix test order flakiness Derrick Stolee via GitGitGadget
2026-05-12 10:50 ` Matthew John Cheetham
2026-04-22 15:25 ` [PATCH v3 2/7] fetch: add --negotiation-restrict option Derrick Stolee via GitGitGadget
2026-05-12 11:11 ` Matthew John Cheetham
2026-05-12 14:23 ` Derrick Stolee
2026-04-22 15:25 ` [PATCH v3 3/7] transport: rename negotiation_tips Derrick Stolee via GitGitGadget
2026-05-12 11:30 ` Matthew John Cheetham
2026-05-12 14:33 ` Derrick Stolee
2026-04-22 15:25 ` [PATCH v3 4/7] remote: add remote.*.negotiationRestrict config Derrick Stolee via GitGitGadget
2026-05-12 12:29 ` Matthew John Cheetham
2026-05-12 14:52 ` Derrick Stolee
2026-04-22 15:25 ` [PATCH v3 5/7] fetch: add --negotiation-include option for negotiation Derrick Stolee via GitGitGadget
2026-05-12 14:38 ` Matthew John Cheetham
2026-05-12 16:54 ` Derrick Stolee [this message]
2026-04-22 15:25 ` [PATCH v3 6/7] remote: add remote.*.negotiationInclude config Derrick Stolee via GitGitGadget
2026-05-12 14:54 ` Matthew John Cheetham
2026-05-12 17:55 ` Derrick Stolee
2026-04-22 15:25 ` [PATCH v3 7/7] send-pack: pass negotiation config in push Derrick Stolee via GitGitGadget
2026-05-12 15:14 ` Matthew John Cheetham
2026-05-14 12:41 ` [PATCH v4 0/8] fetch: rework negotiation tip options Derrick Stolee via GitGitGadget
2026-05-14 12:41 ` [PATCH v4 1/8] t5516: fix test order flakiness Derrick Stolee via GitGitGadget
2026-05-14 12:41 ` [PATCH v4 2/8] fetch: add --negotiation-restrict option Derrick Stolee via GitGitGadget
2026-05-14 12:41 ` [PATCH v4 3/8] transport: rename negotiation_tips Derrick Stolee via GitGitGadget
2026-05-14 12:41 ` [PATCH v4 4/8] remote: add remote.*.negotiationRestrict config Derrick Stolee via GitGitGadget
2026-05-14 12:41 ` [PATCH v4 5/8] negotiator: add have_sent() interface Derrick Stolee via GitGitGadget
2026-05-14 12:41 ` [PATCH v4 6/8] fetch: add --negotiation-include option for negotiation Derrick Stolee via GitGitGadget
2026-05-14 12:41 ` [PATCH v4 7/8] remote: add remote.*.negotiationInclude config Derrick Stolee via GitGitGadget
2026-05-14 12:41 ` [PATCH v4 8/8] send-pack: pass negotiation config in push Derrick Stolee via GitGitGadget
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e33ed325-4533-4f83-bcf3-acea2188f208@gmail.com \
--to=stolee@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitgitgadget@gmail.com \
--cc=gitster@pobox.com \
--cc=mjcheetham@outlook.com \
--cc=ps@pks.im \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox