Git development
 help / color / mirror / Atom feed
From: Derrick Stolee <stolee@gmail.com>
To: Matthew John Cheetham <mjcheetham@outlook.com>,
	Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com>,
	git@vger.kernel.org
Cc: gitster@pobox.com, ps@pks.im
Subject: Re: [PATCH v3 5/7] fetch: add --negotiation-include option for negotiation
Date: Tue, 12 May 2026 12:54:35 -0400	[thread overview]
Message-ID: <e33ed325-4533-4f83-bcf3-acea2188f208@gmail.com> (raw)
In-Reply-To: <VI0PR03MB1163403743E62E0FDC4AFE52DC0392@VI0PR03MB11634.eurprd03.prod.outlook.com>

On 5/12/26 10:38 AM, Matthew John Cheetham wrote:
> On 2026-04-22 16:25, Derrick Stolee via GitGitGadget wrote:
>> From: Derrick Stolee <stolee@gmail.com>
>>
>> Add a new --negotiation-include option to 'git fetch', which ensures
>> that certain ref tips are always sent as 'have' lines during fetch
>> negotiation, regardless of what the negotiation algorithm selects.
>>
>> This is useful when the repository has a large number of references, so
>> the normal negotiation algorithm truncates the list. This is especially
>> important in repositories with long parallel commit histories. For
>> example, a repo could have a 'dev' branch for development and a
>> 'release' branch for released versions. If the 'dev' branch isn't
>> selected for negotiation, then it's not a big deal because there are
>> many in-progress development branches with a shared history. However, if
>> 'release' is not selected for negotiation, then the server may think
>> that this is the first time the client has asked for that reference,
>> causing a full download of its parallel commit history (and any extra
>> data that may be unique to that branch). This is based on a real example
>> where certain fetches would grow to 60+ GB when a release branch
>> updated.
>>
>> This option is a complement to --negotiation-restrict, which reduces the
>> negotiation ref set to a specific list. In the earlier example, using
>> --negotiation-restrict to focus the negotiation to 'dev' and 'release'
>> would avoid those problematic downloads, but would still not allow
>> advertising potentially-relevant user brances. In this way, the
>> 'include' version solves the problem I mention while allowing
>> negotiation to pick other references opportunistically. The two options
>> can also be combined to allow the best of both worlds.
> 
> Nice explanation and motivation for the need of such as feature.
> 
> One small typo: s/brances/branches/

Thanks.

>> --- a/Documentation/fetch-options.adoc
>> +++ b/Documentation/fetch-options.adoc
>> @@ -73,6 +73,25 @@ See also the `fetch.negotiationAlgorithm` and `push.negotiate`
>>   configuration variables documented in linkgit:git-config[1], and the
>>   `--negotiate-only` option below.
>> +`--negotiation-include=<revision>`::
>> +    Ensure that the given ref tip is always sent as a "have" line
>> +    during fetch negotiation, regardless of what the negotiation
>> +    algorithm selects.  This is useful to guarantee that common
>> +    history reachable from specific refs is always considered, even
>> +    when `--negotiation-restrict` restricts the set of tips or when
>> +    the negotiation algorithm would otherwise skip them.
>> ++
>> +This option may be specified more than once; if so, each ref is sent
>> +unconditionally.
>> ++
>> +The argument may be an exact ref name (e.g. `refs/heads/release`) or a
>> +glob pattern (e.g. `refs/heads/release/{asterisk}`).  The pattern syntax
>> +is the same as for `--negotiation-restrict`.
>> ++
>> +If `--negotiation-restrict` is used, the have set is first restricted by
>> +that option and then increased to include the tips specified by
>> +`--negotiation-include`.
>> +
> 
> The placeholder `<revision>` and the description in the body of "ref
> name or glob" slightly disagree with each other. The `--negotiation-restrict` 
> docs use `(<commit>|<glob>)` in the syntax definition and
> "a glob on ref names, a ref, or .. SHA-1 of a commit".

Good eye.

> `resolve_negotiation_include()` calls `repo_get_oid()` for non-globs
> so bare OIDs and abbreviated SHAs work too. Perhaps consider aligning the 
> syntaxes, and mention that OIDs work too.

Will do.

>> @@ -1547,10 +1548,14 @@ static void add_negotiation_restrict_tips(struct 
>> git_transport_options *smart_op
>>           int old_nr;
>>           if (!has_glob_specials(s)) {
>>               struct object_id oid;
>> +
>> +            /* Ignore missing reference. */
>>               if (repo_get_oid(the_repository, s, &oid))
>> -                die(_("%s is not a valid object"), s);
>> +                continue;
>> +            /* Fail on missing object pointed by ref. */
>>               if (!odb_has_object(the_repository->objects, &oid, 0))
>>                   die(_("the object %s does not exist"), s);
>> +
>>               oid_array_append(oids, &oid);
>>               continue;
>>           }
> 
> This is the change in behaviour - unresolvable revs were a fatal error
> and are now silently ignored.
> 
> Note that t5510 '--negotiation-tip rejects missing OIDs' still passes
> because it uses an all-zero OID, which parses as a valid hex string,
> and dies on the second check "object does not exist". Using something
> like `--negotiation-tip=notreal` that previously would error will now
> silently be ignored.
> 
> Is it worth another test? (invalid object vs not exists)?

Yes, let's add a test to guarantee this behavior works.

>> @@ -1615,6 +1620,13 @@ static struct transport *prepare_transport(struct 
>> remote *remote, int deepen,
>>               strbuf_release(&config_name);
>>           }
>>       }
>> +    if (negotiation_include.nr) {
>> +        if (transport->smart_options)
>> +            transport->smart_options->negotiation_include = 
>> &negotiation_include;
>> +        else
>> +            warning(_("ignoring %s because the protocol does not support it"),
>> +                "--negotiation-include");
>> +    }
>>       return transport;
>>   }
> 
> There is a difference between the existing `--negotiation-restrict`
> option and the new `--negotiation-include` option. Patch 3's commit
> message says:
> 
>    "The 'tips' part is kept because this is an oid_array in the transport
>    layer. This requires the builtin to handle parsing refs into
>    collections of oids so the transport layer can handle this cleaner
>    form of the data."
> 
> The new option passes the raw `string_list` to the transport layer and
> lets it resolve it instead. If the transport layer now learns how to
> resolve refs to oids, why not for tips/restrict?
> 
> Would it be easier for future readers for these complementary options
> to resolve their inputs at the same layer? Or at least call out why:
> "would prefer raw tips but for back-compat we resolve in the built-in"
> for example.

This is a really key observation. It's a bit of work to unravel, but I
think it's better for unifying these things. Look forward to a better
organization in the next version.

>> +static void resolve_negotiation_include(const struct string_list 
>> *negotiation_include,
>> +                    struct oidset *result)
>> +{
>> +    struct string_list_item *item;
>> +
>> +    if (!negotiation_include || !negotiation_include->nr)
>> +        return;
>> +
>> +    for_each_string_list_item(item, negotiation_include) {
>> +        if (!has_glob_specials(item->string)) {
>> +            struct object_id oid;
>> +
>> +            /* Ignore missing reference. */
>> +            if (repo_get_oid(the_repository, item->string, &oid))
>> +                continue;
>> +
>> +            /* Fail on missing object pointed by ref. */
>> +            if (!odb_has_object(the_repository->objects, &oid, 0))
>> +                die(_("the object %s does not exist"),
>> +                    item->string);
>> +
>> +            oidset_insert(result, &oid);
>> +        } else {
>> +            struct refs_for_each_ref_options opts = {
>> +                .pattern = item->string,
>> +            };
>> +            refs_for_each_ref_ext(
>> +                get_main_ref_store(the_repository),
>> +                add_oid_to_oidset, result, &opts);
>> +        }
>> +    }
>> +}
>> +
> 
> `resolve_negotiation_include()` is basically doing the same as
> `add_negotiation_restrict_tips()` except outputting to an `oidset`
> vs `oid_array`. This is a result of the difference in ref resolution
> layer between `--negotiation-restrict/tip` and `-include`.

Yes, this code will be replaced with a unified approach in the
next version.

>>   static int find_common(struct fetch_negotiator *negotiator,
>>                  struct fetch_pack_args *args,
>>                  int fd[2], struct object_id *result_oid,
>> @@ -347,6 +390,7 @@ static int find_common(struct fetch_negotiator *negotiator,
>>       struct strbuf req_buf = STRBUF_INIT;
>>       size_t state_len = 0;
>>       struct packet_reader reader;
>> +    struct oidset negotiation_include_oids = OIDSET_INIT;
>>       if (args->stateless_rpc && multi_ack == 1)
>>           die(_("the option '%s' requires '%s'"), "--stateless-rpc", 
>> "multi_ack_detailed");
>> @@ -474,6 +518,33 @@ static int find_common(struct fetch_negotiator *negotiator,
>>       trace2_region_enter("fetch-pack", "negotiation_v0_v1", the_repository);
>>       flushes = 0;
>>       retval = -1;
>> +
>> +    /* Send unconditional haves from --negotiation-include */
>> +    resolve_negotiation_include(args->negotiation_include,
>> +                    &negotiation_include_oids);
>> +    if (oidset_size(&negotiation_include_oids)) {
>> +        struct oidset_iter iter;
>> +        oidset_iter_init(&negotiation_include_oids, &iter);
>> +
>> +        while ((oid = oidset_iter_next(&iter))) {
>> +            struct commit *commit;
>> +            packet_buf_write(&req_buf, "have %s\n",
>> +                     oid_to_hex(oid));
>> +            print_verbose(args, "have %s", oid_to_hex(oid));
>> +            count++;
>> +
>> +            /*
>> +             * If this is a commit, then mark as COMMON to
>> +             * avoid the negotiator also outputting it as
>> +             * a have.
>> +             */
>> +            commit = lookup_commit(the_repository, oid);
>> +            if (commit &&
>> +                !repo_parse_commit(the_repository, commit))
>> +                commit->object.flags |= COMMON;
>> +        }
>> +    }
>> +
> 
> I want to make sure I understand the COMMON pre-marking before
> commenting further on this patch. My understanding is there are actually
> two different COMMON bits in the tree, one defined in fetch-pack.c
> (bit 6) and one in negotiator/default.c (bit 2):
> 
> - fetch-pack.c's COMMON (bit 6) is set after a server ACK confirms an
>    OID is common with us and is read to decide when we've established
>    enough common ground to terminate negotiation. This is not consulted
>    in find_common().
> 
> - negotiator/default.c's COMMON (bit 2) is a book-keeping flag used by
>    `get_rev()` to decide if we skip emitting a commit as a 'have'.
> 
> Since we're in fetch-pack.c here, the `commit->object.flags |= COMMON`
> line is setting bit 6. The `get_rev()` call in negotiator/default.c
> never checks bit 6, only bit 2. As far as I can tell, this mark won't
> suppress the negotiator from emitting another 'have' line in the
> protocol v0/v1 paths in `find_common()`.
> 
> The v2 path doesn't touch the flags.. `add_haves` dedups via `oidset_contains()`:
> 
>    while ((oid = negotiator->next(negotiator))) {
>        if (negotiation_include_oids &&
>            oidset_contains(negotiation_include_oids, oid))
>            continue;
>        packet_buf_write(req_buf, "have %s\n", ...);
>    }
> 
> This works, and is what the new 'avoids duplicates with negotiator' test
> runs against, on protocol v2. If we run on protocol v0/v1, and if my
> assessment is correct, then we'd see a duplicate I think?
> 
> Sorry if I've not understood correctly or am missing something, which is
> entirely possible :-)

This is a great catch! It shows that I'm breaking some abstractions here,
and thus it's easy to make such a mistake. It's worse that I don't catch
this problem in the tests that I am adding. I'll add a test that
demonstrates the difference.

But beyond that, I think the biggest issue is that the consumer of an
abstract 'negotiator' is assuming something about its implementation. This
means that I should update the negotiator struct to have a function
pointer dedicated to "I chose to send this 'have'" and then the negotiator
can control how to prevent sending more 'have's reachable from those tips.

Thanks,
-Stolee


  reply	other threads:[~2026-05-12 16:54 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-08 14:36 [PATCH 0/4] fetch: add --must-have and remote.*.mustHave Derrick Stolee via GitGitGadget
2026-04-08 14:36 ` [PATCH 1/4] t5516: fix test order flakiness Derrick Stolee via GitGitGadget
2026-04-08 14:36 ` [PATCH 2/4] fetch: add --must-have option for negotiation Derrick Stolee via GitGitGadget
2026-04-08 14:36 ` [PATCH 3/4] remote: add mustHave config as default for --must-have Derrick Stolee via GitGitGadget
2026-04-08 14:36 ` [PATCH 4/4] send-pack: pass --must-have for push negotiation Derrick Stolee via GitGitGadget
2026-04-08 18:59 ` [PATCH 0/4] fetch: add --must-have and remote.*.mustHave Junio C Hamano
2026-04-09 12:53   ` Derrick Stolee
2026-04-15 15:14 ` [PATCH v2 0/7] fetch: rework negotiation tip options Derrick Stolee via GitGitGadget
2026-04-15 15:14   ` [PATCH v2 1/7] t5516: fix test order flakiness Derrick Stolee via GitGitGadget
2026-04-15 15:14   ` [PATCH v2 2/7] fetch: add --negotiation-restrict option Derrick Stolee via GitGitGadget
2026-04-15 21:57     ` Junio C Hamano
2026-04-19 23:00       ` Derrick Stolee
2026-04-20 10:32         ` Junio C Hamano
2026-04-20 11:35           ` Derrick Stolee
2026-04-15 15:14   ` [PATCH v2 3/7] transport: rename negotiation_tips Derrick Stolee via GitGitGadget
2026-04-20  8:11     ` Patrick Steinhardt
2026-04-15 15:14   ` [PATCH v2 4/7] remote: add remote.*.negotiationRestrict config Derrick Stolee via GitGitGadget
2026-04-15 19:16     ` Junio C Hamano
2026-04-15 15:14   ` [PATCH v2 5/7] fetch: add --negotiation-require option for negotiation Derrick Stolee via GitGitGadget
2026-04-15 19:50     ` Junio C Hamano
2026-04-21 18:06       ` Derrick Stolee
2026-04-20  8:11     ` Patrick Steinhardt
2026-04-20 11:41       ` Derrick Stolee
2026-04-15 15:14   ` [PATCH v2 6/7] remote: add negotiationRequire config as default for --negotiation-require Derrick Stolee via GitGitGadget
2026-04-15 15:14   ` [PATCH v2 7/7] send-pack: pass negotiation config in push Derrick Stolee via GitGitGadget
2026-04-22 15:25   ` [PATCH v3 0/7] fetch: rework negotiation tip options Derrick Stolee via GitGitGadget
2026-04-22 15:25     ` [PATCH v3 1/7] t5516: fix test order flakiness Derrick Stolee via GitGitGadget
2026-05-12 10:50       ` Matthew John Cheetham
2026-04-22 15:25     ` [PATCH v3 2/7] fetch: add --negotiation-restrict option Derrick Stolee via GitGitGadget
2026-05-12 11:11       ` Matthew John Cheetham
2026-05-12 14:23         ` Derrick Stolee
2026-04-22 15:25     ` [PATCH v3 3/7] transport: rename negotiation_tips Derrick Stolee via GitGitGadget
2026-05-12 11:30       ` Matthew John Cheetham
2026-05-12 14:33         ` Derrick Stolee
2026-04-22 15:25     ` [PATCH v3 4/7] remote: add remote.*.negotiationRestrict config Derrick Stolee via GitGitGadget
2026-05-12 12:29       ` Matthew John Cheetham
2026-05-12 14:52         ` Derrick Stolee
2026-04-22 15:25     ` [PATCH v3 5/7] fetch: add --negotiation-include option for negotiation Derrick Stolee via GitGitGadget
2026-05-12 14:38       ` Matthew John Cheetham
2026-05-12 16:54         ` Derrick Stolee [this message]
2026-04-22 15:25     ` [PATCH v3 6/7] remote: add remote.*.negotiationInclude config Derrick Stolee via GitGitGadget
2026-05-12 14:54       ` Matthew John Cheetham
2026-05-12 17:55         ` Derrick Stolee
2026-04-22 15:25     ` [PATCH v3 7/7] send-pack: pass negotiation config in push Derrick Stolee via GitGitGadget
2026-05-12 15:14       ` Matthew John Cheetham
2026-05-14 12:41     ` [PATCH v4 0/8] fetch: rework negotiation tip options Derrick Stolee via GitGitGadget
2026-05-14 12:41       ` [PATCH v4 1/8] t5516: fix test order flakiness Derrick Stolee via GitGitGadget
2026-05-14 12:41       ` [PATCH v4 2/8] fetch: add --negotiation-restrict option Derrick Stolee via GitGitGadget
2026-05-14 12:41       ` [PATCH v4 3/8] transport: rename negotiation_tips Derrick Stolee via GitGitGadget
2026-05-14 12:41       ` [PATCH v4 4/8] remote: add remote.*.negotiationRestrict config Derrick Stolee via GitGitGadget
2026-05-14 12:41       ` [PATCH v4 5/8] negotiator: add have_sent() interface Derrick Stolee via GitGitGadget
2026-05-14 12:41       ` [PATCH v4 6/8] fetch: add --negotiation-include option for negotiation Derrick Stolee via GitGitGadget
2026-05-14 12:41       ` [PATCH v4 7/8] remote: add remote.*.negotiationInclude config Derrick Stolee via GitGitGadget
2026-05-14 12:41       ` [PATCH v4 8/8] send-pack: pass negotiation config in push Derrick Stolee via GitGitGadget

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e33ed325-4533-4f83-bcf3-acea2188f208@gmail.com \
    --to=stolee@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    --cc=mjcheetham@outlook.com \
    --cc=ps@pks.im \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox