All of lore.kernel.org
 help / color / mirror / Atom feed
From: Derrick Stolee <stolee@gmail.com>
To: Matthew John Cheetham <mjcheetham@outlook.com>,
	Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com>,
	git@vger.kernel.org
Cc: gitster@pobox.com, ps@pks.im
Subject: Re: [PATCH v3 5/7] fetch: add --negotiation-include option for negotiation
Date: Tue, 12 May 2026 12:54:35 -0400	[thread overview]
Message-ID: <e33ed325-4533-4f83-bcf3-acea2188f208@gmail.com> (raw)
In-Reply-To: <VI0PR03MB1163403743E62E0FDC4AFE52DC0392@VI0PR03MB11634.eurprd03.prod.outlook.com>

On 5/12/26 10:38 AM, Matthew John Cheetham wrote:
> On 2026-04-22 16:25, Derrick Stolee via GitGitGadget wrote:
>> From: Derrick Stolee <stolee@gmail.com>
>>
>> Add a new --negotiation-include option to 'git fetch', which ensures
>> that certain ref tips are always sent as 'have' lines during fetch
>> negotiation, regardless of what the negotiation algorithm selects.
>>
>> This is useful when the repository has a large number of references, so
>> the normal negotiation algorithm truncates the list. This is especially
>> important in repositories with long parallel commit histories. For
>> example, a repo could have a 'dev' branch for development and a
>> 'release' branch for released versions. If the 'dev' branch isn't
>> selected for negotiation, then it's not a big deal because there are
>> many in-progress development branches with a shared history. However, if
>> 'release' is not selected for negotiation, then the server may think
>> that this is the first time the client has asked for that reference,
>> causing a full download of its parallel commit history (and any extra
>> data that may be unique to that branch). This is based on a real example
>> where certain fetches would grow to 60+ GB when a release branch
>> updated.
>>
>> This option is a complement to --negotiation-restrict, which reduces the
>> negotiation ref set to a specific list. In the earlier example, using
>> --negotiation-restrict to focus the negotiation to 'dev' and 'release'
>> would avoid those problematic downloads, but would still not allow
>> advertising potentially-relevant user brances. In this way, the
>> 'include' version solves the problem I mention while allowing
>> negotiation to pick other references opportunistically. The two options
>> can also be combined to allow the best of both worlds.
> 
> Nice explanation and motivation for the need of such as feature.
> 
> One small typo: s/brances/branches/

Thanks.

>> --- a/Documentation/fetch-options.adoc
>> +++ b/Documentation/fetch-options.adoc
>> @@ -73,6 +73,25 @@ See also the `fetch.negotiationAlgorithm` and `push.negotiate`
>>   configuration variables documented in linkgit:git-config[1], and the
>>   `--negotiate-only` option below.
>> +`--negotiation-include=<revision>`::
>> +    Ensure that the given ref tip is always sent as a "have" line
>> +    during fetch negotiation, regardless of what the negotiation
>> +    algorithm selects.  This is useful to guarantee that common
>> +    history reachable from specific refs is always considered, even
>> +    when `--negotiation-restrict` restricts the set of tips or when
>> +    the negotiation algorithm would otherwise skip them.
>> ++
>> +This option may be specified more than once; if so, each ref is sent
>> +unconditionally.
>> ++
>> +The argument may be an exact ref name (e.g. `refs/heads/release`) or a
>> +glob pattern (e.g. `refs/heads/release/{asterisk}`).  The pattern syntax
>> +is the same as for `--negotiation-restrict`.
>> ++
>> +If `--negotiation-restrict` is used, the have set is first restricted by
>> +that option and then increased to include the tips specified by
>> +`--negotiation-include`.
>> +
> 
> The placeholder `<revision>` and the description in the body of "ref
> name or glob" slightly disagree with each other. The `--negotiation-restrict` 
> docs use `(<commit>|<glob>)` in the syntax definition and
> "a glob on ref names, a ref, or .. SHA-1 of a commit".

Good eye.

> `resolve_negotiation_include()` calls `repo_get_oid()` for non-globs
> so bare OIDs and abbreviated SHAs work too. Perhaps consider aligning the 
> syntaxes, and mention that OIDs work too.

Will do.

>> @@ -1547,10 +1548,14 @@ static void add_negotiation_restrict_tips(struct 
>> git_transport_options *smart_op
>>           int old_nr;
>>           if (!has_glob_specials(s)) {
>>               struct object_id oid;
>> +
>> +            /* Ignore missing reference. */
>>               if (repo_get_oid(the_repository, s, &oid))
>> -                die(_("%s is not a valid object"), s);
>> +                continue;
>> +            /* Fail on missing object pointed by ref. */
>>               if (!odb_has_object(the_repository->objects, &oid, 0))
>>                   die(_("the object %s does not exist"), s);
>> +
>>               oid_array_append(oids, &oid);
>>               continue;
>>           }
> 
> This is the change in behaviour - unresolvable revs were a fatal error
> and are now silently ignored.
> 
> Note that t5510 '--negotiation-tip rejects missing OIDs' still passes
> because it uses an all-zero OID, which parses as a valid hex string,
> and dies on the second check "object does not exist". Using something
> like `--negotiation-tip=notreal` that previously would error will now
> silently be ignored.
> 
> Is it worth another test? (invalid object vs not exists)?

Yes, let's add a test to guarantee this behavior works.

>> @@ -1615,6 +1620,13 @@ static struct transport *prepare_transport(struct 
>> remote *remote, int deepen,
>>               strbuf_release(&config_name);
>>           }
>>       }
>> +    if (negotiation_include.nr) {
>> +        if (transport->smart_options)
>> +            transport->smart_options->negotiation_include = 
>> &negotiation_include;
>> +        else
>> +            warning(_("ignoring %s because the protocol does not support it"),
>> +                "--negotiation-include");
>> +    }
>>       return transport;
>>   }
> 
> There is a difference between the existing `--negotiation-restrict`
> option and the new `--negotiation-include` option. Patch 3's commit
> message says:
> 
>    "The 'tips' part is kept because this is an oid_array in the transport
>    layer. This requires the builtin to handle parsing refs into
>    collections of oids so the transport layer can handle this cleaner
>    form of the data."
> 
> The new option passes the raw `string_list` to the transport layer and
> lets it resolve it instead. If the transport layer now learns how to
> resolve refs to oids, why not for tips/restrict?
> 
> Would it be easier for future readers for these complementary options
> to resolve their inputs at the same layer? Or at least call out why:
> "would prefer raw tips but for back-compat we resolve in the built-in"
> for example.

This is a really key observation. It's a bit of work to unravel, but I
think it's better for unifying these things. Look forward to a better
organization in the next version.

>> +static void resolve_negotiation_include(const struct string_list 
>> *negotiation_include,
>> +                    struct oidset *result)
>> +{
>> +    struct string_list_item *item;
>> +
>> +    if (!negotiation_include || !negotiation_include->nr)
>> +        return;
>> +
>> +    for_each_string_list_item(item, negotiation_include) {
>> +        if (!has_glob_specials(item->string)) {
>> +            struct object_id oid;
>> +
>> +            /* Ignore missing reference. */
>> +            if (repo_get_oid(the_repository, item->string, &oid))
>> +                continue;
>> +
>> +            /* Fail on missing object pointed by ref. */
>> +            if (!odb_has_object(the_repository->objects, &oid, 0))
>> +                die(_("the object %s does not exist"),
>> +                    item->string);
>> +
>> +            oidset_insert(result, &oid);
>> +        } else {
>> +            struct refs_for_each_ref_options opts = {
>> +                .pattern = item->string,
>> +            };
>> +            refs_for_each_ref_ext(
>> +                get_main_ref_store(the_repository),
>> +                add_oid_to_oidset, result, &opts);
>> +        }
>> +    }
>> +}
>> +
> 
> `resolve_negotiation_include()` is basically doing the same as
> `add_negotiation_restrict_tips()` except outputting to an `oidset`
> vs `oid_array`. This is a result of the difference in ref resolution
> layer between `--negotiation-restrict/tip` and `-include`.

Yes, this code will be replaced with a unified approach in the
next version.

>>   static int find_common(struct fetch_negotiator *negotiator,
>>                  struct fetch_pack_args *args,
>>                  int fd[2], struct object_id *result_oid,
>> @@ -347,6 +390,7 @@ static int find_common(struct fetch_negotiator *negotiator,
>>       struct strbuf req_buf = STRBUF_INIT;
>>       size_t state_len = 0;
>>       struct packet_reader reader;
>> +    struct oidset negotiation_include_oids = OIDSET_INIT;
>>       if (args->stateless_rpc && multi_ack == 1)
>>           die(_("the option '%s' requires '%s'"), "--stateless-rpc", 
>> "multi_ack_detailed");
>> @@ -474,6 +518,33 @@ static int find_common(struct fetch_negotiator *negotiator,
>>       trace2_region_enter("fetch-pack", "negotiation_v0_v1", the_repository);
>>       flushes = 0;
>>       retval = -1;
>> +
>> +    /* Send unconditional haves from --negotiation-include */
>> +    resolve_negotiation_include(args->negotiation_include,
>> +                    &negotiation_include_oids);
>> +    if (oidset_size(&negotiation_include_oids)) {
>> +        struct oidset_iter iter;
>> +        oidset_iter_init(&negotiation_include_oids, &iter);
>> +
>> +        while ((oid = oidset_iter_next(&iter))) {
>> +            struct commit *commit;
>> +            packet_buf_write(&req_buf, "have %s\n",
>> +                     oid_to_hex(oid));
>> +            print_verbose(args, "have %s", oid_to_hex(oid));
>> +            count++;
>> +
>> +            /*
>> +             * If this is a commit, then mark as COMMON to
>> +             * avoid the negotiator also outputting it as
>> +             * a have.
>> +             */
>> +            commit = lookup_commit(the_repository, oid);
>> +            if (commit &&
>> +                !repo_parse_commit(the_repository, commit))
>> +                commit->object.flags |= COMMON;
>> +        }
>> +    }
>> +
> 
> I want to make sure I understand the COMMON pre-marking before
> commenting further on this patch. My understanding is there are actually
> two different COMMON bits in the tree, one defined in fetch-pack.c
> (bit 6) and one in negotiator/default.c (bit 2):
> 
> - fetch-pack.c's COMMON (bit 6) is set after a server ACK confirms an
>    OID is common with us and is read to decide when we've established
>    enough common ground to terminate negotiation. This is not consulted
>    in find_common().
> 
> - negotiator/default.c's COMMON (bit 2) is a book-keeping flag used by
>    `get_rev()` to decide if we skip emitting a commit as a 'have'.
> 
> Since we're in fetch-pack.c here, the `commit->object.flags |= COMMON`
> line is setting bit 6. The `get_rev()` call in negotiator/default.c
> never checks bit 6, only bit 2. As far as I can tell, this mark won't
> suppress the negotiator from emitting another 'have' line in the
> protocol v0/v1 paths in `find_common()`.
> 
> The v2 path doesn't touch the flags.. `add_haves` dedups via `oidset_contains()`:
> 
>    while ((oid = negotiator->next(negotiator))) {
>        if (negotiation_include_oids &&
>            oidset_contains(negotiation_include_oids, oid))
>            continue;
>        packet_buf_write(req_buf, "have %s\n", ...);
>    }
> 
> This works, and is what the new 'avoids duplicates with negotiator' test
> runs against, on protocol v2. If we run on protocol v0/v1, and if my
> assessment is correct, then we'd see a duplicate I think?
> 
> Sorry if I've not understood correctly or am missing something, which is
> entirely possible :-)

This is a great catch! It shows that I'm breaking some abstractions here,
and thus it's easy to make such a mistake. It's worse that I don't catch
this problem in the tests that I am adding. I'll add a test that
demonstrates the difference.

But beyond that, I think the biggest issue is that the consumer of an
abstract 'negotiator' is assuming something about its implementation. This
means that I should update the negotiator struct to have a function
pointer dedicated to "I chose to send this 'have'" and then the negotiator
can control how to prevent sending more 'have's reachable from those tips.

Thanks,
-Stolee


  reply	other threads:[~2026-05-12 16:54 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-08 14:36 [PATCH 0/4] fetch: add --must-have and remote.*.mustHave Derrick Stolee via GitGitGadget
2026-04-08 14:36 ` [PATCH 1/4] t5516: fix test order flakiness Derrick Stolee via GitGitGadget
2026-04-08 14:36 ` [PATCH 2/4] fetch: add --must-have option for negotiation Derrick Stolee via GitGitGadget
2026-04-08 14:36 ` [PATCH 3/4] remote: add mustHave config as default for --must-have Derrick Stolee via GitGitGadget
2026-04-08 14:36 ` [PATCH 4/4] send-pack: pass --must-have for push negotiation Derrick Stolee via GitGitGadget
2026-04-08 18:59 ` [PATCH 0/4] fetch: add --must-have and remote.*.mustHave Junio C Hamano
2026-04-09 12:53   ` Derrick Stolee
2026-04-15 15:14 ` [PATCH v2 0/7] fetch: rework negotiation tip options Derrick Stolee via GitGitGadget
2026-04-15 15:14   ` [PATCH v2 1/7] t5516: fix test order flakiness Derrick Stolee via GitGitGadget
2026-04-15 15:14   ` [PATCH v2 2/7] fetch: add --negotiation-restrict option Derrick Stolee via GitGitGadget
2026-04-15 21:57     ` Junio C Hamano
2026-04-19 23:00       ` Derrick Stolee
2026-04-20 10:32         ` Junio C Hamano
2026-04-20 11:35           ` Derrick Stolee
2026-04-15 15:14   ` [PATCH v2 3/7] transport: rename negotiation_tips Derrick Stolee via GitGitGadget
2026-04-20  8:11     ` Patrick Steinhardt
2026-04-15 15:14   ` [PATCH v2 4/7] remote: add remote.*.negotiationRestrict config Derrick Stolee via GitGitGadget
2026-04-15 19:16     ` Junio C Hamano
2026-04-15 15:14   ` [PATCH v2 5/7] fetch: add --negotiation-require option for negotiation Derrick Stolee via GitGitGadget
2026-04-15 19:50     ` Junio C Hamano
2026-04-21 18:06       ` Derrick Stolee
2026-04-20  8:11     ` Patrick Steinhardt
2026-04-20 11:41       ` Derrick Stolee
2026-04-15 15:14   ` [PATCH v2 6/7] remote: add negotiationRequire config as default for --negotiation-require Derrick Stolee via GitGitGadget
2026-04-15 15:14   ` [PATCH v2 7/7] send-pack: pass negotiation config in push Derrick Stolee via GitGitGadget
2026-04-22 15:25   ` [PATCH v3 0/7] fetch: rework negotiation tip options Derrick Stolee via GitGitGadget
2026-04-22 15:25     ` [PATCH v3 1/7] t5516: fix test order flakiness Derrick Stolee via GitGitGadget
2026-05-12 10:50       ` Matthew John Cheetham
2026-04-22 15:25     ` [PATCH v3 2/7] fetch: add --negotiation-restrict option Derrick Stolee via GitGitGadget
2026-05-12 11:11       ` Matthew John Cheetham
2026-05-12 14:23         ` Derrick Stolee
2026-04-22 15:25     ` [PATCH v3 3/7] transport: rename negotiation_tips Derrick Stolee via GitGitGadget
2026-05-12 11:30       ` Matthew John Cheetham
2026-05-12 14:33         ` Derrick Stolee
2026-04-22 15:25     ` [PATCH v3 4/7] remote: add remote.*.negotiationRestrict config Derrick Stolee via GitGitGadget
2026-05-12 12:29       ` Matthew John Cheetham
2026-05-12 14:52         ` Derrick Stolee
2026-04-22 15:25     ` [PATCH v3 5/7] fetch: add --negotiation-include option for negotiation Derrick Stolee via GitGitGadget
2026-05-12 14:38       ` Matthew John Cheetham
2026-05-12 16:54         ` Derrick Stolee [this message]
2026-04-22 15:25     ` [PATCH v3 6/7] remote: add remote.*.negotiationInclude config Derrick Stolee via GitGitGadget
2026-05-12 14:54       ` Matthew John Cheetham
2026-05-12 17:55         ` Derrick Stolee
2026-04-22 15:25     ` [PATCH v3 7/7] send-pack: pass negotiation config in push Derrick Stolee via GitGitGadget
2026-05-12 15:14       ` Matthew John Cheetham
2026-05-14 12:41     ` [PATCH v4 0/8] fetch: rework negotiation tip options Derrick Stolee via GitGitGadget
2026-05-14 12:41       ` [PATCH v4 1/8] t5516: fix test order flakiness Derrick Stolee via GitGitGadget
2026-05-14 12:41       ` [PATCH v4 2/8] fetch: add --negotiation-restrict option Derrick Stolee via GitGitGadget
2026-05-14 12:41       ` [PATCH v4 3/8] transport: rename negotiation_tips Derrick Stolee via GitGitGadget
2026-05-14 12:41       ` [PATCH v4 4/8] remote: add remote.*.negotiationRestrict config Derrick Stolee via GitGitGadget
2026-05-14 12:41       ` [PATCH v4 5/8] negotiator: add have_sent() interface Derrick Stolee via GitGitGadget
2026-05-14 12:41       ` [PATCH v4 6/8] fetch: add --negotiation-include option for negotiation Derrick Stolee via GitGitGadget
2026-05-14 12:41       ` [PATCH v4 7/8] remote: add remote.*.negotiationInclude config Derrick Stolee via GitGitGadget
2026-05-14 12:41       ` [PATCH v4 8/8] send-pack: pass negotiation config in push Derrick Stolee via GitGitGadget

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e33ed325-4533-4f83-bcf3-acea2188f208@gmail.com \
    --to=stolee@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    --cc=mjcheetham@outlook.com \
    --cc=ps@pks.im \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.