From: Jonathan Tan <jonathantanmy@google.com>
To: git@vger.kernel.org
Cc: Jonathan Tan <jonathantanmy@google.com>,
stolee@gmail.com, gitster@pobox.com
Subject: [PATCH v3 0/7] Lazy fetch with subprocess
Date: Mon, 17 Aug 2020 21:01:30 -0700 [thread overview]
Message-ID: <cover.1597722941.git.jonathantanmy@google.com> (raw)
In-Reply-To: <20200724223844.2723397-1-jonathantanmy@google.com>
These patches are based on jc/no-update-fetch-head, once again.
I've addressed most reviewer comments by expanding commit messages and
fixing the cascading "if" issue in patch 3.
> Would "git fetch" be the only user potentially benefit from being
> able to dereference a tag that already exists locally and ignore
> missing ones? I wonder if teaching deref_tag() a new flag would be
> a better alternative to keep the separation of concerns at different
> layers.
Right now, I think so - "fetch" is a special case of having to avoid
infinite loops when lazy fetching, and of being able to tolerate missing
ref targets. I don't think most other commands are like that.
Jonathan Tan (7):
negotiator/noop: add noop fetch negotiator
fetch: allow refspecs specified through stdin
fetch: avoid reading submodule config until needed
fetch: only populate existing_refs if needed
fetch-pack: do not lazy-fetch during ref iteration
promisor-remote: lazy-fetch objects in subprocess
fetch-pack: remove no_dependents code
Documentation/config/fetch.txt | 5 +-
Documentation/git-fetch.txt | 4 +
Documentation/technical/partial-clone.txt | 13 +-
Makefile | 1 +
builtin/fetch-pack.c | 4 -
builtin/fetch.c | 42 ++++-
fetch-negotiator.c | 5 +
fetch-pack.c | 189 +++++++++-------------
fetch-pack.h | 14 --
negotiator/noop.c | 44 +++++
negotiator/noop.h | 8 +
promisor-remote.c | 46 +++---
remote-curl.c | 6 -
repo-settings.c | 2 +
repository.h | 1 +
submodule-config.c | 8 +-
t/t0410-partial-clone.sh | 2 +-
t/t4067-diff-partial-clone.sh | 8 +-
t/t5554-noop-fetch-negotiator.sh | 22 +++
t/t5601-clone.sh | 2 +-
t/t5616-partial-clone.sh | 20 +++
transport.c | 4 -
transport.h | 7 -
23 files changed, 255 insertions(+), 202 deletions(-)
create mode 100644 negotiator/noop.c
create mode 100644 negotiator/noop.h
create mode 100755 t/t5554-noop-fetch-negotiator.sh
Range-diff against v2:
1: 35bdd372ae ! 1: 5b3b49d123 negotiator/null: add null fetch negotiator
@@ Metadata
Author: Jonathan Tan <jonathantanmy@google.com>
## Commit message ##
- negotiator/null: add null fetch negotiator
+ negotiator/noop: add noop fetch negotiator
- Add a null fetch negotiator. This is introduced to allow partial clones
+ Add a noop fetch negotiator. This is introduced to allow partial clones
to skip the unneeded negotiation step when fetching missing objects
using a "git fetch" subprocess. (The implementation of spawning a "git
fetch" subprocess will be done in a subsequent patch.) But this can also
@@ Documentation/config/fetch.txt: fetch.negotiationAlgorithm::
server. Set to "skipping" to use an algorithm that skips commits in an
effort to converge faster, but may result in a larger-than-necessary
- packfile; The default is "default" which instructs Git to use the default algorithm
-+ packfile; or set to "null" to not send any information at all, which
++ packfile; or set to "noop" to not send any information at all, which
+ will almost certainly result in a larger-than-necessary packfile, but
+ will skip the negotiation step.
+ The default is "default" which instructs Git to use the default algorithm
@@ Makefile: LIB_OBJS += mergesort.o
LIB_OBJS += midx.o
LIB_OBJS += name-hash.o
LIB_OBJS += negotiator/default.o
-+LIB_OBJS += negotiator/null.o
++LIB_OBJS += negotiator/noop.o
LIB_OBJS += negotiator/skipping.o
LIB_OBJS += notes-cache.o
LIB_OBJS += notes-merge.o
@@ fetch-negotiator.c
#include "fetch-negotiator.h"
#include "negotiator/default.h"
#include "negotiator/skipping.h"
-+#include "negotiator/null.h"
++#include "negotiator/noop.h"
#include "repository.h"
void fetch_negotiator_init(struct repository *r,
@@ fetch-negotiator.c: void fetch_negotiator_init(struct repository *r,
skipping_negotiator_init(negotiator);
return;
-+ case FETCH_NEGOTIATION_NULL:
-+ null_negotiator_init(negotiator);
++ case FETCH_NEGOTIATION_NOOP:
++ noop_negotiator_init(negotiator);
+ return;
+
case FETCH_NEGOTIATION_DEFAULT:
default:
default_negotiator_init(negotiator);
- ## negotiator/null.c (new) ##
+ ## negotiator/noop.c (new) ##
@@
+#include "cache.h"
-+#include "null.h"
++#include "noop.h"
+#include "../commit.h"
+#include "../fetch-negotiator.h"
+
@@ negotiator/null.c (new)
+ * This negotiator does not emit any commits, so there is no commit to
+ * be acknowledged. If there is any ack, there is a bug.
+ */
-+ BUG("ack with null negotiator, which does not emit any commits");
++ BUG("ack with noop negotiator, which does not emit any commits");
+ return 0;
+}
+
@@ negotiator/null.c (new)
+ /* nothing to release */
+}
+
-+void null_negotiator_init(struct fetch_negotiator *negotiator)
++void noop_negotiator_init(struct fetch_negotiator *negotiator)
+{
+ negotiator->known_common = known_common;
+ negotiator->add_tip = add_tip;
@@ negotiator/null.c (new)
+ negotiator->data = NULL;
+}
- ## negotiator/null.h (new) ##
+ ## negotiator/noop.h (new) ##
@@
-+#ifndef NEGOTIATOR_NULL_H
-+#define NEGOTIATOR_NULL_H
++#ifndef NEGOTIATOR_NOOP_H
++#define NEGOTIATOR_NOOP_H
+
+struct fetch_negotiator;
+
-+void null_negotiator_init(struct fetch_negotiator *negotiator);
++void noop_negotiator_init(struct fetch_negotiator *negotiator);
+
+#endif
@@ repo-settings.c: void prepare_repo_settings(struct repository *r)
if (!repo_config_get_string(r, "fetch.negotiationalgorithm", &strval)) {
if (!strcasecmp(strval, "skipping"))
r->settings.fetch_negotiation_algorithm = FETCH_NEGOTIATION_SKIPPING;
-+ else if (!strcasecmp(strval, "null"))
-+ r->settings.fetch_negotiation_algorithm = FETCH_NEGOTIATION_NULL;
++ else if (!strcasecmp(strval, "noop"))
++ r->settings.fetch_negotiation_algorithm = FETCH_NEGOTIATION_NOOP;
else
r->settings.fetch_negotiation_algorithm = FETCH_NEGOTIATION_DEFAULT;
}
@@ repository.h: enum fetch_negotiation_setting {
FETCH_NEGOTIATION_NONE = 0,
FETCH_NEGOTIATION_DEFAULT = 1,
FETCH_NEGOTIATION_SKIPPING = 2,
-+ FETCH_NEGOTIATION_NULL = 3,
++ FETCH_NEGOTIATION_NOOP = 3,
};
struct repo_settings {
- ## t/t5554-null-fetch-negotiator.sh (new) ##
+ ## t/t5554-noop-fetch-negotiator.sh (new) ##
@@
+#!/bin/sh
+
-+test_description='test null fetch negotiator'
++test_description='test noop fetch negotiator'
+. ./test-lib.sh
+
-+test_expect_success 'null negotiator does not emit any "have"' '
++test_expect_success 'noop negotiator does not emit any "have"' '
+ rm -f trace &&
+
+ test_create_repo server &&
@@ t/t5554-null-fetch-negotiator.sh (new)
+ test_create_repo client &&
+ test_commit -C client we_have &&
+
-+ test_config -C client fetch.negotiationalgorithm null &&
++ test_config -C client fetch.negotiationalgorithm noop &&
+ GIT_TRACE_PACKET="$(pwd)/trace" git -C client fetch "$(pwd)/server" &&
+
+ ! grep "fetch> have" trace &&
2: 00ad7dd875 = 2: 9f277f1631 fetch: allow refspecs specified through stdin
3: 8b4a522a13 ! 3: fda9f834f6 fetch: avoid reading submodule config until needed
@@ Metadata
## Commit message ##
fetch: avoid reading submodule config until needed
- Teach "git fetch" to avoid reading the submodule config until necessary.
- This allows users to avoid the lazy-fetching of this potentially missing
- config file by specifying the --recurse-submodules=no command line
- option.
+ In "fetch", there are two parameters submodule_fetch_jobs_config and
+ recurse_submodules that can be set in a variety of ways: through
+ .gitmodules, through .git/config, and through the command line.
+ Currently "fetch" handles this by first reading .gitmodules, then
+ reading .git/config (allowing it to overwrite existing values), then
+ reading the command line (allowing it to overwrite existing values).
+
+ Notice that we can avoid reading .gitmodules if .git/config and/or the
+ command line already provides us with what we need. In addition, if
+ recurse_submodules is found to be "no", we do not need the value of
+ submodule_fetch_jobs_config.
+
+ Avoiding reading .gitmodules is especially important when we use "git
+ fetch" to perform lazy fetches in a partial clone because the
+ .gitmodules file itself might need to be lazy fetched (and otherwise
+ causing an infinite loop).
+
+ In light of all this, avoid reading .gitmodules until necessary. When
+ reading it, we may only need one of the two parameters it provides, so
+ teach fetch_config_from_gitmodules() to support NULL arguments. With
+ this patch, users (including Git itself when invoking "git fetch" to
+ lazy-fetch) will be able to guarantee avoiding reading .gitmodules by
+ passing --recurse-submodules=no.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
@@ builtin/fetch.c: int cmd_fetch(int argc, const char **argv, const char *prefix)
if (deepen_relative < 0)
## submodule-config.c ##
-@@ submodule-config.c: struct fetch_config {
- static int gitmodules_fetch_config(const char *var, const char *value, void *cb)
+@@ submodule-config.c: static int gitmodules_fetch_config(const char *var, const char *value, void *cb)
{
struct fetch_config *config = cb;
-- if (!strcmp(var, "submodule.fetchjobs")) {
-+ if (!strcmp(var, "submodule.fetchjobs") && config->max_children) {
- *(config->max_children) = parse_submodule_fetchjobs(var, value);
+ if (!strcmp(var, "submodule.fetchjobs")) {
+- *(config->max_children) = parse_submodule_fetchjobs(var, value);
++ if (config->max_children)
++ *(config->max_children) =
++ parse_submodule_fetchjobs(var, value);
return 0;
-- } else if (!strcmp(var, "fetch.recursesubmodules")) {
-+ } else if (!strcmp(var, "fetch.recursesubmodules") &&
-+ config->recurse_submodules) {
- *(config->recurse_submodules) = parse_fetch_recurse_submodules_arg(var, value);
+ } else if (!strcmp(var, "fetch.recursesubmodules")) {
+- *(config->recurse_submodules) = parse_fetch_recurse_submodules_arg(var, value);
++ if (config->recurse_submodules)
++ *(config->recurse_submodules) =
++ parse_fetch_recurse_submodules_arg(var, value);
return 0;
}
+
4: 77bc83e7f2 ! 4: a5554cd27f fetch: only populate existing_refs if needed
@@ Metadata
## Commit message ##
fetch: only populate existing_refs if needed
- When fetching tags, Git only writes tags that do not already exist in
- the client repository. This necessitates an iteration over all the refs,
- but fetch performs this iteration even if no tags are fetched.
+ In "fetch", get_ref_map() iterates over all refs to populate
+ "existing_refs" in order to populate peer_ref->old_oid in the returned
+ refmap, even if the refmap has no peer_ref set - which is the case when
+ only literal hashes (i.e. no refs by name) are fetched.
- This issue is more severe in a partial clone because the iteration over
- refs also checks that the targets of those refs are present,
- necessitating a lazy fetch if the target is missing.
+ Iterating over refs causes the targets of those refs to be checked for
+ existence. Avoiding this is especially important when we use "git fetch"
+ to perform lazy fetches in a partial clone because a target of such a
+ ref may need to be itself lazy-fetched (and otherwise causing an
+ infinite loop).
- Therefore, iterate over the refs only when necessary. The user can avoid
- this iteration by refraining from fetching tags, for example, by passing
- --no-tags as an argument. A subsequent patch will also teach Git to use
- "git fetch" to lazy-fetch missing objects in a partial clone, thus also
- making use of this change.
+ Therefore, avoid populating "existing_refs" until necessary. With this
+ patch, because Git lazy-fetches objects by literal hashes (to be done in
+ a subsequent commit), it will then be able to guarantee avoiding reading
+ targets of refs.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
5: d42e98ff07 ! 5: f56c24dd1b fetch-pack: do not lazy-fetch during ref iteration
@@ Metadata
## Commit message ##
fetch-pack: do not lazy-fetch during ref iteration
- Teach "fetch-pack" not to lazy fetch whenever iterating over refs. This
- is done by using the raw form of ref iteration and by dereferencing tags
- ourselves.
+ In order to determine negotiation tips, "fetch-pack" iterates over all
+ refs and dereferences all annotated tags found. This causes the
+ existence of targets of refs and annotated tags to be checked. Avoiding
+ this is especially important when we use "git fetch" (which invokes
+ "fetch-pack") to perform lazy fetches in a partial clone because a
+ target of such a ref or annotated tag may need to be itself lazy-fetched
+ (and otherwise causing an infinite loop).
+
+ Therefore, teach "fetch-pack" not to lazy fetch whenever iterating over
+ refs. This is done by using the raw form of ref iteration and by
+ dereferencing tags ourselves.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
6: 55d2e5a4cc = 6: 4e72ca6258 promisor-remote: lazy-fetch objects in subprocess
7: e8f16d6908 = 7: 3ff9d034e9 fetch-pack: remove no_dependents code
--
2.28.0.220.ged08abb693-goog
next prev parent reply other threads:[~2020-08-18 4:02 UTC|newest]
Thread overview: 57+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-07-24 22:38 [RFC PATCH] Modify fetch-pack to no longer die on error? Jonathan Tan
2020-07-24 23:07 ` Junio C Hamano
2020-07-24 23:11 ` Junio C Hamano
2020-07-25 21:41 ` Jeff King
2020-07-25 23:01 ` Junio C Hamano
2020-07-27 17:11 ` Jeff King
2020-07-28 19:23 ` Jonathan Tan
2020-07-28 20:08 ` Jeff King
2020-07-29 18:53 ` Jonathan Tan
2020-07-29 19:29 ` Jeff King
2020-07-29 19:02 ` Junio C Hamano
2020-07-29 22:55 ` Jonathan Tan
2020-08-05 1:20 ` [RFC PATCH 0/7] Lazy fetch with subprocess Jonathan Tan
2020-08-05 1:20 ` [RFC PATCH 1/7] fetch-pack: allow NULL negotiator->add_tip Jonathan Tan
2020-08-05 19:53 ` Junio C Hamano
2020-08-07 20:53 ` Jonathan Tan
2020-08-05 1:20 ` [RFC PATCH 2/7] fetch-pack: allow NULL negotiator->known_common Jonathan Tan
2020-08-05 20:08 ` Junio C Hamano
2020-08-05 22:11 ` Junio C Hamano
2020-08-07 20:59 ` Jonathan Tan
2020-08-05 1:20 ` [RFC PATCH 3/7] negotiator/null: add null fetch negotiator Jonathan Tan
2020-08-05 1:20 ` [RFC PATCH 4/7] fetch: --stdin Jonathan Tan
2020-08-05 20:33 ` Junio C Hamano
2020-08-07 21:10 ` Jonathan Tan
2020-08-07 21:58 ` Junio C Hamano
2020-08-07 21:10 ` Jonathan Tan
2020-08-05 1:20 ` [RFC PATCH 5/7] fetch: submodule config Jonathan Tan
2020-08-05 1:20 ` [RFC PATCH 6/7] fetch: only populate existing_refs if needed Jonathan Tan
2020-08-05 1:20 ` [RFC PATCH 7/7] promisor-remote: use subprocess to fetch Jonathan Tan
2020-08-11 22:52 ` [PATCH v2 0/7] Lazy fetch with subprocess Jonathan Tan
2020-08-11 22:52 ` [PATCH v2 1/7] negotiator/null: add null fetch negotiator Jonathan Tan
2020-08-12 12:55 ` Derrick Stolee
2020-08-12 16:44 ` Junio C Hamano
2020-08-12 17:29 ` Jonathan Tan
2020-08-11 22:52 ` [PATCH v2 2/7] fetch: allow refspecs specified through stdin Jonathan Tan
2020-08-11 22:52 ` [PATCH v2 3/7] fetch: avoid reading submodule config until needed Jonathan Tan
2020-08-12 17:34 ` Junio C Hamano
2020-08-11 22:52 ` [PATCH v2 4/7] fetch: only populate existing_refs if needed Jonathan Tan
2020-08-12 18:06 ` Junio C Hamano
2020-08-11 22:52 ` [PATCH v2 5/7] fetch-pack: do not lazy-fetch during ref iteration Jonathan Tan
2020-08-12 18:25 ` Junio C Hamano
2020-08-11 22:52 ` [PATCH v2 6/7] promisor-remote: lazy-fetch objects in subprocess Jonathan Tan
2020-08-12 18:28 ` Junio C Hamano
2020-08-11 22:52 ` [PATCH v2 7/7] fetch-pack: remove no_dependents code Jonathan Tan
2020-08-12 12:51 ` [PATCH v2 0/7] Lazy fetch with subprocess Derrick Stolee
2020-08-18 4:01 ` Jonathan Tan [this message]
2020-08-18 4:01 ` [PATCH v3 1/7] negotiator/noop: add noop fetch negotiator Jonathan Tan
2020-08-18 4:01 ` [PATCH v3 2/7] fetch: allow refspecs specified through stdin Jonathan Tan
2020-08-18 4:01 ` [PATCH v3 3/7] fetch: avoid reading submodule config until needed Jonathan Tan
2020-08-18 4:01 ` [PATCH v3 4/7] fetch: only populate existing_refs if needed Jonathan Tan
2020-08-18 4:01 ` [PATCH v3 5/7] fetch-pack: do not lazy-fetch during ref iteration Jonathan Tan
2020-08-18 4:01 ` [PATCH v3 6/7] promisor-remote: lazy-fetch objects in subprocess Jonathan Tan
2020-08-18 4:01 ` [PATCH v3 7/7] fetch-pack: remove no_dependents code Jonathan Tan
2020-08-18 19:56 ` [PATCH v3 0/7] Lazy fetch with subprocess Junio C Hamano
2020-08-18 22:32 ` Junio C Hamano
2020-08-18 23:36 ` [PATCH] fixup! promisor-remote: lazy-fetch objects in subprocess Jonathan Tan
2020-08-18 23:57 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cover.1597722941.git.jonathantanmy@google.com \
--to=jonathantanmy@google.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=stolee@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.