From: Junio C Hamano <gitster@pobox.com>
To: git@vger.kernel.org
Cc: Benedek Kozma <cyberbeni@gmail.com>, Glen Choo <chooglen@google.com>
Subject: [PATCH v2] fetch: do not run a redundant fetch from submodule
Date: Mon, 16 May 2022 16:53:40 -0700 [thread overview]
Message-ID: <xmqqk0alyqyj.fsf_-_@gitster.g> (raw)
In-Reply-To: <xmqqczgd16wx.fsf_-_@gitster.g> (Junio C. Hamano's message of "Mon, 16 May 2022 14:53:02 -0700")
When 7dce19d3 (fetch/pull: Add the --recurse-submodules option,
2010-11-12) introduced the "--recurse-submodule" option, the
approach taken was to perform fetches in submodules only once, after
all the main fetching (it may usually be a fetch from a single
remote, but it could be fetching from a group of remotes using
fetch_multiple()) succeeded. Later we added "--all" to fetch from
all defined remotes, which complicated things even more.
If your project has a submodule, and you try to run "git fetch
--recurse-submodule --all", you'd see a fetch for the top-level,
which invokes another fetch for the submodule, followed by another
fetch for the same submodule. All but the last fetch for the
submodule come from a "git fetch --recurse-submodules" subprocess
that is spawned via the fetch_multiple() interface for the remotes,
and the last fetch comes from the code at the end.
Because recursive fetching from submodules is done in each fetch for
the top-level in fetch_multiple(), the last fetch in the submodule
is redundant. It only matters when fetch_one() interacts with a
single remote at the top-level.
While we are at it, there is one optimization that exists in dealing
with a group of remote, but is missing when "--all" is used. In the
former, when the group turns out to be a group of one, instead of
spawning "git fetch" as a subprocess via the fetch_multiple()
interface, we use the normal fetch_one() code path. Do the same
when handing "--all", if it turns out that we have only one remote
defined.
Helped-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
So here is a second attempt. It demonstrates a bit interesting
funny in range-diff where similar changes from the previous round
gets applied to a different target.
t5617 is much cleanly organized than t5526, and we may want to clean
up the latter after dust settles.
1: 006fe43da1 ! 1: da0a4e341b fetch: do not run a redundant fetch from submodule
@@ Commit message
when handing "--all", if it turns out that we have only one remote
defined.
+ Helped-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
## builtin/fetch.c ##
@@ builtin/fetch.c: int cmd_fetch(int argc, const char **argv, const char *prefix)
die(_("fetch --all does not make sense with refspecs"));
(void) for_each_remote(get_one_remote_for_fetch, &list);
+
-+ /* no point doing fetch_multiple() of one */
++ /* do not do fetch_multiple() of one */
+ if (list.nr == 1)
+ remote = remote_get(list.items[0].string);
} else if (argc == 0) {
@@ builtin/fetch.c: int cmd_fetch(int argc, const char **argv, const char *prefix)
}
- if (!result && (recurse_submodules != RECURSE_SUBMODULES_OFF)) {
++
++ /*
++ * This is only needed after fetch_one(), which does not fetch
++ * submodules by itself.
++ *
++ * When we fetch from multiple remotes, fetch_multiple() has
++ * already updated submodules to grab commits necessary for
++ * the fetched history from each remote, so there is no need
++ * to fetch submodules from here.
++ */
+ if (!result && remote && (recurse_submodules != RECURSE_SUBMODULES_OFF)) {
struct strvec options = STRVEC_INIT;
int max_children = max_jobs;
- ## t/t5617-clone-submodules-remote.sh ##
-@@ t/t5617-clone-submodules-remote.sh: test_expect_success 'clone with --single-branch' '
+ ## t/t5526-fetch-submodules.sh ##
+@@ t/t5526-fetch-submodules.sh: test_expect_success 'fetch --recurse-submodules updates name-conflicted, unpopul
)
'
+test_expect_success 'fetch --all with --recurse-submodules' '
-+ test_when_finished "rm -fr super_clone" &&
-+ git clone --recurse-submodules srv.bare super_clone &&
++ test_when_finished "rm -fr src_clone" &&
++ git clone --recurse-submodules src src_clone &&
+ (
-+ cd super_clone &&
++ cd src_clone &&
+ git config submodule.recurse true &&
+ git config fetch.parallel 0 &&
+ git fetch --all 2>../fetch-log
+ ) &&
-+ grep "Fetching sub" fetch-log >fetch-subs &&
++ grep "^Fetching submodule sub$" fetch-log >fetch-subs &&
+ test_line_count = 1 fetch-subs
+'
+
- # do basic partial clone from "srv.bare"
- # confirm partial clone was registered in the local config for super and sub.
- test_expect_success 'clone with --filter' '
++test_expect_success 'fetch --all with --recurse-submodules with multiple' '
++ test_when_finished "rm -fr src_clone" &&
++ git clone --recurse-submodules src src_clone &&
++ (
++ cd src_clone &&
++ git remote add secondary ../src &&
++ git config submodule.recurse true &&
++ git config fetch.parallel 0 &&
++ git fetch --all 2>../fetch-log
++ ) &&
++ grep "Fetching submodule sub" fetch-log >fetch-subs &&
++ test_line_count = 2 fetch-subs
++'
++
+ test_done
builtin/fetch.c | 16 +++++++++++++++-
t/t5526-fetch-submodules.sh | 27 +++++++++++++++++++++++++++
2 files changed, 42 insertions(+), 1 deletion(-)
diff --git a/builtin/fetch.c b/builtin/fetch.c
index e3791f09ed..8b15c40bb2 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -2187,6 +2187,10 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
else if (argc > 1)
die(_("fetch --all does not make sense with refspecs"));
(void) for_each_remote(get_one_remote_for_fetch, &list);
+
+ /* do not do fetch_multiple() of one */
+ if (list.nr == 1)
+ remote = remote_get(list.items[0].string);
} else if (argc == 0) {
/* No arguments -- use default remote */
remote = remote_get(NULL);
@@ -2261,7 +2265,17 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
result = fetch_multiple(&list, max_children);
}
- if (!result && (recurse_submodules != RECURSE_SUBMODULES_OFF)) {
+
+ /*
+ * This is only needed after fetch_one(), which does not fetch
+ * submodules by itself.
+ *
+ * When we fetch from multiple remotes, fetch_multiple() has
+ * already updated submodules to grab commits necessary for
+ * the fetched history from each remote, so there is no need
+ * to fetch submodules from here.
+ */
+ if (!result && remote && (recurse_submodules != RECURSE_SUBMODULES_OFF)) {
struct strvec options = STRVEC_INIT;
int max_children = max_jobs;
diff --git a/t/t5526-fetch-submodules.sh b/t/t5526-fetch-submodules.sh
index 43dada8544..a301b56db8 100755
--- a/t/t5526-fetch-submodules.sh
+++ b/t/t5526-fetch-submodules.sh
@@ -1125,4 +1125,31 @@ test_expect_success 'fetch --recurse-submodules updates name-conflicted, unpopul
)
'
+test_expect_success 'fetch --all with --recurse-submodules' '
+ test_when_finished "rm -fr src_clone" &&
+ git clone --recurse-submodules src src_clone &&
+ (
+ cd src_clone &&
+ git config submodule.recurse true &&
+ git config fetch.parallel 0 &&
+ git fetch --all 2>../fetch-log
+ ) &&
+ grep "^Fetching submodule sub$" fetch-log >fetch-subs &&
+ test_line_count = 1 fetch-subs
+'
+
+test_expect_success 'fetch --all with --recurse-submodules with multiple' '
+ test_when_finished "rm -fr src_clone" &&
+ git clone --recurse-submodules src src_clone &&
+ (
+ cd src_clone &&
+ git remote add secondary ../src &&
+ git config submodule.recurse true &&
+ git config fetch.parallel 0 &&
+ git fetch --all 2>../fetch-log
+ ) &&
+ grep "Fetching submodule sub" fetch-log >fetch-subs &&
+ test_line_count = 2 fetch-subs
+'
+
test_done
--
2.36.1-384-ga138d62aa5
next prev parent reply other threads:[~2022-05-16 23:53 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-04-29 14:46 Bugreport - submodules are fetched twice in some cases Benedek Kozma
2022-04-29 17:39 ` Junio C Hamano
2022-04-29 19:05 ` Glen Choo
2022-04-29 20:02 ` Junio C Hamano
2022-04-29 20:37 ` Glen Choo
2022-05-14 0:07 ` Glen Choo
2022-05-14 5:24 ` Junio C Hamano
2022-05-16 17:45 ` Glen Choo
2022-05-16 18:25 ` Junio C Hamano
2022-05-16 19:04 ` Junio C Hamano
2022-05-16 21:53 ` [PATCH] fetch: do not run a redundant fetch from submodule Junio C Hamano
2022-05-16 22:56 ` Glen Choo
2022-05-16 23:33 ` Junio C Hamano
2022-05-16 23:53 ` Junio C Hamano [this message]
2022-05-17 16:47 ` [PATCH v2] " Glen Choo
2022-05-18 15:53 ` Junio C Hamano
2022-05-14 0:15 ` Bugreport - submodules are fetched twice in some cases Glen Choo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xmqqk0alyqyj.fsf_-_@gitster.g \
--to=gitster@pobox.com \
--cc=chooglen@google.com \
--cc=cyberbeni@gmail.com \
--cc=git@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.