From: Junio C Hamano <gitster@pobox.com>
To: Stefan Beller <sbeller@google.com>
Cc: git@vger.kernel.org, hvoigt@hvoigt.net, jens.lehmann@web.de
Subject: Re: [PATCH] Add fetch.recurseSubmoduleParallelism config option
Date: Mon, 12 Oct 2015 16:14:46 -0700 [thread overview]
Message-ID: <xmqqeggzbrx5.fsf@gitster.mtv.corp.google.com> (raw)
In-Reply-To: <1444690350-6486-1-git-send-email-sbeller@google.com> (Stefan Beller's message of "Mon, 12 Oct 2015 15:52:30 -0700")
Stefan Beller <sbeller@google.com> writes:
> This allows to configure fetching in parallel without having the annoying
> command line option.
s/annoying//;
I think this is a sane thing to do, but the name of the variable may
want to be bikeshedded a bit.
> This moved the responsibility to determine how many parallel processes
> to start from builtin/fetch to submodule.c as we need a way to communicate
> "The user did not specify the number of parallel processes in the command
> line options" in the builtin fetch. The submodule code takes care of
> the precedence (CLI > config > default)
>
> Signed-off-by: Stefan Beller <sbeller@google.com>
> ---
> Documentation/config.txt | 6 ++++++
> builtin/fetch.c | 2 +-
> submodule.c | 14 ++++++++++++++
> 3 files changed, 21 insertions(+), 1 deletion(-)
>
> I just monkey tested the code and it worked once! The problem with testing
> this parallelizing option is that the expected behavior doesn't change
> except for being faster. And I don't want to add timing tests to the test
> suite because they are unreliable.
>
> Any idea how to test this properly?
I agree that a test in t/ would catch bugs in the functionality. If
your parallel implementation is somehow broken in the future and
stops functioning correctly, fetching all submodules with a single
task and fetching them with N tasks will produce different results
;-).
But it would not help you much in seeing if the parallelism is
really taking place. Adding t/perf/ tests to show how much benefit
you are getting may be of more value.
The parallel_process API could learn a new "verbose" feature that it
by itself shows some messages like
"processing the 'frotz' job with N tasks"
"M tasks finished (N still running)"
in the output stream from strategic places. For example, the first
message will come at the end of pp_init(), and the second message
will be appended at the end of buffered output of a task that has
just been finished. Once you have something like that, you could
check for them in a test in t/.
Just a thought.
>
> This applies on top of sb/submodule-parallel-fetch
>
> Thanks,
> Stefan
>
>
> diff --git a/Documentation/config.txt b/Documentation/config.txt
> index 315f271..1172db0 100644
> --- a/Documentation/config.txt
> +++ b/Documentation/config.txt
> @@ -1140,6 +1140,12 @@ fetch.recurseSubmodules::
> when its superproject retrieves a commit that updates the submodule's
> reference.
>
> +fetch.recurseSubmoduleParallelism
> + This is used to determine how many submodules can be fetched in
> + parallel. Specifying a positive integer allows up to that number
> + of submodules being fetched in parallel. Specifying 0 the number
> + of cpus will be taken as the maximum number.
> +
> fetch.fsckObjects::
> If it is set to true, git-fetch-pack will check all fetched
> objects. It will abort in the case of a malformed object or a
> diff --git a/builtin/fetch.c b/builtin/fetch.c
> index f28eac6..b1399dc 100644
> --- a/builtin/fetch.c
> +++ b/builtin/fetch.c
> @@ -37,7 +37,7 @@ static int prune = -1; /* unspecified */
> static int all, append, dry_run, force, keep, multiple, update_head_ok, verbosity;
> static int progress = -1, recurse_submodules = RECURSE_SUBMODULES_DEFAULT;
> static int tags = TAGS_DEFAULT, unshallow, update_shallow;
> -static int max_children = 1;
> +static int max_children = -1;
> static const char *depth;
> static const char *upload_pack;
> static struct strbuf default_rla = STRBUF_INIT;
> diff --git a/submodule.c b/submodule.c
> index c21b265..c85d3ef 100644
> --- a/submodule.c
> +++ b/submodule.c
> @@ -15,6 +15,7 @@
> #include "thread-utils.h"
>
> static int config_fetch_recurse_submodules = RECURSE_SUBMODULES_ON_DEMAND;
> +static int config_fetch_parallel_submodules = -1;
> static struct string_list changed_submodule_paths;
> static int initialized_fetch_ref_tips;
> static struct sha1_array ref_tips_before_fetch;
> @@ -179,6 +180,14 @@ int submodule_config(const char *var, const char *value, void *cb)
> else if (!strcmp(var, "fetch.recursesubmodules")) {
> config_fetch_recurse_submodules = parse_fetch_recurse_submodules_arg(var, value);
> return 0;
> + } else if (!strcmp(var, "fetch.recursesubmoduleparallelism")) {
> + char *end;
> + int ret;
> + config_fetch_parallel_submodules = strtol(value, &end, 0);
> + ret = (*end == '\0');
> + if (!ret)
> + warning("value for fetch.recurseSubmoduleParallelism not recognized");
> + return ret;
> }
> return 0;
> }
> @@ -759,6 +768,11 @@ int fetch_populated_submodules(const struct argv_array *options,
> argv_array_push(&spf.args, "--recurse-submodules-default");
> /* default value, "--submodule-prefix" and its value are added later */
>
> + if (max_parallel_jobs < 0)
> + max_parallel_jobs = config_fetch_parallel_submodules;
> + if (max_parallel_jobs < 0)
> + max_parallel_jobs = 1;
> +
> calculate_changed_submodule_paths();
> run_processes_parallel(max_parallel_jobs,
> get_next_submodule,
next prev parent reply other threads:[~2015-10-12 23:14 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-10-12 22:52 [PATCH] Add fetch.recurseSubmoduleParallelism config option Stefan Beller
2015-10-12 23:14 ` Junio C Hamano [this message]
2015-10-12 23:31 ` Stefan Beller
2015-10-12 23:50 ` Junio C Hamano
2015-10-16 17:04 ` Stefan Beller
2015-10-16 17:26 ` Junio C Hamano
2015-10-13 7:32 ` Junio C Hamano
2015-10-13 16:03 ` Stefan Beller
2015-10-13 21:07 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xmqqeggzbrx5.fsf@gitster.mtv.corp.google.com \
--to=gitster@pobox.com \
--cc=git@vger.kernel.org \
--cc=hvoigt@hvoigt.net \
--cc=jens.lehmann@web.de \
--cc=sbeller@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.