From: <rsbecker@nexbridge.com>
To: "'Zitzmann, Christian'" <Christian.Zitzmann@vitesco.com>,
<git@vger.kernel.org>
Subject: RE: Parallelism for submodule update
Date: Mon, 2 Jan 2023 11:54:24 -0500 [thread overview]
Message-ID: <009801d91eca$e1646360$a42d2a20$@nexbridge.com> (raw)
In-Reply-To: <DB5PR02MB100691E6422F5E94228F0E0EC8AF79@DB5PR02MB10069.eurprd02.prod.outlook.com>
>-----Original Message-----
>From: <Christian.Zitzmann@vitesco.com>
On January 2, 2023 11:45 AM Christian Zitzmann wrote:
>we are using git since many years with also heavily using submodules.
>
>When updating the submodules, only the fetching part is done in parallel (with
>config submodule.fetchjobs or --jobs) but the checkout is done sequentially
>
>What I’ve recognized when cloning with
>- scalar clone --full-clone --recurse-submodules <URL> or
>- git clone --filter=blob:none --also-filter-submodules --recurse-submodules
><URL>
>
>We loose performance, as the fetch of the blobs is done in the sequential
>checkout part, instead of in the parallel part.
>
>Furthermore, the utilization - without partial clone - of network and harddisk is not
>always good, as first the network is utilized (fetch) and then the harddisk
>(checkout)
>
>As the checkout part is local to the submodule (no shared resources to block), it
>would be great if we could move the checkout into the parallelized part.
>E.g. by doing fetch and checkout (with blob fetching) in one step with e.g.
>run_processes_parallel_tr2
>
>I expect that this significantly improves the performance, especially when using
>partial clones.
>
>Do you think this is possible? Do I miss anything in my thoughts?
Since this is a platform-specific request, if it happens, this should be a configuration switch that defaults off. On my platform, the file system itself is fairly fast, but the name service traversals and resolutions (what happens in the name service) is a performance problem. Doing the checkout/switch in parallel would actually be counter-productive in my case. So I would keep it off, but I get that other platforms could benefit.
Regards,
Randall
--
Brief whoami: NonStop&UNIX developer since approximately
UNIX(421664400)
NonStop(211288444200000000)
-- In real life, I talk too much.
next prev parent reply other threads:[~2023-01-02 16:54 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-01-02 16:44 Parallelism for submodule update Zitzmann, Christian
2023-01-02 16:54 ` rsbecker [this message]
2023-01-13 10:49 ` Zitzmann, Christian
2023-01-19 21:39 ` Calvin Wan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='009801d91eca$e1646360$a42d2a20$@nexbridge.com' \
--to=rsbecker@nexbridge.com \
--cc=Christian.Zitzmann@vitesco.com \
--cc=git@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).