From: Junio C Hamano <gitster@pobox.com>
To: Stefan Beller <sbeller@google.com>
Cc: "git\@vger.kernel.org" <git@vger.kernel.org>,
Jacob Keller <jacob.keller@gmail.com>, Jeff King <peff@peff.net>,
Jonathan Nieder <jrnieder@gmail.com>,
Johannes Schindelin <johannes.schindelin@gmail.com>,
Jens Lehmann <Jens.Lehmann@web.de>,
Vitali Lovich <vlovich@gmail.com>
Subject: Re: [PATCHv3 06/13] run-command: add an asynchronous parallel child processor
Date: Tue, 22 Sep 2015 15:23:04 -0700 [thread overview]
Message-ID: <xmqq4mim14c7.fsf@gitster.mtv.corp.google.com> (raw)
In-Reply-To: <CAGZ79kZFoTUugFdOuikOBUbg0DG+TJ3tNTuZaCs7WSaa2r7=Bg@mail.gmail.com> (Stefan Beller's message of "Tue, 22 Sep 2015 14:54:16 -0700")
Stefan Beller <sbeller@google.com> writes:
> On Tue, Sep 22, 2015 at 2:41 PM, Junio C Hamano <gitster@pobox.com> wrote:
>> Stefan Beller <sbeller@google.com> writes:
>>
>>> On Tue, Sep 22, 2015 at 12:53 PM, Junio C Hamano <gitster@pobox.com> wrote:
>>>> Stefan Beller <sbeller@google.com> writes:
>>>>
>>>>> So how would you find out when we are done?
>>>>
>>>> while (1) {
>>>> if (we have available slot)
>>>> ask to start a new one;
>>>> if (nobody is running anymore)
>>>> break;
>>>> collect the output from running ones;
>>>> drain the output from the foreground one;
>>>> cull the finished process;
>>>> }
>>>>
>>>
>>> Personally I do not like the break; in the middle of
>>> the loop, but that's personal preference, I'd guess.
>>> I'll also move the condition for (we have available slot)
>>> back inside the called function.
>>>
>>> So I am thinking about using this in the reroll instead:
>>>
>>> run_processes_parallel_start_as_needed(&pp);
>>> while (pp.nr_processes > 0) {
>>> run_processes_parallel_buffer_stderr(&pp);
>>> run_processes_parallel_output(&pp);
>>> run_processes_parallel_collect_finished(&pp);
>>> run_processes_parallel_start_as_needed(&pp);
>>> }
>>
>> If you truly think the latter is easier to follow its logic (with
>> the duplicated call to the same function only to avoid break that
>> makes it clear why we are quitting the loop,
>
> I dislike having the call twice, too.
> ...
>> and without the
>> explicit "start only if we can afford to"), then I have to say that
>> our design tastes are fundamentally incompatible.
> ...
> I'll think about that.
Don't waste too much time on it ;-) This is largely a taste thing,
and taste is very hard to reason about, understand, teach and learn.
Having said that, if I can pick one thing that I foresee to be
problematic the most (aside from these overlong names of the
functions that are private and do not need such long names), it is
that "start as many without giving any control to the caller"
interface. I wrote "start *a* new one" in the outline above for a
reason.
Even if you want to utilize a moderate number of processes, say 16,
working at the steady state, I suspect that we would find it
suboptimal from perceived latency point of view, if we spin up 16
processes all at once at the very beginning before we start to
collect output from the designated foreground process and relay its
first line of output. We may want to be able to tweak the caller to
spin up just a few first and let the loop ramp up to the full blast
gradually so that we can give an early feedback to the end user.
That is not something easy to arrange with "start as many without
giving control to the caller" interface. We probably will discover
other kinds of scheduling issues once we get familiar with this
machinery and would find need for finer grained control.
And I consider such a ramp-up logic shouldn't be hidden inside the
"start_as_needed()" helper function---it is the way how the calling
loop wants its processes started, and I'd prefer to have the logic
visible in the caller's loop.
But that is also largely a "taste" thing.
next prev parent reply other threads:[~2015-09-22 22:23 UTC|newest]
Thread overview: 48+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-09-21 22:39 [PATCHv3 00/13] fetch submodules in parallel and a preview on parallel "submodule update" Stefan Beller
2015-09-21 22:39 ` [PATCHv3 01/13] Sending "Fetching submodule <foo>" output to stderr Stefan Beller
2015-09-21 23:47 ` Junio C Hamano
2015-09-21 22:39 ` [PATCHv3 02/13] xread: poll on non blocking fds Stefan Beller
2015-09-21 23:55 ` Junio C Hamano
2015-09-22 4:55 ` Torsten Bögershausen
2015-09-22 6:23 ` Jacob Keller
2015-09-22 18:40 ` Torsten Bögershausen
2015-09-22 19:45 ` Junio C Hamano
2015-09-22 19:49 ` Jeff King
2015-09-22 20:00 ` Junio C Hamano
2015-09-23 0:14 ` Stefan Beller
2015-09-23 0:43 ` Junio C Hamano
2015-09-23 1:51 ` Jeff King
2015-09-21 23:56 ` Eric Sunshine
2015-09-22 15:58 ` Junio C Hamano
2015-09-22 17:38 ` Stefan Beller
2015-09-22 18:21 ` Junio C Hamano
2015-09-22 18:41 ` Stefan Beller
2015-09-21 22:39 ` [PATCHv3 03/13] xread_nonblock: add functionality to read from fds nonblockingly Stefan Beller
2015-09-22 0:02 ` Junio C Hamano
2015-09-22 0:10 ` Junio C Hamano
2015-09-22 6:26 ` Jacob Keller
2015-09-22 6:27 ` Jacob Keller
2015-09-22 15:59 ` Junio C Hamano
2015-09-21 22:39 ` [PATCHv3 04/13] strbuf: add strbuf_read_once to read without blocking Stefan Beller
2015-09-22 0:17 ` Junio C Hamano
2015-09-22 6:29 ` Jacob Keller
2015-09-21 22:39 ` [PATCHv3 05/13] run-command: factor out return value computation Stefan Beller
2015-09-22 0:38 ` Junio C Hamano
2015-09-21 22:39 ` [PATCHv3 06/13] run-command: add an asynchronous parallel child processor Stefan Beller
2015-09-22 1:08 ` Junio C Hamano
2015-09-22 18:28 ` Stefan Beller
2015-09-22 19:53 ` Junio C Hamano
2015-09-22 21:31 ` Stefan Beller
2015-09-22 21:41 ` Junio C Hamano
2015-09-22 21:54 ` Stefan Beller
2015-09-22 22:23 ` Junio C Hamano [this message]
2015-09-21 22:39 ` [PATCHv3 07/13] fetch_populated_submodules: use new parallel job processing Stefan Beller
2015-09-22 16:28 ` Junio C Hamano
2015-09-21 22:39 ` [PATCHv3 08/13] submodules: allow parallel fetching, add tests and documentation Stefan Beller
2015-09-21 22:39 ` [PATCHv3 09/13] submodule config: keep update strategy around Stefan Beller
2015-09-22 0:56 ` Eric Sunshine
2015-09-22 15:50 ` Stefan Beller
2015-09-21 22:39 ` [PATCHv3 10/13] git submodule update: cmd_update_recursive Stefan Beller
2015-09-21 22:39 ` [PATCHv3 11/13] git submodule update: cmd_update_clone Stefan Beller
2015-09-21 22:39 ` [PATCHv3 12/13] git submodule update: cmd_update_fetch Stefan Beller
2015-09-21 22:39 ` [PATCHv3 13/13] Rewrite submodule update in C Stefan Beller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xmqq4mim14c7.fsf@gitster.mtv.corp.google.com \
--to=gitster@pobox.com \
--cc=Jens.Lehmann@web.de \
--cc=git@vger.kernel.org \
--cc=jacob.keller@gmail.com \
--cc=johannes.schindelin@gmail.com \
--cc=jrnieder@gmail.com \
--cc=peff@peff.net \
--cc=sbeller@google.com \
--cc=vlovich@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.