From: Johannes Sixt <j6t@kdbg.org>
To: Stefan Beller <sbeller@google.com>, git@vger.kernel.org
Cc: peff@peff.net, gitster@pobox.com, jrnieder@gmail.com,
johannes.schindelin@gmail.com, Jens.Lehmann@web.de,
ericsunshine@gmail.com
Subject: Re: [PATCH 6/8] run-command: add an asynchronous parallel child processor
Date: Mon, 14 Dec 2015 21:39:06 +0100 [thread overview]
Message-ID: <566F28EA.3080802@kdbg.org> (raw)
In-Reply-To: <1450121838-7069-7-git-send-email-sbeller@google.com>
Am 14.12.2015 um 20:37 schrieb Stefan Beller:
> This allows to run external commands in parallel with ordered output
> on stderr.
>
> If we run external commands in parallel we cannot pipe the output directly
> to the our stdout/err as it would mix up. So each process's output will
> flow through a pipe, which we buffer. One subprocess can be directly
> piped to out stdout/err for a low latency feedback to the user.
>
> Example:
> Let's assume we have 5 submodules A,B,C,D,E and each fetch takes a
> different amount of time as the different submodules vary in size, then
> the output of fetches in sequential order might look like this:
>
> time -->
> output: |---A---| |-B-| |-------C-------| |-D-| |-E-|
>
> When we schedule these submodules into maximal two parallel processes,
> a schedule and sample output over time may look like this:
>
> process 1: |---A---| |-D-| |-E-|
>
> process 2: |-B-| |-------C-------|
>
> output: |---A---|B|---C-------|DE
>
> So A will be perceived as it would run normally in the single child
> version. As B has finished by the time A is done, we can dump its whole
> progress buffer on stderr, such that it looks like it finished in no
> time. Once that is done, C is determined to be the visible child and
> its progress will be reported in real time.
>
> So this way of output is really good for human consumption, as it only
> changes the timing, not the actual output.
>
> For machine consumption the output needs to be prepared in the tasks,
> by either having a prefix per line or per block to indicate whose tasks
> output is displayed, because the output order may not follow the
> original sequential ordering:
>
> |----A----| |--B--| |-C-|
>
> will be scheduled to be all parallel:
>
> process 1: |----A----|
> process 2: |--B--|
> process 3: |-C-|
> output: |----A----|CB
>
> This happens because C finished before B did, so it will be queued for
> output before B.
>
> The detection when a child has finished executing is done the same way as
> two fold. First we check regularly if the stderr pipe still exists in an
> interleaved manner with other actions such as checking other children
> for their liveliness or starting new children. Once a child closed their
> stderr stream, we assume it is stopping very soon, such that we can use
> the `finish_command` code borrowed from the single external process
> execution interface.
I can't quite parse the first sentence in this paragraph. Perhaps
something like this:
To detect when a child has finished executing, we check interleaved
with other actions (such as checking the liveliness of children or
starting new processes) whether the stderr pipe still exists. Once a
child closed its stderr stream, we assume it is terminating very soon,
and use finish_command() from the single external process execution
interface to collect the exit status.
>
> By maintaining the strong assumption of stderr being open until the
> very end of a child process, we can avoid other hassle such as an
> implementation using `waitpid(-1)`, which is not implemented in Windows.
>
> Signed-off-by: Stefan Beller <sbeller@google.com>
next prev parent reply other threads:[~2015-12-14 20:39 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-12-14 19:37 [PATCH 0/8] Rerolling sb/submodule-parallel-fetch for the time after 2.7 Stefan Beller
2015-12-14 19:37 ` [PATCH 1/8] submodule.c: write "Fetching submodule <foo>" to stderr Stefan Beller
2015-12-14 19:37 ` [PATCH 2/8] xread: poll on non blocking fds Stefan Beller
2015-12-14 22:58 ` Eric Sunshine
2015-12-14 23:07 ` Stefan Beller
2015-12-14 23:11 ` Junio C Hamano
2015-12-14 23:14 ` Stefan Beller
2015-12-14 19:37 ` [PATCH 3/8] xread_nonblock: add functionality to read from fds without blocking Stefan Beller
2015-12-14 20:59 ` Junio C Hamano
2015-12-14 23:03 ` Eric Sunshine
2015-12-14 23:05 ` Eric Sunshine
2015-12-14 23:15 ` Junio C Hamano
2015-12-14 23:57 ` Jeff King
2015-12-15 0:09 ` Stefan Beller
2015-12-15 0:16 ` Jeff King
2015-12-15 0:25 ` Stefan Beller
2015-12-15 1:44 ` Jeff King
2015-12-15 6:12 ` Johannes Sixt
2015-12-15 1:40 ` Junio C Hamano
2015-12-14 19:37 ` [PATCH 4/8] strbuf: add strbuf_read_once to read " Stefan Beller
2015-12-14 23:16 ` Eric Sunshine
2015-12-14 23:27 ` Stefan Beller
2015-12-14 19:37 ` [PATCH 5/8] sigchain: add command to pop all common signals Stefan Beller
2015-12-14 19:37 ` [PATCH 6/8] run-command: add an asynchronous parallel child processor Stefan Beller
2015-12-14 20:39 ` Johannes Sixt [this message]
2015-12-14 21:40 ` Stefan Beller
2015-12-14 19:37 ` [PATCH 7/8] fetch_populated_submodules: use new parallel job processing Stefan Beller
2015-12-14 19:37 ` [PATCH 8/8] submodules: allow parallel fetching, add tests and documentation Stefan Beller
2015-12-14 20:40 ` [PATCH 0/8] Rerolling sb/submodule-parallel-fetch for the time after 2.7 Johannes Sixt
2015-12-14 21:00 ` Junio C Hamano
-- strict thread matches above, loose matches on Subject: below --
2015-09-28 23:13 [PATCH 0/8] fetch submodules in parallel Stefan Beller
2015-09-28 23:14 ` [PATCH 6/8] run-command: add an asynchronous parallel child processor Stefan Beller
2015-09-30 3:12 ` Junio C Hamano
2015-09-30 18:28 ` Stefan Beller
2015-09-30 18:48 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=566F28EA.3080802@kdbg.org \
--to=j6t@kdbg.org \
--cc=Jens.Lehmann@web.de \
--cc=ericsunshine@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=johannes.schindelin@gmail.com \
--cc=jrnieder@gmail.com \
--cc=peff@peff.net \
--cc=sbeller@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).