From: Johannes Sixt <j6t@kdbg.org>
To: Stefan Beller <sbeller@google.com>, git@vger.kernel.org
Cc: peff@peff.net, gitster@pobox.com, jrnieder@gmail.com,
johannes.schindelin@gmail.com, Jens.Lehmann@web.de,
ericsunshine@gmail.com
Subject: Re: [PATCH 6/8] run-command: add an asynchronous parallel child processor
Date: Mon, 14 Dec 2015 21:39:06 +0100 [thread overview]
Message-ID: <566F28EA.3080802@kdbg.org> (raw)
In-Reply-To: <1450121838-7069-7-git-send-email-sbeller@google.com>
Am 14.12.2015 um 20:37 schrieb Stefan Beller:
> This allows to run external commands in parallel with ordered output
> on stderr.
>
> If we run external commands in parallel we cannot pipe the output directly
> to the our stdout/err as it would mix up. So each process's output will
> flow through a pipe, which we buffer. One subprocess can be directly
> piped to out stdout/err for a low latency feedback to the user.
>
> Example:
> Let's assume we have 5 submodules A,B,C,D,E and each fetch takes a
> different amount of time as the different submodules vary in size, then
> the output of fetches in sequential order might look like this:
>
> time -->
> output: |---A---| |-B-| |-------C-------| |-D-| |-E-|
>
> When we schedule these submodules into maximal two parallel processes,
> a schedule and sample output over time may look like this:
>
> process 1: |---A---| |-D-| |-E-|
>
> process 2: |-B-| |-------C-------|
>
> output: |---A---|B|---C-------|DE
>
> So A will be perceived as it would run normally in the single child
> version. As B has finished by the time A is done, we can dump its whole
> progress buffer on stderr, such that it looks like it finished in no
> time. Once that is done, C is determined to be the visible child and
> its progress will be reported in real time.
>
> So this way of output is really good for human consumption, as it only
> changes the timing, not the actual output.
>
> For machine consumption the output needs to be prepared in the tasks,
> by either having a prefix per line or per block to indicate whose tasks
> output is displayed, because the output order may not follow the
> original sequential ordering:
>
> |----A----| |--B--| |-C-|
>
> will be scheduled to be all parallel:
>
> process 1: |----A----|
> process 2: |--B--|
> process 3: |-C-|
> output: |----A----|CB
>
> This happens because C finished before B did, so it will be queued for
> output before B.
>
> The detection when a child has finished executing is done the same way as
> two fold. First we check regularly if the stderr pipe still exists in an
> interleaved manner with other actions such as checking other children
> for their liveliness or starting new children. Once a child closed their
> stderr stream, we assume it is stopping very soon, such that we can use
> the `finish_command` code borrowed from the single external process
> execution interface.
I can't quite parse the first sentence in this paragraph. Perhaps
something like this:
To detect when a child has finished executing, we check interleaved
with other actions (such as checking the liveliness of children or
starting new processes) whether the stderr pipe still exists. Once a
child closed its stderr stream, we assume it is terminating very soon,
and use finish_command() from the single external process execution
interface to collect the exit status.
>
> By maintaining the strong assumption of stderr being open until the
> very end of a child process, we can avoid other hassle such as an
> implementation using `waitpid(-1)`, which is not implemented in Windows.
>
> Signed-off-by: Stefan Beller <sbeller@google.com>
next prev parent reply other threads:[~2015-12-14 20:39 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-12-14 19:37 [PATCH 0/8] Rerolling sb/submodule-parallel-fetch for the time after 2.7 Stefan Beller
2015-12-14 19:37 ` [PATCH 1/8] submodule.c: write "Fetching submodule <foo>" to stderr Stefan Beller
2015-12-14 19:37 ` [PATCH 2/8] xread: poll on non blocking fds Stefan Beller
2015-12-14 22:58 ` Eric Sunshine
2015-12-14 23:07 ` Stefan Beller
2015-12-14 23:11 ` Junio C Hamano
2015-12-14 23:14 ` Stefan Beller
2015-12-14 19:37 ` [PATCH 3/8] xread_nonblock: add functionality to read from fds without blocking Stefan Beller
2015-12-14 20:59 ` Junio C Hamano
2015-12-14 23:03 ` Eric Sunshine
2015-12-14 23:05 ` Eric Sunshine
2015-12-14 23:15 ` Junio C Hamano
2015-12-14 23:57 ` Jeff King
2015-12-15 0:09 ` Stefan Beller
2015-12-15 0:16 ` Jeff King
2015-12-15 0:25 ` Stefan Beller
2015-12-15 1:44 ` Jeff King
2015-12-15 6:12 ` Johannes Sixt
2015-12-15 1:40 ` Junio C Hamano
2015-12-14 19:37 ` [PATCH 4/8] strbuf: add strbuf_read_once to read " Stefan Beller
2015-12-14 23:16 ` Eric Sunshine
2015-12-14 23:27 ` Stefan Beller
2015-12-14 19:37 ` [PATCH 5/8] sigchain: add command to pop all common signals Stefan Beller
2015-12-14 19:37 ` [PATCH 6/8] run-command: add an asynchronous parallel child processor Stefan Beller
2015-12-14 20:39 ` Johannes Sixt [this message]
2015-12-14 21:40 ` Stefan Beller
2015-12-14 19:37 ` [PATCH 7/8] fetch_populated_submodules: use new parallel job processing Stefan Beller
2015-12-14 19:37 ` [PATCH 8/8] submodules: allow parallel fetching, add tests and documentation Stefan Beller
2015-12-14 20:40 ` [PATCH 0/8] Rerolling sb/submodule-parallel-fetch for the time after 2.7 Johannes Sixt
2015-12-14 21:00 ` Junio C Hamano
-- strict thread matches above, loose matches on Subject: below --
2015-09-28 23:13 [PATCH 0/8] fetch submodules in parallel Stefan Beller
2015-09-28 23:14 ` [PATCH 6/8] run-command: add an asynchronous parallel child processor Stefan Beller
2015-09-30 3:12 ` Junio C Hamano
2015-09-30 18:28 ` Stefan Beller
2015-09-30 18:48 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=566F28EA.3080802@kdbg.org \
--to=j6t@kdbg.org \
--cc=Jens.Lehmann@web.de \
--cc=ericsunshine@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=johannes.schindelin@gmail.com \
--cc=jrnieder@gmail.com \
--cc=peff@peff.net \
--cc=sbeller@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.