git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Johannes Sixt <j6t@kdbg.org>
To: Stefan Beller <sbeller@google.com>, git@vger.kernel.org
Cc: peff@peff.net, gitster@pobox.com, jrnieder@gmail.com,
	johannes.schindelin@gmail.com, Jens.Lehmann@web.de,
	ericsunshine@gmail.com
Subject: Re: [PATCH 6/8] run-command: add an asynchronous parallel child processor
Date: Mon, 14 Dec 2015 21:39:06 +0100	[thread overview]
Message-ID: <566F28EA.3080802@kdbg.org> (raw)
In-Reply-To: <1450121838-7069-7-git-send-email-sbeller@google.com>

Am 14.12.2015 um 20:37 schrieb Stefan Beller:
> This allows to run external commands in parallel with ordered output
> on stderr.
>
> If we run external commands in parallel we cannot pipe the output directly
> to the our stdout/err as it would mix up. So each process's output will
> flow through a pipe, which we buffer. One subprocess can be directly
> piped to out stdout/err for a low latency feedback to the user.
>
> Example:
> Let's assume we have 5 submodules A,B,C,D,E and each fetch takes a
> different amount of time as the different submodules vary in size, then
> the output of fetches in sequential order might look like this:
>
>   time -->
>   output: |---A---| |-B-| |-------C-------| |-D-| |-E-|
>
> When we schedule these submodules into maximal two parallel processes,
> a schedule and sample output over time may look like this:
>
> process 1: |---A---| |-D-| |-E-|
>
> process 2: |-B-| |-------C-------|
>
> output:    |---A---|B|---C-------|DE
>
> So A will be perceived as it would run normally in the single child
> version. As B has finished by the time A is done, we can dump its whole
> progress buffer on stderr, such that it looks like it finished in no
> time. Once that is done, C is determined to be the visible child and
> its progress will be reported in real time.
>
> So this way of output is really good for human consumption, as it only
> changes the timing, not the actual output.
>
> For machine consumption the output needs to be prepared in the tasks,
> by either having a prefix per line or per block to indicate whose tasks
> output is displayed, because the output order may not follow the
> original sequential ordering:
>
>   |----A----| |--B--| |-C-|
>
> will be scheduled to be all parallel:
>
> process 1: |----A----|
> process 2: |--B--|
> process 3: |-C-|
> output:    |----A----|CB
>
> This happens because C finished before B did, so it will be queued for
> output before B.
>
> The detection when a child has finished executing is done the same way as
> two fold. First we check regularly if the stderr pipe still exists in an
> interleaved manner with other actions such as checking other children
> for their liveliness or starting new children. Once a child closed their
> stderr stream, we assume it is stopping very soon, such that we can use
> the `finish_command` code borrowed from the single external process
> execution interface.

I can't quite parse the first sentence in this paragraph. Perhaps 
something like this:

To detect when a child has finished executing, we check interleaved
with other actions (such as checking the liveliness of children or
starting new processes) whether the stderr pipe still exists. Once a
child closed its stderr stream, we assume it is terminating very soon,
and use finish_command() from the single external process execution
interface to collect the exit status.

>
> By maintaining the strong assumption of stderr being open until the
> very end of a child process, we can avoid other hassle such as an
> implementation using `waitpid(-1)`, which is not implemented in Windows.
>
> Signed-off-by: Stefan Beller <sbeller@google.com>

  reply	other threads:[~2015-12-14 20:39 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-12-14 19:37 [PATCH 0/8] Rerolling sb/submodule-parallel-fetch for the time after 2.7 Stefan Beller
2015-12-14 19:37 ` [PATCH 1/8] submodule.c: write "Fetching submodule <foo>" to stderr Stefan Beller
2015-12-14 19:37 ` [PATCH 2/8] xread: poll on non blocking fds Stefan Beller
2015-12-14 22:58   ` Eric Sunshine
2015-12-14 23:07     ` Stefan Beller
2015-12-14 23:11     ` Junio C Hamano
2015-12-14 23:14       ` Stefan Beller
2015-12-14 19:37 ` [PATCH 3/8] xread_nonblock: add functionality to read from fds without blocking Stefan Beller
2015-12-14 20:59   ` Junio C Hamano
2015-12-14 23:03   ` Eric Sunshine
2015-12-14 23:05     ` Eric Sunshine
2015-12-14 23:15     ` Junio C Hamano
2015-12-14 23:57       ` Jeff King
2015-12-15  0:09         ` Stefan Beller
2015-12-15  0:16           ` Jeff King
2015-12-15  0:25             ` Stefan Beller
2015-12-15  1:44               ` Jeff King
2015-12-15  6:12               ` Johannes Sixt
2015-12-15  1:40         ` Junio C Hamano
2015-12-14 19:37 ` [PATCH 4/8] strbuf: add strbuf_read_once to read " Stefan Beller
2015-12-14 23:16   ` Eric Sunshine
2015-12-14 23:27     ` Stefan Beller
2015-12-14 19:37 ` [PATCH 5/8] sigchain: add command to pop all common signals Stefan Beller
2015-12-14 19:37 ` [PATCH 6/8] run-command: add an asynchronous parallel child processor Stefan Beller
2015-12-14 20:39   ` Johannes Sixt [this message]
2015-12-14 21:40     ` Stefan Beller
2015-12-14 19:37 ` [PATCH 7/8] fetch_populated_submodules: use new parallel job processing Stefan Beller
2015-12-14 19:37 ` [PATCH 8/8] submodules: allow parallel fetching, add tests and documentation Stefan Beller
2015-12-14 20:40 ` [PATCH 0/8] Rerolling sb/submodule-parallel-fetch for the time after 2.7 Johannes Sixt
2015-12-14 21:00   ` Junio C Hamano
  -- strict thread matches above, loose matches on Subject: below --
2015-09-28 23:13 [PATCH 0/8] fetch submodules in parallel Stefan Beller
2015-09-28 23:14 ` [PATCH 6/8] run-command: add an asynchronous parallel child processor Stefan Beller
2015-09-30  3:12   ` Junio C Hamano
2015-09-30 18:28     ` Stefan Beller
2015-09-30 18:48     ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=566F28EA.3080802@kdbg.org \
    --to=j6t@kdbg.org \
    --cc=Jens.Lehmann@web.de \
    --cc=ericsunshine@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=johannes.schindelin@gmail.com \
    --cc=jrnieder@gmail.com \
    --cc=peff@peff.net \
    --cc=sbeller@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).