All of lore.kernel.org
 help / color / mirror / Atom feed
From: Emily Shaffer <emilyshaffer@google.com>
To: Jonathan Tan <jonathantanmy@google.com>
Cc: git@vger.kernel.org
Subject: Re: [PATCH v7 12/17] hook: allow parallel hook execution
Date: Mon, 22 Feb 2021 13:46:51 -0800	[thread overview]
Message-ID: <YDQmS2qN/lrm547L@google.com> (raw)
In-Reply-To: <20210201060422.1313603-1-jonathantanmy@google.com>

On Sun, Jan 31, 2021 at 10:04:22PM -0800, Jonathan Tan wrote:
> 
> > In many cases, there's no reason not to allow hooks to execute in
> > parallel. run_processes_parallel() is well-suited - it's a task queue
> > that runs its housekeeping in series, which means users don't
> > need to worry about thread safety on their callback data. True
> > multithreaded execution with the async_* functions isn't necessary here.
> > Synchronous hook execution can be achieved by only allowing 1 job to run
> > at a time.
> > 
> > Teach run_hooks() to use that function for simple hooks which don't
> > require stdin or capture of stderr.
> 
> Which hooks would be run in parallel, and which hooks in series? I don't
> see code that distinguishes between them.

It's up to the caller, who can set run_hooks_opt.jobs. In part II of
this series I made a guess at which ones should run in parallel or in
series and specified it in Documentation/githooks.txt.

> 
> > 
> > Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
> > ---
> > 
> > Notes:
> >     Per AEvar's request - parallel hook execution on day zero.
> >     
> >     In most ways run_processes_parallel() worked great for me - but it didn't
> >     have great support for hooks where we pipe to and from. I had to add this
> >     support later in the series.
> >     
> >     Since I modified an existing and in-use library I'd appreciate a keen look on
> >     these patches.
> 
> What is the existing and in-use library that you're modifying?

Hm, this note wasn't super specific. From this point onwards in the
series I make changes to the run-command.h:run_processes_parallel()
library, although not in this commit itself. I think I meant "from here
on out, help me look at run-command.h".

I'll try to make the note a little better next series, sorry for the
confusion :) :)

> 
> > @@ -246,11 +255,96 @@ void run_hooks_opt_clear(struct run_hooks_opt *o)
> >  	strvec_clear(&o->args);
> >  }
> >  
> > +
> > +static int pick_next_hook(struct child_process *cp,
> > +			  struct strbuf *out,
> > +			  void *pp_cb,
> > +			  void **pp_task_cb)
> > +{
> > +	struct hook_cb_data *hook_cb = pp_cb;
> > +
> > +	struct hook *hook = list_entry(hook_cb->run_me, struct hook, list);
> > +
> > +	if (hook_cb->head == hook_cb->run_me)
> > +		return 0;
> > +
> > +	cp->env = hook_cb->options->env.v;
> > +	cp->stdout_to_stderr = 1;
> > +	cp->trace2_hook_name = hook->command.buf;
> > +
> > +	/* reopen the file for stdin; run_command closes it. */
> > +	if (hook_cb->options->path_to_stdin) {
> > +		cp->no_stdin = 0;
> > +		cp->in = xopen(hook_cb->options->path_to_stdin, O_RDONLY);
> > +	} else {
> > +		cp->no_stdin = 1;
> > +	}
> > +
> > +	/*
> > +	 * Commands from the config could be oneliners, but we know
> > +	 * for certain that hookdir commands are not.
> > +	 */
> > +	if (hook->from_hookdir)
> > +		cp->use_shell = 0;
> > +	else
> > +		cp->use_shell = 1;
> > +
> > +	/* add command */
> > +	strvec_push(&cp->args, hook->command.buf);
> > +
> > +	/*
> > +	 * add passed-in argv, without expanding - let the user get back
> > +	 * exactly what they put in
> > +	 */
> > +	strvec_pushv(&cp->args, hook_cb->options->args.v);
> 
> I just skimmed over this setup-process-for-hook part - it would have
> been much clearer if it was refactored into its own function before this
> patch (or better yet, written as its own function in the first place).
> As it is, there are some unnecessary rewritings - e.g. setting stdin
> after env, and the use_shell setup.

Yeah, that makes sense. Will see if I can change it for next round :)

> 
> > diff --git a/hook.h b/hook.h
> 
> [snip]
> 
> > +/*
> > + * Callback provided to feed_pipe_fn and consume_sideband_fn.
> > + */
> > +struct hook_cb_data {
> > +	int rc;
> > +	struct list_head *head;
> > +	struct list_head *run_me;
> > +	struct run_hooks_opt *options;
> > +};
> 
> Could this be in hook.c instead?

It ends up being needed publicly by
https://lore.kernel.org/git/20201222000435.1529768-17-emilyshaffer@google.com
(receive-pack: convert receive hooks to hook.h), which writes its own
stdin provider callback. (In a later commit, a "void* options" gets
added to this struct.)

At that point it's needed because the run-command callback structure can
provide one context pointer for the overall work queue, and one context
pointer for the individual task; this one is the "overall work queue"
pointer.

From hook.h's perspective, the entire hook_cb_data is needed for
pick_next_hook; but run-command.h:run_processes_parallel() doesn't have
a way to tease out a smaller amount of the context pointer for various
callbacks. If we wanted to obfuscate "hook_cb_data" we'd need to add
another indirection and call back to hook.h first, who could then tease
out the client-provided context and then call the client callback, but
to me it sounds unnecessarily complex.

> 
> Also, I think it's clearer if run_me was a struct hook, and set to NULL
> when iteration reaches the end. If you disagree, I think it needs some
> documentation (e.g. "the embedded linked list part of the hook that must
> be run next; if equal to head, then iteration has ended" or something
> like that).

Yeah, I don't see a huge reason not to do that, sure.

> 
> > +#define RUN_HOOKS_OPT_INIT_SYNC  {   		\
> >  	.env = STRVEC_INIT, 			\
> >  	.args = STRVEC_INIT, 			\
> >  	.path_to_stdin = NULL,			\
> > +	.jobs = 1,				\
> >  	.run_hookdir = configured_hookdir_opt()	\
> >  }
> 
> This is not used anywhere.

It is used in part II by hooks which are not able to be parallelized.

 - Emily

  reply	other threads:[~2021-02-22 21:47 UTC|newest]

Thread overview: 170+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-21 18:54 [PATCH v2 0/4] propose config-based hooks Emily Shaffer
2020-05-21 18:54 ` [PATCH v2 1/4] doc: propose hooks managed by the config Emily Shaffer
2020-05-22 10:13   ` Phillip Wood
2020-06-09 20:26     ` Emily Shaffer
2020-05-21 18:54 ` [PATCH v2 2/4] hook: scaffolding for git-hook subcommand Emily Shaffer
2020-05-21 18:54 ` [PATCH v2 3/4] hook: add list command Emily Shaffer
2020-05-22 10:27   ` Phillip Wood
2020-06-09 21:49     ` Emily Shaffer
2020-08-17 13:36       ` Phillip Wood
2020-05-24 23:00   ` Johannes Schindelin
2020-05-27 23:37     ` Emily Shaffer
2020-05-21 18:54 ` [PATCH v2 4/4] hook: add --porcelain to " Emily Shaffer
2020-05-24 23:00   ` Johannes Schindelin
2020-05-25  0:29     ` Johannes Schindelin
2020-07-28 22:24 ` [PATCH v3 0/6] propose config-based hooks Emily Shaffer
2020-07-28 22:24   ` [PATCH v3 1/6] doc: propose hooks managed by the config Emily Shaffer
2020-07-28 22:24   ` [PATCH v3 2/6] hook: scaffolding for git-hook subcommand Emily Shaffer
2020-07-28 22:24   ` [PATCH v3 3/6] hook: add list command Emily Shaffer
2020-07-28 22:24   ` [PATCH v3 4/6] hook: add --porcelain to " Emily Shaffer
2020-07-28 22:24   ` [RFC PATCH v3 5/6] parse-options: parse into argv_array Emily Shaffer
2020-07-29 19:33     ` Junio C Hamano
2020-07-30 23:41       ` Junio C Hamano
2020-07-28 22:24   ` [RFC PATCH v3 6/6] hook: add 'run' subcommand Emily Shaffer
2020-09-09  0:49   ` [PATCH v4 0/9] propose config-based hooks Emily Shaffer
2020-09-09  0:49     ` [PATCH v4 1/9] doc: propose hooks managed by the config Emily Shaffer
2020-09-23 22:59       ` Jonathan Tan
2020-09-24 21:54         ` Emily Shaffer
2020-10-07  9:23       ` Ævar Arnfjörð Bjarmason
2020-10-22  0:58         ` Emily Shaffer
2020-10-23 19:10           ` Ævar Arnfjörð Bjarmason
2020-10-29 15:38             ` Emily Shaffer
2020-10-29 20:04               ` Ævar Arnfjörð Bjarmason
2020-09-09  0:49     ` [PATCH v4 2/9] hook: scaffolding for git-hook subcommand Emily Shaffer
2020-10-05 23:24       ` Jonathan Nieder
2020-10-06 19:06         ` Emily Shaffer
2020-09-09  0:49     ` [PATCH v4 3/9] hook: add list command Emily Shaffer
2020-09-11 13:27       ` Phillip Wood
2020-09-11 16:51         ` Emily Shaffer
2020-09-23 23:04       ` Jonathan Tan
2020-10-06 20:46         ` Emily Shaffer
2020-09-27 19:23       ` Martin Ågren
2020-10-06 20:20         ` Emily Shaffer
2020-10-05 23:27       ` Jonathan Nieder
2020-09-09  0:49     ` [PATCH v4 4/9] hook: add --porcelain to " Emily Shaffer
2020-09-28 19:29       ` Josh Steadmon
2020-09-09  0:49     ` [PATCH v4 5/9] parse-options: parse into strvec Emily Shaffer
2020-10-05 23:30       ` Jonathan Nieder
2020-10-06  4:49         ` Junio C Hamano
2020-09-09  0:49     ` [PATCH v4 6/9] hook: add 'run' subcommand Emily Shaffer
2020-09-11 13:30       ` Phillip Wood
2020-09-28 19:29       ` Josh Steadmon
2020-10-05 23:39       ` Jonathan Nieder
2020-10-06 22:57         ` Emily Shaffer
2020-09-09  0:49     ` [PATCH v4 7/9] hook: replace run-command.h:find_hook Emily Shaffer
2020-09-09 20:32       ` Junio C Hamano
2020-09-10 19:08         ` Emily Shaffer
2020-09-23 23:20       ` Jonathan Tan
2020-10-05 23:42       ` Jonathan Nieder
2020-09-09  0:49     ` [PATCH v4 8/9] commit: use config-based hooks Emily Shaffer
2020-09-10 13:50       ` Phillip Wood
2020-09-10 22:21         ` Junio C Hamano
2020-09-23 23:47       ` Jonathan Tan
2020-10-05 21:27         ` Emily Shaffer
2020-10-05 23:48           ` Jonathan Nieder
2020-10-06 19:08             ` Emily Shaffer
2020-09-09  0:49     ` [PATCH v4 9/9] run_commit_hook: take strvec instead of varargs Emily Shaffer
2020-09-10 14:16       ` Phillip Wood
2020-09-11 13:20         ` Phillip Wood
2020-09-09 21:04     ` [PATCH v4 0/9] propose config-based hooks Junio C Hamano
2020-10-14 23:24     ` [PATCH v5 0/8] propose config-based hooks (part I) Emily Shaffer
2020-10-14 23:24       ` [PATCH v5 1/8] doc: propose hooks managed by the config Emily Shaffer
2020-10-15 16:31         ` Ævar Arnfjörð Bjarmason
2020-10-16 17:29           ` Junio C Hamano
2020-10-21 23:37           ` Emily Shaffer
2020-10-14 23:24       ` [PATCH v5 2/8] hook: scaffolding for git-hook subcommand Emily Shaffer
2020-10-14 23:24       ` [PATCH v5 3/8] hook: add list command Emily Shaffer
2020-10-14 23:24       ` [PATCH v5 4/8] hook: include hookdir hook in list Emily Shaffer
2020-10-14 23:24       ` [PATCH v5 5/8] hook: implement hookcmd.<name>.skip Emily Shaffer
2020-10-14 23:24       ` [PATCH v5 6/8] parse-options: parse into strvec Emily Shaffer
2020-10-14 23:24       ` [PATCH v5 7/8] hook: add 'run' subcommand Emily Shaffer
2020-10-14 23:24       ` [PATCH v5 8/8] hook: replace find_hook() with hook_exists() Emily Shaffer
2020-12-05  1:45       ` [PATCH v6 00/17] propose config-based hooks (part I) Emily Shaffer
2020-12-05  1:45         ` [PATCH 01/17] doc: propose hooks managed by the config Emily Shaffer
2020-12-05  1:45         ` [PATCH 02/17] hook: scaffolding for git-hook subcommand Emily Shaffer
2020-12-05  1:45         ` [PATCH 03/17] hook: add list command Emily Shaffer
2020-12-05  1:45         ` [PATCH 04/17] hook: include hookdir hook in list Emily Shaffer
2020-12-05  1:45         ` [PATCH 05/17] hook: respect hook.runHookDir Emily Shaffer
2020-12-05  1:45         ` [PATCH 06/17] hook: implement hookcmd.<name>.skip Emily Shaffer
2020-12-05  1:45         ` [PATCH 07/17] parse-options: parse into strvec Emily Shaffer
2020-12-05  1:45         ` [PATCH 08/17] hook: add 'run' subcommand Emily Shaffer
2020-12-11 10:15           ` Phillip Wood
2020-12-15 21:41             ` Emily Shaffer
2020-12-05  1:45         ` [PATCH 09/17] hook: replace find_hook() with hook_exists() Emily Shaffer
2020-12-05  1:46         ` [PATCH 10/17] hook: support passing stdin to hooks Emily Shaffer
2020-12-05  1:46         ` [PATCH 11/17] run-command: allow stdin for run_processes_parallel Emily Shaffer
2020-12-05  1:46         ` [PATCH 12/17] hook: allow parallel hook execution Emily Shaffer
2020-12-05  1:46         ` [PATCH 13/17] hook: allow specifying working directory for hooks Emily Shaffer
2020-12-05  1:46         ` [PATCH 14/17] run-command: add stdin callback for parallelization Emily Shaffer
2020-12-05  1:46         ` [PATCH 15/17] hook: provide stdin by string_list or callback Emily Shaffer
2020-12-08 21:09           ` SZEDER Gábor
2020-12-08 22:11             ` Emily Shaffer
2020-12-05  1:46         ` [PATCH 16/17] run-command: allow capturing of collated output Emily Shaffer
2020-12-05  1:46         ` [PATCH 17/17] hooks: allow callers to capture output Emily Shaffer
2020-12-16  0:34         ` [PATCH v6 00/17] propose config-based hooks (part I) Josh Steadmon
2020-12-16  0:56           ` Junio C Hamano
2020-12-16 20:16             ` Emily Shaffer
2020-12-16 23:32               ` Junio C Hamano
2020-12-18  2:07                 ` Emily Shaffer
2020-12-18  5:29                   ` Junio C Hamano
2020-12-22  0:02         ` [PATCH v7 " Emily Shaffer
2020-12-22  0:02           ` [PATCH v7 01/17] doc: propose hooks managed by the config Emily Shaffer
2021-01-23 15:38             ` Ævar Arnfjörð Bjarmason
2021-01-29 23:52               ` Emily Shaffer
2021-02-01 22:11             ` Junio C Hamano
2021-03-10 19:30               ` Emily Shaffer
2020-12-22  0:02           ` [PATCH v7 02/17] hook: scaffolding for git-hook subcommand Emily Shaffer
2020-12-22  0:02           ` [PATCH v7 03/17] hook: add list command Emily Shaffer
2021-01-31  3:10             ` Jonathan Tan
2021-02-09 21:06               ` Emily Shaffer
2020-12-22  0:02           ` [PATCH v7 04/17] hook: include hookdir hook in list Emily Shaffer
2021-01-31  3:20             ` Jonathan Tan
2021-02-09 22:05               ` Emily Shaffer
2020-12-22  0:02           ` [PATCH v7 05/17] hook: respect hook.runHookDir Emily Shaffer
2021-01-31  3:35             ` Jonathan Tan
2021-02-09 22:31               ` Emily Shaffer
2020-12-22  0:02           ` [PATCH v7 06/17] hook: implement hookcmd.<name>.skip Emily Shaffer
2021-01-31  3:40             ` Jonathan Tan
2021-02-09 22:57               ` Emily Shaffer
2020-12-22  0:02           ` [PATCH v7 07/17] parse-options: parse into strvec Emily Shaffer
2020-12-22  0:02           ` [PATCH v7 08/17] hook: add 'run' subcommand Emily Shaffer
2021-01-31  4:22             ` Jonathan Tan
2021-02-11 22:44               ` Emily Shaffer
2020-12-22  0:02           ` [PATCH v7 09/17] hook: replace find_hook() with hook_exists() Emily Shaffer
2021-01-31  4:39             ` Jonathan Tan
2021-02-12 22:15               ` Emily Shaffer
2021-02-18 22:23               ` Emily Shaffer
2020-12-22  0:02           ` [PATCH v7 10/17] hook: support passing stdin to hooks Emily Shaffer
2020-12-22  0:02           ` [PATCH v7 11/17] run-command: allow stdin for run_processes_parallel Emily Shaffer
2021-02-01  5:38             ` Jonathan Tan
2021-02-19 20:23               ` Emily Shaffer
2020-12-22  0:02           ` [PATCH v7 12/17] hook: allow parallel hook execution Emily Shaffer
2021-02-01  6:04             ` Jonathan Tan
2021-02-22 21:46               ` Emily Shaffer [this message]
2020-12-22  0:02           ` [PATCH v7 13/17] hook: allow specifying working directory for hooks Emily Shaffer
2020-12-22  0:02           ` [PATCH v7 14/17] run-command: add stdin callback for parallelization Emily Shaffer
2021-02-01  6:51             ` Jonathan Tan
2021-02-22 23:38               ` Emily Shaffer
2021-02-23 19:33                 ` Jonathan Tan
2021-03-10 18:24                   ` Emily Shaffer
2020-12-22  0:02           ` [PATCH v7 15/17] hook: provide stdin by string_list or callback Emily Shaffer
2021-02-01  7:04             ` Jonathan Tan
2021-02-23 19:52               ` Emily Shaffer
2021-02-25 20:56                 ` Jonathan Tan
2021-03-02  1:47                   ` Emily Shaffer
2021-03-02 23:33                     ` Jonathan Tan
2020-12-22  0:02           ` [PATCH v7 16/17] run-command: allow capturing of collated output Emily Shaffer
2020-12-22  0:02           ` [PATCH v7 17/17] hooks: allow callers to capture output Emily Shaffer
2020-12-22  2:11           ` [PATCH v7 00/17] propose config-based hooks (part I) Junio C Hamano
2020-12-28 18:34             ` Emily Shaffer
2020-12-28 22:50               ` Junio C Hamano
2020-12-28 22:37           ` [PATCH v3 18/17] doc: make git-hook.txt point of truth Emily Shaffer
2020-12-28 22:39             ` Emily Shaffer
2021-01-29 23:59           ` [PATCH v7 00/17] propose config-based hooks (part I) Emily Shaffer
2021-02-16 19:46           ` Josh Steadmon
2021-02-16 22:47             ` Junio C Hamano
2021-02-17 21:21               ` Josh Steadmon
2021-02-17 23:07                 ` Junio C Hamano
2021-02-25 19:50                   ` Junio C Hamano
2021-03-01 21:51                     ` Emily Shaffer
2021-03-01 22:19                       ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YDQmS2qN/lrm547L@google.com \
    --to=emilyshaffer@google.com \
    --cc=git@vger.kernel.org \
    --cc=jonathantanmy@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.