public inbox for git@vger.kernel.org
 help / color / mirror / Atom feed
From: Patrick Steinhardt <ps@pks.im>
To: Adrian Ratiu <adrian.ratiu@collabora.com>
Cc: git@vger.kernel.org, "Jeff King" <peff@peff.net>,
	"Emily Shaffer" <emilyshaffer@google.com>,
	"Junio C Hamano" <gitster@pobox.com>,
	"Josh Steadmon" <steadmon@google.com>,
	"Kristoffer Haugsbakk" <kristofferhaugsbakk@fastmail.com>,
	"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Subject: Re: [PATCH 2/4] hook: allow parallel hook execution
Date: Wed, 11 Feb 2026 13:41:43 +0100	[thread overview]
Message-ID: <aYx5B-nf4dlFpw3v@pks.im> (raw)
In-Reply-To: <20260204173328.1601807-3-adrian.ratiu@collabora.com>

On Wed, Feb 04, 2026 at 07:33:26PM +0200, Adrian Ratiu wrote:
> From: Emily Shaffer <emilyshaffer@google.com>
> 
> In many cases, there's no reason not to allow hooks to execute in
> parallel, if more than one was provided.
> 
> hook.c already calls run_processes_parallel() so all we need to do is
> allow its job count to be greater than 1.
> 
> Serial execution is achieved by setting .jobs == 1 at compile time via
> RUN_HOOKS_OPT_INIT_SERIAL or by setting the 'hook.jobs' config to 1.
> This matches the behavior prior to this commit.
> 
> The compile-time 'struct run_hooks_opt.jobs' parameter has the highest
> priority if non-zero, followed by the 'hook.jobs' user config, then the
> processor count from online_cpus() is the last fallback.

Wait, the compile-time parameter overrides the user configuration? That
doesn't seem right to me.

I'm also a bit sceptical whether we should really default to
`online_cpus()`. If so, we start to assume semantics of the hooks
themselves, and that they cannot conflict with one another. But this is
nothing we can really guarantee. It might be that multiple hooks want to
modify the same data structure, and if so running them in parallel would
lead to races.

So I wonder whether we should rather make this behaviour opt-in than
opt-out.

> The above ordering ensures hooks unsafe to run in parallel are always
> executed sequentially (RUN_HOOKS_OPT_INIT_SERIAL) while allowing users
> to control parallelism with an efficient default.

Ah, okay, we only let the compile-time parameter override the config in
case we know that hooks must run in serial. That makes a bit more sense.

> diff --git a/Documentation/config/hook.adoc b/Documentation/config/hook.adoc
> index 49c7ffd82e..c394756328 100644
> --- a/Documentation/config/hook.adoc
> +++ b/Documentation/config/hook.adoc
> @@ -15,3 +15,8 @@ hook.<name>.event::
>  	On the specified event, the associated `hook.<name>.command` will be
>  	executed. More than one event can be specified if you wish for
>  	`hook.<name>` to execute on multiple events. See linkgit:git-hook[1].
> +
> +hook.jobs::
> +	Specifies how many hooks can be run simultaneously during parallelized
> +	hook execution. If unspecified, defaults to the number of processors on
> +	the current system.

We should probably note that some hooks will run sequentially regardless
of this setting. Maybe we should even document which ones? I expect it's
not going to be that many.

> diff --git a/Documentation/git-hook.adoc b/Documentation/git-hook.adoc
> index 5f339dc48b..72c6c6d1ee 100644
> --- a/Documentation/git-hook.adoc
> +++ b/Documentation/git-hook.adoc
> @@ -128,6 +129,16 @@ OPTIONS
>  	tools that want to do a blind one-shot run of a hook that may
>  	or may not be present.
>  
> +-j::
> +--jobs::
> +	Only valid for `run`.
> ++
> +Specify how many hooks to run simultaneously. If this flag is not specified,
> +the value of the `hook.jobs` config is used, see linkgit:git-config[1]. If the
> +config is not specified, the number of CPUs on the current system is used. Some
> +hooks may be ineligible for parallelization: for example, 'commit-msg' hooks
> +typically modify the commit message body and cannot be parallelized.

Yeah, this info is probably what I was searching for in the "hook.jobs"
description.

> diff --git a/builtin/hook.c b/builtin/hook.c
> index 4cc6dac45a..cd1f4ebe6a 100644
> --- a/builtin/hook.c
> +++ b/builtin/hook.c
> @@ -76,7 +77,7 @@ static int run(int argc, const char **argv, const char *prefix,
>  	       struct repository *repo UNUSED)
>  {
>  	int i;
> -	struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT;
> +	struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT_PARALLEL;
>  	int ignore_missing = 0;
>  	const char *hook_name;
>  	struct option run_options[] = {

Hm. Assuming that the user executes `git hooks run prepare-commit-msg`
with "--jobs=2", should we really honor that request? We know that the
hook cannot run in parallel, so we might want to refuse such requests.

Taking a step back, I wonder whether it really is sensible to declare
complete classes of hooks as parallelizable or non-parallelizable. We
have to assume semantics of the hook scripts themselves to be able to
answer whether or not they can be parallelizable. For some classes of
hooks like "prepare-commit-msg" we can assume that it's almost never
correct to serialize them. But for others we cannot assume anything.

Which makes me wonder whether the design here is really the right one.
Shouldn't we stop worrying about classes of hooks, but rather worry
about the user's intent? The user will know whether two hooks can run in
parallel or not, so let them tell us that this is the case.

I think this could be achieved via the configuration:

    [hook "my-parallelizable-hook-a"]
    path = /some/script-a.sh
    parallel = true

    [hook "my-parallelizable-hook"]
    path = /some/script-b.sh
    parallel = true

    [hook "serial-hook"]
    path = /some/script-c.sh
    parallel = false

This would tell us that we can safely run two of the hooks in parallel,
but not the third one. So we'd then first execute all serial hooks in
serial, and then in a second phase we'd execute the other hooks in
parallel.

Sure, this puts more responsibility on the user. But I think this is a
more flexible approach as it also empowers the user and caters to more
use cases.

Please let me know what you think.

Thanks!

Patrick

  reply	other threads:[~2026-02-11 12:41 UTC|newest]

Thread overview: 83+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-04 17:33 [PATCH 0/4] Run hooks in parallel Adrian Ratiu
2026-02-04 17:33 ` [PATCH 1/4] config: add a repo_config_get_uint() helper Adrian Ratiu
2026-02-04 17:33 ` [PATCH 2/4] hook: allow parallel hook execution Adrian Ratiu
2026-02-11 12:41   ` Patrick Steinhardt [this message]
2026-02-12 12:25     ` Adrian Ratiu
2026-02-04 17:33 ` [PATCH 3/4] hook: introduce extensions.hookStdoutToStderr Adrian Ratiu
2026-02-04 17:33 ` [PATCH 4/4] hook: allow runtime enabling extensions.hookStdoutToStderr Adrian Ratiu
2026-02-12 10:43 ` [PATCH 0/4] Run hooks in parallel Phillip Wood
2026-02-12 14:24   ` Adrian Ratiu
2026-02-13 14:39     ` Phillip Wood
2026-02-13 17:21       ` Adrian Ratiu
2026-02-22  0:28 ` [PATCH v2 00/10] " Adrian Ratiu
2026-02-22  0:28   ` [PATCH v2 01/10] repository: fix repo_init() memleak due to missing _clear() Adrian Ratiu
2026-02-22  0:28   ` [PATCH v2 02/10] config: add a repo_config_get_uint() helper Adrian Ratiu
2026-02-22  0:28   ` [PATCH v2 03/10] hook: refactor hook_config_cache from strmap to named struct Adrian Ratiu
2026-02-22  0:28   ` [PATCH v2 04/10] hook: parse the hook.jobs config Adrian Ratiu
2026-02-22  0:28   ` [PATCH v2 05/10] hook: allow parallel hook execution Adrian Ratiu
2026-02-22  0:29   ` [PATCH v2 06/10] hook: mark non-parallelizable hooks Adrian Ratiu
2026-02-22  0:29   ` [PATCH v2 07/10] hook: add -j/--jobs option to git hook run Adrian Ratiu
2026-02-22  0:29   ` [PATCH v2 08/10] hook: add per-event jobs config Adrian Ratiu
2026-02-22  0:29   ` [PATCH v2 09/10] hook: introduce extensions.hookStdoutToStderr Adrian Ratiu
2026-02-22  0:29   ` [PATCH v2 10/10] hook: allow runtime enabling extensions.hookStdoutToStderr Adrian Ratiu
2026-03-09 13:37 ` [PATCH v3 0/9] Run hooks in parallel Adrian Ratiu
2026-03-09 13:37   ` [PATCH v3 1/9] repository: fix repo_init() memleak due to missing _clear() Adrian Ratiu
2026-03-15  4:55     ` Junio C Hamano
2026-03-15  5:05     ` Junio C Hamano
2026-03-09 13:37   ` [PATCH v3 2/9] config: add a repo_config_get_uint() helper Adrian Ratiu
2026-03-09 13:37   ` [PATCH v3 3/9] hook: parse the hook.jobs config Adrian Ratiu
2026-03-15 16:13     ` Junio C Hamano
2026-03-09 13:37   ` [PATCH v3 4/9] hook: allow parallel hook execution Adrian Ratiu
2026-03-15 20:46     ` Junio C Hamano
2026-03-18 18:02       ` Adrian Ratiu
2026-03-09 13:37   ` [PATCH v3 5/9] hook: mark non-parallelizable hooks Adrian Ratiu
2026-03-15 20:56     ` Junio C Hamano
2026-03-18 18:40       ` Adrian Ratiu
2026-03-09 13:37   ` [PATCH v3 6/9] hook: add -j/--jobs option to git hook run Adrian Ratiu
2026-03-15 21:00     ` Junio C Hamano
2026-03-18 19:00       ` Adrian Ratiu
2026-03-09 13:37   ` [PATCH v3 7/9] hook: add per-event jobs config Adrian Ratiu
2026-03-16 18:40     ` Junio C Hamano
2026-03-18 19:21       ` Adrian Ratiu
2026-03-09 13:37   ` [PATCH v3 8/9] hook: introduce extensions.hookStdoutToStderr Adrian Ratiu
2026-03-16 18:44     ` Junio C Hamano
2026-03-18 19:50       ` Adrian Ratiu
2026-03-09 13:37   ` [PATCH v3 9/9] hook: allow runtime enabling extensions.hookStdoutToStderr Adrian Ratiu
2026-03-20 13:53 ` [PATCH v4 0/9] Run hooks in parallel Adrian Ratiu
2026-03-20 13:53   ` [PATCH v4 1/9] config: add a repo_config_get_uint() helper Adrian Ratiu
2026-03-20 13:53   ` [PATCH v4 2/9] hook: parse the hook.jobs config Adrian Ratiu
2026-03-24  9:07     ` Patrick Steinhardt
2026-03-24 18:59       ` Adrian Ratiu
2026-03-20 13:53   ` [PATCH v4 3/9] hook: allow parallel hook execution Adrian Ratiu
2026-03-24  9:07     ` Patrick Steinhardt
2026-03-20 13:53   ` [PATCH v4 4/9] hook: allow pre-push parallel execution Adrian Ratiu
2026-03-20 13:53   ` [PATCH v4 5/9] hook: mark non-parallelizable hooks Adrian Ratiu
2026-03-20 13:53   ` [PATCH v4 6/9] hook: add -j/--jobs option to git hook run Adrian Ratiu
2026-03-24  9:07     ` Patrick Steinhardt
2026-03-20 13:53   ` [PATCH v4 7/9] hook: add per-event jobs config Adrian Ratiu
2026-03-24  9:08     ` Patrick Steinhardt
2026-03-20 13:53   ` [PATCH v4 8/9] hook: warn when hook.<friendly-name>.jobs is set Adrian Ratiu
2026-03-24  9:08     ` Patrick Steinhardt
2026-03-20 13:53   ` [PATCH v4 9/9] hook: add hook.<event>.enabled switch Adrian Ratiu
2026-03-24  9:08     ` Patrick Steinhardt
2026-03-25 18:43       ` Adrian Ratiu
2026-03-20 17:24   ` [PATCH v4 0/9] Run hooks in parallel Junio C Hamano
2026-03-23 15:07     ` Adrian Ratiu
2026-03-24  9:07       ` Patrick Steinhardt
2026-03-26 10:18 ` [PATCH v5 00/12] " Adrian Ratiu
2026-03-26 10:18   ` [PATCH v5 01/12] repository: fix repo_init() memleak due to missing _clear() Adrian Ratiu
2026-03-26 10:18   ` [PATCH v5 02/12] config: add a repo_config_get_uint() helper Adrian Ratiu
2026-03-26 10:18   ` [PATCH v5 03/12] hook: parse the hook.jobs config Adrian Ratiu
2026-03-26 10:18   ` [PATCH v5 04/12] hook: allow parallel hook execution Adrian Ratiu
2026-03-26 10:18   ` [PATCH v5 05/12] hook: allow pre-push parallel execution Adrian Ratiu
2026-03-26 10:18   ` [PATCH v5 06/12] hook: mark non-parallelizable hooks Adrian Ratiu
2026-03-26 10:18   ` [PATCH v5 07/12] hook: add -j/--jobs option to git hook run Adrian Ratiu
2026-03-27 14:46     ` Patrick Steinhardt
2026-03-26 10:18   ` [PATCH v5 08/12] hook: add per-event jobs config Adrian Ratiu
2026-03-26 10:18   ` [PATCH v5 09/12] hook: warn when hook.<friendly-name>.jobs is set Adrian Ratiu
2026-03-27 14:46     ` Patrick Steinhardt
2026-03-26 10:18   ` [PATCH v5 10/12] hook: move is_known_hook() to hook.c for wider use Adrian Ratiu
2026-03-27 14:46     ` Patrick Steinhardt
2026-03-27 15:59       ` Adrian Ratiu
2026-03-26 10:18   ` [PATCH v5 11/12] hook: add hook.<event>.enabled switch Adrian Ratiu
2026-03-26 10:18   ` [PATCH v5 12/12] hook: allow hook.jobs=-1 to use all available CPU cores Adrian Ratiu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aYx5B-nf4dlFpw3v@pks.im \
    --to=ps@pks.im \
    --cc=adrian.ratiu@collabora.com \
    --cc=avarab@gmail.com \
    --cc=emilyshaffer@google.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=kristofferhaugsbakk@fastmail.com \
    --cc=peff@peff.net \
    --cc=steadmon@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox