From: Patrick Steinhardt <ps@pks.im>
To: Adrian Ratiu <adrian.ratiu@collabora.com>
Cc: git@vger.kernel.org, Jeff King <peff@peff.net>,
Emily Shaffer <emilyshaffer@google.com>,
Junio C Hamano <gitster@pobox.com>,
Josh Steadmon <steadmon@google.com>,
Kristoffer Haugsbakk <kristofferhaugsbakk@fastmail.com>
Subject: Re: [PATCH 1/4] hook: run a list of hooks
Date: Mon, 9 Feb 2026 15:27:57 +0100 [thread overview]
Message-ID: <aYnu7e_hbSBwu9_N@pks.im> (raw)
In-Reply-To: <20260204165126.1548805-2-adrian.ratiu@collabora.com>
On Wed, Feb 04, 2026 at 06:51:23PM +0200, Adrian Ratiu wrote:
> From: Emily Shaffer <emilyshaffer@google.com>
>
> Teach hook.[hc] to run lists of hooks to prepare for multihook support.
>
> Currently, the hook list contains only one entry representing the
> "legacy" hook from the hookdir, but next commits will allow users
> to supply more than one executable/command for a single hook event
> in addition to these default "legacy" hooks.
>
> All hook commands still run sequentially. A further patch series will
> enable running them in parallel by increasing .jobs > 1 where possible.
>
> Each hook command requires its own internal state copy, even when
> running sequentially, so add an API to allow hooks to duplicate/free
> their internal void *pp_task_cb state.
I think the order in this commit message is a bit off. We typically
first state the problem that we aim to solve before presenting the
solution.
> diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
> index b5379a4895..72fde2207c 100644
> --- a/builtin/receive-pack.c
> +++ b/builtin/receive-pack.c
> @@ -849,7 +849,8 @@ struct receive_hook_feed_state {
>
> static int feed_receive_hook_cb(int hook_stdin_fd, void *pp_cb UNUSED, void *pp_task_cb)
> {
> - struct receive_hook_feed_state *state = pp_task_cb;
> + struct string_list_item *h = pp_task_cb;
> + struct receive_hook_feed_state *state = h->util;
> struct command *cmd = state->cmd;
>
> strbuf_reset(&state->buf);
> @@ -901,6 +902,24 @@ static int feed_receive_hook_cb(int hook_stdin_fd, void *pp_cb UNUSED, void *pp_
> return state->cmd ? 0 : 1; /* 0 = more to come, 1 = EOF */
> }
>
> +static void *copy_receive_hook_feed_state(const void *data)
Nit: our coding guidelines say that the name of the struct should come
first. So this would be `receive_hook_feed_state_copy()`.
> +{
> + const struct receive_hook_feed_state *orig = data;
> + struct receive_hook_feed_state *new_data = xmalloc(sizeof(*new_data));
> + memcpy(new_data, orig, sizeof(*new_data));
> + strbuf_init(&new_data->buf, 0);
> + return new_data;
> +}
> +
> +static void free_receive_hook_feed_state(void *data)
Likewise, this would be `receive_hook_feed_state_free()`.
> diff --git a/hook.c b/hook.c
> index cde7198412..fb90f91f3b 100644
> --- a/hook.c
> +++ b/hook.c
> @@ -47,9 +47,49 @@ const char *find_hook(struct repository *r, const char *name)
> return path.buf;
> }
>
> +/*
> + * Provides a list of hook commands to run for the 'hookname' event.
> + *
> + * This function consolidates hooks from two sources:
> + * 1. The config-based hooks (not yet implemented).
> + * 2. The "traditional" hook found in the repository hooks directory
> + * (e.g., .git/hooks/pre-commit).
> + *
> + * The list is ordered by execution priority.
> + *
> + * The caller is responsible for freeing the memory of the returned list
> + * using string_list_clear() and free().
> + */
> +static struct string_list *list_hooks(struct repository *r, const char *hookname)
> +{
> + struct string_list *hook_head;
> +
> + if (!hookname)
> + BUG("null hookname was provided to hook_list()!");
> +
> + hook_head = xmalloc(sizeof(struct string_list));
> + string_list_init_dup(hook_head);
> +
> + /*
> + * Add the default hook from hookdir. It does not have a friendly name
> + * like the hooks specified via configs, so add it with an empty name.
> + */
> + if (r->gitdir && find_hook(r, hookname))
> + string_list_append(hook_head, "");
Why is there a check for `r->gitdir` here? Do we ever execute hooks
outside of a fully-initialized repository?
Other than that, we now insert hooks into the list. It's somewhat
surprising that we insert hook "names" here, instead of for example
adding the full hook path to the list. Is there any specific reason for
this decision?
> + return hook_head;
> +}
> +
> int hook_exists(struct repository *r, const char *name)
> {
> - return !!find_hook(r, name);
> + int exists = 0;
> + struct string_list *hooks = list_hooks(r, name);
> +
> + exists = hooks->nr > 0;
> +
> + string_list_clear(hooks, 1);
> + free(hooks);
> + return exists;
> }
>
> static int pick_next_hook(struct child_process *cp,
And here we verify that there's at least one such hook that we found.
Which would currently mean that the hook in ".git/hooks/" exists.
> @@ -58,10 +98,11 @@ static int pick_next_hook(struct child_process *cp,
> void **pp_task_cb)
> {
> struct hook_cb_data *hook_cb = pp_cb;
> - const char *hook_path = hook_cb->hook_path;
> + struct string_list *hook_list = hook_cb->hook_command_list;
> + struct string_list_item *to_run = hook_cb->next_hook++;
>
> - if (!hook_path)
> - return 0;
> + if (!to_run || to_run >= hook_list->items + hook_list->nr)
> + return 0; /* no hook left to run */
Hm, okay. Wouldn't it be a bit more ergonomic to instead store the index
of the current hook instead of storing the string list item itself? In
that case we could verify whether `hook_cb->hook_cur < hook_list.nr` or
something like this.
> @@ -85,33 +126,50 @@ static int pick_next_hook(struct child_process *cp,
> cp->trace2_hook_name = hook_cb->hook_name;
> cp->dir = hook_cb->options->dir;
>
> - strvec_push(&cp->args, hook_path);
> + /* find hook commands */
> + if (!*to_run->string) {
> + /* ...from hookdir signified by empty name */
> + const char *hook_path = find_hook(hook_cb->repository,
> + hook_cb->hook_name);
> + if (!hook_path)
> + BUG("hookdir in hook list but no hook present in filesystem");
> +
> + if (hook_cb->options->dir)
> + hook_path = absolute_path(hook_path);
> +
> + strvec_push(&cp->args, hook_path);
> + }
We could avoid this special casing if we stored the absolute paths in
the hook list right away. But maybe there is a good reason you don't.
> + if (!cp->args.nr)
> + BUG("configured hook must have at least one command");
> +
> strvec_pushv(&cp->args, hook_cb->options->args.v);
>
> /*
> * Provide per-hook internal state via task_cb for easy access, so
> * hook callbacks don't have to go through hook_cb->options.
> */
> - *pp_task_cb = hook_cb->options->feed_pipe_cb_data;
> -
> - /*
> - * This pick_next_hook() will be called again, we're only
> - * running one hook, so indicate that no more work will be
> - * done.
> - */
> - hook_cb->hook_path = NULL;
> + *pp_task_cb = to_run;
>
> return 1;
> }
>
> -static int notify_start_failure(struct strbuf *out UNUSED,
> +static int notify_start_failure(struct strbuf *out,
> void *pp_cb,
> - void *pp_task_cp UNUSED)
> + void *pp_task_cb)
> {
> struct hook_cb_data *hook_cb = pp_cb;
> + struct string_list_item *hook = pp_task_cb;
>
> hook_cb->rc |= 1;
>
> + if (out && hook) {
> + if (*hook->string)
> + strbuf_addf(out, _("Couldn't start hook '%s'\n"), hook->string);
> + else
> + strbuf_addstr(out, _("Couldn't start hook from hooks directory\n"));
> + }
> +
> return 1;
> }
Okay, here we can give a bit more detail in the error message. But that
alone doesn't quite feel worth the additional complexity.
> @@ -172,26 +231,50 @@ int run_hooks_opt(struct repository *r, const char *hook_name,
> if (!options->jobs)
> BUG("run_hooks_opt must be called with options.jobs >= 1");
>
> + /*
> + * Ensure cb_data copy and free functions are either provided together,
> + * or neither one is provided.
> + */
> + if ((options->copy_feed_pipe_cb_data && !options->free_feed_pipe_cb_data) ||
> + (!options->copy_feed_pipe_cb_data && options->free_feed_pipe_cb_data))
> + BUG("copy_feed_pipe_cb_data and free_feed_pipe_cb_data must be set together");
I wonder whether it would make the commit a bit easier to review if you
split it up into two:
- One commit that introduces the copy/free callback functions.
- One commit that introduces the hook list.
Then the changes would be a bit more focussed.
> if (options->invoked_hook)
> *options->invoked_hook = 0;
>
> - if (!hook_path && !options->error_if_missing)
> - goto cleanup;
> -
> - if (!hook_path) {
> - ret = error("cannot find a hook named %s", hook_name);
> + if (!cb_data.hook_command_list->nr) {
> + if (options->error_if_missing)
> + ret = error("cannot find a hook named %s", hook_name);
> goto cleanup;
> }
>
> - cb_data.hook_path = hook_path;
> - if (options->dir) {
> - strbuf_add_absolute_path(&abs_path, hook_path);
> - cb_data.hook_path = abs_path.buf;
> + /*
> + * Initialize the iterator/cursor which holds the next hook to run.
> + * run_process_parallel() calls pick_next_hook() which increments it for
> + * each hook command in the list until all hooks have been run.
> + */
> + cb_data.next_hook = cb_data.hook_command_list->items;
> +
> + /*
> + * Give each hook its own copy of the initial void *pp_task_cb state, if
> + * a copy callback was provided.
> + */
> + if (options->copy_feed_pipe_cb_data) {
> + struct string_list_item *item;
> + for_each_string_list_item(item, cb_data.hook_command_list)
> + item->util = options->copy_feed_pipe_cb_data(options->feed_pipe_cb_data);
> }
Makes sense. We cannot just peform a shallow copy of the callback data
as it may contain data itself that needs to be copied. So we have to
handle the copying ourselves.
One thing I wondered is whether it would lead to a cleaner design if
this was instead provided as a callback to allocate and initialize a
completely _new_ callback data. In that case the caller wouldn't have to
first initialize the data only to copy it in a different place, all the
logic would be self-contained in that one callback.
The downside would of course be that we cannot use for example the
stack, and it might make it harder to reuse state across multiple hook
invocations, if that's a feature we care about.
Patrick
next prev parent reply other threads:[~2026-02-09 14:28 UTC|newest]
Thread overview: 69+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-04 16:51 [PATCH 0/4] Specify hooks via configs Adrian Ratiu
2026-02-04 16:51 ` [PATCH 1/4] hook: run a list of hooks Adrian Ratiu
2026-02-05 21:59 ` Junio C Hamano
2026-02-06 11:21 ` Adrian Ratiu
2026-02-09 14:27 ` Patrick Steinhardt [this message]
2026-02-09 18:16 ` Adrian Ratiu
2026-02-10 13:43 ` Patrick Steinhardt
2026-02-04 16:51 ` [PATCH 2/4] hook: introduce "git hook list" Adrian Ratiu
2026-02-09 14:28 ` Patrick Steinhardt
2026-02-09 18:26 ` Adrian Ratiu
2026-02-04 16:51 ` [PATCH 3/4] hook: include hooks from the config Adrian Ratiu
2026-02-09 14:28 ` Patrick Steinhardt
2026-02-09 19:10 ` Adrian Ratiu
2026-02-10 13:43 ` Patrick Steinhardt
2026-02-10 13:56 ` Adrian Ratiu
2026-02-04 16:51 ` [PATCH 4/4] hook: allow out-of-repo 'git hook' invocations Adrian Ratiu
2026-02-06 16:26 ` [PATCH 0/4] Specify hooks via configs Junio C Hamano
2026-02-18 22:23 ` [PATCH v2 0/8] " Adrian Ratiu
2026-02-18 22:23 ` [PATCH v2 1/8] hook: add internal state alloc/free callbacks Adrian Ratiu
2026-02-19 21:47 ` Junio C Hamano
2026-02-20 12:35 ` Adrian Ratiu
2026-02-20 17:21 ` Junio C Hamano
2026-02-20 12:42 ` Adrian Ratiu
2026-02-20 12:45 ` Patrick Steinhardt
2026-02-20 13:40 ` Adrian Ratiu
2026-02-18 22:23 ` [PATCH v2 2/8] hook: run a list of hooks to prepare for multihook support Adrian Ratiu
2026-02-20 12:46 ` Patrick Steinhardt
2026-02-20 13:51 ` Adrian Ratiu
2026-02-18 22:23 ` [PATCH v2 3/8] hook: add "git hook list" command Adrian Ratiu
2026-02-20 12:46 ` Patrick Steinhardt
2026-02-20 13:53 ` Adrian Ratiu
2026-02-18 22:23 ` [PATCH v2 4/8] hook: include hooks from the config Adrian Ratiu
2026-02-19 22:16 ` Junio C Hamano
2026-02-20 12:27 ` Adrian Ratiu
2026-02-20 12:46 ` Patrick Steinhardt
2026-02-20 14:31 ` Adrian Ratiu
2026-02-18 22:23 ` [PATCH v2 5/8] hook: allow disabling config hooks Adrian Ratiu
2026-02-20 12:46 ` Patrick Steinhardt
2026-02-20 14:47 ` Adrian Ratiu
2026-02-20 18:40 ` Patrick Steinhardt
2026-02-20 18:45 ` Junio C Hamano
2026-02-18 22:23 ` [PATCH v2 6/8] hook: allow event = "" to overwrite previous values Adrian Ratiu
2026-02-18 22:23 ` [PATCH v2 7/8] hook: allow out-of-repo 'git hook' invocations Adrian Ratiu
2026-02-18 22:23 ` [PATCH v2 8/8] hook: add -z option to "git hook list" Adrian Ratiu
2026-02-19 21:34 ` [PATCH v2 0/8] Specify hooks via configs Junio C Hamano
2026-02-20 12:51 ` Adrian Ratiu
2026-02-20 23:29 ` brian m. carlson
2026-02-21 14:27 ` Adrian Ratiu
2026-02-22 0:39 ` Adrian Ratiu
2026-02-25 18:37 ` Junio C Hamano
2026-02-26 12:21 ` Adrian Ratiu
2026-02-25 22:30 ` brian m. carlson
2026-02-26 12:41 ` Adrian Ratiu
2026-03-01 18:44 ` [PATCH v3 00/12][next] " Adrian Ratiu
2026-03-01 18:44 ` [PATCH v3 01/12] hook: add internal state alloc/free callbacks Adrian Ratiu
2026-03-01 18:44 ` [PATCH v3 02/12] hook: run a list of hooks to prepare for multihook support Adrian Ratiu
2026-03-01 18:44 ` [PATCH v3 03/12] hook: add "git hook list" command Adrian Ratiu
2026-03-01 18:44 ` [PATCH v3 04/12] string-list: add unsorted_string_list_remove() Adrian Ratiu
2026-03-01 18:44 ` [PATCH v3 05/12] hook: include hooks from the config Adrian Ratiu
2026-03-01 18:44 ` [PATCH v3 06/12] hook: allow disabling config hooks Adrian Ratiu
2026-03-01 18:44 ` [PATCH v3 07/12] hook: allow event = "" to overwrite previous values Adrian Ratiu
2026-03-01 18:44 ` [PATCH v3 08/12] hook: allow out-of-repo 'git hook' invocations Adrian Ratiu
2026-03-01 18:44 ` [PATCH v3 09/12] hook: add -z option to "git hook list" Adrian Ratiu
2026-03-01 18:44 ` [PATCH v3 10/12] hook: refactor hook_config_cache from strmap to named struct Adrian Ratiu
2026-03-01 18:44 ` [PATCH v3 11/12] hook: store and display scope for configured hooks in git hook list Adrian Ratiu
2026-03-01 18:45 ` [PATCH v3 12/12] hook: show disabled hooks in "git hook list" Adrian Ratiu
2026-03-02 16:48 ` [PATCH v3 00/12][next] Specify hooks via configs Junio C Hamano
2026-03-02 17:04 ` Adrian Ratiu
2026-03-02 18:48 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aYnu7e_hbSBwu9_N@pks.im \
--to=ps@pks.im \
--cc=adrian.ratiu@collabora.com \
--cc=emilyshaffer@google.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=kristofferhaugsbakk@fastmail.com \
--cc=peff@peff.net \
--cc=steadmon@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox