* Re: [PATCH 08/11] builtin/maintenance: let tasks do maintenance before and after detach
@ 2025-05-30 12:56 Ben Knoble
2025-05-30 14:05 ` Patrick Steinhardt
0 siblings, 1 reply; 5+ messages in thread
From: Ben Knoble @ 2025-05-30 12:56 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Yonatan Roth, david asraf
> Le 27 mai 2025 à 10:05, Patrick Steinhardt <ps@pks.im> a écrit :
>
> Both git-gc(1) and git-maintenance(1) have logic to daemonize so that
> the maintenance tasks are performed in the background. git-gc(1) has
> some special logic though to not perform _all_ housekeeping tasks in the
> background: both references and reflogs are still handled synchronously
> ni the foreground.
>
> This split exists because otherwise it may easily happen that git-gc(1)
> keeps for the "packed-refs" file locked for an extended amount of time,
> where the next Git command that wants to modify any reference could now
> fail. This was especially important in the past, where git-gc(1) was
> still executed directly as part of our automatic maintenance: git-gc(1)
> was invoked via `git gc --auto --detach`, so we knew to handle most of
> the maintenance tasks in the background while doing those parts that may
> cause locking issues in the foreground.
>
> We have since moved to git-maintenance(1), which is a more flexible
> replacement for git-gc(1). By default this command runs git-gc(1), only,
> but it can be configured to run different tasks, as well. This command
> does not know about the split between maintenance tasks that should run
> before and after detach though, and this has led to several bug reports
> about spurious locking errors for the "packed-refs" file.
>
> Prepare for a fix by introducing this split for maintenance tasks. Note
> that this commit does not yet change any of the tasks, so there should
> not (yet) be a change in behaviour.
>
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
> builtin/gc.c | 40 +++++++++++++++++++++++++---------------
> 1 file changed, 25 insertions(+), 15 deletions(-)
>
> diff --git a/builtin/gc.c b/builtin/gc.c
> index 447e5800846..57f3bbf5344 100644
> --- a/builtin/gc.c
> +++ b/builtin/gc.c
> @@ -1545,53 +1545,54 @@ typedef int (*maintenance_auto_fn)(struct gc_config *cfg);
>
> struct maintenance_task {
> const char *name;
> - maintenance_task_fn fn;
> + maintenance_task_fn before_detach;
> + maintenance_task_fn after_detach;
> maintenance_auto_fn auto_condition;
> };
>
> static const struct maintenance_task tasks[] = {
> [TASK_PREFETCH] = {
> .name = "prefetch",
> - .fn = maintenance_task_prefetch,
> + .after_detach = maintenance_task_prefetch,
> },
> [TASK_LOOSE_OBJECTS] = {
> .name = "loose-objects",
> - .fn = maintenance_task_loose_objects,
> + .after_detach = maintenance_task_loose_objects,
> .auto_condition = loose_object_auto_condition,
> },
> [TASK_INCREMENTAL_REPACK] = {
> .name = "incremental-repack",
> - .fn = maintenance_task_incremental_repack,
> + .after_detach = maintenance_task_incremental_repack,
> .auto_condition = incremental_repack_auto_condition,
> },
> [TASK_GC] = {
> .name = "gc",
> - .fn = maintenance_task_gc,
> + .after_detach = maintenance_task_gc,
> .auto_condition = need_to_gc,
> },
> [TASK_COMMIT_GRAPH] = {
> .name = "commit-graph",
> - .fn = maintenance_task_commit_graph,
> + .after_detach = maintenance_task_commit_graph,
> .auto_condition = should_write_commit_graph,
> },
> [TASK_PACK_REFS] = {
> .name = "pack-refs",
> - .fn = maintenance_task_pack_refs,
> + .after_detach = maintenance_task_pack_refs,
> .auto_condition = pack_refs_condition,
> },
> [TASK_REFLOG_EXPIRE] = {
> .name = "reflog-expire",
> - .fn = maintenance_task_reflog_expire,
> + .after_detach = maintenance_task_reflog_expire,
> .auto_condition = reflog_expire_condition,
> },
> [TASK_WORKTREE_PRUNE] = {
> .name = "worktree-prune",
> - .fn = maintenance_task_worktree_prune,
> + .after_detach = maintenance_task_worktree_prune,
> .auto_condition = worktree_prune_condition,
> },
> [TASK_RERERE_GC] = {
> .name = "rerere-gc",
> - .fn = maintenance_task_rerere_gc,
> + .after_detach = maintenance_task_rerere_gc,
> .auto_condition = rerere_gc_condition,
> },
> };
> @@ -1599,20 +1600,25 @@ static const struct maintenance_task tasks[] = {
> static int maybe_run_task(const struct maintenance_task *task,
> struct repository *repo,
> struct maintenance_run_opts *opts,
> - struct gc_config *cfg)
> + struct gc_config *cfg,
> + int before)
Perhaps we can use a small enum…
> {
> + maintenance_task_fn fn = before ? task->before_detach : task->after_detach;
> + const char *region = before ? "maintenance before" : "maintenance";
> int ret = 0;
>
> + if (!fn)
> + return 0;
> if (opts->auto_flag &&
> (!task->auto_condition || !task->auto_condition(cfg)))
> return 0;
>
> - trace2_region_enter("maintenance", task->name, repo);
> - if (task->fn(opts, cfg)) {
> + trace2_region_enter(region, task->name, repo);
> + if (fn(opts, cfg)) {
> error(_("task '%s' failed"), task->name);
> ret = 1;
> }
> - trace2_region_leave("maintenance", task->name, repo);
> + trace2_region_leave(region, task->name, repo);
>
> return ret;
> }
> @@ -1641,6 +1647,10 @@ static int maintenance_run_tasks(struct maintenance_run_opts *opts,
> }
> free(lock_path);
>
> + for (size_t i = 0; i < opts->tasks_nr; i++)
> + if (maybe_run_task(&tasks[opts->tasks[i]], r, opts, cfg, 1))
So that is clear what « 1 » (« BEFORE ») does here…
> + result = 1;
> +
> /* Failure to daemonize is ok, we'll continue in foreground. */
> if (opts->detach > 0) {
> trace2_region_enter("maintenance", "detach", the_repository);
> @@ -1649,7 +1659,7 @@ static int maintenance_run_tasks(struct maintenance_run_opts *opts,
> }
>
> for (size_t i = 0; i < opts->tasks_nr; i++)
> - if (maybe_run_task(&tasks[opts->tasks[i]], r, opts, cfg))
> + if (maybe_run_task(&tasks[opts->tasks[i]], r, opts, cfg, 0))
And « 0 » (« AFTER ») does here?
> result = 1;
>
> rollback_lock_file(&lk);
>
> --
> 2.49.0.1266.g31b7d2e469.dirty
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 08/11] builtin/maintenance: let tasks do maintenance before and after detach
2025-05-30 12:56 [PATCH 08/11] builtin/maintenance: let tasks do maintenance before and after detach Ben Knoble
@ 2025-05-30 14:05 ` Patrick Steinhardt
0 siblings, 0 replies; 5+ messages in thread
From: Patrick Steinhardt @ 2025-05-30 14:05 UTC (permalink / raw)
To: Ben Knoble; +Cc: git, Yonatan Roth, david asraf
On Fri, May 30, 2025 at 08:56:38AM -0400, Ben Knoble wrote:
> > @@ -1599,20 +1600,25 @@ static const struct maintenance_task tasks[] = {
> > static int maybe_run_task(const struct maintenance_task *task,
> > struct repository *repo,
> > struct maintenance_run_opts *opts,
> > - struct gc_config *cfg)
> > + struct gc_config *cfg,
> > + int before)
>
> Perhaps we can use a small enum…
> > @@ -1641,6 +1647,10 @@ static int maintenance_run_tasks(struct maintenance_run_opts *opts,
> > }
> > free(lock_path);
> >
> > + for (size_t i = 0; i < opts->tasks_nr; i++)
> > + if (maybe_run_task(&tasks[opts->tasks[i]], r, opts, cfg, 1))
>
> So that is clear what « 1 » (« BEFORE ») does here…
>
> > + result = 1;
> > +
> > /* Failure to daemonize is ok, we'll continue in foreground. */
> > if (opts->detach > 0) {
> > trace2_region_enter("maintenance", "detach", the_repository);
> > @@ -1649,7 +1659,7 @@ static int maintenance_run_tasks(struct maintenance_run_opts *opts,
> > }
> >
> > for (size_t i = 0; i < opts->tasks_nr; i++)
> > - if (maybe_run_task(&tasks[opts->tasks[i]], r, opts, cfg))
> > + if (maybe_run_task(&tasks[opts->tasks[i]], r, opts, cfg, 0))
>
> And « 0 » (« AFTER ») does here?
Yeah, I was very close to adding one, but then decided against it to not
produce additional churn. But agreed, it would be way easier to read
this way, so let me add one.
Patrick
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH 00/11] builtin/maintenance: fix ref lock races when detaching
@ 2025-05-27 14:04 Patrick Steinhardt
2025-05-27 14:04 ` [PATCH 08/11] builtin/maintenance: let tasks do maintenance before and after detach Patrick Steinhardt
0 siblings, 1 reply; 5+ messages in thread
From: Patrick Steinhardt @ 2025-05-27 14:04 UTC (permalink / raw)
To: git; +Cc: Yonatan Roth, david asraf
Hi,
this patch series fixes races around locking the "packed-refs" file when
auto-maintenance decides to repack it. This issue has been reported e.g.
via [1] and [2].
The root cause is that git-gc(1) used to know to detach _after_ having
repacked references. As such, callers wouldn't continue with their thing
until we have already packed refs, and thus the race does not exist
there. git-maintenance(1) didn't have the same split though, so this
patch series retrofits that logic.
The series is structured as follows:
- Patches 1 and 2 do some light refactorings.
- Patches 3 to 5 refactor how we set up the list of tasks to not rely
on globals anymore. Instead, we now have a single source of truth
for which tasks exactly will be run.
- The remaining patches introduce the split of before/after-detach
tasks and wire them up for "pack-refs", "reflog-expire" and "gc"
tasks.
Thanks!
Patrick
[1]: <CAJR-fbZ4X1+gN75m2dUvocR6NkowLOZ9F26cjBy8w1qd181OoQ@mail.gmail.com>
[2]: <CANi7bVAkNc+gY1NoXfJuDRjxjZLTgL8Lfn8_ZmWsvLAoiLPkNg@mail.gmail.com>
---
Patrick Steinhardt (11):
builtin/gc: use designated field initializers for maintenance tasks
builtin/gc: drop redundant local variable
builtin/maintenance: centralize configuration of explicit tasks
builtin/maintenance: mark "--task=" and "--schedule=" as incompatible
builtin/maintenance: stop modifying global array of tasks
builtin/maintenance: extract function to run tasks
builtin/maintenance: fix typedef for function pointers
builtin/maintenance: let tasks do maintenance before and after detach
builtin/maintenance: fix locking race when packing refs and reflogs
builtin/gc: avoid global state in `gc_before_repack()`
builtin/maintenance: fix locking race when handling "gc" task
builtin/gc.c | 386 +++++++++++++++++++++++++++----------------------
t/t7900-maintenance.sh | 19 ++-
2 files changed, 229 insertions(+), 176 deletions(-)
---
base-commit: 845c48a16a7f7b2c44d8cb137b16a4a1f0140229
change-id: 20250527-b4-pks-maintenance-ref-lock-race-11ae5d68e06f
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH 08/11] builtin/maintenance: let tasks do maintenance before and after detach
2025-05-27 14:04 [PATCH 00/11] builtin/maintenance: fix ref lock races when detaching Patrick Steinhardt
@ 2025-05-27 14:04 ` Patrick Steinhardt
2025-05-27 17:01 ` Ramsay Jones
0 siblings, 1 reply; 5+ messages in thread
From: Patrick Steinhardt @ 2025-05-27 14:04 UTC (permalink / raw)
To: git; +Cc: Yonatan Roth, david asraf
Both git-gc(1) and git-maintenance(1) have logic to daemonize so that
the maintenance tasks are performed in the background. git-gc(1) has
some special logic though to not perform _all_ housekeeping tasks in the
background: both references and reflogs are still handled synchronously
ni the foreground.
This split exists because otherwise it may easily happen that git-gc(1)
keeps for the "packed-refs" file locked for an extended amount of time,
where the next Git command that wants to modify any reference could now
fail. This was especially important in the past, where git-gc(1) was
still executed directly as part of our automatic maintenance: git-gc(1)
was invoked via `git gc --auto --detach`, so we knew to handle most of
the maintenance tasks in the background while doing those parts that may
cause locking issues in the foreground.
We have since moved to git-maintenance(1), which is a more flexible
replacement for git-gc(1). By default this command runs git-gc(1), only,
but it can be configured to run different tasks, as well. This command
does not know about the split between maintenance tasks that should run
before and after detach though, and this has led to several bug reports
about spurious locking errors for the "packed-refs" file.
Prepare for a fix by introducing this split for maintenance tasks. Note
that this commit does not yet change any of the tasks, so there should
not (yet) be a change in behaviour.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
builtin/gc.c | 40 +++++++++++++++++++++++++---------------
1 file changed, 25 insertions(+), 15 deletions(-)
diff --git a/builtin/gc.c b/builtin/gc.c
index 447e5800846..57f3bbf5344 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -1545,53 +1545,54 @@ typedef int (*maintenance_auto_fn)(struct gc_config *cfg);
struct maintenance_task {
const char *name;
- maintenance_task_fn fn;
+ maintenance_task_fn before_detach;
+ maintenance_task_fn after_detach;
maintenance_auto_fn auto_condition;
};
static const struct maintenance_task tasks[] = {
[TASK_PREFETCH] = {
.name = "prefetch",
- .fn = maintenance_task_prefetch,
+ .after_detach = maintenance_task_prefetch,
},
[TASK_LOOSE_OBJECTS] = {
.name = "loose-objects",
- .fn = maintenance_task_loose_objects,
+ .after_detach = maintenance_task_loose_objects,
.auto_condition = loose_object_auto_condition,
},
[TASK_INCREMENTAL_REPACK] = {
.name = "incremental-repack",
- .fn = maintenance_task_incremental_repack,
+ .after_detach = maintenance_task_incremental_repack,
.auto_condition = incremental_repack_auto_condition,
},
[TASK_GC] = {
.name = "gc",
- .fn = maintenance_task_gc,
+ .after_detach = maintenance_task_gc,
.auto_condition = need_to_gc,
},
[TASK_COMMIT_GRAPH] = {
.name = "commit-graph",
- .fn = maintenance_task_commit_graph,
+ .after_detach = maintenance_task_commit_graph,
.auto_condition = should_write_commit_graph,
},
[TASK_PACK_REFS] = {
.name = "pack-refs",
- .fn = maintenance_task_pack_refs,
+ .after_detach = maintenance_task_pack_refs,
.auto_condition = pack_refs_condition,
},
[TASK_REFLOG_EXPIRE] = {
.name = "reflog-expire",
- .fn = maintenance_task_reflog_expire,
+ .after_detach = maintenance_task_reflog_expire,
.auto_condition = reflog_expire_condition,
},
[TASK_WORKTREE_PRUNE] = {
.name = "worktree-prune",
- .fn = maintenance_task_worktree_prune,
+ .after_detach = maintenance_task_worktree_prune,
.auto_condition = worktree_prune_condition,
},
[TASK_RERERE_GC] = {
.name = "rerere-gc",
- .fn = maintenance_task_rerere_gc,
+ .after_detach = maintenance_task_rerere_gc,
.auto_condition = rerere_gc_condition,
},
};
@@ -1599,20 +1600,25 @@ static const struct maintenance_task tasks[] = {
static int maybe_run_task(const struct maintenance_task *task,
struct repository *repo,
struct maintenance_run_opts *opts,
- struct gc_config *cfg)
+ struct gc_config *cfg,
+ int before)
{
+ maintenance_task_fn fn = before ? task->before_detach : task->after_detach;
+ const char *region = before ? "maintenance before" : "maintenance";
int ret = 0;
+ if (!fn)
+ return 0;
if (opts->auto_flag &&
(!task->auto_condition || !task->auto_condition(cfg)))
return 0;
- trace2_region_enter("maintenance", task->name, repo);
- if (task->fn(opts, cfg)) {
+ trace2_region_enter(region, task->name, repo);
+ if (fn(opts, cfg)) {
error(_("task '%s' failed"), task->name);
ret = 1;
}
- trace2_region_leave("maintenance", task->name, repo);
+ trace2_region_leave(region, task->name, repo);
return ret;
}
@@ -1641,6 +1647,10 @@ static int maintenance_run_tasks(struct maintenance_run_opts *opts,
}
free(lock_path);
+ for (size_t i = 0; i < opts->tasks_nr; i++)
+ if (maybe_run_task(&tasks[opts->tasks[i]], r, opts, cfg, 1))
+ result = 1;
+
/* Failure to daemonize is ok, we'll continue in foreground. */
if (opts->detach > 0) {
trace2_region_enter("maintenance", "detach", the_repository);
@@ -1649,7 +1659,7 @@ static int maintenance_run_tasks(struct maintenance_run_opts *opts,
}
for (size_t i = 0; i < opts->tasks_nr; i++)
- if (maybe_run_task(&tasks[opts->tasks[i]], r, opts, cfg))
+ if (maybe_run_task(&tasks[opts->tasks[i]], r, opts, cfg, 0))
result = 1;
rollback_lock_file(&lk);
--
2.49.0.1266.g31b7d2e469.dirty
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH 08/11] builtin/maintenance: let tasks do maintenance before and after detach
2025-05-27 14:04 ` [PATCH 08/11] builtin/maintenance: let tasks do maintenance before and after detach Patrick Steinhardt
@ 2025-05-27 17:01 ` Ramsay Jones
2025-05-28 7:02 ` Patrick Steinhardt
0 siblings, 1 reply; 5+ messages in thread
From: Ramsay Jones @ 2025-05-27 17:01 UTC (permalink / raw)
To: Patrick Steinhardt, git; +Cc: Yonatan Roth, david asraf
On 27/05/2025 15:04, Patrick Steinhardt wrote:
> Both git-gc(1) and git-maintenance(1) have logic to daemonize so that
> the maintenance tasks are performed in the background. git-gc(1) has
> some special logic though to not perform _all_ housekeeping tasks in the
> background: both references and reflogs are still handled synchronously
> ni the foreground.
s/ni/in/
>
> This split exists because otherwise it may easily happen that git-gc(1)
> keeps for the "packed-refs" file locked for an extended amount of time,
s/keeps for the/keeps the/
> where the next Git command that wants to modify any reference could now
> fail. This was especially important in the past, where git-gc(1) was
> still executed directly as part of our automatic maintenance: git-gc(1)
> was invoked via `git gc --auto --detach`, so we knew to handle most of
> the maintenance tasks in the background while doing those parts that may
> cause locking issues in the foreground.
>
> We have since moved to git-maintenance(1), which is a more flexible
> replacement for git-gc(1). By default this command runs git-gc(1), only,
> but it can be configured to run different tasks, as well. This command
> does not know about the split between maintenance tasks that should run
> before and after detach though, and this has led to several bug reports
> about spurious locking errors for the "packed-refs" file.
>
> Prepare for a fix by introducing this split for maintenance tasks. Note
> that this commit does not yet change any of the tasks, so there should
> not (yet) be a change in behaviour.
>
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
[snip]
ATB,
Ramsay Jones
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 08/11] builtin/maintenance: let tasks do maintenance before and after detach
2025-05-27 17:01 ` Ramsay Jones
@ 2025-05-28 7:02 ` Patrick Steinhardt
0 siblings, 0 replies; 5+ messages in thread
From: Patrick Steinhardt @ 2025-05-28 7:02 UTC (permalink / raw)
To: Ramsay Jones; +Cc: git, Yonatan Roth, david asraf
On Tue, May 27, 2025 at 06:01:21PM +0100, Ramsay Jones wrote:
> On 27/05/2025 15:04, Patrick Steinhardt wrote:
> > Both git-gc(1) and git-maintenance(1) have logic to daemonize so that
> > the maintenance tasks are performed in the background. git-gc(1) has
> > some special logic though to not perform _all_ housekeeping tasks in the
> > background: both references and reflogs are still handled synchronously
> > ni the foreground.
>
> s/ni/in/
>
> >
> > This split exists because otherwise it may easily happen that git-gc(1)
> > keeps for the "packed-refs" file locked for an extended amount of time,
>
> s/keeps for the/keeps the/
And these I've got fixed now, as well. I'll wait a bit for more feedback
though before sending out these fixes.
Thanks!
Patrick
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2025-05-30 14:05 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-30 12:56 [PATCH 08/11] builtin/maintenance: let tasks do maintenance before and after detach Ben Knoble
2025-05-30 14:05 ` Patrick Steinhardt
-- strict thread matches above, loose matches on Subject: below --
2025-05-27 14:04 [PATCH 00/11] builtin/maintenance: fix ref lock races when detaching Patrick Steinhardt
2025-05-27 14:04 ` [PATCH 08/11] builtin/maintenance: let tasks do maintenance before and after detach Patrick Steinhardt
2025-05-27 17:01 ` Ramsay Jones
2025-05-28 7:02 ` Patrick Steinhardt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).