git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/7] builtin/maintenance: fix auto-detach with non-standard tasks
@ 2024-08-13  7:17 Patrick Steinhardt
  2024-08-13  7:17 ` [PATCH 1/7] config: fix constness of out parameter for `git_config_get_expiry()` Patrick Steinhardt
                   ` (10 more replies)
  0 siblings, 11 replies; 79+ messages in thread
From: Patrick Steinhardt @ 2024-08-13  7:17 UTC (permalink / raw)
  To: git

Hi,

I recently configured git-maintenance(1) to not use git-gc(1) anymore,
but instead to use git-multi-pack-index(1). I quickly noticed that the
behaviour here is somewhat broken because instead of auto-detaching when
`git maintenance run --auto` executes, we wait for the process to run to
completion.

The root cause is that git-maintenance(1), probably by accident,
continues to rely on the auto-detaching mechanism in git-gc(1). So
instead of having git-maintenance(1) detach, it is git-gc(1) that
detaches and thus causes git-maintenance(1) to exit early. That of
course falls flat once any maintenance task other than git-gc(1)
executes, because these won't detach.

Despite being a usability issue, this may also cause git-gc(1) to run
concurrently with any other enabled maintenance tasks. This shouldn't
lead to data loss, but it can certainly lead to processes stomping on
each others feet.

This patch series fixes this by wiring up new `--detach` flags for both
git-gc(1) and git-maintenance(1). Like this, git-maintenance(1) now
knows to execute `git gc --auto --no-detach`, while our auto-maintenance
will execute `git mainteance run --auto --detach`.

Patrick

Patrick Steinhardt (7):
  config: fix constness of out parameter for `git_config_get_expiry()`
  builtin/gc: refactor to read config into structure
  builtin/gc: fix leaking config values
  builtin/gc: stop processing log file on signal
  builtin/gc: add a `--detach` flag
  builtin/maintenance: add a `--detach` flag
  builtin/maintenance: fix auto-detach with non-standard tasks

 Documentation/git-gc.txt |   5 +-
 builtin/gc.c             | 384 ++++++++++++++++++++++++---------------
 config.c                 |   4 +-
 config.h                 |   2 +-
 read-cache.c             |  12 +-
 run-command.c            |  12 +-
 t/t5304-prune.sh         |   1 +
 t/t5616-partial-clone.sh |   6 +-
 t/t6500-gc.sh            |  39 ++++
 t/t7900-maintenance.sh   |  82 ++++++++-
 10 files changed, 381 insertions(+), 166 deletions(-)

-- 
2.46.0.46.g406f326d27.dirty


^ permalink raw reply	[flat|nested] 79+ messages in thread

* [PATCH 1/7] config: fix constness of out parameter for `git_config_get_expiry()`
  2024-08-13  7:17 [PATCH 0/7] builtin/maintenance: fix auto-detach with non-standard tasks Patrick Steinhardt
@ 2024-08-13  7:17 ` Patrick Steinhardt
  2024-08-13  7:17 ` [PATCH 2/7] builtin/gc: refactor to read config into structure Patrick Steinhardt
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 79+ messages in thread
From: Patrick Steinhardt @ 2024-08-13  7:17 UTC (permalink / raw)
  To: git

The type of the out parameter of `git_config_get_expiry()` is a pointer
to a constant string, which creates the impression that ownership of the
returned data wasn't transferred to the caller. This isn't true though
and thus quite misleading.

Adapt the parameter to be of type `char **` and adjust callers
accordingly. While at it, refactor `get_shared_index_expire_date()` to
drop the static `shared_index_expire` variable. It is only used in that
function, and furthermore we would only hit the code where we parse the
expiry date a single time because we already use a static `prepared`
variable to track whether we did parse it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/gc.c |  6 +++---
 config.c     |  4 ++--
 config.h     |  2 +-
 read-cache.c | 12 +++++++++---
 4 files changed, 15 insertions(+), 9 deletions(-)

diff --git a/builtin/gc.c b/builtin/gc.c
index 72bac2554f..e7406bf667 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -167,9 +167,9 @@ static void gc_config(void)
 	git_config_get_bool("gc.autodetach", &detach_auto);
 	git_config_get_bool("gc.cruftpacks", &cruft_packs);
 	git_config_get_ulong("gc.maxcruftsize", &max_cruft_size);
-	git_config_get_expiry("gc.pruneexpire", &prune_expire);
-	git_config_get_expiry("gc.worktreepruneexpire", &prune_worktrees_expire);
-	git_config_get_expiry("gc.logexpiry", &gc_log_expire);
+	git_config_get_expiry("gc.pruneexpire", (char **) &prune_expire);
+	git_config_get_expiry("gc.worktreepruneexpire", (char **) &prune_worktrees_expire);
+	git_config_get_expiry("gc.logexpiry", (char **) &gc_log_expire);
 
 	git_config_get_ulong("gc.bigpackthreshold", &big_pack_threshold);
 	git_config_get_ulong("pack.deltacachesize", &max_delta_cache_size);
diff --git a/config.c b/config.c
index 6421894614..dfa4df1417 100644
--- a/config.c
+++ b/config.c
@@ -2766,9 +2766,9 @@ int git_config_get_pathname(const char *key, char **dest)
 	return repo_config_get_pathname(the_repository, key, dest);
 }
 
-int git_config_get_expiry(const char *key, const char **output)
+int git_config_get_expiry(const char *key, char **output)
 {
-	int ret = git_config_get_string(key, (char **)output);
+	int ret = git_config_get_string(key, output);
 	if (ret)
 		return ret;
 	if (strcmp(*output, "now")) {
diff --git a/config.h b/config.h
index 54b47dec9e..4801391c32 100644
--- a/config.h
+++ b/config.h
@@ -701,7 +701,7 @@ int git_config_get_split_index(void);
 int git_config_get_max_percent_split_change(void);
 
 /* This dies if the configured or default date is in the future */
-int git_config_get_expiry(const char *key, const char **output);
+int git_config_get_expiry(const char *key, char **output);
 
 /* parse either "this many days" integer, or "5.days.ago" approxidate */
 int git_config_get_expiry_in_days(const char *key, timestamp_t *, timestamp_t now);
diff --git a/read-cache.c b/read-cache.c
index 48bf24f87c..7f393ee687 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -3176,18 +3176,24 @@ static int write_split_index(struct index_state *istate,
 	return ret;
 }
 
-static const char *shared_index_expire = "2.weeks.ago";
-
 static unsigned long get_shared_index_expire_date(void)
 {
 	static unsigned long shared_index_expire_date;
 	static int shared_index_expire_date_prepared;
 
 	if (!shared_index_expire_date_prepared) {
+		const char *shared_index_expire = "2.weeks.ago";
+		char *value = NULL;
+
 		git_config_get_expiry("splitindex.sharedindexexpire",
-				      &shared_index_expire);
+				      &value);
+		if (value)
+			shared_index_expire = value;
+
 		shared_index_expire_date = approxidate(shared_index_expire);
 		shared_index_expire_date_prepared = 1;
+
+		free(value);
 	}
 
 	return shared_index_expire_date;
-- 
2.46.0.46.g406f326d27.dirty


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 2/7] builtin/gc: refactor to read config into structure
  2024-08-13  7:17 [PATCH 0/7] builtin/maintenance: fix auto-detach with non-standard tasks Patrick Steinhardt
  2024-08-13  7:17 ` [PATCH 1/7] config: fix constness of out parameter for `git_config_get_expiry()` Patrick Steinhardt
@ 2024-08-13  7:17 ` Patrick Steinhardt
  2024-08-15  5:24   ` James Liu
  2024-08-15 13:46   ` Derrick Stolee
  2024-08-13  7:17 ` [PATCH 3/7] builtin/gc: fix leaking config values Patrick Steinhardt
                   ` (8 subsequent siblings)
  10 siblings, 2 replies; 79+ messages in thread
From: Patrick Steinhardt @ 2024-08-13  7:17 UTC (permalink / raw)
  To: git

The git-gc(1) command knows to read a bunch of config keys to tweak its
own behaviour. The values are parsed into global variables, which makes
it hard to correctly manage the lifecycle of values that may require a
memory allocation.

Refactor the code to use a `struct gc_config` that gets populated and
passed around. For one, this makes previously-implicit dependencies on
these config values clear. Second, it will allow us to properly manage
the lifecycle in the next commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/gc.c | 255 +++++++++++++++++++++++++++++----------------------
 1 file changed, 143 insertions(+), 112 deletions(-)

diff --git a/builtin/gc.c b/builtin/gc.c
index e7406bf667..eee7401647 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -49,23 +49,7 @@ static const char * const builtin_gc_usage[] = {
 	NULL
 };
 
-static int pack_refs = 1;
-static int prune_reflogs = 1;
-static int cruft_packs = 1;
-static unsigned long max_cruft_size;
-static int aggressive_depth = 50;
-static int aggressive_window = 250;
-static int gc_auto_threshold = 6700;
-static int gc_auto_pack_limit = 50;
-static int detach_auto = 1;
 static timestamp_t gc_log_expire_time;
-static const char *gc_log_expire = "1.day.ago";
-static const char *prune_expire = "2.weeks.ago";
-static const char *prune_worktrees_expire = "3.months.ago";
-static char *repack_filter;
-static char *repack_filter_to;
-static unsigned long big_pack_threshold;
-static unsigned long max_delta_cache_size = DEFAULT_DELTA_CACHE_SIZE;
 
 static struct strvec reflog = STRVEC_INIT;
 static struct strvec repack = STRVEC_INIT;
@@ -145,37 +129,71 @@ static int gc_config_is_timestamp_never(const char *var)
 	return 0;
 }
 
-static void gc_config(void)
+struct gc_config {
+	int pack_refs;
+	int prune_reflogs;
+	int cruft_packs;
+	unsigned long max_cruft_size;
+	int aggressive_depth;
+	int aggressive_window;
+	int gc_auto_threshold;
+	int gc_auto_pack_limit;
+	int detach_auto;
+	const char *gc_log_expire;
+	const char *prune_expire;
+	const char *prune_worktrees_expire;
+	char *repack_filter;
+	char *repack_filter_to;
+	unsigned long big_pack_threshold;
+	unsigned long max_delta_cache_size;
+};
+
+#define GC_CONFIG_INIT { \
+	.pack_refs = 1, \
+	.prune_reflogs = 1, \
+	.cruft_packs = 1, \
+	.aggressive_depth = 50, \
+	.aggressive_window = 250, \
+	.gc_auto_threshold = 6700, \
+	.gc_auto_pack_limit = 50, \
+	.detach_auto = 1, \
+	.gc_log_expire = "1.day.ago", \
+	.prune_expire = "2.weeks.ago", \
+	.prune_worktrees_expire = "3.months.ago", \
+	.max_delta_cache_size = DEFAULT_DELTA_CACHE_SIZE, \
+}
+
+static void gc_config(struct gc_config *cfg)
 {
 	const char *value;
 
 	if (!git_config_get_value("gc.packrefs", &value)) {
 		if (value && !strcmp(value, "notbare"))
-			pack_refs = -1;
+			cfg->pack_refs = -1;
 		else
-			pack_refs = git_config_bool("gc.packrefs", value);
+			cfg->pack_refs = git_config_bool("gc.packrefs", value);
 	}
 
 	if (gc_config_is_timestamp_never("gc.reflogexpire") &&
 	    gc_config_is_timestamp_never("gc.reflogexpireunreachable"))
-		prune_reflogs = 0;
+		cfg->prune_reflogs = 0;
 
-	git_config_get_int("gc.aggressivewindow", &aggressive_window);
-	git_config_get_int("gc.aggressivedepth", &aggressive_depth);
-	git_config_get_int("gc.auto", &gc_auto_threshold);
-	git_config_get_int("gc.autopacklimit", &gc_auto_pack_limit);
-	git_config_get_bool("gc.autodetach", &detach_auto);
-	git_config_get_bool("gc.cruftpacks", &cruft_packs);
-	git_config_get_ulong("gc.maxcruftsize", &max_cruft_size);
-	git_config_get_expiry("gc.pruneexpire", (char **) &prune_expire);
-	git_config_get_expiry("gc.worktreepruneexpire", (char **) &prune_worktrees_expire);
-	git_config_get_expiry("gc.logexpiry", (char **) &gc_log_expire);
+	git_config_get_int("gc.aggressivewindow", &cfg->aggressive_window);
+	git_config_get_int("gc.aggressivedepth", &cfg->aggressive_depth);
+	git_config_get_int("gc.auto", &cfg->gc_auto_threshold);
+	git_config_get_int("gc.autopacklimit", &cfg->gc_auto_pack_limit);
+	git_config_get_bool("gc.autodetach", &cfg->detach_auto);
+	git_config_get_bool("gc.cruftpacks", &cfg->cruft_packs);
+	git_config_get_ulong("gc.maxcruftsize", &cfg->max_cruft_size);
+	git_config_get_expiry("gc.pruneexpire", (char **) &cfg->prune_expire);
+	git_config_get_expiry("gc.worktreepruneexpire", (char **) &cfg->prune_worktrees_expire);
+	git_config_get_expiry("gc.logexpiry", (char **) &cfg->gc_log_expire);
 
-	git_config_get_ulong("gc.bigpackthreshold", &big_pack_threshold);
-	git_config_get_ulong("pack.deltacachesize", &max_delta_cache_size);
+	git_config_get_ulong("gc.bigpackthreshold", &cfg->big_pack_threshold);
+	git_config_get_ulong("pack.deltacachesize", &cfg->max_delta_cache_size);
 
-	git_config_get_string("gc.repackfilter", &repack_filter);
-	git_config_get_string("gc.repackfilterto", &repack_filter_to);
+	git_config_get_string("gc.repackfilter", &cfg->repack_filter);
+	git_config_get_string("gc.repackfilterto", &cfg->repack_filter_to);
 
 	git_config(git_default_config, NULL);
 }
@@ -206,7 +224,7 @@ struct maintenance_run_opts {
 	enum schedule_priority schedule;
 };
 
-static int pack_refs_condition(void)
+static int pack_refs_condition(UNUSED struct gc_config *cfg)
 {
 	/*
 	 * The auto-repacking logic for refs is handled by the ref backends and
@@ -216,7 +234,8 @@ static int pack_refs_condition(void)
 	return 1;
 }
 
-static int maintenance_task_pack_refs(MAYBE_UNUSED struct maintenance_run_opts *opts)
+static int maintenance_task_pack_refs(MAYBE_UNUSED struct maintenance_run_opts *opts,
+				      UNUSED struct gc_config *cfg)
 {
 	struct child_process cmd = CHILD_PROCESS_INIT;
 
@@ -228,7 +247,7 @@ static int maintenance_task_pack_refs(MAYBE_UNUSED struct maintenance_run_opts *
 	return run_command(&cmd);
 }
 
-static int too_many_loose_objects(void)
+static int too_many_loose_objects(struct gc_config *cfg)
 {
 	/*
 	 * Quickly check if a "gc" is needed, by estimating how
@@ -247,7 +266,7 @@ static int too_many_loose_objects(void)
 	if (!dir)
 		return 0;
 
-	auto_threshold = DIV_ROUND_UP(gc_auto_threshold, 256);
+	auto_threshold = DIV_ROUND_UP(cfg->gc_auto_threshold, 256);
 	while ((ent = readdir(dir)) != NULL) {
 		if (strspn(ent->d_name, "0123456789abcdef") != hexsz_loose ||
 		    ent->d_name[hexsz_loose] != '\0')
@@ -283,12 +302,12 @@ static struct packed_git *find_base_packs(struct string_list *packs,
 	return base;
 }
 
-static int too_many_packs(void)
+static int too_many_packs(struct gc_config *cfg)
 {
 	struct packed_git *p;
 	int cnt;
 
-	if (gc_auto_pack_limit <= 0)
+	if (cfg->gc_auto_pack_limit <= 0)
 		return 0;
 
 	for (cnt = 0, p = get_all_packs(the_repository); p; p = p->next) {
@@ -302,7 +321,7 @@ static int too_many_packs(void)
 		 */
 		cnt++;
 	}
-	return gc_auto_pack_limit < cnt;
+	return cfg->gc_auto_pack_limit < cnt;
 }
 
 static uint64_t total_ram(void)
@@ -336,7 +355,8 @@ static uint64_t total_ram(void)
 	return 0;
 }
 
-static uint64_t estimate_repack_memory(struct packed_git *pack)
+static uint64_t estimate_repack_memory(struct gc_config *cfg,
+				       struct packed_git *pack)
 {
 	unsigned long nr_objects = repo_approximate_object_count(the_repository);
 	size_t os_cache, heap;
@@ -373,7 +393,7 @@ static uint64_t estimate_repack_memory(struct packed_git *pack)
 	 */
 	heap += delta_base_cache_limit;
 	/* and of course pack-objects has its own delta cache */
-	heap += max_delta_cache_size;
+	heap += cfg->max_delta_cache_size;
 
 	return os_cache + heap;
 }
@@ -384,30 +404,31 @@ static int keep_one_pack(struct string_list_item *item, void *data UNUSED)
 	return 0;
 }
 
-static void add_repack_all_option(struct string_list *keep_pack)
+static void add_repack_all_option(struct gc_config *cfg,
+				  struct string_list *keep_pack)
 {
-	if (prune_expire && !strcmp(prune_expire, "now"))
+	if (cfg->prune_expire && !strcmp(cfg->prune_expire, "now"))
 		strvec_push(&repack, "-a");
-	else if (cruft_packs) {
+	else if (cfg->cruft_packs) {
 		strvec_push(&repack, "--cruft");
-		if (prune_expire)
-			strvec_pushf(&repack, "--cruft-expiration=%s", prune_expire);
-		if (max_cruft_size)
+		if (cfg->prune_expire)
+			strvec_pushf(&repack, "--cruft-expiration=%s", cfg->prune_expire);
+		if (cfg->max_cruft_size)
 			strvec_pushf(&repack, "--max-cruft-size=%lu",
-				     max_cruft_size);
+				     cfg->max_cruft_size);
 	} else {
 		strvec_push(&repack, "-A");
-		if (prune_expire)
-			strvec_pushf(&repack, "--unpack-unreachable=%s", prune_expire);
+		if (cfg->prune_expire)
+			strvec_pushf(&repack, "--unpack-unreachable=%s", cfg->prune_expire);
 	}
 
 	if (keep_pack)
 		for_each_string_list(keep_pack, keep_one_pack, NULL);
 
-	if (repack_filter && *repack_filter)
-		strvec_pushf(&repack, "--filter=%s", repack_filter);
-	if (repack_filter_to && *repack_filter_to)
-		strvec_pushf(&repack, "--filter-to=%s", repack_filter_to);
+	if (cfg->repack_filter && *cfg->repack_filter)
+		strvec_pushf(&repack, "--filter=%s", cfg->repack_filter);
+	if (cfg->repack_filter_to && *cfg->repack_filter_to)
+		strvec_pushf(&repack, "--filter-to=%s", cfg->repack_filter_to);
 }
 
 static void add_repack_incremental_option(void)
@@ -415,13 +436,13 @@ static void add_repack_incremental_option(void)
 	strvec_push(&repack, "--no-write-bitmap-index");
 }
 
-static int need_to_gc(void)
+static int need_to_gc(struct gc_config *cfg)
 {
 	/*
 	 * Setting gc.auto to 0 or negative can disable the
 	 * automatic gc.
 	 */
-	if (gc_auto_threshold <= 0)
+	if (cfg->gc_auto_threshold <= 0)
 		return 0;
 
 	/*
@@ -430,13 +451,13 @@ static int need_to_gc(void)
 	 * we run "repack -A -d -l".  Otherwise we tell the caller
 	 * there is no need.
 	 */
-	if (too_many_packs()) {
+	if (too_many_packs(cfg)) {
 		struct string_list keep_pack = STRING_LIST_INIT_NODUP;
 
-		if (big_pack_threshold) {
-			find_base_packs(&keep_pack, big_pack_threshold);
-			if (keep_pack.nr >= gc_auto_pack_limit) {
-				big_pack_threshold = 0;
+		if (cfg->big_pack_threshold) {
+			find_base_packs(&keep_pack, cfg->big_pack_threshold);
+			if (keep_pack.nr >= cfg->gc_auto_pack_limit) {
+				cfg->big_pack_threshold = 0;
 				string_list_clear(&keep_pack, 0);
 				find_base_packs(&keep_pack, 0);
 			}
@@ -445,7 +466,7 @@ static int need_to_gc(void)
 			uint64_t mem_have, mem_want;
 
 			mem_have = total_ram();
-			mem_want = estimate_repack_memory(p);
+			mem_want = estimate_repack_memory(cfg, p);
 
 			/*
 			 * Only allow 1/2 of memory for pack-objects, leave
@@ -456,9 +477,9 @@ static int need_to_gc(void)
 				string_list_clear(&keep_pack, 0);
 		}
 
-		add_repack_all_option(&keep_pack);
+		add_repack_all_option(cfg, &keep_pack);
 		string_list_clear(&keep_pack, 0);
-	} else if (too_many_loose_objects())
+	} else if (too_many_loose_objects(cfg))
 		add_repack_incremental_option();
 	else
 		return 0;
@@ -585,7 +606,8 @@ static int report_last_gc_error(void)
 	return ret;
 }
 
-static void gc_before_repack(struct maintenance_run_opts *opts)
+static void gc_before_repack(struct maintenance_run_opts *opts,
+			     struct gc_config *cfg)
 {
 	/*
 	 * We may be called twice, as both the pre- and
@@ -596,10 +618,10 @@ static void gc_before_repack(struct maintenance_run_opts *opts)
 	if (done++)
 		return;
 
-	if (pack_refs && maintenance_task_pack_refs(opts))
+	if (cfg->pack_refs && maintenance_task_pack_refs(opts, cfg))
 		die(FAILED_RUN, "pack-refs");
 
-	if (prune_reflogs) {
+	if (cfg->prune_reflogs) {
 		struct child_process cmd = CHILD_PROCESS_INIT;
 
 		cmd.git_cmd = 1;
@@ -621,14 +643,15 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 	timestamp_t dummy;
 	struct child_process rerere_cmd = CHILD_PROCESS_INIT;
 	struct maintenance_run_opts opts = {0};
+	struct gc_config cfg = GC_CONFIG_INIT;
 
 	struct option builtin_gc_options[] = {
 		OPT__QUIET(&quiet, N_("suppress progress reporting")),
-		{ OPTION_STRING, 0, "prune", &prune_expire, N_("date"),
+		{ OPTION_STRING, 0, "prune", &cfg.prune_expire, N_("date"),
 			N_("prune unreferenced objects"),
-			PARSE_OPT_OPTARG, NULL, (intptr_t)prune_expire },
-		OPT_BOOL(0, "cruft", &cruft_packs, N_("pack unreferenced objects separately")),
-		OPT_MAGNITUDE(0, "max-cruft-size", &max_cruft_size,
+			PARSE_OPT_OPTARG, NULL, (intptr_t)cfg.prune_expire },
+		OPT_BOOL(0, "cruft", &cfg.cruft_packs, N_("pack unreferenced objects separately")),
+		OPT_MAGNITUDE(0, "max-cruft-size", &cfg.max_cruft_size,
 			      N_("with --cruft, limit the size of new cruft packs")),
 		OPT_BOOL(0, "aggressive", &aggressive, N_("be more thorough (increased runtime)")),
 		OPT_BOOL_F(0, "auto", &opts.auto_flag, N_("enable auto-gc mode"),
@@ -651,27 +674,27 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 	strvec_pushl(&rerere, "rerere", "gc", NULL);
 
 	/* default expiry time, overwritten in gc_config */
-	gc_config();
-	if (parse_expiry_date(gc_log_expire, &gc_log_expire_time))
-		die(_("failed to parse gc.logExpiry value %s"), gc_log_expire);
+	gc_config(&cfg);
+	if (parse_expiry_date(cfg.gc_log_expire, &gc_log_expire_time))
+		die(_("failed to parse gc.logExpiry value %s"), cfg.gc_log_expire);
 
-	if (pack_refs < 0)
-		pack_refs = !is_bare_repository();
+	if (cfg.pack_refs < 0)
+		cfg.pack_refs = !is_bare_repository();
 
 	argc = parse_options(argc, argv, prefix, builtin_gc_options,
 			     builtin_gc_usage, 0);
 	if (argc > 0)
 		usage_with_options(builtin_gc_usage, builtin_gc_options);
 
-	if (prune_expire && parse_expiry_date(prune_expire, &dummy))
-		die(_("failed to parse prune expiry value %s"), prune_expire);
+	if (cfg.prune_expire && parse_expiry_date(cfg.prune_expire, &dummy))
+		die(_("failed to parse prune expiry value %s"), cfg.prune_expire);
 
 	if (aggressive) {
 		strvec_push(&repack, "-f");
-		if (aggressive_depth > 0)
-			strvec_pushf(&repack, "--depth=%d", aggressive_depth);
-		if (aggressive_window > 0)
-			strvec_pushf(&repack, "--window=%d", aggressive_window);
+		if (cfg.aggressive_depth > 0)
+			strvec_pushf(&repack, "--depth=%d", cfg.aggressive_depth);
+		if (cfg.aggressive_window > 0)
+			strvec_pushf(&repack, "--window=%d", cfg.aggressive_window);
 	}
 	if (quiet)
 		strvec_push(&repack, "-q");
@@ -680,16 +703,16 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 		/*
 		 * Auto-gc should be least intrusive as possible.
 		 */
-		if (!need_to_gc())
+		if (!need_to_gc(&cfg))
 			return 0;
 		if (!quiet) {
-			if (detach_auto)
+			if (cfg.detach_auto)
 				fprintf(stderr, _("Auto packing the repository in background for optimum performance.\n"));
 			else
 				fprintf(stderr, _("Auto packing the repository for optimum performance.\n"));
 			fprintf(stderr, _("See \"git help gc\" for manual housekeeping.\n"));
 		}
-		if (detach_auto) {
+		if (cfg.detach_auto) {
 			int ret = report_last_gc_error();
 
 			if (ret == 1)
@@ -701,7 +724,7 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 
 			if (lock_repo_for_gc(force, &pid))
 				return 0;
-			gc_before_repack(&opts); /* dies on failure */
+			gc_before_repack(&opts, &cfg); /* dies on failure */
 			delete_tempfile(&pidfile);
 
 			/*
@@ -716,11 +739,11 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 		if (keep_largest_pack != -1) {
 			if (keep_largest_pack)
 				find_base_packs(&keep_pack, 0);
-		} else if (big_pack_threshold) {
-			find_base_packs(&keep_pack, big_pack_threshold);
+		} else if (cfg.big_pack_threshold) {
+			find_base_packs(&keep_pack, cfg.big_pack_threshold);
 		}
 
-		add_repack_all_option(&keep_pack);
+		add_repack_all_option(&cfg, &keep_pack);
 		string_list_clear(&keep_pack, 0);
 	}
 
@@ -741,7 +764,7 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 		atexit(process_log_file_at_exit);
 	}
 
-	gc_before_repack(&opts);
+	gc_before_repack(&opts, &cfg);
 
 	if (!repository_format_precious_objects) {
 		struct child_process repack_cmd = CHILD_PROCESS_INIT;
@@ -752,11 +775,11 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 		if (run_command(&repack_cmd))
 			die(FAILED_RUN, repack.v[0]);
 
-		if (prune_expire) {
+		if (cfg.prune_expire) {
 			struct child_process prune_cmd = CHILD_PROCESS_INIT;
 
 			/* run `git prune` even if using cruft packs */
-			strvec_push(&prune, prune_expire);
+			strvec_push(&prune, cfg.prune_expire);
 			if (quiet)
 				strvec_push(&prune, "--no-progress");
 			if (repo_has_promisor_remote(the_repository))
@@ -769,10 +792,10 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 		}
 	}
 
-	if (prune_worktrees_expire) {
+	if (cfg.prune_worktrees_expire) {
 		struct child_process prune_worktrees_cmd = CHILD_PROCESS_INIT;
 
-		strvec_push(&prune_worktrees, prune_worktrees_expire);
+		strvec_push(&prune_worktrees, cfg.prune_worktrees_expire);
 		prune_worktrees_cmd.git_cmd = 1;
 		strvec_pushv(&prune_worktrees_cmd.args, prune_worktrees.v);
 		if (run_command(&prune_worktrees_cmd))
@@ -796,7 +819,7 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 					     !quiet && !daemonized ? COMMIT_GRAPH_WRITE_PROGRESS : 0,
 					     NULL);
 
-	if (opts.auto_flag && too_many_loose_objects())
+	if (opts.auto_flag && too_many_loose_objects(&cfg))
 		warning(_("There are too many unreachable loose objects; "
 			"run 'git prune' to remove them."));
 
@@ -892,7 +915,7 @@ static int dfs_on_ref(const char *refname UNUSED,
 	return result;
 }
 
-static int should_write_commit_graph(void)
+static int should_write_commit_graph(struct gc_config *cfg)
 {
 	int result;
 	struct cg_auto_data data;
@@ -929,7 +952,8 @@ static int run_write_commit_graph(struct maintenance_run_opts *opts)
 	return !!run_command(&child);
 }
 
-static int maintenance_task_commit_graph(struct maintenance_run_opts *opts)
+static int maintenance_task_commit_graph(struct maintenance_run_opts *opts,
+					 struct gc_config *cfg)
 {
 	prepare_repo_settings(the_repository);
 	if (!the_repository->settings.core_commit_graph)
@@ -963,7 +987,8 @@ static int fetch_remote(struct remote *remote, void *cbdata)
 	return !!run_command(&child);
 }
 
-static int maintenance_task_prefetch(struct maintenance_run_opts *opts)
+static int maintenance_task_prefetch(struct maintenance_run_opts *opts,
+				     struct gc_config *cfg)
 {
 	if (for_each_remote(fetch_remote, opts)) {
 		error(_("failed to prefetch remotes"));
@@ -973,7 +998,8 @@ static int maintenance_task_prefetch(struct maintenance_run_opts *opts)
 	return 0;
 }
 
-static int maintenance_task_gc(struct maintenance_run_opts *opts)
+static int maintenance_task_gc(struct maintenance_run_opts *opts,
+			       struct gc_config *cfg)
 {
 	struct child_process child = CHILD_PROCESS_INIT;
 
@@ -1021,7 +1047,7 @@ static int loose_object_count(const struct object_id *oid UNUSED,
 	return 0;
 }
 
-static int loose_object_auto_condition(void)
+static int loose_object_auto_condition(struct gc_config *cfg)
 {
 	int count = 0;
 
@@ -1106,12 +1132,13 @@ static int pack_loose(struct maintenance_run_opts *opts)
 	return result;
 }
 
-static int maintenance_task_loose_objects(struct maintenance_run_opts *opts)
+static int maintenance_task_loose_objects(struct maintenance_run_opts *opts,
+					  struct gc_config *cfg)
 {
 	return prune_packed(opts) || pack_loose(opts);
 }
 
-static int incremental_repack_auto_condition(void)
+static int incremental_repack_auto_condition(struct gc_config *cfg)
 {
 	struct packed_git *p;
 	int incremental_repack_auto_limit = 10;
@@ -1230,7 +1257,8 @@ static int multi_pack_index_repack(struct maintenance_run_opts *opts)
 	return 0;
 }
 
-static int maintenance_task_incremental_repack(struct maintenance_run_opts *opts)
+static int maintenance_task_incremental_repack(struct maintenance_run_opts *opts,
+					       struct gc_config *cfg)
 {
 	prepare_repo_settings(the_repository);
 	if (!the_repository->settings.core_multi_pack_index) {
@@ -1247,14 +1275,15 @@ static int maintenance_task_incremental_repack(struct maintenance_run_opts *opts
 	return 0;
 }
 
-typedef int maintenance_task_fn(struct maintenance_run_opts *opts);
+typedef int maintenance_task_fn(struct maintenance_run_opts *opts,
+				struct gc_config *cfg);
 
 /*
  * An auto condition function returns 1 if the task should run
  * and 0 if the task should NOT run. See needs_to_gc() for an
  * example.
  */
-typedef int maintenance_auto_fn(void);
+typedef int maintenance_auto_fn(struct gc_config *cfg);
 
 struct maintenance_task {
 	const char *name;
@@ -1321,7 +1350,8 @@ static int compare_tasks_by_selection(const void *a_, const void *b_)
 	return b->selected_order - a->selected_order;
 }
 
-static int maintenance_run_tasks(struct maintenance_run_opts *opts)
+static int maintenance_run_tasks(struct maintenance_run_opts *opts,
+				 struct gc_config *cfg)
 {
 	int i, found_selected = 0;
 	int result = 0;
@@ -1360,14 +1390,14 @@ static int maintenance_run_tasks(struct maintenance_run_opts *opts)
 
 		if (opts->auto_flag &&
 		    (!tasks[i].auto_condition ||
-		     !tasks[i].auto_condition()))
+		     !tasks[i].auto_condition(cfg)))
 			continue;
 
 		if (opts->schedule && tasks[i].schedule < opts->schedule)
 			continue;
 
 		trace2_region_enter("maintenance", tasks[i].name, r);
-		if (tasks[i].fn(opts)) {
+		if (tasks[i].fn(opts, cfg)) {
 			error(_("task '%s' failed"), tasks[i].name);
 			result = 1;
 		}
@@ -1404,7 +1434,6 @@ static void initialize_task_config(int schedule)
 {
 	int i;
 	struct strbuf config_name = STRBUF_INIT;
-	gc_config();
 
 	if (schedule)
 		initialize_maintenance_strategy();
@@ -1468,6 +1497,7 @@ static int maintenance_run(int argc, const char **argv, const char *prefix)
 {
 	int i;
 	struct maintenance_run_opts opts;
+	struct gc_config cfg = GC_CONFIG_INIT;
 	struct option builtin_maintenance_run_options[] = {
 		OPT_BOOL(0, "auto", &opts.auto_flag,
 			 N_("run tasks based on the state of the repository")),
@@ -1496,12 +1526,13 @@ static int maintenance_run(int argc, const char **argv, const char *prefix)
 	if (opts.auto_flag && opts.schedule)
 		die(_("use at most one of --auto and --schedule=<frequency>"));
 
+	gc_config(&cfg);
 	initialize_task_config(opts.schedule);
 
 	if (argc != 0)
 		usage_with_options(builtin_maintenance_run_usage,
 				   builtin_maintenance_run_options);
-	return maintenance_run_tasks(&opts);
+	return maintenance_run_tasks(&opts, &cfg);
 }
 
 static char *get_maintpath(void)
-- 
2.46.0.46.g406f326d27.dirty


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 3/7] builtin/gc: fix leaking config values
  2024-08-13  7:17 [PATCH 0/7] builtin/maintenance: fix auto-detach with non-standard tasks Patrick Steinhardt
  2024-08-13  7:17 ` [PATCH 1/7] config: fix constness of out parameter for `git_config_get_expiry()` Patrick Steinhardt
  2024-08-13  7:17 ` [PATCH 2/7] builtin/gc: refactor to read config into structure Patrick Steinhardt
@ 2024-08-13  7:17 ` Patrick Steinhardt
  2024-08-15  5:22   ` James Liu
  2024-08-15 13:50   ` Derrick Stolee
  2024-08-13  7:17 ` [PATCH 4/7] builtin/gc: stop processing log file on signal Patrick Steinhardt
                   ` (7 subsequent siblings)
  10 siblings, 2 replies; 79+ messages in thread
From: Patrick Steinhardt @ 2024-08-13  7:17 UTC (permalink / raw)
  To: git

We're leaking config values in git-gc(1) when those values are tracked
as strings. Introduce a new `gc_config_release()` function that releases
this memory to plug those leaks and release old values before populating
the config fields via `git_config_string()` et al.

Note that there is one small gotcha here with the "--prune" option. Next
to passing a string, this option also accepts the "--no-prune" option
that overrides the default or configured value. We thus need to discern
between the option not having been passed by the user and the negative
variant of it. This is done by using a simple sentinel value that lets
us discern these cases.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/gc.c     | 108 +++++++++++++++++++++++++++++++++++------------
 t/t5304-prune.sh |   1 +
 2 files changed, 82 insertions(+), 27 deletions(-)

diff --git a/builtin/gc.c b/builtin/gc.c
index eee7401647..a93cfa147e 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -139,9 +139,9 @@ struct gc_config {
 	int gc_auto_threshold;
 	int gc_auto_pack_limit;
 	int detach_auto;
-	const char *gc_log_expire;
-	const char *prune_expire;
-	const char *prune_worktrees_expire;
+	char *gc_log_expire;
+	char *prune_expire;
+	char *prune_worktrees_expire;
 	char *repack_filter;
 	char *repack_filter_to;
 	unsigned long big_pack_threshold;
@@ -157,15 +157,25 @@ struct gc_config {
 	.gc_auto_threshold = 6700, \
 	.gc_auto_pack_limit = 50, \
 	.detach_auto = 1, \
-	.gc_log_expire = "1.day.ago", \
-	.prune_expire = "2.weeks.ago", \
-	.prune_worktrees_expire = "3.months.ago", \
+	.gc_log_expire = xstrdup("1.day.ago"), \
+	.prune_expire = xstrdup("2.weeks.ago"), \
+	.prune_worktrees_expire = xstrdup("3.months.ago"), \
 	.max_delta_cache_size = DEFAULT_DELTA_CACHE_SIZE, \
 }
 
+static void gc_config_release(struct gc_config *cfg)
+{
+	free(cfg->gc_log_expire);
+	free(cfg->prune_expire);
+	free(cfg->prune_worktrees_expire);
+	free(cfg->repack_filter);
+	free(cfg->repack_filter_to);
+}
+
 static void gc_config(struct gc_config *cfg)
 {
 	const char *value;
+	char *owned = NULL;
 
 	if (!git_config_get_value("gc.packrefs", &value)) {
 		if (value && !strcmp(value, "notbare"))
@@ -185,15 +195,34 @@ static void gc_config(struct gc_config *cfg)
 	git_config_get_bool("gc.autodetach", &cfg->detach_auto);
 	git_config_get_bool("gc.cruftpacks", &cfg->cruft_packs);
 	git_config_get_ulong("gc.maxcruftsize", &cfg->max_cruft_size);
-	git_config_get_expiry("gc.pruneexpire", (char **) &cfg->prune_expire);
-	git_config_get_expiry("gc.worktreepruneexpire", (char **) &cfg->prune_worktrees_expire);
-	git_config_get_expiry("gc.logexpiry", (char **) &cfg->gc_log_expire);
+
+	if (!git_config_get_expiry("gc.pruneexpire", &owned)) {
+		free(cfg->prune_expire);
+		cfg->prune_expire = owned;
+	}
+
+	if (!git_config_get_expiry("gc.worktreepruneexpire", &owned)) {
+		free(cfg->prune_worktrees_expire);
+		cfg->prune_worktrees_expire = owned;
+	}
+
+	if (!git_config_get_expiry("gc.logexpiry", &owned)) {
+		free(cfg->gc_log_expire);
+		cfg->gc_log_expire = owned;
+	}
 
 	git_config_get_ulong("gc.bigpackthreshold", &cfg->big_pack_threshold);
 	git_config_get_ulong("pack.deltacachesize", &cfg->max_delta_cache_size);
 
-	git_config_get_string("gc.repackfilter", &cfg->repack_filter);
-	git_config_get_string("gc.repackfilterto", &cfg->repack_filter_to);
+	if (!git_config_get_string("gc.repackfilter", &owned)) {
+		free(cfg->repack_filter);
+		cfg->repack_filter = owned;
+	}
+
+	if (!git_config_get_string("gc.repackfilterto", &owned)) {
+		free(cfg->repack_filter_to);
+		cfg->repack_filter_to = owned;
+	}
 
 	git_config(git_default_config, NULL);
 }
@@ -644,12 +673,15 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 	struct child_process rerere_cmd = CHILD_PROCESS_INIT;
 	struct maintenance_run_opts opts = {0};
 	struct gc_config cfg = GC_CONFIG_INIT;
+	const char *prune_expire_sentinel = "sentinel";
+	const char *prune_expire_arg = prune_expire_sentinel;
+	int ret;
 
 	struct option builtin_gc_options[] = {
 		OPT__QUIET(&quiet, N_("suppress progress reporting")),
-		{ OPTION_STRING, 0, "prune", &cfg.prune_expire, N_("date"),
+		{ OPTION_STRING, 0, "prune", &prune_expire_arg, N_("date"),
 			N_("prune unreferenced objects"),
-			PARSE_OPT_OPTARG, NULL, (intptr_t)cfg.prune_expire },
+			PARSE_OPT_OPTARG, NULL, (intptr_t)prune_expire_arg },
 		OPT_BOOL(0, "cruft", &cfg.cruft_packs, N_("pack unreferenced objects separately")),
 		OPT_MAGNITUDE(0, "max-cruft-size", &cfg.max_cruft_size,
 			      N_("with --cruft, limit the size of new cruft packs")),
@@ -673,8 +705,8 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 	strvec_pushl(&prune_worktrees, "worktree", "prune", "--expire", NULL);
 	strvec_pushl(&rerere, "rerere", "gc", NULL);
 
-	/* default expiry time, overwritten in gc_config */
 	gc_config(&cfg);
+
 	if (parse_expiry_date(cfg.gc_log_expire, &gc_log_expire_time))
 		die(_("failed to parse gc.logExpiry value %s"), cfg.gc_log_expire);
 
@@ -686,6 +718,10 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 	if (argc > 0)
 		usage_with_options(builtin_gc_usage, builtin_gc_options);
 
+	if (prune_expire_arg != prune_expire_sentinel) {
+		free(cfg.prune_expire);
+		cfg.prune_expire = xstrdup_or_null(prune_expire_arg);
+	}
 	if (cfg.prune_expire && parse_expiry_date(cfg.prune_expire, &dummy))
 		die(_("failed to parse prune expiry value %s"), cfg.prune_expire);
 
@@ -703,8 +739,11 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 		/*
 		 * Auto-gc should be least intrusive as possible.
 		 */
-		if (!need_to_gc(&cfg))
-			return 0;
+		if (!need_to_gc(&cfg)) {
+			ret = 0;
+			goto out;
+		}
+
 		if (!quiet) {
 			if (cfg.detach_auto)
 				fprintf(stderr, _("Auto packing the repository in background for optimum performance.\n"));
@@ -713,17 +752,22 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 			fprintf(stderr, _("See \"git help gc\" for manual housekeeping.\n"));
 		}
 		if (cfg.detach_auto) {
-			int ret = report_last_gc_error();
-
-			if (ret == 1)
+			ret = report_last_gc_error();
+			if (ret == 1) {
 				/* Last gc --auto failed. Skip this one. */
-				return 0;
-			else if (ret)
+				ret = 0;
+				goto out;
+
+			} else if (ret) {
 				/* an I/O error occurred, already reported */
-				return ret;
+				goto out;
+			}
+
+			if (lock_repo_for_gc(force, &pid)) {
+				ret = 0;
+				goto out;
+			}
 
-			if (lock_repo_for_gc(force, &pid))
-				return 0;
 			gc_before_repack(&opts, &cfg); /* dies on failure */
 			delete_tempfile(&pidfile);
 
@@ -749,8 +793,11 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 
 	name = lock_repo_for_gc(force, &pid);
 	if (name) {
-		if (opts.auto_flag)
-			return 0; /* be quiet on --auto */
+		if (opts.auto_flag) {
+			ret = 0;
+			goto out; /* be quiet on --auto */
+		}
+
 		die(_("gc is already running on machine '%s' pid %"PRIuMAX" (use --force if not)"),
 		    name, (uintmax_t)pid);
 	}
@@ -826,6 +873,8 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 	if (!daemonized)
 		unlink(git_path("gc.log"));
 
+out:
+	gc_config_release(&cfg);
 	return 0;
 }
 
@@ -1511,6 +1560,8 @@ static int maintenance_run(int argc, const char **argv, const char *prefix)
 			PARSE_OPT_NONEG, task_option_parse),
 		OPT_END()
 	};
+	int ret;
+
 	memset(&opts, 0, sizeof(opts));
 
 	opts.quiet = !isatty(2);
@@ -1532,7 +1583,10 @@ static int maintenance_run(int argc, const char **argv, const char *prefix)
 	if (argc != 0)
 		usage_with_options(builtin_maintenance_run_usage,
 				   builtin_maintenance_run_options);
-	return maintenance_run_tasks(&opts, &cfg);
+
+	ret = maintenance_run_tasks(&opts, &cfg);
+	gc_config_release(&cfg);
+	return ret;
 }
 
 static char *get_maintpath(void)
diff --git a/t/t5304-prune.sh b/t/t5304-prune.sh
index 1f1f664871..e641df0116 100755
--- a/t/t5304-prune.sh
+++ b/t/t5304-prune.sh
@@ -7,6 +7,7 @@ test_description='prune'
 GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 day=$((60*60*24))
-- 
2.46.0.46.g406f326d27.dirty


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 4/7] builtin/gc: stop processing log file on signal
  2024-08-13  7:17 [PATCH 0/7] builtin/maintenance: fix auto-detach with non-standard tasks Patrick Steinhardt
                   ` (2 preceding siblings ...)
  2024-08-13  7:17 ` [PATCH 3/7] builtin/gc: fix leaking config values Patrick Steinhardt
@ 2024-08-13  7:17 ` Patrick Steinhardt
  2024-08-15  6:01   ` James Liu
  2024-08-13  7:17 ` [PATCH 5/7] builtin/gc: add a `--detach` flag Patrick Steinhardt
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 79+ messages in thread
From: Patrick Steinhardt @ 2024-08-13  7:17 UTC (permalink / raw)
  To: git

When detaching, git-gc(1) will redirect its stderr to a "gc.log" log
file, which is then used to surface errors of a backgrounded process to
the user. To ensure that the file is properly managed on abnormal exit
paths, we install both signal and exit handlers that try to either
commit the underlying lock file or roll it back in case there wasn't any
error.

This logic is severly broken when handling signals though, as we end up
calling all kinds of functions that are not signal safe. This includes
malloc(3P) via `git_path()`, fprintf(3P), fflush(3P) and many more
functions. The consequence can be anything, from deadlocks to crashes.
Unfortunately, we cannot really do much about this without a larger
refactoring.

The least-worst thing we can do is to not set up the signal handler in
the first place. This will still cause us to remove the lockfile, as the
underlying tempfile subsystem already knows to unlink locks when
receiving a signal. But it may cause us to remove the lock even in the
case where it would have contained actual errors, which is a change in
behaviour.

The consequence is that "gc.log" will not be committed, and thus
subsequent calls to `git gc --auto` won't bail out because of this.
Arguably though, it is better to retry garbage collection rather than
having the process run into a potentially-corrupted state.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/gc.c | 8 --------
 1 file changed, 8 deletions(-)

diff --git a/builtin/gc.c b/builtin/gc.c
index a93cfa147e..f815557081 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -109,13 +109,6 @@ static void process_log_file_at_exit(void)
 	process_log_file();
 }
 
-static void process_log_file_on_signal(int signo)
-{
-	process_log_file();
-	sigchain_pop(signo);
-	raise(signo);
-}
-
 static int gc_config_is_timestamp_never(const char *var)
 {
 	const char *value;
@@ -807,7 +800,6 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 					  git_path("gc.log"),
 					  LOCK_DIE_ON_ERROR);
 		dup2(get_lock_file_fd(&log_lock), 2);
-		sigchain_push_common(process_log_file_on_signal);
 		atexit(process_log_file_at_exit);
 	}
 
-- 
2.46.0.46.g406f326d27.dirty


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 5/7] builtin/gc: add a `--detach` flag
  2024-08-13  7:17 [PATCH 0/7] builtin/maintenance: fix auto-detach with non-standard tasks Patrick Steinhardt
                   ` (3 preceding siblings ...)
  2024-08-13  7:17 ` [PATCH 4/7] builtin/gc: stop processing log file on signal Patrick Steinhardt
@ 2024-08-13  7:17 ` Patrick Steinhardt
  2024-08-13  7:17 ` [PATCH 6/7] builtin/maintenance: " Patrick Steinhardt
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 79+ messages in thread
From: Patrick Steinhardt @ 2024-08-13  7:17 UTC (permalink / raw)
  To: git

When running `git gc --auto`, the command will by default detach and
continue running in the background. This behaviour can be tweaked via
the `gc.autoDetach` config, but not via a command line switch. We need
that in a subsequent commit though, where git-maintenance(1) will want
to ask its git-gc(1) child process to not detach anymore.

Add a `--[no-]detach` flag that does this for us.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/git-gc.txt |  5 ++-
 builtin/gc.c             | 70 ++++++++++++++++++++++------------------
 t/t6500-gc.sh            | 39 ++++++++++++++++++++++
 3 files changed, 82 insertions(+), 32 deletions(-)

diff --git a/Documentation/git-gc.txt b/Documentation/git-gc.txt
index b5561c458a..370e22faae 100644
--- a/Documentation/git-gc.txt
+++ b/Documentation/git-gc.txt
@@ -9,7 +9,7 @@ git-gc - Cleanup unnecessary files and optimize the local repository
 SYNOPSIS
 --------
 [verse]
-'git gc' [--aggressive] [--auto] [--quiet] [--prune=<date> | --no-prune] [--force] [--keep-largest-pack]
+'git gc' [--aggressive] [--auto] [--[no-]detach] [--quiet] [--prune=<date> | --no-prune] [--force] [--keep-largest-pack]
 
 DESCRIPTION
 -----------
@@ -53,6 +53,9 @@ configuration options such as `gc.auto` and `gc.autoPackLimit`, all
 other housekeeping tasks (e.g. rerere, working trees, reflog...) will
 be performed as well.
 
+--[no-]detach::
+	Run in the background if the system supports it. This option overrides
+	the `gc.autoDetach` config.
 
 --[no-]cruft::
 	When expiring unreachable objects, pack them separately into a
diff --git a/builtin/gc.c b/builtin/gc.c
index f815557081..269a77960f 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -242,9 +242,13 @@ static enum schedule_priority parse_schedule(const char *value)
 
 struct maintenance_run_opts {
 	int auto_flag;
+	int detach;
 	int quiet;
 	enum schedule_priority schedule;
 };
+#define MAINTENANCE_RUN_OPTS_INIT { \
+	.detach = -1, \
+}
 
 static int pack_refs_condition(UNUSED struct gc_config *cfg)
 {
@@ -664,7 +668,7 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 	int keep_largest_pack = -1;
 	timestamp_t dummy;
 	struct child_process rerere_cmd = CHILD_PROCESS_INIT;
-	struct maintenance_run_opts opts = {0};
+	struct maintenance_run_opts opts = MAINTENANCE_RUN_OPTS_INIT;
 	struct gc_config cfg = GC_CONFIG_INIT;
 	const char *prune_expire_sentinel = "sentinel";
 	const char *prune_expire_arg = prune_expire_sentinel;
@@ -681,6 +685,8 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 		OPT_BOOL(0, "aggressive", &aggressive, N_("be more thorough (increased runtime)")),
 		OPT_BOOL_F(0, "auto", &opts.auto_flag, N_("enable auto-gc mode"),
 			   PARSE_OPT_NOCOMPLETE),
+		OPT_BOOL(0, "detach", &opts.detach,
+			 N_("perform garbage collection in the background")),
 		OPT_BOOL_F(0, "force", &force,
 			   N_("force running gc even if there may be another gc running"),
 			   PARSE_OPT_NOCOMPLETE),
@@ -729,6 +735,9 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 		strvec_push(&repack, "-q");
 
 	if (opts.auto_flag) {
+		if (cfg.detach_auto && opts.detach < 0)
+			opts.detach = 1;
+
 		/*
 		 * Auto-gc should be least intrusive as possible.
 		 */
@@ -738,38 +747,12 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 		}
 
 		if (!quiet) {
-			if (cfg.detach_auto)
+			if (opts.detach > 0)
 				fprintf(stderr, _("Auto packing the repository in background for optimum performance.\n"));
 			else
 				fprintf(stderr, _("Auto packing the repository for optimum performance.\n"));
 			fprintf(stderr, _("See \"git help gc\" for manual housekeeping.\n"));
 		}
-		if (cfg.detach_auto) {
-			ret = report_last_gc_error();
-			if (ret == 1) {
-				/* Last gc --auto failed. Skip this one. */
-				ret = 0;
-				goto out;
-
-			} else if (ret) {
-				/* an I/O error occurred, already reported */
-				goto out;
-			}
-
-			if (lock_repo_for_gc(force, &pid)) {
-				ret = 0;
-				goto out;
-			}
-
-			gc_before_repack(&opts, &cfg); /* dies on failure */
-			delete_tempfile(&pidfile);
-
-			/*
-			 * failure to daemonize is ok, we'll continue
-			 * in foreground
-			 */
-			daemonized = !daemonize();
-		}
 	} else {
 		struct string_list keep_pack = STRING_LIST_INIT_NODUP;
 
@@ -784,6 +767,33 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 		string_list_clear(&keep_pack, 0);
 	}
 
+	if (opts.detach > 0) {
+		ret = report_last_gc_error();
+		if (ret == 1) {
+			/* Last gc --auto failed. Skip this one. */
+			ret = 0;
+			goto out;
+
+		} else if (ret) {
+			/* an I/O error occurred, already reported */
+			goto out;
+		}
+
+		if (lock_repo_for_gc(force, &pid)) {
+			ret = 0;
+			goto out;
+		}
+
+		gc_before_repack(&opts, &cfg); /* dies on failure */
+		delete_tempfile(&pidfile);
+
+		/*
+		 * failure to daemonize is ok, we'll continue
+		 * in foreground
+		 */
+		daemonized = !daemonize();
+	}
+
 	name = lock_repo_for_gc(force, &pid);
 	if (name) {
 		if (opts.auto_flag) {
@@ -1537,7 +1547,7 @@ static int task_option_parse(const struct option *opt UNUSED,
 static int maintenance_run(int argc, const char **argv, const char *prefix)
 {
 	int i;
-	struct maintenance_run_opts opts;
+	struct maintenance_run_opts opts = MAINTENANCE_RUN_OPTS_INIT;
 	struct gc_config cfg = GC_CONFIG_INIT;
 	struct option builtin_maintenance_run_options[] = {
 		OPT_BOOL(0, "auto", &opts.auto_flag,
@@ -1554,8 +1564,6 @@ static int maintenance_run(int argc, const char **argv, const char *prefix)
 	};
 	int ret;
 
-	memset(&opts, 0, sizeof(opts));
-
 	opts.quiet = !isatty(2);
 
 	for (i = 0; i < TASK__COUNT; i++)
diff --git a/t/t6500-gc.sh b/t/t6500-gc.sh
index 1b5909d1b7..737c99e0f8 100755
--- a/t/t6500-gc.sh
+++ b/t/t6500-gc.sh
@@ -396,6 +396,45 @@ test_expect_success 'background auto gc respects lock for all operations' '
 	test_cmp expect actual
 '
 
+test_expect_success '--detach overrides gc.autoDetach=false' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+
+		# Prepare the repository such that git-gc(1) ends up repacking.
+		test_commit "$(test_oid blob17_1)" &&
+		test_commit "$(test_oid blob17_2)" &&
+		git config gc.autodetach false &&
+		git config gc.auto 2 &&
+
+		cat >expect <<-EOF &&
+		Auto packing the repository in background for optimum performance.
+		See "git help gc" for manual housekeeping.
+		EOF
+		GIT_PROGRESS_DELAY=0 git gc --auto --detach 2>actual &&
+		test_cmp expect actual
+	)
+'
+
+test_expect_success '--no-detach overrides gc.autoDetach=true' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+
+		# Prepare the repository such that git-gc(1) ends up repacking.
+		test_commit "$(test_oid blob17_1)" &&
+		test_commit "$(test_oid blob17_2)" &&
+		git config gc.autodetach true &&
+		git config gc.auto 2 &&
+
+		GIT_PROGRESS_DELAY=0 git gc --auto --no-detach 2>output &&
+		test_grep "Auto packing the repository for optimum performance." output &&
+		test_grep "Collecting referenced commits: 2, done." output
+	)
+'
+
 # DO NOT leave a detached auto gc process running near the end of the
 # test script: it can run long enough in the background to racily
 # interfere with the cleanup in 'test_done'.
-- 
2.46.0.46.g406f326d27.dirty


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 6/7] builtin/maintenance: add a `--detach` flag
  2024-08-13  7:17 [PATCH 0/7] builtin/maintenance: fix auto-detach with non-standard tasks Patrick Steinhardt
                   ` (4 preceding siblings ...)
  2024-08-13  7:17 ` [PATCH 5/7] builtin/gc: add a `--detach` flag Patrick Steinhardt
@ 2024-08-13  7:17 ` Patrick Steinhardt
  2024-08-13  7:18 ` [PATCH 7/7] builtin/maintenance: fix auto-detach with non-standard tasks Patrick Steinhardt
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 79+ messages in thread
From: Patrick Steinhardt @ 2024-08-13  7:17 UTC (permalink / raw)
  To: git

Same as the preceding commit, add a `--[no-]detach` flag to the
git-maintenance(1) command. This will be used in a subsequent commit to
fix backgrounding of that command when configured with a non-standard
set of tasks.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/gc.c           |  6 ++++++
 t/t7900-maintenance.sh | 39 +++++++++++++++++++++++++++++++++++++++
 2 files changed, 45 insertions(+)

diff --git a/builtin/gc.c b/builtin/gc.c
index 269a77960f..63106e2028 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -1426,6 +1426,10 @@ static int maintenance_run_tasks(struct maintenance_run_opts *opts,
 	}
 	free(lock_path);
 
+	/* Failure to daemonize is ok, we'll continue in foreground. */
+	if (opts->detach > 0)
+		daemonize();
+
 	for (i = 0; !found_selected && i < TASK__COUNT; i++)
 		found_selected = tasks[i].selected_order >= 0;
 
@@ -1552,6 +1556,8 @@ static int maintenance_run(int argc, const char **argv, const char *prefix)
 	struct option builtin_maintenance_run_options[] = {
 		OPT_BOOL(0, "auto", &opts.auto_flag,
 			 N_("run tasks based on the state of the repository")),
+		OPT_BOOL(0, "detach", &opts.detach,
+			 N_("perform maintenance in the background")),
 		OPT_CALLBACK(0, "schedule", &opts.schedule, N_("frequency"),
 			     N_("run tasks based on frequency"),
 			     maintenance_opt_schedule),
diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
index 8595489ceb..771525aa4b 100755
--- a/t/t7900-maintenance.sh
+++ b/t/t7900-maintenance.sh
@@ -908,4 +908,43 @@ test_expect_success 'failed schedule prevents config change' '
 	done
 '
 
+test_expect_success '--no-detach causes maintenance to not run in background' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+
+		# Prepare the repository such that git-maintenance(1) ends up
+		# outputting something.
+		test_commit something &&
+		git config set maintenance.gc.enabled false &&
+		git config set maintenance.loose-objects.enabled true &&
+		git config set maintenance.loose-objects.auto 1 &&
+		git config set maintenance.incremental-repack.enabled true &&
+
+		# We have no better way to check whether or not the task ran in
+		# the background than to verify whether it output anything. The
+		# next testcase checks the reverse, making this somewhat safer.
+		git maintenance run --no-detach >out 2>&1 &&
+		test_line_count = 1 out
+	)
+'
+
+test_expect_success '--detach causes maintenance to run in background' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+
+		test_commit something &&
+		git config set maintenance.gc.enabled false &&
+		git config set maintenance.loose-objects.enabled true &&
+		git config set maintenance.loose-objects.auto 1 &&
+		git config set maintenance.incremental-repack.enabled true &&
+
+		git maintenance run --detach >out 2>&1 &&
+		test_must_be_empty out
+	)
+'
+
 test_done
-- 
2.46.0.46.g406f326d27.dirty


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 7/7] builtin/maintenance: fix auto-detach with non-standard tasks
  2024-08-13  7:17 [PATCH 0/7] builtin/maintenance: fix auto-detach with non-standard tasks Patrick Steinhardt
                   ` (5 preceding siblings ...)
  2024-08-13  7:17 ` [PATCH 6/7] builtin/maintenance: " Patrick Steinhardt
@ 2024-08-13  7:18 ` Patrick Steinhardt
  2024-08-13 11:29   ` Phillip Wood
                     ` (2 more replies)
  2024-08-15  6:42 ` [PATCH 0/7] " James Liu
                   ` (3 subsequent siblings)
  10 siblings, 3 replies; 79+ messages in thread
From: Patrick Steinhardt @ 2024-08-13  7:18 UTC (permalink / raw)
  To: git

In the past, we used to execute `git gc --auto` as part of our automatic
housekeeping routines. As git-gc(1) may require quite some time to
perform the housekeeping, it knows to detach itself and run in the
background so that the user can continue their work.

Eventually, we refactored our automatic housekeeping to instead use the
more flexible git-maintenance(1) command. The upside of this new infra
is that the user can configure which maintenance tasks are performed, at
least to a certain degree. So while it continues to run git-gc(1) by
default, it can also be adapted to e.g. use git-multi-pack-index(1) for
maintenance of the object database.

The auto-detach of the new infra is somewhat broken though once the user
configures non-standard tasks. The problem is essentially that we detach
at the wrong level in the process hierarchy: git-maintenance(1) never
detaches itself, but instead it continues to be git-gc(1) which does.

When configured to only run the git-gc(1) maintenance task, then the
result is basically the same as before. But when configured to run other
tasks, then git-maintenance(1) will wait for these to run to completion.
Even worse, it may be that git-gc(1) runs concurrently with other
housekeeping tasks, stomping on each others feet.

Fix this bug by asking git-gc(1) to not detach when it is being invoked
via git-maintenance(1). Instead, the latter command now respects a new
config "maintenance.autoDetach", the equivalent of "gc.autoDetach", and
detaches itself into the background if not told otherwise. This should
continue to behave the same for all users which use the git-gc(1) task,
only. For others though, it means that we now properly perform all tasks
in the background.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/gc.c             |  1 +
 run-command.c            | 12 ++++++++++-
 t/t5616-partial-clone.sh |  6 +++---
 t/t7900-maintenance.sh   | 43 +++++++++++++++++++++++++++++++---------
 4 files changed, 49 insertions(+), 13 deletions(-)

diff --git a/builtin/gc.c b/builtin/gc.c
index 63106e2028..bafee330a2 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -1063,6 +1063,7 @@ static int maintenance_task_gc(struct maintenance_run_opts *opts,
 		strvec_push(&child.args, "--quiet");
 	else
 		strvec_push(&child.args, "--no-quiet");
+	strvec_push(&child.args, "--no-detach");
 
 	return run_command(&child);
 }
diff --git a/run-command.c b/run-command.c
index 45ba544932..94f2f3079f 100644
--- a/run-command.c
+++ b/run-command.c
@@ -1808,16 +1808,26 @@ void run_processes_parallel(const struct run_process_parallel_opts *opts)
 
 int prepare_auto_maintenance(int quiet, struct child_process *maint)
 {
-	int enabled;
+	int enabled, auto_detach;
 
 	if (!git_config_get_bool("maintenance.auto", &enabled) &&
 	    !enabled)
 		return 0;
 
+	/*
+	 * When `maintenance.autoDetach` isn't set, then we fall back to
+	 * honoring `gc.autoDetach`. This is somewhat weird, but required to
+	 * retain behaviour from when we used to run git-gc(1) here.
+	 */
+	if (git_config_get_bool("maintenance.autodetach", &auto_detach) &&
+	    git_config_get_bool("gc.autodetach", &auto_detach))
+		auto_detach = 1;
+
 	maint->git_cmd = 1;
 	maint->close_object_store = 1;
 	strvec_pushl(&maint->args, "maintenance", "run", "--auto", NULL);
 	strvec_push(&maint->args, quiet ? "--quiet" : "--no-quiet");
+	strvec_push(&maint->args, auto_detach ? "--detach" : "--no-detach");
 
 	return 1;
 }
diff --git a/t/t5616-partial-clone.sh b/t/t5616-partial-clone.sh
index 2da7291e37..8415884754 100755
--- a/t/t5616-partial-clone.sh
+++ b/t/t5616-partial-clone.sh
@@ -229,7 +229,7 @@ test_expect_success 'fetch --refetch triggers repacking' '
 
 	GIT_TRACE2_EVENT="$PWD/trace1.event" \
 	git -C pc1 fetch --refetch origin &&
-	test_subcommand git maintenance run --auto --no-quiet <trace1.event &&
+	test_subcommand git maintenance run --auto --no-quiet --detach <trace1.event &&
 	grep \"param\":\"gc.autopacklimit\",\"value\":\"1\" trace1.event &&
 	grep \"param\":\"maintenance.incremental-repack.auto\",\"value\":\"-1\" trace1.event &&
 
@@ -238,7 +238,7 @@ test_expect_success 'fetch --refetch triggers repacking' '
 		-c gc.autoPackLimit=0 \
 		-c maintenance.incremental-repack.auto=1234 \
 		-C pc1 fetch --refetch origin &&
-	test_subcommand git maintenance run --auto --no-quiet <trace2.event &&
+	test_subcommand git maintenance run --auto --no-quiet --detach <trace2.event &&
 	grep \"param\":\"gc.autopacklimit\",\"value\":\"0\" trace2.event &&
 	grep \"param\":\"maintenance.incremental-repack.auto\",\"value\":\"-1\" trace2.event &&
 
@@ -247,7 +247,7 @@ test_expect_success 'fetch --refetch triggers repacking' '
 		-c gc.autoPackLimit=1234 \
 		-c maintenance.incremental-repack.auto=0 \
 		-C pc1 fetch --refetch origin &&
-	test_subcommand git maintenance run --auto --no-quiet <trace3.event &&
+	test_subcommand git maintenance run --auto --no-quiet --detach <trace3.event &&
 	grep \"param\":\"gc.autopacklimit\",\"value\":\"1\" trace3.event &&
 	grep \"param\":\"maintenance.incremental-repack.auto\",\"value\":\"0\" trace3.event
 '
diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
index 771525aa4b..06ab43cfb5 100755
--- a/t/t7900-maintenance.sh
+++ b/t/t7900-maintenance.sh
@@ -49,22 +49,47 @@ test_expect_success 'run [--auto|--quiet]' '
 		git maintenance run --auto 2>/dev/null &&
 	GIT_TRACE2_EVENT="$(pwd)/run-no-quiet.txt" \
 		git maintenance run --no-quiet 2>/dev/null &&
-	test_subcommand git gc --quiet <run-no-auto.txt &&
-	test_subcommand ! git gc --auto --quiet <run-auto.txt &&
-	test_subcommand git gc --no-quiet <run-no-quiet.txt
+	test_subcommand git gc --quiet --no-detach <run-no-auto.txt &&
+	test_subcommand ! git gc --auto --quiet --no-detach <run-auto.txt &&
+	test_subcommand git gc --no-quiet --no-detach <run-no-quiet.txt
 '
 
 test_expect_success 'maintenance.auto config option' '
 	GIT_TRACE2_EVENT="$(pwd)/default" git commit --quiet --allow-empty -m 1 &&
-	test_subcommand git maintenance run --auto --quiet <default &&
+	test_subcommand git maintenance run --auto --quiet --detach <default &&
 	GIT_TRACE2_EVENT="$(pwd)/true" \
 		git -c maintenance.auto=true \
 		commit --quiet --allow-empty -m 2 &&
-	test_subcommand git maintenance run --auto --quiet  <true &&
+	test_subcommand git maintenance run --auto --quiet --detach <true &&
 	GIT_TRACE2_EVENT="$(pwd)/false" \
 		git -c maintenance.auto=false \
 		commit --quiet --allow-empty -m 3 &&
-	test_subcommand ! git maintenance run --auto --quiet  <false
+	test_subcommand ! git maintenance run --auto --quiet --detach <false
+'
+
+for cfg in maintenance.autoDetach gc.autoDetach
+do
+	test_expect_success "$cfg=true config option" '
+		test_when_finished "rm -f trace" &&
+		test_config $cfg true &&
+		GIT_TRACE2_EVENT="$(pwd)/trace" git commit --quiet --allow-empty -m 1 &&
+		test_subcommand git maintenance run --auto --quiet --detach <trace
+	'
+
+	test_expect_success "$cfg=false config option" '
+		test_when_finished "rm -f trace" &&
+		test_config $cfg false &&
+		GIT_TRACE2_EVENT="$(pwd)/trace" git commit --quiet --allow-empty -m 1 &&
+		test_subcommand git maintenance run --auto --quiet --no-detach <trace
+	'
+done
+
+test_expect_success "maintenance.autoDetach overrides gc.autoDetach" '
+	test_when_finished "rm -f trace" &&
+	test_config maintenance.autoDetach false &&
+	test_config gc.autoDetach true &&
+	GIT_TRACE2_EVENT="$(pwd)/trace" git commit --quiet --allow-empty -m 1 &&
+	test_subcommand git maintenance run --auto --quiet --no-detach <trace
 '
 
 test_expect_success 'register uses XDG_CONFIG_HOME config if it exists' '
@@ -129,9 +154,9 @@ test_expect_success 'run --task=<task>' '
 		git maintenance run --task=commit-graph 2>/dev/null &&
 	GIT_TRACE2_EVENT="$(pwd)/run-both.txt" \
 		git maintenance run --task=commit-graph --task=gc 2>/dev/null &&
-	test_subcommand ! git gc --quiet <run-commit-graph.txt &&
-	test_subcommand git gc --quiet <run-gc.txt &&
-	test_subcommand git gc --quiet <run-both.txt &&
+	test_subcommand ! git gc --quiet --no-detach <run-commit-graph.txt &&
+	test_subcommand git gc --quiet --no-detach <run-gc.txt &&
+	test_subcommand git gc --quiet --no-detach <run-both.txt &&
 	test_subcommand git commit-graph write --split --reachable --no-progress <run-commit-graph.txt &&
 	test_subcommand ! git commit-graph write --split --reachable --no-progress <run-gc.txt &&
 	test_subcommand git commit-graph write --split --reachable --no-progress <run-both.txt
-- 
2.46.0.46.g406f326d27.dirty


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* Re: [PATCH 7/7] builtin/maintenance: fix auto-detach with non-standard tasks
  2024-08-13  7:18 ` [PATCH 7/7] builtin/maintenance: fix auto-detach with non-standard tasks Patrick Steinhardt
@ 2024-08-13 11:29   ` Phillip Wood
  2024-08-13 11:59     ` Patrick Steinhardt
  2024-08-15  6:40   ` James Liu
  2024-08-15 14:00   ` Derrick Stolee
  2 siblings, 1 reply; 79+ messages in thread
From: Phillip Wood @ 2024-08-13 11:29 UTC (permalink / raw)
  To: Patrick Steinhardt, git

Hi Patrick

On 13/08/2024 08:18, Patrick Steinhardt wrote:
>
> Fix this bug by asking git-gc(1) to not detach when it is being invoked
> via git-maintenance(1). Instead, the latter command now respects a new
> config "maintenance.autoDetach", the equivalent of "gc.autoDetach", and
> detaches itself into the background if not told otherwise. This should
> continue to behave the same for all users which use the git-gc(1) task,
> only. For others though, it means that we now properly perform all tasks
> in the background.

I fear that users who are running "git maintenance" from a scheduler 
such as cron are likely to be surprised by this change in behavior. At 
the very least "git maintenance" will no-longer return a meaningful exit 
code. Perhaps we could switch the logic to be opt in and pass "--detach" 
(or "-c maintenance.autoDetach=true") when running "git maintenance" 
automatically from "git rebase" etc.

Best Wishes

Phillip

> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
>   builtin/gc.c             |  1 +
>   run-command.c            | 12 ++++++++++-
>   t/t5616-partial-clone.sh |  6 +++---
>   t/t7900-maintenance.sh   | 43 +++++++++++++++++++++++++++++++---------
>   4 files changed, 49 insertions(+), 13 deletions(-)
> 
> diff --git a/builtin/gc.c b/builtin/gc.c
> index 63106e2028..bafee330a2 100644
> --- a/builtin/gc.c
> +++ b/builtin/gc.c
> @@ -1063,6 +1063,7 @@ static int maintenance_task_gc(struct maintenance_run_opts *opts,
>   		strvec_push(&child.args, "--quiet");
>   	else
>   		strvec_push(&child.args, "--no-quiet");
> +	strvec_push(&child.args, "--no-detach");
>   
>   	return run_command(&child);
>   }
> diff --git a/run-command.c b/run-command.c
> index 45ba544932..94f2f3079f 100644
> --- a/run-command.c
> +++ b/run-command.c
> @@ -1808,16 +1808,26 @@ void run_processes_parallel(const struct run_process_parallel_opts *opts)
>   
>   int prepare_auto_maintenance(int quiet, struct child_process *maint)
>   {
> -	int enabled;
> +	int enabled, auto_detach;
>   
>   	if (!git_config_get_bool("maintenance.auto", &enabled) &&
>   	    !enabled)
>   		return 0;
>   
> +	/*
> +	 * When `maintenance.autoDetach` isn't set, then we fall back to
> +	 * honoring `gc.autoDetach`. This is somewhat weird, but required to
> +	 * retain behaviour from when we used to run git-gc(1) here.
> +	 */
> +	if (git_config_get_bool("maintenance.autodetach", &auto_detach) &&
> +	    git_config_get_bool("gc.autodetach", &auto_detach))
> +		auto_detach = 1;
> +
>   	maint->git_cmd = 1;
>   	maint->close_object_store = 1;
>   	strvec_pushl(&maint->args, "maintenance", "run", "--auto", NULL);
>   	strvec_push(&maint->args, quiet ? "--quiet" : "--no-quiet");
> +	strvec_push(&maint->args, auto_detach ? "--detach" : "--no-detach");
>   
>   	return 1;
>   }
> diff --git a/t/t5616-partial-clone.sh b/t/t5616-partial-clone.sh
> index 2da7291e37..8415884754 100755
> --- a/t/t5616-partial-clone.sh
> +++ b/t/t5616-partial-clone.sh
> @@ -229,7 +229,7 @@ test_expect_success 'fetch --refetch triggers repacking' '
>   
>   	GIT_TRACE2_EVENT="$PWD/trace1.event" \
>   	git -C pc1 fetch --refetch origin &&
> -	test_subcommand git maintenance run --auto --no-quiet <trace1.event &&
> +	test_subcommand git maintenance run --auto --no-quiet --detach <trace1.event &&
>   	grep \"param\":\"gc.autopacklimit\",\"value\":\"1\" trace1.event &&
>   	grep \"param\":\"maintenance.incremental-repack.auto\",\"value\":\"-1\" trace1.event &&
>   
> @@ -238,7 +238,7 @@ test_expect_success 'fetch --refetch triggers repacking' '
>   		-c gc.autoPackLimit=0 \
>   		-c maintenance.incremental-repack.auto=1234 \
>   		-C pc1 fetch --refetch origin &&
> -	test_subcommand git maintenance run --auto --no-quiet <trace2.event &&
> +	test_subcommand git maintenance run --auto --no-quiet --detach <trace2.event &&
>   	grep \"param\":\"gc.autopacklimit\",\"value\":\"0\" trace2.event &&
>   	grep \"param\":\"maintenance.incremental-repack.auto\",\"value\":\"-1\" trace2.event &&
>   
> @@ -247,7 +247,7 @@ test_expect_success 'fetch --refetch triggers repacking' '
>   		-c gc.autoPackLimit=1234 \
>   		-c maintenance.incremental-repack.auto=0 \
>   		-C pc1 fetch --refetch origin &&
> -	test_subcommand git maintenance run --auto --no-quiet <trace3.event &&
> +	test_subcommand git maintenance run --auto --no-quiet --detach <trace3.event &&
>   	grep \"param\":\"gc.autopacklimit\",\"value\":\"1\" trace3.event &&
>   	grep \"param\":\"maintenance.incremental-repack.auto\",\"value\":\"0\" trace3.event
>   '
> diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
> index 771525aa4b..06ab43cfb5 100755
> --- a/t/t7900-maintenance.sh
> +++ b/t/t7900-maintenance.sh
> @@ -49,22 +49,47 @@ test_expect_success 'run [--auto|--quiet]' '
>   		git maintenance run --auto 2>/dev/null &&
>   	GIT_TRACE2_EVENT="$(pwd)/run-no-quiet.txt" \
>   		git maintenance run --no-quiet 2>/dev/null &&
> -	test_subcommand git gc --quiet <run-no-auto.txt &&
> -	test_subcommand ! git gc --auto --quiet <run-auto.txt &&
> -	test_subcommand git gc --no-quiet <run-no-quiet.txt
> +	test_subcommand git gc --quiet --no-detach <run-no-auto.txt &&
> +	test_subcommand ! git gc --auto --quiet --no-detach <run-auto.txt &&
> +	test_subcommand git gc --no-quiet --no-detach <run-no-quiet.txt
>   '
>   
>   test_expect_success 'maintenance.auto config option' '
>   	GIT_TRACE2_EVENT="$(pwd)/default" git commit --quiet --allow-empty -m 1 &&
> -	test_subcommand git maintenance run --auto --quiet <default &&
> +	test_subcommand git maintenance run --auto --quiet --detach <default &&
>   	GIT_TRACE2_EVENT="$(pwd)/true" \
>   		git -c maintenance.auto=true \
>   		commit --quiet --allow-empty -m 2 &&
> -	test_subcommand git maintenance run --auto --quiet  <true &&
> +	test_subcommand git maintenance run --auto --quiet --detach <true &&
>   	GIT_TRACE2_EVENT="$(pwd)/false" \
>   		git -c maintenance.auto=false \
>   		commit --quiet --allow-empty -m 3 &&
> -	test_subcommand ! git maintenance run --auto --quiet  <false
> +	test_subcommand ! git maintenance run --auto --quiet --detach <false
> +'
> +
> +for cfg in maintenance.autoDetach gc.autoDetach
> +do
> +	test_expect_success "$cfg=true config option" '
> +		test_when_finished "rm -f trace" &&
> +		test_config $cfg true &&
> +		GIT_TRACE2_EVENT="$(pwd)/trace" git commit --quiet --allow-empty -m 1 &&
> +		test_subcommand git maintenance run --auto --quiet --detach <trace
> +	'
> +
> +	test_expect_success "$cfg=false config option" '
> +		test_when_finished "rm -f trace" &&
> +		test_config $cfg false &&
> +		GIT_TRACE2_EVENT="$(pwd)/trace" git commit --quiet --allow-empty -m 1 &&
> +		test_subcommand git maintenance run --auto --quiet --no-detach <trace
> +	'
> +done
> +
> +test_expect_success "maintenance.autoDetach overrides gc.autoDetach" '
> +	test_when_finished "rm -f trace" &&
> +	test_config maintenance.autoDetach false &&
> +	test_config gc.autoDetach true &&
> +	GIT_TRACE2_EVENT="$(pwd)/trace" git commit --quiet --allow-empty -m 1 &&
> +	test_subcommand git maintenance run --auto --quiet --no-detach <trace
>   '
>   
>   test_expect_success 'register uses XDG_CONFIG_HOME config if it exists' '
> @@ -129,9 +154,9 @@ test_expect_success 'run --task=<task>' '
>   		git maintenance run --task=commit-graph 2>/dev/null &&
>   	GIT_TRACE2_EVENT="$(pwd)/run-both.txt" \
>   		git maintenance run --task=commit-graph --task=gc 2>/dev/null &&
> -	test_subcommand ! git gc --quiet <run-commit-graph.txt &&
> -	test_subcommand git gc --quiet <run-gc.txt &&
> -	test_subcommand git gc --quiet <run-both.txt &&
> +	test_subcommand ! git gc --quiet --no-detach <run-commit-graph.txt &&
> +	test_subcommand git gc --quiet --no-detach <run-gc.txt &&
> +	test_subcommand git gc --quiet --no-detach <run-both.txt &&
>   	test_subcommand git commit-graph write --split --reachable --no-progress <run-commit-graph.txt &&
>   	test_subcommand ! git commit-graph write --split --reachable --no-progress <run-gc.txt &&
>   	test_subcommand git commit-graph write --split --reachable --no-progress <run-both.txt

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 7/7] builtin/maintenance: fix auto-detach with non-standard tasks
  2024-08-13 11:29   ` Phillip Wood
@ 2024-08-13 11:59     ` Patrick Steinhardt
  2024-08-13 13:19       ` Phillip Wood
  0 siblings, 1 reply; 79+ messages in thread
From: Patrick Steinhardt @ 2024-08-13 11:59 UTC (permalink / raw)
  To: phillip.wood; +Cc: git

On Tue, Aug 13, 2024 at 12:29:47PM +0100, Phillip Wood wrote:
> Hi Patrick
> 
> On 13/08/2024 08:18, Patrick Steinhardt wrote:
> > 
> > Fix this bug by asking git-gc(1) to not detach when it is being invoked
> > via git-maintenance(1). Instead, the latter command now respects a new
> > config "maintenance.autoDetach", the equivalent of "gc.autoDetach", and
> > detaches itself into the background if not told otherwise. This should
> > continue to behave the same for all users which use the git-gc(1) task,
> > only. For others though, it means that we now properly perform all tasks
> > in the background.
> 
> I fear that users who are running "git maintenance" from a scheduler such as
> cron are likely to be surprised by this change in behavior. At the very
> least "git maintenance" will no-longer return a meaningful exit code.
> Perhaps we could switch the logic to be opt in and pass "--detach" (or "-c
> maintenance.autoDetach=true") when running "git maintenance" automatically
> from "git rebase" etc.

It's actually the reverse: the old behaviour when run via a scheduler
was to detach by default, because git-gc(1) did. We now ask it to not
detach anymore, which fixes this. Furthermore, the default behaviour of
`git maintenance run` did not change either: it stays in the foreground
unless the `--detach` flag is given. So the thing you worry about is
actually getting fixed by this series :)

What _does_ change though is when we run `git maintenance` via our
auto-maintenance framework. Here we now do detach the whole maintenance
process, instead of only git-gc(1). This logic is only being executed by
random commands (git-rebase, git-pull, git-commit etc), and I'd argue it
is the expected behaviour.

Patrick

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 7/7] builtin/maintenance: fix auto-detach with non-standard tasks
  2024-08-13 11:59     ` Patrick Steinhardt
@ 2024-08-13 13:19       ` Phillip Wood
  2024-08-14  4:15         ` Patrick Steinhardt
  0 siblings, 1 reply; 79+ messages in thread
From: Phillip Wood @ 2024-08-13 13:19 UTC (permalink / raw)
  To: Patrick Steinhardt, phillip.wood; +Cc: git

On 13/08/2024 12:59, Patrick Steinhardt wrote:
> On Tue, Aug 13, 2024 at 12:29:47PM +0100, Phillip Wood wrote:
>> Hi Patrick
>>
>> On 13/08/2024 08:18, Patrick Steinhardt wrote:
>>>
>>> Fix this bug by asking git-gc(1) to not detach when it is being invoked
>>> via git-maintenance(1). Instead, the latter command now respects a new
>>> config "maintenance.autoDetach", the equivalent of "gc.autoDetach", and
>>> detaches itself into the background if not told otherwise. This should
>>> continue to behave the same for all users which use the git-gc(1) task,
>>> only. For others though, it means that we now properly perform all tasks
>>> in the background.
>>
>> I fear that users who are running "git maintenance" from a scheduler such as
>> cron are likely to be surprised by this change in behavior. At the very
>> least "git maintenance" will no-longer return a meaningful exit code.
>> Perhaps we could switch the logic to be opt in and pass "--detach" (or "-c
>> maintenance.autoDetach=true") when running "git maintenance" automatically
>> from "git rebase" etc.
> 
> It's actually the reverse: the old behaviour when run via a scheduler
> was to detach by default, because git-gc(1) did.

Oh, I  misunderstood what this patch is changing. So despite being 
tagged builtin/maintenance and talking about "git maintenance" it does 
not actually touch builtin/maintenance.c or change its behavior. What it 
is actually doing is changing how other git commands run "git 
maintenance --auto" so that it is always run in the background unless 
the user configures maintenance.autoDetach=false. That sounds like a 
good change.

Thanks for clarifying

Phillip

> We now ask it to not
> detach anymore, which fixes this. Furthermore, the default behaviour of
> `git maintenance run` did not change either: it stays in the foreground
> unless the `--detach` flag is given. So the thing you worry about is
> actually getting fixed by this series :)
> 
> What _does_ change though is when we run `git maintenance` via our
> auto-maintenance framework. Here we now do detach the whole maintenance
> process, instead of only git-gc(1). This logic is only being executed by
> random commands (git-rebase, git-pull, git-commit etc), and I'd argue it
> is the expected behaviour.
> 
> Patrick
> 

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 7/7] builtin/maintenance: fix auto-detach with non-standard tasks
  2024-08-13 13:19       ` Phillip Wood
@ 2024-08-14  4:15         ` Patrick Steinhardt
  2024-08-14 15:13           ` Phillip Wood
  0 siblings, 1 reply; 79+ messages in thread
From: Patrick Steinhardt @ 2024-08-14  4:15 UTC (permalink / raw)
  To: phillip.wood; +Cc: git

On Tue, Aug 13, 2024 at 02:19:20PM +0100, Phillip Wood wrote:
> On 13/08/2024 12:59, Patrick Steinhardt wrote:
> > On Tue, Aug 13, 2024 at 12:29:47PM +0100, Phillip Wood wrote:
> > > Hi Patrick
> > > 
> > > On 13/08/2024 08:18, Patrick Steinhardt wrote:
> > > > 
> > > > Fix this bug by asking git-gc(1) to not detach when it is being invoked
> > > > via git-maintenance(1). Instead, the latter command now respects a new
> > > > config "maintenance.autoDetach", the equivalent of "gc.autoDetach", and
> > > > detaches itself into the background if not told otherwise. This should
> > > > continue to behave the same for all users which use the git-gc(1) task,
> > > > only. For others though, it means that we now properly perform all tasks
> > > > in the background.
> > > 
> > > I fear that users who are running "git maintenance" from a scheduler such as
> > > cron are likely to be surprised by this change in behavior. At the very
> > > least "git maintenance" will no-longer return a meaningful exit code.
> > > Perhaps we could switch the logic to be opt in and pass "--detach" (or "-c
> > > maintenance.autoDetach=true") when running "git maintenance" automatically
> > > from "git rebase" etc.
> > 
> > It's actually the reverse: the old behaviour when run via a scheduler
> > was to detach by default, because git-gc(1) did.
> 
> Oh, I  misunderstood what this patch is changing. So despite being tagged
> builtin/maintenance and talking about "git maintenance" it does not actually
> touch builtin/maintenance.c or change its behavior. What it is actually
> doing is changing how other git commands run "git maintenance --auto" so
> that it is always run in the background unless the user configures
> maintenance.autoDetach=false. That sounds like a good change.
> 
> Thanks for clarifying

Yes. I should've probably prefixed this with "run-command:", not with
"builtin/maintenance".

Patrick

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 7/7] builtin/maintenance: fix auto-detach with non-standard tasks
  2024-08-14  4:15         ` Patrick Steinhardt
@ 2024-08-14 15:13           ` Phillip Wood
  2024-08-15  5:30             ` Patrick Steinhardt
  0 siblings, 1 reply; 79+ messages in thread
From: Phillip Wood @ 2024-08-14 15:13 UTC (permalink / raw)
  To: Patrick Steinhardt, phillip.wood; +Cc: git

On 14/08/2024 05:15, Patrick Steinhardt wrote:
> On Tue, Aug 13, 2024 at 02:19:20PM +0100, Phillip Wood wrote:
>> On 13/08/2024 12:59, Patrick Steinhardt wrote:
>>
>> Oh, I  misunderstood what this patch is changing. So despite being tagged
>> builtin/maintenance and talking about "git maintenance" it does not actually
>> touch builtin/maintenance.c or change its behavior. What it is actually
>> doing is changing how other git commands run "git maintenance --auto" so
>> that it is always run in the background unless the user configures
>> maintenance.autoDetach=false. That sounds like a good change.
>>
>> Thanks for clarifying

Sorry my message sounds grumpier than I intended.

> Yes. I should've probably prefixed this with "run-command:", not with
> "builtin/maintenance".

That's a good idea, I think the important thing we want to convey is 
that we're changing the way we run "git maintenance --auto", not the 
behavior of "git maintenance" itself.

Thanks

Phillip

> Patrick
> 

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 3/7] builtin/gc: fix leaking config values
  2024-08-13  7:17 ` [PATCH 3/7] builtin/gc: fix leaking config values Patrick Steinhardt
@ 2024-08-15  5:22   ` James Liu
  2024-08-15  8:18     ` Patrick Steinhardt
  2024-08-15 13:50   ` Derrick Stolee
  1 sibling, 1 reply; 79+ messages in thread
From: James Liu @ 2024-08-15  5:22 UTC (permalink / raw)
  To: Patrick Steinhardt, git

On Tue Aug 13, 2024 at 5:17 PM AEST, Patrick Steinhardt wrote:
> Note that there is one small gotcha here with the "--prune" option. Next
> to passing a string, this option also accepts the "--no-prune" option
> that overrides the default or configured value. We thus need to discern
> between the option not having been passed by the user and the negative
> variant of it. This is done by using a simple sentinel value that lets
> us discern these cases.
>
> @@ -644,12 +673,15 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
>  	struct child_process rerere_cmd = CHILD_PROCESS_INIT;
>  	struct maintenance_run_opts opts = {0};
>  	struct gc_config cfg = GC_CONFIG_INIT;
> +	const char *prune_expire_sentinel = "sentinel";
> +	const char *prune_expire_arg = prune_expire_sentinel;
> +	int ret;
>  
>  	struct option builtin_gc_options[] = {
>  		OPT__QUIET(&quiet, N_("suppress progress reporting")),
> -		{ OPTION_STRING, 0, "prune", &cfg.prune_expire, N_("date"),
> +		{ OPTION_STRING, 0, "prune", &prune_expire_arg, N_("date"),
>  			N_("prune unreferenced objects"),
> -			PARSE_OPT_OPTARG, NULL, (intptr_t)cfg.prune_expire },
> +			PARSE_OPT_OPTARG, NULL, (intptr_t)prune_expire_arg },
>  		OPT_BOOL(0, "cruft", &cfg.cruft_packs, N_("pack unreferenced objects separately")),
>  		OPT_MAGNITUDE(0, "max-cruft-size", &cfg.max_cruft_size,
>  			      N_("with --cruft, limit the size of new cruft packs")),

I was wondering how the `no-*` options worked since they're not
explicitly defined in the `builtin_gc_options` array. I guess
they're handled internally by `parse_options()`, and if the `no-` prefix
was present, it leaves that argument unset.


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 2/7] builtin/gc: refactor to read config into structure
  2024-08-13  7:17 ` [PATCH 2/7] builtin/gc: refactor to read config into structure Patrick Steinhardt
@ 2024-08-15  5:24   ` James Liu
  2024-08-15  8:18     ` Patrick Steinhardt
  2024-08-15 13:46   ` Derrick Stolee
  1 sibling, 1 reply; 79+ messages in thread
From: James Liu @ 2024-08-15  5:24 UTC (permalink / raw)
  To: Patrick Steinhardt, git

On Tue Aug 13, 2024 at 5:17 PM AEST, Patrick Steinhardt wrote:
> @@ -206,7 +224,7 @@ struct maintenance_run_opts {
>  	enum schedule_priority schedule;
>  };
>  
> -static int pack_refs_condition(void)
> +static int pack_refs_condition(UNUSED struct gc_config *cfg)
>  {
>  	/*
>  	 * The auto-repacking logic for refs is handled by the ref backends and
> @@ -216,7 +234,8 @@ static int pack_refs_condition(void)
>  	return 1;
>  }
>  
> -static int maintenance_task_pack_refs(MAYBE_UNUSED struct maintenance_run_opts *opts)
> +static int maintenance_task_pack_refs(MAYBE_UNUSED struct maintenance_run_opts *opts,
> +				      UNUSED struct gc_config *cfg)
>  {
>  	struct child_process cmd = CHILD_PROCESS_INIT;
>  

Are we defining *cfg as an unused argument to conform to the
`maintenance_task_fn` signature?


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 7/7] builtin/maintenance: fix auto-detach with non-standard tasks
  2024-08-14 15:13           ` Phillip Wood
@ 2024-08-15  5:30             ` Patrick Steinhardt
  0 siblings, 0 replies; 79+ messages in thread
From: Patrick Steinhardt @ 2024-08-15  5:30 UTC (permalink / raw)
  To: phillip.wood; +Cc: git

On Wed, Aug 14, 2024 at 04:13:05PM +0100, Phillip Wood wrote:
> On 14/08/2024 05:15, Patrick Steinhardt wrote:
> > On Tue, Aug 13, 2024 at 02:19:20PM +0100, Phillip Wood wrote:
> > > On 13/08/2024 12:59, Patrick Steinhardt wrote:
> > > 
> > > Oh, I  misunderstood what this patch is changing. So despite being tagged
> > > builtin/maintenance and talking about "git maintenance" it does not actually
> > > touch builtin/maintenance.c or change its behavior. What it is actually
> > > doing is changing how other git commands run "git maintenance --auto" so
> > > that it is always run in the background unless the user configures
> > > maintenance.autoDetach=false. That sounds like a good change.
> > > 
> > > Thanks for clarifying
> 
> Sorry my message sounds grumpier than I intended.

No worries, I didn't receive it as grumpy.

> > Yes. I should've probably prefixed this with "run-command:", not with
> > "builtin/maintenance".
> 
> That's a good idea, I think the important thing we want to convey is that
> we're changing the way we run "git maintenance --auto", not the behavior of
> "git maintenance" itself.

I'll have a look in case I need to reroll this series.

Patrick

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 4/7] builtin/gc: stop processing log file on signal
  2024-08-13  7:17 ` [PATCH 4/7] builtin/gc: stop processing log file on signal Patrick Steinhardt
@ 2024-08-15  6:01   ` James Liu
  0 siblings, 0 replies; 79+ messages in thread
From: James Liu @ 2024-08-15  6:01 UTC (permalink / raw)
  To: Patrick Steinhardt, git

On Tue Aug 13, 2024 at 5:17 PM AEST, Patrick Steinhardt wrote:
> The consequence is that "gc.log" will not be committed, and thus
> subsequent calls to `git gc --auto` won't bail out because of this.
> Arguably though, it is better to retry garbage collection rather than
> having the process run into a potentially-corrupted state.

Ahh I see, because `report_last_gc_error()` won't have anything to read.
I agree it's an appropriate tradeoff given that garbage collection is
not on the critical path, and it's not likely that GC will be
interrupted on every attempt.

> @@ -807,7 +800,6 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
>  					  git_path("gc.log"),
>  					  LOCK_DIE_ON_ERROR);
>  		dup2(get_lock_file_fd(&log_lock), 2);
> -		sigchain_push_common(process_log_file_on_signal);
>  		atexit(process_log_file_at_exit);
>  	}
>  


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 7/7] builtin/maintenance: fix auto-detach with non-standard tasks
  2024-08-13  7:18 ` [PATCH 7/7] builtin/maintenance: fix auto-detach with non-standard tasks Patrick Steinhardt
  2024-08-13 11:29   ` Phillip Wood
@ 2024-08-15  6:40   ` James Liu
  2024-08-15  8:17     ` Patrick Steinhardt
  2024-08-15 14:00   ` Derrick Stolee
  2 siblings, 1 reply; 79+ messages in thread
From: James Liu @ 2024-08-15  6:40 UTC (permalink / raw)
  To: Patrick Steinhardt, git

On Tue Aug 13, 2024 at 5:18 PM AEST, Patrick Steinhardt wrote:
> diff --git a/run-command.c b/run-command.c
> index 45ba544932..94f2f3079f 100644
> --- a/run-command.c
> +++ b/run-command.c
> @@ -1808,16 +1808,26 @@ void run_processes_parallel(const struct run_process_parallel_opts *opts)
>  
>  int prepare_auto_maintenance(int quiet, struct child_process *maint)
>  {
> -	int enabled;
> +	int enabled, auto_detach;
>  
>  	if (!git_config_get_bool("maintenance.auto", &enabled) &&
>  	    !enabled)
>  		return 0;
>  
> +	/*
> +	 * When `maintenance.autoDetach` isn't set, then we fall back to
> +	 * honoring `gc.autoDetach`. This is somewhat weird, but required to
> +	 * retain behaviour from when we used to run git-gc(1) here.
> +	 */
> +	if (git_config_get_bool("maintenance.autodetach", &auto_detach) &&
> +	    git_config_get_bool("gc.autodetach", &auto_detach))
> +		auto_detach = 1;
> +

Do the two `*.autodetach` values need to be camel-cased or does it not
matter? I've noticed a mix of both through the codebase so I suppose
it's not case-sensitive.


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 0/7] builtin/maintenance: fix auto-detach with non-standard tasks
  2024-08-13  7:17 [PATCH 0/7] builtin/maintenance: fix auto-detach with non-standard tasks Patrick Steinhardt
                   ` (6 preceding siblings ...)
  2024-08-13  7:18 ` [PATCH 7/7] builtin/maintenance: fix auto-detach with non-standard tasks Patrick Steinhardt
@ 2024-08-15  6:42 ` James Liu
  2024-08-15  9:12 ` [PATCH v2 " Patrick Steinhardt
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 79+ messages in thread
From: James Liu @ 2024-08-15  6:42 UTC (permalink / raw)
  To: Patrick Steinhardt, git

On Tue Aug 13, 2024 at 5:17 PM AEST, Patrick Steinhardt wrote:
> Hi,
>
> I recently configured git-maintenance(1) to not use git-gc(1) anymore,
> but instead to use git-multi-pack-index(1). I quickly noticed that the
> behaviour here is somewhat broken because instead of auto-detaching when
> `git maintenance run --auto` executes, we wait for the process to run to
> completion.
>
> The root cause is that git-maintenance(1), probably by accident,
> continues to rely on the auto-detaching mechanism in git-gc(1). So
> instead of having git-maintenance(1) detach, it is git-gc(1) that
> detaches and thus causes git-maintenance(1) to exit early. That of
> course falls flat once any maintenance task other than git-gc(1)
> executes, because these won't detach.
>
> Despite being a usability issue, this may also cause git-gc(1) to run
> concurrently with any other enabled maintenance tasks. This shouldn't
> lead to data loss, but it can certainly lead to processes stomping on
> each others feet.
>
> This patch series fixes this by wiring up new `--detach` flags for both
> git-gc(1) and git-maintenance(1). Like this, git-maintenance(1) now
> knows to execute `git gc --auto --no-detach`, while our auto-maintenance
> will execute `git mainteance run --auto --detach`.
>
> Patrick
>
> Patrick Steinhardt (7):
>   config: fix constness of out parameter for `git_config_get_expiry()`
>   builtin/gc: refactor to read config into structure
>   builtin/gc: fix leaking config values
>   builtin/gc: stop processing log file on signal
>   builtin/gc: add a `--detach` flag
>   builtin/maintenance: add a `--detach` flag
>   builtin/maintenance: fix auto-detach with non-standard tasks
>
>  Documentation/git-gc.txt |   5 +-
>  builtin/gc.c             | 384 ++++++++++++++++++++++++---------------
>  config.c                 |   4 +-
>  config.h                 |   2 +-
>  read-cache.c             |  12 +-
>  run-command.c            |  12 +-
>  t/t5304-prune.sh         |   1 +
>  t/t5616-partial-clone.sh |   6 +-
>  t/t6500-gc.sh            |  39 ++++
>  t/t7900-maintenance.sh   |  82 ++++++++-
>  10 files changed, 381 insertions(+), 166 deletions(-)

Thanks Patrick! I've left a few replies, mostly non-blocking questions.

Cheers,
James

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 7/7] builtin/maintenance: fix auto-detach with non-standard tasks
  2024-08-15  6:40   ` James Liu
@ 2024-08-15  8:17     ` Patrick Steinhardt
  0 siblings, 0 replies; 79+ messages in thread
From: Patrick Steinhardt @ 2024-08-15  8:17 UTC (permalink / raw)
  To: James Liu; +Cc: git

On Thu, Aug 15, 2024 at 04:40:51PM +1000, James Liu wrote:
> On Tue Aug 13, 2024 at 5:18 PM AEST, Patrick Steinhardt wrote:
> > diff --git a/run-command.c b/run-command.c
> > index 45ba544932..94f2f3079f 100644
> > --- a/run-command.c
> > +++ b/run-command.c
> > @@ -1808,16 +1808,26 @@ void run_processes_parallel(const struct run_process_parallel_opts *opts)
> >  
> >  int prepare_auto_maintenance(int quiet, struct child_process *maint)
> >  {
> > -	int enabled;
> > +	int enabled, auto_detach;
> >  
> >  	if (!git_config_get_bool("maintenance.auto", &enabled) &&
> >  	    !enabled)
> >  		return 0;
> >  
> > +	/*
> > +	 * When `maintenance.autoDetach` isn't set, then we fall back to
> > +	 * honoring `gc.autoDetach`. This is somewhat weird, but required to
> > +	 * retain behaviour from when we used to run git-gc(1) here.
> > +	 */
> > +	if (git_config_get_bool("maintenance.autodetach", &auto_detach) &&
> > +	    git_config_get_bool("gc.autodetach", &auto_detach))
> > +		auto_detach = 1;
> > +
> 
> Do the two `*.autodetach` values need to be camel-cased or does it not
> matter? I've noticed a mix of both through the codebase so I suppose
> it's not case-sensitive.

Config keys are case-insensitive in general, and as far as I am aware we
typically use the all-lowercase variant when retrieving config keys. On
the other hand, in our docs we spell the config keys camel-cased to help
the user make sense of it.

Patrick

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 2/7] builtin/gc: refactor to read config into structure
  2024-08-15  5:24   ` James Liu
@ 2024-08-15  8:18     ` Patrick Steinhardt
  0 siblings, 0 replies; 79+ messages in thread
From: Patrick Steinhardt @ 2024-08-15  8:18 UTC (permalink / raw)
  To: James Liu; +Cc: git

On Thu, Aug 15, 2024 at 03:24:21PM +1000, James Liu wrote:
> On Tue Aug 13, 2024 at 5:17 PM AEST, Patrick Steinhardt wrote:
> > @@ -206,7 +224,7 @@ struct maintenance_run_opts {
> >  	enum schedule_priority schedule;
> >  };
> >  
> > -static int pack_refs_condition(void)
> > +static int pack_refs_condition(UNUSED struct gc_config *cfg)
> >  {
> >  	/*
> >  	 * The auto-repacking logic for refs is handled by the ref backends and
> > @@ -216,7 +234,8 @@ static int pack_refs_condition(void)
> >  	return 1;
> >  }
> >  
> > -static int maintenance_task_pack_refs(MAYBE_UNUSED struct maintenance_run_opts *opts)
> > +static int maintenance_task_pack_refs(MAYBE_UNUSED struct maintenance_run_opts *opts,
> > +				      UNUSED struct gc_config *cfg)
> >  {
> >  	struct child_process cmd = CHILD_PROCESS_INIT;
> >  
> 
> Are we defining *cfg as an unused argument to conform to the
> `maintenance_task_fn` signature?

Yup, that's exactly it.

Patrick

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 3/7] builtin/gc: fix leaking config values
  2024-08-15  5:22   ` James Liu
@ 2024-08-15  8:18     ` Patrick Steinhardt
  0 siblings, 0 replies; 79+ messages in thread
From: Patrick Steinhardt @ 2024-08-15  8:18 UTC (permalink / raw)
  To: James Liu; +Cc: git

On Thu, Aug 15, 2024 at 03:22:04PM +1000, James Liu wrote:
> On Tue Aug 13, 2024 at 5:17 PM AEST, Patrick Steinhardt wrote:
> > Note that there is one small gotcha here with the "--prune" option. Next
> > to passing a string, this option also accepts the "--no-prune" option
> > that overrides the default or configured value. We thus need to discern
> > between the option not having been passed by the user and the negative
> > variant of it. This is done by using a simple sentinel value that lets
> > us discern these cases.
> >
> > @@ -644,12 +673,15 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
> >  	struct child_process rerere_cmd = CHILD_PROCESS_INIT;
> >  	struct maintenance_run_opts opts = {0};
> >  	struct gc_config cfg = GC_CONFIG_INIT;
> > +	const char *prune_expire_sentinel = "sentinel";
> > +	const char *prune_expire_arg = prune_expire_sentinel;
> > +	int ret;
> >  
> >  	struct option builtin_gc_options[] = {
> >  		OPT__QUIET(&quiet, N_("suppress progress reporting")),
> > -		{ OPTION_STRING, 0, "prune", &cfg.prune_expire, N_("date"),
> > +		{ OPTION_STRING, 0, "prune", &prune_expire_arg, N_("date"),
> >  			N_("prune unreferenced objects"),
> > -			PARSE_OPT_OPTARG, NULL, (intptr_t)cfg.prune_expire },
> > +			PARSE_OPT_OPTARG, NULL, (intptr_t)prune_expire_arg },
> >  		OPT_BOOL(0, "cruft", &cfg.cruft_packs, N_("pack unreferenced objects separately")),
> >  		OPT_MAGNITUDE(0, "max-cruft-size", &cfg.max_cruft_size,
> >  			      N_("with --cruft, limit the size of new cruft packs")),
> 
> I was wondering how the `no-*` options worked since they're not
> explicitly defined in the `builtin_gc_options` array. I guess
> they're handled internally by `parse_options()`, and if the `no-` prefix
> was present, it leaves that argument unset.

Yes, this is being handled by "parse-options.c". But With the `no-`
prefix it won't leave the argument unset, but will explicitly unset it.
So if the argument was already set to some value X, that value would be
unset when parsing the `no-` variant.

Patrick

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [PATCH v2 0/7] builtin/maintenance: fix auto-detach with non-standard tasks
  2024-08-13  7:17 [PATCH 0/7] builtin/maintenance: fix auto-detach with non-standard tasks Patrick Steinhardt
                   ` (7 preceding siblings ...)
  2024-08-15  6:42 ` [PATCH 0/7] " James Liu
@ 2024-08-15  9:12 ` Patrick Steinhardt
  2024-08-15  9:12   ` [PATCH v2 1/7] config: fix constness of out parameter for `git_config_get_expiry()` Patrick Steinhardt
                     ` (6 more replies)
  2024-08-15 14:04 ` [PATCH 0/7] builtin/maintenance: fix auto-detach with non-standard tasks Derrick Stolee
  2024-08-16 10:44 ` [PATCH v3 " Patrick Steinhardt
  10 siblings, 7 replies; 79+ messages in thread
From: Patrick Steinhardt @ 2024-08-15  9:12 UTC (permalink / raw)
  To: git; +Cc: Phillip Wood, phillip.wood, James Liu

Hi,

this is the second version of my patch series that fixes how
git-maintenance(1) detaches: instead of letting its child process
git-gc(1) detach, we now optionally ask git-maintenance(1) itself to
detach when running via our auto maintenance mechanism. This fixes
behaviour of git-maintenance(1) when configured to run non-standard
tasks like the "incremental" task.

There is only a single change compared to v1, namely a rewording of the
final commit message. It now clarifies that we really only fix the auto
maintenance behaviour, and the default behaviour of git-maintenance(1)
remains the same when ran by the user.

Patrick Steinhardt (7):
  config: fix constness of out parameter for `git_config_get_expiry()`
  builtin/gc: refactor to read config into structure
  builtin/gc: fix leaking config values
  builtin/gc: stop processing log file on signal
  builtin/gc: add a `--detach` flag
  builtin/maintenance: add a `--detach` flag
  run-command: fix detaching when running auto maintenance

 Documentation/git-gc.txt |   5 +-
 builtin/gc.c             | 384 ++++++++++++++++++++++++---------------
 config.c                 |   4 +-
 config.h                 |   2 +-
 read-cache.c             |  12 +-
 run-command.c            |  12 +-
 t/t5304-prune.sh         |   1 +
 t/t5616-partial-clone.sh |   6 +-
 t/t6500-gc.sh            |  39 ++++
 t/t7900-maintenance.sh   |  82 ++++++++-
 10 files changed, 381 insertions(+), 166 deletions(-)

Range-diff against v1:
1:  040453f27f = 1:  040453f27f config: fix constness of out parameter for `git_config_get_expiry()`
2:  ff6aa9d7ba = 2:  ff6aa9d7ba builtin/gc: refactor to read config into structure
3:  310e361371 = 3:  310e361371 builtin/gc: fix leaking config values
4:  812c61c9b6 = 4:  812c61c9b6 builtin/gc: stop processing log file on signal
5:  ca78d3dc7c = 5:  ca78d3dc7c builtin/gc: add a `--detach` flag
6:  06dbb73425 = 6:  06dbb73425 builtin/maintenance: add a `--detach` flag
7:  8d6cbae951 ! 7:  6bc170ff05 builtin/maintenance: fix auto-detach with non-standard tasks
    @@ Metadata
     Author: Patrick Steinhardt <ps@pks.im>
     
      ## Commit message ##
    -    builtin/maintenance: fix auto-detach with non-standard tasks
    +    run-command: fix detaching when running auto maintenance
     
         In the past, we used to execute `git gc --auto` as part of our automatic
         housekeeping routines. As git-gc(1) may require quite some time to
    @@ Commit message
         housekeeping tasks, stomping on each others feet.
     
         Fix this bug by asking git-gc(1) to not detach when it is being invoked
    -    via git-maintenance(1). Instead, the latter command now respects a new
    +    via git-maintenance(1). Instead, git-maintenance(1) now respects a new
         config "maintenance.autoDetach", the equivalent of "gc.autoDetach", and
    -    detaches itself into the background if not told otherwise. This should
    -    continue to behave the same for all users which use the git-gc(1) task,
    -    only. For others though, it means that we now properly perform all tasks
    -    in the background.
    +    detaches itself into the background when running as part of our auto
    +    maintenance. This should continue to behave the same for all users which
    +    use the git-gc(1) task, only. For others though, it means that we now
    +    properly perform all tasks in the background. The default behaviour of
    +    git-maintenance(1) when executed by the user does not change, it will
    +    remain in the foreground unless they pass the `--detach` option.
     
         Signed-off-by: Patrick Steinhardt <ps@pks.im>
     
-- 
2.46.0.46.g406f326d27.dirty


^ permalink raw reply	[flat|nested] 79+ messages in thread

* [PATCH v2 1/7] config: fix constness of out parameter for `git_config_get_expiry()`
  2024-08-15  9:12 ` [PATCH v2 " Patrick Steinhardt
@ 2024-08-15  9:12   ` Patrick Steinhardt
  2024-08-15  9:12   ` [PATCH v2 2/7] builtin/gc: refactor to read config into structure Patrick Steinhardt
                     ` (5 subsequent siblings)
  6 siblings, 0 replies; 79+ messages in thread
From: Patrick Steinhardt @ 2024-08-15  9:12 UTC (permalink / raw)
  To: git; +Cc: Phillip Wood, phillip.wood, James Liu

The type of the out parameter of `git_config_get_expiry()` is a pointer
to a constant string, which creates the impression that ownership of the
returned data wasn't transferred to the caller. This isn't true though
and thus quite misleading.

Adapt the parameter to be of type `char **` and adjust callers
accordingly. While at it, refactor `get_shared_index_expire_date()` to
drop the static `shared_index_expire` variable. It is only used in that
function, and furthermore we would only hit the code where we parse the
expiry date a single time because we already use a static `prepared`
variable to track whether we did parse it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/gc.c |  6 +++---
 config.c     |  4 ++--
 config.h     |  2 +-
 read-cache.c | 12 +++++++++---
 4 files changed, 15 insertions(+), 9 deletions(-)

diff --git a/builtin/gc.c b/builtin/gc.c
index 72bac2554f..e7406bf667 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -167,9 +167,9 @@ static void gc_config(void)
 	git_config_get_bool("gc.autodetach", &detach_auto);
 	git_config_get_bool("gc.cruftpacks", &cruft_packs);
 	git_config_get_ulong("gc.maxcruftsize", &max_cruft_size);
-	git_config_get_expiry("gc.pruneexpire", &prune_expire);
-	git_config_get_expiry("gc.worktreepruneexpire", &prune_worktrees_expire);
-	git_config_get_expiry("gc.logexpiry", &gc_log_expire);
+	git_config_get_expiry("gc.pruneexpire", (char **) &prune_expire);
+	git_config_get_expiry("gc.worktreepruneexpire", (char **) &prune_worktrees_expire);
+	git_config_get_expiry("gc.logexpiry", (char **) &gc_log_expire);
 
 	git_config_get_ulong("gc.bigpackthreshold", &big_pack_threshold);
 	git_config_get_ulong("pack.deltacachesize", &max_delta_cache_size);
diff --git a/config.c b/config.c
index 6421894614..dfa4df1417 100644
--- a/config.c
+++ b/config.c
@@ -2766,9 +2766,9 @@ int git_config_get_pathname(const char *key, char **dest)
 	return repo_config_get_pathname(the_repository, key, dest);
 }
 
-int git_config_get_expiry(const char *key, const char **output)
+int git_config_get_expiry(const char *key, char **output)
 {
-	int ret = git_config_get_string(key, (char **)output);
+	int ret = git_config_get_string(key, output);
 	if (ret)
 		return ret;
 	if (strcmp(*output, "now")) {
diff --git a/config.h b/config.h
index 54b47dec9e..4801391c32 100644
--- a/config.h
+++ b/config.h
@@ -701,7 +701,7 @@ int git_config_get_split_index(void);
 int git_config_get_max_percent_split_change(void);
 
 /* This dies if the configured or default date is in the future */
-int git_config_get_expiry(const char *key, const char **output);
+int git_config_get_expiry(const char *key, char **output);
 
 /* parse either "this many days" integer, or "5.days.ago" approxidate */
 int git_config_get_expiry_in_days(const char *key, timestamp_t *, timestamp_t now);
diff --git a/read-cache.c b/read-cache.c
index 48bf24f87c..7f393ee687 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -3176,18 +3176,24 @@ static int write_split_index(struct index_state *istate,
 	return ret;
 }
 
-static const char *shared_index_expire = "2.weeks.ago";
-
 static unsigned long get_shared_index_expire_date(void)
 {
 	static unsigned long shared_index_expire_date;
 	static int shared_index_expire_date_prepared;
 
 	if (!shared_index_expire_date_prepared) {
+		const char *shared_index_expire = "2.weeks.ago";
+		char *value = NULL;
+
 		git_config_get_expiry("splitindex.sharedindexexpire",
-				      &shared_index_expire);
+				      &value);
+		if (value)
+			shared_index_expire = value;
+
 		shared_index_expire_date = approxidate(shared_index_expire);
 		shared_index_expire_date_prepared = 1;
+
+		free(value);
 	}
 
 	return shared_index_expire_date;
-- 
2.46.0.46.g406f326d27.dirty


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH v2 2/7] builtin/gc: refactor to read config into structure
  2024-08-15  9:12 ` [PATCH v2 " Patrick Steinhardt
  2024-08-15  9:12   ` [PATCH v2 1/7] config: fix constness of out parameter for `git_config_get_expiry()` Patrick Steinhardt
@ 2024-08-15  9:12   ` Patrick Steinhardt
  2024-08-15  9:12   ` [PATCH v2 3/7] builtin/gc: fix leaking config values Patrick Steinhardt
                     ` (4 subsequent siblings)
  6 siblings, 0 replies; 79+ messages in thread
From: Patrick Steinhardt @ 2024-08-15  9:12 UTC (permalink / raw)
  To: git; +Cc: Phillip Wood, phillip.wood, James Liu

The git-gc(1) command knows to read a bunch of config keys to tweak its
own behaviour. The values are parsed into global variables, which makes
it hard to correctly manage the lifecycle of values that may require a
memory allocation.

Refactor the code to use a `struct gc_config` that gets populated and
passed around. For one, this makes previously-implicit dependencies on
these config values clear. Second, it will allow us to properly manage
the lifecycle in the next commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/gc.c | 255 +++++++++++++++++++++++++++++----------------------
 1 file changed, 143 insertions(+), 112 deletions(-)

diff --git a/builtin/gc.c b/builtin/gc.c
index e7406bf667..eee7401647 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -49,23 +49,7 @@ static const char * const builtin_gc_usage[] = {
 	NULL
 };
 
-static int pack_refs = 1;
-static int prune_reflogs = 1;
-static int cruft_packs = 1;
-static unsigned long max_cruft_size;
-static int aggressive_depth = 50;
-static int aggressive_window = 250;
-static int gc_auto_threshold = 6700;
-static int gc_auto_pack_limit = 50;
-static int detach_auto = 1;
 static timestamp_t gc_log_expire_time;
-static const char *gc_log_expire = "1.day.ago";
-static const char *prune_expire = "2.weeks.ago";
-static const char *prune_worktrees_expire = "3.months.ago";
-static char *repack_filter;
-static char *repack_filter_to;
-static unsigned long big_pack_threshold;
-static unsigned long max_delta_cache_size = DEFAULT_DELTA_CACHE_SIZE;
 
 static struct strvec reflog = STRVEC_INIT;
 static struct strvec repack = STRVEC_INIT;
@@ -145,37 +129,71 @@ static int gc_config_is_timestamp_never(const char *var)
 	return 0;
 }
 
-static void gc_config(void)
+struct gc_config {
+	int pack_refs;
+	int prune_reflogs;
+	int cruft_packs;
+	unsigned long max_cruft_size;
+	int aggressive_depth;
+	int aggressive_window;
+	int gc_auto_threshold;
+	int gc_auto_pack_limit;
+	int detach_auto;
+	const char *gc_log_expire;
+	const char *prune_expire;
+	const char *prune_worktrees_expire;
+	char *repack_filter;
+	char *repack_filter_to;
+	unsigned long big_pack_threshold;
+	unsigned long max_delta_cache_size;
+};
+
+#define GC_CONFIG_INIT { \
+	.pack_refs = 1, \
+	.prune_reflogs = 1, \
+	.cruft_packs = 1, \
+	.aggressive_depth = 50, \
+	.aggressive_window = 250, \
+	.gc_auto_threshold = 6700, \
+	.gc_auto_pack_limit = 50, \
+	.detach_auto = 1, \
+	.gc_log_expire = "1.day.ago", \
+	.prune_expire = "2.weeks.ago", \
+	.prune_worktrees_expire = "3.months.ago", \
+	.max_delta_cache_size = DEFAULT_DELTA_CACHE_SIZE, \
+}
+
+static void gc_config(struct gc_config *cfg)
 {
 	const char *value;
 
 	if (!git_config_get_value("gc.packrefs", &value)) {
 		if (value && !strcmp(value, "notbare"))
-			pack_refs = -1;
+			cfg->pack_refs = -1;
 		else
-			pack_refs = git_config_bool("gc.packrefs", value);
+			cfg->pack_refs = git_config_bool("gc.packrefs", value);
 	}
 
 	if (gc_config_is_timestamp_never("gc.reflogexpire") &&
 	    gc_config_is_timestamp_never("gc.reflogexpireunreachable"))
-		prune_reflogs = 0;
+		cfg->prune_reflogs = 0;
 
-	git_config_get_int("gc.aggressivewindow", &aggressive_window);
-	git_config_get_int("gc.aggressivedepth", &aggressive_depth);
-	git_config_get_int("gc.auto", &gc_auto_threshold);
-	git_config_get_int("gc.autopacklimit", &gc_auto_pack_limit);
-	git_config_get_bool("gc.autodetach", &detach_auto);
-	git_config_get_bool("gc.cruftpacks", &cruft_packs);
-	git_config_get_ulong("gc.maxcruftsize", &max_cruft_size);
-	git_config_get_expiry("gc.pruneexpire", (char **) &prune_expire);
-	git_config_get_expiry("gc.worktreepruneexpire", (char **) &prune_worktrees_expire);
-	git_config_get_expiry("gc.logexpiry", (char **) &gc_log_expire);
+	git_config_get_int("gc.aggressivewindow", &cfg->aggressive_window);
+	git_config_get_int("gc.aggressivedepth", &cfg->aggressive_depth);
+	git_config_get_int("gc.auto", &cfg->gc_auto_threshold);
+	git_config_get_int("gc.autopacklimit", &cfg->gc_auto_pack_limit);
+	git_config_get_bool("gc.autodetach", &cfg->detach_auto);
+	git_config_get_bool("gc.cruftpacks", &cfg->cruft_packs);
+	git_config_get_ulong("gc.maxcruftsize", &cfg->max_cruft_size);
+	git_config_get_expiry("gc.pruneexpire", (char **) &cfg->prune_expire);
+	git_config_get_expiry("gc.worktreepruneexpire", (char **) &cfg->prune_worktrees_expire);
+	git_config_get_expiry("gc.logexpiry", (char **) &cfg->gc_log_expire);
 
-	git_config_get_ulong("gc.bigpackthreshold", &big_pack_threshold);
-	git_config_get_ulong("pack.deltacachesize", &max_delta_cache_size);
+	git_config_get_ulong("gc.bigpackthreshold", &cfg->big_pack_threshold);
+	git_config_get_ulong("pack.deltacachesize", &cfg->max_delta_cache_size);
 
-	git_config_get_string("gc.repackfilter", &repack_filter);
-	git_config_get_string("gc.repackfilterto", &repack_filter_to);
+	git_config_get_string("gc.repackfilter", &cfg->repack_filter);
+	git_config_get_string("gc.repackfilterto", &cfg->repack_filter_to);
 
 	git_config(git_default_config, NULL);
 }
@@ -206,7 +224,7 @@ struct maintenance_run_opts {
 	enum schedule_priority schedule;
 };
 
-static int pack_refs_condition(void)
+static int pack_refs_condition(UNUSED struct gc_config *cfg)
 {
 	/*
 	 * The auto-repacking logic for refs is handled by the ref backends and
@@ -216,7 +234,8 @@ static int pack_refs_condition(void)
 	return 1;
 }
 
-static int maintenance_task_pack_refs(MAYBE_UNUSED struct maintenance_run_opts *opts)
+static int maintenance_task_pack_refs(MAYBE_UNUSED struct maintenance_run_opts *opts,
+				      UNUSED struct gc_config *cfg)
 {
 	struct child_process cmd = CHILD_PROCESS_INIT;
 
@@ -228,7 +247,7 @@ static int maintenance_task_pack_refs(MAYBE_UNUSED struct maintenance_run_opts *
 	return run_command(&cmd);
 }
 
-static int too_many_loose_objects(void)
+static int too_many_loose_objects(struct gc_config *cfg)
 {
 	/*
 	 * Quickly check if a "gc" is needed, by estimating how
@@ -247,7 +266,7 @@ static int too_many_loose_objects(void)
 	if (!dir)
 		return 0;
 
-	auto_threshold = DIV_ROUND_UP(gc_auto_threshold, 256);
+	auto_threshold = DIV_ROUND_UP(cfg->gc_auto_threshold, 256);
 	while ((ent = readdir(dir)) != NULL) {
 		if (strspn(ent->d_name, "0123456789abcdef") != hexsz_loose ||
 		    ent->d_name[hexsz_loose] != '\0')
@@ -283,12 +302,12 @@ static struct packed_git *find_base_packs(struct string_list *packs,
 	return base;
 }
 
-static int too_many_packs(void)
+static int too_many_packs(struct gc_config *cfg)
 {
 	struct packed_git *p;
 	int cnt;
 
-	if (gc_auto_pack_limit <= 0)
+	if (cfg->gc_auto_pack_limit <= 0)
 		return 0;
 
 	for (cnt = 0, p = get_all_packs(the_repository); p; p = p->next) {
@@ -302,7 +321,7 @@ static int too_many_packs(void)
 		 */
 		cnt++;
 	}
-	return gc_auto_pack_limit < cnt;
+	return cfg->gc_auto_pack_limit < cnt;
 }
 
 static uint64_t total_ram(void)
@@ -336,7 +355,8 @@ static uint64_t total_ram(void)
 	return 0;
 }
 
-static uint64_t estimate_repack_memory(struct packed_git *pack)
+static uint64_t estimate_repack_memory(struct gc_config *cfg,
+				       struct packed_git *pack)
 {
 	unsigned long nr_objects = repo_approximate_object_count(the_repository);
 	size_t os_cache, heap;
@@ -373,7 +393,7 @@ static uint64_t estimate_repack_memory(struct packed_git *pack)
 	 */
 	heap += delta_base_cache_limit;
 	/* and of course pack-objects has its own delta cache */
-	heap += max_delta_cache_size;
+	heap += cfg->max_delta_cache_size;
 
 	return os_cache + heap;
 }
@@ -384,30 +404,31 @@ static int keep_one_pack(struct string_list_item *item, void *data UNUSED)
 	return 0;
 }
 
-static void add_repack_all_option(struct string_list *keep_pack)
+static void add_repack_all_option(struct gc_config *cfg,
+				  struct string_list *keep_pack)
 {
-	if (prune_expire && !strcmp(prune_expire, "now"))
+	if (cfg->prune_expire && !strcmp(cfg->prune_expire, "now"))
 		strvec_push(&repack, "-a");
-	else if (cruft_packs) {
+	else if (cfg->cruft_packs) {
 		strvec_push(&repack, "--cruft");
-		if (prune_expire)
-			strvec_pushf(&repack, "--cruft-expiration=%s", prune_expire);
-		if (max_cruft_size)
+		if (cfg->prune_expire)
+			strvec_pushf(&repack, "--cruft-expiration=%s", cfg->prune_expire);
+		if (cfg->max_cruft_size)
 			strvec_pushf(&repack, "--max-cruft-size=%lu",
-				     max_cruft_size);
+				     cfg->max_cruft_size);
 	} else {
 		strvec_push(&repack, "-A");
-		if (prune_expire)
-			strvec_pushf(&repack, "--unpack-unreachable=%s", prune_expire);
+		if (cfg->prune_expire)
+			strvec_pushf(&repack, "--unpack-unreachable=%s", cfg->prune_expire);
 	}
 
 	if (keep_pack)
 		for_each_string_list(keep_pack, keep_one_pack, NULL);
 
-	if (repack_filter && *repack_filter)
-		strvec_pushf(&repack, "--filter=%s", repack_filter);
-	if (repack_filter_to && *repack_filter_to)
-		strvec_pushf(&repack, "--filter-to=%s", repack_filter_to);
+	if (cfg->repack_filter && *cfg->repack_filter)
+		strvec_pushf(&repack, "--filter=%s", cfg->repack_filter);
+	if (cfg->repack_filter_to && *cfg->repack_filter_to)
+		strvec_pushf(&repack, "--filter-to=%s", cfg->repack_filter_to);
 }
 
 static void add_repack_incremental_option(void)
@@ -415,13 +436,13 @@ static void add_repack_incremental_option(void)
 	strvec_push(&repack, "--no-write-bitmap-index");
 }
 
-static int need_to_gc(void)
+static int need_to_gc(struct gc_config *cfg)
 {
 	/*
 	 * Setting gc.auto to 0 or negative can disable the
 	 * automatic gc.
 	 */
-	if (gc_auto_threshold <= 0)
+	if (cfg->gc_auto_threshold <= 0)
 		return 0;
 
 	/*
@@ -430,13 +451,13 @@ static int need_to_gc(void)
 	 * we run "repack -A -d -l".  Otherwise we tell the caller
 	 * there is no need.
 	 */
-	if (too_many_packs()) {
+	if (too_many_packs(cfg)) {
 		struct string_list keep_pack = STRING_LIST_INIT_NODUP;
 
-		if (big_pack_threshold) {
-			find_base_packs(&keep_pack, big_pack_threshold);
-			if (keep_pack.nr >= gc_auto_pack_limit) {
-				big_pack_threshold = 0;
+		if (cfg->big_pack_threshold) {
+			find_base_packs(&keep_pack, cfg->big_pack_threshold);
+			if (keep_pack.nr >= cfg->gc_auto_pack_limit) {
+				cfg->big_pack_threshold = 0;
 				string_list_clear(&keep_pack, 0);
 				find_base_packs(&keep_pack, 0);
 			}
@@ -445,7 +466,7 @@ static int need_to_gc(void)
 			uint64_t mem_have, mem_want;
 
 			mem_have = total_ram();
-			mem_want = estimate_repack_memory(p);
+			mem_want = estimate_repack_memory(cfg, p);
 
 			/*
 			 * Only allow 1/2 of memory for pack-objects, leave
@@ -456,9 +477,9 @@ static int need_to_gc(void)
 				string_list_clear(&keep_pack, 0);
 		}
 
-		add_repack_all_option(&keep_pack);
+		add_repack_all_option(cfg, &keep_pack);
 		string_list_clear(&keep_pack, 0);
-	} else if (too_many_loose_objects())
+	} else if (too_many_loose_objects(cfg))
 		add_repack_incremental_option();
 	else
 		return 0;
@@ -585,7 +606,8 @@ static int report_last_gc_error(void)
 	return ret;
 }
 
-static void gc_before_repack(struct maintenance_run_opts *opts)
+static void gc_before_repack(struct maintenance_run_opts *opts,
+			     struct gc_config *cfg)
 {
 	/*
 	 * We may be called twice, as both the pre- and
@@ -596,10 +618,10 @@ static void gc_before_repack(struct maintenance_run_opts *opts)
 	if (done++)
 		return;
 
-	if (pack_refs && maintenance_task_pack_refs(opts))
+	if (cfg->pack_refs && maintenance_task_pack_refs(opts, cfg))
 		die(FAILED_RUN, "pack-refs");
 
-	if (prune_reflogs) {
+	if (cfg->prune_reflogs) {
 		struct child_process cmd = CHILD_PROCESS_INIT;
 
 		cmd.git_cmd = 1;
@@ -621,14 +643,15 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 	timestamp_t dummy;
 	struct child_process rerere_cmd = CHILD_PROCESS_INIT;
 	struct maintenance_run_opts opts = {0};
+	struct gc_config cfg = GC_CONFIG_INIT;
 
 	struct option builtin_gc_options[] = {
 		OPT__QUIET(&quiet, N_("suppress progress reporting")),
-		{ OPTION_STRING, 0, "prune", &prune_expire, N_("date"),
+		{ OPTION_STRING, 0, "prune", &cfg.prune_expire, N_("date"),
 			N_("prune unreferenced objects"),
-			PARSE_OPT_OPTARG, NULL, (intptr_t)prune_expire },
-		OPT_BOOL(0, "cruft", &cruft_packs, N_("pack unreferenced objects separately")),
-		OPT_MAGNITUDE(0, "max-cruft-size", &max_cruft_size,
+			PARSE_OPT_OPTARG, NULL, (intptr_t)cfg.prune_expire },
+		OPT_BOOL(0, "cruft", &cfg.cruft_packs, N_("pack unreferenced objects separately")),
+		OPT_MAGNITUDE(0, "max-cruft-size", &cfg.max_cruft_size,
 			      N_("with --cruft, limit the size of new cruft packs")),
 		OPT_BOOL(0, "aggressive", &aggressive, N_("be more thorough (increased runtime)")),
 		OPT_BOOL_F(0, "auto", &opts.auto_flag, N_("enable auto-gc mode"),
@@ -651,27 +674,27 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 	strvec_pushl(&rerere, "rerere", "gc", NULL);
 
 	/* default expiry time, overwritten in gc_config */
-	gc_config();
-	if (parse_expiry_date(gc_log_expire, &gc_log_expire_time))
-		die(_("failed to parse gc.logExpiry value %s"), gc_log_expire);
+	gc_config(&cfg);
+	if (parse_expiry_date(cfg.gc_log_expire, &gc_log_expire_time))
+		die(_("failed to parse gc.logExpiry value %s"), cfg.gc_log_expire);
 
-	if (pack_refs < 0)
-		pack_refs = !is_bare_repository();
+	if (cfg.pack_refs < 0)
+		cfg.pack_refs = !is_bare_repository();
 
 	argc = parse_options(argc, argv, prefix, builtin_gc_options,
 			     builtin_gc_usage, 0);
 	if (argc > 0)
 		usage_with_options(builtin_gc_usage, builtin_gc_options);
 
-	if (prune_expire && parse_expiry_date(prune_expire, &dummy))
-		die(_("failed to parse prune expiry value %s"), prune_expire);
+	if (cfg.prune_expire && parse_expiry_date(cfg.prune_expire, &dummy))
+		die(_("failed to parse prune expiry value %s"), cfg.prune_expire);
 
 	if (aggressive) {
 		strvec_push(&repack, "-f");
-		if (aggressive_depth > 0)
-			strvec_pushf(&repack, "--depth=%d", aggressive_depth);
-		if (aggressive_window > 0)
-			strvec_pushf(&repack, "--window=%d", aggressive_window);
+		if (cfg.aggressive_depth > 0)
+			strvec_pushf(&repack, "--depth=%d", cfg.aggressive_depth);
+		if (cfg.aggressive_window > 0)
+			strvec_pushf(&repack, "--window=%d", cfg.aggressive_window);
 	}
 	if (quiet)
 		strvec_push(&repack, "-q");
@@ -680,16 +703,16 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 		/*
 		 * Auto-gc should be least intrusive as possible.
 		 */
-		if (!need_to_gc())
+		if (!need_to_gc(&cfg))
 			return 0;
 		if (!quiet) {
-			if (detach_auto)
+			if (cfg.detach_auto)
 				fprintf(stderr, _("Auto packing the repository in background for optimum performance.\n"));
 			else
 				fprintf(stderr, _("Auto packing the repository for optimum performance.\n"));
 			fprintf(stderr, _("See \"git help gc\" for manual housekeeping.\n"));
 		}
-		if (detach_auto) {
+		if (cfg.detach_auto) {
 			int ret = report_last_gc_error();
 
 			if (ret == 1)
@@ -701,7 +724,7 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 
 			if (lock_repo_for_gc(force, &pid))
 				return 0;
-			gc_before_repack(&opts); /* dies on failure */
+			gc_before_repack(&opts, &cfg); /* dies on failure */
 			delete_tempfile(&pidfile);
 
 			/*
@@ -716,11 +739,11 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 		if (keep_largest_pack != -1) {
 			if (keep_largest_pack)
 				find_base_packs(&keep_pack, 0);
-		} else if (big_pack_threshold) {
-			find_base_packs(&keep_pack, big_pack_threshold);
+		} else if (cfg.big_pack_threshold) {
+			find_base_packs(&keep_pack, cfg.big_pack_threshold);
 		}
 
-		add_repack_all_option(&keep_pack);
+		add_repack_all_option(&cfg, &keep_pack);
 		string_list_clear(&keep_pack, 0);
 	}
 
@@ -741,7 +764,7 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 		atexit(process_log_file_at_exit);
 	}
 
-	gc_before_repack(&opts);
+	gc_before_repack(&opts, &cfg);
 
 	if (!repository_format_precious_objects) {
 		struct child_process repack_cmd = CHILD_PROCESS_INIT;
@@ -752,11 +775,11 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 		if (run_command(&repack_cmd))
 			die(FAILED_RUN, repack.v[0]);
 
-		if (prune_expire) {
+		if (cfg.prune_expire) {
 			struct child_process prune_cmd = CHILD_PROCESS_INIT;
 
 			/* run `git prune` even if using cruft packs */
-			strvec_push(&prune, prune_expire);
+			strvec_push(&prune, cfg.prune_expire);
 			if (quiet)
 				strvec_push(&prune, "--no-progress");
 			if (repo_has_promisor_remote(the_repository))
@@ -769,10 +792,10 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 		}
 	}
 
-	if (prune_worktrees_expire) {
+	if (cfg.prune_worktrees_expire) {
 		struct child_process prune_worktrees_cmd = CHILD_PROCESS_INIT;
 
-		strvec_push(&prune_worktrees, prune_worktrees_expire);
+		strvec_push(&prune_worktrees, cfg.prune_worktrees_expire);
 		prune_worktrees_cmd.git_cmd = 1;
 		strvec_pushv(&prune_worktrees_cmd.args, prune_worktrees.v);
 		if (run_command(&prune_worktrees_cmd))
@@ -796,7 +819,7 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 					     !quiet && !daemonized ? COMMIT_GRAPH_WRITE_PROGRESS : 0,
 					     NULL);
 
-	if (opts.auto_flag && too_many_loose_objects())
+	if (opts.auto_flag && too_many_loose_objects(&cfg))
 		warning(_("There are too many unreachable loose objects; "
 			"run 'git prune' to remove them."));
 
@@ -892,7 +915,7 @@ static int dfs_on_ref(const char *refname UNUSED,
 	return result;
 }
 
-static int should_write_commit_graph(void)
+static int should_write_commit_graph(struct gc_config *cfg)
 {
 	int result;
 	struct cg_auto_data data;
@@ -929,7 +952,8 @@ static int run_write_commit_graph(struct maintenance_run_opts *opts)
 	return !!run_command(&child);
 }
 
-static int maintenance_task_commit_graph(struct maintenance_run_opts *opts)
+static int maintenance_task_commit_graph(struct maintenance_run_opts *opts,
+					 struct gc_config *cfg)
 {
 	prepare_repo_settings(the_repository);
 	if (!the_repository->settings.core_commit_graph)
@@ -963,7 +987,8 @@ static int fetch_remote(struct remote *remote, void *cbdata)
 	return !!run_command(&child);
 }
 
-static int maintenance_task_prefetch(struct maintenance_run_opts *opts)
+static int maintenance_task_prefetch(struct maintenance_run_opts *opts,
+				     struct gc_config *cfg)
 {
 	if (for_each_remote(fetch_remote, opts)) {
 		error(_("failed to prefetch remotes"));
@@ -973,7 +998,8 @@ static int maintenance_task_prefetch(struct maintenance_run_opts *opts)
 	return 0;
 }
 
-static int maintenance_task_gc(struct maintenance_run_opts *opts)
+static int maintenance_task_gc(struct maintenance_run_opts *opts,
+			       struct gc_config *cfg)
 {
 	struct child_process child = CHILD_PROCESS_INIT;
 
@@ -1021,7 +1047,7 @@ static int loose_object_count(const struct object_id *oid UNUSED,
 	return 0;
 }
 
-static int loose_object_auto_condition(void)
+static int loose_object_auto_condition(struct gc_config *cfg)
 {
 	int count = 0;
 
@@ -1106,12 +1132,13 @@ static int pack_loose(struct maintenance_run_opts *opts)
 	return result;
 }
 
-static int maintenance_task_loose_objects(struct maintenance_run_opts *opts)
+static int maintenance_task_loose_objects(struct maintenance_run_opts *opts,
+					  struct gc_config *cfg)
 {
 	return prune_packed(opts) || pack_loose(opts);
 }
 
-static int incremental_repack_auto_condition(void)
+static int incremental_repack_auto_condition(struct gc_config *cfg)
 {
 	struct packed_git *p;
 	int incremental_repack_auto_limit = 10;
@@ -1230,7 +1257,8 @@ static int multi_pack_index_repack(struct maintenance_run_opts *opts)
 	return 0;
 }
 
-static int maintenance_task_incremental_repack(struct maintenance_run_opts *opts)
+static int maintenance_task_incremental_repack(struct maintenance_run_opts *opts,
+					       struct gc_config *cfg)
 {
 	prepare_repo_settings(the_repository);
 	if (!the_repository->settings.core_multi_pack_index) {
@@ -1247,14 +1275,15 @@ static int maintenance_task_incremental_repack(struct maintenance_run_opts *opts
 	return 0;
 }
 
-typedef int maintenance_task_fn(struct maintenance_run_opts *opts);
+typedef int maintenance_task_fn(struct maintenance_run_opts *opts,
+				struct gc_config *cfg);
 
 /*
  * An auto condition function returns 1 if the task should run
  * and 0 if the task should NOT run. See needs_to_gc() for an
  * example.
  */
-typedef int maintenance_auto_fn(void);
+typedef int maintenance_auto_fn(struct gc_config *cfg);
 
 struct maintenance_task {
 	const char *name;
@@ -1321,7 +1350,8 @@ static int compare_tasks_by_selection(const void *a_, const void *b_)
 	return b->selected_order - a->selected_order;
 }
 
-static int maintenance_run_tasks(struct maintenance_run_opts *opts)
+static int maintenance_run_tasks(struct maintenance_run_opts *opts,
+				 struct gc_config *cfg)
 {
 	int i, found_selected = 0;
 	int result = 0;
@@ -1360,14 +1390,14 @@ static int maintenance_run_tasks(struct maintenance_run_opts *opts)
 
 		if (opts->auto_flag &&
 		    (!tasks[i].auto_condition ||
-		     !tasks[i].auto_condition()))
+		     !tasks[i].auto_condition(cfg)))
 			continue;
 
 		if (opts->schedule && tasks[i].schedule < opts->schedule)
 			continue;
 
 		trace2_region_enter("maintenance", tasks[i].name, r);
-		if (tasks[i].fn(opts)) {
+		if (tasks[i].fn(opts, cfg)) {
 			error(_("task '%s' failed"), tasks[i].name);
 			result = 1;
 		}
@@ -1404,7 +1434,6 @@ static void initialize_task_config(int schedule)
 {
 	int i;
 	struct strbuf config_name = STRBUF_INIT;
-	gc_config();
 
 	if (schedule)
 		initialize_maintenance_strategy();
@@ -1468,6 +1497,7 @@ static int maintenance_run(int argc, const char **argv, const char *prefix)
 {
 	int i;
 	struct maintenance_run_opts opts;
+	struct gc_config cfg = GC_CONFIG_INIT;
 	struct option builtin_maintenance_run_options[] = {
 		OPT_BOOL(0, "auto", &opts.auto_flag,
 			 N_("run tasks based on the state of the repository")),
@@ -1496,12 +1526,13 @@ static int maintenance_run(int argc, const char **argv, const char *prefix)
 	if (opts.auto_flag && opts.schedule)
 		die(_("use at most one of --auto and --schedule=<frequency>"));
 
+	gc_config(&cfg);
 	initialize_task_config(opts.schedule);
 
 	if (argc != 0)
 		usage_with_options(builtin_maintenance_run_usage,
 				   builtin_maintenance_run_options);
-	return maintenance_run_tasks(&opts);
+	return maintenance_run_tasks(&opts, &cfg);
 }
 
 static char *get_maintpath(void)
-- 
2.46.0.46.g406f326d27.dirty


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH v2 3/7] builtin/gc: fix leaking config values
  2024-08-15  9:12 ` [PATCH v2 " Patrick Steinhardt
  2024-08-15  9:12   ` [PATCH v2 1/7] config: fix constness of out parameter for `git_config_get_expiry()` Patrick Steinhardt
  2024-08-15  9:12   ` [PATCH v2 2/7] builtin/gc: refactor to read config into structure Patrick Steinhardt
@ 2024-08-15  9:12   ` Patrick Steinhardt
  2024-08-15  9:12   ` [PATCH v2 4/7] builtin/gc: stop processing log file on signal Patrick Steinhardt
                     ` (3 subsequent siblings)
  6 siblings, 0 replies; 79+ messages in thread
From: Patrick Steinhardt @ 2024-08-15  9:12 UTC (permalink / raw)
  To: git; +Cc: Phillip Wood, phillip.wood, James Liu

We're leaking config values in git-gc(1) when those values are tracked
as strings. Introduce a new `gc_config_release()` function that releases
this memory to plug those leaks and release old values before populating
the config fields via `git_config_string()` et al.

Note that there is one small gotcha here with the "--prune" option. Next
to passing a string, this option also accepts the "--no-prune" option
that overrides the default or configured value. We thus need to discern
between the option not having been passed by the user and the negative
variant of it. This is done by using a simple sentinel value that lets
us discern these cases.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/gc.c     | 108 +++++++++++++++++++++++++++++++++++------------
 t/t5304-prune.sh |   1 +
 2 files changed, 82 insertions(+), 27 deletions(-)

diff --git a/builtin/gc.c b/builtin/gc.c
index eee7401647..a93cfa147e 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -139,9 +139,9 @@ struct gc_config {
 	int gc_auto_threshold;
 	int gc_auto_pack_limit;
 	int detach_auto;
-	const char *gc_log_expire;
-	const char *prune_expire;
-	const char *prune_worktrees_expire;
+	char *gc_log_expire;
+	char *prune_expire;
+	char *prune_worktrees_expire;
 	char *repack_filter;
 	char *repack_filter_to;
 	unsigned long big_pack_threshold;
@@ -157,15 +157,25 @@ struct gc_config {
 	.gc_auto_threshold = 6700, \
 	.gc_auto_pack_limit = 50, \
 	.detach_auto = 1, \
-	.gc_log_expire = "1.day.ago", \
-	.prune_expire = "2.weeks.ago", \
-	.prune_worktrees_expire = "3.months.ago", \
+	.gc_log_expire = xstrdup("1.day.ago"), \
+	.prune_expire = xstrdup("2.weeks.ago"), \
+	.prune_worktrees_expire = xstrdup("3.months.ago"), \
 	.max_delta_cache_size = DEFAULT_DELTA_CACHE_SIZE, \
 }
 
+static void gc_config_release(struct gc_config *cfg)
+{
+	free(cfg->gc_log_expire);
+	free(cfg->prune_expire);
+	free(cfg->prune_worktrees_expire);
+	free(cfg->repack_filter);
+	free(cfg->repack_filter_to);
+}
+
 static void gc_config(struct gc_config *cfg)
 {
 	const char *value;
+	char *owned = NULL;
 
 	if (!git_config_get_value("gc.packrefs", &value)) {
 		if (value && !strcmp(value, "notbare"))
@@ -185,15 +195,34 @@ static void gc_config(struct gc_config *cfg)
 	git_config_get_bool("gc.autodetach", &cfg->detach_auto);
 	git_config_get_bool("gc.cruftpacks", &cfg->cruft_packs);
 	git_config_get_ulong("gc.maxcruftsize", &cfg->max_cruft_size);
-	git_config_get_expiry("gc.pruneexpire", (char **) &cfg->prune_expire);
-	git_config_get_expiry("gc.worktreepruneexpire", (char **) &cfg->prune_worktrees_expire);
-	git_config_get_expiry("gc.logexpiry", (char **) &cfg->gc_log_expire);
+
+	if (!git_config_get_expiry("gc.pruneexpire", &owned)) {
+		free(cfg->prune_expire);
+		cfg->prune_expire = owned;
+	}
+
+	if (!git_config_get_expiry("gc.worktreepruneexpire", &owned)) {
+		free(cfg->prune_worktrees_expire);
+		cfg->prune_worktrees_expire = owned;
+	}
+
+	if (!git_config_get_expiry("gc.logexpiry", &owned)) {
+		free(cfg->gc_log_expire);
+		cfg->gc_log_expire = owned;
+	}
 
 	git_config_get_ulong("gc.bigpackthreshold", &cfg->big_pack_threshold);
 	git_config_get_ulong("pack.deltacachesize", &cfg->max_delta_cache_size);
 
-	git_config_get_string("gc.repackfilter", &cfg->repack_filter);
-	git_config_get_string("gc.repackfilterto", &cfg->repack_filter_to);
+	if (!git_config_get_string("gc.repackfilter", &owned)) {
+		free(cfg->repack_filter);
+		cfg->repack_filter = owned;
+	}
+
+	if (!git_config_get_string("gc.repackfilterto", &owned)) {
+		free(cfg->repack_filter_to);
+		cfg->repack_filter_to = owned;
+	}
 
 	git_config(git_default_config, NULL);
 }
@@ -644,12 +673,15 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 	struct child_process rerere_cmd = CHILD_PROCESS_INIT;
 	struct maintenance_run_opts opts = {0};
 	struct gc_config cfg = GC_CONFIG_INIT;
+	const char *prune_expire_sentinel = "sentinel";
+	const char *prune_expire_arg = prune_expire_sentinel;
+	int ret;
 
 	struct option builtin_gc_options[] = {
 		OPT__QUIET(&quiet, N_("suppress progress reporting")),
-		{ OPTION_STRING, 0, "prune", &cfg.prune_expire, N_("date"),
+		{ OPTION_STRING, 0, "prune", &prune_expire_arg, N_("date"),
 			N_("prune unreferenced objects"),
-			PARSE_OPT_OPTARG, NULL, (intptr_t)cfg.prune_expire },
+			PARSE_OPT_OPTARG, NULL, (intptr_t)prune_expire_arg },
 		OPT_BOOL(0, "cruft", &cfg.cruft_packs, N_("pack unreferenced objects separately")),
 		OPT_MAGNITUDE(0, "max-cruft-size", &cfg.max_cruft_size,
 			      N_("with --cruft, limit the size of new cruft packs")),
@@ -673,8 +705,8 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 	strvec_pushl(&prune_worktrees, "worktree", "prune", "--expire", NULL);
 	strvec_pushl(&rerere, "rerere", "gc", NULL);
 
-	/* default expiry time, overwritten in gc_config */
 	gc_config(&cfg);
+
 	if (parse_expiry_date(cfg.gc_log_expire, &gc_log_expire_time))
 		die(_("failed to parse gc.logExpiry value %s"), cfg.gc_log_expire);
 
@@ -686,6 +718,10 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 	if (argc > 0)
 		usage_with_options(builtin_gc_usage, builtin_gc_options);
 
+	if (prune_expire_arg != prune_expire_sentinel) {
+		free(cfg.prune_expire);
+		cfg.prune_expire = xstrdup_or_null(prune_expire_arg);
+	}
 	if (cfg.prune_expire && parse_expiry_date(cfg.prune_expire, &dummy))
 		die(_("failed to parse prune expiry value %s"), cfg.prune_expire);
 
@@ -703,8 +739,11 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 		/*
 		 * Auto-gc should be least intrusive as possible.
 		 */
-		if (!need_to_gc(&cfg))
-			return 0;
+		if (!need_to_gc(&cfg)) {
+			ret = 0;
+			goto out;
+		}
+
 		if (!quiet) {
 			if (cfg.detach_auto)
 				fprintf(stderr, _("Auto packing the repository in background for optimum performance.\n"));
@@ -713,17 +752,22 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 			fprintf(stderr, _("See \"git help gc\" for manual housekeeping.\n"));
 		}
 		if (cfg.detach_auto) {
-			int ret = report_last_gc_error();
-
-			if (ret == 1)
+			ret = report_last_gc_error();
+			if (ret == 1) {
 				/* Last gc --auto failed. Skip this one. */
-				return 0;
-			else if (ret)
+				ret = 0;
+				goto out;
+
+			} else if (ret) {
 				/* an I/O error occurred, already reported */
-				return ret;
+				goto out;
+			}
+
+			if (lock_repo_for_gc(force, &pid)) {
+				ret = 0;
+				goto out;
+			}
 
-			if (lock_repo_for_gc(force, &pid))
-				return 0;
 			gc_before_repack(&opts, &cfg); /* dies on failure */
 			delete_tempfile(&pidfile);
 
@@ -749,8 +793,11 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 
 	name = lock_repo_for_gc(force, &pid);
 	if (name) {
-		if (opts.auto_flag)
-			return 0; /* be quiet on --auto */
+		if (opts.auto_flag) {
+			ret = 0;
+			goto out; /* be quiet on --auto */
+		}
+
 		die(_("gc is already running on machine '%s' pid %"PRIuMAX" (use --force if not)"),
 		    name, (uintmax_t)pid);
 	}
@@ -826,6 +873,8 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 	if (!daemonized)
 		unlink(git_path("gc.log"));
 
+out:
+	gc_config_release(&cfg);
 	return 0;
 }
 
@@ -1511,6 +1560,8 @@ static int maintenance_run(int argc, const char **argv, const char *prefix)
 			PARSE_OPT_NONEG, task_option_parse),
 		OPT_END()
 	};
+	int ret;
+
 	memset(&opts, 0, sizeof(opts));
 
 	opts.quiet = !isatty(2);
@@ -1532,7 +1583,10 @@ static int maintenance_run(int argc, const char **argv, const char *prefix)
 	if (argc != 0)
 		usage_with_options(builtin_maintenance_run_usage,
 				   builtin_maintenance_run_options);
-	return maintenance_run_tasks(&opts, &cfg);
+
+	ret = maintenance_run_tasks(&opts, &cfg);
+	gc_config_release(&cfg);
+	return ret;
 }
 
 static char *get_maintpath(void)
diff --git a/t/t5304-prune.sh b/t/t5304-prune.sh
index 1f1f664871..e641df0116 100755
--- a/t/t5304-prune.sh
+++ b/t/t5304-prune.sh
@@ -7,6 +7,7 @@ test_description='prune'
 GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 day=$((60*60*24))
-- 
2.46.0.46.g406f326d27.dirty


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH v2 4/7] builtin/gc: stop processing log file on signal
  2024-08-15  9:12 ` [PATCH v2 " Patrick Steinhardt
                     ` (2 preceding siblings ...)
  2024-08-15  9:12   ` [PATCH v2 3/7] builtin/gc: fix leaking config values Patrick Steinhardt
@ 2024-08-15  9:12   ` Patrick Steinhardt
  2024-08-15  9:12   ` [PATCH v2 5/7] builtin/gc: add a `--detach` flag Patrick Steinhardt
                     ` (2 subsequent siblings)
  6 siblings, 0 replies; 79+ messages in thread
From: Patrick Steinhardt @ 2024-08-15  9:12 UTC (permalink / raw)
  To: git; +Cc: Phillip Wood, phillip.wood, James Liu

When detaching, git-gc(1) will redirect its stderr to a "gc.log" log
file, which is then used to surface errors of a backgrounded process to
the user. To ensure that the file is properly managed on abnormal exit
paths, we install both signal and exit handlers that try to either
commit the underlying lock file or roll it back in case there wasn't any
error.

This logic is severly broken when handling signals though, as we end up
calling all kinds of functions that are not signal safe. This includes
malloc(3P) via `git_path()`, fprintf(3P), fflush(3P) and many more
functions. The consequence can be anything, from deadlocks to crashes.
Unfortunately, we cannot really do much about this without a larger
refactoring.

The least-worst thing we can do is to not set up the signal handler in
the first place. This will still cause us to remove the lockfile, as the
underlying tempfile subsystem already knows to unlink locks when
receiving a signal. But it may cause us to remove the lock even in the
case where it would have contained actual errors, which is a change in
behaviour.

The consequence is that "gc.log" will not be committed, and thus
subsequent calls to `git gc --auto` won't bail out because of this.
Arguably though, it is better to retry garbage collection rather than
having the process run into a potentially-corrupted state.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/gc.c | 8 --------
 1 file changed, 8 deletions(-)

diff --git a/builtin/gc.c b/builtin/gc.c
index a93cfa147e..f815557081 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -109,13 +109,6 @@ static void process_log_file_at_exit(void)
 	process_log_file();
 }
 
-static void process_log_file_on_signal(int signo)
-{
-	process_log_file();
-	sigchain_pop(signo);
-	raise(signo);
-}
-
 static int gc_config_is_timestamp_never(const char *var)
 {
 	const char *value;
@@ -807,7 +800,6 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 					  git_path("gc.log"),
 					  LOCK_DIE_ON_ERROR);
 		dup2(get_lock_file_fd(&log_lock), 2);
-		sigchain_push_common(process_log_file_on_signal);
 		atexit(process_log_file_at_exit);
 	}
 
-- 
2.46.0.46.g406f326d27.dirty


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH v2 5/7] builtin/gc: add a `--detach` flag
  2024-08-15  9:12 ` [PATCH v2 " Patrick Steinhardt
                     ` (3 preceding siblings ...)
  2024-08-15  9:12   ` [PATCH v2 4/7] builtin/gc: stop processing log file on signal Patrick Steinhardt
@ 2024-08-15  9:12   ` Patrick Steinhardt
  2024-08-15 19:11     ` Junio C Hamano
  2024-08-15  9:12   ` [PATCH v2 6/7] builtin/maintenance: " Patrick Steinhardt
  2024-08-15  9:12   ` [PATCH v2 7/7] run-command: fix detaching when running auto maintenance Patrick Steinhardt
  6 siblings, 1 reply; 79+ messages in thread
From: Patrick Steinhardt @ 2024-08-15  9:12 UTC (permalink / raw)
  To: git; +Cc: Phillip Wood, phillip.wood, James Liu

When running `git gc --auto`, the command will by default detach and
continue running in the background. This behaviour can be tweaked via
the `gc.autoDetach` config, but not via a command line switch. We need
that in a subsequent commit though, where git-maintenance(1) will want
to ask its git-gc(1) child process to not detach anymore.

Add a `--[no-]detach` flag that does this for us.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/git-gc.txt |  5 ++-
 builtin/gc.c             | 70 ++++++++++++++++++++++------------------
 t/t6500-gc.sh            | 39 ++++++++++++++++++++++
 3 files changed, 82 insertions(+), 32 deletions(-)

diff --git a/Documentation/git-gc.txt b/Documentation/git-gc.txt
index b5561c458a..370e22faae 100644
--- a/Documentation/git-gc.txt
+++ b/Documentation/git-gc.txt
@@ -9,7 +9,7 @@ git-gc - Cleanup unnecessary files and optimize the local repository
 SYNOPSIS
 --------
 [verse]
-'git gc' [--aggressive] [--auto] [--quiet] [--prune=<date> | --no-prune] [--force] [--keep-largest-pack]
+'git gc' [--aggressive] [--auto] [--[no-]detach] [--quiet] [--prune=<date> | --no-prune] [--force] [--keep-largest-pack]
 
 DESCRIPTION
 -----------
@@ -53,6 +53,9 @@ configuration options such as `gc.auto` and `gc.autoPackLimit`, all
 other housekeeping tasks (e.g. rerere, working trees, reflog...) will
 be performed as well.
 
+--[no-]detach::
+	Run in the background if the system supports it. This option overrides
+	the `gc.autoDetach` config.
 
 --[no-]cruft::
 	When expiring unreachable objects, pack them separately into a
diff --git a/builtin/gc.c b/builtin/gc.c
index f815557081..269a77960f 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -242,9 +242,13 @@ static enum schedule_priority parse_schedule(const char *value)
 
 struct maintenance_run_opts {
 	int auto_flag;
+	int detach;
 	int quiet;
 	enum schedule_priority schedule;
 };
+#define MAINTENANCE_RUN_OPTS_INIT { \
+	.detach = -1, \
+}
 
 static int pack_refs_condition(UNUSED struct gc_config *cfg)
 {
@@ -664,7 +668,7 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 	int keep_largest_pack = -1;
 	timestamp_t dummy;
 	struct child_process rerere_cmd = CHILD_PROCESS_INIT;
-	struct maintenance_run_opts opts = {0};
+	struct maintenance_run_opts opts = MAINTENANCE_RUN_OPTS_INIT;
 	struct gc_config cfg = GC_CONFIG_INIT;
 	const char *prune_expire_sentinel = "sentinel";
 	const char *prune_expire_arg = prune_expire_sentinel;
@@ -681,6 +685,8 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 		OPT_BOOL(0, "aggressive", &aggressive, N_("be more thorough (increased runtime)")),
 		OPT_BOOL_F(0, "auto", &opts.auto_flag, N_("enable auto-gc mode"),
 			   PARSE_OPT_NOCOMPLETE),
+		OPT_BOOL(0, "detach", &opts.detach,
+			 N_("perform garbage collection in the background")),
 		OPT_BOOL_F(0, "force", &force,
 			   N_("force running gc even if there may be another gc running"),
 			   PARSE_OPT_NOCOMPLETE),
@@ -729,6 +735,9 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 		strvec_push(&repack, "-q");
 
 	if (opts.auto_flag) {
+		if (cfg.detach_auto && opts.detach < 0)
+			opts.detach = 1;
+
 		/*
 		 * Auto-gc should be least intrusive as possible.
 		 */
@@ -738,38 +747,12 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 		}
 
 		if (!quiet) {
-			if (cfg.detach_auto)
+			if (opts.detach > 0)
 				fprintf(stderr, _("Auto packing the repository in background for optimum performance.\n"));
 			else
 				fprintf(stderr, _("Auto packing the repository for optimum performance.\n"));
 			fprintf(stderr, _("See \"git help gc\" for manual housekeeping.\n"));
 		}
-		if (cfg.detach_auto) {
-			ret = report_last_gc_error();
-			if (ret == 1) {
-				/* Last gc --auto failed. Skip this one. */
-				ret = 0;
-				goto out;
-
-			} else if (ret) {
-				/* an I/O error occurred, already reported */
-				goto out;
-			}
-
-			if (lock_repo_for_gc(force, &pid)) {
-				ret = 0;
-				goto out;
-			}
-
-			gc_before_repack(&opts, &cfg); /* dies on failure */
-			delete_tempfile(&pidfile);
-
-			/*
-			 * failure to daemonize is ok, we'll continue
-			 * in foreground
-			 */
-			daemonized = !daemonize();
-		}
 	} else {
 		struct string_list keep_pack = STRING_LIST_INIT_NODUP;
 
@@ -784,6 +767,33 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 		string_list_clear(&keep_pack, 0);
 	}
 
+	if (opts.detach > 0) {
+		ret = report_last_gc_error();
+		if (ret == 1) {
+			/* Last gc --auto failed. Skip this one. */
+			ret = 0;
+			goto out;
+
+		} else if (ret) {
+			/* an I/O error occurred, already reported */
+			goto out;
+		}
+
+		if (lock_repo_for_gc(force, &pid)) {
+			ret = 0;
+			goto out;
+		}
+
+		gc_before_repack(&opts, &cfg); /* dies on failure */
+		delete_tempfile(&pidfile);
+
+		/*
+		 * failure to daemonize is ok, we'll continue
+		 * in foreground
+		 */
+		daemonized = !daemonize();
+	}
+
 	name = lock_repo_for_gc(force, &pid);
 	if (name) {
 		if (opts.auto_flag) {
@@ -1537,7 +1547,7 @@ static int task_option_parse(const struct option *opt UNUSED,
 static int maintenance_run(int argc, const char **argv, const char *prefix)
 {
 	int i;
-	struct maintenance_run_opts opts;
+	struct maintenance_run_opts opts = MAINTENANCE_RUN_OPTS_INIT;
 	struct gc_config cfg = GC_CONFIG_INIT;
 	struct option builtin_maintenance_run_options[] = {
 		OPT_BOOL(0, "auto", &opts.auto_flag,
@@ -1554,8 +1564,6 @@ static int maintenance_run(int argc, const char **argv, const char *prefix)
 	};
 	int ret;
 
-	memset(&opts, 0, sizeof(opts));
-
 	opts.quiet = !isatty(2);
 
 	for (i = 0; i < TASK__COUNT; i++)
diff --git a/t/t6500-gc.sh b/t/t6500-gc.sh
index 1b5909d1b7..737c99e0f8 100755
--- a/t/t6500-gc.sh
+++ b/t/t6500-gc.sh
@@ -396,6 +396,45 @@ test_expect_success 'background auto gc respects lock for all operations' '
 	test_cmp expect actual
 '
 
+test_expect_success '--detach overrides gc.autoDetach=false' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+
+		# Prepare the repository such that git-gc(1) ends up repacking.
+		test_commit "$(test_oid blob17_1)" &&
+		test_commit "$(test_oid blob17_2)" &&
+		git config gc.autodetach false &&
+		git config gc.auto 2 &&
+
+		cat >expect <<-EOF &&
+		Auto packing the repository in background for optimum performance.
+		See "git help gc" for manual housekeeping.
+		EOF
+		GIT_PROGRESS_DELAY=0 git gc --auto --detach 2>actual &&
+		test_cmp expect actual
+	)
+'
+
+test_expect_success '--no-detach overrides gc.autoDetach=true' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+
+		# Prepare the repository such that git-gc(1) ends up repacking.
+		test_commit "$(test_oid blob17_1)" &&
+		test_commit "$(test_oid blob17_2)" &&
+		git config gc.autodetach true &&
+		git config gc.auto 2 &&
+
+		GIT_PROGRESS_DELAY=0 git gc --auto --no-detach 2>output &&
+		test_grep "Auto packing the repository for optimum performance." output &&
+		test_grep "Collecting referenced commits: 2, done." output
+	)
+'
+
 # DO NOT leave a detached auto gc process running near the end of the
 # test script: it can run long enough in the background to racily
 # interfere with the cleanup in 'test_done'.
-- 
2.46.0.46.g406f326d27.dirty


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH v2 6/7] builtin/maintenance: add a `--detach` flag
  2024-08-15  9:12 ` [PATCH v2 " Patrick Steinhardt
                     ` (4 preceding siblings ...)
  2024-08-15  9:12   ` [PATCH v2 5/7] builtin/gc: add a `--detach` flag Patrick Steinhardt
@ 2024-08-15  9:12   ` Patrick Steinhardt
  2024-08-15  9:12   ` [PATCH v2 7/7] run-command: fix detaching when running auto maintenance Patrick Steinhardt
  6 siblings, 0 replies; 79+ messages in thread
From: Patrick Steinhardt @ 2024-08-15  9:12 UTC (permalink / raw)
  To: git; +Cc: Phillip Wood, phillip.wood, James Liu

Same as the preceding commit, add a `--[no-]detach` flag to the
git-maintenance(1) command. This will be used in a subsequent commit to
fix backgrounding of that command when configured with a non-standard
set of tasks.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/gc.c           |  6 ++++++
 t/t7900-maintenance.sh | 39 +++++++++++++++++++++++++++++++++++++++
 2 files changed, 45 insertions(+)

diff --git a/builtin/gc.c b/builtin/gc.c
index 269a77960f..63106e2028 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -1426,6 +1426,10 @@ static int maintenance_run_tasks(struct maintenance_run_opts *opts,
 	}
 	free(lock_path);
 
+	/* Failure to daemonize is ok, we'll continue in foreground. */
+	if (opts->detach > 0)
+		daemonize();
+
 	for (i = 0; !found_selected && i < TASK__COUNT; i++)
 		found_selected = tasks[i].selected_order >= 0;
 
@@ -1552,6 +1556,8 @@ static int maintenance_run(int argc, const char **argv, const char *prefix)
 	struct option builtin_maintenance_run_options[] = {
 		OPT_BOOL(0, "auto", &opts.auto_flag,
 			 N_("run tasks based on the state of the repository")),
+		OPT_BOOL(0, "detach", &opts.detach,
+			 N_("perform maintenance in the background")),
 		OPT_CALLBACK(0, "schedule", &opts.schedule, N_("frequency"),
 			     N_("run tasks based on frequency"),
 			     maintenance_opt_schedule),
diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
index 8595489ceb..771525aa4b 100755
--- a/t/t7900-maintenance.sh
+++ b/t/t7900-maintenance.sh
@@ -908,4 +908,43 @@ test_expect_success 'failed schedule prevents config change' '
 	done
 '
 
+test_expect_success '--no-detach causes maintenance to not run in background' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+
+		# Prepare the repository such that git-maintenance(1) ends up
+		# outputting something.
+		test_commit something &&
+		git config set maintenance.gc.enabled false &&
+		git config set maintenance.loose-objects.enabled true &&
+		git config set maintenance.loose-objects.auto 1 &&
+		git config set maintenance.incremental-repack.enabled true &&
+
+		# We have no better way to check whether or not the task ran in
+		# the background than to verify whether it output anything. The
+		# next testcase checks the reverse, making this somewhat safer.
+		git maintenance run --no-detach >out 2>&1 &&
+		test_line_count = 1 out
+	)
+'
+
+test_expect_success '--detach causes maintenance to run in background' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+
+		test_commit something &&
+		git config set maintenance.gc.enabled false &&
+		git config set maintenance.loose-objects.enabled true &&
+		git config set maintenance.loose-objects.auto 1 &&
+		git config set maintenance.incremental-repack.enabled true &&
+
+		git maintenance run --detach >out 2>&1 &&
+		test_must_be_empty out
+	)
+'
+
 test_done
-- 
2.46.0.46.g406f326d27.dirty


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH v2 7/7] run-command: fix detaching when running auto maintenance
  2024-08-15  9:12 ` [PATCH v2 " Patrick Steinhardt
                     ` (5 preceding siblings ...)
  2024-08-15  9:12   ` [PATCH v2 6/7] builtin/maintenance: " Patrick Steinhardt
@ 2024-08-15  9:12   ` Patrick Steinhardt
  2024-08-15 16:13     ` Junio C Hamano
  6 siblings, 1 reply; 79+ messages in thread
From: Patrick Steinhardt @ 2024-08-15  9:12 UTC (permalink / raw)
  To: git; +Cc: Phillip Wood, phillip.wood, James Liu

In the past, we used to execute `git gc --auto` as part of our automatic
housekeeping routines. As git-gc(1) may require quite some time to
perform the housekeeping, it knows to detach itself and run in the
background so that the user can continue their work.

Eventually, we refactored our automatic housekeeping to instead use the
more flexible git-maintenance(1) command. The upside of this new infra
is that the user can configure which maintenance tasks are performed, at
least to a certain degree. So while it continues to run git-gc(1) by
default, it can also be adapted to e.g. use git-multi-pack-index(1) for
maintenance of the object database.

The auto-detach of the new infra is somewhat broken though once the user
configures non-standard tasks. The problem is essentially that we detach
at the wrong level in the process hierarchy: git-maintenance(1) never
detaches itself, but instead it continues to be git-gc(1) which does.

When configured to only run the git-gc(1) maintenance task, then the
result is basically the same as before. But when configured to run other
tasks, then git-maintenance(1) will wait for these to run to completion.
Even worse, it may be that git-gc(1) runs concurrently with other
housekeeping tasks, stomping on each others feet.

Fix this bug by asking git-gc(1) to not detach when it is being invoked
via git-maintenance(1). Instead, git-maintenance(1) now respects a new
config "maintenance.autoDetach", the equivalent of "gc.autoDetach", and
detaches itself into the background when running as part of our auto
maintenance. This should continue to behave the same for all users which
use the git-gc(1) task, only. For others though, it means that we now
properly perform all tasks in the background. The default behaviour of
git-maintenance(1) when executed by the user does not change, it will
remain in the foreground unless they pass the `--detach` option.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/gc.c             |  1 +
 run-command.c            | 12 ++++++++++-
 t/t5616-partial-clone.sh |  6 +++---
 t/t7900-maintenance.sh   | 43 +++++++++++++++++++++++++++++++---------
 4 files changed, 49 insertions(+), 13 deletions(-)

diff --git a/builtin/gc.c b/builtin/gc.c
index 63106e2028..bafee330a2 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -1063,6 +1063,7 @@ static int maintenance_task_gc(struct maintenance_run_opts *opts,
 		strvec_push(&child.args, "--quiet");
 	else
 		strvec_push(&child.args, "--no-quiet");
+	strvec_push(&child.args, "--no-detach");
 
 	return run_command(&child);
 }
diff --git a/run-command.c b/run-command.c
index 45ba544932..94f2f3079f 100644
--- a/run-command.c
+++ b/run-command.c
@@ -1808,16 +1808,26 @@ void run_processes_parallel(const struct run_process_parallel_opts *opts)
 
 int prepare_auto_maintenance(int quiet, struct child_process *maint)
 {
-	int enabled;
+	int enabled, auto_detach;
 
 	if (!git_config_get_bool("maintenance.auto", &enabled) &&
 	    !enabled)
 		return 0;
 
+	/*
+	 * When `maintenance.autoDetach` isn't set, then we fall back to
+	 * honoring `gc.autoDetach`. This is somewhat weird, but required to
+	 * retain behaviour from when we used to run git-gc(1) here.
+	 */
+	if (git_config_get_bool("maintenance.autodetach", &auto_detach) &&
+	    git_config_get_bool("gc.autodetach", &auto_detach))
+		auto_detach = 1;
+
 	maint->git_cmd = 1;
 	maint->close_object_store = 1;
 	strvec_pushl(&maint->args, "maintenance", "run", "--auto", NULL);
 	strvec_push(&maint->args, quiet ? "--quiet" : "--no-quiet");
+	strvec_push(&maint->args, auto_detach ? "--detach" : "--no-detach");
 
 	return 1;
 }
diff --git a/t/t5616-partial-clone.sh b/t/t5616-partial-clone.sh
index 2da7291e37..8415884754 100755
--- a/t/t5616-partial-clone.sh
+++ b/t/t5616-partial-clone.sh
@@ -229,7 +229,7 @@ test_expect_success 'fetch --refetch triggers repacking' '
 
 	GIT_TRACE2_EVENT="$PWD/trace1.event" \
 	git -C pc1 fetch --refetch origin &&
-	test_subcommand git maintenance run --auto --no-quiet <trace1.event &&
+	test_subcommand git maintenance run --auto --no-quiet --detach <trace1.event &&
 	grep \"param\":\"gc.autopacklimit\",\"value\":\"1\" trace1.event &&
 	grep \"param\":\"maintenance.incremental-repack.auto\",\"value\":\"-1\" trace1.event &&
 
@@ -238,7 +238,7 @@ test_expect_success 'fetch --refetch triggers repacking' '
 		-c gc.autoPackLimit=0 \
 		-c maintenance.incremental-repack.auto=1234 \
 		-C pc1 fetch --refetch origin &&
-	test_subcommand git maintenance run --auto --no-quiet <trace2.event &&
+	test_subcommand git maintenance run --auto --no-quiet --detach <trace2.event &&
 	grep \"param\":\"gc.autopacklimit\",\"value\":\"0\" trace2.event &&
 	grep \"param\":\"maintenance.incremental-repack.auto\",\"value\":\"-1\" trace2.event &&
 
@@ -247,7 +247,7 @@ test_expect_success 'fetch --refetch triggers repacking' '
 		-c gc.autoPackLimit=1234 \
 		-c maintenance.incremental-repack.auto=0 \
 		-C pc1 fetch --refetch origin &&
-	test_subcommand git maintenance run --auto --no-quiet <trace3.event &&
+	test_subcommand git maintenance run --auto --no-quiet --detach <trace3.event &&
 	grep \"param\":\"gc.autopacklimit\",\"value\":\"1\" trace3.event &&
 	grep \"param\":\"maintenance.incremental-repack.auto\",\"value\":\"0\" trace3.event
 '
diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
index 771525aa4b..06ab43cfb5 100755
--- a/t/t7900-maintenance.sh
+++ b/t/t7900-maintenance.sh
@@ -49,22 +49,47 @@ test_expect_success 'run [--auto|--quiet]' '
 		git maintenance run --auto 2>/dev/null &&
 	GIT_TRACE2_EVENT="$(pwd)/run-no-quiet.txt" \
 		git maintenance run --no-quiet 2>/dev/null &&
-	test_subcommand git gc --quiet <run-no-auto.txt &&
-	test_subcommand ! git gc --auto --quiet <run-auto.txt &&
-	test_subcommand git gc --no-quiet <run-no-quiet.txt
+	test_subcommand git gc --quiet --no-detach <run-no-auto.txt &&
+	test_subcommand ! git gc --auto --quiet --no-detach <run-auto.txt &&
+	test_subcommand git gc --no-quiet --no-detach <run-no-quiet.txt
 '
 
 test_expect_success 'maintenance.auto config option' '
 	GIT_TRACE2_EVENT="$(pwd)/default" git commit --quiet --allow-empty -m 1 &&
-	test_subcommand git maintenance run --auto --quiet <default &&
+	test_subcommand git maintenance run --auto --quiet --detach <default &&
 	GIT_TRACE2_EVENT="$(pwd)/true" \
 		git -c maintenance.auto=true \
 		commit --quiet --allow-empty -m 2 &&
-	test_subcommand git maintenance run --auto --quiet  <true &&
+	test_subcommand git maintenance run --auto --quiet --detach <true &&
 	GIT_TRACE2_EVENT="$(pwd)/false" \
 		git -c maintenance.auto=false \
 		commit --quiet --allow-empty -m 3 &&
-	test_subcommand ! git maintenance run --auto --quiet  <false
+	test_subcommand ! git maintenance run --auto --quiet --detach <false
+'
+
+for cfg in maintenance.autoDetach gc.autoDetach
+do
+	test_expect_success "$cfg=true config option" '
+		test_when_finished "rm -f trace" &&
+		test_config $cfg true &&
+		GIT_TRACE2_EVENT="$(pwd)/trace" git commit --quiet --allow-empty -m 1 &&
+		test_subcommand git maintenance run --auto --quiet --detach <trace
+	'
+
+	test_expect_success "$cfg=false config option" '
+		test_when_finished "rm -f trace" &&
+		test_config $cfg false &&
+		GIT_TRACE2_EVENT="$(pwd)/trace" git commit --quiet --allow-empty -m 1 &&
+		test_subcommand git maintenance run --auto --quiet --no-detach <trace
+	'
+done
+
+test_expect_success "maintenance.autoDetach overrides gc.autoDetach" '
+	test_when_finished "rm -f trace" &&
+	test_config maintenance.autoDetach false &&
+	test_config gc.autoDetach true &&
+	GIT_TRACE2_EVENT="$(pwd)/trace" git commit --quiet --allow-empty -m 1 &&
+	test_subcommand git maintenance run --auto --quiet --no-detach <trace
 '
 
 test_expect_success 'register uses XDG_CONFIG_HOME config if it exists' '
@@ -129,9 +154,9 @@ test_expect_success 'run --task=<task>' '
 		git maintenance run --task=commit-graph 2>/dev/null &&
 	GIT_TRACE2_EVENT="$(pwd)/run-both.txt" \
 		git maintenance run --task=commit-graph --task=gc 2>/dev/null &&
-	test_subcommand ! git gc --quiet <run-commit-graph.txt &&
-	test_subcommand git gc --quiet <run-gc.txt &&
-	test_subcommand git gc --quiet <run-both.txt &&
+	test_subcommand ! git gc --quiet --no-detach <run-commit-graph.txt &&
+	test_subcommand git gc --quiet --no-detach <run-gc.txt &&
+	test_subcommand git gc --quiet --no-detach <run-both.txt &&
 	test_subcommand git commit-graph write --split --reachable --no-progress <run-commit-graph.txt &&
 	test_subcommand ! git commit-graph write --split --reachable --no-progress <run-gc.txt &&
 	test_subcommand git commit-graph write --split --reachable --no-progress <run-both.txt
-- 
2.46.0.46.g406f326d27.dirty


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* Re: [PATCH 2/7] builtin/gc: refactor to read config into structure
  2024-08-13  7:17 ` [PATCH 2/7] builtin/gc: refactor to read config into structure Patrick Steinhardt
  2024-08-15  5:24   ` James Liu
@ 2024-08-15 13:46   ` Derrick Stolee
  1 sibling, 0 replies; 79+ messages in thread
From: Derrick Stolee @ 2024-08-15 13:46 UTC (permalink / raw)
  To: Patrick Steinhardt, git

On 8/13/24 3:17 AM, Patrick Steinhardt wrote:
> The git-gc(1) command knows to read a bunch of config keys to tweak its
> own behaviour. The values are parsed into global variables, which makes
> it hard to correctly manage the lifecycle of values that may require a
> memory allocation.
> 
> Refactor the code to use a `struct gc_config` that gets populated and
> passed around. For one, this makes previously-implicit dependencies on
> these config values clear. Second, it will allow us to properly manage
> the lifecycle in the next commit.

I think this is a valuable goal.

> -static const char *gc_log_expire = "1.day.ago";
> -static const char *prune_expire = "2.weeks.ago";
> -static const char *prune_worktrees_expire = "3.months.ago";

I was going to mention this in the previous patch where you change how
these variables are cast into git_config_get_expiry(). They aren't
changing to non-const here, either.

> +struct gc_config {
...
> +	const char *gc_log_expire;
> +	const char *prune_expire;
> +	const char *prune_worktrees_expire;

The fact that they are initialized to const strings makes it
difficult to know if they've been updated. I wonder if we need
to change them to have a "if NULL, then use a const default"
somewhere. (And maybe you do this later in the series).

> +static void gc_config(struct gc_config *cfg)

I appreciate that you are taking the step to make the structure
a process parameter and not just another global.

Thanks,
-Stolee


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 3/7] builtin/gc: fix leaking config values
  2024-08-13  7:17 ` [PATCH 3/7] builtin/gc: fix leaking config values Patrick Steinhardt
  2024-08-15  5:22   ` James Liu
@ 2024-08-15 13:50   ` Derrick Stolee
  1 sibling, 0 replies; 79+ messages in thread
From: Derrick Stolee @ 2024-08-15 13:50 UTC (permalink / raw)
  To: Patrick Steinhardt, git

On 8/13/24 3:17 AM, Patrick Steinhardt wrote:
> We're leaking config values in git-gc(1) when those values are tracked
> as strings. Introduce a new `gc_config_release()` function that releases
> this memory to plug those leaks and release old values before populating
> the config fields via `git_config_string()` et al.
> 
> Note that there is one small gotcha here with the "--prune" option. Next
> to passing a string, this option also accepts the "--no-prune" option
> that overrides the default or configured value. We thus need to discern
> between the option not having been passed by the user and the negative
> variant of it. This is done by using a simple sentinel value that lets
> us discern these cases.

I'm glad to see that you are correcting this in the very next patch
where I pointed out my concern about it. Excellent.

> -	const char *gc_log_expire;
> -	const char *prune_expire;
> -	const char *prune_worktrees_expire;
> +	char *gc_log_expire;
> +	char *prune_expire;
> +	char *prune_worktrees_expire;

> -	.gc_log_expire = "1.day.ago", \
> -	.prune_expire = "2.weeks.ago", \
> -	.prune_worktrees_expire = "3.months.ago", \
> +	.gc_log_expire = xstrdup("1.day.ago"), \
> +	.prune_expire = xstrdup("2.weeks.ago"), \
> +	.prune_worktrees_expire = xstrdup("3.months.ago"), \

Using xstrdup() is now possible that these aren't globals. Good.

>   static void gc_config(struct gc_config *cfg)
>   {
>   	const char *value;
> +	char *owned = NULL;
>   
>   	if (!git_config_get_value("gc.packrefs", &value)) {
>   		if (value && !strcmp(value, "notbare"))
> @@ -185,15 +195,34 @@ static void gc_config(struct gc_config *cfg)
>   	git_config_get_bool("gc.autodetach", &cfg->detach_auto);
>   	git_config_get_bool("gc.cruftpacks", &cfg->cruft_packs);
>   	git_config_get_ulong("gc.maxcruftsize", &cfg->max_cruft_size);
> -	git_config_get_expiry("gc.pruneexpire", (char **) &cfg->prune_expire);
> -	git_config_get_expiry("gc.worktreepruneexpire", (char **) &cfg->prune_worktrees_expire);
> -	git_config_get_expiry("gc.logexpiry", (char **) &cfg->gc_log_expire);
> +
> +	if (!git_config_get_expiry("gc.pruneexpire", &owned)) {
> +		free(cfg->prune_expire);
> +		cfg->prune_expire = owned;
> +	}

Ah. Good logic of "if I get a (possibly NULL) result, then free the old
value before setting the new one".

> diff --git a/t/t5304-prune.sh b/t/t5304-prune.sh
> index 1f1f664871..e641df0116 100755
> --- a/t/t5304-prune.sh
> +++ b/t/t5304-prune.sh
> @@ -7,6 +7,7 @@ test_description='prune'
>   GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
>   export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
>   
> +TEST_PASSES_SANITIZE_LEAK=true

Fantastic.

Thanks,
-Stolee


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 7/7] builtin/maintenance: fix auto-detach with non-standard tasks
  2024-08-13  7:18 ` [PATCH 7/7] builtin/maintenance: fix auto-detach with non-standard tasks Patrick Steinhardt
  2024-08-13 11:29   ` Phillip Wood
  2024-08-15  6:40   ` James Liu
@ 2024-08-15 14:00   ` Derrick Stolee
  2 siblings, 0 replies; 79+ messages in thread
From: Derrick Stolee @ 2024-08-15 14:00 UTC (permalink / raw)
  To: Patrick Steinhardt, git

On 8/13/24 3:18 AM, Patrick Steinhardt wrote:

> @@ -1063,6 +1063,7 @@ static int maintenance_task_gc(struct maintenance_run_opts *opts,
>   		strvec_push(&child.args, "--quiet");
>   	else
>   		strvec_push(&child.args, "--no-quiet");
> +	strvec_push(&child.args, "--no-detach");
>   
>   	return run_command(&child);
>   }

I was looking for this earlier, as it could have been placed in either of
the previous two patches. I can understand not putting it in Patch 5
because then the default of 'git maintenance run --auto' would not
daemonize the gc subprocess. But it seems like patch 6 would also have
been fine. Here is good, too.

> +	/*
> +	 * When `maintenance.autoDetach` isn't set, then we fall back to
> +	 * honoring `gc.autoDetach`. This is somewhat weird, but required to
> +	 * retain behaviour from when we used to run git-gc(1) here.
> +	 */
> +	if (git_config_get_bool("maintenance.autodetach", &auto_detach) &&
> +	    git_config_get_bool("gc.autodetach", &auto_detach))
> +		auto_detach = 1;
> +

This && caught me by surprise, but it's really "if both of these config
options are unset, then set a default." Makes sense after thinking a bit
harder.

>   	maint->git_cmd = 1;
>   	maint->close_object_store = 1;
>   	strvec_pushl(&maint->args, "maintenance", "run", "--auto", NULL);
>   	strvec_push(&maint->args, quiet ? "--quiet" : "--no-quiet");
> +	strvec_push(&maint->args, auto_detach ? "--detach" : "--no-detach");
>   
>   	return 1;
>   }
> diff --git a/t/t5616-partial-clone.sh b/t/t5616-partial-clone.sh
> index 2da7291e37..8415884754 100755
> --- a/t/t5616-partial-clone.sh
> +++ b/t/t5616-partial-clone.sh
> @@ -229,7 +229,7 @@ test_expect_success 'fetch --refetch triggers repacking' '
>   
>   	GIT_TRACE2_EVENT="$PWD/trace1.event" \
>   	git -C pc1 fetch --refetch origin &&
> -	test_subcommand git maintenance run --auto --no-quiet <trace1.event &&
> +	test_subcommand git maintenance run --auto --no-quiet --detach <trace1.event &&

And this changes because it's the new default. You don't need config change
to the test repo to make this happen. Good.

> +for cfg in maintenance.autoDetach gc.autoDetach
> +do
> +	test_expect_success "$cfg=true config option" '
> +		test_when_finished "rm -f trace" &&
> +		test_config $cfg true &&
> +		GIT_TRACE2_EVENT="$(pwd)/trace" git commit --quiet --allow-empty -m 1 &&
> +		test_subcommand git maintenance run --auto --quiet --detach <trace
> +	'
> +
> +	test_expect_success "$cfg=false config option" '
> +		test_when_finished "rm -f trace" &&
> +		test_config $cfg false &&
> +		GIT_TRACE2_EVENT="$(pwd)/trace" git commit --quiet --allow-empty -m 1 &&
> +		test_subcommand git maintenance run --auto --quiet --no-detach <trace
> +	'
> +done
> +
> +test_expect_success "maintenance.autoDetach overrides gc.autoDetach" '
> +	test_when_finished "rm -f trace" &&
> +	test_config maintenance.autoDetach false &&
> +	test_config gc.autoDetach true &&
> +	GIT_TRACE2_EVENT="$(pwd)/trace" git commit --quiet --allow-empty -m 1 &&
> +	test_subcommand git maintenance run --auto --quiet --no-detach <trace
>   '

I appreciate the care taken in these test cases. It verifies the logic around
the if statement that I had to read twice.

Thanks,
-Stolee


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 0/7] builtin/maintenance: fix auto-detach with non-standard tasks
  2024-08-13  7:17 [PATCH 0/7] builtin/maintenance: fix auto-detach with non-standard tasks Patrick Steinhardt
                   ` (8 preceding siblings ...)
  2024-08-15  9:12 ` [PATCH v2 " Patrick Steinhardt
@ 2024-08-15 14:04 ` Derrick Stolee
  2024-08-15 15:37   ` Junio C Hamano
  2024-08-16  8:06   ` Patrick Steinhardt
  2024-08-16 10:44 ` [PATCH v3 " Patrick Steinhardt
  10 siblings, 2 replies; 79+ messages in thread
From: Derrick Stolee @ 2024-08-15 14:04 UTC (permalink / raw)
  To: Patrick Steinhardt, git

On 8/13/24 3:17 AM, Patrick Steinhardt wrote:

> I recently configured git-maintenance(1) to not use git-gc(1) anymore,
> but instead to use git-multi-pack-index(1). I quickly noticed that the
> behaviour here is somewhat broken because instead of auto-detaching when
> `git maintenance run --auto` executes, we wait for the process to run to
> completion.
> 
> The root cause is that git-maintenance(1), probably by accident,
> continues to rely on the auto-detaching mechanism in git-gc(1). So
> instead of having git-maintenance(1) detach, it is git-gc(1) that
> detaches and thus causes git-maintenance(1) to exit early. That of
> course falls flat once any maintenance task other than git-gc(1)
> executes, because these won't detach.
> 
> Despite being a usability issue, this may also cause git-gc(1) to run
> concurrently with any other enabled maintenance tasks. This shouldn't
> lead to data loss, but it can certainly lead to processes stomping on
> each others feet.
> 
> This patch series fixes this by wiring up new `--detach` flags for both
> git-gc(1) and git-maintenance(1). Like this, git-maintenance(1) now
> knows to execute `git gc --auto --no-detach`, while our auto-maintenance
> will execute `git mainteance run --auto --detach`.

Thank you for noticing this behavior, which is essentially an unintended
regression from when the maintenance command was first introduced. It
worked for most users because of the accidental detachment of the GC
task, but now users can correctly customize their automatic maintenance
to run in the background.

This was my oversight, as I was focused on scheduled maintenance as
being the primary way that users would customize their maintenance tasks.
Thank you for unifying the concepts.

I sprinkled in commentary, and most of it was just things I noticed
while reading the series in order but then later patches or a careful
read made my comments non-actionable.

This v1 looks good to me.

Thanks,
-Stolee


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 0/7] builtin/maintenance: fix auto-detach with non-standard tasks
  2024-08-15 14:04 ` [PATCH 0/7] builtin/maintenance: fix auto-detach with non-standard tasks Derrick Stolee
@ 2024-08-15 15:37   ` Junio C Hamano
  2024-08-16  8:06   ` Patrick Steinhardt
  1 sibling, 0 replies; 79+ messages in thread
From: Junio C Hamano @ 2024-08-15 15:37 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: Patrick Steinhardt, git

Derrick Stolee <stolee@gmail.com> writes:

> I sprinkled in commentary, and most of it was just things I noticed
> while reading the series in order but then later patches or a careful
> read made my comments non-actionable.
>
> This v1 looks good to me.

Thanks for a "think aloud" review.  Very much appreciated.

I thought there was a minor reroll for phrasing planned, without any
other change on significant part of the series?  Let me mark the
topic for 'next' when it happens.

Thanks, both.


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v2 7/7] run-command: fix detaching when running auto maintenance
  2024-08-15  9:12   ` [PATCH v2 7/7] run-command: fix detaching when running auto maintenance Patrick Steinhardt
@ 2024-08-15 16:13     ` Junio C Hamano
  2024-08-16  8:06       ` Patrick Steinhardt
  0 siblings, 1 reply; 79+ messages in thread
From: Junio C Hamano @ 2024-08-15 16:13 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, Phillip Wood, phillip.wood, James Liu

Patrick Steinhardt <ps@pks.im> writes:

> diff --git a/run-command.c b/run-command.c
> index 45ba544932..94f2f3079f 100644
> --- a/run-command.c
> +++ b/run-command.c
> @@ -1808,16 +1808,26 @@ void run_processes_parallel(const struct run_process_parallel_opts *opts)
>  
>  int prepare_auto_maintenance(int quiet, struct child_process *maint)
>  {
> -	int enabled;
> +	int enabled, auto_detach;
>  
>  	if (!git_config_get_bool("maintenance.auto", &enabled) &&
>  	    !enabled)
>  		return 0;
>  
> +	/*
> +	 * When `maintenance.autoDetach` isn't set, then we fall back to
> +	 * honoring `gc.autoDetach`. This is somewhat weird, but required to
> +	 * retain behaviour from when we used to run git-gc(1) here.
> +	 */
> +	if (git_config_get_bool("maintenance.autodetach", &auto_detach) &&
> +	    git_config_get_bool("gc.autodetach", &auto_detach))
> +		auto_detach = 1;

I think this needs somehow documented.  Something like this,
perhaps?

 Documentation/config/gc.txt          | 2 ++
 Documentation/config/maintenance.txt | 9 +++++++++
 2 files changed, 11 insertions(+)

diff --git c/Documentation/config/gc.txt w/Documentation/config/gc.txt
index 664a3c2874..6506ccb87f 100644
--- c/Documentation/config/gc.txt
+++ w/Documentation/config/gc.txt
@@ -41,6 +41,8 @@ use, it'll affect how the auto pack limit works.
 gc.autoDetach::
 	Make `git gc --auto` return immediately and run in the background
 	if the system supports it. Default is true.
+	It also acts as a fallback setting for the `maintenance.autoDetach`
+	configuration variable.
 
 gc.bigPackThreshold::
 	If non-zero, all non-cruft packs larger than this limit are kept
diff --git c/Documentation/config/maintenance.txt w/Documentation/config/maintenance.txt
index 69a4f05153..7a481a494a 100644
--- c/Documentation/config/maintenance.txt
+++ w/Documentation/config/maintenance.txt
@@ -3,6 +3,15 @@ maintenance.auto::
 	`git maintenance run --auto` after doing their normal work. Defaults
 	to true.
 
+maintenance.autoDetach::
+	Tasks that are run via `git maintenance run --auto` by
+	default runs in the background, if the system supports it.
+	Setting this configuration variable to `true` explicitly
+	asks them to run in the background, and setting it to
+	`false` forces them to run in the foreground.  If this
+	variable is not set, `gc.autoDetach` works as a fallback
+	variable and behaves the same way.
+
 maintenance.strategy::
 	This string config option provides a way to specify one of a few
 	recommended schedules for background maintenance. This only affects

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* Re: [PATCH v2 5/7] builtin/gc: add a `--detach` flag
  2024-08-15  9:12   ` [PATCH v2 5/7] builtin/gc: add a `--detach` flag Patrick Steinhardt
@ 2024-08-15 19:11     ` Junio C Hamano
  2024-08-15 22:29       ` Junio C Hamano
  0 siblings, 1 reply; 79+ messages in thread
From: Junio C Hamano @ 2024-08-15 19:11 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, Phillip Wood, phillip.wood, James Liu

Patrick Steinhardt <ps@pks.im> writes:

> +test_expect_success '--detach overrides gc.autoDetach=false' '
> +	test_when_finished "rm -rf repo" &&
> +	git init repo &&
> +	(
> +		cd repo &&
> +
> +		# Prepare the repository such that git-gc(1) ends up repacking.
> +		test_commit "$(test_oid blob17_1)" &&
> +		test_commit "$(test_oid blob17_2)" &&
> +		git config gc.autodetach false &&
> +		git config gc.auto 2 &&
> +
> +		cat >expect <<-EOF &&
> +		Auto packing the repository in background for optimum performance.
> +		See "git help gc" for manual housekeeping.
> +		EOF
> +		GIT_PROGRESS_DELAY=0 git gc --auto --detach 2>actual &&
> +		test_cmp expect actual
> +	)
> +'

If the gc/maintenance is going to background itself, it is possible
that it still is running, possibly with files under repo/.git/ open
and the process running in repo directory, when the test_when_finished
clean-up trap goes in effect?

I am wondering where this comes from:

  https://github.com/git/git/actions/runs/10408467351/job/28825980833#step:6:2000

where "rm -rf repo" dies with an unusual

  rm: can't remove 'repo/.git': Directory not empty

and my theory is that after "rm -rf" _thinks_ it removed everything
underneath, before it attempts to rmdir("repo/.git"), the repack
process in the background has created a new pack, and "rm -rf" does
not go back and try to create such a new cruft.

The most robust way to work around such a "race" is to wait for the
backgrounded process before cleaning up, or after seeing that the
message we use as a signal that the "gc" has backgrounded itself,
kill that backgrounded process before exiting the test and causing
the clean-up to trigger.


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v2 5/7] builtin/gc: add a `--detach` flag
  2024-08-15 19:11     ` Junio C Hamano
@ 2024-08-15 22:29       ` Junio C Hamano
  2024-08-16  8:06         ` Patrick Steinhardt
  0 siblings, 1 reply; 79+ messages in thread
From: Junio C Hamano @ 2024-08-15 22:29 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, Phillip Wood, phillip.wood, James Liu

Junio C Hamano <gitster@pobox.com> writes:

> Patrick Steinhardt <ps@pks.im> writes:
>
>> +test_expect_success '--detach overrides gc.autoDetach=false' '
>> +	test_when_finished "rm -rf repo" &&
>> +	git init repo &&
>> +	(
>> +		cd repo &&
>> +
>> +		# Prepare the repository such that git-gc(1) ends up repacking.
>> +		test_commit "$(test_oid blob17_1)" &&
>> +		test_commit "$(test_oid blob17_2)" &&
>> +		git config gc.autodetach false &&
>> +		git config gc.auto 2 &&
>> +
>> +		cat >expect <<-EOF &&
>> +		Auto packing the repository in background for optimum performance.
>> +		See "git help gc" for manual housekeeping.
>> +		EOF
>> +		GIT_PROGRESS_DELAY=0 git gc --auto --detach 2>actual &&
>> +		test_cmp expect actual
>> +	)
>> +'
>
> If the gc/maintenance is going to background itself, it is possible
> that it still is running, possibly with files under repo/.git/ open
> and the process running in repo directory, when the test_when_finished
> clean-up trap goes in effect?
>
> I am wondering where this comes from:
>
>   https://github.com/git/git/actions/runs/10408467351/job/28825980833#step:6:2000
>
> where "rm -rf repo" dies with an unusual
>
>   rm: can't remove 'repo/.git': Directory not empty
>
> and my theory is that after "rm -rf" _thinks_ it removed everything
> underneath, before it attempts to rmdir("repo/.git"), the repack
> process in the background has created a new pack, and "rm -rf" does
> not go back and try to create such a new cruft.
>
> The most robust way to work around such a "race" is to wait for the
> backgrounded process before cleaning up, or after seeing that the
> message we use as a signal that the "gc" has backgrounded itself,
> kill that backgrounded process before exiting the test and causing
> the clean-up to trigger.

There already is a clue left by those who worked on this test the
last time at the end of the script.  It says:

    # DO NOT leave a detached auto gc process running near the end of the
    # test script: it can run long enough in the background to racily
    # interfere with the cleanup in 'test_done'.

immediately before "test_done".

In the meantime, I am wondering something simple and silly like the
attached is sufficient.  The idea is that we expect the "oops we
couldn't clean" code not to trigger most of the time, but if it
does, we just wait (with back off) a bit and retry.


 t/t6500-gc.sh | 18 ++++++++++++++++--
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git c/t/t6500-gc.sh w/t/t6500-gc.sh
index 737c99e0f8..4a991e087a 100755
--- c/t/t6500-gc.sh
+++ w/t/t6500-gc.sh
@@ -396,8 +396,22 @@ test_expect_success 'background auto gc respects lock for all operations' '
 	test_cmp expect actual
 '
 
+wait_to_clean () {
+	count=10 sleep=1
+	until rm -rf "$1" && ! test -d "$1"
+	do
+		if test $count = 0
+		then
+			return 1
+		fi
+		count=$(( count - 1 ))
+		sleep=$(( sleep + sleep ))
+		sleep $sleep
+	done
+}
+
 test_expect_success '--detach overrides gc.autoDetach=false' '
-	test_when_finished "rm -rf repo" &&
+	test_when_finished "wait_to_clean repo" &&
 	git init repo &&
 	(
 		cd repo &&
@@ -418,7 +432,7 @@ test_expect_success '--detach overrides gc.autoDetach=false' '
 '
 
 test_expect_success '--no-detach overrides gc.autoDetach=true' '
-	test_when_finished "rm -rf repo" &&
+	test_when_finished "wait_to_clean repo" &&
 	git init repo &&
 	(
 		cd repo &&


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* Re: [PATCH 0/7] builtin/maintenance: fix auto-detach with non-standard tasks
  2024-08-15 14:04 ` [PATCH 0/7] builtin/maintenance: fix auto-detach with non-standard tasks Derrick Stolee
  2024-08-15 15:37   ` Junio C Hamano
@ 2024-08-16  8:06   ` Patrick Steinhardt
  1 sibling, 0 replies; 79+ messages in thread
From: Patrick Steinhardt @ 2024-08-16  8:06 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: git

On Thu, Aug 15, 2024 at 10:04:10AM -0400, Derrick Stolee wrote:
> On 8/13/24 3:17 AM, Patrick Steinhardt wrote:
> 
> > I recently configured git-maintenance(1) to not use git-gc(1) anymore,
> > but instead to use git-multi-pack-index(1). I quickly noticed that the
> > behaviour here is somewhat broken because instead of auto-detaching when
> > `git maintenance run --auto` executes, we wait for the process to run to
> > completion.
> > 
> > The root cause is that git-maintenance(1), probably by accident,
> > continues to rely on the auto-detaching mechanism in git-gc(1). So
> > instead of having git-maintenance(1) detach, it is git-gc(1) that
> > detaches and thus causes git-maintenance(1) to exit early. That of
> > course falls flat once any maintenance task other than git-gc(1)
> > executes, because these won't detach.
> > 
> > Despite being a usability issue, this may also cause git-gc(1) to run
> > concurrently with any other enabled maintenance tasks. This shouldn't
> > lead to data loss, but it can certainly lead to processes stomping on
> > each others feet.
> > 
> > This patch series fixes this by wiring up new `--detach` flags for both
> > git-gc(1) and git-maintenance(1). Like this, git-maintenance(1) now
> > knows to execute `git gc --auto --no-detach`, while our auto-maintenance
> > will execute `git mainteance run --auto --detach`.
> 
> Thank you for noticing this behavior, which is essentially an unintended
> regression from when the maintenance command was first introduced. It
> worked for most users because of the accidental detachment of the GC
> task, but now users can correctly customize their automatic maintenance
> to run in the background.
> 
> This was my oversight, as I was focused on scheduled maintenance as
> being the primary way that users would customize their maintenance tasks.
> Thank you for unifying the concepts.
> 
> I sprinkled in commentary, and most of it was just things I noticed
> while reading the series in order but then later patches or a careful
> read made my comments non-actionable.
> 
> This v1 looks good to me.

Thanks for your thorough review!

Patrick

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v2 5/7] builtin/gc: add a `--detach` flag
  2024-08-15 22:29       ` Junio C Hamano
@ 2024-08-16  8:06         ` Patrick Steinhardt
  0 siblings, 0 replies; 79+ messages in thread
From: Patrick Steinhardt @ 2024-08-16  8:06 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Phillip Wood, phillip.wood, James Liu

On Thu, Aug 15, 2024 at 03:29:20PM -0700, Junio C Hamano wrote:
> Junio C Hamano <gitster@pobox.com> writes:
> 
> > Patrick Steinhardt <ps@pks.im> writes:
> >
> >> +test_expect_success '--detach overrides gc.autoDetach=false' '
> >> +	test_when_finished "rm -rf repo" &&
> >> +	git init repo &&
> >> +	(
> >> +		cd repo &&
> >> +
> >> +		# Prepare the repository such that git-gc(1) ends up repacking.
> >> +		test_commit "$(test_oid blob17_1)" &&
> >> +		test_commit "$(test_oid blob17_2)" &&
> >> +		git config gc.autodetach false &&
> >> +		git config gc.auto 2 &&
> >> +
> >> +		cat >expect <<-EOF &&
> >> +		Auto packing the repository in background for optimum performance.
> >> +		See "git help gc" for manual housekeeping.
> >> +		EOF
> >> +		GIT_PROGRESS_DELAY=0 git gc --auto --detach 2>actual &&
> >> +		test_cmp expect actual
> >> +	)
> >> +'
> >
> > If the gc/maintenance is going to background itself, it is possible
> > that it still is running, possibly with files under repo/.git/ open
> > and the process running in repo directory, when the test_when_finished
> > clean-up trap goes in effect?
> >
> > I am wondering where this comes from:
> >
> >   https://github.com/git/git/actions/runs/10408467351/job/28825980833#step:6:2000
> >
> > where "rm -rf repo" dies with an unusual
> >
> >   rm: can't remove 'repo/.git': Directory not empty
> >
> > and my theory is that after "rm -rf" _thinks_ it removed everything
> > underneath, before it attempts to rmdir("repo/.git"), the repack
> > process in the background has created a new pack, and "rm -rf" does
> > not go back and try to create such a new cruft.
> >
> > The most robust way to work around such a "race" is to wait for the
> > backgrounded process before cleaning up, or after seeing that the
> > message we use as a signal that the "gc" has backgrounded itself,
> > kill that backgrounded process before exiting the test and causing
> > the clean-up to trigger.
> 
> There already is a clue left by those who worked on this test the
> last time at the end of the script.  It says:
> 
>     # DO NOT leave a detached auto gc process running near the end of the
>     # test script: it can run long enough in the background to racily
>     # interfere with the cleanup in 'test_done'.
> 
> immediately before "test_done".
> 
> In the meantime, I am wondering something simple and silly like the
> attached is sufficient.  The idea is that we expect the "oops we
> couldn't clean" code not to trigger most of the time, but if it
> does, we just wait (with back off) a bit and retry.

Ah, indeed, that is a problem. We already have a better tool to fix this
with `run_and_wait_for_auto_gc()`. It creates a separate file descriptor
and waits for it to close via some shell trickery, which will only
happen once the child process of git-gc(1) has exited.

Will fix, thanks!

Patrick

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v2 7/7] run-command: fix detaching when running auto maintenance
  2024-08-15 16:13     ` Junio C Hamano
@ 2024-08-16  8:06       ` Patrick Steinhardt
  0 siblings, 0 replies; 79+ messages in thread
From: Patrick Steinhardt @ 2024-08-16  8:06 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Phillip Wood, phillip.wood, James Liu

On Thu, Aug 15, 2024 at 09:13:28AM -0700, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> > diff --git a/run-command.c b/run-command.c
> > index 45ba544932..94f2f3079f 100644
> > --- a/run-command.c
> > +++ b/run-command.c
> > @@ -1808,16 +1808,26 @@ void run_processes_parallel(const struct run_process_parallel_opts *opts)
> >  
> >  int prepare_auto_maintenance(int quiet, struct child_process *maint)
> >  {
> > -	int enabled;
> > +	int enabled, auto_detach;
> >  
> >  	if (!git_config_get_bool("maintenance.auto", &enabled) &&
> >  	    !enabled)
> >  		return 0;
> >  
> > +	/*
> > +	 * When `maintenance.autoDetach` isn't set, then we fall back to
> > +	 * honoring `gc.autoDetach`. This is somewhat weird, but required to
> > +	 * retain behaviour from when we used to run git-gc(1) here.
> > +	 */
> > +	if (git_config_get_bool("maintenance.autodetach", &auto_detach) &&
> > +	    git_config_get_bool("gc.autodetach", &auto_detach))
> > +		auto_detach = 1;
> 
> I think this needs somehow documented.  Something like this,
> perhaps?

Indeed, I totally forgot doing that.

> --- c/Documentation/config/maintenance.txt
> +++ w/Documentation/config/maintenance.txt
> @@ -3,6 +3,15 @@ maintenance.auto::
>  	`git maintenance run --auto` after doing their normal work. Defaults
>  	to true.
>  
> +maintenance.autoDetach::
> +	Tasks that are run via `git maintenance run --auto` by
> +	default runs in the background, if the system supports it.
> +	Setting this configuration variable to `true` explicitly
> +	asks them to run in the background, and setting it to
> +	`false` forces them to run in the foreground.  If this
> +	variable is not set, `gc.autoDetach` works as a fallback
> +	variable and behaves the same way.

This isn't entirely true. `git maintenance run --auto` will not
background, because that'd change preexisting behaviour. It also would
not make a lot of sense, because here the `--auto` trigger tells the
command to do maintenance as-needed. Coupling that with whether or not
to detach was a misdesign of git-gc(1), I think. What it does control is
whether we detach or not when automatically executing maintenance via
`prepare_auto_maintenance()`.

Anyway, my fault for not documenting it, not yours for getting it
slightly wrong.

Patrick

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [PATCH v3 0/7] builtin/maintenance: fix auto-detach with non-standard tasks
  2024-08-13  7:17 [PATCH 0/7] builtin/maintenance: fix auto-detach with non-standard tasks Patrick Steinhardt
                   ` (9 preceding siblings ...)
  2024-08-15 14:04 ` [PATCH 0/7] builtin/maintenance: fix auto-detach with non-standard tasks Derrick Stolee
@ 2024-08-16 10:44 ` Patrick Steinhardt
  2024-08-16 10:44   ` [PATCH v3 1/7] config: fix constness of out parameter for `git_config_get_expiry()` Patrick Steinhardt
                     ` (6 more replies)
  10 siblings, 7 replies; 79+ messages in thread
From: Patrick Steinhardt @ 2024-08-16 10:44 UTC (permalink / raw)
  To: git; +Cc: Phillip Wood, phillip.wood, James Liu, Derrick Stolee,
	Junio C Hamano

Hi,

this is the second version of my patch series that fixes how
git-maintenance(1) detaches: instead of letting its child process
git-gc(1) detach, we now optionally ask git-maintenance(1) itself to
detach when running via our auto maintenance mechanism. This fixes
behaviour of git-maintenance(1) when configured to run non-standard
tasks like the "incremental" task.

Changes compared to v2:

  - Fix leaking git-gc(1) process in t6500.

  - Add missing documentation for `maintenance.autoDetach`.

Thanks!

Patrick

Patrick Steinhardt (7):
  config: fix constness of out parameter for `git_config_get_expiry()`
  builtin/gc: refactor to read config into structure
  builtin/gc: fix leaking config values
  builtin/gc: stop processing log file on signal
  builtin/gc: add a `--detach` flag
  builtin/maintenance: add a `--detach` flag
  run-command: fix detaching when running auto maintenance

 Documentation/config/gc.txt          |   3 +-
 Documentation/config/maintenance.txt |  11 +
 Documentation/git-gc.txt             |   5 +-
 builtin/gc.c                         | 384 +++++++++++++++++----------
 config.c                             |   4 +-
 config.h                             |   2 +-
 read-cache.c                         |  12 +-
 run-command.c                        |  12 +-
 t/t5304-prune.sh                     |   1 +
 t/t5616-partial-clone.sh             |   6 +-
 t/t6500-gc.sh                        |  45 +++-
 t/t7900-maintenance.sh               |  82 +++++-
 12 files changed, 396 insertions(+), 171 deletions(-)

Range-diff against v2:
1:  040453f27f = 1:  040453f27f config: fix constness of out parameter for `git_config_get_expiry()`
2:  ff6aa9d7ba = 2:  ff6aa9d7ba builtin/gc: refactor to read config into structure
3:  310e361371 = 3:  310e361371 builtin/gc: fix leaking config values
4:  812c61c9b6 = 4:  812c61c9b6 builtin/gc: stop processing log file on signal
5:  ca78d3dc7c ! 5:  b934b23889 builtin/gc: add a `--detach` flag
    @@ builtin/gc.c: static int maintenance_run(int argc, const char **argv, const char
      	for (i = 0; i < TASK__COUNT; i++)
     
      ## t/t6500-gc.sh ##
    +@@ t/t6500-gc.sh: test_expect_success 'gc.maxCruftSize sets appropriate repack options' '
    + 	test_subcommand $cruft_max_size_opts --max-cruft-size=3145728 <trace2.txt
    + '
    + 
    +-run_and_wait_for_auto_gc () {
    ++run_and_wait_for_gc () {
    + 	# We read stdout from gc for the side effect of waiting until the
    + 	# background gc process exits, closing its fd 9.  Furthermore, the
    + 	# variable assignment from a command substitution preserves the
    + 	# exit status of the main gc process.
    + 	# Note: this fd trickery doesn't work on Windows, but there is no
    + 	# need to, because on Win the auto gc always runs in the foreground.
    +-	doesnt_matter=$(git gc --auto 9>&1)
    ++	doesnt_matter=$(git gc "$@" 9>&1)
    + }
    + 
    + test_expect_success 'background auto gc does not run if gc.log is present and recent but does if it is old' '
    +@@ t/t6500-gc.sh: test_expect_success 'background auto gc does not run if gc.log is present and re
    + 	test-tool chmtime =-345600 .git/gc.log &&
    + 	git gc --auto &&
    + 	test_config gc.logexpiry 2.days &&
    +-	run_and_wait_for_auto_gc &&
    ++	run_and_wait_for_gc --auto &&
    + 	ls .git/objects/pack/pack-*.pack >packs &&
    + 	test_line_count = 1 packs
    + '
     @@ t/t6500-gc.sh: test_expect_success 'background auto gc respects lock for all operations' '
    + 	printf "%d %s" "$shell_pid" "$hostname" >.git/gc.pid &&
    + 
    + 	# our gc should exit zero without doing anything
    +-	run_and_wait_for_auto_gc &&
    ++	run_and_wait_for_gc --auto &&
    + 	(ls -1 .git/refs/heads .git/reftable >actual || true) &&
      	test_cmp expect actual
      '
      
    @@ t/t6500-gc.sh: test_expect_success 'background auto gc respects lock for all ope
     +		git config gc.autodetach false &&
     +		git config gc.auto 2 &&
     +
    -+		cat >expect <<-EOF &&
    -+		Auto packing the repository in background for optimum performance.
    -+		See "git help gc" for manual housekeeping.
    -+		EOF
    -+		GIT_PROGRESS_DELAY=0 git gc --auto --detach 2>actual &&
    -+		test_cmp expect actual
    ++		# Note that we cannot use `test_cmp` here to compare stderr
    ++		# because it may contain output from `set -x`.
    ++		run_and_wait_for_gc --auto --detach 2>actual &&
    ++		test_grep "Auto packing the repository in background for optimum performance." actual
     +	)
     +'
     +
6:  06dbb73425 = 6:  347d0a2002 builtin/maintenance: add a `--detach` flag
7:  6bc170ff05 ! 7:  9befef7c1f run-command: fix detaching when running auto maintenance
    @@ Commit message
     
         Signed-off-by: Patrick Steinhardt <ps@pks.im>
     
    + ## Documentation/config/gc.txt ##
    +@@ Documentation/config/gc.txt: use, it'll affect how the auto pack limit works.
    + 
    + gc.autoDetach::
    + 	Make `git gc --auto` return immediately and run in the background
    +-	if the system supports it. Default is true.
    ++	if the system supports it. Default is true. This config variable acts
    ++	as a fallback in case `maintenance.autoDetach` is not set.
    + 
    + gc.bigPackThreshold::
    + 	If non-zero, all non-cruft packs larger than this limit are kept
    +
    + ## Documentation/config/maintenance.txt ##
    +@@ Documentation/config/maintenance.txt: maintenance.auto::
    + 	`git maintenance run --auto` after doing their normal work. Defaults
    + 	to true.
    + 
    ++maintenance.autoDetach::
    ++	Many Git commands trigger automatic maintenance after they have
    ++	written data into the repository. This boolean config option
    ++	controls whether this automatic maintenance shall happen in the
    ++	foreground or whether the maintenance process shall detach and
    ++	continue to run in the background.
    +++
    ++If unset, the value of `gc.autoDetach` is used as a fallback. Defaults
    ++to true if both are unset, meaning that the maintenance process will
    ++detach.
    ++
    + maintenance.strategy::
    + 	This string config option provides a way to specify one of a few
    + 	recommended schedules for background maintenance. This only affects
    +
      ## builtin/gc.c ##
     @@ builtin/gc.c: static int maintenance_task_gc(struct maintenance_run_opts *opts,
      		strvec_push(&child.args, "--quiet");
-- 
2.46.0.46.g406f326d27.dirty


^ permalink raw reply	[flat|nested] 79+ messages in thread

* [PATCH v3 1/7] config: fix constness of out parameter for `git_config_get_expiry()`
  2024-08-16 10:44 ` [PATCH v3 " Patrick Steinhardt
@ 2024-08-16 10:44   ` Patrick Steinhardt
  2024-08-16 10:45   ` [PATCH v3 2/7] builtin/gc: refactor to read config into structure Patrick Steinhardt
                     ` (5 subsequent siblings)
  6 siblings, 0 replies; 79+ messages in thread
From: Patrick Steinhardt @ 2024-08-16 10:44 UTC (permalink / raw)
  To: git; +Cc: Phillip Wood, phillip.wood, James Liu, Derrick Stolee,
	Junio C Hamano

The type of the out parameter of `git_config_get_expiry()` is a pointer
to a constant string, which creates the impression that ownership of the
returned data wasn't transferred to the caller. This isn't true though
and thus quite misleading.

Adapt the parameter to be of type `char **` and adjust callers
accordingly. While at it, refactor `get_shared_index_expire_date()` to
drop the static `shared_index_expire` variable. It is only used in that
function, and furthermore we would only hit the code where we parse the
expiry date a single time because we already use a static `prepared`
variable to track whether we did parse it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/gc.c |  6 +++---
 config.c     |  4 ++--
 config.h     |  2 +-
 read-cache.c | 12 +++++++++---
 4 files changed, 15 insertions(+), 9 deletions(-)

diff --git a/builtin/gc.c b/builtin/gc.c
index 72bac2554f..e7406bf667 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -167,9 +167,9 @@ static void gc_config(void)
 	git_config_get_bool("gc.autodetach", &detach_auto);
 	git_config_get_bool("gc.cruftpacks", &cruft_packs);
 	git_config_get_ulong("gc.maxcruftsize", &max_cruft_size);
-	git_config_get_expiry("gc.pruneexpire", &prune_expire);
-	git_config_get_expiry("gc.worktreepruneexpire", &prune_worktrees_expire);
-	git_config_get_expiry("gc.logexpiry", &gc_log_expire);
+	git_config_get_expiry("gc.pruneexpire", (char **) &prune_expire);
+	git_config_get_expiry("gc.worktreepruneexpire", (char **) &prune_worktrees_expire);
+	git_config_get_expiry("gc.logexpiry", (char **) &gc_log_expire);
 
 	git_config_get_ulong("gc.bigpackthreshold", &big_pack_threshold);
 	git_config_get_ulong("pack.deltacachesize", &max_delta_cache_size);
diff --git a/config.c b/config.c
index 6421894614..dfa4df1417 100644
--- a/config.c
+++ b/config.c
@@ -2766,9 +2766,9 @@ int git_config_get_pathname(const char *key, char **dest)
 	return repo_config_get_pathname(the_repository, key, dest);
 }
 
-int git_config_get_expiry(const char *key, const char **output)
+int git_config_get_expiry(const char *key, char **output)
 {
-	int ret = git_config_get_string(key, (char **)output);
+	int ret = git_config_get_string(key, output);
 	if (ret)
 		return ret;
 	if (strcmp(*output, "now")) {
diff --git a/config.h b/config.h
index 54b47dec9e..4801391c32 100644
--- a/config.h
+++ b/config.h
@@ -701,7 +701,7 @@ int git_config_get_split_index(void);
 int git_config_get_max_percent_split_change(void);
 
 /* This dies if the configured or default date is in the future */
-int git_config_get_expiry(const char *key, const char **output);
+int git_config_get_expiry(const char *key, char **output);
 
 /* parse either "this many days" integer, or "5.days.ago" approxidate */
 int git_config_get_expiry_in_days(const char *key, timestamp_t *, timestamp_t now);
diff --git a/read-cache.c b/read-cache.c
index 48bf24f87c..7f393ee687 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -3176,18 +3176,24 @@ static int write_split_index(struct index_state *istate,
 	return ret;
 }
 
-static const char *shared_index_expire = "2.weeks.ago";
-
 static unsigned long get_shared_index_expire_date(void)
 {
 	static unsigned long shared_index_expire_date;
 	static int shared_index_expire_date_prepared;
 
 	if (!shared_index_expire_date_prepared) {
+		const char *shared_index_expire = "2.weeks.ago";
+		char *value = NULL;
+
 		git_config_get_expiry("splitindex.sharedindexexpire",
-				      &shared_index_expire);
+				      &value);
+		if (value)
+			shared_index_expire = value;
+
 		shared_index_expire_date = approxidate(shared_index_expire);
 		shared_index_expire_date_prepared = 1;
+
+		free(value);
 	}
 
 	return shared_index_expire_date;
-- 
2.46.0.46.g406f326d27.dirty


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH v3 2/7] builtin/gc: refactor to read config into structure
  2024-08-16 10:44 ` [PATCH v3 " Patrick Steinhardt
  2024-08-16 10:44   ` [PATCH v3 1/7] config: fix constness of out parameter for `git_config_get_expiry()` Patrick Steinhardt
@ 2024-08-16 10:45   ` Patrick Steinhardt
  2024-08-16 10:45   ` [PATCH v3 3/7] builtin/gc: fix leaking config values Patrick Steinhardt
                     ` (4 subsequent siblings)
  6 siblings, 0 replies; 79+ messages in thread
From: Patrick Steinhardt @ 2024-08-16 10:45 UTC (permalink / raw)
  To: git; +Cc: Phillip Wood, phillip.wood, James Liu, Derrick Stolee,
	Junio C Hamano

The git-gc(1) command knows to read a bunch of config keys to tweak its
own behaviour. The values are parsed into global variables, which makes
it hard to correctly manage the lifecycle of values that may require a
memory allocation.

Refactor the code to use a `struct gc_config` that gets populated and
passed around. For one, this makes previously-implicit dependencies on
these config values clear. Second, it will allow us to properly manage
the lifecycle in the next commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/gc.c | 255 +++++++++++++++++++++++++++++----------------------
 1 file changed, 143 insertions(+), 112 deletions(-)

diff --git a/builtin/gc.c b/builtin/gc.c
index e7406bf667..eee7401647 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -49,23 +49,7 @@ static const char * const builtin_gc_usage[] = {
 	NULL
 };
 
-static int pack_refs = 1;
-static int prune_reflogs = 1;
-static int cruft_packs = 1;
-static unsigned long max_cruft_size;
-static int aggressive_depth = 50;
-static int aggressive_window = 250;
-static int gc_auto_threshold = 6700;
-static int gc_auto_pack_limit = 50;
-static int detach_auto = 1;
 static timestamp_t gc_log_expire_time;
-static const char *gc_log_expire = "1.day.ago";
-static const char *prune_expire = "2.weeks.ago";
-static const char *prune_worktrees_expire = "3.months.ago";
-static char *repack_filter;
-static char *repack_filter_to;
-static unsigned long big_pack_threshold;
-static unsigned long max_delta_cache_size = DEFAULT_DELTA_CACHE_SIZE;
 
 static struct strvec reflog = STRVEC_INIT;
 static struct strvec repack = STRVEC_INIT;
@@ -145,37 +129,71 @@ static int gc_config_is_timestamp_never(const char *var)
 	return 0;
 }
 
-static void gc_config(void)
+struct gc_config {
+	int pack_refs;
+	int prune_reflogs;
+	int cruft_packs;
+	unsigned long max_cruft_size;
+	int aggressive_depth;
+	int aggressive_window;
+	int gc_auto_threshold;
+	int gc_auto_pack_limit;
+	int detach_auto;
+	const char *gc_log_expire;
+	const char *prune_expire;
+	const char *prune_worktrees_expire;
+	char *repack_filter;
+	char *repack_filter_to;
+	unsigned long big_pack_threshold;
+	unsigned long max_delta_cache_size;
+};
+
+#define GC_CONFIG_INIT { \
+	.pack_refs = 1, \
+	.prune_reflogs = 1, \
+	.cruft_packs = 1, \
+	.aggressive_depth = 50, \
+	.aggressive_window = 250, \
+	.gc_auto_threshold = 6700, \
+	.gc_auto_pack_limit = 50, \
+	.detach_auto = 1, \
+	.gc_log_expire = "1.day.ago", \
+	.prune_expire = "2.weeks.ago", \
+	.prune_worktrees_expire = "3.months.ago", \
+	.max_delta_cache_size = DEFAULT_DELTA_CACHE_SIZE, \
+}
+
+static void gc_config(struct gc_config *cfg)
 {
 	const char *value;
 
 	if (!git_config_get_value("gc.packrefs", &value)) {
 		if (value && !strcmp(value, "notbare"))
-			pack_refs = -1;
+			cfg->pack_refs = -1;
 		else
-			pack_refs = git_config_bool("gc.packrefs", value);
+			cfg->pack_refs = git_config_bool("gc.packrefs", value);
 	}
 
 	if (gc_config_is_timestamp_never("gc.reflogexpire") &&
 	    gc_config_is_timestamp_never("gc.reflogexpireunreachable"))
-		prune_reflogs = 0;
+		cfg->prune_reflogs = 0;
 
-	git_config_get_int("gc.aggressivewindow", &aggressive_window);
-	git_config_get_int("gc.aggressivedepth", &aggressive_depth);
-	git_config_get_int("gc.auto", &gc_auto_threshold);
-	git_config_get_int("gc.autopacklimit", &gc_auto_pack_limit);
-	git_config_get_bool("gc.autodetach", &detach_auto);
-	git_config_get_bool("gc.cruftpacks", &cruft_packs);
-	git_config_get_ulong("gc.maxcruftsize", &max_cruft_size);
-	git_config_get_expiry("gc.pruneexpire", (char **) &prune_expire);
-	git_config_get_expiry("gc.worktreepruneexpire", (char **) &prune_worktrees_expire);
-	git_config_get_expiry("gc.logexpiry", (char **) &gc_log_expire);
+	git_config_get_int("gc.aggressivewindow", &cfg->aggressive_window);
+	git_config_get_int("gc.aggressivedepth", &cfg->aggressive_depth);
+	git_config_get_int("gc.auto", &cfg->gc_auto_threshold);
+	git_config_get_int("gc.autopacklimit", &cfg->gc_auto_pack_limit);
+	git_config_get_bool("gc.autodetach", &cfg->detach_auto);
+	git_config_get_bool("gc.cruftpacks", &cfg->cruft_packs);
+	git_config_get_ulong("gc.maxcruftsize", &cfg->max_cruft_size);
+	git_config_get_expiry("gc.pruneexpire", (char **) &cfg->prune_expire);
+	git_config_get_expiry("gc.worktreepruneexpire", (char **) &cfg->prune_worktrees_expire);
+	git_config_get_expiry("gc.logexpiry", (char **) &cfg->gc_log_expire);
 
-	git_config_get_ulong("gc.bigpackthreshold", &big_pack_threshold);
-	git_config_get_ulong("pack.deltacachesize", &max_delta_cache_size);
+	git_config_get_ulong("gc.bigpackthreshold", &cfg->big_pack_threshold);
+	git_config_get_ulong("pack.deltacachesize", &cfg->max_delta_cache_size);
 
-	git_config_get_string("gc.repackfilter", &repack_filter);
-	git_config_get_string("gc.repackfilterto", &repack_filter_to);
+	git_config_get_string("gc.repackfilter", &cfg->repack_filter);
+	git_config_get_string("gc.repackfilterto", &cfg->repack_filter_to);
 
 	git_config(git_default_config, NULL);
 }
@@ -206,7 +224,7 @@ struct maintenance_run_opts {
 	enum schedule_priority schedule;
 };
 
-static int pack_refs_condition(void)
+static int pack_refs_condition(UNUSED struct gc_config *cfg)
 {
 	/*
 	 * The auto-repacking logic for refs is handled by the ref backends and
@@ -216,7 +234,8 @@ static int pack_refs_condition(void)
 	return 1;
 }
 
-static int maintenance_task_pack_refs(MAYBE_UNUSED struct maintenance_run_opts *opts)
+static int maintenance_task_pack_refs(MAYBE_UNUSED struct maintenance_run_opts *opts,
+				      UNUSED struct gc_config *cfg)
 {
 	struct child_process cmd = CHILD_PROCESS_INIT;
 
@@ -228,7 +247,7 @@ static int maintenance_task_pack_refs(MAYBE_UNUSED struct maintenance_run_opts *
 	return run_command(&cmd);
 }
 
-static int too_many_loose_objects(void)
+static int too_many_loose_objects(struct gc_config *cfg)
 {
 	/*
 	 * Quickly check if a "gc" is needed, by estimating how
@@ -247,7 +266,7 @@ static int too_many_loose_objects(void)
 	if (!dir)
 		return 0;
 
-	auto_threshold = DIV_ROUND_UP(gc_auto_threshold, 256);
+	auto_threshold = DIV_ROUND_UP(cfg->gc_auto_threshold, 256);
 	while ((ent = readdir(dir)) != NULL) {
 		if (strspn(ent->d_name, "0123456789abcdef") != hexsz_loose ||
 		    ent->d_name[hexsz_loose] != '\0')
@@ -283,12 +302,12 @@ static struct packed_git *find_base_packs(struct string_list *packs,
 	return base;
 }
 
-static int too_many_packs(void)
+static int too_many_packs(struct gc_config *cfg)
 {
 	struct packed_git *p;
 	int cnt;
 
-	if (gc_auto_pack_limit <= 0)
+	if (cfg->gc_auto_pack_limit <= 0)
 		return 0;
 
 	for (cnt = 0, p = get_all_packs(the_repository); p; p = p->next) {
@@ -302,7 +321,7 @@ static int too_many_packs(void)
 		 */
 		cnt++;
 	}
-	return gc_auto_pack_limit < cnt;
+	return cfg->gc_auto_pack_limit < cnt;
 }
 
 static uint64_t total_ram(void)
@@ -336,7 +355,8 @@ static uint64_t total_ram(void)
 	return 0;
 }
 
-static uint64_t estimate_repack_memory(struct packed_git *pack)
+static uint64_t estimate_repack_memory(struct gc_config *cfg,
+				       struct packed_git *pack)
 {
 	unsigned long nr_objects = repo_approximate_object_count(the_repository);
 	size_t os_cache, heap;
@@ -373,7 +393,7 @@ static uint64_t estimate_repack_memory(struct packed_git *pack)
 	 */
 	heap += delta_base_cache_limit;
 	/* and of course pack-objects has its own delta cache */
-	heap += max_delta_cache_size;
+	heap += cfg->max_delta_cache_size;
 
 	return os_cache + heap;
 }
@@ -384,30 +404,31 @@ static int keep_one_pack(struct string_list_item *item, void *data UNUSED)
 	return 0;
 }
 
-static void add_repack_all_option(struct string_list *keep_pack)
+static void add_repack_all_option(struct gc_config *cfg,
+				  struct string_list *keep_pack)
 {
-	if (prune_expire && !strcmp(prune_expire, "now"))
+	if (cfg->prune_expire && !strcmp(cfg->prune_expire, "now"))
 		strvec_push(&repack, "-a");
-	else if (cruft_packs) {
+	else if (cfg->cruft_packs) {
 		strvec_push(&repack, "--cruft");
-		if (prune_expire)
-			strvec_pushf(&repack, "--cruft-expiration=%s", prune_expire);
-		if (max_cruft_size)
+		if (cfg->prune_expire)
+			strvec_pushf(&repack, "--cruft-expiration=%s", cfg->prune_expire);
+		if (cfg->max_cruft_size)
 			strvec_pushf(&repack, "--max-cruft-size=%lu",
-				     max_cruft_size);
+				     cfg->max_cruft_size);
 	} else {
 		strvec_push(&repack, "-A");
-		if (prune_expire)
-			strvec_pushf(&repack, "--unpack-unreachable=%s", prune_expire);
+		if (cfg->prune_expire)
+			strvec_pushf(&repack, "--unpack-unreachable=%s", cfg->prune_expire);
 	}
 
 	if (keep_pack)
 		for_each_string_list(keep_pack, keep_one_pack, NULL);
 
-	if (repack_filter && *repack_filter)
-		strvec_pushf(&repack, "--filter=%s", repack_filter);
-	if (repack_filter_to && *repack_filter_to)
-		strvec_pushf(&repack, "--filter-to=%s", repack_filter_to);
+	if (cfg->repack_filter && *cfg->repack_filter)
+		strvec_pushf(&repack, "--filter=%s", cfg->repack_filter);
+	if (cfg->repack_filter_to && *cfg->repack_filter_to)
+		strvec_pushf(&repack, "--filter-to=%s", cfg->repack_filter_to);
 }
 
 static void add_repack_incremental_option(void)
@@ -415,13 +436,13 @@ static void add_repack_incremental_option(void)
 	strvec_push(&repack, "--no-write-bitmap-index");
 }
 
-static int need_to_gc(void)
+static int need_to_gc(struct gc_config *cfg)
 {
 	/*
 	 * Setting gc.auto to 0 or negative can disable the
 	 * automatic gc.
 	 */
-	if (gc_auto_threshold <= 0)
+	if (cfg->gc_auto_threshold <= 0)
 		return 0;
 
 	/*
@@ -430,13 +451,13 @@ static int need_to_gc(void)
 	 * we run "repack -A -d -l".  Otherwise we tell the caller
 	 * there is no need.
 	 */
-	if (too_many_packs()) {
+	if (too_many_packs(cfg)) {
 		struct string_list keep_pack = STRING_LIST_INIT_NODUP;
 
-		if (big_pack_threshold) {
-			find_base_packs(&keep_pack, big_pack_threshold);
-			if (keep_pack.nr >= gc_auto_pack_limit) {
-				big_pack_threshold = 0;
+		if (cfg->big_pack_threshold) {
+			find_base_packs(&keep_pack, cfg->big_pack_threshold);
+			if (keep_pack.nr >= cfg->gc_auto_pack_limit) {
+				cfg->big_pack_threshold = 0;
 				string_list_clear(&keep_pack, 0);
 				find_base_packs(&keep_pack, 0);
 			}
@@ -445,7 +466,7 @@ static int need_to_gc(void)
 			uint64_t mem_have, mem_want;
 
 			mem_have = total_ram();
-			mem_want = estimate_repack_memory(p);
+			mem_want = estimate_repack_memory(cfg, p);
 
 			/*
 			 * Only allow 1/2 of memory for pack-objects, leave
@@ -456,9 +477,9 @@ static int need_to_gc(void)
 				string_list_clear(&keep_pack, 0);
 		}
 
-		add_repack_all_option(&keep_pack);
+		add_repack_all_option(cfg, &keep_pack);
 		string_list_clear(&keep_pack, 0);
-	} else if (too_many_loose_objects())
+	} else if (too_many_loose_objects(cfg))
 		add_repack_incremental_option();
 	else
 		return 0;
@@ -585,7 +606,8 @@ static int report_last_gc_error(void)
 	return ret;
 }
 
-static void gc_before_repack(struct maintenance_run_opts *opts)
+static void gc_before_repack(struct maintenance_run_opts *opts,
+			     struct gc_config *cfg)
 {
 	/*
 	 * We may be called twice, as both the pre- and
@@ -596,10 +618,10 @@ static void gc_before_repack(struct maintenance_run_opts *opts)
 	if (done++)
 		return;
 
-	if (pack_refs && maintenance_task_pack_refs(opts))
+	if (cfg->pack_refs && maintenance_task_pack_refs(opts, cfg))
 		die(FAILED_RUN, "pack-refs");
 
-	if (prune_reflogs) {
+	if (cfg->prune_reflogs) {
 		struct child_process cmd = CHILD_PROCESS_INIT;
 
 		cmd.git_cmd = 1;
@@ -621,14 +643,15 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 	timestamp_t dummy;
 	struct child_process rerere_cmd = CHILD_PROCESS_INIT;
 	struct maintenance_run_opts opts = {0};
+	struct gc_config cfg = GC_CONFIG_INIT;
 
 	struct option builtin_gc_options[] = {
 		OPT__QUIET(&quiet, N_("suppress progress reporting")),
-		{ OPTION_STRING, 0, "prune", &prune_expire, N_("date"),
+		{ OPTION_STRING, 0, "prune", &cfg.prune_expire, N_("date"),
 			N_("prune unreferenced objects"),
-			PARSE_OPT_OPTARG, NULL, (intptr_t)prune_expire },
-		OPT_BOOL(0, "cruft", &cruft_packs, N_("pack unreferenced objects separately")),
-		OPT_MAGNITUDE(0, "max-cruft-size", &max_cruft_size,
+			PARSE_OPT_OPTARG, NULL, (intptr_t)cfg.prune_expire },
+		OPT_BOOL(0, "cruft", &cfg.cruft_packs, N_("pack unreferenced objects separately")),
+		OPT_MAGNITUDE(0, "max-cruft-size", &cfg.max_cruft_size,
 			      N_("with --cruft, limit the size of new cruft packs")),
 		OPT_BOOL(0, "aggressive", &aggressive, N_("be more thorough (increased runtime)")),
 		OPT_BOOL_F(0, "auto", &opts.auto_flag, N_("enable auto-gc mode"),
@@ -651,27 +674,27 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 	strvec_pushl(&rerere, "rerere", "gc", NULL);
 
 	/* default expiry time, overwritten in gc_config */
-	gc_config();
-	if (parse_expiry_date(gc_log_expire, &gc_log_expire_time))
-		die(_("failed to parse gc.logExpiry value %s"), gc_log_expire);
+	gc_config(&cfg);
+	if (parse_expiry_date(cfg.gc_log_expire, &gc_log_expire_time))
+		die(_("failed to parse gc.logExpiry value %s"), cfg.gc_log_expire);
 
-	if (pack_refs < 0)
-		pack_refs = !is_bare_repository();
+	if (cfg.pack_refs < 0)
+		cfg.pack_refs = !is_bare_repository();
 
 	argc = parse_options(argc, argv, prefix, builtin_gc_options,
 			     builtin_gc_usage, 0);
 	if (argc > 0)
 		usage_with_options(builtin_gc_usage, builtin_gc_options);
 
-	if (prune_expire && parse_expiry_date(prune_expire, &dummy))
-		die(_("failed to parse prune expiry value %s"), prune_expire);
+	if (cfg.prune_expire && parse_expiry_date(cfg.prune_expire, &dummy))
+		die(_("failed to parse prune expiry value %s"), cfg.prune_expire);
 
 	if (aggressive) {
 		strvec_push(&repack, "-f");
-		if (aggressive_depth > 0)
-			strvec_pushf(&repack, "--depth=%d", aggressive_depth);
-		if (aggressive_window > 0)
-			strvec_pushf(&repack, "--window=%d", aggressive_window);
+		if (cfg.aggressive_depth > 0)
+			strvec_pushf(&repack, "--depth=%d", cfg.aggressive_depth);
+		if (cfg.aggressive_window > 0)
+			strvec_pushf(&repack, "--window=%d", cfg.aggressive_window);
 	}
 	if (quiet)
 		strvec_push(&repack, "-q");
@@ -680,16 +703,16 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 		/*
 		 * Auto-gc should be least intrusive as possible.
 		 */
-		if (!need_to_gc())
+		if (!need_to_gc(&cfg))
 			return 0;
 		if (!quiet) {
-			if (detach_auto)
+			if (cfg.detach_auto)
 				fprintf(stderr, _("Auto packing the repository in background for optimum performance.\n"));
 			else
 				fprintf(stderr, _("Auto packing the repository for optimum performance.\n"));
 			fprintf(stderr, _("See \"git help gc\" for manual housekeeping.\n"));
 		}
-		if (detach_auto) {
+		if (cfg.detach_auto) {
 			int ret = report_last_gc_error();
 
 			if (ret == 1)
@@ -701,7 +724,7 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 
 			if (lock_repo_for_gc(force, &pid))
 				return 0;
-			gc_before_repack(&opts); /* dies on failure */
+			gc_before_repack(&opts, &cfg); /* dies on failure */
 			delete_tempfile(&pidfile);
 
 			/*
@@ -716,11 +739,11 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 		if (keep_largest_pack != -1) {
 			if (keep_largest_pack)
 				find_base_packs(&keep_pack, 0);
-		} else if (big_pack_threshold) {
-			find_base_packs(&keep_pack, big_pack_threshold);
+		} else if (cfg.big_pack_threshold) {
+			find_base_packs(&keep_pack, cfg.big_pack_threshold);
 		}
 
-		add_repack_all_option(&keep_pack);
+		add_repack_all_option(&cfg, &keep_pack);
 		string_list_clear(&keep_pack, 0);
 	}
 
@@ -741,7 +764,7 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 		atexit(process_log_file_at_exit);
 	}
 
-	gc_before_repack(&opts);
+	gc_before_repack(&opts, &cfg);
 
 	if (!repository_format_precious_objects) {
 		struct child_process repack_cmd = CHILD_PROCESS_INIT;
@@ -752,11 +775,11 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 		if (run_command(&repack_cmd))
 			die(FAILED_RUN, repack.v[0]);
 
-		if (prune_expire) {
+		if (cfg.prune_expire) {
 			struct child_process prune_cmd = CHILD_PROCESS_INIT;
 
 			/* run `git prune` even if using cruft packs */
-			strvec_push(&prune, prune_expire);
+			strvec_push(&prune, cfg.prune_expire);
 			if (quiet)
 				strvec_push(&prune, "--no-progress");
 			if (repo_has_promisor_remote(the_repository))
@@ -769,10 +792,10 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 		}
 	}
 
-	if (prune_worktrees_expire) {
+	if (cfg.prune_worktrees_expire) {
 		struct child_process prune_worktrees_cmd = CHILD_PROCESS_INIT;
 
-		strvec_push(&prune_worktrees, prune_worktrees_expire);
+		strvec_push(&prune_worktrees, cfg.prune_worktrees_expire);
 		prune_worktrees_cmd.git_cmd = 1;
 		strvec_pushv(&prune_worktrees_cmd.args, prune_worktrees.v);
 		if (run_command(&prune_worktrees_cmd))
@@ -796,7 +819,7 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 					     !quiet && !daemonized ? COMMIT_GRAPH_WRITE_PROGRESS : 0,
 					     NULL);
 
-	if (opts.auto_flag && too_many_loose_objects())
+	if (opts.auto_flag && too_many_loose_objects(&cfg))
 		warning(_("There are too many unreachable loose objects; "
 			"run 'git prune' to remove them."));
 
@@ -892,7 +915,7 @@ static int dfs_on_ref(const char *refname UNUSED,
 	return result;
 }
 
-static int should_write_commit_graph(void)
+static int should_write_commit_graph(struct gc_config *cfg)
 {
 	int result;
 	struct cg_auto_data data;
@@ -929,7 +952,8 @@ static int run_write_commit_graph(struct maintenance_run_opts *opts)
 	return !!run_command(&child);
 }
 
-static int maintenance_task_commit_graph(struct maintenance_run_opts *opts)
+static int maintenance_task_commit_graph(struct maintenance_run_opts *opts,
+					 struct gc_config *cfg)
 {
 	prepare_repo_settings(the_repository);
 	if (!the_repository->settings.core_commit_graph)
@@ -963,7 +987,8 @@ static int fetch_remote(struct remote *remote, void *cbdata)
 	return !!run_command(&child);
 }
 
-static int maintenance_task_prefetch(struct maintenance_run_opts *opts)
+static int maintenance_task_prefetch(struct maintenance_run_opts *opts,
+				     struct gc_config *cfg)
 {
 	if (for_each_remote(fetch_remote, opts)) {
 		error(_("failed to prefetch remotes"));
@@ -973,7 +998,8 @@ static int maintenance_task_prefetch(struct maintenance_run_opts *opts)
 	return 0;
 }
 
-static int maintenance_task_gc(struct maintenance_run_opts *opts)
+static int maintenance_task_gc(struct maintenance_run_opts *opts,
+			       struct gc_config *cfg)
 {
 	struct child_process child = CHILD_PROCESS_INIT;
 
@@ -1021,7 +1047,7 @@ static int loose_object_count(const struct object_id *oid UNUSED,
 	return 0;
 }
 
-static int loose_object_auto_condition(void)
+static int loose_object_auto_condition(struct gc_config *cfg)
 {
 	int count = 0;
 
@@ -1106,12 +1132,13 @@ static int pack_loose(struct maintenance_run_opts *opts)
 	return result;
 }
 
-static int maintenance_task_loose_objects(struct maintenance_run_opts *opts)
+static int maintenance_task_loose_objects(struct maintenance_run_opts *opts,
+					  struct gc_config *cfg)
 {
 	return prune_packed(opts) || pack_loose(opts);
 }
 
-static int incremental_repack_auto_condition(void)
+static int incremental_repack_auto_condition(struct gc_config *cfg)
 {
 	struct packed_git *p;
 	int incremental_repack_auto_limit = 10;
@@ -1230,7 +1257,8 @@ static int multi_pack_index_repack(struct maintenance_run_opts *opts)
 	return 0;
 }
 
-static int maintenance_task_incremental_repack(struct maintenance_run_opts *opts)
+static int maintenance_task_incremental_repack(struct maintenance_run_opts *opts,
+					       struct gc_config *cfg)
 {
 	prepare_repo_settings(the_repository);
 	if (!the_repository->settings.core_multi_pack_index) {
@@ -1247,14 +1275,15 @@ static int maintenance_task_incremental_repack(struct maintenance_run_opts *opts
 	return 0;
 }
 
-typedef int maintenance_task_fn(struct maintenance_run_opts *opts);
+typedef int maintenance_task_fn(struct maintenance_run_opts *opts,
+				struct gc_config *cfg);
 
 /*
  * An auto condition function returns 1 if the task should run
  * and 0 if the task should NOT run. See needs_to_gc() for an
  * example.
  */
-typedef int maintenance_auto_fn(void);
+typedef int maintenance_auto_fn(struct gc_config *cfg);
 
 struct maintenance_task {
 	const char *name;
@@ -1321,7 +1350,8 @@ static int compare_tasks_by_selection(const void *a_, const void *b_)
 	return b->selected_order - a->selected_order;
 }
 
-static int maintenance_run_tasks(struct maintenance_run_opts *opts)
+static int maintenance_run_tasks(struct maintenance_run_opts *opts,
+				 struct gc_config *cfg)
 {
 	int i, found_selected = 0;
 	int result = 0;
@@ -1360,14 +1390,14 @@ static int maintenance_run_tasks(struct maintenance_run_opts *opts)
 
 		if (opts->auto_flag &&
 		    (!tasks[i].auto_condition ||
-		     !tasks[i].auto_condition()))
+		     !tasks[i].auto_condition(cfg)))
 			continue;
 
 		if (opts->schedule && tasks[i].schedule < opts->schedule)
 			continue;
 
 		trace2_region_enter("maintenance", tasks[i].name, r);
-		if (tasks[i].fn(opts)) {
+		if (tasks[i].fn(opts, cfg)) {
 			error(_("task '%s' failed"), tasks[i].name);
 			result = 1;
 		}
@@ -1404,7 +1434,6 @@ static void initialize_task_config(int schedule)
 {
 	int i;
 	struct strbuf config_name = STRBUF_INIT;
-	gc_config();
 
 	if (schedule)
 		initialize_maintenance_strategy();
@@ -1468,6 +1497,7 @@ static int maintenance_run(int argc, const char **argv, const char *prefix)
 {
 	int i;
 	struct maintenance_run_opts opts;
+	struct gc_config cfg = GC_CONFIG_INIT;
 	struct option builtin_maintenance_run_options[] = {
 		OPT_BOOL(0, "auto", &opts.auto_flag,
 			 N_("run tasks based on the state of the repository")),
@@ -1496,12 +1526,13 @@ static int maintenance_run(int argc, const char **argv, const char *prefix)
 	if (opts.auto_flag && opts.schedule)
 		die(_("use at most one of --auto and --schedule=<frequency>"));
 
+	gc_config(&cfg);
 	initialize_task_config(opts.schedule);
 
 	if (argc != 0)
 		usage_with_options(builtin_maintenance_run_usage,
 				   builtin_maintenance_run_options);
-	return maintenance_run_tasks(&opts);
+	return maintenance_run_tasks(&opts, &cfg);
 }
 
 static char *get_maintpath(void)
-- 
2.46.0.46.g406f326d27.dirty


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH v3 3/7] builtin/gc: fix leaking config values
  2024-08-16 10:44 ` [PATCH v3 " Patrick Steinhardt
  2024-08-16 10:44   ` [PATCH v3 1/7] config: fix constness of out parameter for `git_config_get_expiry()` Patrick Steinhardt
  2024-08-16 10:45   ` [PATCH v3 2/7] builtin/gc: refactor to read config into structure Patrick Steinhardt
@ 2024-08-16 10:45   ` Patrick Steinhardt
  2024-08-16 10:45   ` [PATCH v3 4/7] builtin/gc: stop processing log file on signal Patrick Steinhardt
                     ` (3 subsequent siblings)
  6 siblings, 0 replies; 79+ messages in thread
From: Patrick Steinhardt @ 2024-08-16 10:45 UTC (permalink / raw)
  To: git; +Cc: Phillip Wood, phillip.wood, James Liu, Derrick Stolee,
	Junio C Hamano

We're leaking config values in git-gc(1) when those values are tracked
as strings. Introduce a new `gc_config_release()` function that releases
this memory to plug those leaks and release old values before populating
the config fields via `git_config_string()` et al.

Note that there is one small gotcha here with the "--prune" option. Next
to passing a string, this option also accepts the "--no-prune" option
that overrides the default or configured value. We thus need to discern
between the option not having been passed by the user and the negative
variant of it. This is done by using a simple sentinel value that lets
us discern these cases.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/gc.c     | 108 +++++++++++++++++++++++++++++++++++------------
 t/t5304-prune.sh |   1 +
 2 files changed, 82 insertions(+), 27 deletions(-)

diff --git a/builtin/gc.c b/builtin/gc.c
index eee7401647..a93cfa147e 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -139,9 +139,9 @@ struct gc_config {
 	int gc_auto_threshold;
 	int gc_auto_pack_limit;
 	int detach_auto;
-	const char *gc_log_expire;
-	const char *prune_expire;
-	const char *prune_worktrees_expire;
+	char *gc_log_expire;
+	char *prune_expire;
+	char *prune_worktrees_expire;
 	char *repack_filter;
 	char *repack_filter_to;
 	unsigned long big_pack_threshold;
@@ -157,15 +157,25 @@ struct gc_config {
 	.gc_auto_threshold = 6700, \
 	.gc_auto_pack_limit = 50, \
 	.detach_auto = 1, \
-	.gc_log_expire = "1.day.ago", \
-	.prune_expire = "2.weeks.ago", \
-	.prune_worktrees_expire = "3.months.ago", \
+	.gc_log_expire = xstrdup("1.day.ago"), \
+	.prune_expire = xstrdup("2.weeks.ago"), \
+	.prune_worktrees_expire = xstrdup("3.months.ago"), \
 	.max_delta_cache_size = DEFAULT_DELTA_CACHE_SIZE, \
 }
 
+static void gc_config_release(struct gc_config *cfg)
+{
+	free(cfg->gc_log_expire);
+	free(cfg->prune_expire);
+	free(cfg->prune_worktrees_expire);
+	free(cfg->repack_filter);
+	free(cfg->repack_filter_to);
+}
+
 static void gc_config(struct gc_config *cfg)
 {
 	const char *value;
+	char *owned = NULL;
 
 	if (!git_config_get_value("gc.packrefs", &value)) {
 		if (value && !strcmp(value, "notbare"))
@@ -185,15 +195,34 @@ static void gc_config(struct gc_config *cfg)
 	git_config_get_bool("gc.autodetach", &cfg->detach_auto);
 	git_config_get_bool("gc.cruftpacks", &cfg->cruft_packs);
 	git_config_get_ulong("gc.maxcruftsize", &cfg->max_cruft_size);
-	git_config_get_expiry("gc.pruneexpire", (char **) &cfg->prune_expire);
-	git_config_get_expiry("gc.worktreepruneexpire", (char **) &cfg->prune_worktrees_expire);
-	git_config_get_expiry("gc.logexpiry", (char **) &cfg->gc_log_expire);
+
+	if (!git_config_get_expiry("gc.pruneexpire", &owned)) {
+		free(cfg->prune_expire);
+		cfg->prune_expire = owned;
+	}
+
+	if (!git_config_get_expiry("gc.worktreepruneexpire", &owned)) {
+		free(cfg->prune_worktrees_expire);
+		cfg->prune_worktrees_expire = owned;
+	}
+
+	if (!git_config_get_expiry("gc.logexpiry", &owned)) {
+		free(cfg->gc_log_expire);
+		cfg->gc_log_expire = owned;
+	}
 
 	git_config_get_ulong("gc.bigpackthreshold", &cfg->big_pack_threshold);
 	git_config_get_ulong("pack.deltacachesize", &cfg->max_delta_cache_size);
 
-	git_config_get_string("gc.repackfilter", &cfg->repack_filter);
-	git_config_get_string("gc.repackfilterto", &cfg->repack_filter_to);
+	if (!git_config_get_string("gc.repackfilter", &owned)) {
+		free(cfg->repack_filter);
+		cfg->repack_filter = owned;
+	}
+
+	if (!git_config_get_string("gc.repackfilterto", &owned)) {
+		free(cfg->repack_filter_to);
+		cfg->repack_filter_to = owned;
+	}
 
 	git_config(git_default_config, NULL);
 }
@@ -644,12 +673,15 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 	struct child_process rerere_cmd = CHILD_PROCESS_INIT;
 	struct maintenance_run_opts opts = {0};
 	struct gc_config cfg = GC_CONFIG_INIT;
+	const char *prune_expire_sentinel = "sentinel";
+	const char *prune_expire_arg = prune_expire_sentinel;
+	int ret;
 
 	struct option builtin_gc_options[] = {
 		OPT__QUIET(&quiet, N_("suppress progress reporting")),
-		{ OPTION_STRING, 0, "prune", &cfg.prune_expire, N_("date"),
+		{ OPTION_STRING, 0, "prune", &prune_expire_arg, N_("date"),
 			N_("prune unreferenced objects"),
-			PARSE_OPT_OPTARG, NULL, (intptr_t)cfg.prune_expire },
+			PARSE_OPT_OPTARG, NULL, (intptr_t)prune_expire_arg },
 		OPT_BOOL(0, "cruft", &cfg.cruft_packs, N_("pack unreferenced objects separately")),
 		OPT_MAGNITUDE(0, "max-cruft-size", &cfg.max_cruft_size,
 			      N_("with --cruft, limit the size of new cruft packs")),
@@ -673,8 +705,8 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 	strvec_pushl(&prune_worktrees, "worktree", "prune", "--expire", NULL);
 	strvec_pushl(&rerere, "rerere", "gc", NULL);
 
-	/* default expiry time, overwritten in gc_config */
 	gc_config(&cfg);
+
 	if (parse_expiry_date(cfg.gc_log_expire, &gc_log_expire_time))
 		die(_("failed to parse gc.logExpiry value %s"), cfg.gc_log_expire);
 
@@ -686,6 +718,10 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 	if (argc > 0)
 		usage_with_options(builtin_gc_usage, builtin_gc_options);
 
+	if (prune_expire_arg != prune_expire_sentinel) {
+		free(cfg.prune_expire);
+		cfg.prune_expire = xstrdup_or_null(prune_expire_arg);
+	}
 	if (cfg.prune_expire && parse_expiry_date(cfg.prune_expire, &dummy))
 		die(_("failed to parse prune expiry value %s"), cfg.prune_expire);
 
@@ -703,8 +739,11 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 		/*
 		 * Auto-gc should be least intrusive as possible.
 		 */
-		if (!need_to_gc(&cfg))
-			return 0;
+		if (!need_to_gc(&cfg)) {
+			ret = 0;
+			goto out;
+		}
+
 		if (!quiet) {
 			if (cfg.detach_auto)
 				fprintf(stderr, _("Auto packing the repository in background for optimum performance.\n"));
@@ -713,17 +752,22 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 			fprintf(stderr, _("See \"git help gc\" for manual housekeeping.\n"));
 		}
 		if (cfg.detach_auto) {
-			int ret = report_last_gc_error();
-
-			if (ret == 1)
+			ret = report_last_gc_error();
+			if (ret == 1) {
 				/* Last gc --auto failed. Skip this one. */
-				return 0;
-			else if (ret)
+				ret = 0;
+				goto out;
+
+			} else if (ret) {
 				/* an I/O error occurred, already reported */
-				return ret;
+				goto out;
+			}
+
+			if (lock_repo_for_gc(force, &pid)) {
+				ret = 0;
+				goto out;
+			}
 
-			if (lock_repo_for_gc(force, &pid))
-				return 0;
 			gc_before_repack(&opts, &cfg); /* dies on failure */
 			delete_tempfile(&pidfile);
 
@@ -749,8 +793,11 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 
 	name = lock_repo_for_gc(force, &pid);
 	if (name) {
-		if (opts.auto_flag)
-			return 0; /* be quiet on --auto */
+		if (opts.auto_flag) {
+			ret = 0;
+			goto out; /* be quiet on --auto */
+		}
+
 		die(_("gc is already running on machine '%s' pid %"PRIuMAX" (use --force if not)"),
 		    name, (uintmax_t)pid);
 	}
@@ -826,6 +873,8 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 	if (!daemonized)
 		unlink(git_path("gc.log"));
 
+out:
+	gc_config_release(&cfg);
 	return 0;
 }
 
@@ -1511,6 +1560,8 @@ static int maintenance_run(int argc, const char **argv, const char *prefix)
 			PARSE_OPT_NONEG, task_option_parse),
 		OPT_END()
 	};
+	int ret;
+
 	memset(&opts, 0, sizeof(opts));
 
 	opts.quiet = !isatty(2);
@@ -1532,7 +1583,10 @@ static int maintenance_run(int argc, const char **argv, const char *prefix)
 	if (argc != 0)
 		usage_with_options(builtin_maintenance_run_usage,
 				   builtin_maintenance_run_options);
-	return maintenance_run_tasks(&opts, &cfg);
+
+	ret = maintenance_run_tasks(&opts, &cfg);
+	gc_config_release(&cfg);
+	return ret;
 }
 
 static char *get_maintpath(void)
diff --git a/t/t5304-prune.sh b/t/t5304-prune.sh
index 1f1f664871..e641df0116 100755
--- a/t/t5304-prune.sh
+++ b/t/t5304-prune.sh
@@ -7,6 +7,7 @@ test_description='prune'
 GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 day=$((60*60*24))
-- 
2.46.0.46.g406f326d27.dirty


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH v3 4/7] builtin/gc: stop processing log file on signal
  2024-08-16 10:44 ` [PATCH v3 " Patrick Steinhardt
                     ` (2 preceding siblings ...)
  2024-08-16 10:45   ` [PATCH v3 3/7] builtin/gc: fix leaking config values Patrick Steinhardt
@ 2024-08-16 10:45   ` Patrick Steinhardt
  2024-08-16 10:45   ` [PATCH v3 5/7] builtin/gc: add a `--detach` flag Patrick Steinhardt
                     ` (2 subsequent siblings)
  6 siblings, 0 replies; 79+ messages in thread
From: Patrick Steinhardt @ 2024-08-16 10:45 UTC (permalink / raw)
  To: git; +Cc: Phillip Wood, phillip.wood, James Liu, Derrick Stolee,
	Junio C Hamano

When detaching, git-gc(1) will redirect its stderr to a "gc.log" log
file, which is then used to surface errors of a backgrounded process to
the user. To ensure that the file is properly managed on abnormal exit
paths, we install both signal and exit handlers that try to either
commit the underlying lock file or roll it back in case there wasn't any
error.

This logic is severly broken when handling signals though, as we end up
calling all kinds of functions that are not signal safe. This includes
malloc(3P) via `git_path()`, fprintf(3P), fflush(3P) and many more
functions. The consequence can be anything, from deadlocks to crashes.
Unfortunately, we cannot really do much about this without a larger
refactoring.

The least-worst thing we can do is to not set up the signal handler in
the first place. This will still cause us to remove the lockfile, as the
underlying tempfile subsystem already knows to unlink locks when
receiving a signal. But it may cause us to remove the lock even in the
case where it would have contained actual errors, which is a change in
behaviour.

The consequence is that "gc.log" will not be committed, and thus
subsequent calls to `git gc --auto` won't bail out because of this.
Arguably though, it is better to retry garbage collection rather than
having the process run into a potentially-corrupted state.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/gc.c | 8 --------
 1 file changed, 8 deletions(-)

diff --git a/builtin/gc.c b/builtin/gc.c
index a93cfa147e..f815557081 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -109,13 +109,6 @@ static void process_log_file_at_exit(void)
 	process_log_file();
 }
 
-static void process_log_file_on_signal(int signo)
-{
-	process_log_file();
-	sigchain_pop(signo);
-	raise(signo);
-}
-
 static int gc_config_is_timestamp_never(const char *var)
 {
 	const char *value;
@@ -807,7 +800,6 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 					  git_path("gc.log"),
 					  LOCK_DIE_ON_ERROR);
 		dup2(get_lock_file_fd(&log_lock), 2);
-		sigchain_push_common(process_log_file_on_signal);
 		atexit(process_log_file_at_exit);
 	}
 
-- 
2.46.0.46.g406f326d27.dirty


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH v3 5/7] builtin/gc: add a `--detach` flag
  2024-08-16 10:44 ` [PATCH v3 " Patrick Steinhardt
                     ` (3 preceding siblings ...)
  2024-08-16 10:45   ` [PATCH v3 4/7] builtin/gc: stop processing log file on signal Patrick Steinhardt
@ 2024-08-16 10:45   ` Patrick Steinhardt
  2024-08-16 10:45   ` [PATCH v3 6/7] builtin/maintenance: " Patrick Steinhardt
  2024-08-16 10:45   ` [PATCH v3 7/7] run-command: fix detaching when running auto maintenance Patrick Steinhardt
  6 siblings, 0 replies; 79+ messages in thread
From: Patrick Steinhardt @ 2024-08-16 10:45 UTC (permalink / raw)
  To: git; +Cc: Phillip Wood, phillip.wood, James Liu, Derrick Stolee,
	Junio C Hamano

When running `git gc --auto`, the command will by default detach and
continue running in the background. This behaviour can be tweaked via
the `gc.autoDetach` config, but not via a command line switch. We need
that in a subsequent commit though, where git-maintenance(1) will want
to ask its git-gc(1) child process to not detach anymore.

Add a `--[no-]detach` flag that does this for us.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/git-gc.txt |  5 ++-
 builtin/gc.c             | 70 ++++++++++++++++++++++------------------
 t/t6500-gc.sh            | 45 +++++++++++++++++++++++---
 3 files changed, 84 insertions(+), 36 deletions(-)

diff --git a/Documentation/git-gc.txt b/Documentation/git-gc.txt
index b5561c458a..370e22faae 100644
--- a/Documentation/git-gc.txt
+++ b/Documentation/git-gc.txt
@@ -9,7 +9,7 @@ git-gc - Cleanup unnecessary files and optimize the local repository
 SYNOPSIS
 --------
 [verse]
-'git gc' [--aggressive] [--auto] [--quiet] [--prune=<date> | --no-prune] [--force] [--keep-largest-pack]
+'git gc' [--aggressive] [--auto] [--[no-]detach] [--quiet] [--prune=<date> | --no-prune] [--force] [--keep-largest-pack]
 
 DESCRIPTION
 -----------
@@ -53,6 +53,9 @@ configuration options such as `gc.auto` and `gc.autoPackLimit`, all
 other housekeeping tasks (e.g. rerere, working trees, reflog...) will
 be performed as well.
 
+--[no-]detach::
+	Run in the background if the system supports it. This option overrides
+	the `gc.autoDetach` config.
 
 --[no-]cruft::
 	When expiring unreachable objects, pack them separately into a
diff --git a/builtin/gc.c b/builtin/gc.c
index f815557081..269a77960f 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -242,9 +242,13 @@ static enum schedule_priority parse_schedule(const char *value)
 
 struct maintenance_run_opts {
 	int auto_flag;
+	int detach;
 	int quiet;
 	enum schedule_priority schedule;
 };
+#define MAINTENANCE_RUN_OPTS_INIT { \
+	.detach = -1, \
+}
 
 static int pack_refs_condition(UNUSED struct gc_config *cfg)
 {
@@ -664,7 +668,7 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 	int keep_largest_pack = -1;
 	timestamp_t dummy;
 	struct child_process rerere_cmd = CHILD_PROCESS_INIT;
-	struct maintenance_run_opts opts = {0};
+	struct maintenance_run_opts opts = MAINTENANCE_RUN_OPTS_INIT;
 	struct gc_config cfg = GC_CONFIG_INIT;
 	const char *prune_expire_sentinel = "sentinel";
 	const char *prune_expire_arg = prune_expire_sentinel;
@@ -681,6 +685,8 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 		OPT_BOOL(0, "aggressive", &aggressive, N_("be more thorough (increased runtime)")),
 		OPT_BOOL_F(0, "auto", &opts.auto_flag, N_("enable auto-gc mode"),
 			   PARSE_OPT_NOCOMPLETE),
+		OPT_BOOL(0, "detach", &opts.detach,
+			 N_("perform garbage collection in the background")),
 		OPT_BOOL_F(0, "force", &force,
 			   N_("force running gc even if there may be another gc running"),
 			   PARSE_OPT_NOCOMPLETE),
@@ -729,6 +735,9 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 		strvec_push(&repack, "-q");
 
 	if (opts.auto_flag) {
+		if (cfg.detach_auto && opts.detach < 0)
+			opts.detach = 1;
+
 		/*
 		 * Auto-gc should be least intrusive as possible.
 		 */
@@ -738,38 +747,12 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 		}
 
 		if (!quiet) {
-			if (cfg.detach_auto)
+			if (opts.detach > 0)
 				fprintf(stderr, _("Auto packing the repository in background for optimum performance.\n"));
 			else
 				fprintf(stderr, _("Auto packing the repository for optimum performance.\n"));
 			fprintf(stderr, _("See \"git help gc\" for manual housekeeping.\n"));
 		}
-		if (cfg.detach_auto) {
-			ret = report_last_gc_error();
-			if (ret == 1) {
-				/* Last gc --auto failed. Skip this one. */
-				ret = 0;
-				goto out;
-
-			} else if (ret) {
-				/* an I/O error occurred, already reported */
-				goto out;
-			}
-
-			if (lock_repo_for_gc(force, &pid)) {
-				ret = 0;
-				goto out;
-			}
-
-			gc_before_repack(&opts, &cfg); /* dies on failure */
-			delete_tempfile(&pidfile);
-
-			/*
-			 * failure to daemonize is ok, we'll continue
-			 * in foreground
-			 */
-			daemonized = !daemonize();
-		}
 	} else {
 		struct string_list keep_pack = STRING_LIST_INIT_NODUP;
 
@@ -784,6 +767,33 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 		string_list_clear(&keep_pack, 0);
 	}
 
+	if (opts.detach > 0) {
+		ret = report_last_gc_error();
+		if (ret == 1) {
+			/* Last gc --auto failed. Skip this one. */
+			ret = 0;
+			goto out;
+
+		} else if (ret) {
+			/* an I/O error occurred, already reported */
+			goto out;
+		}
+
+		if (lock_repo_for_gc(force, &pid)) {
+			ret = 0;
+			goto out;
+		}
+
+		gc_before_repack(&opts, &cfg); /* dies on failure */
+		delete_tempfile(&pidfile);
+
+		/*
+		 * failure to daemonize is ok, we'll continue
+		 * in foreground
+		 */
+		daemonized = !daemonize();
+	}
+
 	name = lock_repo_for_gc(force, &pid);
 	if (name) {
 		if (opts.auto_flag) {
@@ -1537,7 +1547,7 @@ static int task_option_parse(const struct option *opt UNUSED,
 static int maintenance_run(int argc, const char **argv, const char *prefix)
 {
 	int i;
-	struct maintenance_run_opts opts;
+	struct maintenance_run_opts opts = MAINTENANCE_RUN_OPTS_INIT;
 	struct gc_config cfg = GC_CONFIG_INIT;
 	struct option builtin_maintenance_run_options[] = {
 		OPT_BOOL(0, "auto", &opts.auto_flag,
@@ -1554,8 +1564,6 @@ static int maintenance_run(int argc, const char **argv, const char *prefix)
 	};
 	int ret;
 
-	memset(&opts, 0, sizeof(opts));
-
 	opts.quiet = !isatty(2);
 
 	for (i = 0; i < TASK__COUNT; i++)
diff --git a/t/t6500-gc.sh b/t/t6500-gc.sh
index 1b5909d1b7..5378455968 100755
--- a/t/t6500-gc.sh
+++ b/t/t6500-gc.sh
@@ -338,14 +338,14 @@ test_expect_success 'gc.maxCruftSize sets appropriate repack options' '
 	test_subcommand $cruft_max_size_opts --max-cruft-size=3145728 <trace2.txt
 '
 
-run_and_wait_for_auto_gc () {
+run_and_wait_for_gc () {
 	# We read stdout from gc for the side effect of waiting until the
 	# background gc process exits, closing its fd 9.  Furthermore, the
 	# variable assignment from a command substitution preserves the
 	# exit status of the main gc process.
 	# Note: this fd trickery doesn't work on Windows, but there is no
 	# need to, because on Win the auto gc always runs in the foreground.
-	doesnt_matter=$(git gc --auto 9>&1)
+	doesnt_matter=$(git gc "$@" 9>&1)
 }
 
 test_expect_success 'background auto gc does not run if gc.log is present and recent but does if it is old' '
@@ -361,7 +361,7 @@ test_expect_success 'background auto gc does not run if gc.log is present and re
 	test-tool chmtime =-345600 .git/gc.log &&
 	git gc --auto &&
 	test_config gc.logexpiry 2.days &&
-	run_and_wait_for_auto_gc &&
+	run_and_wait_for_gc --auto &&
 	ls .git/objects/pack/pack-*.pack >packs &&
 	test_line_count = 1 packs
 '
@@ -391,11 +391,48 @@ test_expect_success 'background auto gc respects lock for all operations' '
 	printf "%d %s" "$shell_pid" "$hostname" >.git/gc.pid &&
 
 	# our gc should exit zero without doing anything
-	run_and_wait_for_auto_gc &&
+	run_and_wait_for_gc --auto &&
 	(ls -1 .git/refs/heads .git/reftable >actual || true) &&
 	test_cmp expect actual
 '
 
+test_expect_success '--detach overrides gc.autoDetach=false' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+
+		# Prepare the repository such that git-gc(1) ends up repacking.
+		test_commit "$(test_oid blob17_1)" &&
+		test_commit "$(test_oid blob17_2)" &&
+		git config gc.autodetach false &&
+		git config gc.auto 2 &&
+
+		# Note that we cannot use `test_cmp` here to compare stderr
+		# because it may contain output from `set -x`.
+		run_and_wait_for_gc --auto --detach 2>actual &&
+		test_grep "Auto packing the repository in background for optimum performance." actual
+	)
+'
+
+test_expect_success '--no-detach overrides gc.autoDetach=true' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+
+		# Prepare the repository such that git-gc(1) ends up repacking.
+		test_commit "$(test_oid blob17_1)" &&
+		test_commit "$(test_oid blob17_2)" &&
+		git config gc.autodetach true &&
+		git config gc.auto 2 &&
+
+		GIT_PROGRESS_DELAY=0 git gc --auto --no-detach 2>output &&
+		test_grep "Auto packing the repository for optimum performance." output &&
+		test_grep "Collecting referenced commits: 2, done." output
+	)
+'
+
 # DO NOT leave a detached auto gc process running near the end of the
 # test script: it can run long enough in the background to racily
 # interfere with the cleanup in 'test_done'.
-- 
2.46.0.46.g406f326d27.dirty


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH v3 6/7] builtin/maintenance: add a `--detach` flag
  2024-08-16 10:44 ` [PATCH v3 " Patrick Steinhardt
                     ` (4 preceding siblings ...)
  2024-08-16 10:45   ` [PATCH v3 5/7] builtin/gc: add a `--detach` flag Patrick Steinhardt
@ 2024-08-16 10:45   ` Patrick Steinhardt
  2024-08-17  7:09     ` Jeff King
  2024-08-16 10:45   ` [PATCH v3 7/7] run-command: fix detaching when running auto maintenance Patrick Steinhardt
  6 siblings, 1 reply; 79+ messages in thread
From: Patrick Steinhardt @ 2024-08-16 10:45 UTC (permalink / raw)
  To: git; +Cc: Phillip Wood, phillip.wood, James Liu, Derrick Stolee,
	Junio C Hamano

Same as the preceding commit, add a `--[no-]detach` flag to the
git-maintenance(1) command. This will be used in a subsequent commit to
fix backgrounding of that command when configured with a non-standard
set of tasks.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/gc.c           |  6 ++++++
 t/t7900-maintenance.sh | 39 +++++++++++++++++++++++++++++++++++++++
 2 files changed, 45 insertions(+)

diff --git a/builtin/gc.c b/builtin/gc.c
index 269a77960f..63106e2028 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -1426,6 +1426,10 @@ static int maintenance_run_tasks(struct maintenance_run_opts *opts,
 	}
 	free(lock_path);
 
+	/* Failure to daemonize is ok, we'll continue in foreground. */
+	if (opts->detach > 0)
+		daemonize();
+
 	for (i = 0; !found_selected && i < TASK__COUNT; i++)
 		found_selected = tasks[i].selected_order >= 0;
 
@@ -1552,6 +1556,8 @@ static int maintenance_run(int argc, const char **argv, const char *prefix)
 	struct option builtin_maintenance_run_options[] = {
 		OPT_BOOL(0, "auto", &opts.auto_flag,
 			 N_("run tasks based on the state of the repository")),
+		OPT_BOOL(0, "detach", &opts.detach,
+			 N_("perform maintenance in the background")),
 		OPT_CALLBACK(0, "schedule", &opts.schedule, N_("frequency"),
 			     N_("run tasks based on frequency"),
 			     maintenance_opt_schedule),
diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
index 8595489ceb..771525aa4b 100755
--- a/t/t7900-maintenance.sh
+++ b/t/t7900-maintenance.sh
@@ -908,4 +908,43 @@ test_expect_success 'failed schedule prevents config change' '
 	done
 '
 
+test_expect_success '--no-detach causes maintenance to not run in background' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+
+		# Prepare the repository such that git-maintenance(1) ends up
+		# outputting something.
+		test_commit something &&
+		git config set maintenance.gc.enabled false &&
+		git config set maintenance.loose-objects.enabled true &&
+		git config set maintenance.loose-objects.auto 1 &&
+		git config set maintenance.incremental-repack.enabled true &&
+
+		# We have no better way to check whether or not the task ran in
+		# the background than to verify whether it output anything. The
+		# next testcase checks the reverse, making this somewhat safer.
+		git maintenance run --no-detach >out 2>&1 &&
+		test_line_count = 1 out
+	)
+'
+
+test_expect_success '--detach causes maintenance to run in background' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+
+		test_commit something &&
+		git config set maintenance.gc.enabled false &&
+		git config set maintenance.loose-objects.enabled true &&
+		git config set maintenance.loose-objects.auto 1 &&
+		git config set maintenance.incremental-repack.enabled true &&
+
+		git maintenance run --detach >out 2>&1 &&
+		test_must_be_empty out
+	)
+'
+
 test_done
-- 
2.46.0.46.g406f326d27.dirty


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH v3 7/7] run-command: fix detaching when running auto maintenance
  2024-08-16 10:44 ` [PATCH v3 " Patrick Steinhardt
                     ` (5 preceding siblings ...)
  2024-08-16 10:45   ` [PATCH v3 6/7] builtin/maintenance: " Patrick Steinhardt
@ 2024-08-16 10:45   ` Patrick Steinhardt
  2024-08-17 12:14     ` Jeff King
  6 siblings, 1 reply; 79+ messages in thread
From: Patrick Steinhardt @ 2024-08-16 10:45 UTC (permalink / raw)
  To: git; +Cc: Phillip Wood, phillip.wood, James Liu, Derrick Stolee,
	Junio C Hamano

In the past, we used to execute `git gc --auto` as part of our automatic
housekeeping routines. As git-gc(1) may require quite some time to
perform the housekeeping, it knows to detach itself and run in the
background so that the user can continue their work.

Eventually, we refactored our automatic housekeeping to instead use the
more flexible git-maintenance(1) command. The upside of this new infra
is that the user can configure which maintenance tasks are performed, at
least to a certain degree. So while it continues to run git-gc(1) by
default, it can also be adapted to e.g. use git-multi-pack-index(1) for
maintenance of the object database.

The auto-detach of the new infra is somewhat broken though once the user
configures non-standard tasks. The problem is essentially that we detach
at the wrong level in the process hierarchy: git-maintenance(1) never
detaches itself, but instead it continues to be git-gc(1) which does.

When configured to only run the git-gc(1) maintenance task, then the
result is basically the same as before. But when configured to run other
tasks, then git-maintenance(1) will wait for these to run to completion.
Even worse, it may be that git-gc(1) runs concurrently with other
housekeeping tasks, stomping on each others feet.

Fix this bug by asking git-gc(1) to not detach when it is being invoked
via git-maintenance(1). Instead, git-maintenance(1) now respects a new
config "maintenance.autoDetach", the equivalent of "gc.autoDetach", and
detaches itself into the background when running as part of our auto
maintenance. This should continue to behave the same for all users which
use the git-gc(1) task, only. For others though, it means that we now
properly perform all tasks in the background. The default behaviour of
git-maintenance(1) when executed by the user does not change, it will
remain in the foreground unless they pass the `--detach` option.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/config/gc.txt          |  3 +-
 Documentation/config/maintenance.txt | 11 +++++++
 builtin/gc.c                         |  1 +
 run-command.c                        | 12 +++++++-
 t/t5616-partial-clone.sh             |  6 ++--
 t/t7900-maintenance.sh               | 43 ++++++++++++++++++++++------
 6 files changed, 62 insertions(+), 14 deletions(-)

diff --git a/Documentation/config/gc.txt b/Documentation/config/gc.txt
index 664a3c2874..1d4f9470ea 100644
--- a/Documentation/config/gc.txt
+++ b/Documentation/config/gc.txt
@@ -40,7 +40,8 @@ use, it'll affect how the auto pack limit works.
 
 gc.autoDetach::
 	Make `git gc --auto` return immediately and run in the background
-	if the system supports it. Default is true.
+	if the system supports it. Default is true. This config variable acts
+	as a fallback in case `maintenance.autoDetach` is not set.
 
 gc.bigPackThreshold::
 	If non-zero, all non-cruft packs larger than this limit are kept
diff --git a/Documentation/config/maintenance.txt b/Documentation/config/maintenance.txt
index 69a4f05153..72a9d6cf81 100644
--- a/Documentation/config/maintenance.txt
+++ b/Documentation/config/maintenance.txt
@@ -3,6 +3,17 @@ maintenance.auto::
 	`git maintenance run --auto` after doing their normal work. Defaults
 	to true.
 
+maintenance.autoDetach::
+	Many Git commands trigger automatic maintenance after they have
+	written data into the repository. This boolean config option
+	controls whether this automatic maintenance shall happen in the
+	foreground or whether the maintenance process shall detach and
+	continue to run in the background.
++
+If unset, the value of `gc.autoDetach` is used as a fallback. Defaults
+to true if both are unset, meaning that the maintenance process will
+detach.
+
 maintenance.strategy::
 	This string config option provides a way to specify one of a few
 	recommended schedules for background maintenance. This only affects
diff --git a/builtin/gc.c b/builtin/gc.c
index 63106e2028..bafee330a2 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -1063,6 +1063,7 @@ static int maintenance_task_gc(struct maintenance_run_opts *opts,
 		strvec_push(&child.args, "--quiet");
 	else
 		strvec_push(&child.args, "--no-quiet");
+	strvec_push(&child.args, "--no-detach");
 
 	return run_command(&child);
 }
diff --git a/run-command.c b/run-command.c
index 45ba544932..94f2f3079f 100644
--- a/run-command.c
+++ b/run-command.c
@@ -1808,16 +1808,26 @@ void run_processes_parallel(const struct run_process_parallel_opts *opts)
 
 int prepare_auto_maintenance(int quiet, struct child_process *maint)
 {
-	int enabled;
+	int enabled, auto_detach;
 
 	if (!git_config_get_bool("maintenance.auto", &enabled) &&
 	    !enabled)
 		return 0;
 
+	/*
+	 * When `maintenance.autoDetach` isn't set, then we fall back to
+	 * honoring `gc.autoDetach`. This is somewhat weird, but required to
+	 * retain behaviour from when we used to run git-gc(1) here.
+	 */
+	if (git_config_get_bool("maintenance.autodetach", &auto_detach) &&
+	    git_config_get_bool("gc.autodetach", &auto_detach))
+		auto_detach = 1;
+
 	maint->git_cmd = 1;
 	maint->close_object_store = 1;
 	strvec_pushl(&maint->args, "maintenance", "run", "--auto", NULL);
 	strvec_push(&maint->args, quiet ? "--quiet" : "--no-quiet");
+	strvec_push(&maint->args, auto_detach ? "--detach" : "--no-detach");
 
 	return 1;
 }
diff --git a/t/t5616-partial-clone.sh b/t/t5616-partial-clone.sh
index 2da7291e37..8415884754 100755
--- a/t/t5616-partial-clone.sh
+++ b/t/t5616-partial-clone.sh
@@ -229,7 +229,7 @@ test_expect_success 'fetch --refetch triggers repacking' '
 
 	GIT_TRACE2_EVENT="$PWD/trace1.event" \
 	git -C pc1 fetch --refetch origin &&
-	test_subcommand git maintenance run --auto --no-quiet <trace1.event &&
+	test_subcommand git maintenance run --auto --no-quiet --detach <trace1.event &&
 	grep \"param\":\"gc.autopacklimit\",\"value\":\"1\" trace1.event &&
 	grep \"param\":\"maintenance.incremental-repack.auto\",\"value\":\"-1\" trace1.event &&
 
@@ -238,7 +238,7 @@ test_expect_success 'fetch --refetch triggers repacking' '
 		-c gc.autoPackLimit=0 \
 		-c maintenance.incremental-repack.auto=1234 \
 		-C pc1 fetch --refetch origin &&
-	test_subcommand git maintenance run --auto --no-quiet <trace2.event &&
+	test_subcommand git maintenance run --auto --no-quiet --detach <trace2.event &&
 	grep \"param\":\"gc.autopacklimit\",\"value\":\"0\" trace2.event &&
 	grep \"param\":\"maintenance.incremental-repack.auto\",\"value\":\"-1\" trace2.event &&
 
@@ -247,7 +247,7 @@ test_expect_success 'fetch --refetch triggers repacking' '
 		-c gc.autoPackLimit=1234 \
 		-c maintenance.incremental-repack.auto=0 \
 		-C pc1 fetch --refetch origin &&
-	test_subcommand git maintenance run --auto --no-quiet <trace3.event &&
+	test_subcommand git maintenance run --auto --no-quiet --detach <trace3.event &&
 	grep \"param\":\"gc.autopacklimit\",\"value\":\"1\" trace3.event &&
 	grep \"param\":\"maintenance.incremental-repack.auto\",\"value\":\"0\" trace3.event
 '
diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
index 771525aa4b..06ab43cfb5 100755
--- a/t/t7900-maintenance.sh
+++ b/t/t7900-maintenance.sh
@@ -49,22 +49,47 @@ test_expect_success 'run [--auto|--quiet]' '
 		git maintenance run --auto 2>/dev/null &&
 	GIT_TRACE2_EVENT="$(pwd)/run-no-quiet.txt" \
 		git maintenance run --no-quiet 2>/dev/null &&
-	test_subcommand git gc --quiet <run-no-auto.txt &&
-	test_subcommand ! git gc --auto --quiet <run-auto.txt &&
-	test_subcommand git gc --no-quiet <run-no-quiet.txt
+	test_subcommand git gc --quiet --no-detach <run-no-auto.txt &&
+	test_subcommand ! git gc --auto --quiet --no-detach <run-auto.txt &&
+	test_subcommand git gc --no-quiet --no-detach <run-no-quiet.txt
 '
 
 test_expect_success 'maintenance.auto config option' '
 	GIT_TRACE2_EVENT="$(pwd)/default" git commit --quiet --allow-empty -m 1 &&
-	test_subcommand git maintenance run --auto --quiet <default &&
+	test_subcommand git maintenance run --auto --quiet --detach <default &&
 	GIT_TRACE2_EVENT="$(pwd)/true" \
 		git -c maintenance.auto=true \
 		commit --quiet --allow-empty -m 2 &&
-	test_subcommand git maintenance run --auto --quiet  <true &&
+	test_subcommand git maintenance run --auto --quiet --detach <true &&
 	GIT_TRACE2_EVENT="$(pwd)/false" \
 		git -c maintenance.auto=false \
 		commit --quiet --allow-empty -m 3 &&
-	test_subcommand ! git maintenance run --auto --quiet  <false
+	test_subcommand ! git maintenance run --auto --quiet --detach <false
+'
+
+for cfg in maintenance.autoDetach gc.autoDetach
+do
+	test_expect_success "$cfg=true config option" '
+		test_when_finished "rm -f trace" &&
+		test_config $cfg true &&
+		GIT_TRACE2_EVENT="$(pwd)/trace" git commit --quiet --allow-empty -m 1 &&
+		test_subcommand git maintenance run --auto --quiet --detach <trace
+	'
+
+	test_expect_success "$cfg=false config option" '
+		test_when_finished "rm -f trace" &&
+		test_config $cfg false &&
+		GIT_TRACE2_EVENT="$(pwd)/trace" git commit --quiet --allow-empty -m 1 &&
+		test_subcommand git maintenance run --auto --quiet --no-detach <trace
+	'
+done
+
+test_expect_success "maintenance.autoDetach overrides gc.autoDetach" '
+	test_when_finished "rm -f trace" &&
+	test_config maintenance.autoDetach false &&
+	test_config gc.autoDetach true &&
+	GIT_TRACE2_EVENT="$(pwd)/trace" git commit --quiet --allow-empty -m 1 &&
+	test_subcommand git maintenance run --auto --quiet --no-detach <trace
 '
 
 test_expect_success 'register uses XDG_CONFIG_HOME config if it exists' '
@@ -129,9 +154,9 @@ test_expect_success 'run --task=<task>' '
 		git maintenance run --task=commit-graph 2>/dev/null &&
 	GIT_TRACE2_EVENT="$(pwd)/run-both.txt" \
 		git maintenance run --task=commit-graph --task=gc 2>/dev/null &&
-	test_subcommand ! git gc --quiet <run-commit-graph.txt &&
-	test_subcommand git gc --quiet <run-gc.txt &&
-	test_subcommand git gc --quiet <run-both.txt &&
+	test_subcommand ! git gc --quiet --no-detach <run-commit-graph.txt &&
+	test_subcommand git gc --quiet --no-detach <run-gc.txt &&
+	test_subcommand git gc --quiet --no-detach <run-both.txt &&
 	test_subcommand git commit-graph write --split --reachable --no-progress <run-commit-graph.txt &&
 	test_subcommand ! git commit-graph write --split --reachable --no-progress <run-gc.txt &&
 	test_subcommand git commit-graph write --split --reachable --no-progress <run-both.txt
-- 
2.46.0.46.g406f326d27.dirty


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* Re: [PATCH v3 6/7] builtin/maintenance: add a `--detach` flag
  2024-08-16 10:45   ` [PATCH v3 6/7] builtin/maintenance: " Patrick Steinhardt
@ 2024-08-17  7:09     ` Jeff King
  2024-08-17  7:14       ` Jeff King
  2024-08-19  6:17       ` Patrick Steinhardt
  0 siblings, 2 replies; 79+ messages in thread
From: Jeff King @ 2024-08-17  7:09 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: git, Phillip Wood, phillip.wood, James Liu, Derrick Stolee,
	Junio C Hamano

On Fri, Aug 16, 2024 at 12:45:15PM +0200, Patrick Steinhardt wrote:

> +test_expect_success '--no-detach causes maintenance to not run in background' '
> [...]
> +		# We have no better way to check whether or not the task ran in
> +		# the background than to verify whether it output anything. The
> +		# next testcase checks the reverse, making this somewhat safer.
> +		git maintenance run --no-detach >out 2>&1 &&
> +		test_line_count = 1 out
> [...]
> +test_expect_success '--detach causes maintenance to run in background' '
> +	test_when_finished "rm -rf repo" &&
> +	git init repo &&
> +	(
> +		cd repo &&
> +
> +		test_commit something &&
> +		git config set maintenance.gc.enabled false &&
> +		git config set maintenance.loose-objects.enabled true &&
> +		git config set maintenance.loose-objects.auto 1 &&
> +		git config set maintenance.incremental-repack.enabled true &&
> +
> +		git maintenance run --detach >out 2>&1 &&
> +		test_must_be_empty out
> +	)
> +'

This second test seems to fail racily (or maybe always? see below). In
CI on Windows, I saw:

  'out' is not empty, it contains:
  fc9fea69579f349e3b02e3264cffbef03e4b1852

That would make sense to me if the detached process still held the
original stdout/stderr channel open (in which case we'd racily see the
same line as in the no-detach case). But we do appear to call
daemonize(), which closes both.

Curiously, the code in gc.c does this:

          /* Failure to daemonize is ok, we'll continue in foreground. */
          if (opts->detach > 0)
                  daemonize();

and the only way for daemonize to fail is if NO_POSIX_GOODIES is set.
Which I'd expect on Windows. But then I'd expect this test to _always_
fail on Windows. Does it? If so, should it be marked with !MINGW?

While investigating that, I ran it with --stress locally (on Linux) and
got some odd (and definitely racy) results. The test itself passes, but
the "rm -rf repo" in the test_when_finished sometimes fails with:

  rm: cannot remove 'repo/.git/objects': Directory not empty

or similar (sometimes it's another directory like 'repo/.git'). My guess
is that the background process is still running and creating files in
the repository, racing with rm's call to rmdir().

Even if we remove the test_when_finished, it would mean that the final
cleanup after test_done might similarly fail, leaving a crufty trash
directory. I think to make this robust, we'd need some way of detecting
when the background process has finished. I don't think we report the
pid anywhere, and the daemonize() call means it won't even be in the
same process group. Maybe we could spin looking for the incremental pack
it will create (and timeout after N seconds)? That feels pretty hacky,
but I can't think of anything better.

-Peff

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v3 6/7] builtin/maintenance: add a `--detach` flag
  2024-08-17  7:09     ` Jeff King
@ 2024-08-17  7:14       ` Jeff King
  2024-08-19  6:17       ` Patrick Steinhardt
  1 sibling, 0 replies; 79+ messages in thread
From: Jeff King @ 2024-08-17  7:14 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: git, Phillip Wood, phillip.wood, James Liu, Derrick Stolee,
	Junio C Hamano

On Sat, Aug 17, 2024 at 03:09:25AM -0400, Jeff King wrote:

> While investigating that, I ran it with --stress locally (on Linux) and
> got some odd (and definitely racy) results. The test itself passes, but
> the "rm -rf repo" in the test_when_finished sometimes fails with:
> 
>   rm: cannot remove 'repo/.git/objects': Directory not empty
> 
> or similar (sometimes it's another directory like 'repo/.git'). My guess
> is that the background process is still running and creating files in
> the repository, racing with rm's call to rmdir().
> 
> Even if we remove the test_when_finished, it would mean that the final
> cleanup after test_done might similarly fail, leaving a crufty trash
> directory. I think to make this robust, we'd need some way of detecting
> when the background process has finished. I don't think we report the
> pid anywhere, and the daemonize() call means it won't even be in the
> same process group. Maybe we could spin looking for the incremental pack
> it will create (and timeout after N seconds)? That feels pretty hacky,
> but I can't think of anything better.

Ah, I just noticed that a similar problem happened in t6500, discussed
during v2 of the series (I only looked at the latest version). I think
t7900 needs a similar run_and_wait_for_auto_gc() solution.

I suspect the Windows "out is not empty" issue is separate, though.

-Peff

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v3 7/7] run-command: fix detaching when running auto maintenance
  2024-08-16 10:45   ` [PATCH v3 7/7] run-command: fix detaching when running auto maintenance Patrick Steinhardt
@ 2024-08-17 12:14     ` Jeff King
  2024-08-19  6:17       ` Patrick Steinhardt
  2024-08-19 10:49       ` Patrick Steinhardt
  0 siblings, 2 replies; 79+ messages in thread
From: Jeff King @ 2024-08-17 12:14 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: git, Phillip Wood, phillip.wood, James Liu, Derrick Stolee,
	Junio C Hamano

On Fri, Aug 16, 2024 at 12:45:17PM +0200, Patrick Steinhardt wrote:

> Fix this bug by asking git-gc(1) to not detach when it is being invoked
> via git-maintenance(1). Instead, git-maintenance(1) now respects a new
> config "maintenance.autoDetach", the equivalent of "gc.autoDetach", and
> detaches itself into the background when running as part of our auto
> maintenance. This should continue to behave the same for all users which
> use the git-gc(1) task, only. For others though, it means that we now
> properly perform all tasks in the background. The default behaviour of
> git-maintenance(1) when executed by the user does not change, it will
> remain in the foreground unless they pass the `--detach` option.

This patch seems to cause segfaults in t5616 when combined with the
reftable backend. Try this:

  GIT_TEST_DEFAULT_REF_FORMAT=reftable ./t5616-partial-clone.sh --run=1-16 --stress

which fails for me within a few runs. Bisecting leads to 98077d06b2
(run-command: fix detaching when running auto maintenance, 2024-08-16).
It doesn't trigger with the files ref backend.

Compiling with ASan gets me a stack trace like this:

  + git -c protocol.version=0 -C pc1 fetch --filter=blob:limit=29999 --refetch origin
  AddressSanitizer:DEADLYSIGNAL
  =================================================================
  ==657994==ERROR: AddressSanitizer: SEGV on unknown address 0x7fa0f0ec6089 (pc 0x55f23e52ddf9 bp 0x7ffe7bfa1700 sp 0x7ffe7bfa1700 T0)
  ==657994==The signal is caused by a READ memory access.
      #0 0x55f23e52ddf9 in get_var_int reftable/record.c:29
      #1 0x55f23e53295e in reftable_decode_keylen reftable/record.c:170
      #2 0x55f23e532cc0 in reftable_decode_key reftable/record.c:194
      #3 0x55f23e54e72e in block_iter_next reftable/block.c:398
      #4 0x55f23e5573dc in table_iter_next_in_block reftable/reader.c:240
      #5 0x55f23e5573dc in table_iter_next reftable/reader.c:355
      #6 0x55f23e5573dc in table_iter_next reftable/reader.c:339
      #7 0x55f23e551283 in merged_iter_advance_subiter reftable/merged.c:69
      #8 0x55f23e55169e in merged_iter_next_entry reftable/merged.c:123
      #9 0x55f23e55169e in merged_iter_next_void reftable/merged.c:172
      #10 0x55f23e537625 in reftable_iterator_next_ref reftable/generic.c:175
      #11 0x55f23e2cf9c6 in reftable_ref_iterator_advance refs/reftable-backend.c:464
      #12 0x55f23e2d996e in ref_iterator_advance refs/iterator.c:13
      #13 0x55f23e2d996e in do_for_each_ref_iterator refs/iterator.c:452
      #14 0x55f23dca6767 in get_ref_map builtin/fetch.c:623
      #15 0x55f23dca6767 in do_fetch builtin/fetch.c:1659
      #16 0x55f23dca6767 in fetch_one builtin/fetch.c:2133
      #17 0x55f23dca6767 in cmd_fetch builtin/fetch.c:2432
      #18 0x55f23dba7764 in run_builtin git.c:484
      #19 0x55f23dba7764 in handle_builtin git.c:741
      #20 0x55f23dbab61e in run_argv git.c:805
      #21 0x55f23dbab61e in cmd_main git.c:1000
      #22 0x55f23dba4781 in main common-main.c:64
      #23 0x7fa0f063fc89 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
      #24 0x7fa0f063fd44 in __libc_start_main_impl ../csu/libc-start.c:360
      #25 0x55f23dba6ad0 in _start (git+0xadfad0) (BuildId: 803b2b7f59beb03d7849fb8294a8e2145dd4aa27)

My guess based on what I'm seeing and what the patch does is that now
maintenance from a previous command is running in the background while
our foreground git-fetch runs, and it somehow confuses things (perhaps
by trying to compact reftables or something?). So I think there are two
problems:

  1. The reftable code needs to be more robust against whatever race is
     happening. I didn't dig further, but I'm sure it would be possible
     to produce a coredump.

  2. Having racy background maintenance doesn't seem great for test
     robustness. At the very least, it might subject us to the "rm"
     problems mentioned elsewhere, where we fail to clean up. Annotating
     individual "git gc" or "git maintenance" calls with an extra
     descriptor isn't too bad, but in this case it's all happening under
     the hood via fetch. Is it a potential problem for every script,
     then? If so, should we disable background detaching for all test
     repos, and then let the few that want to test it turn it back on?

-Peff

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v3 7/7] run-command: fix detaching when running auto maintenance
  2024-08-17 12:14     ` Jeff King
@ 2024-08-19  6:17       ` Patrick Steinhardt
  2024-08-19  7:47         ` [PATCH 0/3] Fixups for git-maintenance(1) tests Patrick Steinhardt
  2024-08-19  8:46         ` [PATCH v3 7/7] run-command: fix detaching when running auto maintenance Jeff King
  2024-08-19 10:49       ` Patrick Steinhardt
  1 sibling, 2 replies; 79+ messages in thread
From: Patrick Steinhardt @ 2024-08-19  6:17 UTC (permalink / raw)
  To: Jeff King
  Cc: git, Phillip Wood, phillip.wood, James Liu, Derrick Stolee,
	Junio C Hamano

On Sat, Aug 17, 2024 at 08:14:24AM -0400, Jeff King wrote:
> On Fri, Aug 16, 2024 at 12:45:17PM +0200, Patrick Steinhardt wrote:
> 
> > Fix this bug by asking git-gc(1) to not detach when it is being invoked
> > via git-maintenance(1). Instead, git-maintenance(1) now respects a new
> > config "maintenance.autoDetach", the equivalent of "gc.autoDetach", and
> > detaches itself into the background when running as part of our auto
> > maintenance. This should continue to behave the same for all users which
> > use the git-gc(1) task, only. For others though, it means that we now
> > properly perform all tasks in the background. The default behaviour of
> > git-maintenance(1) when executed by the user does not change, it will
> > remain in the foreground unless they pass the `--detach` option.
> 
> This patch seems to cause segfaults in t5616 when combined with the
> reftable backend. Try this:
> 
>   GIT_TEST_DEFAULT_REF_FORMAT=reftable ./t5616-partial-clone.sh --run=1-16 --stress
> 
> which fails for me within a few runs. Bisecting leads to 98077d06b2
> (run-command: fix detaching when running auto maintenance, 2024-08-16).
> It doesn't trigger with the files ref backend.
> 
> Compiling with ASan gets me a stack trace like this:
> 
>   + git -c protocol.version=0 -C pc1 fetch --filter=blob:limit=29999 --refetch origin
>   AddressSanitizer:DEADLYSIGNAL
>   =================================================================
>   ==657994==ERROR: AddressSanitizer: SEGV on unknown address 0x7fa0f0ec6089 (pc 0x55f23e52ddf9 bp 0x7ffe7bfa1700 sp 0x7ffe7bfa1700 T0)
>   ==657994==The signal is caused by a READ memory access.
>       #0 0x55f23e52ddf9 in get_var_int reftable/record.c:29
>       #1 0x55f23e53295e in reftable_decode_keylen reftable/record.c:170
>       #2 0x55f23e532cc0 in reftable_decode_key reftable/record.c:194
>       #3 0x55f23e54e72e in block_iter_next reftable/block.c:398
>       #4 0x55f23e5573dc in table_iter_next_in_block reftable/reader.c:240
>       #5 0x55f23e5573dc in table_iter_next reftable/reader.c:355
>       #6 0x55f23e5573dc in table_iter_next reftable/reader.c:339
>       #7 0x55f23e551283 in merged_iter_advance_subiter reftable/merged.c:69
>       #8 0x55f23e55169e in merged_iter_next_entry reftable/merged.c:123
>       #9 0x55f23e55169e in merged_iter_next_void reftable/merged.c:172
>       #10 0x55f23e537625 in reftable_iterator_next_ref reftable/generic.c:175
>       #11 0x55f23e2cf9c6 in reftable_ref_iterator_advance refs/reftable-backend.c:464
>       #12 0x55f23e2d996e in ref_iterator_advance refs/iterator.c:13
>       #13 0x55f23e2d996e in do_for_each_ref_iterator refs/iterator.c:452
>       #14 0x55f23dca6767 in get_ref_map builtin/fetch.c:623
>       #15 0x55f23dca6767 in do_fetch builtin/fetch.c:1659
>       #16 0x55f23dca6767 in fetch_one builtin/fetch.c:2133
>       #17 0x55f23dca6767 in cmd_fetch builtin/fetch.c:2432
>       #18 0x55f23dba7764 in run_builtin git.c:484
>       #19 0x55f23dba7764 in handle_builtin git.c:741
>       #20 0x55f23dbab61e in run_argv git.c:805
>       #21 0x55f23dbab61e in cmd_main git.c:1000
>       #22 0x55f23dba4781 in main common-main.c:64
>       #23 0x7fa0f063fc89 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
>       #24 0x7fa0f063fd44 in __libc_start_main_impl ../csu/libc-start.c:360
>       #25 0x55f23dba6ad0 in _start (git+0xadfad0) (BuildId: 803b2b7f59beb03d7849fb8294a8e2145dd4aa27)
> 
> My guess based on what I'm seeing and what the patch does is that now
> maintenance from a previous command is running in the background while
> our foreground git-fetch runs, and it somehow confuses things (perhaps
> by trying to compact reftables or something?). So I think there are two
> problems:
> 
>   1. The reftable code needs to be more robust against whatever race is
>      happening. I didn't dig further, but I'm sure it would be possible
>      to produce a coredump.

Yes, it certainly has to be robust against this. It's also where the
recent reftable compaction fixes [1] came from. In theory, it should
work alright with a concurrent process compacting the stack at the same
time where another process is reading. In practice the backend is still
in its infancy, so I'm not entirely surprised that there are concurrency
bugs.

I will investigate this issue, thanks a lot for the backtrace.

[1]: https://lore.kernel.org/git/cover.1722435214.git.ps@pks.im/

>   2. Having racy background maintenance doesn't seem great for test
>      robustness. At the very least, it might subject us to the "rm"
>      problems mentioned elsewhere, where we fail to clean up. Annotating
>      individual "git gc" or "git maintenance" calls with an extra
>      descriptor isn't too bad, but in this case it's all happening under
>      the hood via fetch. Is it a potential problem for every script,
>      then? If so, should we disable background detaching for all test
>      repos, and then let the few that want to test it turn it back on?

Might be a good idea to set `maintenance.autoDetach=false` globally,
yes. The only downside is of course that it wouldn't cause us to detect
failures like the above, where the concurrency itself causes failure.

Anyway, for now I'll:

  - Send a patch to fix the race in t7900.

  - Investigate the reftable concurrency issue.

  - _Not_ send a patch that sets `maintenance.autoDetach=false`.

The last one requires a bit more discussion first, and we have been
running with `gc.autoDetach=true` implicitly in the past. Thinking a bit
more about it, the reason why the above bug triggers now is that
git-gc(1) itself runs git-pack-refs(1), but does that _synchronously_
before detaching itself. Now we detach at a higher level in the
hierarchy, which means that the previously-synchronous part now runs
asynchronously, as well.

I cannot think of a reason why we shouldn't do this, as the ref backends
should handle this gracefully. The fact that the reftable backend
doesn't is a separate, preexisting bug.

Patrick

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v3 6/7] builtin/maintenance: add a `--detach` flag
  2024-08-17  7:09     ` Jeff King
  2024-08-17  7:14       ` Jeff King
@ 2024-08-19  6:17       ` Patrick Steinhardt
  1 sibling, 0 replies; 79+ messages in thread
From: Patrick Steinhardt @ 2024-08-19  6:17 UTC (permalink / raw)
  To: Jeff King
  Cc: git, Phillip Wood, phillip.wood, James Liu, Derrick Stolee,
	Junio C Hamano

On Sat, Aug 17, 2024 at 03:09:24AM -0400, Jeff King wrote:
> On Fri, Aug 16, 2024 at 12:45:15PM +0200, Patrick Steinhardt wrote:
> 
> > +test_expect_success '--no-detach causes maintenance to not run in background' '
> > [...]
> > +		# We have no better way to check whether or not the task ran in
> > +		# the background than to verify whether it output anything. The
> > +		# next testcase checks the reverse, making this somewhat safer.
> > +		git maintenance run --no-detach >out 2>&1 &&
> > +		test_line_count = 1 out
> > [...]
> > +test_expect_success '--detach causes maintenance to run in background' '
> > +	test_when_finished "rm -rf repo" &&
> > +	git init repo &&
> > +	(
> > +		cd repo &&
> > +
> > +		test_commit something &&
> > +		git config set maintenance.gc.enabled false &&
> > +		git config set maintenance.loose-objects.enabled true &&
> > +		git config set maintenance.loose-objects.auto 1 &&
> > +		git config set maintenance.incremental-repack.enabled true &&
> > +
> > +		git maintenance run --detach >out 2>&1 &&
> > +		test_must_be_empty out
> > +	)
> > +'
> 
> This second test seems to fail racily (or maybe always? see below). In
> CI on Windows, I saw:
> 
>   'out' is not empty, it contains:
>   fc9fea69579f349e3b02e3264cffbef03e4b1852
> 
> That would make sense to me if the detached process still held the
> original stdout/stderr channel open (in which case we'd racily see the
> same line as in the no-detach case). But we do appear to call
> daemonize(), which closes both.
> 
> Curiously, the code in gc.c does this:
> 
>           /* Failure to daemonize is ok, we'll continue in foreground. */
>           if (opts->detach > 0)
>                   daemonize();
> 
> and the only way for daemonize to fail is if NO_POSIX_GOODIES is set.
> Which I'd expect on Windows. But then I'd expect this test to _always_
> fail on Windows. Does it? If so, should it be marked with !MINGW?
> 
> While investigating that, I ran it with --stress locally (on Linux) and
> got some odd (and definitely racy) results. The test itself passes, but
> the "rm -rf repo" in the test_when_finished sometimes fails with:
> 
>   rm: cannot remove 'repo/.git/objects': Directory not empty
> 
> or similar (sometimes it's another directory like 'repo/.git'). My guess
> is that the background process is still running and creating files in
> the repository, racing with rm's call to rmdir().
> 
> Even if we remove the test_when_finished, it would mean that the final
> cleanup after test_done might similarly fail, leaving a crufty trash
> directory. I think to make this robust, we'd need some way of detecting
> when the background process has finished. I don't think we report the
> pid anywhere, and the daemonize() call means it won't even be in the
> same process group. Maybe we could spin looking for the incremental pack
> it will create (and timeout after N seconds)? That feels pretty hacky,
> but I can't think of anything better.

Oh, good catch indeed, I can reproduce this on Linux with `--stress`.

I think we can use the same workaround as we do in t6500, namely open a
new file descriptor, inherit it and then wait for the child process to
close it.

Thanks!

Patrick

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [PATCH 0/3] Fixups for git-maintenance(1) tests
  2024-08-19  6:17       ` Patrick Steinhardt
@ 2024-08-19  7:47         ` Patrick Steinhardt
  2024-08-19  7:47           ` [PATCH 1/3] t7900: fix flaky test due to leaking background job Patrick Steinhardt
                             ` (2 more replies)
  2024-08-19  8:46         ` [PATCH v3 7/7] run-command: fix detaching when running auto maintenance Jeff King
  1 sibling, 3 replies; 79+ messages in thread
From: Patrick Steinhardt @ 2024-08-19  7:47 UTC (permalink / raw)
  To: git; +Cc: Phillip Wood, phillip.wood, James Liu, Derrick Stolee,
	Junio C Hamano

Hi,

this small patch series fixes up the test flakes and issues on Windows
as reported by Peff:

  - We now wait for git-maintenance(1) to run to completion.

  - Instead of checking for the detach logic via the output, we now have
    a new trace2 region that allows us to check whether the detaching
    logic was executed.

  - Fix another bug that caused the "loose-objects" task to emit the
    packfile hash. This is a preexisting issue, but Peff made me have a
    deeper look at it.

Patrick

Patrick Steinhardt (3):
  t7900: fix flaky test due to leaking background job
  t7900: exercise detaching via trace2 regions
  builtin/maintenance: fix loose objects task emitting pack hash

 builtin/gc.c           | 11 ++++++++++-
 t/t7900-maintenance.sh | 28 +++++++++++++++++++++++++---
 2 files changed, 35 insertions(+), 4 deletions(-)


base-commit: 98077d06b28b97d508c389886ee5014056707a5e
-- 
2.46.0.164.g477ce5ccd6.dirty


^ permalink raw reply	[flat|nested] 79+ messages in thread

* [PATCH 1/3] t7900: fix flaky test due to leaking background job
  2024-08-19  7:47         ` [PATCH 0/3] Fixups for git-maintenance(1) tests Patrick Steinhardt
@ 2024-08-19  7:47           ` Patrick Steinhardt
  2024-08-19  8:49             ` Jeff King
  2024-08-19  7:48           ` [PATCH 2/3] t7900: exercise detaching via trace2 regions Patrick Steinhardt
  2024-08-19  7:48           ` [PATCH 3/3] builtin/maintenance: fix loose objects task emitting pack hash Patrick Steinhardt
  2 siblings, 1 reply; 79+ messages in thread
From: Patrick Steinhardt @ 2024-08-19  7:47 UTC (permalink / raw)
  To: git; +Cc: Phillip Wood, phillip.wood, James Liu, Derrick Stolee,
	Junio C Hamano

One of the recently-added tests in t7900 exercises git-maintanance(1)
with the `--detach` flag, which causes it to perform maintenance in the
background. We do not wait for the backgrounded process to exit though,
which causes the process to leak outside of the test, leading to racy
behaviour.

Fix this by synchronizing with the process via a separate file
descriptor. This is the same workaround as we use in t6500, see the
function `run_and_wait_for_auto_gc ()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 t/t7900-maintenance.sh | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
index 06ab43cfb5..074eadcd1c 100755
--- a/t/t7900-maintenance.sh
+++ b/t/t7900-maintenance.sh
@@ -967,8 +967,13 @@ test_expect_success '--detach causes maintenance to run in background' '
 		git config set maintenance.loose-objects.auto 1 &&
 		git config set maintenance.incremental-repack.enabled true &&
 
-		git maintenance run --detach >out 2>&1 &&
-		test_must_be_empty out
+		# The extra file descriptor gets inherited to the child
+		# process, and by reading stdout we thus essentially wait for
+		# that descriptor to get closed, which indicates that the child
+		# is done, too.
+		output=$(git maintenance run --detach 2>&1 9>&1) &&
+		printf "%s" "$output" >output &&
+		test_must_be_empty output
 	)
 '
 
-- 
2.46.0.164.g477ce5ccd6.dirty


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 2/3] t7900: exercise detaching via trace2 regions
  2024-08-19  7:47         ` [PATCH 0/3] Fixups for git-maintenance(1) tests Patrick Steinhardt
  2024-08-19  7:47           ` [PATCH 1/3] t7900: fix flaky test due to leaking background job Patrick Steinhardt
@ 2024-08-19  7:48           ` Patrick Steinhardt
  2024-08-19  8:51             ` Jeff King
  2024-08-19  7:48           ` [PATCH 3/3] builtin/maintenance: fix loose objects task emitting pack hash Patrick Steinhardt
  2 siblings, 1 reply; 79+ messages in thread
From: Patrick Steinhardt @ 2024-08-19  7:48 UTC (permalink / raw)
  To: git; +Cc: Phillip Wood, phillip.wood, James Liu, Derrick Stolee,
	Junio C Hamano

In t7900, we exercise the `--detach` logic by checking whether the
command ended up writing anything to its output or not. This supposedly
works because we close stdin, stdout and stderr when daemonizing. But
one, it breaks on platforms where daemonize is a no-op, like Windows.
And second, that git-maintenance(1) outputs anything at all in these
tests is a bug in the first place that we'll fix in a subsequent commit.

Introduce a new trace2 region around the detach which allows us to more
explicitly check whether the detaching logic was executed. This is a
much more direct way to exercise the logic, provides a potentially
useful signal to tracing logs and also works alright on platforms which
do not have the ability to daemonize.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/gc.c           |  5 ++++-
 t/t7900-maintenance.sh | 11 ++++++-----
 2 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/builtin/gc.c b/builtin/gc.c
index bafee330a2..13bc0572a3 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -1428,8 +1428,11 @@ static int maintenance_run_tasks(struct maintenance_run_opts *opts,
 	free(lock_path);
 
 	/* Failure to daemonize is ok, we'll continue in foreground. */
-	if (opts->detach > 0)
+	if (opts->detach > 0) {
+		trace2_region_enter("maintenance", "detach", the_repository);
 		daemonize();
+		trace2_region_leave("maintenance", "detach", the_repository);
+	}
 
 	for (i = 0; !found_selected && i < TASK__COUNT; i++)
 		found_selected = tasks[i].selected_order >= 0;
diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
index 074eadcd1c..46a61d66fb 100755
--- a/t/t7900-maintenance.sh
+++ b/t/t7900-maintenance.sh
@@ -950,8 +950,9 @@ test_expect_success '--no-detach causes maintenance to not run in background' '
 		# We have no better way to check whether or not the task ran in
 		# the background than to verify whether it output anything. The
 		# next testcase checks the reverse, making this somewhat safer.
-		git maintenance run --no-detach >out 2>&1 &&
-		test_line_count = 1 out
+		GIT_TRACE2_EVENT="$(pwd)/trace.txt" \
+			git maintenance run --no-detach >out 2>&1 &&
+		! test_region maintenance detach trace.txt
 	)
 '
 
@@ -971,9 +972,9 @@ test_expect_success '--detach causes maintenance to run in background' '
 		# process, and by reading stdout we thus essentially wait for
 		# that descriptor to get closed, which indicates that the child
 		# is done, too.
-		output=$(git maintenance run --detach 2>&1 9>&1) &&
-		printf "%s" "$output" >output &&
-		test_must_be_empty output
+		does_not_matter=$(GIT_TRACE2_EVENT="$(pwd)/trace.txt" \
+			git maintenance run --detach 9>&1) &&
+		test_region maintenance detach trace.txt
 	)
 '
 
-- 
2.46.0.164.g477ce5ccd6.dirty


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 3/3] builtin/maintenance: fix loose objects task emitting pack hash
  2024-08-19  7:47         ` [PATCH 0/3] Fixups for git-maintenance(1) tests Patrick Steinhardt
  2024-08-19  7:47           ` [PATCH 1/3] t7900: fix flaky test due to leaking background job Patrick Steinhardt
  2024-08-19  7:48           ` [PATCH 2/3] t7900: exercise detaching via trace2 regions Patrick Steinhardt
@ 2024-08-19  7:48           ` Patrick Steinhardt
  2024-08-19  8:55             ` Jeff King
  2 siblings, 1 reply; 79+ messages in thread
From: Patrick Steinhardt @ 2024-08-19  7:48 UTC (permalink / raw)
  To: git; +Cc: Phillip Wood, phillip.wood, James Liu, Derrick Stolee,
	Junio C Hamano

The "loose-objects" maintenance tasks executes git-pack-objects(1) to
pack all loose objects into a new packfile. This command ends up
printing the hash of the packfile to stdout though, which clutters the
output of `git maintenance run`.

Fix this issue by disabling stdout of the child process.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/gc.c           |  6 ++++++
 t/t7900-maintenance.sh | 16 ++++++++++++++++
 2 files changed, 22 insertions(+)

diff --git a/builtin/gc.c b/builtin/gc.c
index 13bc0572a3..be75efa17a 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -1159,6 +1159,12 @@ static int pack_loose(struct maintenance_run_opts *opts)
 
 	pack_proc.in = -1;
 
+	/*
+	 * git-pack-objects(1) ends up writing the pack hash to stdout, which
+	 * we do not care for.
+	 */
+	pack_proc.out = -1;
+
 	if (start_command(&pack_proc)) {
 		error(_("failed to start 'git pack-objects' process"));
 		return 1;
diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
index 46a61d66fb..7cc4eb262c 100755
--- a/t/t7900-maintenance.sh
+++ b/t/t7900-maintenance.sh
@@ -978,4 +978,20 @@ test_expect_success '--detach causes maintenance to run in background' '
 	)
 '
 
+test_expect_success 'repacking loose objects is quiet' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+
+		test_commit something &&
+		git config set maintenance.gc.enabled false &&
+		git config set maintenance.loose-objects.enabled true &&
+		git config set maintenance.loose-objects.auto 1 &&
+
+		git maintenance run --quiet >out 2>&1 &&
+		test_must_be_empty out
+	)
+'
+
 test_done
-- 
2.46.0.164.g477ce5ccd6.dirty


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* Re: [PATCH v3 7/7] run-command: fix detaching when running auto maintenance
  2024-08-19  6:17       ` Patrick Steinhardt
  2024-08-19  7:47         ` [PATCH 0/3] Fixups for git-maintenance(1) tests Patrick Steinhardt
@ 2024-08-19  8:46         ` Jeff King
  2024-08-19  9:04           ` Patrick Steinhardt
  1 sibling, 1 reply; 79+ messages in thread
From: Jeff King @ 2024-08-19  8:46 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: git, Phillip Wood, phillip.wood, James Liu, Derrick Stolee,
	Junio C Hamano

On Mon, Aug 19, 2024 at 08:17:22AM +0200, Patrick Steinhardt wrote:

> >   2. Having racy background maintenance doesn't seem great for test
> >      robustness. At the very least, it might subject us to the "rm"
> >      problems mentioned elsewhere, where we fail to clean up. Annotating
> >      individual "git gc" or "git maintenance" calls with an extra
> >      descriptor isn't too bad, but in this case it's all happening under
> >      the hood via fetch. Is it a potential problem for every script,
> >      then? If so, should we disable background detaching for all test
> >      repos, and then let the few that want to test it turn it back on?
> 
> Might be a good idea to set `maintenance.autoDetach=false` globally,
> yes. The only downside is of course that it wouldn't cause us to detect
> failures like the above, where the concurrency itself causes failure.
> 
> Anyway, for now I'll:
> 
>   - Send a patch to fix the race in t7900.
> 
>   - Investigate the reftable concurrency issue.
> 
>   - _Not_ send a patch that sets `maintenance.autoDetach=false`.

That sounds like a good direction. I do suspect there are at least _two_
races in t7900:

  1. the detached maintenance that we run explicitly, which causes the
     "rm" cleanup to fail

  2. whatever earlier test is kicking off detached maintenance via "git
     fetch", which is causing the reftable concurrency issue.

Fixing (1) should be easy (and it looks like you've already sent a
series). Fixing the reftable code will stop us from segfaulting for (2),
but I wonder if that detached maintenance might cause similar "rm" style
problems elsewhere.

> The last one requires a bit more discussion first, and we have been
> running with `gc.autoDetach=true` implicitly in the past. Thinking a bit
> more about it, the reason why the above bug triggers now is that
> git-gc(1) itself runs git-pack-refs(1), but does that _synchronously_
> before detaching itself. Now we detach at a higher level in the
> hierarchy, which means that the previously-synchronous part now runs
> asynchronously, as well.

That makes sense. I guess we've perhaps been doing background gc for a
long time, then, just not in the refs? In practice most repos in the
test suite aren't big enough to trigger auto-gc anyway, so it may only
affect a handful of scripts.

Once the reftable issue is fixed, it's possible that the lingering
detached processes don't cause a problem in practice (because they're
not really writing much, and/or have finished by the time anybody else
gets to cleanup), and we can just live with them. But I'm worried that
sounds like wishful thinking. ;)

-Peff

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 1/3] t7900: fix flaky test due to leaking background job
  2024-08-19  7:47           ` [PATCH 1/3] t7900: fix flaky test due to leaking background job Patrick Steinhardt
@ 2024-08-19  8:49             ` Jeff King
  2024-08-19  8:55               ` Patrick Steinhardt
  0 siblings, 1 reply; 79+ messages in thread
From: Jeff King @ 2024-08-19  8:49 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: git, Phillip Wood, phillip.wood, James Liu, Derrick Stolee,
	Junio C Hamano

On Mon, Aug 19, 2024 at 09:47:59AM +0200, Patrick Steinhardt wrote:

> diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
> index 06ab43cfb5..074eadcd1c 100755
> --- a/t/t7900-maintenance.sh
> +++ b/t/t7900-maintenance.sh
> @@ -967,8 +967,13 @@ test_expect_success '--detach causes maintenance to run in background' '
>  		git config set maintenance.loose-objects.auto 1 &&
>  		git config set maintenance.incremental-repack.enabled true &&
>  
> -		git maintenance run --detach >out 2>&1 &&
> -		test_must_be_empty out
> +		# The extra file descriptor gets inherited to the child
> +		# process, and by reading stdout we thus essentially wait for
> +		# that descriptor to get closed, which indicates that the child
> +		# is done, too.
> +		output=$(git maintenance run --detach 2>&1 9>&1) &&
> +		printf "%s" "$output" >output &&
> +		test_must_be_empty output
>  	)
>  '

This looks correct, but should we be doing it for all of the "git
maintenance" runs in that script? They're all going to kick off detached
gc jobs, I think.

-Peff

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 2/3] t7900: exercise detaching via trace2 regions
  2024-08-19  7:48           ` [PATCH 2/3] t7900: exercise detaching via trace2 regions Patrick Steinhardt
@ 2024-08-19  8:51             ` Jeff King
  2024-08-19  8:56               ` Patrick Steinhardt
  0 siblings, 1 reply; 79+ messages in thread
From: Jeff King @ 2024-08-19  8:51 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: git, Phillip Wood, phillip.wood, James Liu, Derrick Stolee,
	Junio C Hamano

On Mon, Aug 19, 2024 at 09:48:02AM +0200, Patrick Steinhardt wrote:

> In t7900, we exercise the `--detach` logic by checking whether the
> command ended up writing anything to its output or not. This supposedly
> works because we close stdin, stdout and stderr when daemonizing. But
> one, it breaks on platforms where daemonize is a no-op, like Windows.
> And second, that git-maintenance(1) outputs anything at all in these
> tests is a bug in the first place that we'll fix in a subsequent commit.
> 
> Introduce a new trace2 region around the detach which allows us to more
> explicitly check whether the detaching logic was executed. This is a
> much more direct way to exercise the logic, provides a potentially
> useful signal to tracing logs and also works alright on platforms which
> do not have the ability to daemonize.

Nice, this is so much cleaner than the way the existing test worked. The
code looks good, but...

> diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
> index 074eadcd1c..46a61d66fb 100755
> --- a/t/t7900-maintenance.sh
> +++ b/t/t7900-maintenance.sh
> @@ -950,8 +950,9 @@ test_expect_success '--no-detach causes maintenance to not run in background' '
>  		# We have no better way to check whether or not the task ran in
>  		# the background than to verify whether it output anything. The
>  		# next testcase checks the reverse, making this somewhat safer.
> -		git maintenance run --no-detach >out 2>&1 &&
> -		test_line_count = 1 out
> +		GIT_TRACE2_EVENT="$(pwd)/trace.txt" \
> +			git maintenance run --no-detach >out 2>&1 &&
> +		! test_region maintenance detach trace.txt
>  	)
>  '

...I think this "we have no better way..." comment is now out of date
(and can probably just be dropped).

-Peff

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 1/3] t7900: fix flaky test due to leaking background job
  2024-08-19  8:49             ` Jeff King
@ 2024-08-19  8:55               ` Patrick Steinhardt
  2024-08-19  9:12                 ` Jeff King
  0 siblings, 1 reply; 79+ messages in thread
From: Patrick Steinhardt @ 2024-08-19  8:55 UTC (permalink / raw)
  To: Jeff King
  Cc: git, Phillip Wood, phillip.wood, James Liu, Derrick Stolee,
	Junio C Hamano

On Mon, Aug 19, 2024 at 04:49:43AM -0400, Jeff King wrote:
> On Mon, Aug 19, 2024 at 09:47:59AM +0200, Patrick Steinhardt wrote:
> 
> > diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
> > index 06ab43cfb5..074eadcd1c 100755
> > --- a/t/t7900-maintenance.sh
> > +++ b/t/t7900-maintenance.sh
> > @@ -967,8 +967,13 @@ test_expect_success '--detach causes maintenance to run in background' '
> >  		git config set maintenance.loose-objects.auto 1 &&
> >  		git config set maintenance.incremental-repack.enabled true &&
> >  
> > -		git maintenance run --detach >out 2>&1 &&
> > -		test_must_be_empty out
> > +		# The extra file descriptor gets inherited to the child
> > +		# process, and by reading stdout we thus essentially wait for
> > +		# that descriptor to get closed, which indicates that the child
> > +		# is done, too.
> > +		output=$(git maintenance run --detach 2>&1 9>&1) &&
> > +		printf "%s" "$output" >output &&
> > +		test_must_be_empty output
> >  	)
> >  '
> 
> This looks correct, but should we be doing it for all of the "git
> maintenance" runs in that script? They're all going to kick off detached
> gc jobs, I think.

Only those that use `--detach` run in the background.

Patrick

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 3/3] builtin/maintenance: fix loose objects task emitting pack hash
  2024-08-19  7:48           ` [PATCH 3/3] builtin/maintenance: fix loose objects task emitting pack hash Patrick Steinhardt
@ 2024-08-19  8:55             ` Jeff King
  2024-08-19  9:07               ` Patrick Steinhardt
  0 siblings, 1 reply; 79+ messages in thread
From: Jeff King @ 2024-08-19  8:55 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: git, Phillip Wood, phillip.wood, James Liu, Derrick Stolee,
	Junio C Hamano

On Mon, Aug 19, 2024 at 09:48:05AM +0200, Patrick Steinhardt wrote:

> The "loose-objects" maintenance tasks executes git-pack-objects(1) to
> pack all loose objects into a new packfile. This command ends up
> printing the hash of the packfile to stdout though, which clutters the
> output of `git maintenance run`.
> 
> Fix this issue by disabling stdout of the child process.

Ah, I wondered where that output was coming from.

> diff --git a/builtin/gc.c b/builtin/gc.c
> index 13bc0572a3..be75efa17a 100644
> --- a/builtin/gc.c
> +++ b/builtin/gc.c
> @@ -1159,6 +1159,12 @@ static int pack_loose(struct maintenance_run_opts *opts)
>  
>  	pack_proc.in = -1;
>  
> +	/*
> +	 * git-pack-objects(1) ends up writing the pack hash to stdout, which
> +	 * we do not care for.
> +	 */
> +	pack_proc.out = -1;
> +
>  	if (start_command(&pack_proc)) {
>  		error(_("failed to start 'git pack-objects' process"));
>  		return 1;

I have not paid much attention to the "maintenance" stuff. It is a
little weird to me that it is not building on "git repack", which
already handles this, but perhaps there are reasons. Anyway, totally
unrelated to your patch (which looks good to me).

> +++ b/t/t7900-maintenance.sh
> @@ -978,4 +978,20 @@ test_expect_success '--detach causes maintenance to run in background' '
>  	)
>  '
>  
> +test_expect_success 'repacking loose objects is quiet' '
> +	test_when_finished "rm -rf repo" &&
> +	git init repo &&
> +	(
> +		cd repo &&
> +
> +		test_commit something &&
> +		git config set maintenance.gc.enabled false &&
> +		git config set maintenance.loose-objects.enabled true &&
> +		git config set maintenance.loose-objects.auto 1 &&
> +
> +		git maintenance run --quiet >out 2>&1 &&
> +		test_must_be_empty out
> +	)
> +'

I wondered if you needed --no-detach here to avoid a race, but I guess
as a non-auto run, it would never background?

-Peff

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 2/3] t7900: exercise detaching via trace2 regions
  2024-08-19  8:51             ` Jeff King
@ 2024-08-19  8:56               ` Patrick Steinhardt
  2024-08-21 18:38                 ` Junio C Hamano
  0 siblings, 1 reply; 79+ messages in thread
From: Patrick Steinhardt @ 2024-08-19  8:56 UTC (permalink / raw)
  To: Jeff King
  Cc: git, Phillip Wood, phillip.wood, James Liu, Derrick Stolee,
	Junio C Hamano

On Mon, Aug 19, 2024 at 04:51:05AM -0400, Jeff King wrote:
> On Mon, Aug 19, 2024 at 09:48:02AM +0200, Patrick Steinhardt wrote:
> 
> > In t7900, we exercise the `--detach` logic by checking whether the
> > command ended up writing anything to its output or not. This supposedly
> > works because we close stdin, stdout and stderr when daemonizing. But
> > one, it breaks on platforms where daemonize is a no-op, like Windows.
> > And second, that git-maintenance(1) outputs anything at all in these
> > tests is a bug in the first place that we'll fix in a subsequent commit.
> > 
> > Introduce a new trace2 region around the detach which allows us to more
> > explicitly check whether the detaching logic was executed. This is a
> > much more direct way to exercise the logic, provides a potentially
> > useful signal to tracing logs and also works alright on platforms which
> > do not have the ability to daemonize.
> 
> Nice, this is so much cleaner than the way the existing test worked. The
> code looks good, but...
> 
> > diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
> > index 074eadcd1c..46a61d66fb 100755
> > --- a/t/t7900-maintenance.sh
> > +++ b/t/t7900-maintenance.sh
> > @@ -950,8 +950,9 @@ test_expect_success '--no-detach causes maintenance to not run in background' '
> >  		# We have no better way to check whether or not the task ran in
> >  		# the background than to verify whether it output anything. The
> >  		# next testcase checks the reverse, making this somewhat safer.
> > -		git maintenance run --no-detach >out 2>&1 &&
> > -		test_line_count = 1 out
> > +		GIT_TRACE2_EVENT="$(pwd)/trace.txt" \
> > +			git maintenance run --no-detach >out 2>&1 &&
> > +		! test_region maintenance detach trace.txt
> >  	)
> >  '
> 
> ...I think this "we have no better way..." comment is now out of date
> (and can probably just be dropped).

Oops, yes, that one is definitely stale. I'll drop it in the next
version of this patch series.

Patrick

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v3 7/7] run-command: fix detaching when running auto maintenance
  2024-08-19  8:46         ` [PATCH v3 7/7] run-command: fix detaching when running auto maintenance Jeff King
@ 2024-08-19  9:04           ` Patrick Steinhardt
  0 siblings, 0 replies; 79+ messages in thread
From: Patrick Steinhardt @ 2024-08-19  9:04 UTC (permalink / raw)
  To: Jeff King
  Cc: git, Phillip Wood, phillip.wood, James Liu, Derrick Stolee,
	Junio C Hamano

On Mon, Aug 19, 2024 at 04:46:14AM -0400, Jeff King wrote:
> On Mon, Aug 19, 2024 at 08:17:22AM +0200, Patrick Steinhardt wrote:
> 
> > >   2. Having racy background maintenance doesn't seem great for test
> > >      robustness. At the very least, it might subject us to the "rm"
> > >      problems mentioned elsewhere, where we fail to clean up. Annotating
> > >      individual "git gc" or "git maintenance" calls with an extra
> > >      descriptor isn't too bad, but in this case it's all happening under
> > >      the hood via fetch. Is it a potential problem for every script,
> > >      then? If so, should we disable background detaching for all test
> > >      repos, and then let the few that want to test it turn it back on?
> > 
> > Might be a good idea to set `maintenance.autoDetach=false` globally,
> > yes. The only downside is of course that it wouldn't cause us to detect
> > failures like the above, where the concurrency itself causes failure.
> > 
> > Anyway, for now I'll:
> > 
> >   - Send a patch to fix the race in t7900.
> > 
> >   - Investigate the reftable concurrency issue.
> > 
> >   - _Not_ send a patch that sets `maintenance.autoDetach=false`.
> 
> That sounds like a good direction. I do suspect there are at least _two_
> races in t7900:
> 
>   1. the detached maintenance that we run explicitly, which causes the
>      "rm" cleanup to fail
> 
>   2. whatever earlier test is kicking off detached maintenance via "git
>      fetch", which is causing the reftable concurrency issue.
> 
> Fixing (1) should be easy (and it looks like you've already sent a
> series). Fixing the reftable code will stop us from segfaulting for (2),
> but I wonder if that detached maintenance might cause similar "rm" style
> problems elsewhere.

It certainly might. The only reason why I don't want to send that patch
now is that it feels a bit too reactionary. We haven't had issues in the
past with it to the best of my knowledge, even though it is an issue in
theory.

We can certainly revisit that though if we see that it indeed is a more
widespread issue.

> > The last one requires a bit more discussion first, and we have been
> > running with `gc.autoDetach=true` implicitly in the past. Thinking a bit
> > more about it, the reason why the above bug triggers now is that
> > git-gc(1) itself runs git-pack-refs(1), but does that _synchronously_
> > before detaching itself. Now we detach at a higher level in the
> > hierarchy, which means that the previously-synchronous part now runs
> > asynchronously, as well.
> 
> That makes sense. I guess we've perhaps been doing background gc for a
> long time, then, just not in the refs? In practice most repos in the
> test suite aren't big enough to trigger auto-gc anyway, so it may only
> affect a handful of scripts.

Yup. We should expect this to work just fine, because it can trigger
regardless of whether or not we run auto-compaction concurrently or not.
After all, the reftable backend even performs auto-compaction after
every write to the table, so any two concurrent writes may hit the bug
without even invoking git-maintenance(1) at all.

> Once the reftable issue is fixed, it's possible that the lingering
> detached processes don't cause a problem in practice (because they're
> not really writing much, and/or have finished by the time anybody else
> gets to cleanup), and we can just live with them. But I'm worried that
> sounds like wishful thinking. ;)

Could certainly be, yeah. I just want to focus on fixing the immediate
issues before we jump to conclusions too fast.

Patrick

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 3/3] builtin/maintenance: fix loose objects task emitting pack hash
  2024-08-19  8:55             ` Jeff King
@ 2024-08-19  9:07               ` Patrick Steinhardt
  2024-08-19  9:17                 ` Jeff King
  0 siblings, 1 reply; 79+ messages in thread
From: Patrick Steinhardt @ 2024-08-19  9:07 UTC (permalink / raw)
  To: Jeff King
  Cc: git, Phillip Wood, phillip.wood, James Liu, Derrick Stolee,
	Junio C Hamano

On Mon, Aug 19, 2024 at 04:55:22AM -0400, Jeff King wrote:
> On Mon, Aug 19, 2024 at 09:48:05AM +0200, Patrick Steinhardt wrote:
> 
> > The "loose-objects" maintenance tasks executes git-pack-objects(1) to
> > pack all loose objects into a new packfile. This command ends up
> > printing the hash of the packfile to stdout though, which clutters the
> > output of `git maintenance run`.
> > 
> > Fix this issue by disabling stdout of the child process.
> 
> Ah, I wondered where that output was coming from.
> 
> > diff --git a/builtin/gc.c b/builtin/gc.c
> > index 13bc0572a3..be75efa17a 100644
> > --- a/builtin/gc.c
> > +++ b/builtin/gc.c
> > @@ -1159,6 +1159,12 @@ static int pack_loose(struct maintenance_run_opts *opts)
> >  
> >  	pack_proc.in = -1;
> >  
> > +	/*
> > +	 * git-pack-objects(1) ends up writing the pack hash to stdout, which
> > +	 * we do not care for.
> > +	 */
> > +	pack_proc.out = -1;
> > +
> >  	if (start_command(&pack_proc)) {
> >  		error(_("failed to start 'git pack-objects' process"));
> >  		return 1;
> 
> I have not paid much attention to the "maintenance" stuff. It is a
> little weird to me that it is not building on "git repack", which
> already handles this, but perhaps there are reasons. Anyway, totally
> unrelated to your patch (which looks good to me).

git-repack(1) is way less efficient than running git-pack-objects(1)
directly. I've also noticed that at one point in time when revamping how
we do housekeeping in Git.

It mostly boils down to git-repack(1) doing a connectivity check,
whereas git-pack-objects(1) doesn't. We just soak up every single loose
object, and then eventually we expire them via git-multi-pack-index(1)'s
"expire" subcommand.

> > +++ b/t/t7900-maintenance.sh
> > @@ -978,4 +978,20 @@ test_expect_success '--detach causes maintenance to run in background' '
> >  	)
> >  '
> >  
> > +test_expect_success 'repacking loose objects is quiet' '
> > +	test_when_finished "rm -rf repo" &&
> > +	git init repo &&
> > +	(
> > +		cd repo &&
> > +
> > +		test_commit something &&
> > +		git config set maintenance.gc.enabled false &&
> > +		git config set maintenance.loose-objects.enabled true &&
> > +		git config set maintenance.loose-objects.auto 1 &&
> > +
> > +		git maintenance run --quiet >out 2>&1 &&
> > +		test_must_be_empty out
> > +	)
> > +'
> 
> I wondered if you needed --no-detach here to avoid a race, but I guess
> as a non-auto run, it would never background?

Even the `--auto` run does not background. That was the case for
git-gc(1), but is not the case for git-maintenance(1). You now have to
pass `--detach` explicitly to cause it to background, which I think is
the saner way to do this anyway.

Patrick

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 1/3] t7900: fix flaky test due to leaking background job
  2024-08-19  8:55               ` Patrick Steinhardt
@ 2024-08-19  9:12                 ` Jeff King
  2024-08-19  9:17                   ` Patrick Steinhardt
  0 siblings, 1 reply; 79+ messages in thread
From: Jeff King @ 2024-08-19  9:12 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: git, Phillip Wood, phillip.wood, James Liu, Derrick Stolee,
	Junio C Hamano

On Mon, Aug 19, 2024 at 10:55:07AM +0200, Patrick Steinhardt wrote:

> > This looks correct, but should we be doing it for all of the "git
> > maintenance" runs in that script? They're all going to kick off detached
> > gc jobs, I think.
> 
> Only those that use `--detach` run in the background.

I thought since the default for maintenance.autoDetach was true, all of
the "--auto" ones would need something similar. I notice a lot of those
use "--task", though, so maybe that doesn't count. I'm not clear on all
of the rules.

-Peff

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 3/3] builtin/maintenance: fix loose objects task emitting pack hash
  2024-08-19  9:07               ` Patrick Steinhardt
@ 2024-08-19  9:17                 ` Jeff King
  2024-08-19  9:26                   ` Patrick Steinhardt
  2024-08-19 17:05                   ` Junio C Hamano
  0 siblings, 2 replies; 79+ messages in thread
From: Jeff King @ 2024-08-19  9:17 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: git, Phillip Wood, phillip.wood, James Liu, Derrick Stolee,
	Junio C Hamano

On Mon, Aug 19, 2024 at 11:07:51AM +0200, Patrick Steinhardt wrote:

> > I have not paid much attention to the "maintenance" stuff. It is a
> > little weird to me that it is not building on "git repack", which
> > already handles this, but perhaps there are reasons. Anyway, totally
> > unrelated to your patch (which looks good to me).
> 
> git-repack(1) is way less efficient than running git-pack-objects(1)
> directly. I've also noticed that at one point in time when revamping how
> we do housekeeping in Git.
> 
> It mostly boils down to git-repack(1) doing a connectivity check,
> whereas git-pack-objects(1) doesn't. We just soak up every single loose
> object, and then eventually we expire them via git-multi-pack-index(1)'s
> "expire" subcommand.

Hmph. I'd have suggested that we should teach git-repack to do the more
efficient thing. I'm a bit worried about having parallel universes of
how maintenance works making it harder to reason about when or how
things happen, and how various concurrent / racy behaviors work.

But it's probably a bit late to re-open that (and certainly it's not
part of your series).

> > I wondered if you needed --no-detach here to avoid a race, but I guess
> > as a non-auto run, it would never background?
> 
> Even the `--auto` run does not background. That was the case for
> git-gc(1), but is not the case for git-maintenance(1). You now have to
> pass `--detach` explicitly to cause it to background, which I think is
> the saner way to do this anyway.

Am I misreading the documentation? The entry for maintenance.autoDetach
on 'next' says:

  If unset, the value of `gc.autoDetach` is used as a fallback. Defaults
  to true if both are unset, meaning that the maintenance process will
  detach.

-Peff

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 1/3] t7900: fix flaky test due to leaking background job
  2024-08-19  9:12                 ` Jeff King
@ 2024-08-19  9:17                   ` Patrick Steinhardt
  0 siblings, 0 replies; 79+ messages in thread
From: Patrick Steinhardt @ 2024-08-19  9:17 UTC (permalink / raw)
  To: Jeff King
  Cc: git, Phillip Wood, phillip.wood, James Liu, Derrick Stolee,
	Junio C Hamano

On Mon, Aug 19, 2024 at 05:12:41AM -0400, Jeff King wrote:
> On Mon, Aug 19, 2024 at 10:55:07AM +0200, Patrick Steinhardt wrote:
> 
> > > This looks correct, but should we be doing it for all of the "git
> > > maintenance" runs in that script? They're all going to kick off detached
> > > gc jobs, I think.
> > 
> > Only those that use `--detach` run in the background.
> 
> I thought since the default for maintenance.autoDetach was true, all of
> the "--auto" ones would need something similar. I notice a lot of those
> use "--task", though, so maybe that doesn't count. I'm not clear on all
> of the rules.

`maintenance.autoDetach` only influences the auto-maintenance as
executed by `run_auto_maintenance()`.

Patrick

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 3/3] builtin/maintenance: fix loose objects task emitting pack hash
  2024-08-19  9:17                 ` Jeff King
@ 2024-08-19  9:26                   ` Patrick Steinhardt
  2024-08-19 10:26                     ` Jeff King
  2024-08-19 17:05                   ` Junio C Hamano
  1 sibling, 1 reply; 79+ messages in thread
From: Patrick Steinhardt @ 2024-08-19  9:26 UTC (permalink / raw)
  To: Jeff King
  Cc: git, Phillip Wood, phillip.wood, James Liu, Derrick Stolee,
	Junio C Hamano

On Mon, Aug 19, 2024 at 05:17:15AM -0400, Jeff King wrote:
> On Mon, Aug 19, 2024 at 11:07:51AM +0200, Patrick Steinhardt wrote:
> 
> > > I have not paid much attention to the "maintenance" stuff. It is a
> > > little weird to me that it is not building on "git repack", which
> > > already handles this, but perhaps there are reasons. Anyway, totally
> > > unrelated to your patch (which looks good to me).
> > 
> > git-repack(1) is way less efficient than running git-pack-objects(1)
> > directly. I've also noticed that at one point in time when revamping how
> > we do housekeeping in Git.
> > 
> > It mostly boils down to git-repack(1) doing a connectivity check,
> > whereas git-pack-objects(1) doesn't. We just soak up every single loose
> > object, and then eventually we expire them via git-multi-pack-index(1)'s
> > "expire" subcommand.
> 
> Hmph. I'd have suggested that we should teach git-repack to do the more
> efficient thing. I'm a bit worried about having parallel universes of
> how maintenance works making it harder to reason about when or how
> things happen, and how various concurrent / racy behaviors work.
> 
> But it's probably a bit late to re-open that (and certainly it's not
> part of your series).
> 
> > > I wondered if you needed --no-detach here to avoid a race, but I guess
> > > as a non-auto run, it would never background?
> > 
> > Even the `--auto` run does not background. That was the case for
> > git-gc(1), but is not the case for git-maintenance(1). You now have to
> > pass `--detach` explicitly to cause it to background, which I think is
> > the saner way to do this anyway.
> 
> Am I misreading the documentation? The entry for maintenance.autoDetach
> on 'next' says:
> 
>   If unset, the value of `gc.autoDetach` is used as a fallback. Defaults
>   to true if both are unset, meaning that the maintenance process will
>   detach.

You've omitted the important part:

	Many Git commands trigger automatic maintenance after they have
	written data into the repository. This boolean config option
	controls whether this automatic maintenance shall happen in the
	foreground or whether the maintenance process shall detach and
	continue to run in the background.

The `maintenance.autoDetach` setting only impacts auto-maintentance as
run via `run_auto_maintenance()`. The `--auto` flag is somewhat
orthogonal: it asks the git-maintenance(1) job to do nothing in case the
repository is already optimal.

For git-gc(1) we indeed did tie the `--auto` flag to backgrounding,
which is somewhat nonsensical. There are usecases where you may want to
pass `--auto`, but still have it run in the foreground. That's why we
handle this differently for git-maintenance(1), which requires you to
pass an explicit `--detach` flag.

Also, we cannot change the behaviour of git-maintenance(1) retroactively
to make `--auto` detach. While it already essentially did detach for
git-gc(1), that was a bug. E.g. when running as part of the scheduler,
we'd always have detached and thus ended up with a bunch of concurrent
git-gc(1) processes. So even though it does make sense for the scheduler
to use `--auto`, it wouldn't want the process to detach.

Patrick

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 3/3] builtin/maintenance: fix loose objects task emitting pack hash
  2024-08-19  9:26                   ` Patrick Steinhardt
@ 2024-08-19 10:26                     ` Jeff King
  2024-08-20  7:39                       ` Patrick Steinhardt
  0 siblings, 1 reply; 79+ messages in thread
From: Jeff King @ 2024-08-19 10:26 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: git, Phillip Wood, phillip.wood, James Liu, Derrick Stolee,
	Junio C Hamano

On Mon, Aug 19, 2024 at 11:26:06AM +0200, Patrick Steinhardt wrote:

> > Am I misreading the documentation? The entry for maintenance.autoDetach
> > on 'next' says:
> > 
> >   If unset, the value of `gc.autoDetach` is used as a fallback. Defaults
> >   to true if both are unset, meaning that the maintenance process will
> >   detach.
> 
> You've omitted the important part:
> 
> 	Many Git commands trigger automatic maintenance after they have
> 	written data into the repository. This boolean config option
> 	controls whether this automatic maintenance shall happen in the
> 	foreground or whether the maintenance process shall detach and
> 	continue to run in the background.
> 
> The `maintenance.autoDetach` setting only impacts auto-maintentance as
> run via `run_auto_maintenance()`. The `--auto` flag is somewhat
> orthogonal: it asks the git-maintenance(1) job to do nothing in case the
> repository is already optimal.

Ah. I naively assumed that they did so by passing the "--auto" flag. But
I see now that the caller actually checks the config and passes
"--detach" or not.

That seems kind of unfriendly to scripted porcelains which want to
invoke it, since they have to reimplement that logic. The idea of "git
gc --auto" was that it provided a single API for scripts to invoke,
including respecting the user's config. Now that "maintenance --auto"
has taken that over, I'd have expected it to do the same.

To be clear, I don't feel all that strongly about it, but I'm not sure I
buy the argument that it is orthogonal, or that here:

> For git-gc(1) we indeed did tie the `--auto` flag to backgrounding,
> which is somewhat nonsensical. There are usecases where you may want to
> pass `--auto`, but still have it run in the foreground. That's why we
> handle this differently for git-maintenance(1), which requires you to
> pass an explicit `--detach` flag.

we couldn't just patch "--no-detach" for cases where you want to be sure
it is in the foreground.

> Also, we cannot change the behaviour of git-maintenance(1) retroactively
> to make `--auto` detach. While it already essentially did detach for
> git-gc(1), that was a bug. E.g. when running as part of the scheduler,
> we'd always have detached and thus ended up with a bunch of concurrent
> git-gc(1) processes. So even though it does make sense for the scheduler
> to use `--auto`, it wouldn't want the process to detach.

Backwards compatibility is a more compelling argument here, if we've had
"maintenance --auto" that didn't ever detach (though it sounds like it
did, via gc, anyway). But yes, one kicked off from a scheduler should be
using --no-detach, I'd think.

Like I said, I don't feel strongly enough to work on any changes here.
I'd hoped to never think about repository maintenance ever again. So you
can take these as just impressions of a (relatively) clueful user seeing
it for the first time. ;)

-Peff

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v3 7/7] run-command: fix detaching when running auto maintenance
  2024-08-17 12:14     ` Jeff King
  2024-08-19  6:17       ` Patrick Steinhardt
@ 2024-08-19 10:49       ` Patrick Steinhardt
  2024-08-19 15:41         ` Patrick Steinhardt
  1 sibling, 1 reply; 79+ messages in thread
From: Patrick Steinhardt @ 2024-08-19 10:49 UTC (permalink / raw)
  To: Jeff King
  Cc: git, Phillip Wood, phillip.wood, James Liu, Derrick Stolee,
	Junio C Hamano

On Sat, Aug 17, 2024 at 08:14:24AM -0400, Jeff King wrote:
> On Fri, Aug 16, 2024 at 12:45:17PM +0200, Patrick Steinhardt wrote:
> 
> > Fix this bug by asking git-gc(1) to not detach when it is being invoked
> > via git-maintenance(1). Instead, git-maintenance(1) now respects a new
> > config "maintenance.autoDetach", the equivalent of "gc.autoDetach", and
> > detaches itself into the background when running as part of our auto
> > maintenance. This should continue to behave the same for all users which
> > use the git-gc(1) task, only. For others though, it means that we now
> > properly perform all tasks in the background. The default behaviour of
> > git-maintenance(1) when executed by the user does not change, it will
> > remain in the foreground unless they pass the `--detach` option.
> 
> This patch seems to cause segfaults in t5616 when combined with the
> reftable backend. Try this:
> 
>   GIT_TEST_DEFAULT_REF_FORMAT=reftable ./t5616-partial-clone.sh --run=1-16 --stress
> 
> which fails for me within a few runs. Bisecting leads to 98077d06b2
> (run-command: fix detaching when running auto maintenance, 2024-08-16).
> It doesn't trigger with the files ref backend.
> 
> Compiling with ASan gets me a stack trace like this:
> 
>   + git -c protocol.version=0 -C pc1 fetch --filter=blob:limit=29999 --refetch origin
>   AddressSanitizer:DEADLYSIGNAL
>   =================================================================
>   ==657994==ERROR: AddressSanitizer: SEGV on unknown address 0x7fa0f0ec6089 (pc 0x55f23e52ddf9 bp 0x7ffe7bfa1700 sp 0x7ffe7bfa1700 T0)
>   ==657994==The signal is caused by a READ memory access.
>       #0 0x55f23e52ddf9 in get_var_int reftable/record.c:29
>       #1 0x55f23e53295e in reftable_decode_keylen reftable/record.c:170
>       #2 0x55f23e532cc0 in reftable_decode_key reftable/record.c:194
>       #3 0x55f23e54e72e in block_iter_next reftable/block.c:398
>       #4 0x55f23e5573dc in table_iter_next_in_block reftable/reader.c:240
>       #5 0x55f23e5573dc in table_iter_next reftable/reader.c:355
>       #6 0x55f23e5573dc in table_iter_next reftable/reader.c:339
>       #7 0x55f23e551283 in merged_iter_advance_subiter reftable/merged.c:69
>       #8 0x55f23e55169e in merged_iter_next_entry reftable/merged.c:123
>       #9 0x55f23e55169e in merged_iter_next_void reftable/merged.c:172
>       #10 0x55f23e537625 in reftable_iterator_next_ref reftable/generic.c:175
>       #11 0x55f23e2cf9c6 in reftable_ref_iterator_advance refs/reftable-backend.c:464
>       #12 0x55f23e2d996e in ref_iterator_advance refs/iterator.c:13
>       #13 0x55f23e2d996e in do_for_each_ref_iterator refs/iterator.c:452
>       #14 0x55f23dca6767 in get_ref_map builtin/fetch.c:623
>       #15 0x55f23dca6767 in do_fetch builtin/fetch.c:1659
>       #16 0x55f23dca6767 in fetch_one builtin/fetch.c:2133
>       #17 0x55f23dca6767 in cmd_fetch builtin/fetch.c:2432
>       #18 0x55f23dba7764 in run_builtin git.c:484
>       #19 0x55f23dba7764 in handle_builtin git.c:741
>       #20 0x55f23dbab61e in run_argv git.c:805
>       #21 0x55f23dbab61e in cmd_main git.c:1000
>       #22 0x55f23dba4781 in main common-main.c:64
>       #23 0x7fa0f063fc89 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
>       #24 0x7fa0f063fd44 in __libc_start_main_impl ../csu/libc-start.c:360
>       #25 0x55f23dba6ad0 in _start (git+0xadfad0) (BuildId: 803b2b7f59beb03d7849fb8294a8e2145dd4aa27)

I haven't yet been able to definitely tell, but I think this is a
lifetime issue. We create an iterator, eventually notice that the
reftable stack has been rewritten, and reload the stack. But the
block sources used for the old tables are still referenced by the
iterator, even though it was closed. As such, the mmapped memory of the
table has been unmapped and is now invalid, which causes the above
invalid reads.

I'll work on a patch series that introduces refcounting for block
sources, but guess that'll take a bit.

Patrick

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v3 7/7] run-command: fix detaching when running auto maintenance
  2024-08-19 10:49       ` Patrick Steinhardt
@ 2024-08-19 15:41         ` Patrick Steinhardt
  0 siblings, 0 replies; 79+ messages in thread
From: Patrick Steinhardt @ 2024-08-19 15:41 UTC (permalink / raw)
  To: Jeff King
  Cc: git, Phillip Wood, phillip.wood, James Liu, Derrick Stolee,
	Junio C Hamano

On Mon, Aug 19, 2024 at 12:49:52PM +0200, Patrick Steinhardt wrote:
> On Sat, Aug 17, 2024 at 08:14:24AM -0400, Jeff King wrote:
> > On Fri, Aug 16, 2024 at 12:45:17PM +0200, Patrick Steinhardt wrote:
> > 
> > > Fix this bug by asking git-gc(1) to not detach when it is being invoked
> > > via git-maintenance(1). Instead, git-maintenance(1) now respects a new
> > > config "maintenance.autoDetach", the equivalent of "gc.autoDetach", and
> > > detaches itself into the background when running as part of our auto
> > > maintenance. This should continue to behave the same for all users which
> > > use the git-gc(1) task, only. For others though, it means that we now
> > > properly perform all tasks in the background. The default behaviour of
> > > git-maintenance(1) when executed by the user does not change, it will
> > > remain in the foreground unless they pass the `--detach` option.
> > 
> > This patch seems to cause segfaults in t5616 when combined with the
> > reftable backend. Try this:
> > 
> >   GIT_TEST_DEFAULT_REF_FORMAT=reftable ./t5616-partial-clone.sh --run=1-16 --stress
> > 
> > which fails for me within a few runs. Bisecting leads to 98077d06b2
> > (run-command: fix detaching when running auto maintenance, 2024-08-16).
> > It doesn't trigger with the files ref backend.
> > 
> > Compiling with ASan gets me a stack trace like this:
> > 
> >   + git -c protocol.version=0 -C pc1 fetch --filter=blob:limit=29999 --refetch origin
> >   AddressSanitizer:DEADLYSIGNAL
> >   =================================================================
> >   ==657994==ERROR: AddressSanitizer: SEGV on unknown address 0x7fa0f0ec6089 (pc 0x55f23e52ddf9 bp 0x7ffe7bfa1700 sp 0x7ffe7bfa1700 T0)
> >   ==657994==The signal is caused by a READ memory access.
> >       #0 0x55f23e52ddf9 in get_var_int reftable/record.c:29
> >       #1 0x55f23e53295e in reftable_decode_keylen reftable/record.c:170
> >       #2 0x55f23e532cc0 in reftable_decode_key reftable/record.c:194
> >       #3 0x55f23e54e72e in block_iter_next reftable/block.c:398
> >       #4 0x55f23e5573dc in table_iter_next_in_block reftable/reader.c:240
> >       #5 0x55f23e5573dc in table_iter_next reftable/reader.c:355
> >       #6 0x55f23e5573dc in table_iter_next reftable/reader.c:339
> >       #7 0x55f23e551283 in merged_iter_advance_subiter reftable/merged.c:69
> >       #8 0x55f23e55169e in merged_iter_next_entry reftable/merged.c:123
> >       #9 0x55f23e55169e in merged_iter_next_void reftable/merged.c:172
> >       #10 0x55f23e537625 in reftable_iterator_next_ref reftable/generic.c:175
> >       #11 0x55f23e2cf9c6 in reftable_ref_iterator_advance refs/reftable-backend.c:464
> >       #12 0x55f23e2d996e in ref_iterator_advance refs/iterator.c:13
> >       #13 0x55f23e2d996e in do_for_each_ref_iterator refs/iterator.c:452
> >       #14 0x55f23dca6767 in get_ref_map builtin/fetch.c:623
> >       #15 0x55f23dca6767 in do_fetch builtin/fetch.c:1659
> >       #16 0x55f23dca6767 in fetch_one builtin/fetch.c:2133
> >       #17 0x55f23dca6767 in cmd_fetch builtin/fetch.c:2432
> >       #18 0x55f23dba7764 in run_builtin git.c:484
> >       #19 0x55f23dba7764 in handle_builtin git.c:741
> >       #20 0x55f23dbab61e in run_argv git.c:805
> >       #21 0x55f23dbab61e in cmd_main git.c:1000
> >       #22 0x55f23dba4781 in main common-main.c:64
> >       #23 0x7fa0f063fc89 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
> >       #24 0x7fa0f063fd44 in __libc_start_main_impl ../csu/libc-start.c:360
> >       #25 0x55f23dba6ad0 in _start (git+0xadfad0) (BuildId: 803b2b7f59beb03d7849fb8294a8e2145dd4aa27)
> 
> I haven't yet been able to definitely tell, but I think this is a
> lifetime issue. We create an iterator, eventually notice that the
> reftable stack has been rewritten, and reload the stack. But the
> block sources used for the old tables are still referenced by the
> iterator, even though it was closed. As such, the mmapped memory of the
> table has been unmapped and is now invalid, which causes the above
> invalid reads.
> 
> I'll work on a patch series that introduces refcounting for block
> sources, but guess that'll take a bit.

This is being handled via
https://lore.kernel.org/git/cover.1724080006.git.ps@pks.im/.

Patrick

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 3/3] builtin/maintenance: fix loose objects task emitting pack hash
  2024-08-19  9:17                 ` Jeff King
  2024-08-19  9:26                   ` Patrick Steinhardt
@ 2024-08-19 17:05                   ` Junio C Hamano
  1 sibling, 0 replies; 79+ messages in thread
From: Junio C Hamano @ 2024-08-19 17:05 UTC (permalink / raw)
  To: Jeff King
  Cc: Patrick Steinhardt, git, Phillip Wood, phillip.wood, James Liu,
	Derrick Stolee

Jeff King <peff@peff.net> writes:

> On Mon, Aug 19, 2024 at 11:07:51AM +0200, Patrick Steinhardt wrote:
>
>> It mostly boils down to git-repack(1) doing a connectivity check,
>> whereas git-pack-objects(1) doesn't. We just soak up every single loose
>> object, and then eventually we expire them via git-multi-pack-index(1)'s
>> "expire" subcommand.
>
> Hmph. I'd have suggested that we should teach git-repack to do the more
> efficient thing. I'm a bit worried about having parallel universes of
> how maintenance works making it harder to reason about when or how
> things happen, and how various concurrent / racy behaviors work.

I'd suggest being careful before going there.

The above only explains why it is OK not to exclude unreachable
cruft, but does not address another thing we should be worried
about, which is the quality of the resulting pack.

Throwing a random set of object names at pack-objects in the order
that they are discovered by for_each_loose_file_in_objdir(), which
is what gc.c:pack_loose() does, would give no locality benefit that
walking the commits would.  If we assume that we will pack_loose()
often enough that we won't have huge number of objects in the
resulting pack, packing objects that are close in the history may
not matter much, but on the other hand, if we run pack_loose() too
often to produce a small pack, you would not have a great delta base
selection.

So we should probably monitor how much "badness" the pack_loose()
is causing, and if it turns out to be too much, we may need to
reconsider its design.  Being able to produce ultra-quickly a pack
whose layout and delta base choice would hurt runtime performance is
not a feature.

> But it's probably a bit late to re-open that (and certainly it's not
> part of your series).

True.

Thanks.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 3/3] builtin/maintenance: fix loose objects task emitting pack hash
  2024-08-19 10:26                     ` Jeff King
@ 2024-08-20  7:39                       ` Patrick Steinhardt
  2024-08-20 15:58                         ` Junio C Hamano
  0 siblings, 1 reply; 79+ messages in thread
From: Patrick Steinhardt @ 2024-08-20  7:39 UTC (permalink / raw)
  To: Jeff King
  Cc: git, Phillip Wood, phillip.wood, James Liu, Derrick Stolee,
	Junio C Hamano

On Mon, Aug 19, 2024 at 06:26:02AM -0400, Jeff King wrote:
> On Mon, Aug 19, 2024 at 11:26:06AM +0200, Patrick Steinhardt wrote:
> 
> > > Am I misreading the documentation? The entry for maintenance.autoDetach
> > > on 'next' says:
> > > 
> > >   If unset, the value of `gc.autoDetach` is used as a fallback. Defaults
> > >   to true if both are unset, meaning that the maintenance process will
> > >   detach.
> > 
> > You've omitted the important part:
> > 
> > 	Many Git commands trigger automatic maintenance after they have
> > 	written data into the repository. This boolean config option
> > 	controls whether this automatic maintenance shall happen in the
> > 	foreground or whether the maintenance process shall detach and
> > 	continue to run in the background.
> > 
> > The `maintenance.autoDetach` setting only impacts auto-maintentance as
> > run via `run_auto_maintenance()`. The `--auto` flag is somewhat
> > orthogonal: it asks the git-maintenance(1) job to do nothing in case the
> > repository is already optimal.
> 
> Ah. I naively assumed that they did so by passing the "--auto" flag. But
> I see now that the caller actually checks the config and passes
> "--detach" or not.
> 
> That seems kind of unfriendly to scripted porcelains which want to
> invoke it, since they have to reimplement that logic. The idea of "git
> gc --auto" was that it provided a single API for scripts to invoke,
> including respecting the user's config. Now that "maintenance --auto"
> has taken that over, I'd have expected it to do the same.
> 
> To be clear, I don't feel all that strongly about it, but I'm not sure I
> buy the argument that it is orthogonal, or that here:
> 
> > For git-gc(1) we indeed did tie the `--auto` flag to backgrounding,
> > which is somewhat nonsensical. There are usecases where you may want to
> > pass `--auto`, but still have it run in the foreground. That's why we
> > handle this differently for git-maintenance(1), which requires you to
> > pass an explicit `--detach` flag.
> 
> we couldn't just patch "--no-detach" for cases where you want to be sure
> it is in the foreground.

We certainly could. But honestly, your scripted use case you mention
above is even more of an argument why we shouldn't do it, in my opinion.
We have long had the stance that the behaviour of plumbing tools should
_not_ be impacted by the user configuration. And detaching based on some
config to me very much sounds like the exact opposite.

Mind you, we are all quite used to `git gc --auto` detaching. But if I
were new to the project, I'd find it quite surprising that it may or may
not detach if all I want it to do is to decide for itself whether it
needs to garbage collect or not. It is much more straight forward and
way less surprising for a script writer to use `--detach` if they want
the script to detach, because now the command does what they want
without them having to worry about the user's config.

> > Also, we cannot change the behaviour of git-maintenance(1) retroactively
> > to make `--auto` detach. While it already essentially did detach for
> > git-gc(1), that was a bug. E.g. when running as part of the scheduler,
> > we'd always have detached and thus ended up with a bunch of concurrent
> > git-gc(1) processes. So even though it does make sense for the scheduler
> > to use `--auto`, it wouldn't want the process to detach.
> 
> Backwards compatibility is a more compelling argument here, if we've had
> "maintenance --auto" that didn't ever detach (though it sounds like it
> did, via gc, anyway). But yes, one kicked off from a scheduler should be
> using --no-detach, I'd think.

Yes, we did, but as mentioned it was buggy. Once the scheduler kicks
off, you'd now have N git-gc(1) processes all running in parallel to
each other. With N being large you will certainly face some issues. You
also lose the exit code, which is another issue.

But as you said, you could make the scheduler pass `--no-detach`. In
fact, the first versions of this patch series were using your approach,
where I changed `git maintenance run --auto` to detach based on the
config. But after some thought (and after seeing the negative fallout
that this had on our test suite) I decided to throw this approach away
because it just didn't feel right to me.

> Like I said, I don't feel strongly enough to work on any changes here.
> I'd hoped to never think about repository maintenance ever again. So you
> can take these as just impressions of a (relatively) clueful user seeing
> it for the first time. ;)

I certainly appreciate the discussion, thanks for chiming in! I'm still
not convinced that we should continue to couple auto-maintenance and
backgrounding to each other. In my opinion, this behaviour was a mistake
in the past and continues to surprise now, too. Making it an explicit
option feels more natural to me.

That being said, when others feel strongly about this, as well, then I'm
of course happy to adapt.

Patrick

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 3/3] builtin/maintenance: fix loose objects task emitting pack hash
  2024-08-20  7:39                       ` Patrick Steinhardt
@ 2024-08-20 15:58                         ` Junio C Hamano
  0 siblings, 0 replies; 79+ messages in thread
From: Junio C Hamano @ 2024-08-20 15:58 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: Jeff King, git, Phillip Wood, phillip.wood, James Liu,
	Derrick Stolee

Patrick Steinhardt <ps@pks.im> writes:

> I certainly appreciate the discussion, thanks for chiming in! I'm still
> not convinced that we should continue to couple auto-maintenance and
> backgrounding to each other. In my opinion, this behaviour was a mistake
> in the past and continues to surprise now, too. Making it an explicit
> option feels more natural to me.
>
> That being said, when others feel strongly about this, as well, then I'm
> of course happy to adapt.

FWIW, I find it is a sensible approach to have a separate "run in
the background" that is not strongly tied to "do your thing if you
think the repository really needs it".

Thanks.


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 2/3] t7900: exercise detaching via trace2 regions
  2024-08-19  8:56               ` Patrick Steinhardt
@ 2024-08-21 18:38                 ` Junio C Hamano
  2024-08-22  5:41                   ` Patrick Steinhardt
  0 siblings, 1 reply; 79+ messages in thread
From: Junio C Hamano @ 2024-08-21 18:38 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: Jeff King, git, Phillip Wood, phillip.wood, James Liu,
	Derrick Stolee

Patrick Steinhardt <ps@pks.im> writes:

>> ...I think this "we have no better way..." comment is now out of date
>> (and can probably just be dropped).
>
> Oops, yes, that one is definitely stale. I'll drop it in the next
> version of this patch series.

I am not sure if there is a need for "the next version"; in the
meantime, let me do this.  I'd prefer to merge the main topic down
to 'master' soonish.

Thanks.

1:  759b453f9f = 1:  759b453f9f t7900: fix flaky test due to leaking background job
2:  b64db3e437 ! 2:  51a0b8a2a7 t7900: exercise detaching via trace2 regions
    @@ Commit message
         do not have the ability to daemonize.
     
         Signed-off-by: Patrick Steinhardt <ps@pks.im>
    +    [jc: dropped a stale in-code comment from a test]
         Signed-off-by: Junio C Hamano <gitster@pobox.com>
     
      ## builtin/gc.c ##
    @@ builtin/gc.c: static int maintenance_run_tasks(struct maintenance_run_opts *opts
     
      ## t/t7900-maintenance.sh ##
     @@ t/t7900-maintenance.sh: test_expect_success '--no-detach causes maintenance to not run in background' '
    - 		# We have no better way to check whether or not the task ran in
    - 		# the background than to verify whether it output anything. The
    - 		# next testcase checks the reverse, making this somewhat safer.
    + 		git config set maintenance.loose-objects.auto 1 &&
    + 		git config set maintenance.incremental-repack.enabled true &&
    + 
    +-		# We have no better way to check whether or not the task ran in
    +-		# the background than to verify whether it output anything. The
    +-		# next testcase checks the reverse, making this somewhat safer.
     -		git maintenance run --no-detach >out 2>&1 &&
     -		test_line_count = 1 out
     +		GIT_TRACE2_EVENT="$(pwd)/trace.txt" \
3:  eac44feff3 = 3:  8311e3b551 builtin/maintenance: fix loose objects task emitting pack hash

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 2/3] t7900: exercise detaching via trace2 regions
  2024-08-21 18:38                 ` Junio C Hamano
@ 2024-08-22  5:41                   ` Patrick Steinhardt
  2024-08-22 17:22                     ` Junio C Hamano
  0 siblings, 1 reply; 79+ messages in thread
From: Patrick Steinhardt @ 2024-08-22  5:41 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Jeff King, git, Phillip Wood, phillip.wood, James Liu,
	Derrick Stolee

On Wed, Aug 21, 2024 at 11:38:02AM -0700, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> >> ...I think this "we have no better way..." comment is now out of date
> >> (and can probably just be dropped).
> >
> > Oops, yes, that one is definitely stale. I'll drop it in the next
> > version of this patch series.
> 
> I am not sure if there is a need for "the next version"; in the
> meantime, let me do this.  I'd prefer to merge the main topic down
> to 'master' soonish.
> 
> Thanks.

Thanks, this looks good to me.

Patrick

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 2/3] t7900: exercise detaching via trace2 regions
  2024-08-22  5:41                   ` Patrick Steinhardt
@ 2024-08-22 17:22                     ` Junio C Hamano
  0 siblings, 0 replies; 79+ messages in thread
From: Junio C Hamano @ 2024-08-22 17:22 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: Jeff King, git, Phillip Wood, phillip.wood, James Liu,
	Derrick Stolee

Patrick Steinhardt <ps@pks.im> writes:

> On Wed, Aug 21, 2024 at 11:38:02AM -0700, Junio C Hamano wrote:
>> Patrick Steinhardt <ps@pks.im> writes:
>> 
>> >> ...I think this "we have no better way..." comment is now out of date
>> >> (and can probably just be dropped).
>> >
>> > Oops, yes, that one is definitely stale. I'll drop it in the next
>> > version of this patch series.
>> 
>> I am not sure if there is a need for "the next version"; in the
>> meantime, let me do this.  I'd prefer to merge the main topic down
>> to 'master' soonish.
>> 
>> Thanks.
>
> Thanks, this looks good to me.

OK.  Let me merge the whole thing to 'next', cook it for a few days
and then merge it together with the base topic down to 'master',
then.

Thanks.

^ permalink raw reply	[flat|nested] 79+ messages in thread

end of thread, other threads:[~2024-08-22 17:22 UTC | newest]

Thread overview: 79+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-13  7:17 [PATCH 0/7] builtin/maintenance: fix auto-detach with non-standard tasks Patrick Steinhardt
2024-08-13  7:17 ` [PATCH 1/7] config: fix constness of out parameter for `git_config_get_expiry()` Patrick Steinhardt
2024-08-13  7:17 ` [PATCH 2/7] builtin/gc: refactor to read config into structure Patrick Steinhardt
2024-08-15  5:24   ` James Liu
2024-08-15  8:18     ` Patrick Steinhardt
2024-08-15 13:46   ` Derrick Stolee
2024-08-13  7:17 ` [PATCH 3/7] builtin/gc: fix leaking config values Patrick Steinhardt
2024-08-15  5:22   ` James Liu
2024-08-15  8:18     ` Patrick Steinhardt
2024-08-15 13:50   ` Derrick Stolee
2024-08-13  7:17 ` [PATCH 4/7] builtin/gc: stop processing log file on signal Patrick Steinhardt
2024-08-15  6:01   ` James Liu
2024-08-13  7:17 ` [PATCH 5/7] builtin/gc: add a `--detach` flag Patrick Steinhardt
2024-08-13  7:17 ` [PATCH 6/7] builtin/maintenance: " Patrick Steinhardt
2024-08-13  7:18 ` [PATCH 7/7] builtin/maintenance: fix auto-detach with non-standard tasks Patrick Steinhardt
2024-08-13 11:29   ` Phillip Wood
2024-08-13 11:59     ` Patrick Steinhardt
2024-08-13 13:19       ` Phillip Wood
2024-08-14  4:15         ` Patrick Steinhardt
2024-08-14 15:13           ` Phillip Wood
2024-08-15  5:30             ` Patrick Steinhardt
2024-08-15  6:40   ` James Liu
2024-08-15  8:17     ` Patrick Steinhardt
2024-08-15 14:00   ` Derrick Stolee
2024-08-15  6:42 ` [PATCH 0/7] " James Liu
2024-08-15  9:12 ` [PATCH v2 " Patrick Steinhardt
2024-08-15  9:12   ` [PATCH v2 1/7] config: fix constness of out parameter for `git_config_get_expiry()` Patrick Steinhardt
2024-08-15  9:12   ` [PATCH v2 2/7] builtin/gc: refactor to read config into structure Patrick Steinhardt
2024-08-15  9:12   ` [PATCH v2 3/7] builtin/gc: fix leaking config values Patrick Steinhardt
2024-08-15  9:12   ` [PATCH v2 4/7] builtin/gc: stop processing log file on signal Patrick Steinhardt
2024-08-15  9:12   ` [PATCH v2 5/7] builtin/gc: add a `--detach` flag Patrick Steinhardt
2024-08-15 19:11     ` Junio C Hamano
2024-08-15 22:29       ` Junio C Hamano
2024-08-16  8:06         ` Patrick Steinhardt
2024-08-15  9:12   ` [PATCH v2 6/7] builtin/maintenance: " Patrick Steinhardt
2024-08-15  9:12   ` [PATCH v2 7/7] run-command: fix detaching when running auto maintenance Patrick Steinhardt
2024-08-15 16:13     ` Junio C Hamano
2024-08-16  8:06       ` Patrick Steinhardt
2024-08-15 14:04 ` [PATCH 0/7] builtin/maintenance: fix auto-detach with non-standard tasks Derrick Stolee
2024-08-15 15:37   ` Junio C Hamano
2024-08-16  8:06   ` Patrick Steinhardt
2024-08-16 10:44 ` [PATCH v3 " Patrick Steinhardt
2024-08-16 10:44   ` [PATCH v3 1/7] config: fix constness of out parameter for `git_config_get_expiry()` Patrick Steinhardt
2024-08-16 10:45   ` [PATCH v3 2/7] builtin/gc: refactor to read config into structure Patrick Steinhardt
2024-08-16 10:45   ` [PATCH v3 3/7] builtin/gc: fix leaking config values Patrick Steinhardt
2024-08-16 10:45   ` [PATCH v3 4/7] builtin/gc: stop processing log file on signal Patrick Steinhardt
2024-08-16 10:45   ` [PATCH v3 5/7] builtin/gc: add a `--detach` flag Patrick Steinhardt
2024-08-16 10:45   ` [PATCH v3 6/7] builtin/maintenance: " Patrick Steinhardt
2024-08-17  7:09     ` Jeff King
2024-08-17  7:14       ` Jeff King
2024-08-19  6:17       ` Patrick Steinhardt
2024-08-16 10:45   ` [PATCH v3 7/7] run-command: fix detaching when running auto maintenance Patrick Steinhardt
2024-08-17 12:14     ` Jeff King
2024-08-19  6:17       ` Patrick Steinhardt
2024-08-19  7:47         ` [PATCH 0/3] Fixups for git-maintenance(1) tests Patrick Steinhardt
2024-08-19  7:47           ` [PATCH 1/3] t7900: fix flaky test due to leaking background job Patrick Steinhardt
2024-08-19  8:49             ` Jeff King
2024-08-19  8:55               ` Patrick Steinhardt
2024-08-19  9:12                 ` Jeff King
2024-08-19  9:17                   ` Patrick Steinhardt
2024-08-19  7:48           ` [PATCH 2/3] t7900: exercise detaching via trace2 regions Patrick Steinhardt
2024-08-19  8:51             ` Jeff King
2024-08-19  8:56               ` Patrick Steinhardt
2024-08-21 18:38                 ` Junio C Hamano
2024-08-22  5:41                   ` Patrick Steinhardt
2024-08-22 17:22                     ` Junio C Hamano
2024-08-19  7:48           ` [PATCH 3/3] builtin/maintenance: fix loose objects task emitting pack hash Patrick Steinhardt
2024-08-19  8:55             ` Jeff King
2024-08-19  9:07               ` Patrick Steinhardt
2024-08-19  9:17                 ` Jeff King
2024-08-19  9:26                   ` Patrick Steinhardt
2024-08-19 10:26                     ` Jeff King
2024-08-20  7:39                       ` Patrick Steinhardt
2024-08-20 15:58                         ` Junio C Hamano
2024-08-19 17:05                   ` Junio C Hamano
2024-08-19  8:46         ` [PATCH v3 7/7] run-command: fix detaching when running auto maintenance Jeff King
2024-08-19  9:04           ` Patrick Steinhardt
2024-08-19 10:49       ` Patrick Steinhardt
2024-08-19 15:41         ` Patrick Steinhardt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).