git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Patrick Steinhardt <ps@pks.im>
To: git@vger.kernel.org
Cc: Derrick Stolee <stolee@gmail.com>
Subject: [PATCH v2 6/8] rerere: provide function to collect stale entries
Date: Wed, 30 Apr 2025 12:25:10 +0200	[thread overview]
Message-ID: <20250430-pks-maintenance-missing-tasks-v2-6-2580b7b8ca3a@pks.im> (raw)
In-Reply-To: <20250430-pks-maintenance-missing-tasks-v2-0-2580b7b8ca3a@pks.im>

We're about to add another task for git-maintenance(1) that prunes stale
rerere entries via `git rerere gc`. The condition of when to run this
subcommand will be configurable so that the subcommand is only executed
when a certain number of stale rerere entries exists. This requires us
to know about the number of stale rerere entries in the first place,
which is non-trivial to figure out.

Refactor `rerere_gc()` and `prune_one()` so that garbage collection is
split into three phases:

  1. We collect any stale rerere entries and directories that are about
     to become empty.

  2. Prune all stale rerere entries.

  3. Remove all directories that should have become empty in (2).

By splitting out the collection of stale entries we can trivially expose
this function to external callers and thus reuse it in later steps.

This refactoring is not expected to result in a user-visible change in
behaviour.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 rerere.c | 92 ++++++++++++++++++++++++++++++++++++++++++++--------------------
 rerere.h | 14 ++++++++++
 2 files changed, 78 insertions(+), 28 deletions(-)

diff --git a/rerere.c b/rerere.c
index 740e8ad1a0b..eb06e5f8bea 100644
--- a/rerere.c
+++ b/rerere.c
@@ -1202,8 +1202,8 @@ static void unlink_rr_item(struct rerere_id *id)
 	strbuf_release(&buf);
 }
 
-static void prune_one(struct rerere_id *id,
-		      timestamp_t cutoff_resolve, timestamp_t cutoff_noresolve)
+static int is_stale(struct rerere_id *id,
+		    timestamp_t cutoff_resolve, timestamp_t cutoff_noresolve)
 {
 	timestamp_t then;
 	timestamp_t cutoff;
@@ -1214,11 +1214,11 @@ static void prune_one(struct rerere_id *id,
 	else {
 		then = rerere_created_at(id);
 		if (!then)
-			return;
+			return 0;
 		cutoff = cutoff_noresolve;
 	}
-	if (then < cutoff)
-		unlink_rr_item(id);
+
+	return then < cutoff;
 }
 
 /* Does the basename in "path" look plausibly like an rr-cache entry? */
@@ -1229,29 +1229,35 @@ static int is_rr_cache_dirname(const char *path)
 	return !parse_oid_hex(path, &oid, &end) && !*end;
 }
 
-void rerere_gc(struct repository *r, struct string_list *rr)
+int rerere_collect_stale_entries(struct repository *r,
+				 struct string_list *prunable_dirs,
+				 struct rerere_id **prunable_entries,
+				 size_t *prunable_entries_nr)
 {
-	struct string_list to_remove = STRING_LIST_INIT_DUP;
-	DIR *dir;
-	struct dirent *e;
-	int i;
 	timestamp_t now = time(NULL);
 	timestamp_t cutoff_noresolve = now - 15 * 86400;
 	timestamp_t cutoff_resolve = now - 60 * 86400;
 	struct strbuf buf = STRBUF_INIT;
+	size_t prunable_entries_alloc;
+	struct dirent *e;
+	DIR *dir = NULL;
+	int ret;
 
-	if (setup_rerere(r, rr, 0) < 0)
-		return;
+	*prunable_entries = NULL;
+	*prunable_entries_nr = 0;
+	prunable_entries_alloc = 0;
 
-	repo_config_get_expiry_in_days(the_repository, "gc.rerereresolved",
+	repo_config_get_expiry_in_days(r, "gc.rerereresolved",
 				       &cutoff_resolve, now);
-	repo_config_get_expiry_in_days(the_repository, "gc.rerereunresolved",
+	repo_config_get_expiry_in_days(r, "gc.rerereunresolved",
 				       &cutoff_noresolve, now);
-	git_config(git_default_config, NULL);
-	dir = opendir(repo_git_path_replace(the_repository, &buf, "rr-cache"));
-	if (!dir)
-		die_errno(_("unable to open rr-cache directory"));
-	/* Collect stale conflict IDs ... */
+
+	dir = opendir(repo_git_path_replace(r, &buf, "rr-cache"));
+	if (!dir) {
+		ret = error_errno(_("unable to open rr-cache directory"));
+		goto out;
+	}
+
 	while ((e = readdir_skip_dot_and_dotdot(dir))) {
 		struct rerere_dir *rr_dir;
 		struct rerere_id id;
@@ -1266,23 +1272,53 @@ void rerere_gc(struct repository *r, struct string_list *rr)
 		for (id.variant = 0, id.collection = rr_dir;
 		     id.variant < id.collection->status_nr;
 		     id.variant++) {
-			prune_one(&id, cutoff_resolve, cutoff_noresolve);
-			if (id.collection->status[id.variant])
+			if (is_stale(&id, cutoff_resolve, cutoff_noresolve)) {
+				ALLOC_GROW(*prunable_entries, *prunable_entries_nr + 1,
+					   prunable_entries_alloc);
+				(*prunable_entries)[(*prunable_entries_nr)++] = id;
+			} else {
 				now_empty = 0;
+			}
 		}
 		if (now_empty)
-			string_list_append(&to_remove, e->d_name);
+			string_list_append(prunable_dirs, e->d_name);
 	}
-	closedir(dir);
 
-	/* ... and then remove the empty directories */
-	for (i = 0; i < to_remove.nr; i++)
-		rmdir(repo_git_path_replace(the_repository, &buf,
-					    "rr-cache/%s", to_remove.items[i].string));
+	ret = 0;
+
+out:
+	strbuf_release(&buf);
+	if (dir)
+		closedir(dir);
+	return ret;
+}
+
+void rerere_gc(struct repository *r, struct string_list *rr)
+{
+	struct string_list prunable_dirs = STRING_LIST_INIT_DUP;
+	struct rerere_id *prunable_entries;
+	struct strbuf buf = STRBUF_INIT;
+	size_t prunable_entries_nr;
+
+	if (setup_rerere(r, rr, 0) < 0)
+		return;
+
+	git_config(git_default_config, NULL);
+
+	if (rerere_collect_stale_entries(r, &prunable_dirs, &prunable_entries,
+					 &prunable_entries_nr) < 0)
+		exit(127);
+
+	for (size_t i = 0; i < prunable_entries_nr; i++)
+		unlink_rr_item(&prunable_entries[i]);
+	for (size_t i = 0; i < prunable_dirs.nr; i++)
+		rmdir(repo_git_path_replace(r, &buf, "rr-cache/%s",
+					    prunable_dirs.items[i].string));
 
-	string_list_clear(&to_remove, 0);
+	string_list_clear(&prunable_dirs, 0);
 	rollback_lock_file(&write_lock);
 	strbuf_release(&buf);
+	free(prunable_entries);
 }
 
 /*
diff --git a/rerere.h b/rerere.h
index d4b5f7c9320..fd5a2388b06 100644
--- a/rerere.h
+++ b/rerere.h
@@ -37,6 +37,20 @@ const char *rerere_path(struct strbuf *buf, const struct rerere_id *,
 int rerere_forget(struct repository *, struct pathspec *);
 int rerere_remaining(struct repository *, struct string_list *);
 void rerere_clear(struct repository *, struct string_list *);
+
+/*
+ * Collect prunable rerere entries that would be garbage collected via
+ * `rerere_gc()`. Whether or not an entry is prunable depends on both
+ * "gc.rerereResolved" and "gc.rerereUnresolved".
+ *
+ * Returns 0 on success, a negative error code in case entries could not be
+ * collected.
+ */
+int rerere_collect_stale_entries(struct repository *r,
+				 struct string_list *prunable_dirs,
+				 struct rerere_id **prunable_entries,
+				 size_t *prunable_entries_nr);
+
 void rerere_gc(struct repository *, struct string_list *);
 
 #define OPT_RERERE_AUTOUPDATE(v) OPT_UYN(0, "rerere-autoupdate", (v), \

-- 
2.49.0.987.g0cc8ee98dc.dirty


  parent reply	other threads:[~2025-04-30 10:25 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-25  7:29 [PATCH 0/7] builtin/maintenance: implement missing tasks compared to git-gc(1) Patrick Steinhardt
2025-04-25  7:29 ` [PATCH 1/7] builtin/gc: fix indentation of `cmd_gc()` parameters Patrick Steinhardt
2025-04-25  7:29 ` [PATCH 2/7] builtin/gc: remove global variables where it trivial to do Patrick Steinhardt
2025-04-25  7:29 ` [PATCH 3/7] builtin/gc: move pruning of worktrees into a separate function Patrick Steinhardt
2025-04-25  7:29 ` [PATCH 4/7] worktree: expose function to retrieve worktree names Patrick Steinhardt
2025-04-25  7:29 ` [PATCH 5/7] builtin/maintenance: introduce "worktree-prune" task Patrick Steinhardt
2025-04-29 20:02   ` Derrick Stolee
2025-04-30  7:08     ` Patrick Steinhardt
2025-04-25  7:29 ` [PATCH 6/7] builtin/gc: move rerere garbage collection into separate function Patrick Steinhardt
2025-04-25  7:29 ` [PATCH 7/7] builtin/maintenance: introduce "rerere-gc" task Patrick Steinhardt
2025-04-29 20:02 ` [PATCH 0/7] builtin/maintenance: implement missing tasks compared to git-gc(1) Derrick Stolee
2025-04-30  7:08   ` Patrick Steinhardt
2025-04-30 10:25 ` [PATCH v2 0/8] " Patrick Steinhardt
2025-04-30 10:25   ` [PATCH v2 1/8] builtin/gc: fix indentation of `cmd_gc()` parameters Patrick Steinhardt
2025-04-30 10:25   ` [PATCH v2 2/8] builtin/gc: remove global variables where it trivial to do Patrick Steinhardt
2025-04-30 10:25   ` [PATCH v2 3/8] builtin/gc: move pruning of worktrees into a separate function Patrick Steinhardt
2025-04-30 10:25   ` [PATCH v2 4/8] worktree: expose function to retrieve worktree names Patrick Steinhardt
2025-04-30 10:25   ` [PATCH v2 5/8] builtin/maintenance: introduce "worktree-prune" task Patrick Steinhardt
2025-04-30 10:25   ` Patrick Steinhardt [this message]
2025-04-30 16:58     ` [PATCH v2 6/8] rerere: provide function to collect stale entries Junio C Hamano
2025-05-02  8:07       ` Patrick Steinhardt
2025-05-02 16:35         ` Junio C Hamano
2025-05-05  7:22           ` Patrick Steinhardt
2025-04-30 10:25   ` [PATCH v2 7/8] builtin/gc: move rerere garbage collection into separate function Patrick Steinhardt
2025-04-30 10:25   ` [PATCH v2 8/8] builtin/maintenance: introduce "rerere-gc" task Patrick Steinhardt
2025-04-30 10:37   ` [PATCH v2 0/8] builtin/maintenance: implement missing tasks compared to git-gc(1) Derrick Stolee
2025-05-02  8:43 ` [PATCH v3 0/7] " Patrick Steinhardt
2025-05-02  8:43   ` [PATCH v3 1/7] builtin/gc: fix indentation of `cmd_gc()` parameters Patrick Steinhardt
2025-05-02  8:43   ` [PATCH v3 2/7] builtin/gc: remove global variables where it trivial to do Patrick Steinhardt
2025-05-02  8:44   ` [PATCH v3 3/7] builtin/gc: move pruning of worktrees into a separate function Patrick Steinhardt
2025-05-02  8:44   ` [PATCH v3 4/7] worktree: expose function to retrieve worktree names Patrick Steinhardt
2025-05-05  8:42     ` Eric Sunshine
2025-05-07  7:06       ` Patrick Steinhardt
2025-05-02  8:44   ` [PATCH v3 5/7] builtin/maintenance: introduce "worktree-prune" task Patrick Steinhardt
2025-05-05  8:59     ` Eric Sunshine
2025-05-07  7:06       ` Patrick Steinhardt
2025-05-02  8:44   ` [PATCH v3 6/7] builtin/gc: move rerere garbage collection into separate function Patrick Steinhardt
2025-05-02  8:44   ` [PATCH v3 7/7] builtin/maintenance: introduce "rerere-gc" task Patrick Steinhardt
2025-05-02 14:57   ` [PATCH v3 0/7] builtin/maintenance: implement missing tasks compared to git-gc(1) Derrick Stolee
2025-05-02 21:07     ` Junio C Hamano
2025-05-05  7:32       ` Patrick Steinhardt
2025-05-05  8:51 ` [PATCH v4 " Patrick Steinhardt
2025-05-05  8:51   ` [PATCH v4 1/7] builtin/gc: fix indentation of `cmd_gc()` parameters Patrick Steinhardt
2025-05-05  8:51   ` [PATCH v4 2/7] builtin/gc: remove global variables where it trivial to do Patrick Steinhardt
2025-05-06  7:44     ` Christian Couder
2025-05-07  7:06       ` Patrick Steinhardt
2025-05-05  8:51   ` [PATCH v4 3/7] builtin/gc: move pruning of worktrees into a separate function Patrick Steinhardt
2025-05-06  7:50     ` Christian Couder
2025-05-07  7:06       ` Patrick Steinhardt
2025-05-05  8:51   ` [PATCH v4 4/7] worktree: expose function to retrieve worktree names Patrick Steinhardt
2025-05-06  8:20     ` Christian Couder
2025-05-06 16:08       ` Eric Sunshine
2025-05-05  8:51   ` [PATCH v4 5/7] builtin/maintenance: introduce "worktree-prune" task Patrick Steinhardt
2025-05-06  7:40     ` Christian Couder
2025-05-07  7:06       ` Patrick Steinhardt
2025-05-05  8:51   ` [PATCH v4 6/7] builtin/gc: move rerere garbage collection into separate function Patrick Steinhardt
2025-05-06  8:39     ` Christian Couder
2025-05-05  8:51   ` [PATCH v4 7/7] builtin/maintenance: introduce "rerere-gc" task Patrick Steinhardt
2025-05-06  9:05   ` [PATCH v4 0/7] builtin/maintenance: implement missing tasks compared to git-gc(1) Christian Couder
2025-05-07  7:21 ` [PATCH v5 0/6] " Patrick Steinhardt
2025-05-07  7:21   ` [PATCH v5 1/6] builtin/gc: fix indentation of `cmd_gc()` parameters Patrick Steinhardt
2025-05-07  7:21   ` [PATCH v5 2/6] builtin/gc: remove global variables where it is trivial to do Patrick Steinhardt
2025-05-07  7:21   ` [PATCH v5 3/6] builtin/gc: move pruning of worktrees into a separate function Patrick Steinhardt
2025-05-07  7:21   ` [PATCH v5 4/6] builtin/maintenance: introduce "worktree-prune" task Patrick Steinhardt
2025-05-07  7:21   ` [PATCH v5 5/6] builtin/gc: move rerere garbage collection into separate function Patrick Steinhardt
2025-05-07  7:21   ` [PATCH v5 6/6] builtin/maintenance: introduce "rerere-gc" task Patrick Steinhardt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250430-pks-maintenance-missing-tasks-v2-6-2580b7b8ca3a@pks.im \
    --to=ps@pks.im \
    --cc=git@vger.kernel.org \
    --cc=stolee@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).