All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Shubham Kanodia via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: "mailto:gitster@pobox.com" <[gitster@pobox.com]>,
	"mailto:ps@pks.im" <[ps@pks.im]>,
	Shubham Kanodia <shubham.kanodia10@gmail.com>,
	Shubham Kanodia <shubham.kanodia10@gmail.com>
Subject: [PATCH] maintenance: add prune-remote-refs task
Date: Mon, 23 Dec 2024 09:36:06 +0000	[thread overview]
Message-ID: <pull.1838.git.1734946566885.gitgitgadget@gmail.com> (raw)

From: Shubham Kanodia <shubham.kanodia10@gmail.com>

Remote-tracking refs can accumulate in local repositories even as branches
are deleted on remotes, impacting git performance negatively. Existing
alternatives to keep refs pruned have a few issues — 

1. The `fetch.prune` config automatically cleans up remote ref on fetch,
but also pulls in new ref from remote which is an undesirable side-effect.

2.`git remote prune` cleans up refs without adding to the existing list
but requires periodic user intervention.

This adds a new maintenance task 'prune-remote-refs' that runs
'git remote prune' for each configured remote daily. This provides an
automated way to clean up stale remote-tracking refs — especially when
users may not do a full fetch.

This task is disabled by default.

Signed-off-by: Shubham Kanodia <shubham.kanodia10@gmail.com>
---
    maintenance: add prune-remote-refs task
    
    As discussed previously on:
    https://lore.kernel.org/git/xmqqwmfr112w.fsf@gitster.g/T/#t
    
    Remote-tracking refs can accumulate in local repositories even as
    branches are deleted on remotes, impacting git performance negatively.
    Existing alternatives to keep refs pruned have a few issues — 
    
     1. The fetch.prune config automatically cleans up remote ref on fetch,
        but also pulls in new ref from remote which is an undesirable
        side-effect.
    
    2.git remote prune cleans up refs without adding to the existing list
    but requires periodic user intervention.
    
    This adds a new maintenance task 'prune-remote-refs' that runs 'git
    remote prune' for each configured remote daily. This provides an
    automated way to clean up stale remote-tracking refs — especially when
    users may not do a full fetch.
    
    This task is disabled by default.

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1838%2Fpastelsky%2Fsk%2Fadd-remote-prune-maintenance-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1838/pastelsky/sk/add-remote-prune-maintenance-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1838

 Documentation/git-maintenance.txt | 20 ++++++++++++++
 builtin/gc.c                      | 42 +++++++++++++++++++++++++++++
 t/t7900-maintenance.sh            | 44 +++++++++++++++++++++++++++++++
 3 files changed, 106 insertions(+)

diff --git a/Documentation/git-maintenance.txt b/Documentation/git-maintenance.txt
index 6e6651309d3..0c8f1e01ccd 100644
--- a/Documentation/git-maintenance.txt
+++ b/Documentation/git-maintenance.txt
@@ -158,6 +158,26 @@ pack-refs::
 	need to iterate across many references. See linkgit:git-pack-refs[1]
 	for more information.
 
+prune-remote-refs::
+	The `prune-remote-refs` task runs `git remote prune` on each remote
+	repository registered in the local repository. This task helps clean
+	up deleted remote branches, improving the performance of operations
+	that iterate through the refs. See linkgit:git-remote[1] for more
+	information. This task is disabled by default.
++
+NOTE: This task is opt-in to prevent unexpected removal of remote refs
+for users of git-maintenance. For most users, configuring `fetch.prune=true`
+is a acceptable solution, as it will automatically clean up stale remote-tracking
+branches during normal fetch operations. However, this task can be useful in
+specific scenarios:
++
+--
+* When using selective fetching (e.g., `git fetch origin +foo:refs/remotes/origin/foo`)
+  where `fetch.prune` would not affect refs outside the fetched hierarchy
+* When third-party tools might perform unexpected full fetches, and you want
+  periodic cleanup independently of fetch operations
+--
+
 OPTIONS
 -------
 --auto::
diff --git a/builtin/gc.c b/builtin/gc.c
index 4ae5196aedf..9acf1d29895 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -20,6 +20,7 @@
 #include "lockfile.h"
 #include "parse-options.h"
 #include "run-command.h"
+#include "remote.h"
 #include "sigchain.h"
 #include "strvec.h"
 #include "commit.h"
@@ -913,6 +914,40 @@ static int maintenance_opt_schedule(const struct option *opt, const char *arg,
 	return 0;
 }
 
+static int collect_remote(struct remote *remote, void *cb_data)
+{
+	struct string_list *list = cb_data;
+
+	if (!remote->url.nr)
+		return 0;
+
+	string_list_append(list, remote->name);
+	return 0;
+}
+
+static int maintenance_task_prune_remote(struct maintenance_run_opts *opts UNUSED,
+					 struct gc_config *cfg UNUSED)
+{
+	struct string_list_item *item;
+	struct string_list remotes_list = STRING_LIST_INIT_NODUP;
+	struct child_process child = CHILD_PROCESS_INIT;
+	int result = 0;
+
+	for_each_remote(collect_remote, &remotes_list);
+
+	for_each_string_list_item (item, &remotes_list) {
+		const char *remote_name = item->string;
+		child.git_cmd = 1;
+		strvec_pushl(&child.args, "remote", "prune", remote_name, NULL);
+
+		if (run_command(&child))
+			result = error(_("failed to prune '%s'"), remote_name);
+	}
+
+	string_list_clear(&remotes_list, 0);
+	return result;
+}
+
 /* Remember to update object flag allocation in object.h */
 #define SEEN		(1u<<0)
 
@@ -1375,6 +1410,7 @@ enum maintenance_task_label {
 	TASK_GC,
 	TASK_COMMIT_GRAPH,
 	TASK_PACK_REFS,
+	TASK_PRUNE_REMOTE_REFS,
 
 	/* Leave as final value */
 	TASK__COUNT
@@ -1411,6 +1447,10 @@ static struct maintenance_task tasks[] = {
 		maintenance_task_pack_refs,
 		pack_refs_condition,
 	},
+	[TASK_PRUNE_REMOTE_REFS] = {
+		"prune-remote-refs",
+		maintenance_task_prune_remote,
+	},
 };
 
 static int compare_tasks_by_selection(const void *a_, const void *b_)
@@ -1505,6 +1545,8 @@ static void initialize_maintenance_strategy(void)
 		tasks[TASK_LOOSE_OBJECTS].schedule = SCHEDULE_DAILY;
 		tasks[TASK_PACK_REFS].enabled = 1;
 		tasks[TASK_PACK_REFS].schedule = SCHEDULE_WEEKLY;
+		tasks[TASK_PRUNE_REMOTE_REFS].enabled = 0;
+		tasks[TASK_PRUNE_REMOTE_REFS].schedule = SCHEDULE_DAILY;
 	}
 }
 
diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
index 0ce4ba1cbef..60a0c3f8353 100755
--- a/t/t7900-maintenance.sh
+++ b/t/t7900-maintenance.sh
@@ -446,6 +446,50 @@ test_expect_success 'pack-refs task' '
 	test_subcommand git pack-refs --all --prune <pack-refs.txt
 '
 
+test_expect_success 'prune-remote-refs task not enabled by default' '
+	git clone . prune-test &&
+	(
+		cd prune-test &&
+		GIT_TRACE2_EVENT="$(pwd)/prune.txt" git maintenance run 2>err &&
+		test_subcommand ! git remote prune origin <prune.txt
+	)
+'
+
+test_expect_success 'prune-remote-refs task cleans stale remote refs' '
+	test_commit initial &&
+
+	# Create two separate remote repos
+	git clone . remote1 &&
+	git clone . remote2 &&
+
+	git clone . prune-test-clean &&
+	(
+		cd prune-test-clean &&
+		git config maintenance.prune-remote-refs.enabled true &&
+
+		# Add both remotes
+		git remote add remote1 "../remote1" &&
+		git remote add remote2 "../remote2" &&
+
+		# Create and push branches to both remotes
+		git branch -f side2 HEAD &&
+		git push remote1 side2 &&
+		git push remote2 side2 &&
+
+		# Rename branches in each remote to simulate a stale branch
+		git -C ../remote1 branch -m side2 side3 &&
+		git -C ../remote2 branch -m side2 side4 &&
+
+		GIT_TRACE2_EVENT="$(pwd)/prune.txt" git maintenance run --task=prune-remote-refs &&
+
+		# Verify pruning happened for both remotes
+		test_subcommand git remote prune remote1 <prune.txt &&
+		test_subcommand git remote prune remote2 <prune.txt &&
+		test_must_fail git rev-parse refs/remotes/remote1/side2 &&
+		test_must_fail git rev-parse refs/remotes/remote2/side2
+	)
+'
+
 test_expect_success '--auto and --schedule incompatible' '
 	test_must_fail git maintenance run --auto --schedule=daily 2>err &&
 	test_grep "at most one" err

base-commit: 063bcebf0c917140ca0e705cbe0fdea127e90086
-- 
gitgitgadget

             reply	other threads:[~2024-12-23  9:36 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-23  9:36 Shubham Kanodia via GitGitGadget [this message]
2024-12-27  9:07 ` [PATCH] maintenance: add prune-remote-refs task Junio C Hamano
2024-12-28  9:58   ` Shubham Kanodia
2024-12-28 16:05     ` Junio C Hamano
2024-12-28 16:24       ` Shubham Kanodia
2024-12-28 10:07 ` [PATCH v2] " Shubham Kanodia via GitGitGadget
2024-12-28 16:25   ` Junio C Hamano
2024-12-30  7:15   ` Patrick Steinhardt
2024-12-30 14:05     ` Junio C Hamano
2025-01-03  6:50       ` Shubham Kanodia
2025-01-03  7:38         ` Patrick Steinhardt
2025-01-03 18:13   ` [PATCH v3] " Shubham Kanodia via GitGitGadget
2025-01-03 19:02     ` Junio C Hamano
     [not found]       ` <CAG=Um+1ch1sKC0H8MJoFv=6iSK3pvA=03AKXmvhm5DG=H8T1rw@mail.gmail.com>
2025-01-07 17:29         ` Shubham Kanodia
2025-01-07 18:48           ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=pull.1838.git.1734946566885.gitgitgadget@gmail.com \
    --to=gitgitgadget@gmail.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.